Skip to main content

Add your description here

Project description

ZotMD

Sync your Zotero library to Markdown files with automatic updates and PDF annotation extraction.

Perfect for use with Obsidian, Logseq, or any Markdown-based note-taking app.

Features

  • 📚 Smart Sync: Incremental sync only updates changed items
  • 📝 PDF Annotations: Automatically extracts highlights and notes
  • 🎨 Customizable Templates: Use Jinja2 templates to format your notes
  • 🔑 Citation Keys: Uses Better BibTeX for consistent filenames
  • 💾 User Notes: Preserves your custom notes across syncs
  • ⚙️ Configurable: Simple TOML configuration

Quick Start

# Install with uv (https://docs.astral.sh/uv/)
uv tool install zotmd

# Set up configuration
zotmd init

# Sync your library
zotmd sync

Requirements

Documentation

📖 Full Documentation

Example Output

---
title: "Reproscreener: Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines"
citekey: "bhaskarReproscreenerLeveragingLLMs2024"
itemType: conferencePaper
venue: "Association for Computing Machinery"
year: 2024
dateAdded: 2025-02-08
authors:
  - "Adhithya Bhaskar"
  - "Victoria Stodden"
status: unread
tags:
  - automated-checks
  - metrics
  - tools/LLM
  - references
links:
  - "zotero://select/library/items/RUNIG8WJ"
  - "https://doi.org/10.1145/3641525.3663629"
  - "https://doi.org/10.1145/3641525.3663629"
aliases:
  - "Reproscreener Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines"
  - "bhaskarReproscreenerLeveragingLLMs2024"
---

# @bhaskarReproscreenerLeveragingLLMs2024


> [!abstract]-
> The increasing reliance on machine learning models in scientific research and day-to-day applications – and the near-opacity of their associated computational methods – creates a widely recognized need to enable others to verify results coming from Machine Learning Pipelines. In this work we use an empirical approach to build on efforts to define and deploy structured publication standards that allow machine learning research to be automatically assessed and verified, enabling greater reliability and trust in results. To automate the assessment of a set of publication standards for Machine Learning Pipelines we developed Reproscreener; a novel, open-source software tool (see https://reproscreener.org/). We benchmark Reproscreener’s automatic reproducibility assessment against a novel manually labeled “gold standard” dataset of machine learning arXiv preprints. Our empirical evaluation has a dual goal: to assess Reproscreener’s performance; and to uncover gaps and opportunities in current reproducibility standards. We develop reproducibility assessment metrics we called the Repo Metrics to provide a novel overall assessment of the re-executability potential of the Machine Learning Pipeline, called the ReproScore. We used two approaches to the automatic identification of reproducibility metrics, keywords and LLM tools, and found the reproducibility metric evaluation performance of Large Language Model (LLM) tools superior to keyword associations.


# Notes

%% begin notes %%
-----------------------
%% end notes %%

# Annotations

%% begin annotations %%
- <mark class="hltr-purple">We adapt the following three criteria from prior work [15]: (1) README file presence:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=6PZJVP8Y)
- <mark class="hltr-purple">(2) Wrapper scripts:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=D4FD7GNU)
- <mark class="hltr-purple">(3) Software dependencies:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=E6V5IGGC)
- <mark class="hltr-green">Reproscreener’s architecture has 3 key stages: Ingest, Evaluate and Report. First, it analyzes the preprint’s TEX files and repository contents, including README files and filenames. Next, keyword searches based on predefined criteria are performed on the parsed data which return a set of pass and fail results used to generate a reproducibility score. Along with these scores, Reproscreener5 provides a table highlighting areas for improvement.</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=ZFFVD2VA)
- <mark class="hltr-green">There are three evaluations:  (1) The 9 selected Gunderson metrics on the full text of the preprint. (2) The 9 selected Gunderson metrics on the abstract. (3) The 6 Repo Metrics on the code repositories.</mark> [Page 105](zotero://open-pdf/library/items/PN6G5V8A?page=4&annotation=4QS5BUN5)
- <mark class="hltr-red">We find that researchers often include the experimental setup (74%) and dataset (62%) but details such as the problem, objective, hypothesis, and research questions were mentioned less frequently. ReproScreener performed best on the ‘Code Available’ metric (82%) which is not surprising since this metric is assessed by extracting and parsing URLs from the preprint as opposed to keyword searches for other metrics, where the latter is more likely to have false positives.</mark> [Page 107](zotero://open-pdf/library/items/PN6G5V8A?page=6&annotation=5QGJXVA5)
%% end annotations %%

## Related Literature

```dataview
TABLE year, venue, status
FROM "references/"
WHERE contains(file.inlinks, this.file.link) OR contains(file.outlinks, this.file.link)

Annotation Color Key

Highlight Color Meaning
Red Important
Purple General
Green Potential References
Orange Applications
Yellow Technical Details
Blue Personal Insights

Formatted bibliography

[1]

Adhithya Bhaskar, Victoria Stodden, "Reproscreener: Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines," Unknown, pp. 101–109, July 11, 2024, doi: 10.1145/3641525.3663629.

%% Import Date: 2025-12-26T19:27:41.672+03:00 %%


## License

MIT License - see [LICENSE](LICENSE) for details.

## Contributing

Contributions welcome! Please see the [documentation](https://adbX.github.io/zotmd/) for development setup.

## Support

- 📝 [Report Issues](https://github.com/adbX/zotmd/issues)
- 💬 [Discussions](https://github.com/adbX/zotmd/discussions)
- 📖 [Documentation](https://adbX.github.io/zotmd/)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zotmd-0.2.0.tar.gz (86.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zotmd-0.2.0-py3-none-any.whl (35.5 kB view details)

Uploaded Python 3

File details

Details for the file zotmd-0.2.0.tar.gz.

File metadata

  • Download URL: zotmd-0.2.0.tar.gz
  • Upload date:
  • Size: 86.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.4

File hashes

Hashes for zotmd-0.2.0.tar.gz
Algorithm Hash digest
SHA256 dfb61dd53d6f5272e9176dc466560d4f75a357aef40dd2de6101000911d8bff5
MD5 2a6be8460137a31ef77621a30e12cf38
BLAKE2b-256 b40ea9862178bcb3d14b8a4f6842607c632a64d9f802849082108323a13c4156

See more details on using hashes here.

File details

Details for the file zotmd-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: zotmd-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 35.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.4

File hashes

Hashes for zotmd-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b2c83d6994a2791d1c992c65037996f6154c4f9b66bc1580e0ad65e1cafa3312
MD5 758d11addeeed93a2fb2d756a03b6253
BLAKE2b-256 9e8860ae8647f4161fd2897f6c5187f7b3ca271fe5c30b60ea9765ef8a92dccd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page