Add your description here
Project description
ZotMD
Sync your Zotero library to Markdown files with automatic updates and PDF annotation extraction.
Perfect for use with Obsidian, Logseq, or any Markdown-based note-taking app.
Features
- 📚 Smart Sync: Incremental sync only updates changed items
- 📝 PDF Annotations: Automatically extracts highlights and notes
- 🎨 Customizable Templates: Use Jinja2 templates to format your notes
- 🔑 Citation Keys: Uses Better BibTeX for consistent filenames
- 💾 User Notes: Preserves your custom notes across syncs
- ⚙️ Configurable: Simple TOML configuration
Quick Start
# Install with uv (https://docs.astral.sh/uv/)
uv tool install zotmd
# Set up configuration
zotmd init
# Sync your library
zotmd sync
Requirements
- Python 3.13+
- Better BibTeX (Zotero plugin)
- Zotero API access
Documentation
Example Output
---
title: "Reproscreener: Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines"
citekey: "bhaskarReproscreenerLeveragingLLMs2024"
itemType: conferencePaper
venue: "Association for Computing Machinery"
year: 2024
dateAdded: 2025-02-08
authors:
- "Adhithya Bhaskar"
- "Victoria Stodden"
status: unread
tags:
- automated-checks
- metrics
- tools/LLM
- references
links:
- "zotero://select/library/items/RUNIG8WJ"
- "https://doi.org/10.1145/3641525.3663629"
- "https://doi.org/10.1145/3641525.3663629"
aliases:
- "Reproscreener Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines"
- "bhaskarReproscreenerLeveragingLLMs2024"
---
# @bhaskarReproscreenerLeveragingLLMs2024
> [!abstract]-
> The increasing reliance on machine learning models in scientific research and day-to-day applications – and the near-opacity of their associated computational methods – creates a widely recognized need to enable others to verify results coming from Machine Learning Pipelines. In this work we use an empirical approach to build on efforts to define and deploy structured publication standards that allow machine learning research to be automatically assessed and verified, enabling greater reliability and trust in results. To automate the assessment of a set of publication standards for Machine Learning Pipelines we developed Reproscreener; a novel, open-source software tool (see https://reproscreener.org/). We benchmark Reproscreener’s automatic reproducibility assessment against a novel manually labeled “gold standard” dataset of machine learning arXiv preprints. Our empirical evaluation has a dual goal: to assess Reproscreener’s performance; and to uncover gaps and opportunities in current reproducibility standards. We develop reproducibility assessment metrics we called the Repo Metrics to provide a novel overall assessment of the re-executability potential of the Machine Learning Pipeline, called the ReproScore. We used two approaches to the automatic identification of reproducibility metrics, keywords and LLM tools, and found the reproducibility metric evaluation performance of Large Language Model (LLM) tools superior to keyword associations.
# Notes
%% begin notes %%
-----------------------
%% end notes %%
# Annotations
%% begin annotations %%
- <mark class="hltr-purple">We adapt the following three criteria from prior work [15]: (1) README file presence:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=6PZJVP8Y)
- <mark class="hltr-purple">(2) Wrapper scripts:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=D4FD7GNU)
- <mark class="hltr-purple">(3) Software dependencies:</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=E6V5IGGC)
- <mark class="hltr-green">Reproscreener’s architecture has 3 key stages: Ingest, Evaluate and Report. First, it analyzes the preprint’s TEX files and repository contents, including README files and filenames. Next, keyword searches based on predefined criteria are performed on the parsed data which return a set of pass and fail results used to generate a reproducibility score. Along with these scores, Reproscreener5 provides a table highlighting areas for improvement.</mark> [Page 103](zotero://open-pdf/library/items/PN6G5V8A?page=2&annotation=ZFFVD2VA)
- <mark class="hltr-green">There are three evaluations: (1) The 9 selected Gunderson metrics on the full text of the preprint. (2) The 9 selected Gunderson metrics on the abstract. (3) The 6 Repo Metrics on the code repositories.</mark> [Page 105](zotero://open-pdf/library/items/PN6G5V8A?page=4&annotation=4QS5BUN5)
- <mark class="hltr-red">We find that researchers often include the experimental setup (74%) and dataset (62%) but details such as the problem, objective, hypothesis, and research questions were mentioned less frequently. ReproScreener performed best on the ‘Code Available’ metric (82%) which is not surprising since this metric is assessed by extracting and parsing URLs from the preprint as opposed to keyword searches for other metrics, where the latter is more likely to have false positives.</mark> [Page 107](zotero://open-pdf/library/items/PN6G5V8A?page=6&annotation=5QGJXVA5)
%% end annotations %%
## Related Literature
```dataview
TABLE year, venue, status
FROM "references/"
WHERE contains(file.inlinks, this.file.link) OR contains(file.outlinks, this.file.link)
Annotation Color Key
| Highlight Color | Meaning |
|---|---|
| Red | Important |
| Purple | General |
| Green | Potential References |
| Orange | Applications |
| Yellow | Technical Details |
| Blue | Personal Insights |
Formatted bibliography
[1]
Adhithya Bhaskar, Victoria Stodden, "Reproscreener: Leveraging LLMs for Assessing Computational Reproducibility of Machine Learning Pipelines," Unknown, pp. 101–109, July 11, 2024, doi: 10.1145/3641525.3663629.
%% Import Date: 2025-12-26T19:27:41.672+03:00 %%
## License
MIT License - see [LICENSE](LICENSE) for details.
## Contributing
Contributions welcome! Please see the [documentation](https://adbX.github.io/zotmd/) for development setup.
## Support
- 📝 [Report Issues](https://github.com/adbX/zotmd/issues)
- 💬 [Discussions](https://github.com/adbX/zotmd/discussions)
- 📖 [Documentation](https://adbX.github.io/zotmd/)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zotmd-0.2.0.tar.gz.
File metadata
- Download URL: zotmd-0.2.0.tar.gz
- Upload date:
- Size: 86.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfb61dd53d6f5272e9176dc466560d4f75a357aef40dd2de6101000911d8bff5
|
|
| MD5 |
2a6be8460137a31ef77621a30e12cf38
|
|
| BLAKE2b-256 |
b40ea9862178bcb3d14b8a4f6842607c632a64d9f802849082108323a13c4156
|
File details
Details for the file zotmd-0.2.0-py3-none-any.whl.
File metadata
- Download URL: zotmd-0.2.0-py3-none-any.whl
- Upload date:
- Size: 35.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2c83d6994a2791d1c992c65037996f6154c4f9b66bc1580e0ad65e1cafa3312
|
|
| MD5 |
758d11addeeed93a2fb2d756a03b6253
|
|
| BLAKE2b-256 |
9e8860ae8647f4161fd2897f6c5187f7b3ca271fe5c30b60ea9765ef8a92dccd
|