A package for processing academic references from plaintext files
Project description
RefCatch
A Python package for processing academic references from plaintext files. RefCatch extracts references from text files (markdown, txt, etc.), attempts to find their DOIs using the CrossRef API, and outputs the results.
Installation
pip install refcatch
Or install directly from the repository:
git clone https://github.com/AhsanKhodami/refcatch.git
cd refcatch
pip install -e .
Publishing to PyPI
To publish RefCatch to PyPI:
# Install build and twine
pip install build twine
# Build the package
python -m build
# Upload to PyPI (use --repository-url https://test.pypi.org/legacy/ for TestPyPI)
python -m twine upload dist/*
Usage
As a Python Package
from refcatch import refcatch
# Basic usage
refcatch("path/to/references.md", "path/to/output.md")
# With all options
refcatch(
input_file="path/to/references.md",
output_file="path/to/output.md",
doi_file="path/to/dois.txt", # Optional, will be auto-generated if not provided
log=True # Set to False to disable logging
)
Command Line Interface
# Basic usage
refcatch references.md
# Specify output file
refcatch references.md -o output.md
# Specify DOI file and run silently
refcatch references.md -o output.md -d dois.txt --silent
Example
Input file (references.md):
1. Wong WL, Su X, Li X, et al. Global prevalence of age-related macular degeneration and disease burden projection for 2020 and 2040: a systematic review and meta-analysis. Lancet Glob Health. 2014;2(2):e106-116.
2. Flaxman SR, Bourne RRA, Resnikoff S, et al. Global causes of blindness and distance vision impairment 1990-2020: a systematic review and meta-analysis. Lancet Glob Health. 2017;5(12):e1221-e1234.
Output file:
1. Wong WL, Su X, Li X, et al. Global prevalence of age-related macular degeneration and disease burden projection for 2020 and 2040: a systematic review and meta-analysis. Lancet Glob Health. 2014;2(2):e106-116.
DOI: 10.1016/S2214-109X(13)70145-1
2. Flaxman SR, Bourne RRA, Resnikoff S, et al. Global causes of blindness and distance vision impairment 1990-2020: a systematic review and meta-analysis. Lancet Glob Health. 2017;5(12):e1221-e1234.
DOI: 10.1016/S2214-109X(17)30393-5
Features
- Extracts references from plaintext files
- Makes multiple attempts to find DOIs with different search strategies
- Outputs references with their DOIs
- Saves DOIs to a separate file
- Optional logging of the process
- Simple command-line interface
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file refcatch-0.1.0.tar.gz.
File metadata
- Download URL: refcatch-0.1.0.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
755a2dbddf7f777703433a74d1c87cc1c6bb59391b88d35867616f45aab898fa
|
|
| MD5 |
c9f412872bb7468a194dd33410dd1ff3
|
|
| BLAKE2b-256 |
2fdbde467df98ac02e9358067cd78aefc1b55bca4d7fc74b89d2b820e6737a24
|
File details
Details for the file refcatch-0.1.0-py3-none-any.whl.
File metadata
- Download URL: refcatch-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
af4bf05575215ced41b2d93014578bff75ad0504f3cf681206805ac1da9241d9
|
|
| MD5 |
a3d66673b43450e8cb5851f502437961
|
|
| BLAKE2b-256 |
94373447653d1b8ac1977d46d3b5790871849e3e7281beb79a146cee6e2934e7
|