Intelligent academic paper and media renaming tool with multi-source metadata extraction

These details have not been verified by PyPI

Project links

Project description

CiteWright

Anybody else have a huge folder full of files with names like 235680_download.PDF and smith_et_al_2008_full.pdf(2)?

... yeah.

I wrote this because I got mass-downloading papers from Sci-Hub and then staring at a folder of cryptic filenames wondering which one was the paper about transformer attention mechanisms and which one was about soil bacteria. Life's too short.

What It Does

Strips text from documents and uses arXiv, Semantic Scholar, Crossref, PubMed, OpenLibrary, and Unpaywall to find the actual source
Renames files to Author_Year_Title.ext like a civilized person
Handles PDF, TXT, Markdown, DOC/DOCX, and Python files - throw it at it, let's find out
Maintains a BibTeX database so you don't have to
Logs everything, doesn't break anything, asks before doing anything destructive
Optionally uses a local LLM (Ollama) or cloud providers (OpenAI, Anthropic, Gemini) if the free APIs come up empty

The Philosophy

I built this with a "try the free stuff first" approach. Why pay for API calls when CrossRef is right there?

Tier	What Happens
1	Check if the PDF already has metadata embedded. Usually garbage, but sometimes you get lucky.
2	Extract DOIs, arXiv IDs, ISBNs from the text and look them up. This is where the magic happens.
3	Search academic APIs using whatever title/author text it can scrape. Works more often than you'd think.
4	(Optional) Throw the text at an LLM and ask nicely. Costs money unless you're running Ollama locally.

Installation

git clone https://github.com/lukeslp/citewright.git
cd citewright
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install .

Want the LLM-powered features and media processing?

pip install ".[all]"

Usage

Preview what would happen (dry run, safe):

citewright pdf ~/papers

Actually rename things:

citewright pdf ~/papers --execute

Go recursive and spit out a BibTeX file:

citewright pdf ~/papers -r --execute --bibtex library.bib

Let the LLM analyze the stubborn ones:

citewright pdf ~/papers --ai --execute

Rename photos and videos too (uses EXIF data):

citewright media ~/photos --execute

Use vision models to describe images:

citewright media ~/photos --ai --execute

Oh no go back:

citewright undo

Configuration

Config lives at ~/.config/citewright/config.json, or use the CLI:

citewright config --show
citewright config --ai-provider openai  # Select LLM provider
citewright config --ai-enabled
citewright config --unpaywall-email "you@example.com"

The Unpaywall email is optional but they appreciate it. Be cool.

License

MIT. Do whatever.

Author

Luke Steuber
https://github.com/lukeslp
luke@actuallyuseful.ai

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Feb 9, 2026

0.1.0

Jan 12, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

citewright-0.1.1.tar.gz (26.8 kB view details)

Uploaded Feb 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

citewright-0.1.1-py3-none-any.whl (33.1 kB view details)

Uploaded Feb 9, 2026 Python 3

File details

Details for the file citewright-0.1.1.tar.gz.

File metadata

Download URL: citewright-0.1.1.tar.gz
Upload date: Feb 9, 2026
Size: 26.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for citewright-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`fb27c3e77573d3736e4185e0c1be9969d8aacaea5248698105c3e36277022ef7`
MD5	`0f93c6f307402fa83c470fb893769b4b`
BLAKE2b-256	`d6c3efaee32f3872ed22ce158746a5a9ba7b29f0b3901e13560971a9b9947d8e`

See more details on using hashes here.

File details

Details for the file citewright-0.1.1-py3-none-any.whl.

File metadata

Download URL: citewright-0.1.1-py3-none-any.whl
Upload date: Feb 9, 2026
Size: 33.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for citewright-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0f10220ed226cb550bb3d245cfb576af1e4cd60e0c3f4659d4bbf63068ae8462`
MD5	`24a3397b432c3936757bf51a0399e7cf`
BLAKE2b-256	`1038c738280fe14ff839760cf8e8e2cc74ff1ba14d1100261ef9330917622f47`

See more details on using hashes here.

citewright 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CiteWright

What It Does

The Philosophy

Installation

Usage

Configuration

License

Author

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes