Skip to main content

Extract BibTeX metadata from PDFs, EPUBs, URLs, and identifiers

Project description

antifile

Extract BibTeX metadata from PDFs, EPUBs, URLs, and identifiers (DOI, arXiv, ISBN) and append it to a .bib file.

Install

pip install antifile
# or
uv tool install antifile

Usage

antifile INPUT -o refs.bib

INPUT can be:

  • a PDF or EPUB file — antifile paper.pdf -o refs.bib
  • a folder of PDFs/EPUBs — antifile ~/Downloads/papers -o refs.bib (add --recursive to descend into subfolders)
  • a URLantifile https://example.com/article -o refs.bib
  • a DOIantifile 10.1145/3292500 -o refs.bib
  • an arXiv IDantifile arXiv:1706.03762 -o refs.bib
  • an ISBNantifile 9780262033848 -o refs.bib

Entries are appended with de-duplication: a new entry matching an existing one (by DOI, arXiv ID, ISBN, or normalized title+author) fills in any missing fields rather than creating a duplicate. Pass --no-merge to skip on duplicate, or --force to append anyway with an auto-suffixed key.

Options

flag effect
-o, --output FILE target .bib (required; created if missing)
--method {auto,doi,arxiv,isbn,crossref,llm,claude-code,codex} force a PDF extraction method (default: auto)
--recursive recurse into subfolders for folder input
--no-preview skip the first-page PDF preview
--no-merge on duplicate, skip instead of filling missing fields
--force append even if a duplicate exists

LLM-assisted extraction

When a PDF has no resolvable identifier, antifile falls back to an LLM to read the first page. Set whichever API key you have — it's picked up from the environment:

export ANTHROPIC_API_KEY=...   # or OPENAI_API_KEY, or GEMINI_API_KEY / GOOGLE_API_KEY

With no API key set, antifile falls back to a CLI agent if one is installed — first the claude CLI, then the codex CLI. Each tier is skipped if its binary isn't on your PATH. Full chain:

DOI → arXiv → ISBN → CrossRef title → LLM API → claude CLI → codex CLI → none

Related

  • antilibrary — manage BibTeX libraries from the terminal (can call antifile via --add-from-files).
  • antifind — online metadata search → BibTeX.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antifile-0.1.4.tar.gz (38.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

antifile-0.1.4-py3-none-any.whl (26.3 kB view details)

Uploaded Python 3

File details

Details for the file antifile-0.1.4.tar.gz.

File metadata

  • Download URL: antifile-0.1.4.tar.gz
  • Upload date:
  • Size: 38.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for antifile-0.1.4.tar.gz
Algorithm Hash digest
SHA256 76708a5b0bb69c36feca1cf8bd2e582d349b52d9d755d5a103e3f76a95fb3c53
MD5 fe7c1001d602f2ca0bf0d5d08fb3ba48
BLAKE2b-256 b97d42570673cce000fbf1e4762080496365a3a19f08cca8264d6df35ca5e19e

See more details on using hashes here.

File details

Details for the file antifile-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: antifile-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 26.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.24 {"installer":{"name":"uv","version":"0.11.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for antifile-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 6ec66140b7ab4f4e26eda28dfae93c6fb068228da59b0f7bdcadedaf22f5ca90
MD5 2a5d98e0d0f9688d81bfa0a104d77532
BLAKE2b-256 28bda95540776908bb8083af8c4608cfb8e650c6272f442dab34918e85c1bdbd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page