BibTeX parser for Python
Project description
citerra
BibTeX parser for Python.
citerra parses, validates, edits, and writes BibTeX documents. It supports
strict parsing by default, opt-in tolerant recovery, diagnostics with source
locations, raw-text retention, source-preserving writes, name/date/identifier
helpers, and plain-record projection for application code.
The package is distributed as ABI3 wheels for Python 3.8 and newer.
Performance Snapshot
Measured on tests/fixtures/tugboat.bib: 2,701,551 bytes, 73,993 lines, and
3,644 entries. Hardware was AMD Ryzen 5 5600G, 6 cores / 12 threads. Measured
on 2026-05-13 with Python 3.11.14; throughput is input-size normalized.
The comparison used citerra 0.2.2, bibtexparser 1.4.4,
bibtexparser 2.0.0b9, and pybtex 0.26.1.
citerra structured parse disables source capture and raw preservation for the
closest parser-output comparison. citerra source-preserving parse includes
raw source text, source locations, diagnostics, and source-order blocks.
| Python parser / mode | Version | Median parse time | Throughput | Relative time |
|---|---|---|---|---|
citerra structured parse |
0.2.2 | 0.058 s | 44.3 MiB/s | 1.0x |
citerra source-preserving parse |
0.2.2 | 0.065 s | 39.9 MiB/s | 1.1x |
bibtexparser parse |
2.0.0b9 | 0.372 s | 6.9 MiB/s | 6.4x |
pybtex parse |
0.26.1 | 0.859 s | 3.0 MiB/s | 14.8x |
bibtexparser parse |
1.4.4 | 10.483 s | 0.2 MiB/s | 180.1x |
| Python writer / mode | Version | Median write time | Throughput | Relative time |
|---|---|---|---|---|
citerra raw-preserving write |
0.2.2 | 0.003 s | 953.2 MiB/s | 1.0x |
citerra normalized write |
0.2.2 | 0.014 s | 181.3 MiB/s | 5.3x |
bibtexparser write |
1.4.4 | 0.106 s | 24.3 MiB/s | 39.2x |
bibtexparser write |
2.0.0b9 | 0.493 s | 5.2 MiB/s | 182.2x |
pybtex write |
0.26.1 | 3.942 s | 0.7 MiB/s | 1458.5x |
Reproduction commands are listed in Reproducing Benchmarks.
Install
pip install citerra
The distribution name and import name are both citerra:
import citerra
Parse
import citerra
document = citerra.parse(
'@article{paper, author = "Jane Doe", title = "Example Paper", year = 2026}',
expand_values=True,
)
entry = document.entry("paper")
assert entry is not None
assert entry.entry_type == "article"
assert entry.get("title") == "Example Paper"
assert entry.date_parts().year == 2026
File helpers are available:
from pathlib import Path
import citerra
document = citerra.parse_path("references.bib", tolerant=True)
Path("normalized.bib").write_text(citerra.dumps(document), encoding="utf-8")
File-like helpers are also available:
with open("references.bib", encoding="utf-8") as handle:
document = citerra.load(handle, tolerant=True)
text = citerra.dumps(document)
Document Model
Documentcontains entries, comments, preambles, string definitions, source-order blocks, diagnostics, and validation helpers.Entryexposes the citation key, entry type, fields, source text, semantic helpers, and field mutation methods.Fieldexposes the original field name, parsed value, optional raw source text, and optional source location.Valuerepresents string literals, numbers, variables, and concatenations.Diagnosticreports parse or validation problems with stable codes and source locations when available.
Tolerant Parsing And Diagnostics
document = citerra.parse(
text,
tolerant=True,
capture_source=True,
preserve_raw=True,
source="refs/main.bib",
)
if document.status != "ok":
for diagnostic in document.diagnostics:
span = diagnostic.source
if span is None:
print(diagnostic.code, diagnostic.message)
else:
print(diagnostic.code, span.line, span.column, diagnostic.message)
Raw Text And Source-Preserving Writes
document = citerra.parse(
text,
tolerant=True,
capture_source=True,
preserve_raw=True,
)
entry = document.entry("paper")
if entry is not None:
print(entry.raw)
print(entry.field("title").raw_value)
Use WriterConfig(preserve_raw=True) for low-churn output that reuses retained
source text where possible. Use WriterConfig(preserve_raw=False) for
normalized structured output.
document.rename_key("paper", "paper-v2")
document.set_field("paper-v2", "note", "accepted")
document.remove_export_fields(["abstract", "keywords"])
config = citerra.WriterConfig(
preserve_raw=True,
trailing_comma=True,
)
output = document.write(config)
Plain Records
Some application code wants ordinary dictionaries for filtering, indexing, or
bulk transforms. citerra provides explicit helpers for that shape without
changing the document model:
document = citerra.parse_path("references.bib")
records = citerra.document_to_dicts(document)
selected = [record for record in records if record.get("year") == "2026"]
text = citerra.write_entries(
selected,
field_order=["author", "title", "journal", "year", "doi"],
sort_by=["ID"],
trailing_comma=True,
)
Plain records use ENTRYTYPE and ID keys for the entry type and citation key.
Helpers
assert citerra.normalize_doi("https://doi.org/10.1000/XYZ.") == "10.1000/xyz"
assert citerra.latex_to_unicode("Jos\\'e") == "José"
names = citerra.parse_names("Jane Doe and {Research Group}")
assert names[1].literal == "Research Group"
date = citerra.parse_date("2026-05-13")
assert (date.year, date.month, date.day) == (2026, 5, 13)
Reproducing Benchmarks
The comparison script uses whichever optional packages are installed in the active environment:
python python/benchmarks/compare_parsers.py tests/fixtures/tugboat.bib
python python/benchmarks/compare_parsers.py tests/fixtures/tugboat.bib --write
Implementation
citerra is implemented as a native extension. Wheels include the parser
engine, so ordinary Python installs do not require a Rust toolchain.
Local Build
Use the project manifest for local development:
guix shell -m manifest.scm -- maturin build --release --out target/wheels
For local tests without installing into the user environment, unpack the built
wheel into a temporary import directory and run pytest with that directory on
PYTHONPATH:
rm -rf target/python-test
python3 - <<'PY'
from pathlib import Path
from zipfile import ZipFile
wheel = sorted(Path("target/wheels").glob("citerra-*.whl"))[-1]
target = Path("target/python-test")
target.mkdir(parents=True, exist_ok=True)
with ZipFile(wheel) as archive:
archive.extractall(target)
PY
guix shell -m manifest.scm -- env PYTHONPATH=target/python-test python3 -m pytest tests/python
License
Licensed under either of Apache-2.0 or MIT, at your option.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file citerra-0.2.2.tar.gz.
File metadata
- Download URL: citerra-0.2.2.tar.gz
- Upload date:
- Size: 333.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
525d52f94a397ec7e5be73c4b415b3dbd016298568ceb6d475f70b699aef1f47
|
|
| MD5 |
4b2725b1a107de7836f3bbb6ce0da52d
|
|
| BLAKE2b-256 |
5fc815244d1e0a1c4a2b21338960c28f965719ca44aceb9508787d9b2541b97a
|
Provenance
The following attestation bundles were made for citerra-0.2.2.tar.gz:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2.tar.gz -
Subject digest:
525d52f94a397ec7e5be73c4b415b3dbd016298568ceb6d475f70b699aef1f47 - Sigstore transparency entry: 1523924345
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file citerra-0.2.2-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: citerra-0.2.2-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 383.4 kB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b91a3dffe8e736c33d3883587aeda66d51cc0eba285788d95338fb57c57381a6
|
|
| MD5 |
5d1aa5ae4c8047789f54e0d511d177fd
|
|
| BLAKE2b-256 |
950d55276c93eb454a8480c633f701d5866280e74c9f07b0ffd1aa6a374da0d2
|
Provenance
The following attestation bundles were made for citerra-0.2.2-cp38-abi3-win_amd64.whl:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2-cp38-abi3-win_amd64.whl -
Subject digest:
b91a3dffe8e736c33d3883587aeda66d51cc0eba285788d95338fb57c57381a6 - Sigstore transparency entry: 1523924512
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file citerra-0.2.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: citerra-0.2.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 455.8 kB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d10b5e4fe884a5aa61a0edcc2bd2976dd9252058c227317ed85dc4b66dbdf040
|
|
| MD5 |
d01441507407ab2f752aa7c278a5fab6
|
|
| BLAKE2b-256 |
88ff524629d82a49ffb1166005764a884a2eb4e235338c65f94537795168543e
|
Provenance
The following attestation bundles were made for citerra-0.2.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
d10b5e4fe884a5aa61a0edcc2bd2976dd9252058c227317ed85dc4b66dbdf040 - Sigstore transparency entry: 1523924644
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file citerra-0.2.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: citerra-0.2.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 437.8 kB
- Tags: CPython 3.8+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5b27e6e25438aa12b94ed8c8f4cc61ba17fc28394a59add2a01f6c6cc0da03dd
|
|
| MD5 |
44801fdb0e2cff7a6f410caa1c02de11
|
|
| BLAKE2b-256 |
0d3e8217f83b32a7c9d9c8cb0827edfec251fe03b18151d7570b9b7b9f69e1b3
|
Provenance
The following attestation bundles were made for citerra-0.2.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2-cp38-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
5b27e6e25438aa12b94ed8c8f4cc61ba17fc28394a59add2a01f6c6cc0da03dd - Sigstore transparency entry: 1523924472
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file citerra-0.2.2-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: citerra-0.2.2-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 419.2 kB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
94afd7940fead8518059601df55ea7c8d0d7ef0bbba2938927f290de5a96db95
|
|
| MD5 |
889af4a968878a89fd2b4a2afdf97db0
|
|
| BLAKE2b-256 |
759efa04856839e54d9ba9225206150fb5b7a79220d33d8a1b1890ec01524441
|
Provenance
The following attestation bundles were made for citerra-0.2.2-cp38-abi3-macosx_11_0_arm64.whl:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2-cp38-abi3-macosx_11_0_arm64.whl -
Subject digest:
94afd7940fead8518059601df55ea7c8d0d7ef0bbba2938927f290de5a96db95 - Sigstore transparency entry: 1523924392
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type:
File details
Details for the file citerra-0.2.2-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: citerra-0.2.2-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 442.7 kB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34d073ae0e50151299ba0dbab8426783f55b3efe7a7481ee91168caff48435ce
|
|
| MD5 |
aba3264cadf81bac22761a82025be898
|
|
| BLAKE2b-256 |
f0a496453a471e45ca6909258b345fd42fbea6dcfb2735b6e6180334d1b10540
|
Provenance
The following attestation bundles were made for citerra-0.2.2-cp38-abi3-macosx_10_12_x86_64.whl:
Publisher:
release.yml on b-vitamins/bibtex-parser
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
citerra-0.2.2-cp38-abi3-macosx_10_12_x86_64.whl -
Subject digest:
34d073ae0e50151299ba0dbab8426783f55b3efe7a7481ee91168caff48435ce - Sigstore transparency entry: 1523924585
- Sigstore integration time:
-
Permalink:
b-vitamins/bibtex-parser@3db0bc5d31fe960685df7119612fda2395f480f1 -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/b-vitamins
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@3db0bc5d31fe960685df7119612fda2395f480f1 -
Trigger Event:
push
-
Statement type: