A command-line tool that converts a BibTeX file to a TSV of author-paper pairs for NSF COA forms
Project description
bib2coa
A command-line tool and Python library that converts a BibTeX file (.bib) into a tab-separated file (TSV) of unique author–paper pairs — designed to jumpstart the creation of NSF-compliant Collaborators & Other Affiliations (COA) documents.
What does this do?
- Parses a
.bibfile and extracts every author and their associated paper title + year. - Merges duplicate entries (same year + title) so each author appears only once.
- Writes a clean two-column TSV:
author<TAB>year - paper title. - Optionally prints summary statistics (number of entries, unique papers, unique authors).
What does this NOT do?
It does not automatically look up author affiliations, institutions, or any other metadata beyond what is in the BibTeX file. You will still need to fill those in manually — but this tool saves you from copying and pasting author names out of PDFs.
Installation
From PyPI (stable release)
pip install bib2coa
From source
git clone https://github.com/holehouse-lab/bib2coa.git
cd bib2coa
pip install .
Requires Python 3.6+ and the
bibtexparserpackage (installed automatically).
Command-line usage
Basic usage
bib2coa my_references.bib
This reads my_references.bib and writes coa_initial.tsv in the current directory.
Custom output filename
bib2coa my_references.bib --output my_coa.tsv
Show summary statistics
bib2coa my_references.bib --stats
Example output:
+----------------------+
| Summary statistics |
+----------------------+
Number of bibtex entries : 42
Number of unique paper names : 38
Number of unique authors : 107
Wrote 107 author-paper pairs to coa_initial.tsv
Show version
bib2coa --version
All options
| Flag | Description |
|---|---|
file (positional) |
Path to the BibTeX .bib file |
--output FILENAME |
Output TSV filename (default: coa_initial.tsv) |
--stats |
Print summary statistics after processing |
--version |
Show version and exit |
Python API
You can also use bib2coa as a library in your own scripts:
from bib2coa import parse_bibtex, write_tsv
# Parse a BibTeX file
unique_pairs, stats = parse_bibtex("my_references.bib")
# unique_pairs is a list of (author, paper_label) tuples
for author, paper in unique_pairs:
print(f"{author} -> {paper}")
# stats is a dict with keys:
# 'num_entries', 'num_unique_papers', 'num_unique_authors'
print(f"Found {stats['num_unique_authors']} unique authors")
# Write to a TSV file
write_tsv(unique_pairs, "coa_initial.tsv")
parse_bibtex(filepath)
Parse a BibTeX file and return unique author–paper pairs.
Parameters:
filepath(str) — Path to a.bibfile.
Returns:
unique_pairs(list of tuple) — Each tuple is(author_name, paper_label)wherepaper_labelis formatted as"YEAR - Title".stats(dict) — Keys:num_entries,num_unique_papers,num_unique_authors.
Raises:
Bib2COAException— If the file does not exist or cannot be parsed.
write_tsv(unique_pairs, output_path)
Write author–paper pairs to a tab-separated file.
Parameters:
unique_pairs(list of tuple) — Output fromparse_bibtex().output_path(str) — Path for the output TSV file.
Bib2COAException
Custom exception raised when a file is missing or a BibTeX file cannot be parsed.
Output format
The output TSV has two columns separated by a tab character:
Smith, John 2020 - A study of protein folding
Doe, Jane 2020 - A study of protein folding
Garcia, Maria 2021 - Phase separation in the cell
Kumar, Raj 2019 - Intrinsically disordered regions
- Column 1: Author name (as it appears in the BibTeX file).
- Column 2: Year and cleaned title (LaTeX braces
{}are stripped).
Each author appears exactly once, associated with the first paper where they were encountered.
Recommended workflow
Below is the workflow we recommend for making an NSF-compliant COA file.
1. Build your references
Use your reference manager (we recommend PaperPile) to select all the references you want to process and export them as a single BibTeX file.
For PaperPile, select all the papers you want to include, copy the citations as BibTeX keys to your clipboard, and save them as a .bib file (e.g. nsf.bib).
2. Run bib2coa
bib2coa --stats nsf.bib
This generates coa_initial.tsv.
3. Edit in a spreadsheet
Open coa_initial.tsv in Excel, Google Sheets, or Numbers using a tab delimiter. From there, reorganize and add affiliations to match the appropriate table in the NSF COA template.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bib2coa-1.2.tar.gz.
File metadata
- Download URL: bib2coa-1.2.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31b8b7fbf45abcab19250032c4071fdd0d78bd35c71f3f7b1b44d8ac47ba3962
|
|
| MD5 |
57e8210a9e735254acfff7fc48610781
|
|
| BLAKE2b-256 |
aac76415930ff933e652a9f98d70b465c903cca004e1a510954955fa44b7bc2e
|
File details
Details for the file bib2coa-1.2-py3-none-any.whl.
File metadata
- Download URL: bib2coa-1.2-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.30 {"installer":{"name":"uv","version":"0.9.30","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
553c86b680319409a08a6b1ae0ab11476a9ae89cbdb850884737eb8f17406c59
|
|
| MD5 |
aa41da1e57f2f7d1fef32665a6f781d0
|
|
| BLAKE2b-256 |
c200b557cf596d647870307ec0a03c8bd8e149675ec8d14a0091f99bb1cbf827
|