Skip to main content

Module to complete bibtex files by polling online databases

Project description

Bibtex Autocomplete

Maintenance PyPI version PyPI pyversions License PyPI status Downloads actions

bibtexautocomplete or btac is a python package to autocomplete bibtex bibliographies. It is inspired and expanding on the solution provided by thando in this tex stackexchange post.

It attempts to complete a bibtex file by querying the following domains:

Big thanks to all of them for allowing open, easy and well-documented access to their databases.

Demo

demo.svg

Quick overview

How does it find matches?

btac queries the websites using the entry doi if known otherwise the title. So entries that don't have one of those two fields will not be completed. Additionally, the title should be the full title (title are compared excluding case and punctuation, but missing words are a mismatch).

Disclaimers

  • There is no guarantee that the script will find matches for your entries, or that the websites will have any data to add to your entries, (or even that the website data is correct, but that's not for me to say...)

  • The script is designed to minimize the chance of false positives - that is adding data from another similar-ish entry to your entry. If you find any such false positive please report them using the issue tracker.

How are entries completed?

Once responses from all websites have been found, the script will add fields from website with the following priority : crossref > arxiv > dblp > researchr > unpaywall.

So if both crossref's and dblp's response contain a publisher, the one from crossref will be used.

The script will not overwrite any user given non-empty fields, unless the -f/--force-overwrite flag is given.

Installation

Can be installed with pip :

pip install bibtexautocomplete

You should now be able to run the script using either command:

btac --version
python3 -m bibtexautocomplete --version

Dependencies

This package has two dependencies (automatically installed by pip) :

Usage

The command line tool can be used as follows:

btac [-flags] <input_files>

Examples :

  • btac my/db.bib : reads from ./my/db.bib, writes to ./my/db.btac.bib
  • btac -i db.bib : reads from db.bib and overwrites it (inplace flag)
  • btac db1.bib db2.bib -o out1.bib -o out2.bib reads multiple files and write their outputs to out1.bib and out2.bib respectively

Optional arguments:

  • -o --output <file.bib>

    Write output to given file. Can be used multiple times when also giving multiple inputs. Maps inputs to outputs in order in that case If there are extra inputs, use default name (old_name.btac.bib). Ignored in inplace (-i) mode.

  • -q --only-query <website> or -Q --dont-query <website>

    Restrict which websites to query from. <site> must be one of: crossref, dblp, researchr, unpaywall. These arguments can be used multiple times, for example to only query crossref and dblp use -q crossref -q dblp or -Q researchr -Q unpaywall

  • -e --only-entry <id> or -E --exclude-entry <id>

    Restrict which entries should be autocomplete. <id> is the entry id used in your bibtex file (e.g. @inproceedings{<id> ... }). These arguments can also be used multiple times to select only/exclude multiple entries

  • -c --only-complete <field> or -C --dont-complete <field>

    Restrict which fields you wish to autocomplete. Field is a bibtex field (e.g. author, doi,...). So if you only wish to add missing doi's used -c doi.

Output formatting:

  • --fa --align-values pad fieldnames to align all values

    @article{Example,
      author = {Someone},
      doi    = {10.xxxx/yyyyy},
    }
    
  • --fc --comma-first use comma first syntax

    @article{Example
      , author = {Someone}
      , doi = {10.xxxx/yyyyy}
      ,
    }
    
  • --fl --no-trailing-comma don't add the last trailing comma

  • --fi --indent <space> space used for indentation, default is a tab

Flags:

  • -i --inplace Modify input files inplace, ignores any specified output files

  • -f --force-overwrite Overwrite already present fields. The default is to overwrite a field if it is empty or absent

  • -t --timeout <float> set timeout on request in seconds, default: 10.0 s, increase this if you are getting a lot of timeouts.

  • -d --dump-data <file.json> writes matching entries to the given JSON files.

    This allows to see duplicate fields from different sources that are otherwise overwritten when merged into a single entry.

    The JSON file will have the following formatting:

    [
      {
        "entry": "<entry_id>",
        "new-fields": 8,
        "crossref": {
          "query-url": "https://api.crossref.org/...",
          "query-response-time": 0.556,
          "query-response-status": 200,
          "author" : "Lastname, Firstnames and Lastname, Firstnames ...",
          "title" : "super interesting article!",
          "..." : "..."
        },
        "arxiv": null, // null when no match found
        "dblp": ...,
        "researchr": ...,
        "unpaywall": ...
      },
      ...
    ]
    
  • -O --no-output don't write any output files (except the one specified by --dump-data)

  • -v --verbose verbose mode shows more info. It details entries as they are being processed and shows a summary of new fields and their source at the end. Using it more then once prints debug info (up to three times).

  • -s --silent hide info and progressbar. Keep showing warnings and errors. Use twice to also hide warnings, thrice to also hide errors and four times to also hide critical error, effectively killing all output.

  • -n --no-color don't color use ANSI codes to color and stylise output

  • --version show version number

  • -h --help show help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bibtexautocomplete-1.1.1.tar.gz (35.3 kB view details)

Uploaded Source

Built Distribution

bibtexautocomplete-1.1.1-py3-none-any.whl (43.3 kB view details)

Uploaded Python 3

File details

Details for the file bibtexautocomplete-1.1.1.tar.gz.

File metadata

  • Download URL: bibtexautocomplete-1.1.1.tar.gz
  • Upload date:
  • Size: 35.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.10.4

File hashes

Hashes for bibtexautocomplete-1.1.1.tar.gz
Algorithm Hash digest
SHA256 760dc8452df6fb010fb8bc1e9544ed9076b13c2d0c25b6f2ec3904e597fdd5bc
MD5 aa6626f22de68ba8352d6433b6d5ec05
BLAKE2b-256 d50fbd8dd2c1dc644c9f50aa09a9bb749564ec9b229ce395521d14081ec3e0a3

See more details on using hashes here.

File details

Details for the file bibtexautocomplete-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for bibtexautocomplete-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 82624914a86625cb1de5ae0b31c75c0df5a868f148eaf01a48d0b0f0f3df9150
MD5 e1e55d15f25ff5145adddf3b9a829773
BLAKE2b-256 4447a27ccc650ab0cbb3ff3efebdce0e68d98e5336e60c551cb742592ef47591

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page