A useful tool for looking up Bib entries using DOI, or pubmed ID (or URL), or arXiv ID (or URL).
Project description
bib_lookup
A useful tool for looking up Bib entries using DOI, or pubmed ID (or URL), or arXiv ID (or URL).
It is an updated version of https://github.com/wenh06/utils/blob/master/utils_universal/utils_bib.py
NOTE that you should have internet connection to use bib_lookup
.
Installation
Run
python -m pip install bib-lookup
or install the latest version in GitHub using
python -m pip install git+https://github.com/DeepPSP/bib_lookup.git
or git clone this repository and install locally via
cd bib_lookup
python -m pip install .
Requirements
- requests
- feedparser
- pandas
Usage Examples
>>> from bib_lookup import BibLookup
>>> bl = BibLookup(align="middle")
>>> res = bl("1707.07183")
@article{wen2017_1707.07183v2,
author = {Hao Wen and Chunhui Liu},
title = {Counting Multiplicities in a Hypersurface over a Number Field},
journal = {arXiv preprint arXiv:1707.07183v2},
year = {2017},
month = {7},
}
>>> bl("10.1109/CVPR.2016.90")
@inproceedings{He_2016,
author = {Kaiming He and Xiangyu Zhang and Shaoqing Ren and Jian Sun},
title = {Deep Residual Learning for Image Recognition},
booktitle = {2016 {IEEE} Conference on Computer Vision and Pattern Recognition ({CVPR})},
doi = {10.1109/cvpr.2016.90},
year = {2016},
month = {6},
publisher = {{IEEE}},
}
>>> bl("10.23919/cinc53138.2021.9662801", align="left-middle")
@inproceedings{Wen_2021,
author = {Hao Wen and Jingsu Kang},
title = {Hybrid Arrhythmia Detection on Varying-Dimensional Electrocardiography: Combining Deep Neural Networks and Clinical Rules},
booktitle = {2021 Computing in Cardiology ({CinC})},
doi = {10.23919/cinc53138.2021.9662801},
year = {2021},
month = {9},
publisher = {{IEEE}},
}
Command-line Usage
After installation, one can use bib-lookup
in the command line:
bib-lookup 10.1109/CVPR.2016.90 10.23919/cinc53138.2021.9662801 --ignore-fields url doi -i path/to/input.txt -o path/to/output.bib
Output (Append) to a .bib
File
Each time a bib item is successfully found, it will be cached. One can call the save
function to write the cached bib items to a .bib
file, in the append mode.
>>> from bib_lookup import BibLookup
>>> bl = BibLookup()
>>> bl(["10.1109/CVPR.2016.90", "10.23919/cinc53138.2021.9662801", "DOI: 10.1142/S1005386718000305"]);
>>> len(bl)
3
>>> bl[0]
'10.1109/CVPR.2016.90'
>>> bl.save([0, 2], "path/to/some/file.bib") # save bib item corr. to "10.1109/CVPR.2016.90" and "DOI: 10.1142/S1005386718000305"
>>> len(bl)
1
>>> bl.pop(0) # remove the bib item corr. "10.23919/cinc53138.2021.9662801", equivalent to `bl.pop("10.23919/cinc53138.2021.9662801")`
>>> len(bl)
0
Bib Items Checking
One can use BibLookup
to check the validity (required fields, duplicate labels, etc) of bib items in a Bib file. The following is an example with a Bib file with incorrect and duplicate bib items.
>>> from bib_lookup import BibLookup
>>> bl = BibLookup()
>>> bl.check_bib_file("./test/invalid_items.bib")
Bib item "He_2016"
starting from line 3 is not valid.
Bib item of entry type "inproceedings" should have the following fields:
['author', 'title', 'booktitle', 'year']
Bib item "Wen_2018"
starting from line 16 is not valid.
Bib item of entry type "article" should have the following fields:
['author', 'title', 'journal', 'year']
Bib items "He_2016" starting from line 3
and "He_2016" starting from line 45 is duplicate.
[3, 16, 45]
or from command line
bib-lookup -c ./test/invalid_items.bib
bib-lookup --ignore-fields url doi -i ./test/sample_input.txt -o ./tmp/a.bib -c true
TODO
- (:heavy_check_mark:)
add CLI support; - use eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi for PubMed, as in [3];
- try using google scholar api described in [4] (unfortunately [4] is charged);
- use
Flask
to write a simple browser-based UI; - (:heavy_check_mark:)
check if the bib item is already existed in the output file, and skip saving it if so;
WARNING
Many journals have specific requirements for the Bib entries, for example, the title and/or journal (and/or booktitle), etc. should be capitalized, which could not be done automatically since
- some abbreviations in title should have characters all in the upper case, for example
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
- some should have characters all in in the lower case,
mixup: Beyond Empirical Risk Minimization
- and some others should have mixed cases,
KeMRE: Knowledge-enhanced Medical Relation Extraction for Chinese Medicine Instructions
This should be corrected by the user himself if necessary (which although is rare), and remember to enclose such fields with double curly braces.
Biblatex Cheetsheet
This file downloaded from [6] gives full knowledge about bib
entries.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bib_lookup-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e90d74188a37745bbb54542a3f3cb45327aa543a6edcde2218ab25049f6992a4 |
|
MD5 | 585b6a2b14d030a6bd3ec2ce86e9d6ff |
|
BLAKE2b-256 | 667c822c5b92392a706a073423df19a81c8636afdff743a1b0bf6ba56e784ce8 |