Small library for extracting references used in scholarly communication.
Project description
Small library for extracting references used in scholarly communication.
Free software: GPLv2
Documentation: http://pythonhosted.org/refextract/
Originally exported from Invenio https://github.com/inveniosoftware/invenio.
Installation
pip install refextract
Usage
To get structured info from a publication reference:
from refextract import extract_journal_reference
reference = extract_journal_reference("J.Phys.,A39,13445")
print(reference)
{
'extra_ibids': [],
'is_ibid': False,
'misc_txt': u'',
'page': u'13445',
'title': u'J. Phys.',
'type': 'JOURNAL',
'volume': u'A39',
'year': ''
}
To extract references from a publication full-text PDF:
from refextract import extract_references_from_file
reference = extract_references_from_file("some/fulltext/1503.07589v1.pdf")
print(reference)
{
'references': [
{'author': [u'F. Englert and R. Brout'],
'doi': [u'10.1103/PhysRevLett.13.321'],
'journal_page': [u'321'],
'journal_reference': ['Phys.Rev.Lett.,13,1964'],
'journal_title': [u'Phys.Rev.Lett.'],
'journal_volume': [u'13'],
'journal_year': [u'1964'],
'linemarker': [u'1'],
'title': [u'Broken symmetry and the mass of gauge vector mesons'],
'year': [u'1964']}, ...
],
'stats': {
'author': 15,
'date': '2016-01-12 10:52:58',
'doi': 1,
'misc': 0,
'old_stats_str': '0-1-1-15-0-1-0',
'reportnum': 1,
'status': 0,
'title': 1,
'url': 0,
'version': u'0.1.0.dev20150722'
}
}
You can also extract directly from a URL:
from refextract import extract_references_from_url
reference = extract_references_from_url("http://arxiv.org/pdf/1503.07589v1.pdf")
print(reference)
{
'references': [
{'author': [u'F. Englert and R. Brout'],
'doi': [u'10.1103/PhysRevLett.13.321'],
'journal_page': [u'321'],
'journal_reference': ['Phys.Rev.Lett.,13,1964'],
'journal_title': [u'Phys.Rev.Lett.'],
'journal_volume': [u'13'],
'journal_year': [u'1964'],
'linemarker': [u'1'],
'title': [u'Broken symmetry and the mass of gauge vector mesons'],
'year': [u'1964']}, ...
],
'stats': {
'author': 15,
'date': '2016-01-12 10:52:58',
'doi': 1,
'misc': 0,
'old_stats_str': '0-1-1-15-0-1-0',
'reportnum': 1,
'status': 0,
'title': 1,
'url': 0,
'version': u'0.1.0.dev20150722'
}
}
Changes
Version 0.1.0 (2016-01-12)
Initial export from Invenio Software <https://github.com/inveniosoftware/invenio>
Restructured into stripped down, standalone version
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file refextract-0.1.0.tar.gz
.
File metadata
- Download URL: refextract-0.1.0.tar.gz
- Upload date:
- Size: 1.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 172896a7fbab80df66424658739ecfa7034b163a351f84d67e2cfe49c1312cc7 |
|
MD5 | 511a64891a1a4780d0f76458e986388e |
|
BLAKE2b-256 | a1be0604ed0de23402296f3289cc1355d1db0a88723144ab572aaad572f80b2b |