Skip to main content

Reads RIS files into dictionaries via a generator for large files

Project description

A Python 3.6+ reader/writer of RIS reference files.

Usage

Parsing:

>>> import os
>>> from pprint import pprint
>>> import rispy
>>> filepath = 'tests/data/example_full.ris'
>>> with open(filepath, 'r') as bibliography_file:
...     entries = rispy.load(bibliography_file)
...     for entry in entries:
...         print(entry['id'])
...         print(entry['first_authors'])
12345
['Marx, Karl', 'Lindgren, Astrid']
12345
['Marxus, Karlus', 'Lindgren, Astrid']

Writing:

>>> import os
>>> import rispy
>>> entries = [
... {'type_of_reference': 'JOUR',
...  'id': '42',
...  'primary_title': 'The title of the reference',
...  'first_authors': ['Marxus, Karlus', 'Lindgren, Astrid']
...  },{
... 'type_of_reference': 'JOUR',
...  'id': '43',
...  'primary_title': 'Reference 43',
...  'abstract': 'Lorem ipsum'
...  }]
>>> filepath = 'export.ris'
>>> with open(filepath, 'w') as bibliography_file:
...     rispy.dump(entries, bibliography_file)

Example RIS entry

1.
TY  - JOUR
ID  - 12345
T1  - Title of reference
A1  - Marx, Karl
A1  - Lindgren, Astrid
A2  - Glattauer, Daniel
Y1  - 2014//
N2  - BACKGROUND: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.  RESULTS: Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. CONCLUSIONS: Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium.
KW  - Pippi
KW  - Nordwind
KW  - Piraten
JF  - Lorem
JA  - lorem
VL  - 9
IS  - 3
SP  - e0815
CY  - United States
PB  - Fun Factory
PB  - Fun Factory USA
SN  - 1932-6208
M1  - 1008150341
L2  - http://example.com
ER  -

TAG_KEY_MAPPING

The most fields contain string values, but some like first_authors (A1) are parsed into lists. The default mapping were created from specifications scattered around the web, but to our knowledge there is not one single source of RIS truth, so these may need to be modified for specific export systems:

Complete list of ListType tags

>>> from rispy.config import LIST_TYPE_TAGS
>>> pprint(LIST_TYPE_TAGS)
['A1', 'A2', 'A3', 'A4', 'AU', 'KW', 'N1']

Complete default mapping

>>> from rispy.config import TAG_KEY_MAPPING
>>> pprint(TAG_KEY_MAPPING)
{'A1': 'first_authors',
 'A2': 'secondary_authors',
 'A3': 'tertiary_authors',
 'A4': 'subsidiary_authors',
 'AB': 'abstract',
 'AD': 'author_address',
 'AN': 'accession_number',
 'AU': 'authors',
 'C1': 'custom1',
 'C2': 'custom2',
 'C3': 'custom3',
 'C4': 'custom4',
 'C5': 'custom5',
 'C6': 'custom6',
 'C7': 'custom7',
 'C8': 'custom8',
 'CA': 'caption',
 'CN': 'call_number',
 'CY': 'place_published',
 'DA': 'date',
 'DB': 'name_of_database',
 'DO': 'doi',
 'DP': 'database_provider',
 'EP': 'end_page',
 'ER': 'end_of_reference',
 'ET': 'edition',
 'ID': 'id',
 'IS': 'number',
 'J2': 'alternate_title1',
 'JA': 'alternate_title2',
 'JF': 'alternate_title3',
 'JO': 'journal_name',
 'KW': 'keywords',
 'L1': 'file_attachments1',
 'L2': 'file_attachments2',
 'L4': 'figure',
 'LA': 'language',
 'LB': 'label',
 'M1': 'note',
 'M3': 'type_of_work',
 'N1': 'notes',
 'N2': 'notes_abstract',
 'NV': 'number_of_volumes',
 'OP': 'original_publication',
 'PB': 'publisher',
 'PY': 'year',
 'RI': 'reviewed_item',
 'RN': 'research_notes',
 'RP': 'reprint_edition',
 'SE': 'section',
 'SN': 'issn',
 'SP': 'start_page',
 'ST': 'short_title',
 'T1': 'primary_title',
 'T2': 'secondary_title',
 'T3': 'tertiary_title',
 'TA': 'translated_author',
 'TI': 'title',
 'TT': 'translated_title',
 'TY': 'type_of_reference',
 'UK': 'unknown_tag',
 'UR': 'url',
 'VL': 'volume',
 'Y1': 'publication_year',
 'Y2': 'access_date'}

Override key mapping

The parser use a TAG_KEY_MAPPING, which one can override by calling rispy.load() with a custom mapping.

>>> import os
>>> from copy import deepcopy
>>> import rispy
>>> from pprint import pprint

>>> filepath = 'tests/data/example_full.ris'
>>> mapping = deepcopy(rispy.TAG_KEY_MAPPING)
>>> mapping["SP"] = "pages_this_is_my_fun"
>>> with open(filepath, 'r') as bibliography_file:
...     entries = rispy.load(bibliography_file, mapping=mapping)
...     pprint(sorted(entries[0].keys()))
['alternate_title2',
 'alternate_title3',
 'file_attachments2',
 'first_authors',
 'id',
 'issn',
 'keywords',
 'note',
 'notes_abstract',
 'number',
 'pages_this_is_my_fun',
 'place_published',
 'primary_title',
 'publication_year',
 'publisher',
 'secondary_authors',
 'type_of_reference',
 'url',
 'volume']

Developer instructions

Common developer commands are in the provided Makefile; if you don’t have make installed, you can view the make commands and run the commands from the command-line manually:

# setup environment
python -m venv venv
source venv/bin/activate
pip install -e .[dev,test]

# check if code format changes are required
make lint

# reformat code
make format

# run tests
make test

Github Actions are currently enabled to run lint and test when submitting a pull-request.

Project details


Release history Release notifications

This version

0.5.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for rispy, version 0.5.0
Filename, size File type Python version Upload date Hashes
Filename, size rispy-0.5.0-py3-none-any.whl (15.6 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size rispy-0.5.0.tar.gz (26.6 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page