Utility class wrapping lxml for reading data from MODS v3.4 XML metadata into Python data types.
Project description
pymods is utility module for working with the Library of Congress’s MODS XML standard: Metadata Description Schema (MODS). It is a utility wrapper for the lxml module specific to deserializing data out of MODSXML into python data types.
If you need a module to serialize data into MODSXML, see the other pymods by Matt Cordial.
Installing
Recommended:
pip install pymods
Using
Basics
XML is parsed using the MODSReader class:
mods_records = pymods.MODSReader('some_file.xml')
Individual records are stored as an iterator of the MODSRecord object:
In [3]: len(mods_records)
Out[3]: 3
In [5]: for record in mods_records:
....: print(record)
....:
<Element Record at 0x6fffe5d50e8>
<Element Record at 0x6fffe5d5188>
<Element Record at 0x6fffe5d5228>
Or they can be accessed invidually through the the MODSReader.record_list attribute:
In [8]: print(mods_records.record_list[0].title_constructor())
['Fire Line System']
MODSReader will work with mods:modsCollection documents, outputs from OAI-PMH feeds, or individual MODSXML documents with mods:mods as the root element. When parsing only a single record, the MODSReader class will still store the record in the record_list attribute. Accessing the record will still require calling the object as an iterator or by list index.
pymods.Record
The MODSReader class parses each mods:mods element into a pymods.Record object. pymods.Record is a custom wrapper class for the lxml.ElementBase class. All children of pymods.Record inherit the lxml._Element and lxml.ElementBase methods.
Methods
All functions return data either as a string, list, or dict. See the appropriate docstrings for details.
Examples
Importing
from pymods import MODSReader, Record
Parsing a file
>>> mods = MODSReader('example.xml')
>>> len(mods)
3
Simple tasks
Generating a title list
In [14]: for record in mods:
....: print(record.title_constructor())
....:
['Fire Line System']
['$93,668.90. One Mill Tax Apportioned by Various Ways Proposed']
['Broward NOW News: National Organization for Women, February 1987']
Creating a subject list
In [17]: for record in mods:
....: for subject in record.subject_constructor():
....: print(subject)
....:
Concert halls
Architecture
Architectural drawings
Structural systems
Structural systems drawings
Structural drawings
Safety equipment
Construction
Mechanics
Structural optimization
Architectural design
Fire prevention--Safety measures
7013143
Taxes
Tax payers
Tax collection
Organizations
Feminism
Sex discrimination against women
Women's rights
Equal rights amendments
2020598
Women--Societies and clubs
National Organization for Women
More complex tasks
Creating a list of subject URI’s only for LCSH subjects
In [18]: for record in mods:
....: for subject in record.subject():
....: if 'authority' in subject.keys() and 'lcsh' == subject['authority']:
....: print(subject['valueURI'])
....:
http://id.loc.gov/authorities/subjects/sh85082767
http://id.loc.gov/authorities/subjects/sh88004614
http://id.loc.gov/authorities/subjects/sh85132810
http://id.loc.gov/authorities/subjects/sh85147343
Get URLs for objects using a No Copyright US rightsstatement.org URI
In [23]: for record in mods:
....: if record.rights()['URI'] == 'http://rightsstatements.org/vocab/NoC-US/1.0/':
....: print(record.purl_search())
....:
http://purl.flvc.org/fsu/fd/FSU_MSS0204_B01_F10_09
http://purl.flvc.org/fsu/fd/FSU_MSS2008003_B18_F01_004
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pymods-1.0.0.tar.gz
.
File metadata
- Download URL: pymods-1.0.0.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f1d4b7d6dfd2b78562fede54d0714abb08ee1759046c664d7c297f2aed15b2fe |
|
MD5 | a1ff541921d8db31112b1482d29cc5d8 |
|
BLAKE2b-256 | 4329a13f18d2de1b80bbc2618fe1ac1461b03bba3798e026021c1168cc34e1af |
File details
Details for the file pymods-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: pymods-1.0.0-py3-none-any.whl
- Upload date:
- Size: 15.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9e874c0677c022d7f4a5625d63d2f466d53a719350e59c28d968addddc1e1943 |
|
MD5 | f3d39684f1a96f1411d196dc2bc8032f |
|
BLAKE2b-256 | 96216de8d63cb20d8148315a6b0319c2d884f0e828ccf1b3eff6d5de53d53783 |