A python client for the Wilson Center Digital Archive API.
Project description
Digital Archive
A Python client for the Wilson Center's Digital Archive ("DA") of historical primary sources. This library provides a ORM for searching and accessing documents and other resources in the Digital Archive.
Usage
>>> import digitalarchive
# Search for documents:
>>> soviet_docs = digitalarchive.Document.match(name="soviet").all()
# Collections and other resource types are also searchable.
>> soviet_collections = digitalarchive.Collection.match(name="soviet")
# Grab a single, specific document:
>>> document = digitalarchive.Document.match(id="112566").first()
# Pull transcripts, translations, and original scans of documents:
>>> document.hydrate()
>>> document = test_doc.transcripts[0].html
# Pull the metadata and other assets for an entire resultset.
>>> chernobyl_docs = digitalarchive.Document.match(name="chernobyl")
>>> chernobyl_docs.hydrate()
>>> chernobyl_docs.all()
# Or just download all the documents!
>>> all_documents = digitalarchive.Document.match().all()
Disclaimers
- This is an unofficial library. I am not presently affiliated with the Wilson Center. I understandthat the API is unlikely to change in the near future, but I cannot guarantee that this library won't break without warning.
- If you plan to scrape the DA, please be respectful.
Planned Features
- Support for searching by date range.
- Asynchronous hydration of large result sets.
- For Collections, inlcude keyword hits in
short_description
for searches. (modify collection searches to use therecord.json
instead ofcollection.json
endpoint.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
digitalarchive-0.1.1.tar.gz
(8.1 kB
view hashes)