A Python library that makes it easy for users to access data from the Columbia History Lab's database by functioning as a wrapper around the History Lab's API.
Project description
History as Data Science
History Lab API
The History Lab focuses on digitizing historical documents and turning them into a format more amenable to the tools of modern data analysis. As part of this, the History Lab has compiled a database of more than 3 million declassified historical documents.
Traversing any large database of this sort can be tedious though. histlabapi
is a Python library that aims to solve this, making it easier for users to access data from the History Lab's database by wrapping around the History Lab's API.
Installation and setup
Installation is quite straightforward with pip. This package is only compatible with Python 3.9+ due to its usage of the requests
dependency and its reliance on sphinx
to generate its documentation.
$ pip install histlabapi
Once installed, you can import the package with this:
from histlabapi import histlabapi
Usage
Before extracting documents left and right, its important to get some bearing on how the History Lab stores and structures its various documents. As such, I've compiled a quick guide where one can look up the various collections and fields that you can access through this API here.
Once that's settled, you can use this package's various functions to extract information in all kinds of ways:
- An overview of all the collections currently available in the API
- Listing all the entities of a certain type that appear across all collections
- Searching and extracting documents by text, entity, date or document ID
Documentation
Full documentation can be accessed at Read the Docs
Support
Feel free to contact me at dg3279@columbia.edu if you have any questions and/or want to contribute!
License
histlabapi
was created by Derrick Gozal. It is licensed under the terms of the MIT license.
Credits
histlabapi
was created with cookiecutter
and the py-pkgs-cookiecutter
template.
Also much thanks to Professor Raymond Hicks and the rest of the History Lab team for all the support in building up this package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file histlabapi-0.1.1.tar.gz
.
File metadata
- Download URL: histlabapi-0.1.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.11.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d86a456c9197ad57ab2b4d7b35ccd8fe9c08860b67df5460f2b201fc04827ba |
|
MD5 | 39526c9b090698fbf28c016e6c2f2ba4 |
|
BLAKE2b-256 | d8a636d6beccb3560a12d2c476ec24c8572569a89cefba3b179b9ed5cc14c82f |
File details
Details for the file histlabapi-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: histlabapi-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.6.1 CPython/3.11.4 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 778648299c1a3148ba8a3185d0e86872aeae93ec1a0ce9e33358c8a0d713f833 |
|
MD5 | 4197d9f224e8c6c04d9fa3edab5ae15c |
|
BLAKE2b-256 | 3fc3f8f31c4b382ebdec6bb6089ad200d8be3263141aa52ae5fb5c8794df6a3c |