Skip to main content

A Python library that makes it easy for users to access data from the Columbia History Lab's database by functioning as a wrapper around the History Lab's API.

Project description


History as Data Science


History Lab API

The History Lab focuses on digitizing historical documents and turning them into a format more amenable to the tools of modern data analysis. As part of this, the History Lab has compiled a database of more than 3 million declassified historical documents.

Traversing any large database of this sort can be tedious though. histlabapi is a Python library that aims to solve this, making it easier for users to access data from the History Lab's database by wrapping around the History Lab's API.

Installation and setup

Installation is quite straightforward with pip. This package is only compatible with Python 3.9+ due to its usage of the requests dependency and its reliance on sphinx to generate its documentation.

$ pip install histlabapi

Once installed, you can import the package with this:

from histlabapi import histlabapi

Usage

Before extracting documents left and right, its important to get some bearing on how the History Lab stores and structures its various documents. As such, I've compiled a quick guide where one can look up the various collections and fields that you can access through this API here.

Once that's settled, you can use this package's various functions to extract information in all kinds of ways:

  • An overview of all the collections currently available in the API
  • Listing all the entities of a certain type that appear across all collections
  • Searching and extracting documents by text, entity, date or document ID

Documentation

Full documentation can be accessed at Read the Docs

Support

Feel free to contact me at dg3279@columbia.edu if you have any questions and/or want to contribute!

License

histlabapi was created by Derrick Gozal. It is licensed under the terms of the MIT license.

Credits

histlabapi was created with cookiecutter and the py-pkgs-cookiecutter template. Also much thanks to Professor Raymond Hicks and the rest of the History Lab team for all the support in building up this package.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

histlabapi-0.1.1.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

histlabapi-0.1.1-py3-none-any.whl (10.3 kB view details)

Uploaded Python 3

File details

Details for the file histlabapi-0.1.1.tar.gz.

File metadata

  • Download URL: histlabapi-0.1.1.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.4 Windows/10

File hashes

Hashes for histlabapi-0.1.1.tar.gz
Algorithm Hash digest
SHA256 4d86a456c9197ad57ab2b4d7b35ccd8fe9c08860b67df5460f2b201fc04827ba
MD5 39526c9b090698fbf28c016e6c2f2ba4
BLAKE2b-256 d8a636d6beccb3560a12d2c476ec24c8572569a89cefba3b179b9ed5cc14c82f

See more details on using hashes here.

File details

Details for the file histlabapi-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: histlabapi-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.6.1 CPython/3.11.4 Windows/10

File hashes

Hashes for histlabapi-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 778648299c1a3148ba8a3185d0e86872aeae93ec1a0ce9e33358c8a0d713f833
MD5 4197d9f224e8c6c04d9fa3edab5ae15c
BLAKE2b-256 3fc3f8f31c4b382ebdec6bb6089ad200d8be3263141aa52ae5fb5c8794df6a3c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page