Skip to main content

Code to tidy archivist catalogues into human- and computer-readable formats.

Project description

The Tidy Archive Catalogues project

The aim of the Tidy Archive Catalogues project is to enable archivist catalogues to be human- and computer-readable. We want to create a python package which will:

  1. enable researchers to analyse and visualise at the level of a catalogue or a subset of the catalogue that they find interesting.
  2. make checking for mistakes in archiving (typos, or non-compliance with archiving standards) easy for archivists, who work quickly and make many subtle decisions.

Road-map

The project currently aims to have the following functionality:

  1. Enable archivists to search their archives for potential typos/mistakes. Specifically, they will be able to be returned with: A. Specific places to check in the archive (e.g. specific reference numbers/codes, and columns) for mistakes, particularly for: i. Dates ii. Named entities (places, people, and businesses) B. Flag inconsistencies in the archive, according to a chosen set of archiving guidelines, for example if 'c.', 'c', 'circa' are used to mean approximately, offer one option. i. According to national archives guidelines ii. According to Theatre Collection guidelines
  2. Enable digital humanities researchers to digitally visualise/analyse the collection, by creating machine-readable and human-readable (for labels, etc) versions of the following c: A. Dates, date ranges, and date uncertainty B. Named entities (places, people, and businesses)

The output of (2.) should give the data in a useful format to interact with existing python/R libraries.

For example dates should be in a datetime format, places should be either points in longitude/latitude, or polygonal areas (e.g. in geojson), or easy to convert to. There should be tutorial notebooks, for creating simple visualisations of: A. Social networks B. Geographical areas C. Timelines

Both (1) and (2) should be tested on large archives (e.g. British Library and The National Collection), and smaller ones (e.g. The Theatre Collection, and the Harry Ransom Centre)

Contributors

Contributors are recognised using all-contributors guidelines.

Natalie Thurlby Julian Warren Jo Elseworth Elaine McGirr Emma Howgill

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidy-archive-catalogues-0.0.1.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

tidy_archive_catalogues-0.0.1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file tidy-archive-catalogues-0.0.1.tar.gz.

File metadata

  • Download URL: tidy-archive-catalogues-0.0.1.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for tidy-archive-catalogues-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b5a904b41e031443ba18156d497a8aa9ad705be05635183c972f0555cd179721
MD5 583b86fc33805371ea74db35a1a544e9
BLAKE2b-256 e1e5d87f744aacc6930a6df63437284726e9656ea27bbd6c9a75fc8a39e12120

See more details on using hashes here.

File details

Details for the file tidy_archive_catalogues-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: tidy_archive_catalogues-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/2.7.15

File hashes

Hashes for tidy_archive_catalogues-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cfcea243dba014e1196bfcf1fa36e2c64c5e50c4d7ee70b2353404d3cf1d012a
MD5 f5d9a6ddc0894b969b7dc3337654112a
BLAKE2b-256 5f0b3ca508dbe34dddbba36b3b1f15e420ef4be1757bde656a73578742922448

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page