Skip to main content

Text processing and analysis for HathiTrust Research Center

Project description

htrc-text-processing Library [Under Development]

Table of Contents

  1. What is htrc-text-processing Library
  2. How to Install/Use
  3. Usage
  4. Examples
  5. What else?

About htrc-text-processing Library

Description goes here.

How to Install

currenlty only by downloading htrc_text_processing folder and placed in your working directory.

easiest way is, just clone the repo and run example1.py.

TODO need to create a pip install verion (after creating all functionalities)

What you can do with this.

A function that finds the zip files at the end of the pairtree, moves them to a new folder and expands them, removing the zips

import htrc_text_processing as htrc_tp 

# Expand all zip files seperately into a given folder
htrc_tp.get_zips_extract('sample-pairtree-data-parent/sample-pairtree-data', 'output_unziped_files') 

# In case you only need zip files use this function 
htrc_tp.get_zips_only('pairtree-data', 'output_only_zip_files') 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htrc-text-processing-0.0.1.tar.gz (8.1 kB view hashes)

Uploaded Source

Built Distribution

htrc_text_processing-0.0.1-py3-none-any.whl (12.9 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page