A package for downloading bulk files from courtlistener
Project description
Easy Bulk export, no cap
This repository provides scripts and notebooks that make it easy to export data in bulk from CourtListener's freely available downloads.
- Create first version of notebook suitable for Data Scientists
- Create the appropriate dtypes to optimize panda storage
- Select necessary cols usecols, for example 'created_by' date field indicating a database insert isn't necessary
- Read the opinions.csv (190+gb) chunk at a time from disk while converting into JSON
- Create a standalone script that can be piped to other tools
- Create PyPi library using Poetry: package
- Output script using json lines format
- Improve speed by using DASK DataFrame
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lil_nocap-0.5.4.tar.gz
(11.3 kB
view details)
Built Distribution
lil_nocap-0.5.4-py3-none-any.whl
(13.7 kB
view details)
File details
Details for the file lil_nocap-0.5.4.tar.gz
.
File metadata
- Download URL: lil_nocap-0.5.4.tar.gz
- Upload date:
- Size: 11.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.10 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ab8b4a2c1f087af857f643aab445f55038686eef777caf7526a83d953a3665a |
|
MD5 | 68fd2c5e9a1da7d5d8556432961cd55c |
|
BLAKE2b-256 | 56786056a109a171c976f36de98aa1b804846373107460e97d14a137dc75a4d2 |
File details
Details for the file lil_nocap-0.5.4-py3-none-any.whl
.
File metadata
- Download URL: lil_nocap-0.5.4-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.2 CPython/3.10.10 Darwin/21.5.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f68262713d2837b3573f931a6961e5a9dfb9883e4171a60f3781afc236df01e |
|
MD5 | 666731e084acb7508a327115ee8637cf |
|
BLAKE2b-256 | 97be6313b256f9111442e38f3ceb0d2ff3ce1bd36c8702d8374da0189754a007 |