Creates a JSON file of the Library of Congress Classification system
Project description
Library of Congress Classification to JSON
lcc2json outputs a single JSON file of the
Library of Congress Classification system.
For input, it downloads 699 .json files, 14 megabytes in total,
from the Library of Congress.
Install
Install from PyPI:
pip install lcc2json
Or install from main source repo, such as:
git clone https://spacecruft.org/books/lcc2json
cd lcc2json/
python -m venv venv
source venv/bin/activate
pip install -U setuptools pip wheel
pip install -e .
Usage
Thusly.
Download the source JSON files from the Library of Congress
lcc2json-dl
Parse the downloaded JSON files and output a single JSON file:
lcc2json
Help
Download script help:
(venv) jebba@rs-pencil:~/devel/spacecruft/books/lcc2json$ lcc2json-dl --help
usage: lcc2json-dl [-h] [-o OUTPUT_DIR] [-d MAX_DEPTH] [-v] [--dry-run]
Download Library of Congress Classification JSON files from id.loc.gov
options:
-h, --help show this help message and exit
-o, --output-dir OUTPUT_DIR
Output directory for JSON files (default: json)
-d, --max-depth MAX_DEPTH
Maximum depth to crawl (default: 2)
-v, --verbose Enable verbose logging
--dry-run Show what would be downloaded without actually downloading
Examples:
lcc2json-dl # Download all classifications to ./json/ (depth 2)
lcc2json-dl --max-depth 4 # Download to depth 4 (includes subdivisions)
lcc2json-dl -o lcc_data # Download to ./lcc_data/
lcc2json-dl -v # Verbose output
lcc2json-dl --dry-run # Show what would be downloaded
Depth levels:
0 = Root classification scheme
1 = Main classes (A-Z)
2 = Subclass ranges (e.g., PR1-PR9680) [default]
3 = Period/topic divisions (e.g., PR6050-PR6076)
4 = Alphabetical ranges (e.g., PR6066.A-PR6066.Z)
5+ = Individual entries (e.g., PR6066.A84)
Output JSON script help:
$ lcc2json --help
usage: lcc2json [-h] [-i INPUT_DIR] [-o OUTPUT] [-v] [--ranges]
Extract LCC outlines from downloaded JSON files.
options:
-h, --help show this help message and exit
-i, --input-dir INPUT_DIR
Directory containing JSON files (default: json)
-o, --output OUTPUT Output file path (default: lcc.json)
-v, --verbose Enable verbose output
--ranges Include start/stop/prefix range fields in output (larger file size)
JSON Data
Depth 1
- 21 files.
- ~5 second download.
- 215K size.
- 21 classification entries.
Depth 2
- 698 files.
- 2 minute download.
- 14M size.
- 14,786 classification entries.
- 516 unique prefixes.
Depth 3
- 14,581 files.
- 2 hour download.
- 161M size.
- 101,699 classification entries.
Depth 4
- 100,551 files.
- 14 hour download.
- 824M size.
- 344,073 classification entries.
- Two missing (404) files.
Depth 5
- 342,499 files.
- 2 day download.
- 2.9G size.
- 766,892 classification entries.
- Three missing files.
Downloads
JSON data snapshots are also available at this URL, so, optionally you don't have to download with this script:
📄 License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Copyright © 2025 Jeff Moe
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lcc2json-1.3.0.tar.gz.
File metadata
- Download URL: lcc2json-1.3.0.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
820e1bf2081d28fcc35e31ac224436880a6a966bad08d28ac4f94303fa04e94e
|
|
| MD5 |
249de70a8c8f135739add8c783db1bc6
|
|
| BLAKE2b-256 |
05f8bc4c86df474b1fbedb25ac8f8fb7e89f13e142c3e506873e9be53a1feb37
|
File details
Details for the file lcc2json-1.3.0-py3-none-any.whl.
File metadata
- Download URL: lcc2json-1.3.0-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ed174d94cba6ae65d787ae5ec8b274692b6d85255dae07d85fe1350e2a951b7d
|
|
| MD5 |
0d91496c74fac1c1168102eb54a56972
|
|
| BLAKE2b-256 |
978363a422f000037c019eae7fbd91536cca754a71f288b8ecd587b8f9ea438b
|