Automatic generation of codebooks from dataframes.
Project description
Codebooks
Automatically generate codebooks from dataframes. Includes methods to:
- Infer variable type (as unique key, indicator, categorical, or continuous).
- Summarize values with histograms and KDEs.
- Generate a self-contained HTML report (may be extended to PDF or other formats in the future).
Usage:
codebooks -o output.html input.csv
Adding variable descriptions
You can specify a csv file that maps variable names to descriptions using:
codebooks --desc descriptions.csv -o output.html input.csv
The csv file is expected to have two columns (variable name, description).
License
3-Clause BSD (see LICENSE)
Tests
The test/
subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:
cd test
python dataset.py
codebooks dataset.csv
python benchmark.py
Authors
Mark Howison
http://mark.howison.org
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
codebooks-0.0.3.tar.gz
(12.7 kB
view hashes)
Built Distribution
codebooks-0.0.3-py3-none-any.whl
(12.9 kB
view hashes)
Close
Hashes for codebooks-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86aa58fa98fdcde35eb317de36d9f055e6855800990faab19c2edef0c4ae3ef7 |
|
MD5 | 898ad14ef567f263f770339419c5d6c2 |
|
BLAKE2b-256 | f8e0c8513d58f2383afc5a19222dce789c3f7524e8ff1c03e1841db7ab911679 |