Automatic generation of codebooks from dataframes.
Project description
Codebooks
Automatically generate codebooks from dataframes. Includes methods to:
- Infer variable type (as unique key, indicator, categorical, or continuous).
- Summarize values with histograms and KDEs.
- Generate a self-contained HTML report (may be extended to PDF or other formats in the future).
Usage:
codebooks -o output.html input.csv
Adding variable descriptions
You can specify a csv file that maps variable names to descriptions using:
codebooks --desc descriptions.csv -o output.html input.csv
The csv file is expected to have two columns (variable, description).
License
3-Clause BSD (see LICENSE)
Tests
The test/
subdirectory contains a script to generate a synthetic data set, an integration test for the codebooks package, and a benchmark script used to test performance optimizations. You can run these with:
cd test
python dataset.py
codebooks --desc desc.csv dataset.csv
python benchmark.py
Authors
Mark Howison
http://mark.howison.org
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
codebooks-0.0.4.tar.gz
(12.7 kB
view hashes)
Built Distribution
codebooks-0.0.4-py3-none-any.whl
(12.9 kB
view hashes)
Close
Hashes for codebooks-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d2df3f90e408d884fabb25cf5a0aebf7617f7141c347db3088319e9b71f45be |
|
MD5 | dd3b494134f79d163e48cf67ebc45d20 |
|
BLAKE2b-256 | cabcacd146d1ee222e12567e9a44d4c721a14d117a29f23f19da6d068fd62813 |