Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model validation, etc.
Project description
data-science-utils
Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, etc.
Project Organization
├── README.md <- The top-level README for developers using this project.
├── gists <- Code gists with commonly used code (change to root
│ directory, connect to database, profile data, etc)
├── io <- Code for input/output utilities
├── etl <- For building reproducible ETL pipelines, including data
│ checks and transformers
├── ml <- Machine Learning utility code (feature engineering, etc)
├── pandas <- Pandas related utility code
│ ├── analysis
│ ├── cleaning
│ ├── engineering
│ ├── text
│ ├── datetime
│ ├── optimization
│ └── profiling
├── text <- Code for dealing with text. Includes distributed loading of text corpus,
│ entity statement extraction, sentiment analysis, etc.
├── __init__.py <- Makes data_science_utils a Python module
├── project_utils.py <- For project specific utilities
└── LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data_science_toolbox-0.1.0.tar.gz
(77.1 kB
view hashes)
Built Distribution
Close
Hashes for data_science_toolbox-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7508ee4f4597fa210c7260ef81be511fc7a00a2e4143a46bd25ea15fec5d0b07 |
|
MD5 | 07822e95e5db2264b0e72d8c9d6d27b7 |
|
BLAKE2b-256 | bae76e8ce46effd096d6e59a826dd294ffecdf055296b68da9435d71e8fcd37c |
Close
Hashes for data_science_toolbox-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a859a543da84872a8f2b35c6967d61a0eb4a94c787e99c2cf2d69a2a0581e2a5 |
|
MD5 | a8ec59b05ed03a2f6e6642e3cdac62b7 |
|
BLAKE2b-256 | d197d69a5444b060d4973328204bdbcaedf41187de950e00ed7d798013b97310 |