Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model validation, etc.
Project description
data_science_toolbox
=====================
Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model training and validation etc.
Installation
Using pip
You can install using the pip package manager by running
pip install data-science-toolbox
Project Organization
├── README.md
├── data_science_toolbox <- Project source code
│ │
│ ├── gists <- Code gists with commonly used code (change to root
│ │ directory, connect to database, profile data, etc)
│ ├── data_checks <- Code for data checks and assertions
│ ├── io <- Code for input/output utilities
│ ├── etl <- For building reproducible ETL pipelines, including data
│ │ checks and transformers
│ ├── ml <- Machine Learning utility code (feature engineering, etc)
│ ├── pandas <- Pandas related utility code
│ │ ├── analysis
│ │ ├── cleaning
│ │ ├── engineering
│ │ ├── text
│ │ ├── datetime
│ │ ├── optimization
│ │ └── profiling
│ ├── project_utils.py <- For project specific utilities
│ │
│ ├── text <- Code for dealing with text. Includes distributed loading of text corpus,
│ │ entity statement extraction, sentiment analysis, pii removal etc.
│ └── __init__.py <- Makes data_science_toolbox a Python module
├── tests <- Pytest unit tests
├── dist <- tars and whls of version builds
├── LICENSE
├── poetry.lock
└── pyproject.toml
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_science_toolbox-0.1.4.tar.gz.
File metadata
- Download URL: data_science_toolbox-0.1.4.tar.gz
- Upload date:
- Size: 66.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/0.12.17 CPython/3.6.7 Windows/10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c3d0f2b3ea88c3386768c1801076c18737b10a73076cb1aaa3acfdf5ca4af111
|
|
| MD5 |
a70ce9d8abe8f0e7a3ce3d1cf7e950f1
|
|
| BLAKE2b-256 |
81a6e139ab0173b9ed288db43ebb8e3e013c013ad0efe8c4cb446256f04f1024
|
File details
Details for the file data_science_toolbox-0.1.4-py3-none-any.whl.
File metadata
- Download URL: data_science_toolbox-0.1.4-py3-none-any.whl
- Upload date:
- Size: 109.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/0.12.17 CPython/3.6.7 Windows/10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7b8cc6e6b9a24a7db992b70fee02405bb42f24944d7f59f9e991312de891959
|
|
| MD5 |
cb110de958d84187d5402df29e1e55f4
|
|
| BLAKE2b-256 |
a844d6e1c92c87bb01f507e5fdea96d84eeec6f06046614b057962b0eea0a7ff
|