Skip to main content

Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model validation, etc.

Project description

data_science_toolbox

=====================

Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model training and validation etc.

Installation

Using pip

You can install using the pip package manager by running

pip install data-science-toolbox

Project Organization


├── README.md              
├── data_science_toolbox   <- Project source code
│   │
│   ├── gists              <- Code gists with commonly used code (change to root
│   │                         directory, connect to database, profile data, etc)
│   ├── data_checks        <- Code for data checks and assertions
│   ├── io                 <- Code for input/output utilities
│   ├── etl                <- For building reproducible ETL pipelines, including data
│   │                         checks and transformers
│   ├── ml                 <- Machine Learning utility code (feature engineering, etc) 
│   ├── pandas             <- Pandas related utility code
│   │   ├── analysis                  
│   │   ├── cleaning
│   │   ├── engineering
│   │   ├── text    
│   │   ├── datetime     
│   │   ├── optimization       
│   │   └── profiling   
│   ├── project_utils.py   <- For project specific utilities
│   │
│   ├── text               <- Code for dealing with text. Includes distributed loading of text corpus, 
│   │                         entity statement extraction, sentiment analysis, pii removal etc.
│   └── __init__.py        <- Makes data_science_toolbox a Python module               
├── tests                  <- Pytest unit tests 
├── dist                   <- tars and whls of version builds
├── LICENSE
├── poetry.lock
└── pyproject.toml 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for data-science-toolbox, version 0.1.4
Filename, size File type Python version Upload date Hashes
Filename, size data_science_toolbox-0.1.4-py3-none-any.whl (109.2 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size data_science_toolbox-0.1.4.tar.gz (66.9 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page