Skip to main content

Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model validation, etc.

Project description

data_science_toolbox

=====================

Various code to aid in data science projects for tasks involving data cleaning, ETL, EDA, NLP, viz, feature engineering, feature selection, model training and validation etc.

Installation

Using pip

You can install using the pip package manager by running

pip install data-science-toolbox

Project Organization


├── README.md              
├── data_science_toolbox   <- Project source code
│   │
│   ├── gists              <- Code gists with commonly used code (change to root
│   │                         directory, connect to database, profile data, etc)
│   ├── data_checks        <- Code for data checks and assertions
│   ├── io                 <- Code for input/output utilities
│   ├── etl                <- For building reproducible ETL pipelines, including data
│   │                         checks and transformers
│   ├── ml                 <- Machine Learning utility code (feature engineering, etc) 
│   ├── pandas             <- Pandas related utility code
│   │   ├── analysis                  
│   │   ├── cleaning
│   │   ├── engineering
│   │   ├── text    
│   │   ├── datetime     
│   │   ├── optimization       
│   │   └── profiling   
│   ├── project_utils.py   <- For project specific utilities
│   │
│   ├── text               <- Code for dealing with text. Includes distributed loading of text corpus, 
│   │                         entity statement extraction, sentiment analysis, pii removal etc.
│   └── __init__.py        <- Makes data_science_toolbox a Python module               
├── tests                  <- Pytest unit tests 
├── dist                   <- tars and whls of version builds
├── LICENSE
├── poetry.lock
└── pyproject.toml 

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_science_toolbox-0.1.4.tar.gz (66.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_science_toolbox-0.1.4-py3-none-any.whl (109.2 kB view details)

Uploaded Python 3

File details

Details for the file data_science_toolbox-0.1.4.tar.gz.

File metadata

  • Download URL: data_science_toolbox-0.1.4.tar.gz
  • Upload date:
  • Size: 66.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.17 CPython/3.6.7 Windows/10

File hashes

Hashes for data_science_toolbox-0.1.4.tar.gz
Algorithm Hash digest
SHA256 c3d0f2b3ea88c3386768c1801076c18737b10a73076cb1aaa3acfdf5ca4af111
MD5 a70ce9d8abe8f0e7a3ce3d1cf7e950f1
BLAKE2b-256 81a6e139ab0173b9ed288db43ebb8e3e013c013ad0efe8c4cb446256f04f1024

See more details on using hashes here.

File details

Details for the file data_science_toolbox-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for data_science_toolbox-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e7b8cc6e6b9a24a7db992b70fee02405bb42f24944d7f59f9e991312de891959
MD5 cb110de958d84187d5402df29e1e55f4
BLAKE2b-256 a844d6e1c92c87bb01f507e5fdea96d84eeec6f06046614b057962b0eea0a7ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page