A Data science library for data science / data analysis teams

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Dataramp

Dataramp is a Python library designed to streamline data science and data analysis workflows. It offers a collection of utility functions and tools tailored to assist data science teams in various aspects of their projects.

Key Features

1. Project Management

Simplify project setup with a single function call to generate a standardized project directory structure.
Organize datasets, model outputs, scripts, notebooks, and more in predefined folders for better project management.

2. Model Saving and Loading

Save and load trained machine learning models effortlessly.
Supports multiple formats including joblib, pickle, and keras for compatibility with diverse model types.

3. Data Exploration and Visualization

Explore datasets and generate summary statistics with ease.
Visualize feature distributions and missing data patterns to gain insights into your data.

4. Feature Engineering

Handle missing data and outliers effectively.
Drop missing columns based on user-defined thresholds and detect outliers using Tukey's Interquartile Range (IQR) method.

5. Model Evaluation and Cross-Validation

Evaluate model performance with comprehensive metrics such as accuracy, F1-score, precision, and recall.
Generate classification reports and support cross-validation for robust model evaluation.

6. Scaling and Normalization

Scale and normalize data using min-max scaling and z-score normalization techniques.
Bring features to a common scale for improved model performance.

By providing a range of functionalities, Dataramp aims to enhance productivity and efficiency in data science projects, empowering teams to focus on deriving meaningful insights from their data.

Quickstart

To get started with Dataramp in your data science projects, follow these simple steps:

You can install Dataramp via pip:

pip install dataramp

To upgrade an existing installation of Dataramp, use:

pip install --upgrade dataramp

Getting Started

Once installed, you can import the library and explore its functionality:

import dataramp as dr

Creating a New Project

To create a new project using Dataramp, run:

dr.core.create_project("project-name")

This will create a project with a structured directory layout to kickstart your project.

Project Directory Structure

project-name/
├── datasets
│   └── dataset.csv
├── outputs
│   └── models
├── README.md
└── src
    ├── notebooks
    │   └── notebook.ipynb
    └── scripts
        ├── ingest
        └── tests

Sample Usage

import dataramp as dr  # import the dataramp library
import pandas as pd

from dataramp.utils import (
    describe_df,
    get_cat_vars,
    feature_summary,
    display_missing,
    get_unique_counts,
)

df = pd.read_csv("data/iris.csv")  # load iris dataset

df.head() #  Snapshot of your df

missing = display_missing(df)
print(missing)

Project Links

GitHub Repository: dataramp
PyPI Package: dataramp

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.1.dev190 pre-release

Feb 23, 2024

1.0.1.dev189 pre-release

Feb 22, 2024

1.0.1.dev184 pre-release

Feb 21, 2024

1.0.1.dev181 pre-release

Feb 21, 2024

This version

1.0.1.dev177 pre-release

Feb 21, 2024

1.0.1.dev173 pre-release

Feb 19, 2024

1.0.1.dev171 pre-release

Feb 19, 2024

1.0.1.dev169 pre-release

Feb 19, 2024

0.1.2

May 22, 2024

0.1.1

May 22, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataramp-1.0.1.dev177.tar.gz (13.9 kB view hashes)

Uploaded Feb 21, 2024 Source

Built Distribution

dataramp-1.0.1.dev177-py2.py3-none-any.whl (14.9 kB view hashes)

Uploaded Feb 21, 2024 Python 2 Python 3

Hashes for dataramp-1.0.1.dev177.tar.gz

Hashes for dataramp-1.0.1.dev177.tar.gz
Algorithm	Hash digest
SHA256	`a00319655610fce8093278fc387f1f2be27a8fd82077b0e78a70e5e25c68b3e7`
MD5	`b42ec22c36ea885640acc4e871e5b887`
BLAKE2b-256	`5de269dfddac981c77de0463df36c0ee2fa24044b4b9797c8aa511936a2b7718`

Hashes for dataramp-1.0.1.dev177-py2.py3-none-any.whl

Hashes for dataramp-1.0.1.dev177-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`ba9626decb75e1f3428b0593932afb44a2f0a1f4b11d7d1989c698f76c973b53`
MD5	`91da1bf3d76de6331585bfd77ec89cc3`
BLAKE2b-256	`290bb11459ae8143c2d0291ca36a7636cc555d77e1ac302481ba752ae4a89b41`