A Data science library for data science / data analysis teams
Project description
dataramp
dataramp is a Python library designed to assist data science and data analysis teams in their workflow. It provides various utility functions and tools to streamline common data science tasks.
Features
dataramp offers the following key features:
-
Project Management: Simplifies the creation of standard data science project structures. With a single function call, you can generate a well-organized project directory with predefined folders for datasets, processed data, raw data, outputs, models, scripts, notebooks, and more.
-
Model Saving and Loading: Provides easy-to-use functions for saving and loading trained machine learning models. It supports various formats such as joblib, pickle, and keras, enabling seamless integration with different model types.
-
Data Exploration and Visualization: Includes functions for data exploration, summary statistics, and visualization. Quickly generate feature vi plots and visualize missing data to gain insights into your datasets.
-
Feature Engineering: Methods for handling missing data and noise in your datasets. Offers functions for dropping missing columns based on a specified threshold and detecting outliers using Tukey's Interquartile Range (IQR) method.
-
Model Evaluation and Cross-Validation: Provides tools to evaluate model performance, including functions to calculate accuracy, F1-score, precision, recall, and generate classification reports. Also supports cross-validation for model evaluation.
-
Scaling and Normalization: Offers functions for min-max scaling and z-score normalization of data to bring features to a common scale.
Quickstart
To use dataramp in your data science projects, you can install it via pip:
pip install dataramp
Once installed, you can import the library and explore its functionality:
import dataramp as dh # import the dataramp library
df = pd.read_csv("data/iris.csv") # load iris dataset
df.head()
cats = dh.eda.get_cat_vars(df)
print(cats)
num_var = dh.eda.get_num_vars(df)
print(num_var)
cat_count = dh.eda.get_cat_counts(df)
cat_count
missing = dh.eda.display_missing(df)
missing
Lins
Project: https://github.com/kimxons/dataramp PyPi: https://pypi.org/project/dataramp/
Documentation
For detailed usage instructions and API reference, please refer to the official documentation at https://dataramp-docs.example.com
We use SemVer for versioning
Contribution
dataramp is an open-source project, and we welcome contributions from the data science community. If you find a bug, have a feature request, or want to contribute improvements, please open an issue or submit a pull request on our GitHub repository at https://github.com/kimxons/dataramp.
License
dataramp is licensed under the MIT License. See the LICENSE file for more details.
Contact
If you have any questions or feedback, feel free to reach out to our support team at dev.kitonga@gmail.com or join our community forum at https://community.dataramp.com. We are here to assist you in making your data science journey smooth and successful!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file dataramp-1.0.1.dev169.tar.gz
.
File metadata
- Download URL: dataramp-1.0.1.dev169.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9b6f1d2bc146b96e70daf8dc2fe062286f526e6926055a4436237e3f35843324 |
|
MD5 | 8fac8126081cba5f729742723a1c9662 |
|
BLAKE2b-256 | ff5d63ab37eaf1957064e3698abca4b9288b982dc560a0811fdb921bbc7ebc37 |
File details
Details for the file dataramp-1.0.1.dev169-py2.py3-none-any.whl
.
File metadata
- Download URL: dataramp-1.0.1.dev169-py2.py3-none-any.whl
- Upload date:
- Size: 12.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d575cc7b916a3d7a34b45f01caf31cbf9af26839ac90fd23104ef1a1b2d9caea |
|
MD5 | 832768e6ef846ebef063dadf3b6be279 |
|
BLAKE2b-256 | b5ca5cb621bb3182564cc5ac2355558882ff0532f8866fd1e41f4a2b6756ada1 |