Tools to jazz up your data.
Project description
DataJazz
Tools to jazz up your data.
DataJazz is toolkit for manipulating and optimizing your data for analysis, machine learning, extraction, transformation, and loading (ETL).
Contributing
DataJazz is an open-source project founded and maintained to better serve the data science and machine learning community. Please feel free to submit pull requests to contribute to the project. By participating, you are expected to adhere to DataJazz's code of conduct.
Installation
pip install datajazz
Example usage
Create a dataframe with different datatypes
import pandas as pd
import numpy as np
rng = pd.date_range('2015-02-24', periods=5, freq='20H')
df = pd.DataFrame({ 'Start_Time': rng, 'Values': np.random.randn(len(rng)), 'Categories': ['A']*len(rng) })
df.head()
Create time-of-time features
import datajazz as dj
df = dj.timeoftime(df)
One-hot encode your categorical columns
import datajazz as dj
df = dj.onehot_categories(df)
Remove redundant rows and columns
import datajazz as dj
df = dj.remove_redundancies(df)
Many more use cases to come! Submit a pull request to add a new use case.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datajazz-0.0.11.tar.gz
(179.8 kB
view hashes)
Built Distribution
Close
Hashes for datajazz-0.0.11-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a0431b23fb3a359a1be839a3cd260f5dc48c800967fbf74dbc390e5513d01d28 |
|
MD5 | 7baaad42b583f1605ad4825a456a50cb |
|
BLAKE2b-256 | f694db2644ce0dd544230a28b10c138263c805122f66a80e2e82d30670786e87 |