a utility for space efficient dataframes
Project description
pandas hug
sometimes you need to embrace your data and get a little, or a lot, more of it into memory.
your column data types are rarely space efficient. most of the time this is because they were chosen by someone else, but sometimes its just a hassle to find the most space efficient types.
pandas-hug
is here to help crush your data to fit in memory.
installation
pip install pandas-hug
usage
import pandas as pd
import pandas_hug
S = pd.Series([2**8])
A = pd.Series([f'a{i}' for i in range(100)])
M = pd.Series([42])
E = pd.Series(['a', 'b', 'c'] * 15)
df = pd.DataFrame({'S': S, 'A': A, 'M': M, 'E': E})
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 S 1 non-null float64
1 A 100 non-null object
2 M 1 non-null float64
3 E 45 non-null object
dtypes: float64(2), object(2)
memory usage: 3.2+ KB
df.convert_dtypes().hug().info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 S 1 non-null UInt16
1 A 100 non-null string
2 M 1 non-null UInt8
3 E 45 non-null category
dtypes: UInt16(1), UInt8(1), category(1), string(1)
memory usage: 1.6 KB
pandas-hug
monkey-patches pandas.DataFrame
and pandas.Series
to add the hug()
method.
you should call convert_dtypes()
before hugging your data. this does useful things like converting float
to int
(pandas >=1.2.0, dec 2020) and object
to string
where appropriate.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for pandas_hug-0.12.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5c4633c88a94caa8f17be35ffa0915052e4d3755458cc02ecf5255bc82225dd5 |
|
MD5 | 232af2a62db1f7a4e7273c6304fa5904 |
|
BLAKE2b-256 | af82595ea80cc3ba53891547f153c0ba4586d76962c5550a288b238bfe56596a |