a utility for space efficient dataframes
Project description
pandas hug
sometimes you need to embrace your data and get a little, or a lot, more of it into memory.
your column data types are rarely space efficient. most of the time this is because they were chosen by someone else, but sometimes its just a hassle to find the most space efficient types.
pandas-hug is here to help crush your data to fit in memory.
installation
pip install pandas-hug
usage
import pandas as pd
import pandas_hug
S = pd.Series([2**8])
A = pd.Series([f'a{i}' for i in range(100)])
M = pd.Series([42])
E = pd.Series(['a', 'b', 'c'] * 15)
df = pd.DataFrame({'S': S, 'A': A, 'M': M, 'E': E})
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 S 1 non-null float64
1 A 100 non-null object
2 M 1 non-null float64
3 E 45 non-null object
dtypes: float64(2), object(2)
memory usage: 3.2+ KB
df.convert_dtypes().hug().info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 S 1 non-null UInt16
1 A 100 non-null string
2 M 1 non-null UInt8
3 E 45 non-null category
dtypes: UInt16(1), UInt8(1), category(1), string(1)
memory usage: 1.6 KB
pandas-hug monkey-patches pandas.DataFrame and pandas.Series to add the hug() method.
you should call convert_dtypes() before hugging your data. this does useful things like converting float to int (pandas >=1.2.0, dec 2020) and object to string where appropriate.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_hug-0.14.0-py3-none-any.whl.
File metadata
- Download URL: pandas_hug-0.14.0-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.8
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bbb26072e34362a45e45c09c2283829e389c1436bc211907d8ed4158ca30703
|
|
| MD5 |
016642c9039b1e0c6cfbfd07374f65b4
|
|
| BLAKE2b-256 |
750bd7343affcda64bca4385c89a06946f1e57ab4e3ae3734d9e6816c33200f3
|