Pack Pandas data frames into smaller, more memory-efficient data types.
Project description
owid-repack-py
Pack Pandas DataFrames into smaller, more memory efficient types.
Overview
When you load data into Pandas, it will use standard types by default:
object
for stringsint64
for integersfloat64
for floating point numbers
However, for many datasets there is a much more compact representation that Pandas could be using for that data. Using a more compact representation leads to lower memory usage, and smaller binary files on disk when using formats such as Feather and Parquet.
This library does just one thing: it shrinks your data frames to use smaller types.
Installing
pip install owid-repack
Usage
The owid.repack
module exposes two methods, repack_series()
and repack_frame()
.
repack_series()
will detect the smallest type that can accurately fit the existing data in the series.
In [1]: from owid import repack
In [2]: pd.Series([1, 2, 3])
Out[2]:
0 1
1 2
2 3
dtype: int64
In [3]: repack.repack_series(pd.Series([1.5, 2, 3]))
Out[3]:
0 1.5
1 2.0
2 3.0
dtype: float32
In [4]: repack.repack_series(pd.Series([1, None, 3]))
Out[4]:
0 1
1 <NA>
2 3
dtype: UInt8
In [5]: repack.repack_series(pd.Series([-1, None, 3]))
Out[5]:
0 -1
1 <NA>
2 3
dtype: Int8
The repack_frame()
method simply does this across every column in your DataFrame, returning a new DataFrame.
Releases
0.1.0
:- Migrate first version from
owid-catalog-py
repo
- Migrate first version from
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file owid_repack-0.1.0.tar.gz
.
File metadata
- Download URL: owid_repack-0.1.0.tar.gz
- Upload date:
- Size: 4.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.7 Darwin/22.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e23c66345223e0ab04b62f2e18909a2d0297f83714bb2331fc8d07f11b9633d4 |
|
MD5 | 48901fd1c9f53446ea7065e83d6c74cf |
|
BLAKE2b-256 | 0c4eed411567fbf10db1df86b0c432fee164968a9c8ea8a8663251a614885862 |
File details
Details for the file owid_repack-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: owid_repack-0.1.0-py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.3.1 CPython/3.10.7 Darwin/22.2.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3474f551f6692422c0fab4a595980f132a17643659b335da222e520cc9540ee2 |
|
MD5 | f9fe105442fb6b1786b689c37c3dd66d |
|
BLAKE2b-256 | 3e7c0508d1edb61246eedea7e88100f4cfcc4ce41ab49b6c2c99605b69ffd18c |