Faster loading of pandas data frames by saving them as numpy arrays and pickling their meta info (row+column names, column dtype info).
Project description
numpickle
Faster loading of pandas data frames by saving them as numpy arrays and pickling their meta info (row+column names, column dtype info).
Install
pip install numpickle
Usage
import pandas as pd
import numpickle as npl
# create example data frame with non-numeric and numeric columns
df = pd.DataFrame([[1, 2,'a'], [3, 4, 'b']])
df.columns = ["A", "B", "C"]
df.index = ["row1", "row2"]
df
# A B C
# row1 1 2 a
# row2 3 4 b
df.dtypes
# A int64
# B int64
# C object
# dtype: object
# save data frame as numpy array and pickle row and column names
# into helper pickle file "/home/user/test.npy.pckl"
npl.save_numpickle(df, "/home/user/test.npy")
# load the saved data
df_ = npl.load_numpickle("/home/user/test.npy")
df_
# A B C
# row1 1 2 a
# row2 3 4 b
df_.dtypes
# A int64
# B int64
# C object
# dtype: object
all(df == df_)
# True
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
numpickle-0.1.2.post6.tar.gz
(2.1 kB
view hashes)
Built Distribution
Close
Hashes for numpickle-0.1.2.post6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0071d4fa3a625233f91e464c7f8bbfae712b2d11d599dc1f1b0f42db4e96ab0b |
|
MD5 | 775291f27424835256fed4e199d043db |
|
BLAKE2b-256 | c810c71bf51928e56f0c3b94d9bcf6e763f469cf1634b8c9b6630cae9659f242 |