Helpers for doing IO with Pandas DataFrames
Project description
df_io
Python helpers for doing IO with Pandas DataFrames
Available methods
write_df
This method supports:
- streaming writes
- chunked writes
- gzip/zstandard compression
- passing parameters to Pandas' writers
- writing to AWS S3 and local files
Examples
Write a Pandas DataFrame (df) to an S3 path in CSV format (the default):
import df_io
df_io.write_df(df, 's3://bucket/dir/mydata.csv')
The same with gzip compression:
df_io.write_df(df, 's3://bucket/dir/mydata.csv.gz')
With zstandard compression using pickle:
df_io.write_df(df, 's3://bucket/dir/mydata.pickle.zstd', fmt='pickle')
Using JSON lines:
df_io.write_df(df, 's3://bucket/dir/mydata.json.gz', fmt='json')
Passing writer parameters:
df_io.write_df(df, 's3://bucket/dir/mydata.json.gz', fmt='json', writer_options={'lines': False})
Chunked write (splitting the df into equally sized parts and creating/writing outputs for them):
df_io.write_df(df, 's3://bucket/dir/mydata.json.gz', fmt='json', chunksize=10000)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
df_io-0.0.6.tar.gz
(4.1 kB
view hashes)
Built Distribution
Close
Hashes for df_io-0.0.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ab805c8fbced63c7b1c778c389bead256a2dcdd08706fe047431c3146c21f859 |
|
MD5 | 0dbb6cb1334f5dce0415386c0cc23c63 |
|
BLAKE2b-256 | 394a9df216d8f442ef3e8de3779e59ae6c69747f1e59f7d6e9cf29cf1d9bc548 |