A library for chunking different types of data files.
Project description
chunkr
A library for chunking different types of data files.
Getting started
pip install chunkr
Usage
Suppose you want to chunk a csv file of 1 million records into 10 pieces, you can do this
from chunkr import create_chunks_dir
import pandas as pd
with create_chunks_dir(
'csv',
'csv_test',
'path/to/file',
'temp/output',
100_000,
None,
None,
quote_char='"',
delimiter=',',
escape_char='\\',
) as chunks_dir:
assert 1_000_000 == sum(
len(pd.read_parquet(file)) for file in chunks_dir.iterdir()
)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
chunkr-0.1.0.tar.gz
(4.5 kB
view hashes)
Built Distribution
chunkr-0.1.0-py3-none-any.whl
(4.2 kB
view hashes)