Sort huge csv files.
Project description
Disk CSV Sort
Description
Sort huge CSV files using disk space and RAM together.
For now support only CSV files with header.
Usage
For example
CSV file with movies
name | year |
---|---|
Batman Begins | 2005 |
Blade Runner 2049 | 2017 |
Dune | 2021 |
Snatch | 2000 |
Sort this CSV file that stored in movies.csv
by year
and name
.
Note: order of columns is matter during sorting.
Using diskcsvsort package
from pathlib import Path
from diskcsvsort import CSVSort
csvsort = CSVSort(
src=Path('movies.csv'),
key=lambda row: (int(row['year']), row['name']),
)
csvsort.apply()
Using diskcsvsort CLI
python -m diskcsvsort movies.csv --by year:int --by name:str
Note: columns year
and name
will be converted to int
and str
, respectively.
Available types:
- str
- int
- float
- datetime
- date
- time
Types usage:
- str:
column:str
- int:
column:int
- float:
column:float
- datetime:
column:datetime(%Y-%m-%d %H:%M:%S)
- date:
column:datetime(%Y-%m-%d)
- time:
column:datetime(%H:%M:%S)
Algorithm
TODO
Metrics
TODO
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
diskcsvsort-0.1.1.tar.gz
(10.9 kB
view details)
Built Distribution
File details
Details for the file diskcsvsort-0.1.1.tar.gz
.
File metadata
- Download URL: diskcsvsort-0.1.1.tar.gz
- Upload date:
- Size: 10.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.11 CPython/3.10.0 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e6b9bfbc9c08a9d9f38cd22c3aaf40e1064137f2a3a7de40f6471af27c26af7a |
|
MD5 | 82870e87854a02bc4f097bd48eb404f0 |
|
BLAKE2b-256 | 326bbb2abd3af42692ace817f3418c5b0ef3fdbcb20e4f1e745abbcdbdb4c4ac |
File details
Details for the file diskcsvsort-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: diskcsvsort-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.11 CPython/3.10.0 Windows/10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dab56c5b5dd381792c544f9b24f32de0e06939b3a8c9c99be1104892bae2a5d5 |
|
MD5 | 02f0e6564071077517e659b68e5a9d93 |
|
BLAKE2b-256 | 1de02037b4741af3729e72948a212c3f186fc47741bc07c1304496392f07454c |