Skip to main content

Sort huge csv files.

Project description

Disk CSV Sort

Supported Versions

Description

Sort huge CSV files using disk space and RAM together.

For now support only CSV files with header.

Usage

Sort CSV file path/to/file.csv by column Some Column.

from pathlib import Path
from diskcsvsort import CSVSort

csvsort = CSVSort(
    src=Path('path/to/file.csv'),
    key=lambda row: row['Some Column'],
)
csvsort.apply()

CLI

Sort CSV file path/to/file.csv by columns col1 and col2. col1 will be converted to python str and col2 will be converted to python int.

python -m diskcsvsort path/to/file.csv --by col1:str --by col2:int

Available types:

  • str
  • int
  • float
  • datetime
  • date
  • time

Types usage:

  • str: column:str
  • int: column:int
  • float: column:float
  • datetime: column:datetime(%Y-%m-%d %H:%M:%S)
  • date: column:datetime(%Y-%m-%d)
  • time: column:datetime(%H:%M:%S)

Algorithm

TODO

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diskcsvsort-0.1.0.tar.gz (10.6 kB view hashes)

Uploaded Source

Built Distribution

diskcsvsort-0.1.0-py3-none-any.whl (9.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page