Skip to main content

Sort large CSV files on disk rather than in memory

Project description

CSV Sort

Sort a CSV file on disk rather than in memory. The merge sort algorithm is used to break up the original file into smaller chunks, sort these in memory, and then merge these sorted files.

Example usage:

>>> from csvsort import csvsort
>>> # sort this CSV on the 4th and 2nd columns (columns are 0 indexed)
>>> csvsort('test1.csv', [4,2])
>>> # sort this CSV with no header on 3rd column and save results to separate file
>>> csvsort('test2.csv', [3], output_file='test3.csv', has_header=False)

csvsort can also be used from the command line:

$ # sort this CSV on 0th column
$ python csvsort.py test1.tsv --coloumn=0

$ # sort this tab separated file (TSV) on 3rd and 1st columns
$ python csvsort.py test3.tsv --delimiter='\t' -c 3 -c 1

Install

$ pip install csvsort

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvsort-1.0.tar.gz (2.6 kB view details)

Uploaded Source

File details

Details for the file csvsort-1.0.tar.gz.

File metadata

  • Download URL: csvsort-1.0.tar.gz
  • Upload date:
  • Size: 2.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for csvsort-1.0.tar.gz
Algorithm Hash digest
SHA256 76f853bf1ad07ca8043b70dbd3d21b751d08908af2ce3c977f734aafe4b788ce
MD5 3501ff65a12119e981e164c88d5db6b5
BLAKE2b-256 3db71fa69d42cb05bb4d88553029c5f175a3fa3cd96e51244ee5d8e453534c1a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page