Skip to main content

Sort large CSV files on disk rather than in memory

Project description

CSV Sort

For sorting CSV files on disk that do not fit into memory. The merge sort algorithm is used to break up the original file into smaller chunks, sort these in memory, and then merge these sorted files.

Example usage

>>> from csvsort import csvsort
>>> # sort this CSV on the 5th and 3rd columns (columns are 0 indexed)
>>> csvsort('test1.csv', [4,2])
>>> # sort this CSV with no header on 4th column and save results to separate file
>>> csvsort('test2.csv', [3], output_filename='test3.csv', has_header=False)
>>> # sort this TSV on the first column and use a maximum of 10MB per split
>>> csvsort('test3.tsv', [0], max_size=10, delimiter='\t')
>>> # sort this CSV on the first column and force quotes around every field (default is csv.QUOTE_MINIMAL)
>>> import csv
>>> csvsort('test4.csv', [0], quoting=csv.QUOTE_ALL)

Install

Supports python 2 & 3:

$ pip install csvsort
$ pip3 install csvsort

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvsort-1.6.1.tar.gz (3.5 kB view details)

Uploaded Source

File details

Details for the file csvsort-1.6.1.tar.gz.

File metadata

  • Download URL: csvsort-1.6.1.tar.gz
  • Upload date:
  • Size: 3.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.3

File hashes

Hashes for csvsort-1.6.1.tar.gz
Algorithm Hash digest
SHA256 b2d34042979cd843c29c50ff6cd8e62579e9e0008b7551fbb11f06cfe41004ab
MD5 a2774219450189f688c1215215db4e02
BLAKE2b-256 dab8f16eddfcf3f0dccfff358167d19ea61b83c8636ef5ba612f52c6ffbb2587

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page