Skip to main content

Sort large CSV files on disk rather than in memory

Project description

CSV Sorter

Fork of the csvsort for Python 3. For sorting CSV files on disk that do not fit into memory. The merge sort algorithm is used to break up the original file into smaller chunks, sort these in memory, and then merge these sorted files.

Example usage

>>> from csvsorter import csvsort
>>> # sort this CSV on the 5th and 3rd columns (columns are 0 indexed)
>>> csvsort('test1.csv', [4,2])
>>> # sort this CSV with no header on 4th column and save results to separate file
>>> csvsort('test2.csv', [3], output_file='test3.csv', has_header=False)
>>> # sort this TSV on the first column and use a maximum of 10MB per split
>>> csvsort('test3.tsv', [0], max_size=10, delimiter='\t')
>>> # sort this CSV on the first column, force quotes around every field (default is csv.QUOTE_MINIMAL) and use windows-1250 encoding
>>> import csv
>>> csvsort('test4.csv', [0], quoting=csv.QUOTE_ALL, encoding='windows-1250')

Install

$ pip install csvsorter

Project details


Release history Release notifications | RSS feed

This version

1.4

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvsorter-1.4.tar.gz (3.3 kB view details)

Uploaded Source

File details

Details for the file csvsorter-1.4.tar.gz.

File metadata

  • Download URL: csvsorter-1.4.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for csvsorter-1.4.tar.gz
Algorithm Hash digest
SHA256 2b3c6a944f8541e76ff18d2b14f672a3a936cff5ce7e54a0d4fe1ce90e9248ff
MD5 59311f75f74d6ba668088e8b34303bf2
BLAKE2b-256 a7b533f8c2541df92aec7dccbf90e5bc0ecdd8f5f86481e57c8eff80e8e7df3f

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page