Sort large CSV files on disk rather than in memory
Project description
CSV Sort
For sorting CSV files on disk that do not fit into memory. The merge sort algorithm is used to break up the original file into smaller chunks, sort these in memory, and then merge these sorted files.
Example usage
>>> from csvsort import csvsort
>>> # sort this CSV on the 5th and 3rd columns (columns are 0 indexed)
>>> csvsort('test1.csv', [4,2])
>>> # sort this CSV with no header on 4th column and save results to separate file
>>> csvsort('test2.csv', [3], output_filename='test3.csv', has_header=False)
>>> # sort this TSV on the first column and use a maximum of 10MB per split
>>> csvsort('test3.tsv', [0], max_size=10, delimiter='\t')
>>> # sort this CSV on the first column and force quotes around every field (default is csv.QUOTE_MINIMAL)
>>> import csv
>>> csvsort('test4.csv', [0], quoting=csv.QUOTE_ALL)
Install
Supports python 2 & 3:
$ pip install csvsort
$ pip3 install csvsort
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
csvsort-1.6.1.tar.gz
(3.5 kB
view details)
File details
Details for the file csvsort-1.6.1.tar.gz
.
File metadata
- Download URL: csvsort-1.6.1.tar.gz
- Upload date:
- Size: 3.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2d34042979cd843c29c50ff6cd8e62579e9e0008b7551fbb11f06cfe41004ab |
|
MD5 | a2774219450189f688c1215215db4e02 |
|
BLAKE2b-256 | dab8f16eddfcf3f0dccfff358167d19ea61b83c8636ef5ba612f52c6ffbb2587 |