Skip to main content

High-performance CSV parser with SIMD optimizations

Project description

CISV Python Bindings (nanobind)

High-performance Python bindings for the CISV CSV parser using nanobind.

Performance

These bindings are 10-100x faster than the ctypes-based bindings because they:

  1. Use the batch API: All data is parsed in C and returned at once, eliminating millions of per-field callbacks
  2. Use nanobind: Much lower overhead than ctypes or pybind11
  3. Release the GIL: Parallel parsing runs without holding the Python GIL
File Size ctypes nanobind Speedup
142MB (1M rows × 10 cols) ~20s <0.8s 25x+

Installation

From PyPI (recommended)

pip install cisv

From source

cd bindings/python-nanobind
pip install .

Development install

cd bindings/python-nanobind
pip install -e .

Usage

import cisv

# Parse a file
rows = cisv.parse_file('data.csv')

# Parse with options
rows = cisv.parse_file(
    'data.csv',
    delimiter=';',
    quote="'",
    trim=True,
    skip_empty_lines=True
)

# Parse large files in parallel (faster on multi-core systems)
rows = cisv.parse_file('large.csv', parallel=True)

# Parse a string
rows = cisv.parse_string("a,b,c\n1,2,3")

# Count rows quickly (SIMD-accelerated)
count = cisv.count_rows('data.csv')

# Row-by-row iteration (memory efficient, supports early exit)
with cisv.CisvIterator('large.csv') as reader:
    for row in reader:
        print(row)  # List[str]
        if row[0] == 'stop':
            break  # Early exit - no wasted work

# Or use the convenience function
for row in cisv.open_iterator('data.csv', delimiter=',', trim=True):
    process(row)

API Reference

parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)

Parse a CSV file and return all rows.

Parameters:

  • path: Path to the CSV file
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines
  • parallel: Use multi-threaded parsing (faster for large files)
  • num_threads: Number of threads for parallel parsing (0 = auto-detect)

Returns: List of rows, where each row is a list of field values.

parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Parse a CSV string and return all rows.

Parameters:

  • content: CSV content as a string
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines

Returns: List of rows, where each row is a list of field values.

count_rows(path)

Count the number of rows in a CSV file without full parsing.

This is very fast as it only scans for newlines using SIMD instructions.

Parameters:

  • path: Path to the CSV file

Returns: Number of rows in the file.

CisvIterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Row-by-row iterator for streaming CSV parsing with minimal memory footprint.

Provides fgetcsv-style iteration that supports early exit - breaking out of iteration stops parsing immediately with no wasted work.

Parameters:

  • path: Path to the CSV file
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines

Methods:

  • next(): Get the next row as List[str], or None if at end of file
  • close(): Close the iterator and release resources
  • closed: Property indicating whether the iterator has been closed

Protocols:

  • Iterator protocol: Use in for loops with for row in iterator
  • Context manager: Use with with statement for automatic cleanup

Example:

# Context manager (recommended)
with cisv.CisvIterator('data.csv') as reader:
    for row in reader:
        if row[0] == 'target':
            print(f"Found: {row}")
            break  # Early exit

# Manual iteration
reader = cisv.CisvIterator('data.csv')
try:
    while True:
        row = reader.next()
        if row is None:
            break
        process(row)
finally:
    reader.close()

open_iterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Convenience function that returns a CisvIterator. Same parameters as CisvIterator.

Example:

for row in cisv.open_iterator('data.csv'):
    print(row)

Running Tests

cd bindings/python-nanobind
pip install -e ".[test]"
pytest

Benchmarking

pip install -e ".[benchmark]"
python -c "
import cisv
import time

# Create test file
with open('/tmp/test.csv', 'w') as f:
    f.write('col1,col2,col3\n')
    for i in range(100000):
        f.write(f'value{i}_1,value{i}_2,value{i}_3\n')

# Benchmark
start = time.time()
rows = cisv.parse_file('/tmp/test.csv')
print(f'Parsed {len(rows)} rows in {time.time()-start:.3f}s')
"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cisv-0.4.6.tar.gz (48.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cisv-0.4.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (108.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.4.6-cp313-cp313-macosx_11_0_x86_64.whl (74.4 kB view details)

Uploaded CPython 3.13macOS 11.0+ x86-64

cisv-0.4.6-cp313-cp313-macosx_11_0_arm64.whl (70.7 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

cisv-0.4.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (108.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.4.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (118.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

cisv-0.4.6-cp312-cp312-macosx_11_0_x86_64.whl (74.4 kB view details)

Uploaded CPython 3.12macOS 11.0+ x86-64

cisv-0.4.6-cp312-cp312-macosx_11_0_arm64.whl (70.7 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

cisv-0.4.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (108.5 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.4.6-cp311-cp311-macosx_11_0_x86_64.whl (74.8 kB view details)

Uploaded CPython 3.11macOS 11.0+ x86-64

cisv-0.4.6-cp311-cp311-macosx_11_0_arm64.whl (71.3 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

cisv-0.4.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (108.7 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.4.6-cp310-cp310-macosx_11_0_x86_64.whl (75.0 kB view details)

Uploaded CPython 3.10macOS 11.0+ x86-64

cisv-0.4.6-cp310-cp310-macosx_11_0_arm64.whl (71.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

cisv-0.4.6-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (109.0 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.4.6-cp39-cp39-macosx_11_0_x86_64.whl (75.2 kB view details)

Uploaded CPython 3.9macOS 11.0+ x86-64

cisv-0.4.6-cp39-cp39-macosx_11_0_arm64.whl (71.7 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file cisv-0.4.6.tar.gz.

File metadata

  • Download URL: cisv-0.4.6.tar.gz
  • Upload date:
  • Size: 48.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.4.6.tar.gz
Algorithm Hash digest
SHA256 b325a590b8313fd20e70e18ade1efe5ae5bb30720d84a70e895fddd2a91c9d4e
MD5 4be0ad6f941d0e06e41c3ab40851584a
BLAKE2b-256 d230f6d445d5daa1998cb98f43a2db5adaeb8bc4162139ff3d9acb8a62758da4

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 46ce3b554fb5f000fb66c60aae4f0f4b5ae44e7b035f0f482a4d688ed55d6be9
MD5 dc99407fc2e33d5dd5dfde8300a5331f
BLAKE2b-256 d4c2861b403d364a9839340f96035f810c891cf7903fe02c41cc6e335e071ec7

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp313-cp313-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp313-cp313-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 5d1965a5972690235b06058ecd2ea1c38f4c14b19c4a415ec0285bae4c5e64b5
MD5 077728ec07045ffd3824fa9b7f7ffc8c
BLAKE2b-256 c38ad208d11adea1c1ed46177dff8a6090f5601e59727f4884b75f1d8511a94c

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a21778743b85859e72d37b24d6001bb622a46b1b26603d089dd7415443c1e05a
MD5 804957d313ce27348e8b3a8163b7ea8a
BLAKE2b-256 de00833845dbade2294630c6543d3a3a6caca365e9e0eaafa1c98149df666fbe

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ffb9f950889fb8b15a93037518738dbdb9f55f982712531622606f099dee7ce
MD5 20744931636a34062caff6f4ff4391ab
BLAKE2b-256 3f5adce9b60dd67b787d5c2a89d8be7720d7c3e43099a3620904595bad6cd6d2

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 384dae64fbe4bd321d1bfd12151c1ac8227b8a304f36d68b44c55e95033c0138
MD5 9bccef5637aba8424bc8c2d6de661696
BLAKE2b-256 9b95ec724f592199e7d3fca3347e01f031edbce547551b2ebd5e866cc0922d63

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp312-cp312-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp312-cp312-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 2a77ee7001ab627b189d2d12955a2c537f12c72108b4f7f804ea7ff005a98bd8
MD5 429295789bf238501b9914fbb6026d18
BLAKE2b-256 552a72f5ff62e3c09256d1b56d1700375b82ae67883212be02923bcc391cb739

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1f6e0577081464da9c682395eee0b41375fac549b51045ba1259469e6e267cb7
MD5 3854ad85d3ffd2dc9a4cebed4692d60a
BLAKE2b-256 89550d811ef882f3b958a02fd916cabd08d2359f268c7de847ab324be6c039c9

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7654be50a6b1b58ecf60e73dbc42a341e0190442b9e381f822891ab60e7953cb
MD5 53045a210f8618d6d3c845941b249862
BLAKE2b-256 dfdbf57792483c708be4089cf3a0bd330670527393fb185bcd16ce1b8f6d3760

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp311-cp311-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp311-cp311-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 0f3121fbd120ba04e5d07fc8243d89a4630990f19b6b1190d983eeded312267f
MD5 c8aeec32b10b774f9ae6ae240af4d0bc
BLAKE2b-256 45d591028fe4ba28dcec27291969553ba0580241be94c1fb534e5b4da29d7b53

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 91179b7df1842158210d49e902ed02e24eaa475193f827be830e439a285e68c3
MD5 66e5786cf36cf51f1c25a7b4483bd998
BLAKE2b-256 f5d5aa2f5b78a5955d7666cb67028e246809e264d10696cb006e8d11a2fbc063

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 11ee5ec8304ea6df98f9fd21948de2b7536c6672cef049fdbd36b70b070ccb97
MD5 6eabeb17a9489c3e4d9d3fec80f91090
BLAKE2b-256 868cecf647293087cd89a93c15d2f655ab44a0b7f7e8d41f713ff72c900ddea6

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp310-cp310-macosx_11_0_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp310-cp310-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 99645b1a176b2718fd563a6def4c5c695dc7f4b341c6575a36d5ceb0b6b1e849
MD5 ce4e960ed174e750db39de860cf3b58e
BLAKE2b-256 ce4aa2ec6609a116adf15e749c9ff43531cf7e52edaae7249f99bd2e00866900

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 309133e3b690a1a132909acb67ac70dbeb043de68d1ddb447300209b173fde8f
MD5 efd00bc8ee615f30c223a328389fc8cb
BLAKE2b-256 d43d4c767a943875ef75b4ec27c937e95f9a0a44fba5e3f90ac432fec6c5f3bc

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.4.6-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1340c51750a8bc04f88c5d53eab6df89e95de38499486617c639a8c368c32884
MD5 063bd8ebb2dd8722f18b692f72e2b08f
BLAKE2b-256 87f2ce693a7f93a9e67e58ff45247c139ff16cd649f7e904ceb2df9db75aeb3f

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp39-cp39-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: cisv-0.4.6-cp39-cp39-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 75.2 kB
  • Tags: CPython 3.9, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.4.6-cp39-cp39-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 337f49bfbe053c44f59d6e2fb189a96dfdf128461845d18cb869e13780a005a0
MD5 feeb008fb48bb68c9094803597c8d022
BLAKE2b-256 1ea1a651a2297592df0235045999d1d93fb0f3f5c266ee57e07eed8c70bab7e5

See more details on using hashes here.

File details

Details for the file cisv-0.4.6-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

  • Download URL: cisv-0.4.6-cp39-cp39-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 71.7 kB
  • Tags: CPython 3.9, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.4.6-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 32b9af0c60135e6112128c34cd0e75561eee6eae07dfc4b649c1d6bb6c30b98b
MD5 247dbab3becf957510389e511e8a54d4
BLAKE2b-256 8e3928c9903b5927d52979324320bd9980f8f0af57cea0fadc8d0f20e4f428f7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page