High-performance CSV parser with SIMD optimizations

These details have not been verified by PyPI

Project links

Project description

CISV Python Bindings (nanobind)

High-performance Python bindings for the CISV CSV parser using nanobind.

Performance

These bindings are 10-100x faster than the ctypes-based bindings because they:

Use the batch API: All data is parsed in C and returned at once, eliminating millions of per-field callbacks
Use nanobind: Much lower overhead than ctypes or pybind11
Release the GIL: Parallel parsing runs without holding the Python GIL

File Size	ctypes	nanobind	Speedup
142MB (1M rows × 10 cols)	~20s	<0.8s	25x+

Installation

From PyPI (recommended)

pip install cisv

From source

cd cisv
pip install .

Development install

cd cisv
pip install -e .

Usage

import cisv

# Parse a file
rows = cisv.parse_file('data.csv')

# Parse with options
rows = cisv.parse_file(
    'data.csv',
    delimiter=';',
    quote="'",
    trim=True,
    skip_empty_lines=True
)

# Parse large files in parallel (faster on multi-core systems)
rows = cisv.parse_file('large.csv', parallel=True)

# Parse a string
rows = cisv.parse_string("a,b,c\n1,2,3")

# Count rows quickly (SIMD-accelerated)
count = cisv.count_rows('data.csv')

# Row-by-row iteration (memory efficient, supports early exit)
with cisv.CisvIterator('large.csv') as reader:
    for row in reader:
        print(row)  # List[str]
        if row[0] == 'stop':
            break  # Early exit - no wasted work

# Or use the convenience function
for row in cisv.open_iterator('data.csv', delimiter=',', trim=True):
    process(row)

API Reference

`parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)`

Parse a CSV file and return all rows.

Parameters:

path: Path to the CSV file
delimiter: Field delimiter character (default: ',')
quote: Quote character (default: '"')
trim: Whether to trim whitespace from fields
skip_empty_lines: Whether to skip empty lines
parallel: Use multi-threaded parsing (faster for large files)
num_threads: Number of threads for parallel parsing (0 = auto-detect)

Returns: List of rows, where each row is a list of field values.

`parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`

Parse a CSV string and return all rows.

Parameters:

content: CSV content as a string
delimiter: Field delimiter character (default: ',')
quote: Quote character (default: '"')
trim: Whether to trim whitespace from fields
skip_empty_lines: Whether to skip empty lines

Returns: List of rows, where each row is a list of field values.

`count_rows(path)`

Count the number of rows in a CSV file without full parsing.

This is very fast as it only scans for newlines using SIMD instructions.

Parameters:

path: Path to the CSV file

Returns: Number of rows in the file.

`CisvIterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`

Row-by-row iterator for streaming CSV parsing with minimal memory footprint.

Provides fgetcsv-style iteration that supports early exit - breaking out of iteration stops parsing immediately with no wasted work.

Parameters:

path: Path to the CSV file
delimiter: Field delimiter character (default: ',')
quote: Quote character (default: '"')
trim: Whether to trim whitespace from fields
skip_empty_lines: Whether to skip empty lines

Methods:

next(): Get the next row as List[str], or None if at end of file
close(): Close the iterator and release resources
closed: Property indicating whether the iterator has been closed

Protocols:

Iterator protocol: Use in for loops with for row in iterator
Context manager: Use with with statement for automatic cleanup

Example:

# Context manager (recommended)
with cisv.CisvIterator('data.csv') as reader:
    for row in reader:
        if row[0] == 'target':
            print(f"Found: {row}")
            break  # Early exit

# Manual iteration
reader = cisv.CisvIterator('data.csv')
try:
    while True:
        row = reader.next()
        if row is None:
            break
        process(row)
finally:
    reader.close()

`open_iterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`

Convenience function that returns a CisvIterator. Same parameters as CisvIterator.

Example:

for row in cisv.open_iterator('data.csv'):
    print(row)

Running Tests

cd cisv
pip install -e ".[test]"
pytest

Benchmarking

pip install -e ".[benchmark]"
python -c "
import cisv
import time

# Create test file
with open('/tmp/test.csv', 'w') as f:
    f.write('col1,col2,col3\n')
    for i in range(100000):
        f.write(f'value{i}_1,value{i}_2,value{i}_3\n')

# Benchmark
start = time.time()
rows = cisv.parse_file('/tmp/test.csv')
print(f'Parsed {len(rows)} rows in {time.time()-start:.3f}s')
"

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.8

Mar 11, 2026

This version

0.4.7

Mar 6, 2026

0.4.6

Mar 5, 2026

0.4.5

Mar 5, 2026

0.4.4

Mar 5, 2026

0.4.3

Mar 5, 2026

0.3.3

Mar 2, 2026

0.2.5

Jan 23, 2026

0.2.4

Jan 22, 2026

0.2.3

Jan 22, 2026

0.2.2

Jan 22, 2026

0.2.0

Jan 22, 2026

0.1.3

Jan 19, 2026

0.1.1

Jan 18, 2026

0.0.79

Jan 17, 2026

0.0.78

Jan 17, 2026

0.0.77

Jan 17, 2026

0.0.76

Jan 17, 2026

0.0.75

Jan 17, 2026

0.0.74

Jan 17, 2026

0.0.69

Jan 16, 2026

0.0.67

Jan 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cisv-0.4.7.tar.gz (55.3 kB view details)

Uploaded Mar 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cisv-0.4.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (118.0 kB view details)

Uploaded Mar 6, 2026 CPython 3.12manylinux: glibc 2.17+ x86-64

File details

Details for the file cisv-0.4.7.tar.gz.

File metadata

Download URL: cisv-0.4.7.tar.gz
Upload date: Mar 6, 2026
Size: 55.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.4.7.tar.gz
Algorithm	Hash digest
SHA256	`ac0e425d98fb5bc7529bb77a241086ed1bcd239884d2a027843ba5104a6be62a`
MD5	`6078ee680fea4aa8ffddb07961fd60d5`
BLAKE2b-256	`2e8dbe2c3285762952c6ef297c9a6c314f7db07f60e4f322c62566eebf2e70db`

See more details on using hashes here.

File details

Details for the file cisv-0.4.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

Download URL: cisv-0.4.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Upload date: Mar 6, 2026
Size: 118.0 kB
Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.4.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm	Hash digest
SHA256	`596c7250e4b345ecffdd3d625aeeb19fc0328b8c693fa9110d28b8797a737634`
MD5	`21b7067a4404fd0dd4de7fb6b241aae7`
BLAKE2b-256	`c9bd22e535c8b846512fcc9a4164f4bd8c1502628fc62031c787c7b11f3b10c8`

See more details on using hashes here.

cisv 0.4.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CISV Python Bindings (nanobind)

Performance

Installation

From PyPI (recommended)

From source

Development install

Usage

API Reference

parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)

parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

count_rows(path)

CisvIterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

open_iterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Running Tests

Benchmarking

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)`

`parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`

`count_rows(path)`

`CisvIterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`

`open_iterator(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)`