Skip to main content

High-performance CSV parser with SIMD optimizations

Project description

CISV Python Bindings (nanobind)

High-performance Python bindings for the CISV CSV parser using nanobind.

Performance

These bindings are 10-100x faster than the ctypes-based bindings because they:

  1. Use the batch API: All data is parsed in C and returned at once, eliminating millions of per-field callbacks
  2. Use nanobind: Much lower overhead than ctypes or pybind11
  3. Release the GIL: Parallel parsing runs without holding the Python GIL
File Size ctypes nanobind Speedup
142MB (1M rows × 10 cols) ~20s <0.8s 25x+

Installation

From PyPI (recommended)

pip install cisv

From source

cd bindings/python-nanobind
pip install .

Development install

cd bindings/python-nanobind
pip install -e .

Usage

import cisv

# Parse a file
rows = cisv.parse_file('data.csv')

# Parse with options
rows = cisv.parse_file(
    'data.csv',
    delimiter=';',
    quote="'",
    trim=True,
    skip_empty_lines=True
)

# Parse large files in parallel (faster on multi-core systems)
rows = cisv.parse_file('large.csv', parallel=True)

# Parse a string
rows = cisv.parse_string("a,b,c\n1,2,3")

# Count rows quickly (SIMD-accelerated)
count = cisv.count_rows('data.csv')

API Reference

parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)

Parse a CSV file and return all rows.

Parameters:

  • path: Path to the CSV file
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines
  • parallel: Use multi-threaded parsing (faster for large files)
  • num_threads: Number of threads for parallel parsing (0 = auto-detect)

Returns: List of rows, where each row is a list of field values.

parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Parse a CSV string and return all rows.

Parameters:

  • content: CSV content as a string
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines

Returns: List of rows, where each row is a list of field values.

count_rows(path)

Count the number of rows in a CSV file without full parsing.

This is very fast as it only scans for newlines using SIMD instructions.

Parameters:

  • path: Path to the CSV file

Returns: Number of rows in the file.

Running Tests

cd bindings/python-nanobind
pip install -e ".[test]"
pytest

Benchmarking

pip install -e ".[benchmark]"
python -c "
import cisv
import time

# Create test file
with open('/tmp/test.csv', 'w') as f:
    f.write('col1,col2,col3\n')
    for i in range(100000):
        f.write(f'value{i}_1,value{i}_2,value{i}_3\n')

# Benchmark
start = time.time()
rows = cisv.parse_file('/tmp/test.csv')
print(f'Parsed {len(rows)} rows in {time.time()-start:.3f}s')
"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cisv-0.2.2.tar.gz (41.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cisv-0.2.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (77.6 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.2-cp313-cp313-macosx_11_0_arm64.whl (54.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

cisv-0.2.2-cp313-cp313-macosx_10_14_x86_64.whl (57.4 kB view details)

Uploaded CPython 3.13macOS 10.14+ x86-64

cisv-0.2.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (77.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.2-cp312-cp312-macosx_11_0_arm64.whl (54.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

cisv-0.2.2-cp312-cp312-macosx_10_14_x86_64.whl (57.4 kB view details)

Uploaded CPython 3.12macOS 10.14+ x86-64

cisv-0.2.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (77.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.2-cp311-cp311-macosx_11_0_arm64.whl (55.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

cisv-0.2.2-cp311-cp311-macosx_10_14_x86_64.whl (57.4 kB view details)

Uploaded CPython 3.11macOS 10.14+ x86-64

cisv-0.2.2-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (77.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.2-cp310-cp310-macosx_11_0_arm64.whl (55.2 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

cisv-0.2.2-cp310-cp310-macosx_10_14_x86_64.whl (57.5 kB view details)

Uploaded CPython 3.10macOS 10.14+ x86-64

cisv-0.2.2-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (77.9 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.2-cp39-cp39-macosx_11_0_arm64.whl (55.2 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

cisv-0.2.2-cp39-cp39-macosx_10_14_x86_64.whl (57.6 kB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

File details

Details for the file cisv-0.2.2.tar.gz.

File metadata

  • Download URL: cisv-0.2.2.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.2.2.tar.gz
Algorithm Hash digest
SHA256 dc78e4881b060adc70fe44a7d645793128812547bce135380fd8fcf39fe19945
MD5 6381e4e25cff0815886acf06b6e13d4b
BLAKE2b-256 d41c5516b3a8aadb0715034d60576639fc9663f9afb887270bf03c1c8f26ab2c

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 495b7829c06f29816783e794f6838b245d68c73d06116a54d8ba5cbc78b97dcb
MD5 e4c686e41bee41f7eb81ae9d5012ecbd
BLAKE2b-256 294719bfd444fc0493ea5aef6fb72f86064a4ed05398546765f07b9a49ad2362

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 674ba988b6d6fc7930fc0a7a3b62637fcb934b1791ecee4c56a0f4616e0a3321
MD5 e6cd00beb4824144c7ceda3edc3f3a2c
BLAKE2b-256 465b3a574525b407055977aa151d93684f5b131c4ba6e312f6b07623a4754cc5

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp313-cp313-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp313-cp313-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ab2107675cead24103045609876020b70d5931abb53d82d4520158ae613d3b69
MD5 cb411c3d0d8d5f36c7343a0f777507c9
BLAKE2b-256 b67d9bc37c44bce2b1fc51305b8fbaa9b7ed218508bb4cafa5ad2f02d58c5bb0

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e82a1cea6b6d7e2a3283fb3a018143664f1dc078aceefe041684f5a2fe5eb2b7
MD5 0db7c1da3507d9cbf2062d70f6ac0cad
BLAKE2b-256 6758b4b2dbc98ab8e4b9cd0bcaf4aa27cf4244d60b073d3f2d6016143fbdce19

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5cea0006c741b278f43f8e5b94be4f1c009a03ba4377e422ef6a51128dfa99ee
MD5 09c3f22de028292359ebe3b26b8b1912
BLAKE2b-256 8aca5651ffe3732fa0f1064442c3246b6df4887eedf9a36717cb241bd7ed16e9

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp312-cp312-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp312-cp312-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 833d5b7b9b3a3422928ae6ee34f3e575ba347b30bf4f6b531cdf02584a1b7d5d
MD5 121f8811d558488c8a482a733a16ace1
BLAKE2b-256 c42c3cfb0498ba620240b03d488ca9bea6c4ad9924831c0c0d37d96edb1a6d1c

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a0988daefa0a9de41694a03dd046c59f7be1ce8af38de468a21c5072d603424d
MD5 6f420c88cc005d46283408c8cd6eccbb
BLAKE2b-256 e418289889e1f203bda4c1dcfd70f15bc7e870641445f99a251199cbf2d6ca64

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a3d56618c001878cf32a77d0494900d0a61c765ff54011b621120ca1ea887e07
MD5 c876030d606e2004fc4b3b1225d81777
BLAKE2b-256 2439d2ba1909cb39daf8b91e536ef55954b6e06042dbc7b319fd4b1ef61b8aa4

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c1209e3cf999e7d0710b0021f55a300fa5a2298a14b439f1da0458f20540793f
MD5 4cd482363898a6506acb7e8062542a35
BLAKE2b-256 c958a7957a90f735e20f6de9e956257f43ccd14de25dd4a22a792f83b77c5fe0

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 449dc1ba7939b77de44474d7568b34a0ec2c07cabfe3f3ded1a4f85df8756b99
MD5 78289d501120982f25383b9e206e39f7
BLAKE2b-256 869d383da0190e1cfb1333443223ca4fb0e8ab0fd6cbf31a4b242c4687ac1423

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52bc3ebd338bfa0ff6f2383f8d060cb6b996b039edcfc90bfc47ae8c8208ad9f
MD5 d02de18ce3c4968576d215e95aa64180
BLAKE2b-256 aef24c022b230d1518810b6293d7fbf453403d4e3bec70b50683e2a0e57cf804

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7c28ead568a37873ff361926dc5e3a56883c88c7ef6eb1671011d89f7989a09c
MD5 959a7ae90a5709f57aaef221092c65d2
BLAKE2b-256 19441a54ea225cf0b0415c63db5985dc589507342f141efb92e673f01ce6aff7

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0b098b5aaf4e05205d77b87ef1b2848fe03cd3998aae7064f2e08b17527c4bc9
MD5 409984b41391662305925907aac662d9
BLAKE2b-256 53476c4a6ab4cfe146d793fa563f837254c017c2ab45a7a202272a0854f85e1a

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

  • Download URL: cisv-0.2.2-cp39-cp39-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 55.2 kB
  • Tags: CPython 3.9, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.2.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 356e331966c56bf1fc809b9bfda74688430c027d364a1964889a47429beb37e5
MD5 5de207c5eb791058b6fc5248980e21ac
BLAKE2b-256 da8ca5098409a4f7629fd4c3ecd0184d54f65cae9ab189f53b5feeb71abef6dc

See more details on using hashes here.

File details

Details for the file cisv-0.2.2-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.2-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2f995d41f4df8eed1a02d6303dfb7657a046dcb5c2fef28d8f50d3da9c67e2d9
MD5 711447553d3e7f98b6d774652ab4e2cd
BLAKE2b-256 8de8b3d00aec033542bac6b0c61e429b719bb67a0c945b3daf106faabf3a1b0c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page