Skip to main content

High-performance CSV parser with SIMD optimizations

Project description

CISV Python Bindings (nanobind)

High-performance Python bindings for the CISV CSV parser using nanobind.

Performance

These bindings are 10-100x faster than the ctypes-based bindings because they:

  1. Use the batch API: All data is parsed in C and returned at once, eliminating millions of per-field callbacks
  2. Use nanobind: Much lower overhead than ctypes or pybind11
  3. Release the GIL: Parallel parsing runs without holding the Python GIL
File Size ctypes nanobind Speedup
142MB (1M rows × 10 cols) ~20s <0.8s 25x+

Installation

From PyPI (recommended)

pip install cisv

From source

cd bindings/python-nanobind
pip install .

Development install

cd bindings/python-nanobind
pip install -e .

Usage

import cisv

# Parse a file
rows = cisv.parse_file('data.csv')

# Parse with options
rows = cisv.parse_file(
    'data.csv',
    delimiter=';',
    quote="'",
    trim=True,
    skip_empty_lines=True
)

# Parse large files in parallel (faster on multi-core systems)
rows = cisv.parse_file('large.csv', parallel=True)

# Parse a string
rows = cisv.parse_string("a,b,c\n1,2,3")

# Count rows quickly (SIMD-accelerated)
count = cisv.count_rows('data.csv')

API Reference

parse_file(path, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False, parallel=False, num_threads=0)

Parse a CSV file and return all rows.

Parameters:

  • path: Path to the CSV file
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines
  • parallel: Use multi-threaded parsing (faster for large files)
  • num_threads: Number of threads for parallel parsing (0 = auto-detect)

Returns: List of rows, where each row is a list of field values.

parse_string(content, delimiter=',', quote='"', *, trim=False, skip_empty_lines=False)

Parse a CSV string and return all rows.

Parameters:

  • content: CSV content as a string
  • delimiter: Field delimiter character (default: ',')
  • quote: Quote character (default: '"')
  • trim: Whether to trim whitespace from fields
  • skip_empty_lines: Whether to skip empty lines

Returns: List of rows, where each row is a list of field values.

count_rows(path)

Count the number of rows in a CSV file without full parsing.

This is very fast as it only scans for newlines using SIMD instructions.

Parameters:

  • path: Path to the CSV file

Returns: Number of rows in the file.

Running Tests

cd bindings/python-nanobind
pip install -e ".[test]"
pytest

Benchmarking

pip install -e ".[benchmark]"
python -c "
import cisv
import time

# Create test file
with open('/tmp/test.csv', 'w') as f:
    f.write('col1,col2,col3\n')
    for i in range(100000):
        f.write(f'value{i}_1,value{i}_2,value{i}_3\n')

# Benchmark
start = time.time()
rows = cisv.parse_file('/tmp/test.csv')
print(f'Parsed {len(rows)} rows in {time.time()-start:.3f}s')
"

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cisv-0.2.0.tar.gz (34.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

cisv-0.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (64.8 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl (43.9 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

cisv-0.2.0-cp313-cp313-macosx_10_14_x86_64.whl (46.9 kB view details)

Uploaded CPython 3.13macOS 10.14+ x86-64

cisv-0.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (64.8 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.0-cp312-cp312-macosx_11_0_arm64.whl (43.9 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

cisv-0.2.0-cp312-cp312-macosx_10_14_x86_64.whl (46.9 kB view details)

Uploaded CPython 3.12macOS 10.14+ x86-64

cisv-0.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (65.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.0-cp311-cp311-macosx_11_0_arm64.whl (44.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

cisv-0.2.0-cp311-cp311-macosx_10_14_x86_64.whl (47.0 kB view details)

Uploaded CPython 3.11macOS 10.14+ x86-64

cisv-0.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (65.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.0-cp310-cp310-macosx_11_0_arm64.whl (44.2 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

cisv-0.2.0-cp310-cp310-macosx_10_14_x86_64.whl (47.0 kB view details)

Uploaded CPython 3.10macOS 10.14+ x86-64

cisv-0.2.0-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (65.1 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.24+ x86-64manylinux: glibc 2.28+ x86-64

cisv-0.2.0-cp39-cp39-macosx_11_0_arm64.whl (44.2 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

cisv-0.2.0-cp39-cp39-macosx_10_14_x86_64.whl (47.1 kB view details)

Uploaded CPython 3.9macOS 10.14+ x86-64

File details

Details for the file cisv-0.2.0.tar.gz.

File metadata

  • Download URL: cisv-0.2.0.tar.gz
  • Upload date:
  • Size: 34.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2ce7cdcdd3cb818607ff79a2b716e05a95f7c9ba643af12aed19c78fc3960cc6
MD5 2c608e79b58865c711df1a19c880a40a
BLAKE2b-256 24052b05258359163cd029ed93e6304c7efb452b7180e8cf4ca069313b6cd1e9

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp313-cp313-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 679f3b329f69272e7c144897a0f3a065b8fe89fe94fab65bfd0291177911f642
MD5 de6e655106d64ef39ce3680a523d3127
BLAKE2b-256 20a57f198154f7fca5a0321446b6e96021c87eaa962adec307c3119311366592

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 859d5f6ff0837f2f08a87b72329dd351c671081380938c6b68bd10b701e4810d
MD5 800fbe6f18b6fcd27cb6d3556fd2cc46
BLAKE2b-256 7c8495979a04632cf123236cb43b1050ae60104960e45d33ab373c8c1ee4cbbf

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp313-cp313-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp313-cp313-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 43da9c53b097e61780c52289a6ea795ce4752d0f7311e87c3de0fada9c0ed9ce
MD5 f309f5464578813731a3c113a4e9dfa9
BLAKE2b-256 4d878ae109ca3c24ec5c7601d3db529e5a300c545c0bf760c010276598bfc39d

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d45594808961b28a2e0dafa1cdeba56939bdeb3925d4e8be8a6d879e3926a3cd
MD5 420bf21ab0035607033465a16ec9b99a
BLAKE2b-256 35c62899d2ae058419345c2d4a9e52ce816904418ad2ecf23feedd8b641f90af

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cb7754324e2201d3fe6bfa032bfc336c2a8075e3fde0d370e8212125c9731f5e
MD5 6ae5d6480ece1fc60174d96ed404b7fb
BLAKE2b-256 dad3679b0bcab7c9ee1244580c5d27dc1b3e7a2168c13cc2ac82236543b47c57

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp312-cp312-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp312-cp312-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 351f19648379f3433d11a01443b9f76c02287a085c71bec60eb39ed39f4530ae
MD5 78e13bebac6fc7a6f888470ad6f9cb9b
BLAKE2b-256 8abfb75799b48576249567937f7daae018f7751f398c520553875067cbbb1ec0

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9577ba89710b95c26dde52b7a1a6997c889162c1a8ca33bbcb7c2cea6789387b
MD5 9e2df5b94cd4e74d76765fe412c4b0d3
BLAKE2b-256 fd302810f166f16d31f66cbb2c4ad6988f17064aa066b198aa72340b2fe17b4e

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1608374acd7d69c3f062ba4546cd890ce6dde562491a4c21667e15b5e0f7d903
MD5 9f982fd8757a890f5cad2e8a5bd277df
BLAKE2b-256 2c6b7183431d361e36b0ede425263289ea25e2ca0ebd5180adedecc9bce91292

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8fcbf850ed6262a50bd322155b2c4a544730177e002cc94d4fdb08b5bff135c1
MD5 2abfb6b0052f2c980119ed59e51205ec
BLAKE2b-256 c3b54b7aa5d5a1702241c266ec92dc709804be6362fe0faff503191af94df609

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 fe161d52d51db04bf1f879d0e952a35af5ee268eb90ecda96e3247fd0764f842
MD5 7bcfdd92f0047e138a18fa82d7ba2561
BLAKE2b-256 b44508b0381307e93f7aa9fd141d8c876a0243d8c00b4bdf5a9d7fad52933d7d

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6187d901061a6046cf0f3b44379baab03174871441ea48da5b3bd9d8001b1aa2
MD5 7855e59e4c1c5c5e9bee1d5c077e5fbb
BLAKE2b-256 bcda4074ba16abc8e7ff6cff56def33b84aa32461659a9402e09868980c0d74f

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 64819f65e1d74859cb3b8a54d0940de52dfdf0c2ed70a9e4c58f217097361878
MD5 9170425cc63319f3d250df863c9e06b2
BLAKE2b-256 9cbcfb514d6ea68897374e2bfb5b73f4bacd94bbd5aa64defa5697af459cd55a

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c694fef808a92a53a0639b15e2da3827c6a5c13b927c5dad40edd2371bde9946
MD5 11e3a913e77ca7af380095ca0d585eb2
BLAKE2b-256 59ea8d833855df64d2a8bbcca66634cde505eaf2dd9b5ab8318d2f747b385338

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

  • Download URL: cisv-0.2.0-cp39-cp39-macosx_11_0_arm64.whl
  • Upload date:
  • Size: 44.2 kB
  • Tags: CPython 3.9, macOS 11.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for cisv-0.2.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 20d0aa7078fa9b0f37b2e3d0c7c2073cdbcedbfc5ec02181cbb098f4ab0af7c4
MD5 02764d8c82567bc8286cf51b5d65a778
BLAKE2b-256 f1d261825a82d87a7cfa9972114dfc0a4d654d56463e96b6cb009e7d42ddee0f

See more details on using hashes here.

File details

Details for the file cisv-0.2.0-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for cisv-0.2.0-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 cfd4cd51ecf73558d250fe2c85c5b28ad282832547a34be6a18f4b509e49c4f2
MD5 a7f67dc4e56d2671ca97224e45d8940c
BLAKE2b-256 7c0ba29c11396615ca878281c0adc1c9de6490a11f9b5aa8d2a9724576b88a60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page