Skip to main content

High-performance CSV parser with SIMD acceleration

Project description

vroom-csv

PyPI version Python versions License: MIT

High-performance CSV parser with SIMD acceleration for Python.

Features

  • SIMD-accelerated parsing using Google Highway for portable vectorization
  • Multi-threaded parsing for large files
  • Automatic dialect detection (delimiter, quoting, line endings)
  • Automatic type inference detects integers, floats, booleans, and strings
  • Arrow PyCapsule interface for zero-copy interoperability with PyArrow, Polars, DuckDB
  • Full type annotations with IDE autocomplete support

Installation

pip install vroom-csv

For Arrow interoperability:

pip install vroom-csv[arrow]   # PyArrow support
pip install vroom-csv[polars]  # Polars support

For development:

pip install vroom-csv[dev]

Quick Start

import vroom_csv

# Read a CSV file
table = vroom_csv.read_csv("data.csv")

print(f"Loaded {table.num_rows} rows, {table.num_columns} columns")
print(f"Columns: {table.column_names}")

# Access data
names = table.column("name")
first_row = table.row(0)

Arrow Interoperability

vroom-csv implements the Arrow PyCapsule interface for zero-copy data exchange:

PyArrow

import pyarrow as pa
import vroom_csv

table = vroom_csv.read_csv("data.csv")
arrow_table = pa.table(table)  # Zero-copy conversion

# Now use PyArrow's features
arrow_table.to_pandas()

Polars

import polars as pl
import vroom_csv

table = vroom_csv.read_csv("data.csv")
df = pl.from_arrow(table)  # Zero-copy conversion

# Now use Polars' features
df.filter(pl.col("age") > 30)

DuckDB

import duckdb
import vroom_csv

table = vroom_csv.read_csv("data.csv")
result = duckdb.query("SELECT * FROM table WHERE age > 30")

API Reference

read_csv(path, ...)

Read a CSV file and return a Table object.

read_csv(
    path: str,
    delimiter: str | None = None,
    quote_char: str = '"',
    has_header: bool = True,
    skip_rows: int = 0,
    n_rows: int | None = None,
    usecols: list[str | int] | None = None,
    dtype: dict[str, str] | None = None,
    null_values: list[str] | None = None,
    empty_is_null: bool = True,
    encoding: str = "utf-8",
    num_threads: int = 1,
) -> Table

Parameters:

  • path: Path to the CSV file
  • delimiter: Field delimiter (auto-detected if None)
  • quote_char: Quote character for quoted fields
  • has_header: Whether the first row contains column headers
  • skip_rows: Number of data rows to skip after the header
  • n_rows: Maximum number of rows to read (None = all)
  • usecols: List of column names or indices to read
  • dtype: Dict mapping column names to types ("string", "int", "float", "bool")
  • null_values: List of strings to treat as null values
  • empty_is_null: Whether empty fields are treated as null
  • encoding: File encoding (currently UTF-8 only)
  • num_threads: Number of threads for parsing

detect_dialect(path)

Detect CSV dialect (delimiter, quoting, line endings).

dialect = vroom_csv.detect_dialect("data.csv")
print(dialect.delimiter)   # ','
print(dialect.quote_char)  # '"'
print(dialect.has_header)  # True
print(dialect.confidence)  # 0.95

Table

The Table class represents parsed CSV data.

Properties:

  • num_rows: Number of data rows (excluding header)
  • num_columns: Number of columns
  • column_names: List of column names

Methods:

  • column(index_or_name): Get column data as list
  • row(index): Get row data as list
  • has_errors(): Check if any parse errors occurred
  • error_summary(): Get a summary of parse errors
  • errors(): Get list of parse error messages

Arrow PyCapsule Protocol:

  • __arrow_c_schema__(): Export schema via Arrow C Data Interface
  • __arrow_c_stream__(): Export data via Arrow C Stream Interface

Examples

See the examples directory for Jupyter notebooks demonstrating:

Benchmarks

Run performance benchmarks against pandas, Polars, PyArrow, and DuckDB:

cd benchmarks
pip install pandas polars pyarrow duckdb
python benchmark_csv.py --sizes 1,10,100

See benchmarks/README.md for details.

License

MIT License - see LICENSE file in the repository root.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vroom_csv-0.1.0.tar.gz (45.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (483.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (456.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

vroom_csv-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (267.3 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

vroom_csv-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl (299.2 kB view details)

Uploaded CPython 3.12macOS 10.15+ x86-64

vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (484.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (457.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

vroom_csv-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (265.9 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

vroom_csv-0.1.0-cp311-cp311-macosx_10_15_x86_64.whl (296.0 kB view details)

Uploaded CPython 3.11macOS 10.15+ x86-64

vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (483.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (456.2 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

vroom_csv-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (264.2 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

vroom_csv-0.1.0-cp310-cp310-macosx_10_15_x86_64.whl (294.3 kB view details)

Uploaded CPython 3.10macOS 10.15+ x86-64

vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (483.8 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (456.7 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

vroom_csv-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (264.4 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

vroom_csv-0.1.0-cp39-cp39-macosx_10_15_x86_64.whl (294.4 kB view details)

Uploaded CPython 3.9macOS 10.15+ x86-64

File details

Details for the file vroom_csv-0.1.0.tar.gz.

File metadata

  • Download URL: vroom_csv-0.1.0.tar.gz
  • Upload date:
  • Size: 45.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vroom_csv-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dbd363add9941313f532f30a1f9eca852eb6443495d3fcc432e57dce196ffb35
MD5 93485a8451296a00d96b01dc6a486093
BLAKE2b-256 221c7ce8e7b6b21af0148d7e9874652e8b14191442d395eb9c70a23e77e1f434

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0.tar.gz:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d74e2af9f59b2a812eaed4d2f3e04c3f5a92a1a9a3255b16400256682a4a6823
MD5 dce50ea77371c6e6c61d6caa57d60e06
BLAKE2b-256 f9f4a6f547f2c4292b51008a627c700578a3f4a76dcdb2c4dc6df5ad29f67329

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 24c277174caf3f76dc11a5b515b364d7537dc5fbe58e25ffde9526fc015e372f
MD5 004f1f71ec83af1ffbc62e559ec64052
BLAKE2b-256 7d0e641393ab6798052e14b53fd4270949adbfe7c791fa719582539eae1a4c12

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2ca6acd6a24e5c4c55e3acd5e4a5575a531e61c63703d193c2a06edba8b73fbe
MD5 8566eccbd95da02f8756fd559ee3256a
BLAKE2b-256 8359dd5b7dc1ad9284e68fb67f5b360783d53747e64f2fdc7c4ae66d111abc23

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp312-cp312-macosx_11_0_arm64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 cd71fa0d777136eee7504a2fed7b51a3a6f5ae3d982461cc73671872370ba2a2
MD5 610ef93f8da2254f4ad09172f517a812
BLAKE2b-256 dfd092c7c3411709e2aa933af0961ca493d249ee2f20620eff79e5afe65720b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp312-cp312-macosx_10_15_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3888774164194dbf398953ef38f10d9938d152f3366203322418dc785d49b5a6
MD5 02e7ddbaa6f0bf076675525335099d8e
BLAKE2b-256 ecd05f6aaf535ded810c2b07a27ecfe9d2548b310671553f0e7c0b2a01b53c1f

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 401f3be4cf9159616b1eb97aa82dc0099fc0943c480aa62736acf17e67d4afc7
MD5 b89a7d44b97dbaf30d60d471623ae25d
BLAKE2b-256 d4b6e27cce598e54acc0c461ab6ff3f16ed3f146d364f16087d8f66fec38d7a7

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2b7ec9019900b37fc58f65b58131fea387c999e6bfc5cb28b3830708ac9edb2b
MD5 761dc798dd3b5260dc497b6079cd8dc6
BLAKE2b-256 114dce57551f6a46fa70dbd1108bedddbeb4c125b1e5657b71d9eccd06e71a66

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp311-cp311-macosx_11_0_arm64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp311-cp311-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp311-cp311-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 82f83b0d4b53adb0f282a2d49d79865d645ef24c4b472a05a55e5285b92bcef9
MD5 ba6c568a5a2531cc1d98c2d4f4e84def
BLAKE2b-256 e81dde752ffc17739654a00e67710f8d10d0b2f788f5dbcc420dfeb2a3f44427

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp311-cp311-macosx_10_15_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e80a14e22da22516dec98560f57da3a91df3d728a72cbe4a3d4b090796b6d22e
MD5 fc904be0acba2ce5778a3df5f280a924
BLAKE2b-256 c5d3049b675453b25e8f6637676c260ce682c58fa4a2f33052bdace385a12adf

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d3416a81c853633c29e773f3a7c01ccca1631316e98faf47a0fd43774a2f0c89
MD5 766b5dead2f9c5ac8cbeaa0a3c16216c
BLAKE2b-256 1c609a7978c6bf26fd11f8f4a439f7e3432152eb3e16b38a63d67994308dddfc

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 391fa049ce5a92b72bc5f40cff5d6b23e63bcf033d6bd810faa0cdca2cdb38d9
MD5 0508b64285209416ddaf2509a4de1539
BLAKE2b-256 fec69b37513cd19f17d104276a2e4455401b7ddebd66d0943fee23d99474a5e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp310-cp310-macosx_11_0_arm64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp310-cp310-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp310-cp310-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 0f512ccfb513390c10da474574775ac185efcb766f6216ed917a3f3168921c8b
MD5 b436cb41914130e217a22f650c394d4f
BLAKE2b-256 bec6dc5b4b3fc9e01c60ed292ad10dd61a9ed455b51b82ae5371ede95e85a9d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp310-cp310-macosx_10_15_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1bbd1267eb037787db1be258ee3a80d9e582fe757485a32653e64fad4ae3e962
MD5 b54a3d2a46bc455d5c591f4fbda40679
BLAKE2b-256 7165adfff87078c56f7362011b647911b4dec63b04481fa3809e760b112d5831

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2a078dd5f0c0200cd40bbf11696d91995e32d0966bd1c978b250a79fc09bb1bf
MD5 355d8405c49489aeaea0651c4ad9b228
BLAKE2b-256 8b5161a2b530d2bf6aab08b47786173404544ae7ee334ad146f1a8d08537e989

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8926bb5b08fe24d2287ee90fe1b7a03e4548369a520da170012dd6816f2e7a7f
MD5 504396a4d4e9df942d19db1827da16f1
BLAKE2b-256 893a87e19c8aad2fbe29da3b87111c4f70b37ee7bed2300d5bc0c0ee925253c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp39-cp39-macosx_11_0_arm64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file vroom_csv-0.1.0-cp39-cp39-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for vroom_csv-0.1.0-cp39-cp39-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 4c0f36a89f619de7438db47cd569f71cbb2e78845744cf6c01d8365d2f2d8679
MD5 ebc3ebdf04184d9fd3d816a2bde08d1b
BLAKE2b-256 b85985c5664ad1a53094b710c1a70709df150a459e07c0fe0aab9b223348406d

See more details on using hashes here.

Provenance

The following attestation bundles were made for vroom_csv-0.1.0-cp39-cp39-macosx_10_15_x86_64.whl:

Publisher: python-wheels.yml on jimhester/libvroom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page