Blazing-fast drop-in replacement for Python's csv module, powered by Rust

These details have not been verified by PyPI

Project links

Homepage

Project description

zcsv

Blazing-fast drop-in replacement for Python's csv module, powered by Rust.

import zcsv as csv — same API, 4-6x faster.

Why zcsv?

Python's built-in csv module is implemented in C but still creates Python objects for every field of every row. For a 100K-row file with 50 columns, that's 5 million string allocations just to iterate.

zcsv eliminates this. The Rust core parses CSV via SIMD instructions into a single contiguous buffer. The Python-facing Row object is a zero-copy cursor — it holds a pointer and an index, nothing more. Python strings are created only when you access a field, and only for the fields you actually use.

Features you won't find elsewhere

CSV Injection Protection — safe=True escapes =, +, -, @, \t, \r (OWASP best practice)
RFC 4180 Strict Mode — strict=True validates field counts, quoting rules, with line numbers in errors
Delimiter Autodetection — frequency analysis in Rust (, ; \t | :)
Encoding Autodetection — UTF-8, UTF-16 (LE/BE), Latin-1, BOM handling
Automatic Type Inference — zcsv.read() returns typed list[dict] (int, float, bool, str)

Install

pip install zcsv

From source (requires Rust toolchain):

git clone https://github.com/Seinarukiro2/zcsv.git
cd zcsv
pip install maturin
maturin develop --release

Quick Start

Drop-in replacement

import zcsv as csv

# Exactly like stdlib — but 4x faster
with open("data.csv") as f:
    for row in csv.reader(f):
        print(row[0], row[1])

# DictReader — 6x faster
with open("data.csv") as f:
    for row in csv.DictReader(f):
        print(row["name"], row["age"])

# Writer — 1.4x faster
with open("out.csv", "w", newline="") as f:
    w = csv.writer(f)
    w.writerow(["name", "age"])
    w.writerows([["Alice", "30"], ["Bob", "25"]])

zcsv extensions

import zcsv

# Read with automatic type inference
data = zcsv.read("data.csv")
# [{"name": "Alice", "age": 30, "active": True}, ...]

# Batch reading for large files
for batch in zcsv.read_batches("huge.csv", batch_size=10_000):
    process(batch)

# Write with CSV injection protection (safe=True by default)
zcsv.write("out.csv", data)

API Reference

`zcsv.reader(csvfile, **kwargs)`

Returns a cursor iterator. Each iteration advances to the next row. Access fields with row[0] (by index) or row.to_list() for a full list.

with open("data.csv") as f:
    for row in zcsv.reader(f):
        name = row[0]          # lazy — creates Python string only now
        last = row[-1]         # negative indexing works
        print(len(row))        # field count
        print(repr(row))       # ['Alice', '30', 'NYC']

Storing rows: The cursor reuses the same object. To collect rows, use snapshot():

with open("data.csv") as f:
    rows = [row.snapshot() for row in zcsv.reader(f)]
    # or: [row.to_list() for row in zcsv.reader(f)]

Parameters: delimiter, quotechar, strict (RFC 4180 validation)

`zcsv.DictReader(f, fieldnames=None, **kwargs)`

Same cursor pattern with dict-like access:

with open("data.csv") as f:
    for row in zcsv.DictReader(f):
        row["name"]           # by key
        row[0]                # also by index
        row.keys()            # column names
        row.values()          # all values
        row.items()           # (key, value) pairs
        row.get("x", "N/A")  # with default
        "name" in row         # membership test

`zcsv.read(path, **kwargs) -> list[dict]`

Read entire file with automatic type inference.

zcsv.read("data.csv",
    delimiter=None,       # None = autodetect
    has_header=True,
    schema={"id": int, "price": float},  # override types
    skip_rows=0,
    max_rows=None,
    columns=["name", "age"],  # select columns
    null_values=["", "NA", "null", "None"],
    encoding=None,        # None = autodetect
    strict=False,         # RFC 4180 validation
    n_threads=None,       # parallel type conversion
)

`zcsv.write(path, data, **kwargs)`

zcsv.write("out.csv", data,
    delimiter=",",
    safe=True,    # CSV injection protection (default: True)
    strict=False,
)

`zcsv.writer(csvfile, kwargs)` / `zcsv.DictWriter(csvfile, fieldnames, kwargs)`

Stdlib-compatible streaming writer. safe=False by default (stdlib compat).

`zcsv.read_batches(path, batch_size=10_000, **kwargs)`

Memory-efficient iterator yielding list[dict] batches.

Architecture

Python API        ┌─ reader() ─── cursor Row (zero-copy, lazy strings)
                  ├─ DictReader() ─── cursor with field names
                  ├─ writer() / DictWriter() ─── raw FFI serialization
                  ├─ read() ─── type inference + parallel conversion
                  └─ write() ─── CSV injection protection

Rust Core         ┌─ simd-csv ─── SIMD-accelerated CSV parsing
(PyO3 + FFI)      ├─ SharedData ─── single Vec<u8> buffer for all rows
                  ├─ memmap2 ─── memory-mapped I/O for large files
                  ├─ rayon ─── parallel column type conversion
                  ├─ encoding_rs ─── charset detection + conversion
                  └─ fast_pyobjects ─── raw CPython FFI (PyUnicode_New, PyList_SET_ITEM)

Key design decisions

Zero-copy Row: All CSV data lives in one contiguous Vec<u8>. Row is Arc<SharedData> + u32 — 12 bytes, no per-row heap allocation. Python strings created only on field access via raw PyUnicode_New.
Cursor pattern: reader.__next__() returns self with Py_INCREF (~10ns) instead of allocating a new object (~900ns).
String dedup cache: Repeated values (countries, categories, booleans) are cached. Auto-disables after 200 samples if hit rate < 20%.
GIL release: File I/O, SIMD parsing, type inference, CSV serialization all run with GIL released.

Benchmarks

100,000 rows, Python 3.13, Apple Silicon M4:

Operation	stdlib `csv`	`zcsv`	Speedup
`reader()` 10 cols	0.080s	0.018s	4.4x
`reader()` 50 cols	0.335s	0.073s	4.6x
`DictReader()` 10 cols	0.124s	0.025s	5.0x
`DictReader()` 50 cols	0.491s	0.082s	6.0x
`writer()` 10 cols	0.160s	0.112s	1.4x
`writer()` 50 cols	0.780s	0.548s	1.4x

License

MIT

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.1

Apr 5, 2026

This version

0.2.0

Apr 5, 2026

0.1.0

Apr 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zcsv-0.2.0.tar.gz (34.2 kB view details)

Uploaded Apr 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

zcsv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl (468.5 kB view details)

Uploaded Apr 5, 2026 CPython 3.13macOS 11.0+ ARM64

File details

Details for the file zcsv-0.2.0.tar.gz.

File metadata

Download URL: zcsv-0.2.0.tar.gz
Upload date: Apr 5, 2026
Size: 34.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for zcsv-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`4992834dcd9627f3c6fc17dacc122eac99edeb175e1fcfa07b21b90ef949d68d`
MD5	`93d0543c4b09e85b6e6e96010fa43fc2`
BLAKE2b-256	`3b263f801b2646dc5b43ea4cf7b0c34995e09bdc74a1b5925853414fffd692df`

See more details on using hashes here.

File details

Details for the file zcsv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

Download URL: zcsv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl
Upload date: Apr 5, 2026
Size: 468.5 kB
Tags: CPython 3.13, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for zcsv-0.2.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`9ebee8c5e53820354b805fac7c5faa192e601d4c0e6964b3d16d5db7df0a9cdb`
MD5	`49086133acd0ff781e221cb15a218d7a`
BLAKE2b-256	`d261b672f0fdf10a310c0517aa0fab410ac1dcd80d215fb4a2c541729cf90bbd`

See more details on using hashes here.

zcsv 0.2.0

Navigation

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

zcsv

Why zcsv?

Features you won't find elsewhere

Install

Quick Start

Drop-in replacement

zcsv extensions

API Reference

zcsv.reader(csvfile, **kwargs)

zcsv.DictReader(f, fieldnames=None, **kwargs)

zcsv.read(path, **kwargs) -> list[dict]

zcsv.write(path, data, **kwargs)

zcsv.writer(csvfile, **kwargs) / zcsv.DictWriter(csvfile, fieldnames, **kwargs)

zcsv.read_batches(path, batch_size=10_000, **kwargs)

Architecture

Key design decisions

Benchmarks

License

Project details

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`zcsv.reader(csvfile, **kwargs)`

`zcsv.DictReader(f, fieldnames=None, **kwargs)`

`zcsv.read(path, **kwargs) -> list[dict]`

`zcsv.write(path, data, **kwargs)`

`zcsv.writer(csvfile, kwargs)` / `zcsv.DictWriter(csvfile, fieldnames, kwargs)`

`zcsv.read_batches(path, batch_size=10_000, **kwargs)`