Skip to main content

Fast CSV querying from Python — powered by a Zig/SIMD engine

Project description

csvql-query

CI License: MIT PyPI

Query CSV files with SQL from Python — powered by a Zig/SIMD engine.

Zero-copy mmap reads + SIMD parsing happen before Python ever sees the data. Faster than DuckDB on typical workloads, no dependencies required.

Installation

pip install csvql-query

Quick Start

import csvql

# Returns a list of dicts (like csv.DictReader, but with SQL)
rows = csvql.query("SELECT name, salary FROM 'employees.csv' WHERE salary > 100000 ORDER BY salary DESC")
# [{'name': 'Alice', 'salary': '185000'}, ...]

# Raw CSV string
csv_str = csvql.query_csv("SELECT * FROM 'data.csv' LIMIT 10")

# pandas DataFrame (pandas must be installed)
df = csvql.query_df("SELECT category, COUNT(*) as n FROM 'sales.csv' GROUP BY category")

# (headers, rows) tuples — no dependencies
headers, rows = csvql.query_tuples("SELECT name, age FROM 'users.csv' WHERE age > 25")

API

Function Returns Description
query(sql) list[dict] Execute SQL, get list of dicts
query_csv(sql) str Execute SQL, get raw CSV string
query_df(sql) DataFrame Execute SQL, get pandas DataFrame
query_tuples(sql) (list[str], list[tuple]) Execute SQL, get (headers, rows)

SQL Support

The SQL path is embedded in the query string (same as the CLI):

# Filtering, ordering, limiting
csvql.query("SELECT name, city FROM 'data.csv' WHERE age > 30 ORDER BY name LIMIT 5")

# Aggregation
csvql.query("SELECT department, AVG(salary) FROM 'emp.csv' GROUP BY department")

# Unix pipes — use '-' as the filename
import subprocess, sys
# or just pass stdin data via the engine directly

Full SQL reference: SIMPLE_QUERY_LANGUAGE.md

Performance

  • mmap + SIMD parsing — data is never copied into Python memory
  • Parallel chunk processing on multi-core machines
  • Typically 5–9x faster than DuckDB on 1M-row CSVs

Requirements

  • Python ≥ 3.10
  • macOS (x86_64 / arm64) or Linux (x86_64)
  • pandas optional — only needed for query_df()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csvql_query-1.5.0-py3-none-manylinux_2_17_x86_64.whl (1.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

csvql_query-1.5.0-py3-none-macosx_12_0_x86_64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ x86-64

csvql_query-1.5.0-py3-none-macosx_12_0_arm64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ ARM64

File details

Details for the file csvql_query-1.5.0-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.0-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 1e86c55f15b880118dec990820368dfae11dffc1eecf07bbb7354caaef7bc335
MD5 267d29041635fba62fc7b39510b1315a
BLAKE2b-256 83d990a273bc675a7f5abd7de5816e6e94aadddd819a59f8b754fa553dac96bc

See more details on using hashes here.

File details

Details for the file csvql_query-1.5.0-py3-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.0-py3-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 4aa1408b3b19540b21449f3543388b0e55fc03aa743b863ed05fac89e64f314c
MD5 d597624ddbe6bb47285b74bc2f1826ff
BLAKE2b-256 76cccc8b52d8629cc0c2f4ca46237ce220eeb8c3dfd937c7a48d9905d650172e

See more details on using hashes here.

File details

Details for the file csvql_query-1.5.0-py3-none-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.0-py3-none-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 497146f3e73207eec5411e25dfaeff1821f42b48f6f5c680b79b7242e1004e8a
MD5 bfbe9a92bbff21caf8057601aff314ab
BLAKE2b-256 3a597c25c87b40a6f179689e93eaf734532ad0f72a5c45df17489ae4c1474d06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page