Skip to main content

Fast CSV querying from Python — powered by a Zig/SIMD engine

Project description

csvql-query

CI License: MIT PyPI

Query CSV files with SQL from Python — powered by a Zig/SIMD engine.

Zero-copy mmap reads + SIMD parsing happen before Python ever sees the data. Faster than DuckDB on typical workloads, no dependencies required.

Installation

pip install csvql-query

Quick Start

import csvql

# Returns a list of dicts (like csv.DictReader, but with SQL)
rows = csvql.query("SELECT name, salary FROM 'employees.csv' WHERE salary > 100000 ORDER BY salary DESC")
# [{'name': 'Alice', 'salary': '185000'}, ...]

# Raw CSV string
csv_str = csvql.query_csv("SELECT * FROM 'data.csv' LIMIT 10")

# pandas DataFrame (pandas must be installed)
df = csvql.query_df("SELECT category, COUNT(*) as n FROM 'sales.csv' GROUP BY category")

# (headers, rows) tuples — no dependencies
headers, rows = csvql.query_tuples("SELECT name, age FROM 'users.csv' WHERE age > 25")

API

Function Returns Description
query(sql) list[dict] Execute SQL, get list of dicts
query_csv(sql) str Execute SQL, get raw CSV string
query_df(sql) DataFrame Execute SQL, get pandas DataFrame
query_tuples(sql) (list[str], list[tuple]) Execute SQL, get (headers, rows)

SQL Support

The SQL path is embedded in the query string (same as the CLI):

# Filtering, ordering, limiting
csvql.query("SELECT name, city FROM 'data.csv' WHERE age > 30 ORDER BY name LIMIT 5")

# Aggregation
csvql.query("SELECT department, AVG(salary) FROM 'emp.csv' GROUP BY department")

# Unix pipes — use '-' as the filename
import subprocess, sys
# or just pass stdin data via the engine directly

Full SQL reference: SIMPLE_QUERY_LANGUAGE.md

Performance

  • mmap + SIMD parsing — data is never copied into Python memory
  • Parallel chunk processing on multi-core machines
  • Typically 5–9x faster than DuckDB on 1M-row CSVs

Requirements

  • Python ≥ 3.10
  • macOS (x86_64 / arm64) or Linux (x86_64)
  • pandas optional — only needed for query_df()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csvql_query-1.4.0-py3-none-manylinux_2_17_x86_64.whl (1.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

csvql_query-1.4.0-py3-none-macosx_12_0_x86_64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ x86-64

csvql_query-1.4.0-py3-none-macosx_12_0_arm64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ ARM64

File details

Details for the file csvql_query-1.4.0-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.4.0-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 6ae949be137b1659be3d6ffa96f7c5b87850ef45a26a280ba2462f285637fcca
MD5 210add0b5ada5bccc3f14f114b352d5f
BLAKE2b-256 7f4fd6783dc5606104f9dae85bf3511b2835a1d2863218f8030a40f8d00aaf7f

See more details on using hashes here.

File details

Details for the file csvql_query-1.4.0-py3-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.4.0-py3-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 8a70548a52c97641f001f8fd30908d6acad330727141ee8e50568bb128c6b72d
MD5 621592b4a8b9a39590446673dd9f90ed
BLAKE2b-256 0ed8ab50dff09d9ee0799c568808e6ca3f36a0b375095b5a1be2528ef4210c81

See more details on using hashes here.

File details

Details for the file csvql_query-1.4.0-py3-none-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for csvql_query-1.4.0-py3-none-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 de2f167c49f0dfa4e38af146427cffe076c68194959424376d231e06258fca6a
MD5 b4e9fd7ebec56c7434026a827bd02ee3
BLAKE2b-256 df049edb58c38fbc6ae27384d0fd3e4611926cf328fc1f37df839ef1b2b3fba9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page