Skip to main content

Fast CSV querying from Python — powered by a Zig/SIMD engine

Project description

csvql-query

CI License: MIT PyPI

Query CSV files with SQL from Python — powered by a Zig/SIMD engine.

Zero-copy mmap reads + SIMD parsing happen before Python ever sees the data. Faster than DuckDB on typical workloads, no dependencies required.

Installation

pip install csvql-query

Quick Start

import csvql

# Returns a list of dicts (like csv.DictReader, but with SQL)
rows = csvql.query("SELECT name, salary FROM 'employees.csv' WHERE salary > 100000 ORDER BY salary DESC")
# [{'name': 'Alice', 'salary': '185000'}, ...]

# Raw CSV string
csv_str = csvql.query_csv("SELECT * FROM 'data.csv' LIMIT 10")

# pandas DataFrame (pandas must be installed)
df = csvql.query_df("SELECT category, COUNT(*) as n FROM 'sales.csv' GROUP BY category")

# (headers, rows) tuples — no dependencies
headers, rows = csvql.query_tuples("SELECT name, age FROM 'users.csv' WHERE age > 25")

API

Function Returns Description
query(sql) list[dict] Execute SQL, get list of dicts
query_csv(sql) str Execute SQL, get raw CSV string
query_df(sql) DataFrame Execute SQL, get pandas DataFrame
query_tuples(sql) (list[str], list[tuple]) Execute SQL, get (headers, rows)

SQL Support

The SQL path is embedded in the query string (same as the CLI):

# Filtering, ordering, limiting
csvql.query("SELECT name, city FROM 'data.csv' WHERE age > 30 ORDER BY name LIMIT 5")

# Aggregation
csvql.query("SELECT department, AVG(salary) FROM 'emp.csv' GROUP BY department")

# Unix pipes — use '-' as the filename
import subprocess, sys
# or just pass stdin data via the engine directly

Full SQL reference: SIMPLE_QUERY_LANGUAGE.md

Performance

  • mmap + SIMD parsing — data is never copied into Python memory
  • Parallel chunk processing on multi-core machines
  • Typically 5–9x faster than DuckDB on 1M-row CSVs

Requirements

  • Python ≥ 3.10
  • macOS (x86_64 / arm64) or Linux (x86_64)
  • pandas optional — only needed for query_df()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csvql_query-1.5.1-py3-none-manylinux_2_17_x86_64.whl (1.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

csvql_query-1.5.1-py3-none-macosx_12_0_x86_64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ x86-64

csvql_query-1.5.1-py3-none-macosx_12_0_arm64.whl (353.8 kB view details)

Uploaded Python 3macOS 12.0+ ARM64

File details

Details for the file csvql_query-1.5.1-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.1-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 9eeaaf9ba77fca39a17cbcc7ee8c460823d02fee72a7a9cf17a85e128ce20d81
MD5 3aaa00f5e01bdba0880e09c9c6f18037
BLAKE2b-256 76385838e46a0098bffc8220fdb97f8a318a8cc3d4dbdefb31ebddb099973fb6

See more details on using hashes here.

File details

Details for the file csvql_query-1.5.1-py3-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.1-py3-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 276861c736564136d0150c13dfb325f4121b4e421145dfdbe8ffe51f0cb18fb8
MD5 2d2b8b64a015967a130024f4d6268177
BLAKE2b-256 24613897afb0913b0b7dc875803ab82f151c1536090f7e48bb9b62cc74991452

See more details on using hashes here.

File details

Details for the file csvql_query-1.5.1-py3-none-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for csvql_query-1.5.1-py3-none-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 165db71893b308c99cb63cf68a230781bf86139a90a3674e3f2f37643e496c9a
MD5 eb69f5f23b582bd1765f2bf299cfd500
BLAKE2b-256 ff4653b826ad6ba89f80b7a992e9e39943587c7ef12f4bab542ba49d091e7978

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page