Skip to main content

Fast CSV querying from Python — powered by a Zig/SIMD engine

Project description

csvql-query

CI License: MIT PyPI

Query CSV files with SQL from Python — powered by a Zig/SIMD engine.

Zero-copy mmap reads + SIMD parsing happen before Python ever sees the data. Faster than DuckDB on typical workloads, no dependencies required.

Installation

pip install csvql-query

Quick Start

import csvql

# Returns a list of dicts (like csv.DictReader, but with SQL)
rows = csvql.query("SELECT name, salary FROM 'employees.csv' WHERE salary > 100000 ORDER BY salary DESC")
# [{'name': 'Alice', 'salary': '185000'}, ...]

# Raw CSV string
csv_str = csvql.query_csv("SELECT * FROM 'data.csv' LIMIT 10")

# pandas DataFrame (pandas must be installed)
df = csvql.query_df("SELECT category, COUNT(*) as n FROM 'sales.csv' GROUP BY category")

# (headers, rows) tuples — no dependencies
headers, rows = csvql.query_tuples("SELECT name, age FROM 'users.csv' WHERE age > 25")

API

Function Returns Description
query(sql) list[dict] Execute SQL, get list of dicts
query_csv(sql) str Execute SQL, get raw CSV string
query_df(sql) DataFrame Execute SQL, get pandas DataFrame
query_tuples(sql) (list[str], list[tuple]) Execute SQL, get (headers, rows)

SQL Support

The SQL path is embedded in the query string (same as the CLI):

# Filtering, ordering, limiting
csvql.query("SELECT name, city FROM 'data.csv' WHERE age > 30 ORDER BY name LIMIT 5")

# Aggregation
csvql.query("SELECT department, AVG(salary) FROM 'emp.csv' GROUP BY department")

# Unix pipes — use '-' as the filename
import subprocess, sys
# or just pass stdin data via the engine directly

Full SQL reference: SIMPLE_QUERY_LANGUAGE.md

Performance

  • mmap + SIMD parsing — data is never copied into Python memory
  • Parallel chunk processing on multi-core machines
  • Typically 5–9x faster than DuckDB on 1M-row CSVs

Requirements

  • Python ≥ 3.10
  • macOS (x86_64 / arm64) or Linux (x86_64)
  • pandas optional — only needed for query_df()

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

csvql_query-1.3.0-py3-none-manylinux_2_17_x86_64.whl (1.6 MB view details)

Uploaded Python 3manylinux: glibc 2.17+ x86-64

csvql_query-1.3.0-py3-none-macosx_12_0_x86_64.whl (353.2 kB view details)

Uploaded Python 3macOS 12.0+ x86-64

csvql_query-1.3.0-py3-none-macosx_12_0_arm64.whl (353.2 kB view details)

Uploaded Python 3macOS 12.0+ ARM64

File details

Details for the file csvql_query-1.3.0-py3-none-manylinux_2_17_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.3.0-py3-none-manylinux_2_17_x86_64.whl
Algorithm Hash digest
SHA256 9ff44eba089b9b5459ec9fb65b68cd1f748c5607c2698f0846d096629722c40c
MD5 563cfa10816f9b8cb0cc33ee0adbe95a
BLAKE2b-256 aaed5e910df74e0c376dc5636dadb9e93d3a325514cdeb77146c807968902eb2

See more details on using hashes here.

File details

Details for the file csvql_query-1.3.0-py3-none-macosx_12_0_x86_64.whl.

File metadata

File hashes

Hashes for csvql_query-1.3.0-py3-none-macosx_12_0_x86_64.whl
Algorithm Hash digest
SHA256 cc840e12955a4b3d74e604d34f4b9cefab5ace45e40e8b187ecfcf89fb13d566
MD5 68deb6da107d9b766a8acf841687bb3f
BLAKE2b-256 66420a252452f3036546fb043dc4ab299f05c1a80965e4d559ed0990e6170716

See more details on using hashes here.

File details

Details for the file csvql_query-1.3.0-py3-none-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for csvql_query-1.3.0-py3-none-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 2cd78431329cda9746f20d3f3e5d44f6a7ae07bd14ab95b4355f806328a0ca78
MD5 3ed30ae77dcf2e6f08e6ee6c6ba9ac12
BLAKE2b-256 c0c706ff1ab7b64a9749d35f4c5ee63fd583ddabd7553c543bdb9784c91fc855

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page