High-performance Qlik QVD file reader/writer with Parquet conversion — Rust-powered Python bindings
Project description
qvd
High-performance Rust library for reading, writing and converting Qlik QVD files. With Parquet/Arrow interop, streaming reader, CLI tool, and Python bindings.
First and only QVD crate on crates.io.
Features
- Read/Write QVD — byte-identical roundtrip, zero-copy where possible
- Parquet ↔ QVD — convert in both directions with compression support (snappy, zstd, gzip, lz4)
- Arrow RecordBatch — convert QVD to/from Arrow for integration with DataFusion, DuckDB, Polars
- Streaming reader — read QVD files in chunks without loading everything into memory
- EXISTS() index — O(1) hash lookup, like Qlik's
EXISTS()function - CLI tool —
qvd-cli convert,inspect,head,schema - Python bindings — via PyO3/maturin, 20-35x faster than PyQvd
- Zero dependencies for core QVD read/write (Parquet/Arrow/Python are optional features)
Performance
Tested on 20 real QVD files (11 KB to 2.8 GB):
| File | Size | Rows | Columns | Read | Write |
|---|---|---|---|---|---|
| sample_tiny.qvd | 11 KB | 12 | 5 | 0.0s | 0.0s |
| sample_small.qvd | 418 KB | 2,746 | 8 | 0.0s | 0.0s |
| sample_medium.qvd | 41 MB | 465,810 | 12 | 0.5s | 0.0s |
| sample_large.qvd | 587 MB | 5,458,618 | 15 | 6.1s | 0.4s |
| sample_xlarge.qvd | 1.7 GB | 87,617,047 | 6 | 36.8s | 1.6s |
| sample_huge.qvd | 2.8 GB | 11,907,648 | 42 | 24.3s | 2.4s |
All 20 files — byte-identical roundtrip (MD5 match).
vs PyQvd (Pure Python)
| File | PyQvd | qvd (Rust) | Speedup |
|---|---|---|---|
| 10 MB, 1.4M rows | 5.0s | 0.17s | 29x |
| 41 MB, 466K rows | 8.5s | 0.5s | 16x |
| 480 MB, 12M rows | 79.4s | 2.3s | 35x |
| 1.7 GB, 87M rows | >10 min | 29.6s | >20x |
Installation
Rust
# Core QVD read/write (zero dependencies)
[dependencies]
qvd = "0.1"
# With Parquet/Arrow support
[dependencies]
qvd = { version = "0.1", features = ["parquet_support"] }
CLI
cargo install qvd --features cli
Python
pip install qvdrs
Or with uv:
uv pip install qvdrs
Quick Start — Rust
Read/Write QVD
use qvd::{read_qvd_file, write_qvd_file};
let table = read_qvd_file("data.qvd")?;
println!("Rows: {}, Cols: {}", table.num_rows(), table.num_cols());
// Byte-identical roundtrip
write_qvd_file(&table, "output.qvd")?;
Convert Parquet ↔ QVD
use qvd::{convert_parquet_to_qvd, convert_qvd_to_parquet, ParquetCompression};
// Parquet → QVD
convert_parquet_to_qvd("input.parquet", "output.qvd")?;
// QVD → Parquet (with zstd compression)
convert_qvd_to_parquet("input.qvd", "output.parquet", ParquetCompression::Zstd)?;
Arrow RecordBatch
use qvd::{read_qvd_file, qvd_to_record_batch, record_batch_to_qvd};
let table = read_qvd_file("data.qvd")?;
let batch = qvd_to_record_batch(&table)?;
// Use with DataFusion, DuckDB, Polars, etc.
// Arrow → QVD
let qvd_table = record_batch_to_qvd(&batch, "my_table")?;
Streaming Reader
use qvd::open_qvd_stream;
let mut reader = open_qvd_stream("huge_file.qvd")?;
println!("Total rows: {}", reader.total_rows());
while let Some(chunk) = reader.next_chunk(65536)? {
// Process 65K rows at a time
println!("Chunk: {} rows starting at {}", chunk.num_rows, chunk.start_row);
}
EXISTS() — O(1) Lookup
use qvd::{read_qvd_file, ExistsIndex, filter_rows_by_exists_fast};
let clients = read_qvd_file("clients.qvd")?;
let index = ExistsIndex::new(&clients, "ClientID");
// O(1) lookup
assert!(index.exists("12345"));
// Filter another table
let facts = read_qvd_file("facts.qvd")?;
let filtered = filter_rows_by_exists_fast(&facts, "ClientID", &index);
Quick Start — Python
import qvd
# Read QVD
table = qvd.read_qvd("data.qvd")
print(table.columns, table.num_rows)
print(table.head(5))
# Save QVD
table.save("output.qvd")
# Parquet → QVD
qvd.convert_parquet_to_qvd("input.parquet", "output.qvd")
# QVD → Parquet
qvd.convert_qvd_to_parquet("input.qvd", "output.parquet", compression="zstd")
# Load Parquet as QvdTable
table = qvd.QvdTable.from_parquet("input.parquet")
table.save("output.qvd")
table.save_as_parquet("output.parquet", compression="snappy")
# EXISTS — O(1) lookup
idx = qvd.ExistsIndex(table, "ClientID")
print("12345" in idx) # True/False
# Filter rows
rows = qvd.filter_exists(other_table, "ClientID", idx)
CLI
# Convert Parquet → QVD
qvd-cli convert input.parquet output.qvd
# Convert QVD → Parquet (with compression)
qvd-cli convert input.qvd output.parquet --compression zstd
# Inspect QVD metadata
qvd-cli inspect data.qvd
# Show first 20 rows
qvd-cli head data.qvd --rows 20
# Show Arrow schema
qvd-cli schema data.qvd
Architecture
src/
├── lib.rs — public API, re-exports
├── error.rs — error types (QvdError, QvdResult)
├── header.rs — XML header parser/writer (custom, zero-dep)
├── value.rs — QVD data types (QvdSymbol, QvdValue)
├── symbol.rs — symbol table binary reader/writer
├── index.rs — index table bit-stuffing reader/writer
├── reader.rs — high-level QVD reader
├── writer.rs — high-level QVD writer + QvdTableBuilder
├── exists.rs — ExistsIndex with HashSet + filter functions
├── streaming.rs — streaming chunk-based QVD reader
├── parquet.rs — Parquet/Arrow ↔ QVD conversion (optional)
├── python.rs — PyO3 bindings (optional)
└── bin/qvd.rs — CLI binary (optional)
Feature Flags
| Feature | Dependencies | Description |
|---|---|---|
| (default) | none | Core QVD read/write |
parquet_support |
arrow, parquet, chrono | Parquet/Arrow conversion |
cli |
+ clap | CLI binary |
python |
+ pyo3 | Python bindings |
Publishing
crates.io
- Go to crates.io/settings/tokens
- Click "New Token"
- Name:
github-actions, Scopes: publish-update for crateqvd - Copy the token
- In GitHub repo → Settings → Secrets and variables → Actions → New repository secret
- Name:
CARGO_REGISTRY_TOKEN, Value: paste the token
PyPI
- Go to pypi.org/manage/account/publishing
- Add a new Trusted Publisher (pending):
- PyPI project name:
qvdrs - Owner:
bintocher - Repository:
qvdrs - Workflow name:
release-pypi.yml - Environment name:
pypi
- PyPI project name:
- In GitHub repo → Settings → Environments → Create "pypi" environment
Triggering a release
git tag v0.1.0
git push origin v0.1.0
Then create a GitHub Release from the tag — both crates.io and PyPI workflows will trigger automatically.
Author
Stanislav Chernov (@bintocher)
License
MIT — see LICENSE
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file qvdrs-0.1.0.tar.gz.
File metadata
- Download URL: qvdrs-0.1.0.tar.gz
- Upload date:
- Size: 54.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5fb1a89d2cdf4df18859556cbcd598393f0d3ef7344a56acc42eee255dfdfd55
|
|
| MD5 |
fc78dac5b6863127aae0dd4bf1700627
|
|
| BLAKE2b-256 |
b6c0acbfd679449ed1739fc188750ecab00c206dd30862babb526b6836aec9fe
|
File details
Details for the file qvdrs-0.1.0-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: qvdrs-0.1.0-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 3.2 MB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
849fa255ea8d6c04fb91a7b5fe14c244a4e97517913b97024d0f7ce78b8fb259
|
|
| MD5 |
0b82b02588060c397f82c76800e28f17
|
|
| BLAKE2b-256 |
1fe123960280696ec78b51bab1a6a850e247095ac5b5e88998f967f9d8f7f728
|
File details
Details for the file qvdrs-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: qvdrs-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.0 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88d804e8484fe0b3f0c10bda177ab3867f76e6b8cc1d3c084c1f46fdd427a8a1
|
|
| MD5 |
557fe1ba6c76281d9f43c406bfc867b6
|
|
| BLAKE2b-256 |
d79a7643c59e8b146ed0490d96566530e9405cca0af0cfa732d55aef0581222d
|
File details
Details for the file qvdrs-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: qvdrs-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.8 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31b67c74a83d297fe89b43ad5f8f5cb5e5e069e31c24bb0cfe41fdbddeb5af64
|
|
| MD5 |
64f23f4537012f01e2ece8a6c5d22315
|
|
| BLAKE2b-256 |
ba77578992a595d94a9b9ae32c6c5bf57b80702bf053d5409d1e382c85438e03
|
File details
Details for the file qvdrs-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: qvdrs-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.4 MB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ec7c82bad17cc6abc8478904e876f8fa55a1f12fc258d23ceaca57ca93441383
|
|
| MD5 |
61e96c280264e0bac896e9185ae27999
|
|
| BLAKE2b-256 |
fc77f18e05c3a45a692e11bad7d250bd750336e007a187ab45ddae384e92dbc6
|
File details
Details for the file qvdrs-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl.
File metadata
- Download URL: qvdrs-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.5 MB
- Tags: CPython 3.13, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7de8924d7da3731694803dbb446edfc5a6936f521a59fdbabb567c05c6f9d810
|
|
| MD5 |
7f75fd11012ff5743537ca3f95af701c
|
|
| BLAKE2b-256 |
4fc4d40e3ddd72c8652590a61a4b4fe74a59ff7cee0f35cc4b2bfe1fe6418de7
|