Skip to main content

High-performance Qlik QVD file reader/writer with Parquet conversion — Rust-powered Python bindings

Project description

qvd

Crates.io PyPI License: MIT

High-performance Rust library for reading, writing and converting Qlik QVD files. With Parquet/Arrow interop, streaming reader, CLI tool, and Python bindings.

First and only QVD crate on crates.io.

Features

  • Read/Write QVD — byte-identical roundtrip, zero-copy where possible
  • Parquet ↔ QVD — convert in both directions with compression support (snappy, zstd, gzip, lz4)
  • Arrow RecordBatch — convert QVD to/from Arrow for integration with DataFusion, DuckDB, Polars
  • Streaming reader — read QVD files in chunks without loading everything into memory
  • EXISTS() index — O(1) hash lookup, like Qlik's EXISTS() function
  • CLI toolqvd-cli convert, inspect, head, schema
  • Python bindings — via PyO3/maturin, 20-35x faster than PyQvd
  • Zero dependencies for core QVD read/write (Parquet/Arrow/Python are optional features)

Performance

Tested on 20 real QVD files (11 KB to 2.8 GB):

File Size Rows Columns Read Write
sample_tiny.qvd 11 KB 12 5 0.0s 0.0s
sample_small.qvd 418 KB 2,746 8 0.0s 0.0s
sample_medium.qvd 41 MB 465,810 12 0.5s 0.0s
sample_large.qvd 587 MB 5,458,618 15 6.1s 0.4s
sample_xlarge.qvd 1.7 GB 87,617,047 6 36.8s 1.6s
sample_huge.qvd 2.8 GB 11,907,648 42 24.3s 2.4s

All 20 files — byte-identical roundtrip (MD5 match).

vs PyQvd (Pure Python)

File PyQvd qvd (Rust) Speedup
10 MB, 1.4M rows 5.0s 0.17s 29x
41 MB, 466K rows 8.5s 0.5s 16x
480 MB, 12M rows 79.4s 2.3s 35x
1.7 GB, 87M rows >10 min 29.6s >20x

Installation

Rust

# Core QVD read/write (zero dependencies)
[dependencies]
qvd = "0.1"

# With Parquet/Arrow support
[dependencies]
qvd = { version = "0.1", features = ["parquet_support"] }

CLI

cargo install qvd --features cli

Python

pip install qvdrs

Or with uv:

uv pip install qvdrs

Quick Start — Rust

Read/Write QVD

use qvd::{read_qvd_file, write_qvd_file};

let table = read_qvd_file("data.qvd")?;
println!("Rows: {}, Cols: {}", table.num_rows(), table.num_cols());

// Byte-identical roundtrip
write_qvd_file(&table, "output.qvd")?;

Convert Parquet ↔ QVD

use qvd::{convert_parquet_to_qvd, convert_qvd_to_parquet, ParquetCompression};

// Parquet → QVD
convert_parquet_to_qvd("input.parquet", "output.qvd")?;

// QVD → Parquet (with zstd compression)
convert_qvd_to_parquet("input.qvd", "output.parquet", ParquetCompression::Zstd)?;

Arrow RecordBatch

use qvd::{read_qvd_file, qvd_to_record_batch, record_batch_to_qvd};

let table = read_qvd_file("data.qvd")?;
let batch = qvd_to_record_batch(&table)?;
// Use with DataFusion, DuckDB, Polars, etc.

// Arrow → QVD
let qvd_table = record_batch_to_qvd(&batch, "my_table")?;

Streaming Reader

use qvd::open_qvd_stream;

let mut reader = open_qvd_stream("huge_file.qvd")?;
println!("Total rows: {}", reader.total_rows());

while let Some(chunk) = reader.next_chunk(65536)? {
    // Process 65K rows at a time
    println!("Chunk: {} rows starting at {}", chunk.num_rows, chunk.start_row);
}

EXISTS() — O(1) Lookup

use qvd::{read_qvd_file, ExistsIndex, filter_rows_by_exists_fast};

let clients = read_qvd_file("clients.qvd")?;
let index = ExistsIndex::new(&clients, "ClientID");

// O(1) lookup
assert!(index.exists("12345"));

// Filter another table
let facts = read_qvd_file("facts.qvd")?;
let filtered = filter_rows_by_exists_fast(&facts, "ClientID", &index);

Quick Start — Python

import qvd

# Read QVD
table = qvd.read_qvd("data.qvd")
print(table.columns, table.num_rows)
print(table.head(5))

# Save QVD
table.save("output.qvd")

# Parquet → QVD
qvd.convert_parquet_to_qvd("input.parquet", "output.qvd")

# QVD → Parquet
qvd.convert_qvd_to_parquet("input.qvd", "output.parquet", compression="zstd")

# Load Parquet as QvdTable
table = qvd.QvdTable.from_parquet("input.parquet")
table.save("output.qvd")
table.save_as_parquet("output.parquet", compression="snappy")

# EXISTS — O(1) lookup
idx = qvd.ExistsIndex(table, "ClientID")
print("12345" in idx)  # True/False

# Filter rows
rows = qvd.filter_exists(other_table, "ClientID", idx)

CLI

# Convert Parquet → QVD
qvd-cli convert input.parquet output.qvd

# Convert QVD → Parquet (with compression)
qvd-cli convert input.qvd output.parquet --compression zstd

# Inspect QVD metadata
qvd-cli inspect data.qvd

# Show first 20 rows
qvd-cli head data.qvd --rows 20

# Show Arrow schema
qvd-cli schema data.qvd

Architecture

src/
├── lib.rs          — public API, re-exports
├── error.rs        — error types (QvdError, QvdResult)
├── header.rs       — XML header parser/writer (custom, zero-dep)
├── value.rs        — QVD data types (QvdSymbol, QvdValue)
├── symbol.rs       — symbol table binary reader/writer
├── index.rs        — index table bit-stuffing reader/writer
├── reader.rs       — high-level QVD reader
├── writer.rs       — high-level QVD writer + QvdTableBuilder
├── exists.rs       — ExistsIndex with HashSet + filter functions
├── streaming.rs    — streaming chunk-based QVD reader
├── parquet.rs      — Parquet/Arrow ↔ QVD conversion (optional)
├── python.rs       — PyO3 bindings (optional)
└── bin/qvd.rs      — CLI binary (optional)

Feature Flags

Feature Dependencies Description
(default) none Core QVD read/write
parquet_support arrow, parquet, chrono Parquet/Arrow conversion
cli + clap CLI binary
python + pyo3 Python bindings

Publishing

crates.io

  1. Go to crates.io/settings/tokens
  2. Click "New Token"
  3. Name: github-actions, Scopes: publish-update for crate qvd
  4. Copy the token
  5. In GitHub repo → Settings → Secrets and variables → Actions → New repository secret
  6. Name: CARGO_REGISTRY_TOKEN, Value: paste the token

PyPI

  1. Go to pypi.org/manage/account/publishing
  2. Add a new Trusted Publisher (pending):
    • PyPI project name: qvdrs
    • Owner: bintocher
    • Repository: qvdrs
    • Workflow name: release-pypi.yml
    • Environment name: pypi
  3. In GitHub repo → Settings → Environments → Create "pypi" environment

Triggering a release

git tag v0.1.0
git push origin v0.1.0

Then create a GitHub Release from the tag — both crates.io and PyPI workflows will trigger automatically.

Author

Stanislav Chernov (@bintocher)

License

MIT — see LICENSE

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qvdrs-0.1.0.tar.gz (54.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

qvdrs-0.1.0-cp313-cp313-win_amd64.whl (3.2 MB view details)

Uploaded CPython 3.13Windows x86-64

qvdrs-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

qvdrs-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.8 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

qvdrs-0.1.0-cp313-cp313-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

qvdrs-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl (3.5 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

File details

Details for the file qvdrs-0.1.0.tar.gz.

File metadata

  • Download URL: qvdrs-0.1.0.tar.gz
  • Upload date:
  • Size: 54.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for qvdrs-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5fb1a89d2cdf4df18859556cbcd598393f0d3ef7344a56acc42eee255dfdfd55
MD5 fc78dac5b6863127aae0dd4bf1700627
BLAKE2b-256 b6c0acbfd679449ed1739fc188750ecab00c206dd30862babb526b6836aec9fe

See more details on using hashes here.

File details

Details for the file qvdrs-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: qvdrs-0.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for qvdrs-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 849fa255ea8d6c04fb91a7b5fe14c244a4e97517913b97024d0f7ce78b8fb259
MD5 0b82b02588060c397f82c76800e28f17
BLAKE2b-256 1fe123960280696ec78b51bab1a6a850e247095ac5b5e88998f967f9d8f7f728

See more details on using hashes here.

File details

Details for the file qvdrs-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for qvdrs-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 88d804e8484fe0b3f0c10bda177ab3867f76e6b8cc1d3c084c1f46fdd427a8a1
MD5 557fe1ba6c76281d9f43c406bfc867b6
BLAKE2b-256 d79a7643c59e8b146ed0490d96566530e9405cca0af0cfa732d55aef0581222d

See more details on using hashes here.

File details

Details for the file qvdrs-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for qvdrs-0.1.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 31b67c74a83d297fe89b43ad5f8f5cb5e5e069e31c24bb0cfe41fdbddeb5af64
MD5 64f23f4537012f01e2ece8a6c5d22315
BLAKE2b-256 ba77578992a595d94a9b9ae32c6c5bf57b80702bf053d5409d1e382c85438e03

See more details on using hashes here.

File details

Details for the file qvdrs-0.1.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for qvdrs-0.1.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ec7c82bad17cc6abc8478904e876f8fa55a1f12fc258d23ceaca57ca93441383
MD5 61e96c280264e0bac896e9185ae27999
BLAKE2b-256 fc77f18e05c3a45a692e11bad7d250bd750336e007a187ab45ddae384e92dbc6

See more details on using hashes here.

File details

Details for the file qvdrs-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for qvdrs-0.1.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7de8924d7da3731694803dbb446edfc5a6936f521a59fdbabb567c05c6f9d810
MD5 7f75fd11012ff5743537ca3f95af701c
BLAKE2b-256 4fc4d40e3ddd72c8652590a61a4b4fe74a59ff7cee0f35cc4b2bfe1fe6418de7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page