Quantum data preparation — the missing preprocessing layer between classical datasets and quantum computing frameworks
Project description
QuPrep — Quantum Data Preparation
The missing preprocessing layer between classical datasets and quantum computing frameworks.
QuPrep converts classical datasets into quantum-circuit-ready format. It is not a quantum computing framework, simulator, or training tool — it is the preprocessing step that feeds into Qiskit, PennyLane, Cirq, TKET, and any other quantum workflow.
CSV / DataFrame / NumPy / images / text / graphs → QuPrep → circuit-ready output
What QuPrep does
- Ingest tabular data, time series, images, text, and graphs — all in the same pipeline API
- Clean, normalize, and reduce dimensionality to fit your hardware qubit budget
- Encode data into circuits using 13 encoding methods (Angle, Amplitude, IQP, ZZFeatureMap, GraphState, and more)
- Recommend, compare, and auto-select the best encoding for your dataset and task
- Export circuits to 8 frameworks: OpenQASM 3.0, Qiskit, PennyLane, Cirq, TKET, Braket, Q#, IQM
- Formulate combinatorial optimization problems as QUBO / Ising models; export as QAOA circuit templates for your quantum framework
QuPrep does not train models, simulate circuits, run on quantum hardware, or optimize variational parameters.
Installation
pip install quprep
With optional extras:
# Framework exporters
pip install quprep[qiskit] # Qiskit QuantumCircuit
pip install quprep[pennylane] # PennyLane QNode
pip install quprep[cirq] # Cirq Circuit
pip install quprep[tket] # TKET/pytket Circuit
pip install quprep[braket] # Amazon Braket Circuit
pip install quprep[qsharp] # Q# / Azure Quantum
pip install quprep[iqm] # IQM native format
pip install quprep[frameworks] # all framework exporters at once
# Data modalities
pip install quprep[image] # image ingestion (Pillow)
pip install quprep[text] # text embeddings (sentence-transformers, ~2 GB)
pip install quprep[modalities] # image + text at once
# Other
pip install quprep[umap] # UMAP dimensionality reduction
pip install quprep[viz] # matplotlib circuit diagrams
pip install quprep[all] # everything
Requirements: Python ≥ 3.10. Core dependencies: numpy, scipy, pandas, scikit-learn.
Quickstart
One-liner
import quprep as qd
result = qd.prepare("data.csv", encoding="angle", framework="qasm")
print(result.circuit)
Full pipeline
import quprep as qd
pipeline = qd.Pipeline(
cleaner=qd.Imputer(),
reducer=qd.PCAReducer(n_components=8),
encoder=qd.IQPEncoder(reps=2),
exporter=qd.PennyLaneExporter(), # pip install quprep[pennylane]
)
result = pipeline.fit_transform("data.csv")
qnode = result.circuit # callable qml.QNode
Data modalities — time series, images, text, graphs
import quprep as qd
# Time series — sliding window then encode
from quprep.ingest.time_series_ingester import TimeSeriesIngester
from quprep.clean.window_transformer import WindowTransformer
result = qd.Pipeline(
preprocessor=WindowTransformer(window_size=5, step=1),
encoder=qd.AngleEncoder(),
).fit_transform(TimeSeriesIngester(time_column="date").load("sensor.csv"))
# Images — pip install quprep[image]
from quprep.ingest.image_ingester import ImageIngester
result = qd.prepare("images/", encoding="angle", ingester=ImageIngester(size=(8, 8), grayscale=True))
# Text — TF-IDF (no deps) or sentence-transformers (pip install quprep[text])
from quprep.ingest.text_ingester import TextIngester
texts = ["quantum computing is powerful", "machine learning meets QML", ...]
result = qd.prepare(texts, encoding="angle", ingester=TextIngester(method="tfidf", max_features=16))
# Graphs — lossless graph state encoding
from quprep.ingest.graph_ingester import GraphIngester
from quprep.encode.graph_state import GraphStateEncoder
import numpy as np
graph_list = [np.array([[0,1,1],[1,0,0],[1,0,0]], dtype=float), ...] # adjacency matrices
result = qd.Pipeline(encoder=GraphStateEncoder()).fit_transform(
GraphIngester(features="adjacency").load(graph_list)
)
More features
| Feature | Docs |
|---|---|
| Encoding recommendation — ranked by dataset profile and task | guide |
| Qubit budget suggestion — NISQ-safe ceiling with reasoning | API |
| Side-by-side encoder comparison — depth, gates, NISQ safety | API |
| Data drift detection — warn when new data leaves training distribution | API |
| Pipeline save / load — serialize fitted pipelines, no re-fitting | API |
| Schema validation & cost estimation — gate count before encoding | guide |
| QUBO / Ising formulation — Max-Cut, TSP, Knapsack, QAOA circuits, D-Wave export | guide |
| Plugin system — register custom encoders and exporters | guide |
| Circuit visualization — ASCII (no deps) or matplotlib | API |
| Batch QASM export — save all samples to disk as individual files | API |
Supported encodings
| Encoding | Qubits | Depth | NISQ-safe | Best for |
|---|---|---|---|---|
| Angle (Ry/Rx/Rz) | n = d | O(1) | ✅ Excellent | Most QML tasks |
| Amplitude | ⌈log₂ d⌉ | O(2ⁿ) | ❌ Poor | Qubit-limited scenarios |
| Basis | n = d | O(1) | ✅ Excellent | Binary features / QAOA |
| Entangled Angle | n = d | O(d · layers) | ✅ Good | Feature correlations |
| IQP | n = d | O(d² · reps) | ⚠️ Medium | Kernel methods |
| Re-uploading | n = d | O(d · layers) | ✅ Good | High-expressivity QNNs |
| Hamiltonian | n = d | O(d · steps) | ⚠️ Medium | Physics simulation / VQE |
| ZZ Feature Map | n = d | O(d² · reps) | ⚠️ Medium | Quantum kernel methods |
| Pauli Feature Map | n = d | O(d² · reps) | ⚠️ Medium | Configurable kernel methods |
| Random Fourier | n_components | O(1) | ✅ Excellent | RBF kernel approximation |
| Tensor Product | ⌈d/2⌉ | O(1) | ✅ Excellent | Qubit-efficient encoding |
| QAOA Problem | n = d | O(p) | ✅ Good | QAOA warm-start, problem-inspired maps |
| Graph State | n = nodes | O(edges) | ✅ Good | Graph-structured data (lossless) |
Supported export frameworks
| Framework | Install | Output |
|---|---|---|
| OpenQASM 3.0 | (included) | str |
| Qiskit | pip install quprep[qiskit] |
QuantumCircuit |
| PennyLane | pip install quprep[pennylane] |
qml.QNode |
| Cirq | pip install quprep[cirq] |
cirq.Circuit |
| TKET | pip install quprep[tket] |
pytket.Circuit |
| Amazon Braket | pip install quprep[braket] |
braket.Circuit |
| Q# | pip install quprep[qsharp] |
Q# operation string |
| IQM | pip install quprep[iqm] |
IQM circuit JSON |
Documentation
Full documentation at docs.quprep.org
Examples
Contributing
Contributions are welcome. Please read CONTRIBUTING.md before opening a pull request.
- Open an issue for bugs or feature requests
- Start a discussion for questions or ideas
License
Apache 2.0 — see LICENSE.
Citation
If you use QuPrep in your research, please cite:
@software{quprep2026,
author = {Perera, Hasarindu},
title = {QuPrep: Quantum Data Preparation},
year = {2026},
publisher = {Zenodo},
version = {0.8.0},
doi = {10.5281/zenodo.19286258},
url = {https://doi.org/10.5281/zenodo.19286258},
license = {Apache-2.0},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quprep-0.8.0.tar.gz.
File metadata
- Download URL: quprep-0.8.0.tar.gz
- Upload date:
- Size: 641.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
660b912315243404fb4abfb80660fc8644144a1f59f0fb71d658b0d98bcca668
|
|
| MD5 |
9a949cfc584510c5fa6b982acb1fcbe9
|
|
| BLAKE2b-256 |
89dd4ef0b3134251f3435bdf9123b6309d8178db4fa670df1d8aa90a650e4c89
|
Provenance
The following attestation bundles were made for quprep-0.8.0.tar.gz:
Publisher:
release.yml on quprep/quprep
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quprep-0.8.0.tar.gz -
Subject digest:
660b912315243404fb4abfb80660fc8644144a1f59f0fb71d658b0d98bcca668 - Sigstore transparency entry: 1294925317
- Sigstore integration time:
-
Permalink:
quprep/quprep@fa277da06b480b0e2efa7ce183523aa1f85396bb -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/quprep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fa277da06b480b0e2efa7ce183523aa1f85396bb -
Trigger Event:
push
-
Statement type:
File details
Details for the file quprep-0.8.0-py3-none-any.whl.
File metadata
- Download URL: quprep-0.8.0-py3-none-any.whl
- Upload date:
- Size: 178.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dfa82928a05fc9689d9f63c266d8199378a9a8e2f46e993d0138f620d241fa13
|
|
| MD5 |
08a8764ffe2f58fb0331e384af489f87
|
|
| BLAKE2b-256 |
912bb9eb722c3471f2a6db722d105f4a28c12300369eea86430e39b27196a2a2
|
Provenance
The following attestation bundles were made for quprep-0.8.0-py3-none-any.whl:
Publisher:
release.yml on quprep/quprep
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
quprep-0.8.0-py3-none-any.whl -
Subject digest:
dfa82928a05fc9689d9f63c266d8199378a9a8e2f46e993d0138f620d241fa13 - Sigstore transparency entry: 1294925349
- Sigstore integration time:
-
Permalink:
quprep/quprep@fa277da06b480b0e2efa7ce183523aa1f85396bb -
Branch / Tag:
refs/tags/v0.8.0 - Owner: https://github.com/quprep
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fa277da06b480b0e2efa7ce183523aa1f85396bb -
Trigger Event:
push
-
Statement type: