Unified Transformer for Multi-Task Data Quality
Project description
UNIDQ: Unified Data Quality
A unified transformer architecture for multi-task data quality assessment.
UNIDQ addresses 6 data quality tasks with a single model:
- ✅ Error Detection (F1=0.894, +42% vs Raha)
- ✅ Data Repair
- ✅ Missing Value Imputation (R²=0.941, +295% vs MICE)
- ✅ Label Noise Detection (F1=0.856, +28% vs Cleanlab)
- ✅ Label Classification
- ✅ Data Valuation
Installation
pip install unidq
Quick Start
from unidq import UNIDQ, MultiTaskDataset, UNIDQTrainer
# Load your data
dataset = MultiTaskDataset(
dirty_features=X_dirty,
clean_features=X_clean,
error_mask=errors,
labels=y
)
# Initialize model
model = UNIDQ(n_features=X_dirty.shape[1])
# Train
trainer = UNIDQTrainer(model)
trainer.fit(dataset)
# Predict
results = model.predict(X_new)
print(f"Detected errors: {results['errors']}")
print(f"Imputed values: {results['imputed']}")
Citation
If you use UNIDQ in your research, please cite:
@inproceedings{unidq2026,
title={UNIDQ: A Unified Transformer Architecture for Multi-Task Data Quality},
author={shivakoreddi,sravanisowrupilli},
booktitle={Proceedings of the VLDB Endowment},
year={2026}
}
License
MIT License
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unidq-0.1.1.tar.gz
(15.1 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
unidq-0.1.1-py3-none-any.whl
(13.1 kB
view details)
File details
Details for the file unidq-0.1.1.tar.gz.
File metadata
- Download URL: unidq-0.1.1.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
633393977c117a38960932dab9b67456df638af574845bf0d175858b9b331b09
|
|
| MD5 |
2aba7e183316b55104d8a08aeac359fc
|
|
| BLAKE2b-256 |
555e617f4881a0b841199d70313f10e14c258d54bf9129602b1fb7639ccfeb9d
|
File details
Details for the file unidq-0.1.1-py3-none-any.whl.
File metadata
- Download URL: unidq-0.1.1-py3-none-any.whl
- Upload date:
- Size: 13.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43ccf7dd7949893470dd057d21a574311e0eb337b6434e905d84b14158152364
|
|
| MD5 |
9071e25a312e0f431ccedeb3de4bda1e
|
|
| BLAKE2b-256 |
f776d6f3a480bfca481572fb5bff4825d211e3036eca99a695fd2e9b1bc0de8e
|