A flexible data ingestion library for various file formats
Project description
Data Ingestors ๐
Get your data into the tracebloc training environment โ validated, clean, and ready for model evaluation.
These pipelines handle the full data preparation workflow: validation, preprocessing, and secure transfer into your Kubernetes cluster. A metadata representation syncs to the tracebloc web app so you can manage datasets visually. Your raw data never leaves your infrastructure.
How it works
Your raw data
โ
โผ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Data ingestor โโโโโโบโ Your Kubernetes cluster โ
โ โ โ โ
โ Validates โ โ Validated dataset โ
โ Preprocesses โ โ (ready for training) โ
โ Transfers โ โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ
โ
Metadata only
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ tracebloc web app โ
โ (dataset management UI) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Data stays on your infrastructure. Only metadata (structure, schema, statistics) syncs to the web app for dataset management and vendor guidance.
Supported data types
| Type | Examples |
|---|---|
| Image | Classification, detection, segmentation datasets |
| Text / NLP | Document classification, sentiment, named entities |
| Tabular | Structured CSV data, feature tables |
| Time series | Sequential measurements, forecasting datasets |
Install
pip install tracebloc-ingestor
Prerequisites
- Python 3.8+
- A tracebloc account with an active use case
- A running tracebloc client on your infrastructure
For step-by-step data preparation instructions โ Prepare Data guide
Links
Platform ยท Docs ยท Data preparation guide ยท Discord
License
Apache 2.0 โ see LICENSE.
Questions? support@tracebloc.io or open an issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tracebloc_ingestor-0.2.11.tar.gz.
File metadata
- Download URL: tracebloc_ingestor-0.2.11.tar.gz
- Upload date:
- Size: 45.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
337c026a7900e158e9005cec8d5ccd7b242a013679dbcb94ac09eb0ac4452e13
|
|
| MD5 |
8ac109ab8e75acff11a5990facb65c28
|
|
| BLAKE2b-256 |
fa27d23196708e012a79413abf21f4b1a00a0eecd19a52165e50e43e78ad8d90
|
File details
Details for the file tracebloc_ingestor-0.2.11-py3-none-any.whl.
File metadata
- Download URL: tracebloc_ingestor-0.2.11-py3-none-any.whl
- Upload date:
- Size: 61.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2e25a4fe5d81fb855561e222572c1636c91b87836831678d0f287ee5f604a88d
|
|
| MD5 |
b06a19d61fb81ba61af6a13c25b751c0
|
|
| BLAKE2b-256 |
0f3788cc84532879cb44f866e37a4a8c7d1b718860c1f94eab80fe74acd3a32e
|