Skip to main content

A flexible data ingestion library for various file formats

Project description

License PyPI Python Platform

Data Ingestors ๐Ÿ“Š

Get your data into the tracebloc training environment โ€” validated, clean, and ready for model evaluation.

These pipelines handle the full data preparation workflow: validation, preprocessing, and secure transfer into your Kubernetes cluster. A metadata representation syncs to the tracebloc web app so you can manage datasets visually. Your raw data never leaves your infrastructure.

How it works

Your raw data
     โ”‚
     โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Data ingestor   โ”‚โ”€โ”€โ”€โ”€โ–บโ”‚  Your Kubernetes cluster          โ”‚
โ”‚                  โ”‚     โ”‚                                   โ”‚
โ”‚  Validates       โ”‚     โ”‚  Validated dataset                โ”‚
โ”‚  Preprocesses    โ”‚     โ”‚  (ready for training)             โ”‚
โ”‚  Transfers       โ”‚     โ”‚                                   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                       โ”‚
                              Metadata only
                                       โ”‚
                                       โ–ผ
                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                        โ”‚  tracebloc web app        โ”‚
                        โ”‚  (dataset management UI)  โ”‚
                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data stays on your infrastructure. Only metadata (structure, schema, statistics) syncs to the web app for dataset management and vendor guidance.

Supported data types

Type Examples
Image Classification, detection, segmentation datasets
Text / NLP Document classification, sentiment, named entities
Tabular Structured CSV data, feature tables
Time series Sequential measurements, forecasting datasets

Install

pip install tracebloc-ingestor

Prerequisites

For step-by-step data preparation instructions โ†’ Prepare Data guide

Links

Platform ยท Docs ยท Data preparation guide ยท Discord

License

Apache 2.0 โ€” see LICENSE.

Questions? support@tracebloc.io or open an issue.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tracebloc_ingestor-0.2.11.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tracebloc_ingestor-0.2.11-py3-none-any.whl (61.1 kB view details)

Uploaded Python 3

File details

Details for the file tracebloc_ingestor-0.2.11.tar.gz.

File metadata

  • Download URL: tracebloc_ingestor-0.2.11.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for tracebloc_ingestor-0.2.11.tar.gz
Algorithm Hash digest
SHA256 337c026a7900e158e9005cec8d5ccd7b242a013679dbcb94ac09eb0ac4452e13
MD5 8ac109ab8e75acff11a5990facb65c28
BLAKE2b-256 fa27d23196708e012a79413abf21f4b1a00a0eecd19a52165e50e43e78ad8d90

See more details on using hashes here.

File details

Details for the file tracebloc_ingestor-0.2.11-py3-none-any.whl.

File metadata

File hashes

Hashes for tracebloc_ingestor-0.2.11-py3-none-any.whl
Algorithm Hash digest
SHA256 2e25a4fe5d81fb855561e222572c1636c91b87836831678d0f287ee5f604a88d
MD5 b06a19d61fb81ba61af6a13c25b751c0
BLAKE2b-256 0f3788cc84532879cb44f866e37a4a8c7d1b718860c1f94eab80fe74acd3a32e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page