Skip to main content

The Ultimate Data Cleaning Engine for Python

Project description

Tidely

The Operating System for Data Quality

PyPI Python License Stars Issues Downloads

Zero-Configuration • Explainable • Deterministic • Fast


Install

pip install tidely

The Magic

import tidely as td

result = td.clean("sales.csv")

clean_df = result.df

print(result.summary())

Why Tidely?

Real-world datasets are messy.

Missing values.

Broken dates.

Mixed datatypes.

Duplicate records.

Memory waste.

Encoding issues.

Schema drift.

Normally you spend hours writing cleaning scripts.

Tidely turns all of that into a single function call.


Dataset Intelligence

profile = td.inspect("sales.csv")

profile.show()

Output

✔ Trust Score

✔ Dataset DNA

✔ Semantic Detection

✔ Missing Values

✔ Duplicate Analysis

✔ Memory Analysis

✔ ML Readiness

✔ Data Quality Score


Why Use Tidely?

Feature Pandas Tidely
Read CSV
Auto Detect Dates
Auto Clean Dataset
Memory Optimization Manual Automatic
Duplicate Detection Manual Automatic
Missing Value Strategy Manual Automatic
Semantic Column Detection
Explain Every Change
Health Score
Trust Score
Production Summary

Production Validation

Tidely has been validated on

Dataset Type Status
CSV
Excel (.xlsx)
ARFF
Government Open Data
Educational Data
ML Benchmark Datasets
Large CSV (>3 Million Rows)
Time Series
Mixed Datatypes
Corrupted Data

Validation Results

Version

v1.3.0b2

Dataset Rows Health Before Health After
Parking Meters 52 94 96
Credit-G 1000 86 90
Diabetes 768 86 92
Iris 150 92 92
Allegations 57 95 92
Mathematics 59 97 94

Benchmarks

3,055,000 Row Dataset

Metric Result
Runtime 2.37 sec
Original Memory 148 MB
Final Memory 58 MB
Memory Saved 61%

Supported Formats

  • CSV

  • Excel

  • Parquet

  • JSON

  • TSV

  • Feather

  • ARFF

More coming soon.


Explainable Cleaning

Tidely never silently changes your data.

Every transformation is documented.

Example

✓ Converted "Order Date" to datetime

Reason

Detected temporal values.

Impact

Allows time-series operations.


✓ Downcasted int64 → int16

Reason

Values fit inside Int16.

Impact

61% lower memory.


Philosophy

Tidely follows three principles.

Never silently modify data.

Every transformation is visible.

Deterministic.

Same input.

Same output.

Every time.

Local First.

Runs entirely on your machine.

No cloud.

No API keys.

No LLMs.


Roadmap

  • CSV Cleaning

  • Explainable Reports

  • Memory Optimization

  • Semantic Detection

  • ARFF Support

  • Excel Support

  • Intelligent Missing Value Imputation

  • Fuzzy Duplicate Detection

  • Streaming Engine

  • DuckDB Integration

  • Out-of-Core Cleaning

  • Auto Feature Engineering

  • SQL Dataset Support

  • Distributed Processing


Contributing

PRs are welcome.

Bug reports are welcome.

Feature requests are welcome.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidely-1.3.0b2.tar.gz (2.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tidely-1.3.0b2-py3-none-any.whl (45.5 kB view details)

Uploaded Python 3

File details

Details for the file tidely-1.3.0b2.tar.gz.

File metadata

  • Download URL: tidely-1.3.0b2.tar.gz
  • Upload date:
  • Size: 2.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tidely-1.3.0b2.tar.gz
Algorithm Hash digest
SHA256 f750fcb764e657e7e4c3488e2f53756c4fb140f99fc85c9c83a2e58dab74e5c3
MD5 a381d43937c9159a1919c87716fc9f5f
BLAKE2b-256 019b3fc27b053b887f5a8b269cc24966171e84dfeca2fe6b3285d6a00c6e036e

See more details on using hashes here.

File details

Details for the file tidely-1.3.0b2-py3-none-any.whl.

File metadata

  • Download URL: tidely-1.3.0b2-py3-none-any.whl
  • Upload date:
  • Size: 45.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for tidely-1.3.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 55ca38287b501f2520bd80f319539a5450bcfe54b7e1b35159357e92fac7e2e0
MD5 6241459ed58d7f8aeac018433d016304
BLAKE2b-256 e2c30095227aefaa157121d22c8698433ff9621705dfea95a9ed4f2b837ebc73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page