Skip to main content

The Ultimate Data Cleaning Engine for Python

Project description

Tidely

The Operating System for Data Quality

PyPI Python License Stars Issues Downloads

Zero-Configuration • Explainable • Deterministic • Fast


Install

pip install tidely

The Magic

import tidely as td

result = td.clean("sales.csv")

clean_df = result.df

print(result.summary())

Why Tidely?

Real-world datasets are messy.

Missing values.

Broken dates.

Mixed datatypes.

Duplicate records.

Memory waste.

Encoding issues.

Schema drift.

Normally you spend hours writing cleaning scripts.

Tidely turns all of that into a single function call.


Dataset Intelligence

profile = td.inspect("sales.csv")

profile.show()

Output

✔ Trust Score

✔ Dataset DNA

✔ Semantic Detection

✔ Missing Values

✔ Duplicate Analysis

✔ Memory Analysis

✔ ML Readiness

✔ Data Quality Score


Why Use Tidely?

Feature Pandas Tidely
Read CSV
Auto Detect Dates
Auto Clean Dataset
Memory Optimization Manual Automatic
Duplicate Detection Manual Automatic
Missing Value Strategy Manual Automatic
Semantic Column Detection
Explain Every Change
Health Score
Trust Score
Production Summary

Production Validation

Tidely has been validated on

Dataset Type Status
CSV
Excel (.xlsx)
ARFF
Government Open Data
Educational Data
ML Benchmark Datasets
Large CSV (>3 Million Rows)
Time Series
Mixed Datatypes
Corrupted Data

Validation Results

Version

v1.3.0-beta

Dataset Rows Health Before Health After
Parking Meters 52 94 96
Credit-G 1000 86 90
Diabetes 768 86 92
Iris 150 92 92
Allegations 57 95 92
Mathematics 59 97 94

Benchmarks

3,055,000 Row Dataset

Metric Result
Runtime 2.37 sec
Original Memory 148 MB
Final Memory 58 MB
Memory Saved 61%

Supported Formats

  • CSV

  • Excel

  • Parquet

  • JSON

  • TSV

  • Feather

  • ARFF

More coming soon.


Explainable Cleaning

Tidely never silently changes your data.

Every transformation is documented.

Example

✓ Converted "Order Date" to datetime

Reason

Detected temporal values.

Impact

Allows time-series operations.


✓ Downcasted int64 → int16

Reason

Values fit inside Int16.

Impact

61% lower memory.


Philosophy

Tidely follows three principles.

Never silently modify data.

Every transformation is visible.

Deterministic.

Same input.

Same output.

Every time.

Local First.

Runs entirely on your machine.

No cloud.

No API keys.

No LLMs.


Roadmap

  • CSV Cleaning

  • Explainable Reports

  • Memory Optimization

  • Semantic Detection

  • ARFF Support

  • Excel Support

  • Intelligent Missing Value Imputation

  • Fuzzy Duplicate Detection

  • Streaming Engine

  • DuckDB Integration

  • Out-of-Core Cleaning

  • Auto Feature Engineering

  • SQL Dataset Support

  • Distributed Processing


Contributing

PRs are welcome.

Bug reports are welcome.

Feature requests are welcome.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tidely-1.3.0b1.tar.gz (2.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tidely-1.3.0b1-py3-none-any.whl (45.4 kB view details)

Uploaded Python 3

File details

Details for the file tidely-1.3.0b1.tar.gz.

File metadata

  • Download URL: tidely-1.3.0b1.tar.gz
  • Upload date:
  • Size: 2.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for tidely-1.3.0b1.tar.gz
Algorithm Hash digest
SHA256 c6a4a5b320e6e0b77ad348b81c85307a8cb8b4d56e799ec8f4bc60c2a1fa11f0
MD5 0f11fe039c7fbfc5f9e97e29e05c01da
BLAKE2b-256 8a47364f9ece89e8fee13f23d1437692552eea27185f53dc877d6ac53102f8f1

See more details on using hashes here.

File details

Details for the file tidely-1.3.0b1-py3-none-any.whl.

File metadata

  • Download URL: tidely-1.3.0b1-py3-none-any.whl
  • Upload date:
  • Size: 45.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for tidely-1.3.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 7ac5390d8a4ba438781eda8054e2f8d56fa9e7320ab30f23ae9975aae2a84459
MD5 8570302bc9e22da1ab175eefe633aa3d
BLAKE2b-256 d10d20487190ed66cb878498f7252859fd7e13dcd3fb571b38b09eff2b7ba4a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page