Skip to main content

Fast, consensus-based date format inference

Reason this release was yanked:

Superseded by 0.1.4

Project description

fastdateinfer

Fast, consensus-based date format inference written in Rust with Python bindings.

License: MIT Python 3.10+

Why?

The problem: Is 01/02/2025 January 2nd or February 1st?

Library Approach Problem
pandas dayfirst=True hint You must know the format
dateutil Guess per-element Inconsistent results
hidateinfer Consensus voting Correct, but slow

The solution: If your data contains 15/03/2025, we know it's DD/MM/YYYY (15 can't be a month). This insight applies to ALL dates, resolving ambiguous ones like 01/02/2025.

fastdateinfer implements this consensus algorithm in Rust — 270x faster than hidateinfer.

Installation

pip install fastdateinfer

Quick Start

import fastdateinfer

# Infer format from dates
result = fastdateinfer.infer(["15/03/2025", "01/02/2025", "28/12/2025"])
print(result.format)      # %d/%m/%Y
print(result.confidence)  # 1.0

# Just get the format string
fmt = fastdateinfer.infer_format(["2025-01-15", "2025-03-20"])
print(fmt)  # %Y-%m-%d

# Use with pandas
import pandas as pd
dates = ["15/03/2025", "01/02/2025", "28/12/2025"]
fmt = fastdateinfer.infer_format(dates)
df = pd.to_datetime(dates, format=fmt)

Benchmarks

vs hidateinfer (Python)

Tested on 29,351 real-world dates across multiple formats:

Library Time Speedup
fastdateinfer 22.5 ms
hidateinfer 6,075 ms 270x slower

vs pandas / polars

Comparison on synthetic data (DD/MM/YYYY format):

Dates fastdateinfer pandas (explicit) pandas (mixed) Ratio
100 0.05 ms 0.24 ms 0.25 ms 5x faster
1,000 0.48 ms 0.97 ms 1.02 ms 2x faster
10,000 0.74 ms 2.14 ms 2.20 ms 3x faster
100,000 3.39 ms 17.00 ms 17.50 ms 5x faster

Note: fastdateinfer does format inference while pandas just parses a known format. Yet fastdateinfer is faster because it samples intelligently (consensus converges with ~1000 dates).

Scaling

Dates Time Per-date
1,000 0.48 ms 0.48 µs
10,000 0.74 ms 0.07 µs
100,000 3.39 ms 0.03 µs
1,000,000 ~35 ms 0.03 µs

Performance is sublinear due to smart sampling — only ~1000 dates are fully analyzed regardless of input size.

Supported Formats

Format Example Output
European 15/03/2025 %d/%m/%Y
American 03/15/2025 %m/%d/%Y
ISO 8601 2025-03-15 %Y-%m-%d
ISO datetime 2025-03-15T10:30:00 %Y-%m-%dT%H:%M:%S
Month name 15 Mar 2025 %d %b %Y
Month name (full) 15 March 2025 %d %B %Y
Month first Mar 15, 2025 %b %d, %Y
2-digit year 15/03/25 %d/%m/%y
With time 15/03/25 10.30.00 %d/%m/%y %H.%M.%S
Month-year only March, 2025 %B, %Y
Day-month only 15/Mar %d/%b

API Reference

infer(dates, prefer_dayfirst=True, min_confidence=0.0, strict=False)

Infer date format from a list of date strings.

Arguments:

  • dates: List of date strings
  • prefer_dayfirst: Use DD/MM for fully ambiguous dates (default: True)
  • min_confidence: Minimum confidence threshold (default: 0.0)
  • strict: Raise error if any date doesn't match (default: False)

Returns: InferResult with:

  • format: strptime format string
  • confidence: float between 0.0 and 1.0
  • token_types: list of resolved token types
result = fastdateinfer.infer(["01/02/2025", "03/04/2025"], prefer_dayfirst=False)
print(result.format)  # %m/%d/%Y (American format)

infer_format(dates, prefer_dayfirst=True)

Convenience function that returns only the format string.

fmt = fastdateinfer.infer_format(["2025-01-15", "2025-03-20"])
print(fmt)  # %Y-%m-%d

infer_batch(columns, prefer_dayfirst=True)

Infer formats for multiple columns at once.

results = fastdateinfer.infer_batch({
    "transaction_date": ["15/03/2025", "01/02/2025"],
    "created_at": ["2025-01-15T10:30:00", "2025-01-16T14:45:00"],
    "value_date": ["15-Mar-2025", "01-Feb-2025"]
})

for col, result in results.items():
    print(f"{col}: {result.format}")
# transaction_date: %d/%m/%Y
# created_at: %Y-%m-%dT%H:%M:%S
# value_date: %d-%b-%Y

How It Works

  1. Tokenize: Split "15/03/2025" into [15, /, 03, /, 2025]
  2. Constrain: 15 can only be Day (>12), 03 could be Day or Month, 2025 is Year
  3. Vote: Across all dates, count evidence for each position
  4. Resolve: Position 1 has strong Day evidence → Position 2 must be Month
  5. Format: Output %d/%m/%Y

The key insight: consensus converges quickly. Even with 1 million dates, we only need to analyze ~1000 to determine the format with high confidence.

Use Cases

CSV/Data Processing

import pandas as pd
import fastdateinfer

# Read raw data
df = pd.read_csv("data.csv")

# Detect format automatically
fmt = fastdateinfer.infer_format(df["date"].dropna().tolist())

# Parse with detected format
df["date"] = pd.to_datetime(df["date"], format=fmt)

Multi-format Data Pipeline

# Different columns may have different formats
results = fastdateinfer.infer_batch({
    col: df[col].dropna().astype(str).tolist()
    for col in ["date", "value_date", "created_at"]
})

for col, result in results.items():
    df[col] = pd.to_datetime(df[col], format=result.format)

Validation

# Ensure high confidence
result = fastdateinfer.infer(dates, min_confidence=0.9)
if result.confidence < 0.9:
    raise ValueError(f"Low confidence: {result.confidence}")

Comparison

Feature fastdateinfer hidateinfer pandas dateutil
Consensus-based
Speed (10k dates) 0.74 ms 200 ms 2 ms* N/A
Returns strptime format
Batch inference
Type hints
Pure Rust core

*pandas time is for parsing only (you must already know the format)

Building from Source

# Prerequisites
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
pip install maturin

# Clone and build
git clone https://github.com/coledrain/fastdateinfer
cd fastdateinfer
maturin develop --release

# Run tests
cargo test

License

MIT License. See LICENSE for details.

Contributing

Contributions welcome! Please open an issue or PR on GitHub.

Acknowledgments

  • Inspired by hidateinfer
  • Built with PyO3 for Python bindings
  • Built for high-volume data processing pipelines

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastdateinfer-0.1.2.tar.gz (27.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

fastdateinfer-0.1.2-cp313-cp313-win_amd64.whl (175.5 kB view details)

Uploaded CPython 3.13Windows x86-64

fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (314.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

fastdateinfer-0.1.2-cp313-cp313-macosx_11_0_arm64.whl (280.1 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

fastdateinfer-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl (284.7 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

fastdateinfer-0.1.2-cp312-cp312-win_amd64.whl (176.5 kB view details)

Uploaded CPython 3.12Windows x86-64

fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (323.0 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

fastdateinfer-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (280.8 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

fastdateinfer-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl (285.4 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

fastdateinfer-0.1.2-cp311-cp311-win_amd64.whl (177.1 kB view details)

Uploaded CPython 3.11Windows x86-64

fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

fastdateinfer-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (280.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

fastdateinfer-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl (285.6 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

fastdateinfer-0.1.2-cp310-cp310-win_amd64.whl (177.0 kB view details)

Uploaded CPython 3.10Windows x86-64

fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (322.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (315.5 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

fastdateinfer-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (280.7 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

fastdateinfer-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl (285.5 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

File details

Details for the file fastdateinfer-0.1.2.tar.gz.

File metadata

  • Download URL: fastdateinfer-0.1.2.tar.gz
  • Upload date:
  • Size: 27.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.11.5

File hashes

Hashes for fastdateinfer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 15a2191698e2feaeaf6ef4058ab365088d93dbf280431fb5a5db6e8d7025c3e7
MD5 930375e31ca746bb199b85e7cc2f3e11
BLAKE2b-256 5886774a0e128352071bdffd6e80ebe7cafec2ef676afb48a0bfd505f210052a

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 25b9f2ce27d273d387332c0446139596948eaf8255a466110a577e9f90010118
MD5 55a7ab099e0debc6a9b40ecb740159ae
BLAKE2b-256 aeaae1aa83fb25742ca9fa49e73ef558ed37405c13b3ae10cee3541fb0287491

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 933145911617a358b165dd953c38b29bd2fc2821b12b8c361aac6f90835a7672
MD5 e0e705227a3201de6bf79b546e9e114e
BLAKE2b-256 78f7031156f1605a377ee0fc2e752a9f1ab9e2eeb451a7347e4bd81356eb2caa

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0ea6fd7944fc24af1d3547314daee3c128b436acbf0b068bc642735265a428b5
MD5 3683fdf65298a8ee9f85e7d62d07b227
BLAKE2b-256 f9642914551fac580c052b2fd170931d1d6a6d9956e829a1167677bcc0f3107f

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 91691bc1e3ee32dd04695c2ee82bddeecdc4aa8e197ec446f5aafa38e8d1253b
MD5 b9c65bf331c16cb158c799de4dbfd431
BLAKE2b-256 7fc52420c679bf37292c915845d51135ccf73696b4ec9998a7f6ac8e701e2431

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ea26e5ec07c10c0ac123b2194d3543fadede349ecd80247fbfdce50e0f10a77b
MD5 942947513534f9bb9d7422282ace08eb
BLAKE2b-256 ac5c78200a30bd8f1220f7860afa633507a489727408565e8a2c3d298090690f

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 160afaeafeecdf76aab63b76393ff14a49fdd8360984cc4d829dbb4388b8827c
MD5 194f55f8a07d5f33beff4177ef861e2d
BLAKE2b-256 b313671e6aa40b1dd96ae4f1fd6691146136fa44d365441c5b9cbd43606e63ae

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2fb2ec8ca9bb2d3d2f5a75b9d87541842b0d99a2086d2a0eb12e0fa7360e3873
MD5 2cbd1a68b01e5dc7647bec302c16ada0
BLAKE2b-256 6d288afd60771ff328d5a5c8e4e45351c46e2c1585124ab049d3a2f3ef6244fe

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b7622351c1d9de633a7241ec42bc931b9c6dea64ddfec998e64ab7651ec9bfb1
MD5 ca0adcded83e73bd1872c16b2bb315d9
BLAKE2b-256 83a8c3545cd30aea3e1a06568135cf7a762ef883987bd3b08296cf9851e077b8

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 36c45acce695a2a52279830c79b594ddf19d63cbfe7b8ece8b246ddc5b6087ec
MD5 9ed63c514d48e62968dcecf51e69426a
BLAKE2b-256 40bdc2c5cc41dccf46475a5f98b455956c60f0f8f18a1c35a9af324c0219e039

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 09187060912c1b5c975c2466ba0eb86e729414812d93548b0f7dda03d534cbdd
MD5 6d6988719311bc56d4d99999e18ef682
BLAKE2b-256 174554f60878a3d9e6b80e9e1d052d5a4dc31d455b6de430fb08865705e229b8

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 542570b623cf987103faac411e830311d126aacad13cfe52fd3aec1403265e01
MD5 ea91e3fcaea0e29bbc0bf4478593fef1
BLAKE2b-256 53cabc2a8962fb732002b9ae5b5ca422ebf70ce93014dd5449dc6702f6e627b7

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 29ea358824d2355244aca5748989e4c48c8ec13a43583f51369b4c90809667b8
MD5 c13a1dd7ed904569c9e87112b2b7a308
BLAKE2b-256 72024d32b494b8295165d4f7e5e68f88077e2e827aa21a7b2c4db98315dc349e

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2bebea1242ed490b057d0009df41819a63d479f51292defb09aa546c1f2ef4d4
MD5 db82c0d4622d56f8cdace5d72c13c183
BLAKE2b-256 f9ef6602f7197617f8e27d64e44a9e0c694a05152f56c0e56b1f5780fd9016f8

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 69c6f2d0f6a31f25468508006f628ac29ffc6a3b803e888200373bbe798513e1
MD5 c73f01d8e2441c11fe25d97f1fd0e1a4
BLAKE2b-256 704e986a6dfa8f03fec2da8800152fdbca28e0009dd4c066e95fb59b595d346f

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e3ef7e632d3bcf48b1898f3d1918d77f847b3a10ca08372d61047ba75f8878b6
MD5 9939f564098c02bedbf959cddb33e878
BLAKE2b-256 5ab4374a4f5fb109565535fd719b8c9ab4281b1ec37e114150b81a7204a2ffe1

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 759613f6c61c18a055f3ac75bd752a5e8e9bcafa11b13548b5f3ece3cb019c08
MD5 01c9d9c8b89ac2ad5e7ac80d43d2e89d
BLAKE2b-256 3d4a90d8eb4ce2ad294330c947807e2e5d672b50cac86dbe1db73fba24c3191e

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7c3efa5e66d3607edc3f7677af38cc57115751dca0bac58c073519082faa4c79
MD5 88ad1d555264e2ef9dc05fa43ba5e286
BLAKE2b-256 79e7fff01b893de5bca8a673d5128c9e922cc2623a5c2c10de66ee2df5302cf0

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e3b60112e0845116c4610ebc328ecca7e4a06d2746775eaf3d0f97e0b8dce14d
MD5 26c5458fa2ef7396aefb426e6489e5c5
BLAKE2b-256 9aac200bf3d1d1092d776c4fb941337458e7094d2fee4dce7c285149e960b155

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 69b250ed60788e0fd8b369da7b2f223a0a9f8e41d9b8fe38e826493e5414955e
MD5 448fbc0b3c1948cf6d586950b58948c6
BLAKE2b-256 f34a4e2cd49d91725c344c5048bb67ca5685865a5f6ad2e9ecb92ab781c1869f

See more details on using hashes here.

File details

Details for the file fastdateinfer-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for fastdateinfer-0.1.2-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 4a481f81b07339a89793ae89e3b49fbd135cdd240257d78ef167375404a42287
MD5 581066458c46dac950ceb13552d088da
BLAKE2b-256 7f7d43a1942484bcf4f8844fc9e1aad91536b18e25174808b8ff45212a23b80e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page