Batch date parsing with ambiguity detection, confidence scores, and format lock-in.
Project description
datemonkey
Batch date parsing with ambiguity detection, confidence scores, and format lock-in.
The problem: dateutil.parser.parse("01/02/03") silently guesses and is often wrong. DD/MM vs MM/DD ambiguity corrupts joins, aggregations, and reports. datemonkey detects ambiguity and tells you about it instead of guessing.
Install
pip install datemonkey
Quick Start
Detect format from a column of values
from datemonkey import detect_format
result = detect_format(["15/03/2024", "20/04/2024", "25/12/2024"])
print(result.format.label) # "European date (DD/MM/YYYY)"
print(result.confidence) # Confidence.HIGH
print(result.is_ambiguous) # False — day > 12 resolves it
Ambiguity detection
result = detect_format(["01/02/2024", "03/04/2024", "05/06/2024"])
print(result.is_ambiguous) # True
print(result.ambiguities) # [AmbiguityType.DAY_MONTH_SWAP]
print(result.warnings)
# ["Ambiguous: cannot distinguish US date (MM/DD/YYYY) from European date (DD/MM/YYYY) ..."]
Resolve ambiguity with locale preference
result = detect_format(["01/02/2024", "03/04/2024"], locale_preference="eu")
print(result.format.label) # "European date (DD/MM/YYYY)"
Parse a batch of dates
from datemonkey import parse_dates
batch = parse_dates(["2024-03-15", "2024-04-20", "2024-12-25"])
print(batch.ok) # True
print(batch.dates) # [datetime(2024,3,15), datetime(2024,4,20), datetime(2024,12,25)]
print(batch.iso_strings) # ["2024-03-15T00:00:00", ...]
Format lock-in
from datemonkey import parse_dates, ISO_8601
batch = parse_dates(["2024-03-15", "03/15/2024"], format=ISO_8601)
print(batch.results[0].ok) # True — matches ISO
print(batch.results[1].ok) # False — doesn't match, flagged not re-guessed
Strict mode
batch = parse_dates(["01/02/2024", "03/04/2024"], strict=True)
print(batch.parsed_count) # 0 — refuses to parse ambiguous data
print(batch.warnings) # ["Strict mode: refusing to parse due to DD/MM vs MM/DD ambiguity..."]
Excel serial dates
from datemonkey import parse_dates, excel_serial_to_datetime
# Single value
dt = excel_serial_to_datetime(45292) # datetime(2024, 1, 1)
# Batch — auto-detected
batch = parse_dates(["45292", "45293", "45294"])
print(batch.detected_format.label) # "Excel serial date number"
Per-value results
batch = parse_dates(["2024-03-15", "garbage", "2024-12-25"], format="%Y-%m-%d")
for r in batch.results:
print(f"{r.original:20s} ok={r.ok} parsed={r.iso} warnings={r.warnings}")
# 2024-03-15 ok=True parsed=2024-03-15T00:00:00 warnings=[]
# garbage ok=False parsed=None warnings=[...]
# 2024-12-25 ok=True parsed=2024-12-25T00:00:00 warnings=[]
CLI
# Detect format
datemonkey detect "15/03/2024" "20/04/2024" "25/12/2024"
# Detect with JSON output
datemonkey detect --json "01/02/2024" "03/04/2024"
# Parse dates
datemonkey parse "2024-03-15" "2024-04-20"
# Parse from CSV file (column 2, skip header)
datemonkey parse --file data.csv --column 2 --skip-header
# Parse with explicit format
datemonkey parse --format "%d-%m-%Y" "15-03-2024"
# Parse in strict mode
datemonkey parse --strict "01/02/2024" "03/04/2024"
# List known formats
datemonkey formats
API Reference
detect_format(values, *, locale_preference=None, formats=None) -> FormatDetectionResult
Analyze a batch and determine the most likely format, reporting ambiguity.
- values: List of date-like values (strings, ints, floats, None)
- locale_preference:
"us"for MM/DD,"eu"for DD/MM (only used when data alone can't resolve) - formats: Custom list of
DateFormatobjects to test
parse_dates(values, *, format=None, locale_preference=None, strict=False) -> BatchResult
Parse a batch with format lock-in.
- format: A
DateFormatobject or strftime string. If None, auto-detected. - strict: If True, refuse to parse when DD/MM vs MM/DD is ambiguous.
excel_serial_to_datetime(serial) -> datetime | None
Convert an Excel serial date number to a Python datetime.
Result Objects
| Object | Key Properties |
|---|---|
FormatDetectionResult |
.format, .confidence, .is_ambiguous, .ambiguities, .candidates, .warnings |
BatchResult |
.ok, .results, .detected_format, .dates, .iso_strings, .failed, .succeeded, .success_ratio |
DateResult |
.ok, .original, .parsed, .date, .iso, .confidence, .warnings, .row_index |
Confidence Levels
| Level | Meaning |
|---|---|
HIGH |
Unambiguous parse, format is certain |
MEDIUM |
Likely correct, minor ambiguity (e.g. two-digit year) |
LOW |
Ambiguous — DD/MM vs MM/DD unresolved, or poor match ratio |
FAILED |
Could not parse or detect |
Design
- Batch-first: Designed for columns of data, not single strings
- No silent guessing: Ambiguity is reported, not hidden
- Format lock-in: Once detected, the format is enforced — violations are flagged
- Structured results: Every parse returns confidence scores and warnings
- Zero dependencies: Pure Python, stdlib only
Built for LLMs
datemonkey is designed to work well as a tool for large language models. Date parsing is a common source of silent errors in LLM-driven data pipelines — ambiguous formats lead to wrong guesses, wasted tokens on retries, and broken downstream logic. datemonkey reduces that complexity: a single call returns a structured result with the detected format, confidence level, and any ambiguities — no multi-step prompting or validation loops required. Fewer tokens in, reliable answers out.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datemonkey-0.1.0.tar.gz.
File metadata
- Download URL: datemonkey-0.1.0.tar.gz
- Upload date:
- Size: 19.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
888c097ed0666ac5142906b183e397985ab656d19ee6bb18287ce6bbf8c3e7b6
|
|
| MD5 |
4d562f5c3d0d5973465346cfd2059354
|
|
| BLAKE2b-256 |
aaaa2e01df00997adad85303b9f10ed51b54e74ea06af0bec2b512b7b5af58a1
|
File details
Details for the file datemonkey-0.1.0-py3-none-any.whl.
File metadata
- Download URL: datemonkey-0.1.0-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80fc71fbc43460faabd7df635318182d05cd88329d35ccf5de0bcac4944bca54
|
|
| MD5 |
7decda73b8654a33a4853c755b59403c
|
|
| BLAKE2b-256 |
7d0d9e4abb1ae5eaf8a72d5d40cf465d73d4d9649259d1f72b92f4bd2e43dfa2
|