Skip to main content

Extract datetime objects from natural language text

Project description

Build Status pypi downloads per day pypi version

A python module for locating dates inside text. Use this package to extract date-like strings from documents and turn them into useful datetime/temporal objects.

As of 1.0.0, find_dates(...) defaults to the v2 compatibility engine. The original engine remains available as find_dates_legacy(...).

Installation

Requires Python 3.9+.

With pip

pip install datefinder

If a compatible prebuilt wheel is unavailable for your platform, pip will build from source and requires a Rust toolchain.

Note: I do not publish the version on conda forge and cannot verify its integrity.

What You Can Do With datefinder

datefinder is a Python date parser for extracting dates from unstructured text. It is useful when your data is not already normalized, for example:

  • emails, tickets, and support conversations

  • contracts, policies, and legal text

  • logs, reports, and markdown/wiki pages

  • scraped HTML and mixed-format documents

You can use it to:

  • parse explicit calendar dates like January 4th, 2017 or 2024-11-03 18:00

  • parse relative expressions like tomorrow, yesterday, and in 3 days

  • parse multiple date formats in one pass (month-name, slash, ISO, hyphen)

  • anchor relative parsing to a reference/base date

  • return either compatibility datetimes or typed structured match objects

In short: if you need to find and parse dates from text in Python, especially inside large documents with mixed formatting, datefinder is designed for that.

Common workflows:

  • migration from legacy date extraction code: use find_dates_legacy(...) for parity, then move to find_dates(...)

  • modern typed extraction: use extract(...) to get match kinds, spans, confidence, and structured values

  • command line processing: use datefinder --engine extract --json in shell pipelines

Example (Python):

import datefinder
from datetime import datetime, timezone

text = "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."
ref = datetime(2026, 3, 19, 12, 0, tzinfo=timezone.utc)

# Compatibility datetimes
print(list(datefinder.find_dates(text, base_date=ref)))

# Typed extraction
for match in datefinder.extract(text, reference_dt=ref):
    print(match.kind, match.text, match.value)

Example (CLI):

datefinder --reference "2026-03-19T12:00:00+00:00" --json \
  "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."

How to Use

In [1]: string_with_dates = """
   ...: ...
   ...: entries are due by January 4th, 2017 at 8:00pm
   ...: ...
   ...: created 01/15/2005 by ACME Inc. and associates.
   ...: ...
   ...: """

In [2]: import datefinder

In [3]: matches = datefinder.find_dates(string_with_dates)

In [4]: for match in matches:
   ...:     print(match)
   ...:
2017-01-04 20:00:00
2005-01-15 00:00:00

CLI

The package now includes a CLI entrypoint:

datefinder --json "tomorrow and 2024-12-10"

You can also run it as a module:

python -m datefinder --engine extract --json --reference "2026-03-18T00:00:00+00:00" "in 3 days"

Engine options:

  • default: find_dates(...) (v2 compatibility default)

  • legacy: find_dates_legacy(...)

  • compat: find_dates_compat(...)

  • extract: typed extract(...) output

Common options:

  • --reference <ISO8601>: anchor for relative dates/times (equivalent to base_date/reference_dt)

  • --first {month,day,year}: disambiguation for numeric dates

  • --strict: stricter matching

  • --json / --pretty: machine-readable output

  • --source / --index: include source span details (default/legacy only)

  • --locale <code>: locale hint for extract (repeatable)

  • --no-month-only: disable month-only inference ("May" -> YYYY-05-01)

  • --compact-numeric: enable compact numeric parsing (e.g. 20240315)

  • --no-multiline: disable cross-line matching

Examples:

# default engine (v2 compatibility), anchored relative parsing
datefinder --reference "2026-03-19T12:00:00+00:00" --json "tomorrow and 2024-12-10"

# explicit legacy behavior, include source text and indices
datefinder --engine legacy --source --index --json "created 01/15/2005 by ACME"

# typed extract output with locale hints
datefinder --engine extract --locale en --locale fr --pretty --json "in 3 days and demain"

# read long input from stdin
cat document.txt | datefinder --engine extract --json

Relative and duration values:

  • default / legacy / compat engines emit datetimes.

  • extract emits typed values: - relative includes both resolved_datetime and delta_seconds. - duration includes total_seconds and normalized components.

V2 Typed API

This repository includes a v2 extraction API with typed match objects and first-class support for relative expressions and durations.

import datefinder
from datetime import datetime, timezone

matches = datefinder.extract(
    "in 3 days we deploy on 2024-11-03 18:00",
    reference_dt=datetime.now(timezone.utc),
)
for m in matches:
    print(m.kind, m.text, m.value)

There is also a compatibility helper for migrating existing code:

for dt in datefinder.find_dates_compat("tomorrow and 2024-12-10"):
    print(dt)

If you need the original parser behavior exactly:

for dt in datefinder.find_dates_legacy("April 9, 2013 at 6:11 a.m."):
    print(dt)

Rust kernel source is under rust/datefinder-kernel and is required for v2/default runtime behavior.

Rust Portability

  • Compiled Rust extensions are platform-specific, they do not run on every system by default.

  • Release wheel targets: - Linux glibc: x86_64 and aarch64 (manylinux2014) - Linux musl: x86_64 and aarch64 (musllinux_1_2) - macOS: x86_64 and arm64 - Windows: x86_64

  • If no compatible wheel is available, pip builds from source and requires a Rust toolchain.

Conformance and Ambiguity Reports

Build a reproducible corpus from legacy tests and generate differential reports between legacy behavior and find_dates_compat:

python scripts/build_conformance_corpus.py
python scripts/diff_legacy_v2.py

This writes:

  • conformance/legacy_parity_cases.jsonl

  • conformance/reports/legacy_v2_diff_report.md

  • conformance/reports/ambiguity_showcase.md

  • conformance/reports/behavior_change_changelog.md

The ambiguity showcase also supports interpretation judgments in conformance/interpretation_judgments.jsonl to assess whether legacy behavior is semantically preferable for ambiguous real-world cases.

See also:

  • CONTRIBUTING.md for developer setup and validation commands.

  • RELEASE.md for release checklist.

Benchmark Snapshot

The command below generates a local benchmark snapshot comparing:

  • v2: datefinder.extract(...)

  • legacy: datefinder.find_dates_legacy(...)

  • dateparser: dateparser.search.search_dates

  • duckling_http: Duckling POST /parse

Run:

# optional: run duckling locally
docker run --rm -p 8000:8000 rasa/duckling:latest

python bench/bench_readme_compare.py \
  --iterations-small 12 \
  --iterations-large 2

Latest local snapshot (2026-03-19 UTC):

dataset

size

v2 median (s)

legacy median (s)

dateparser median (s)

duckling_http median (s)

v2 vs legacy

v2 vs dateparser

v2 vs duckling_http

core_corpus

498

0.000236

0.003042

0.180596

0.050266

12.91x

766.74x

213.41x

seattle_html_76k

74838

0.037436

0.281466

0.771712

25.353595

7.52x

20.61x

677.24x

test_data_560k

552301

0.239391

2.840845

n/a

n/a

11.87x

n/a

n/a

Notes:

  • n/a means unavailable/failed for that dataset in this run.

  • dateparser/duckling_http are skipped by default for documents larger than 200k bytes unless forced.

  • Match counts differ across engines because behavior targets differ (e.g. relative/duration support and false-positive tolerance).

  • Results are hardware/environment dependent and should be treated as directional.

Release Notes

  • docs/releases/1.0.0.md documents GA scope, behavior changes, and migration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datefinder-1.0.0.tar.gz (33.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datefinder-1.0.0-cp39-abi3-win_amd64.whl (810.8 kB view details)

Uploaded CPython 3.9+Windows x86-64

datefinder-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

datefinder-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

datefinder-1.0.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datefinder-1.0.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datefinder-1.0.0-cp39-abi3-macosx_11_0_arm64.whl (904.7 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

datefinder-1.0.0-cp39-abi3-macosx_10_12_x86_64.whl (950.0 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file datefinder-1.0.0.tar.gz.

File metadata

  • Download URL: datefinder-1.0.0.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for datefinder-1.0.0.tar.gz
Algorithm Hash digest
SHA256 badac271404c500791e6a03e325462dd16bdaca3b3016807ae9f555d93c8acc6
MD5 452d5acf91412b4173f4de8a0d450c53
BLAKE2b-256 3031f1cd018576ae255190e83faadcbc8bf2e1ac2badbe7e0181e5a42e5dfb17

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: datefinder-1.0.0-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 810.8 kB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 a082601580914ad54b7179e39f7416de84eb1f82228a7a901a1d0e5d90d59781
MD5 31a9a455177a1979acacfa6bb911d908
BLAKE2b-256 486c0a5e8ea7a01e508a4180582e62941753c0fb7bb5bc30d6d2bc144210377b

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 473986f8128b516ce995464b5491ce0ee92337f006d21ce38448b3b123fba13a
MD5 199c2df8b3bef1a0a3535f3ad2515a86
BLAKE2b-256 ae89217db5a0ba3fb559bf6cef705858b4a00103930e8adcdb4f9ebe9dda8888

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 66cf4f4732b7845ae58a1eb826baa9283b66bfdcb533d3f9babac4b69cde7746
MD5 0dac8b07de94bf0c2911ac506cba7e29
BLAKE2b-256 7968266059b6e9078356dbc5bb22bd063c9ff254ee683ea1d0c118a21c379393

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3006a5f009a873be4e8cb6725df0651edaef0d0dd40d5c19ea56cd544da12bee
MD5 d9902da247d1565a5a0fb90af1f9789f
BLAKE2b-256 7428c5a463aad781cb80ddfa9b4c1977a96fd2a59d4189c8ba2496b872948acc

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ac3bbed693182e671c8c0e23fc42948ed15b346ce830d5868ecff1802336580f
MD5 5f029eb3fa99755b828a989d327f20d2
BLAKE2b-256 4d26602128e849c7afd8cc24c8dc5da5ce054a38f23a9e13fc337531007f38aa

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 85b0e7774d244bc8e00d5801b8a8ce97ab0ed717d3ed8e76cf913e4c81a45411
MD5 58198a3e55976318b7d5acea1664b71f
BLAKE2b-256 f63a01ab208f425244130413b98c98f0e0ca93a7736be8843a242f693df53d84

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 754f82d96db72831492d15f863783f486070d15fbf2c8fab7d62116537dddd7e
MD5 9ed72408b82f2d1306e0835eff12e649
BLAKE2b-256 f00fd3a98bfb27ca78c868998972e25af89f60763a8a33ebcfc9c1229913d7bc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page