Skip to main content

Extract datetime objects from natural language text

Project description

Build Status pypi downloads per day pypi version

A python module for locating dates inside text. Use this package to extract date-like strings from documents and turn them into useful datetime/temporal objects.

As of 1.0.0rc3, find_dates(...) defaults to the v2 compatibility engine. The original engine remains available as find_dates_legacy(...).

Installation

Requires Python 3.9+.

With pip

pip install datefinder

If a compatible prebuilt wheel is unavailable for your platform, pip will build from source and requires a Rust toolchain.

Note: I do not publish the version on conda forge and cannot verify its integrity.

What You Can Do With datefinder

datefinder is a Python date parser for extracting dates from unstructured text. It is useful when your data is not already normalized, for example:

  • emails, tickets, and support conversations

  • contracts, policies, and legal text

  • logs, reports, and markdown/wiki pages

  • scraped HTML and mixed-format documents

You can use it to:

  • parse explicit calendar dates like January 4th, 2017 or 2024-11-03 18:00

  • parse relative expressions like tomorrow, yesterday, and in 3 days

  • parse multiple date formats in one pass (month-name, slash, ISO, hyphen)

  • anchor relative parsing to a reference/base date

  • return either compatibility datetimes or typed structured match objects

In short: if you need to find and parse dates from text in Python, especially inside large documents with mixed formatting, datefinder is designed for that.

Common workflows:

  • migration from legacy date extraction code: use find_dates_legacy(...) for parity, then move to find_dates(...)

  • modern typed extraction: use extract(...) to get match kinds, spans, confidence, and structured values

  • command line processing: use datefinder --engine extract --json in shell pipelines

Example (Python):

import datefinder
from datetime import datetime, timezone

text = "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."
ref = datetime(2026, 3, 19, 12, 0, tzinfo=timezone.utc)

# Compatibility datetimes
print(list(datefinder.find_dates(text, base_date=ref)))

# Typed extraction
for match in datefinder.extract(text, reference_dt=ref):
    print(match.kind, match.text, match.value)

Example (CLI):

datefinder --reference "2026-03-19T12:00:00+00:00" --json \
  "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."

How to Use

In [1]: string_with_dates = """
   ...: ...
   ...: entries are due by January 4th, 2017 at 8:00pm
   ...: ...
   ...: created 01/15/2005 by ACME Inc. and associates.
   ...: ...
   ...: """

In [2]: import datefinder

In [3]: matches = datefinder.find_dates(string_with_dates)

In [4]: for match in matches:
   ...:     print(match)
   ...:
2017-01-04 20:00:00
2005-01-15 00:00:00

CLI

The package now includes a CLI entrypoint:

datefinder --json "tomorrow and 2024-12-10"

You can also run it as a module:

python -m datefinder --engine extract --json --reference "2026-03-18T00:00:00+00:00" "in 3 days"

Engine options:

  • default: find_dates(...) (v2 compatibility default)

  • legacy: find_dates_legacy(...)

  • compat: find_dates_compat(...)

  • extract: typed extract(...) output

Common options:

  • --reference <ISO8601>: anchor for relative dates/times (equivalent to base_date/reference_dt)

  • --first {month,day,year}: disambiguation for numeric dates

  • --strict: stricter matching

  • --json / --pretty: machine-readable output

  • --source / --index: include source span details (default/legacy only)

  • --locale <code>: locale hint for extract (repeatable)

  • --no-month-only: disable month-only inference ("May" -> YYYY-05-01)

  • --compact-numeric: enable compact numeric parsing (e.g. 20240315)

  • --no-multiline: disable cross-line matching

Examples:

# default engine (v2 compatibility), anchored relative parsing
datefinder --reference "2026-03-19T12:00:00+00:00" --json "tomorrow and 2024-12-10"

# explicit legacy behavior, include source text and indices
datefinder --engine legacy --source --index --json "created 01/15/2005 by ACME"

# typed extract output with locale hints
datefinder --engine extract --locale en --locale fr --pretty --json "in 3 days and demain"

# read long input from stdin
cat document.txt | datefinder --engine extract --json

Relative and duration values:

  • default / legacy / compat engines emit datetimes.

  • extract emits typed values: - relative includes both resolved_datetime and delta_seconds. - duration includes total_seconds and normalized components.

V2 Typed API

This repository includes a v2 extraction API with typed match objects and first-class support for relative expressions and durations.

import datefinder
from datetime import datetime, timezone

matches = datefinder.extract(
    "in 3 days we deploy on 2024-11-03 18:00",
    reference_dt=datetime.now(timezone.utc),
)
for m in matches:
    print(m.kind, m.text, m.value)

There is also a compatibility helper for migrating existing code:

for dt in datefinder.find_dates_compat("tomorrow and 2024-12-10"):
    print(dt)

If you need the original parser behavior exactly:

for dt in datefinder.find_dates_legacy("April 9, 2013 at 6:11 a.m."):
    print(dt)

Rust kernel source is under rust/datefinder-kernel and is required for v2/default runtime behavior.

Rust Portability

  • Compiled Rust extensions are platform-specific, they do not run on every system by default.

  • Release wheel targets: - Linux glibc: x86_64 and aarch64 (manylinux2014) - Linux musl: x86_64 and aarch64 (musllinux_1_2) - macOS: x86_64 and arm64 - Windows: x86_64

  • If no compatible wheel is available, pip builds from source and requires a Rust toolchain.

Conformance and Ambiguity Reports

Build a reproducible corpus from legacy tests and generate differential reports between legacy behavior and find_dates_compat:

python scripts/build_conformance_corpus.py
python scripts/diff_legacy_v2.py

This writes:

  • conformance/legacy_parity_cases.jsonl

  • conformance/reports/legacy_v2_diff_report.md

  • conformance/reports/ambiguity_showcase.md

  • conformance/reports/behavior_change_changelog.md

The ambiguity showcase also supports interpretation judgments in conformance/interpretation_judgments.jsonl to assess whether legacy behavior is semantically preferable for ambiguous real-world cases.

See also:

  • CONTRIBUTING.md for developer setup and validation commands.

  • RELEASE.md for release checklist.

Benchmark Snapshot

The command below generates a local benchmark snapshot comparing:

  • v2: datefinder.extract(...)

  • legacy: datefinder.find_dates_legacy(...)

  • dateparser: dateparser.search.search_dates

  • duckling_http: Duckling POST /parse

Run:

# optional: run duckling locally
docker run --rm -p 8000:8000 rasa/duckling:latest

python bench/bench_readme_compare.py \
  --iterations-small 12 \
  --iterations-large 2

Latest local snapshot (2026-03-19 UTC):

dataset

size

v2 median (s)

legacy median (s)

dateparser median (s)

duckling_http median (s)

v2 vs legacy

v2 vs dateparser

v2 vs duckling_http

core_corpus

498

0.000236

0.003042

0.180596

0.050266

12.91x

766.74x

213.41x

seattle_html_76k

74838

0.037436

0.281466

0.771712

25.353595

7.52x

20.61x

677.24x

test_data_560k

552301

0.239391

2.840845

n/a

n/a

11.87x

n/a

n/a

Notes:

  • n/a means unavailable/failed for that dataset in this run.

  • dateparser/duckling_http are skipped by default for documents larger than 200k bytes unless forced.

  • Match counts differ across engines because behavior targets differ (e.g. relative/duration support and false-positive tolerance).

  • Results are hardware/environment dependent and should be treated as directional.

Release Notes

  • docs/releases/1.0.0rc3.md documents RC scope, behavior changes, and migration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datefinder-1.0.0rc3.tar.gz (33.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datefinder-1.0.0rc3-cp39-abi3-win_amd64.whl (810.1 kB view details)

Uploaded CPython 3.9+Windows x86-64

datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc3-cp39-abi3-macosx_11_0_arm64.whl (904.0 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

datefinder-1.0.0rc3-cp39-abi3-macosx_10_12_x86_64.whl (949.3 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file datefinder-1.0.0rc3.tar.gz.

File metadata

  • Download URL: datefinder-1.0.0rc3.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for datefinder-1.0.0rc3.tar.gz
Algorithm Hash digest
SHA256 39a03bbe05549caab3526438d44d8a62fc6b790b24aff99add5284633f5fd5ce
MD5 7cbd4e398d0609ed9f2b33e249669e9b
BLAKE2b-256 e50c3fb19e2bddfc52916584eefa0af65ec3ca48f5fd555c7e1d92c48c94852f

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c872db56b0e8504cce51406a638ae7723abfc43c0708567f49d8ef0a7e025d8b
MD5 4d6936165f39e8e988f8c28035d8c4c7
BLAKE2b-256 321565c3142ada3c11cf179ec0a2859842040577ab90541e4512ceb2fda23966

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a276ee1bd873b1353cd23524893111937ad4d88fed70acdba0357f889241ba2a
MD5 ad42bb37ba404b427d5e9943bd471770
BLAKE2b-256 8b33008a3f5afe2f48d1d00d7efcfd3afb4713e31ceabea2d007caef073fc833

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 05843635f4b8d56dde1cd6b3380ce3b672b30ff33fb001504bf609150d99b730
MD5 b7d33b72081cd90c0f7dadfc479a6da6
BLAKE2b-256 ff7c6e6850ed8f179b2379bb869508a1e7cc392f9f2e2c1bec6b19f367bea53a

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 59d8f83a9f5219a68b5be1369f4d97c904ae41f5c6ee920906925786ab8b18b4
MD5 90a125486fc33cb153445f213dc8dcd1
BLAKE2b-256 525c631e6815c45f3ecce52d38a641770309ac82d6b98e9dce6eaf2efc1db46d

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 6e0612186427daf93fc7b04016f877c9ce788e3fe3f4552c282e212d907c19b2
MD5 3b2aa71ac8217d677b3064af63316787
BLAKE2b-256 7b770bec0413582f9687f2aa5bced0ffc7d7e38ccceaeb6f146893751244fa75

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 96d7d83c76bafd121d4c7206955fd961d33bb398ecaaf0ecb7e7f0e65eb52950
MD5 994fbc4a14615bdd2ce2767fd58e6c2c
BLAKE2b-256 b0d4fa09e34dd7c60e36afb8ebf0dc820bbe07fd84ffc1b374af7aa8e8148e67

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc3-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc3-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b2fba65ff76a9bed0c5a880bb5a754f1cd76d1a28e96e95a3023277a71e75641
MD5 91eab13aead8e42a2596796f1ae6ef00
BLAKE2b-256 eba0a8c76c868b423546da7e37df23646c684d3b5d5558a31a6dd1175088600e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page