Skip to main content

Extract datetime objects from natural language text

Project description

Build Status pypi downloads per day pypi version

A python module for locating dates inside text. Use this package to extract date-like strings from documents and turn them into useful datetime/temporal objects.

As of 1.0.0rc2, find_dates(...) defaults to the v2 compatibility engine. The original engine remains available as find_dates_legacy(...).

Installation

Requires Python 3.9+.

With pip

pip install datefinder

If a compatible prebuilt wheel is unavailable for your platform, pip will build from source and requires a Rust toolchain.

Note: I do not publish the version on conda forge and cannot verify its integrity.

What You Can Do With datefinder

datefinder is a Python date parser for extracting dates from unstructured text. It is useful when your data is not already normalized, for example:

  • emails, tickets, and support conversations

  • contracts, policies, and legal text

  • logs, reports, and markdown/wiki pages

  • scraped HTML and mixed-format documents

You can use it to:

  • parse explicit calendar dates like January 4th, 2017 or 2024-11-03 18:00

  • parse relative expressions like tomorrow, yesterday, and in 3 days

  • parse multiple date formats in one pass (month-name, slash, ISO, hyphen)

  • anchor relative parsing to a reference/base date

  • return either compatibility datetimes or typed structured match objects

In short: if you need to find and parse dates from text in Python, especially inside large documents with mixed formatting, datefinder is designed for that.

Common workflows:

  • migration from legacy date extraction code: use find_dates_legacy(...) for parity, then move to find_dates(...)

  • modern typed extraction: use extract(...) to get match kinds, spans, confidence, and structured values

  • command line processing: use datefinder --engine extract --json in shell pipelines

Example (Python):

import datefinder
from datetime import datetime, timezone

text = "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."
ref = datetime(2026, 3, 19, 12, 0, tzinfo=timezone.utc)

# Compatibility datetimes
print(list(datefinder.find_dates(text, base_date=ref)))

# Typed extraction
for match in datefinder.extract(text, reference_dt=ref):
    print(match.kind, match.text, match.value)

Example (CLI):

datefinder --reference "2026-03-19T12:00:00+00:00" --json \
  "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."

How to Use

In [1]: string_with_dates = """
   ...: ...
   ...: entries are due by January 4th, 2017 at 8:00pm
   ...: ...
   ...: created 01/15/2005 by ACME Inc. and associates.
   ...: ...
   ...: """

In [2]: import datefinder

In [3]: matches = datefinder.find_dates(string_with_dates)

In [4]: for match in matches:
   ...:     print(match)
   ...:
2017-01-04 20:00:00
2005-01-15 00:00:00

CLI

The package now includes a CLI entrypoint:

datefinder --json "tomorrow and 2024-12-10"

You can also run it as a module:

python -m datefinder --engine extract --json --reference "2026-03-18T00:00:00+00:00" "in 3 days"

Engine options:

  • default: find_dates(...) (v2 compatibility default)

  • legacy: find_dates_legacy(...)

  • compat: find_dates_compat(...)

  • extract: typed extract(...) output

Common options:

  • --reference <ISO8601>: anchor for relative dates/times (equivalent to base_date/reference_dt)

  • --first {month,day,year}: disambiguation for numeric dates

  • --strict: stricter matching

  • --json / --pretty: machine-readable output

  • --source / --index: include source span details (default/legacy only)

  • --locale <code>: locale hint for extract (repeatable)

  • --no-month-only: disable month-only inference ("May" -> YYYY-05-01)

  • --compact-numeric: enable compact numeric parsing (e.g. 20240315)

  • --no-multiline: disable cross-line matching

Examples:

# default engine (v2 compatibility), anchored relative parsing
datefinder --reference "2026-03-19T12:00:00+00:00" --json "tomorrow and 2024-12-10"

# explicit legacy behavior, include source text and indices
datefinder --engine legacy --source --index --json "created 01/15/2005 by ACME"

# typed extract output with locale hints
datefinder --engine extract --locale en --locale fr --pretty --json "in 3 days and demain"

# read long input from stdin
cat document.txt | datefinder --engine extract --json

Relative and duration values:

  • default / legacy / compat engines emit datetimes.

  • extract emits typed values: - relative includes both resolved_datetime and delta_seconds. - duration includes total_seconds and normalized components.

V2 Typed API

This repository includes a v2 extraction API with typed match objects and first-class support for relative expressions and durations.

import datefinder
from datetime import datetime, timezone

matches = datefinder.extract(
    "in 3 days we deploy on 2024-11-03 18:00",
    reference_dt=datetime.now(timezone.utc),
)
for m in matches:
    print(m.kind, m.text, m.value)

There is also a compatibility helper for migrating existing code:

for dt in datefinder.find_dates_compat("tomorrow and 2024-12-10"):
    print(dt)

If you need the original parser behavior exactly:

for dt in datefinder.find_dates_legacy("April 9, 2013 at 6:11 a.m."):
    print(dt)

Rust kernel source is under rust/datefinder-kernel and is required for v2/default runtime behavior.

Rust Portability

  • Compiled Rust extensions are platform-specific, they do not run on every system by default.

  • Release wheel targets: - Linux glibc: x86_64 and aarch64 (manylinux2014) - Linux musl: x86_64 and aarch64 (musllinux_1_2) - macOS: x86_64 and arm64 - Windows: x86_64

  • If no compatible wheel is available, pip builds from source and requires a Rust toolchain.

Conformance and Ambiguity Reports

Build a reproducible corpus from legacy tests and generate differential reports between legacy behavior and find_dates_compat:

python scripts/build_conformance_corpus.py
python scripts/diff_legacy_v2.py

This writes:

  • conformance/legacy_parity_cases.jsonl

  • conformance/reports/legacy_v2_diff_report.md

  • conformance/reports/ambiguity_showcase.md

  • conformance/reports/behavior_change_changelog.md

The ambiguity showcase also supports interpretation judgments in conformance/interpretation_judgments.jsonl to assess whether legacy behavior is semantically preferable for ambiguous real-world cases.

See also:

  • CONTRIBUTING.md for developer setup and validation commands.

  • RELEASE.md for release checklist.

Benchmark Snapshot

The command below generates a local benchmark snapshot comparing:

  • v2: datefinder.extract(...)

  • legacy: datefinder.find_dates_legacy(...)

  • dateparser: dateparser.search.search_dates

  • duckling_http: Duckling POST /parse

Run:

# optional: run duckling locally
docker run --rm -p 8000:8000 rasa/duckling:latest

python bench/bench_readme_compare.py \
  --iterations-small 12 \
  --iterations-large 2

Latest local snapshot (2026-03-19 UTC):

dataset

size

v2 median (s)

legacy median (s)

dateparser median (s)

duckling_http median (s)

v2 vs legacy

v2 vs dateparser

v2 vs duckling_http

core_corpus

498

0.000236

0.003042

0.180596

0.050266

12.91x

766.74x

213.41x

seattle_html_76k

74838

0.037436

0.281466

0.771712

25.353595

7.52x

20.61x

677.24x

test_data_560k

552301

0.239391

2.840845

n/a

n/a

11.87x

n/a

n/a

Notes:

  • n/a means unavailable/failed for that dataset in this run.

  • dateparser/duckling_http are skipped by default for documents larger than 200k bytes unless forced.

  • Match counts differ across engines because behavior targets differ (e.g. relative/duration support and false-positive tolerance).

  • Results are hardware/environment dependent and should be treated as directional.

Release Notes

  • docs/releases/1.0.0rc2.md documents RC scope, behavior changes, and migration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datefinder-1.0.0rc2-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc2-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc2-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc2-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded PyPymanylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc2-cp39-abi3-win_amd64.whl (810.2 kB view details)

Uploaded CPython 3.9+Windows x86-64

datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_x86_64.whl (1.3 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ x86-64

datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_aarch64.whl (1.2 MB view details)

Uploaded CPython 3.9+musllinux: musl 1.2+ ARM64

datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datefinder-1.0.0rc2-cp39-abi3-macosx_11_0_arm64.whl (904.0 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

datefinder-1.0.0rc2-cp39-abi3-macosx_10_12_x86_64.whl (949.4 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

File details

Details for the file datefinder-1.0.0rc2-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f1b507efad35b76ca96917c47bdef89eaf59fe6266e45d45aed887ff4adc4bfa
MD5 5ce262c32f545f09b0c19f48c75372eb
BLAKE2b-256 d9261b1385b71ba64bdc7a9ea12b145c44373c01f58daf862cc2a6cc7f1f8ef4

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 e4d02471f35aadd44b4decd62cb63211dcb4ffbf4ed0dc96f6181d7672247076
MD5 c315518d0da28c6e4466d3fc72dee767
BLAKE2b-256 9a12beb07769d909a196f8a9ade9204f2fec6b6579c5bf0dacd99ab1c9565b44

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 343e6df1fc97329d224d00aacfe57ed864938ebca10058d20032ca6b67c03c18
MD5 cc3532ba659548a44db2d7496e912563
BLAKE2b-256 b68e9b7cf945854b5eaa00fd402d7d62c2c747a58a21fc6a81253efe69cc15f9

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 279e89933064c9399b9533906ab558fd8359ba0f0ab99d76b7be348ddc29a765
MD5 0af5f0a3c1711e610ac13f3ff556cc04
BLAKE2b-256 844d3ef1aad263afe82719e97bb04f25786c64d96c3a4c232c0a9bb18868cc88

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 c8d89049d82ff8e685de829aec9617f3c7eb647fa67cdd153aab7f03f92782ca
MD5 366c99e5aee6a708cb564b06d9ff8d78
BLAKE2b-256 eaa449f9d7a1bef8fdab37d4a76cc5ab74c214fed941533d2c15dfdc90f054d1

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 ff588a7b71d30fb031f1440ec29b42da4215415d43e76c9872d1cdb8308b93d1
MD5 9efc1ca5e83e18fe011897853b547ef5
BLAKE2b-256 1bbc4428e5c4813ee660d46aeea03b6c2a6a3a4bf1f2b7245e934c4a000a2143

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 1d7af306926adf5489e9c6b422f3ce6f89281645390480d4decfb80ccd11b5c5
MD5 430b65f431d4a9528c1b08aec527bf29
BLAKE2b-256 a153d44a998259e9676b9e46fca23ac06d979337b7657537794e1884e8e37cdd

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c062e55538a1c154e58911d4099d57f7e42a7ce1fd64490efcf97c9960651348
MD5 c15509f44d50fe7481feaaa3b0366c8e
BLAKE2b-256 488f22000a1e6edf2a9b220b9ee867fba10392d48dbe69f7b3df718b2d93ab90

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4086a1ee12b76aa206b49fef297c9e61c5be89ee8a05850d01f739e444453db4
MD5 e26b00114439a473a22fb11a90d772f1
BLAKE2b-256 8b4129ae29dcdeb422360f035ec12140ff5482ea1bd1d88df9d49997961717b7

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ab3e5748b5e6e376ab227f94ca503746a6f88f591e2bc2c95ce39fa1d6530263
MD5 2d8252bfb36df7a77333e48cae371ee7
BLAKE2b-256 26c42ec2bff6c676e10f8c53c09eeb136e34c5c369c4ad57a9a938160ea057d2

See more details on using hashes here.

File details

Details for the file datefinder-1.0.0rc2-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for datefinder-1.0.0rc2-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 c1d1a1240ba3f1f2fdc4824cbaf7341a844e5cc91f885b1f9cb090233f43d675
MD5 b4f4f15bc78ce8bd48b3b69af820192b
BLAKE2b-256 5da9cfb5db84783afec1bf7f105b7ff3f3cbe652843e74d2cf88136ae701d0d4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page