Extract datetime objects from natural language text
Project description
A python module for locating dates inside text. Use this package to extract date-like strings from documents and turn them into useful datetime/temporal objects.
As of 1.0.0rc1, find_dates(...) defaults to the v2 compatibility engine. The original engine remains available as find_dates_legacy(...).
Installation
Requires Python 3.9+.
With pip
pip install datefinder
If a compatible prebuilt wheel is unavailable for your platform, pip will build from source and requires a Rust toolchain.
Note: I do not publish the version on conda forge and cannot verify its integrity.
What You Can Do With datefinder
datefinder is a Python date parser for extracting dates from unstructured text. It is useful when your data is not already normalized, for example:
emails, tickets, and support conversations
contracts, policies, and legal text
logs, reports, and markdown/wiki pages
scraped HTML and mixed-format documents
You can use it to:
parse explicit calendar dates like January 4th, 2017 or 2024-11-03 18:00
parse relative expressions like tomorrow, yesterday, and in 3 days
parse multiple date formats in one pass (month-name, slash, ISO, hyphen)
anchor relative parsing to a reference/base date
return either compatibility datetimes or typed structured match objects
In short: if you need to find and parse dates from text in Python, especially inside large documents with mixed formatting, datefinder is designed for that.
Common workflows:
migration from legacy date extraction code: use find_dates_legacy(...) for parity, then move to find_dates(...)
modern typed extraction: use extract(...) to get match kinds, spans, confidence, and structured values
command line processing: use datefinder --engine extract --json in shell pipelines
Example (Python):
import datefinder
from datetime import datetime, timezone
text = "Meeting tomorrow; launch on 2024-11-03 18:00 UTC."
ref = datetime(2026, 3, 19, 12, 0, tzinfo=timezone.utc)
# Compatibility datetimes
print(list(datefinder.find_dates(text, base_date=ref)))
# Typed extraction
for match in datefinder.extract(text, reference_dt=ref):
print(match.kind, match.text, match.value)
Example (CLI):
datefinder --reference "2026-03-19T12:00:00+00:00" --json \
"Meeting tomorrow; launch on 2024-11-03 18:00 UTC."
How to Use
In [1]: string_with_dates = """
...: ...
...: entries are due by January 4th, 2017 at 8:00pm
...: ...
...: created 01/15/2005 by ACME Inc. and associates.
...: ...
...: """
In [2]: import datefinder
In [3]: matches = datefinder.find_dates(string_with_dates)
In [4]: for match in matches:
...: print(match)
...:
2017-01-04 20:00:00
2005-01-15 00:00:00
CLI
The package now includes a CLI entrypoint:
datefinder --json "tomorrow and 2024-12-10"
You can also run it as a module:
python -m datefinder --engine extract --json --reference "2026-03-18T00:00:00+00:00" "in 3 days"
Engine options:
default: find_dates(...) (v2 compatibility default)
legacy: find_dates_legacy(...)
compat: find_dates_compat(...)
extract: typed extract(...) output
Common options:
--reference <ISO8601>: anchor for relative dates/times (equivalent to base_date/reference_dt)
--first {month,day,year}: disambiguation for numeric dates
--strict: stricter matching
--json / --pretty: machine-readable output
--source / --index: include source span details (default/legacy only)
--locale <code>: locale hint for extract (repeatable)
Examples:
# default engine (v2 compatibility), anchored relative parsing
datefinder --reference "2026-03-19T12:00:00+00:00" --json "tomorrow and 2024-12-10"
# explicit legacy behavior, include source text and indices
datefinder --engine legacy --source --index --json "created 01/15/2005 by ACME"
# typed extract output with locale hints
datefinder --engine extract --locale en --locale fr --pretty --json "in 3 days and demain"
# read long input from stdin
cat document.txt | datefinder --engine extract --json
Relative and duration values:
default / legacy / compat engines emit datetimes.
extract emits typed values: - relative includes both resolved_datetime and delta_seconds. - duration includes total_seconds and normalized components.
V2 Typed API
This repository includes a v2 extraction API with typed match objects and first-class support for relative expressions and durations.
import datefinder
from datetime import datetime, timezone
matches = datefinder.extract(
"in 3 days we deploy on 2024-11-03 18:00",
reference_dt=datetime.now(timezone.utc),
)
for m in matches:
print(m.kind, m.text, m.value)
There is also a compatibility helper for migrating existing code:
for dt in datefinder.find_dates_compat("tomorrow and 2024-12-10"):
print(dt)
If you need the original parser behavior exactly:
for dt in datefinder.find_dates_legacy("April 9, 2013 at 6:11 a.m."):
print(dt)
Rust kernel source is under rust/datefinder-kernel and is required for v2/default runtime behavior.
Rust Portability
Compiled Rust extensions are platform-specific, they do not run on every system by default.
Release wheel targets: - Linux glibc: x86_64 and aarch64 (manylinux2014) - Linux musl: x86_64 and aarch64 (musllinux_1_2) - macOS: x86_64 and arm64 - Windows: x86_64
If no compatible wheel is available, pip builds from source and requires a Rust toolchain.
Conformance and Ambiguity Reports
Build a reproducible corpus from legacy tests and generate differential reports between legacy behavior and find_dates_compat:
python scripts/build_conformance_corpus.py
python scripts/diff_legacy_v2.py
This writes:
conformance/legacy_parity_cases.jsonl
conformance/reports/legacy_v2_diff_report.md
conformance/reports/ambiguity_showcase.md
conformance/reports/behavior_change_changelog.md
The ambiguity showcase also supports interpretation judgments in conformance/interpretation_judgments.jsonl to assess whether legacy behavior is semantically preferable for ambiguous real-world cases.
See also:
CONTRIBUTING.md for developer setup and validation commands.
RELEASE.md for release checklist.
Benchmark Snapshot
The command below generates a local benchmark snapshot comparing:
v2: datefinder.extract(...)
legacy: datefinder.find_dates_legacy(...)
dateparser: dateparser.search.search_dates
duckling_http: Duckling POST /parse
Run:
# optional: run duckling locally
docker run --rm -p 8000:8000 rasa/duckling:latest
python bench/bench_readme_compare.py \
--iterations-small 12 \
--iterations-large 2
Latest local snapshot (2026-03-19 UTC):
dataset |
size |
v2 median (s) |
legacy median (s) |
dateparser median (s) |
duckling_http median (s) |
v2 vs legacy |
v2 vs dateparser |
v2 vs duckling_http |
|---|---|---|---|---|---|---|---|---|
core_corpus |
498 |
0.000236 |
0.003042 |
0.180596 |
0.050266 |
12.91x |
766.74x |
213.41x |
seattle_html_76k |
74838 |
0.037436 |
0.281466 |
0.771712 |
25.353595 |
7.52x |
20.61x |
677.24x |
test_data_560k |
552301 |
0.239391 |
2.840845 |
n/a |
n/a |
11.87x |
n/a |
n/a |
Notes:
n/a means unavailable/failed for that dataset in this run.
dateparser/duckling_http are skipped by default for documents larger than 200k bytes unless forced.
Match counts differ across engines because behavior targets differ (e.g. relative/duration support and false-positive tolerance).
Results are hardware/environment dependent and should be treated as directional.
Release Notes
docs/releases/1.0.0rc1.md documents RC scope, behavior changes, and migration.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datefinder-1.0.0rc1-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.0 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a24211660bf84a9bc9ba797e1d23f1a8424cca2bdf4508f7aeec6382a20eaa7
|
|
| MD5 |
58fdd5e7983dcd8617dfb82709b4ce45
|
|
| BLAKE2b-256 |
19e0b9d3a7f82cbb735e72ca4f5ac32b4acfda02beaae5c0b11f793f975ef72c
|
File details
Details for the file datefinder-1.0.0rc1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.0 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41fc18e0511648c91008a74e17891557e5fae6cfaca34138846f3c7f0c8ad536
|
|
| MD5 |
0d4846b2c34137fe4f2724b589459d97
|
|
| BLAKE2b-256 |
454dd5c354b65e4c1983ba572e9739dcc7085a19d6cf618e25b7a892f224c7c1
|
File details
Details for the file datefinder-1.0.0rc1-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.0 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c4c415f359a6b3ed35de68cac44b22776f5a1b05dd9405a9a231d07283da8d4
|
|
| MD5 |
c0820a450bf5d519230db59d98606104
|
|
| BLAKE2b-256 |
5f43690ada6dc81fbc787ede288506e9770e4e81587cd9e215ffc4da52147396
|
File details
Details for the file datefinder-1.0.0rc1-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.0 MB
- Tags: PyPy, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c0ed165122bd81ae33de1a23e1ff58b48ff8e10a3a41e716eed8c0941aa1ec0
|
|
| MD5 |
a5c3720b223c08504ddd9fb6c87d05fe
|
|
| BLAKE2b-256 |
0053bb4d85cf565be0e6958437cd2efb036f6806f3ed4de154a220c74e66966a
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-win_amd64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-win_amd64.whl
- Upload date:
- Size: 785.3 kB
- Tags: CPython 3.9+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
467e3b86dcf0ec4a36e602006dd322294ff0334a77f7a3b7d13b706b7e1acae8
|
|
| MD5 |
cf09fa86c0738ffa91eb4443c19b0024
|
|
| BLAKE2b-256 |
90646f125365b46532e7157f9e9c6bf0e1efb530077527ef3d2de176462e735c
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-musllinux_1_2_x86_64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-musllinux_1_2_x86_64.whl
- Upload date:
- Size: 1.3 MB
- Tags: CPython 3.9+, musllinux: musl 1.2+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
553d6205786ee6d03a235f295c08eb0f4ab66fdf95316b6997609304da1c88b7
|
|
| MD5 |
229c4086a55a7c0bd9fee69c1ebc64cc
|
|
| BLAKE2b-256 |
5670c02fcfec5e714b4f3fdf2e7251cb94abc0462bcb379de2e16fa55ee2eb76
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-musllinux_1_2_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-musllinux_1_2_aarch64.whl
- Upload date:
- Size: 1.2 MB
- Tags: CPython 3.9+, musllinux: musl 1.2+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c3b2033598bdf145235778c44d17e6011bc0dd8752c6ab49df7777b6eb6edc8
|
|
| MD5 |
d990b9d113a6903672ab9016c4cfb008
|
|
| BLAKE2b-256 |
b90d0836e45d8057561083cab77a1a2f2aafadf154d524723bcd0ac122e1f96b
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
078ee2b0c02592eefb1f3bf041ed3bc1e3c8aee359f2e2573c92b06335a3e892
|
|
| MD5 |
1a96a71c43c4c59224a9cfbcc6088b7d
|
|
| BLAKE2b-256 |
30ab6a489c56c4cbdf4657f3f56aa89ea39766e20f9ff6ded418ebd745aa0fc7
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 1.0 MB
- Tags: CPython 3.9+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3d78bf9be17c572d4f60e59bcf02feced68c2adf34f725a964410485eef93b0f
|
|
| MD5 |
be7f194f7ba1985c7a1d78812c117054
|
|
| BLAKE2b-256 |
6b5b415e4642035ca3d4c0cc9bce54d694c5bbcd842959e18bd8e1d24b79c699
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 880.4 kB
- Tags: CPython 3.9+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10257f3a374522483dab23bc2fc1881391bbb8b7f72bdad5e60ddeace8b9b0dc
|
|
| MD5 |
153c6c099d87a397b7ab8d516039b6e9
|
|
| BLAKE2b-256 |
dfc018c2e99c9b33559146fd19e56921631d0a9280fb48ac66280e31a44a5614
|
File details
Details for the file datefinder-1.0.0rc1-cp39-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: datefinder-1.0.0rc1-cp39-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 924.0 kB
- Tags: CPython 3.9+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93efee37543a2025f984cb8b8dffed5c849208556acc8d8466a3b01bef04d2a4
|
|
| MD5 |
9a9fddb333b83f267d466ee4e6371941
|
|
| BLAKE2b-256 |
6bab46be75b752d341035d50100e32236a35d43d6ffa7f23230a00a599a739a8
|