Skip to main content

A Python port of Facebook's Duckling: parse natural-language English and Arabic into structured numbers, dates, durations, money, and more.

Project description

puckling

A Python port of Facebook Duckling — parses natural-language English and Arabic text into structured values: numbers, ordinals, dates, durations, distances, temperatures, money, emails, URLs, phone numbers, and more.

Quick start

uv sync --all-extras
uv run pytest
import datetime as dt
from puckling import Context, Lang, Locale, Options, parse

ctx = Context(
    reference_time=dt.datetime(2013, 2, 12, 4, 30, tzinfo=dt.UTC),
    locale=Locale(Lang.EN),
)

for entity in parse("I'll meet you tomorrow at 5pm for $50", ctx, Options()):
    print(entity)

Architecture

Puckling mirrors Duckling's parsing model in idiomatic, functional Python:

  • Rules are pure data: Rule(name, pattern, prod).
  • Patterns are tuples of RegexItem (regex over source text) and PredicateItem (predicates over existing tokens).
  • Productions are pure functions tuple[Token, ...] → Token | None.
  • The engine is a saturating fixed-point parser that applies rules iteratively until no new tokens appear.
  • Resolution is context-aware (reference time, locale) and dimension-specific.

All public types are @dataclass(frozen=True, slots=True) — no mutation. Cross-dimension references go through predicates (is_numeral, is_grain, …), never imports, so each rule file stays independent.

Engine budgets

The saturating fixed-point parser is bounded by three caps to prevent runaway parses on pathological compositional inputs:

Options field Default Disable with
parse_timeout_ms 2000 None
max_tokens 10000 n/a
max_iterations 50 n/a

When any cap is hit, the engine returns the tokens it has accumulated so far (a valid, possibly partial parse). For offline corpus runs where you want unbounded analysis, pass Options(parse_timeout_ms=None).

Running scripts safely

Inline smoke tests should always be wrapped with the shell timeout so a runaway parse can't survive the calling shell:

timeout 5 uv run python -c "
from puckling import parse, Context, Locale, Lang, Options
import datetime as dt
ctx = Context(reference_time=dt.datetime.now(dt.UTC), locale=Locale(Lang.EN))
print(parse('tomorrow at 5pm', ctx, Options()))
"

The engine's own budget should be enough on its own, but the shell-level timeout is belt-and-suspenders against any future engine path that bypasses the budget check.

Adding a dimension or locale

To port a Duckling rule file, add:

src/puckling/dimensions/<dim>/<lang>/__init__.py
src/puckling/dimensions/<dim>/<lang>/rules.py     # exports RULES: tuple[Rule, ...]
src/puckling/dimensions/<dim>/<lang>/corpus.py    # exports CORPUS: tuple[Example, ...]
tests/dimensions/test_<dim>_<lang>.py

The registry auto-discovers any <dim>/<lang>/rules.py exporting RULES. No central registration list to update.

Status

Foundation complete. Per-dimension rule sets are being ported in parallel from upstream Haskell sources. See src/puckling/dimensions/* for current coverage.

License

Apache-2.0, mirroring upstream Duckling.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

puckling-0.1.1.tar.gz (156.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

puckling-0.1.1-py3-none-any.whl (179.1 kB view details)

Uploaded Python 3

File details

Details for the file puckling-0.1.1.tar.gz.

File metadata

  • Download URL: puckling-0.1.1.tar.gz
  • Upload date:
  • Size: 156.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for puckling-0.1.1.tar.gz
Algorithm Hash digest
SHA256 57ae82ffd3287b73150792cf74ade4ba71d684c6f92a379109d61cdc92a6a1f8
MD5 4bbeaaeb4d22ad8b468ceb9006b5e9b1
BLAKE2b-256 dda9485be4000c28129464eec22d8f264d694dfecc394f3f5778b35aeb1e9bbc

See more details on using hashes here.

Provenance

The following attestation bundles were made for puckling-0.1.1.tar.gz:

Publisher: publish.yml on Mazyod/puckling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file puckling-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: puckling-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 179.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for puckling-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 9670af551b677784c8197bec4bcb82ade79fac52ee166daf374f2cefdd5a732b
MD5 1da042925e57f1efbdee8bb84552cb58
BLAKE2b-256 d5a274cb9073909d2a9f83c26fc088e293ccfd99b3c78ef0ef443b7a40a0bc62

See more details on using hashes here.

Provenance

The following attestation bundles were made for puckling-0.1.1-py3-none-any.whl:

Publisher: publish.yml on Mazyod/puckling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page