Skip to main content

Extracts and structures dates and times from natural French text.

Project description

chronostring

chronostring is a Python library designed to extract dates and times from natural language strings written in French. It transforms text like "du 3 au 5 juillet" into Python datetime objects, making it easier to process temporal information from unstructured data.

It is designed to handle flexible, informal expressions such as:

  • "5 et 6 juin 2024"
  • "du 3 au 5 juillet"
  • "lundi 4 et mardi 5 mars 2025"
  • "les 1er, 2 et 5 juin à 10h"
  • "le 8 et le 9 mai"
  • "vendredi 12/01/2025 à 18h30"

Features

  • Extracts single dates and date ranges from French strings
  • Handles time expressions like "à 18h" or "de 10:00 à 12:00"
  • Recognizes partial dates and completes them from context
  • Produces clean and structured outputs: each item in the output list is either a date, a datetime or a datetime range.

How chronostring Works

chronostring operates through a multi-step process designed for extracting structured date and time data from natural language strings.

First, the input string is tokenized into elementary Token objects, identifying both literal tokens (such as conjunctions and delimiters) and content-bearing ones (such as partial or complete dates and times).

These tokens are then enriched through a second stage that detects and completes partial dates and times when possible. The next stage involves matching token sequences to known temporal patterns, using a symbolic representation (each token class mapped to a character) and applying regular expressions on these symbolic strings.

Matched patterns are replaced with more complex temporal objects such as datetime, date, or datetime intervals. This entire processing pipeline is implemented via a series of processors, designed to be chained and optimized with Python's yield mechanism for lazy evaluation and efficiency.

The library also supports internationalization: all language-specific tokens and processors for French are grouped into a separate implementation (tokens_fr and processors_fr), making it easy to extend support for other languages.

Installation

pip install chronostring

Usage

from chronostring import parse_dates

text = "Les 5, 6 et 7 juin à 10h"
dates = parse_dates(text)

for dt in dates:
    print(dt)

Test suite

To run the test suite, simply use the command make test from the root directory of the project. This command will automatically invoke pytest and run all unit tests defined in the tests/ directory. Make sure you have the required dependencies installed beforehand.

License

This project is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0).

See LICENSE for details.

Authors

See AUTHORS.md for the list of contributors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chronostring-0.1.2.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chronostring-0.1.2-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file chronostring-0.1.2.tar.gz.

File metadata

  • Download URL: chronostring-0.1.2.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chronostring-0.1.2.tar.gz
Algorithm Hash digest
SHA256 fa53ff261758ead02b331baff39f8501346dc527dd730aaf540c6825df7627c7
MD5 81793c2d05c47afecbd525351df4f22e
BLAKE2b-256 c695f6ab3d0bb076ab9cf6cc5f99ef3af67ed4376a003b3cdf91611b71833e47

See more details on using hashes here.

File details

Details for the file chronostring-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: chronostring-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chronostring-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1eee193c3ba85d9b98dba8b36596155fb783b91c1a4aa1e26bc9711ff508fcf3
MD5 972106fbd4098c71817d1305702c25f2
BLAKE2b-256 d1b29fa01c41629c4005abb0ac5a13404a2baa874b6b9507ce34917f9cc8dee4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page