Skip to main content

Extracts and structures dates and times from natural French text.

Project description

chronostring

chronostring is a Python library designed to extract dates and times from natural language strings written in French. It transforms text like "du 3 au 5 juillet" into Python datetime objects, making it easier to process temporal information from unstructured data.

It is designed to handle flexible, informal expressions such as:

  • "5 et 6 juin 2024"
  • "du 3 au 5 juillet"
  • "lundi 4 et mardi 5 mars 2025"
  • "les 1er, 2 et 5 juin à 10h"
  • "le 8 et le 9 mai"
  • "vendredi 12/01/2025 à 18h30"

Features

  • Extracts single dates and date ranges from French strings
  • Handles time expressions like "à 18h" or "de 10:00 à 12:00"
  • Recognizes partial dates and completes them from context
  • Produces clean and structured outputs: each item in the output list is either a date, a datetime or a datetime range.

How chronostring Works

chronostring operates through a multi-step process designed for extracting structured date and time data from natural language strings.

First, the input string is tokenized into elementary Token objects, identifying both literal tokens (such as conjunctions and delimiters) and content-bearing ones (such as partial or complete dates and times).

These tokens are then enriched through a second stage that detects and completes partial dates and times when possible. The next stage involves matching token sequences to known temporal patterns, using a symbolic representation (each token class mapped to a character) and applying regular expressions on these symbolic strings.

Matched patterns are replaced with more complex temporal objects such as datetime, date, or datetime intervals. This entire processing pipeline is implemented via a series of processors, designed to be chained and optimized with Python's yield mechanism for lazy evaluation and efficiency.

The library also supports internationalization: all language-specific tokens and processors for French are grouped into a separate implementation (tokens_fr and processors_fr), making it easy to extend support for other languages.

Installation

pip install chronostring

Usage

from chronostring import parse_dates

text = "Les 5, 6 et 7 juin à 10h"
dates = parse_dates(text)

for dt in dates:
    print(dt)

Test suite

To run the test suite, simply use the command make test from the root directory of the project. This command will automatically invoke pytest and run all unit tests defined in the tests/ directory. Make sure you have the required dependencies installed beforehand.

License

This project is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0).

See LICENSE for details.

Authors

See AUTHORS.md for the list of contributors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chronostring-0.1.5-py3-none-any.whl (24.9 kB view details)

Uploaded Python 3

File details

Details for the file chronostring-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: chronostring-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 24.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.11

File hashes

Hashes for chronostring-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f9c5128392c9bbf62eb243dbfd29a3bd1e8275323b68976109b543bb05330c54
MD5 689166a34c18a5ef8f32ae0c77c62939
BLAKE2b-256 47610bfbe07d93e218288499a320957ea392b5bfc47d344c70e36df1a57508bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page