Skip to main content

Extracts and structures dates and times from natural French text.

Project description

chronostring

chronostring is a Python library designed to extract dates and times from natural language strings written in French. It transforms text like "du 3 au 5 juillet" into Python datetime objects, making it easier to process temporal information from unstructured data.

It is designed to handle flexible, informal expressions such as:

  • "5 et 6 juin 2024"
  • "du 3 au 5 juillet"
  • "lundi 4 et mardi 5 mars 2025"
  • "les 1er, 2 et 5 juin à 10h"
  • "le 8 et le 9 mai"
  • "vendredi 12/01/2025 à 18h30"

Features

  • Extracts single dates and date ranges from French strings
  • Handles time expressions like "à 18h" or "de 10:00 à 12:00"
  • Recognizes partial dates and completes them from context
  • Produces clean and structured outputs: each item in the output list is either a date, a datetime or a datetime range.

How chronostring Works

chronostring operates through a multi-step process designed for extracting structured date and time data from natural language strings.

First, the input string is tokenized into elementary Token objects, identifying both literal tokens (such as conjunctions and delimiters) and content-bearing ones (such as partial or complete dates and times).

These tokens are then enriched through a second stage that detects and completes partial dates and times when possible. The next stage involves matching token sequences to known temporal patterns, using a symbolic representation (each token class mapped to a character) and applying regular expressions on these symbolic strings.

Matched patterns are replaced with more complex temporal objects such as datetime, date, or datetime intervals. This entire processing pipeline is implemented via a series of processors, designed to be chained and optimized with Python's yield mechanism for lazy evaluation and efficiency.

The library also supports internationalization: all language-specific tokens and processors for French are grouped into a separate implementation (tokens_fr and processors_fr), making it easy to extend support for other languages.

Installation

pip install chronostring

(Note: Not yet available on PyPI)

Usage

from chronostring import parse_dates

text = "Les 5, 6 et 7 juin à 10h"
dates = parse_dates(text)

for dt in dates:
    print(dt)

Test suite

To run the test suite, simply use the command make test from the root directory of the project. This command will automatically invoke pytest and run all unit tests defined in the tests/ directory. Make sure you have the required dependencies installed beforehand.

License

This project is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0).

See LICENSE for details.

Authors

See AUTHORS.md for the list of contributors.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chronostring-0.1.0.tar.gz (23.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chronostring-0.1.0-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file chronostring-0.1.0.tar.gz.

File metadata

  • Download URL: chronostring-0.1.0.tar.gz
  • Upload date:
  • Size: 23.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chronostring-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3ef30ff86d09af0fb805561cce1157068f9ccaac70fe8c2cee0a256bb6df109d
MD5 89555fdb459db3a4eeb42543f613dd96
BLAKE2b-256 0d143c3e2984690bc02fdb427a3910ce38792b5d9260246a839b16a38d6d808e

See more details on using hashes here.

File details

Details for the file chronostring-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: chronostring-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for chronostring-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e86be63bce218dd0f22905a6bdcb6468d03a5f008d3e1f8a6bf833250c51b5f
MD5 d39ea50a779a82c29f394e84b746a878
BLAKE2b-256 70622762c619c646e0bf9e040585f3ab1d9fd78599318d400c5dfbea527e0c5f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page