Skip to main content

Python-native temporal expression parser and normalizer

Project description

Timenorm-Py

A Python-native temporal expression parser and normalizer based on the timenorm library.

Overview

Timenorm-Py finds and normalizes temporal expressions in natural language text using a neural network-based approach (SCATE - Semantically Compositional Annotation for TEmporal expressions).

Example:

from timenorm import TemporalParser
import datetime

parser = TemporalParser()
text = "I saw her last week and will meet her next Tuesday."
anchor = datetime.datetime(2024, 11, 15)

results = parser.parse(text, anchor)
# Returns temporal expressions with normalized intervals:
# - "last week" → Interval(2024-11-08, 2024-11-15)
# - "next Tuesday" → Interval(2024-11-19, 2024-11-20)

Features

  • 🧠 Neural Parser: Character-level RNN for accurate temporal expression identification
  • 🔧 Compositional Operators: Build complex temporal expressions from simple operators (Last, Next, This, Before, After, etc.)
  • 📅 Python-Native: Built with Python's datetime and dateutil for seamless integration
  • Well-Tested: Comprehensive test suite matching the original Scala implementation

Installation

From GitHub

# Clone the repository
git clone https://github.com/dadhichgaurav1/temporalextractor-timenorm-py.git
cd temporalextractor-timenorm-py

# Install in development mode
pip install -e .

From PyPI (Coming Soon)

pip install timenorm-py

Requirements

  • Python 3.10+
  • python-dateutil
  • tensorflow (optional, for neural network inference)

Quick Start

Simple Parsing

from timenorm import TemporalParser, Interval

# Create parser
parser = TemporalParser()

# Parse with anchor time (document creation time)
anchor = Interval.of(2024, 11, 19)
text = "I saw her last week and will meet her next Tuesday."

# Note: Requires TensorFlow model for detection
# Currently returns empty without model, but infrastructure is ready
results = parser.parse(text, anchor=anchor)

Using Direct Temporal Algebra

from timenorm import Interval, Period, Last, Next, DAY, WEEK, MONTH
import datetime

# Create intervals
anchor = Interval.of(2024, 11, 19)
year_2024 = Interval.of(2024)
march_15 = Interval.of(2024, 3, 15)

# Period arithmetic
three_months = Period(MONTH, 3)
start = datetime.datetime(2024, 1, 1)
interval = start + three_months  # January 1 + 3 months = April 1

# Temporal operators
last_week = Last(anchor, Period(DAY, 7))
print(f"Last week: {last_week.start} to {last_week.end}")

next_weeks = Next(anchor, Period(WEEK, 3))
print(f"Next 3 weeks: {next_weeks.start} to {next_weeks.end}")

Parsing from XML (Anafora Format)

from timenorm import TemporalParser, Interval

parser = TemporalParser()
anchor = Interval.of(2024, 11, 19)

# Parse from Anafora XML file
results = parser.parse_xml("annotations.xml", anchor=anchor)

for expr in results:
    print(f"{expr}: {expr.start} to {expr.end}")

Batch Processing

from timenorm import TemporalParser, Interval

parser = TemporalParser()
text = "Monday meeting. Tuesday lunch. Wednesday presentation."
spans = [(0, 15), (16, 29), (30, 53)]
anchor = Interval.of(2024, 11, 19)

results = parser.parse_batch(text, spans, anchor=anchor)
for i, batch_result in enumerate(results):
    print(f"Batch {i+1}: {batch_result}")

Core Concepts

Intervals

Temporal intervals on the timeline with start (inclusive) and end (exclusive) points:

from timenorm import Interval

# Year 2024
year = Interval.of(2024)  # 2024-01-01 to 2025-01-01

# Specific day
day = Interval.of(2024, 11, 15)  # 2024-11-15 to 2024-11-16

Periods

Amounts of time independent of the timeline:

from timenorm import Period, MONTH, WEEK

three_months = Period(MONTH, 3)
two_weeks = Period(WEEK, 2)

Operators

Compositional operators for building complex temporal expressions:

from timenorm import Last, Next, Period, DAY
from datetime import datetime

anchor = Interval.of(2024, 11, 15)

# "last 7 days"
last_week = Last(anchor, Period(DAY, 7))

# "next Tuesday"
next_tuesday = Next(anchor, Repeating(DAY, WEEK, value=1))  # Tuesday = 1

Requirements

  • Python >= 3.10
  • TensorFlow >= 2.12
  • python-dateutil >= 2.8

Credits

This is a Python-native reimplementation of the original timenorm library developed by:

  • Steven Bethard
  • Egoitz Laparra
    This is a Python-native reimplementation of timenorm, originally developed by Steven Bethard, Egoitz Laparra, and Dongfang Xu at the University of Arizona's Computational Language Understanding Lab (CLU Lab).

Original Authors

  • Steven Bethard - University of Arizona
  • Egoitz Laparra - University of Arizona
  • Dongfang Xu - University of Arizona

Research Papers

The temporal expression normalization approach implemented in this library is based on:

  1. Laparra, E., Xu, D., & Bethard, S. (2018). From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations. Transactions of the Association for Computational Linguistics, 6, 343-356.

  2. Xu, D., Laparra, E., & Bethard, S. (2019). Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis. Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics.

Acknowledgments

This Python implementation:

  • Maintains API compatibility with the original SCATE component
  • Uses the same compositional semantics for temporal expressions
  • Follows the architectural patterns from the original Scala implementation
  • Includes resources (vocabularies, labels, schemas) from the original project

License

Both the original timenorm and this Python implementation are licensed under the Apache License 2.0. See the LICENSE file for details.

Contributing

Contributions are welcome! This project aims to maintain compatibility with the original timenorm while providing a pure-Python implementation.

Areas for contribution:

  • Neural network model integration
  • Additional language support
  • Performance optimizations
  • Documentation improvements

Citation

If you use this library in research, please cite the original papers:

@article{laparra2018characters,
  title={From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations},
  author={Laparra, Egoitz and Xu, Dongfang and Bethard, Steven},
  journal={Transactions of the Association for Computational Linguistics},
  volume={6},
  pages={343--356},
  year={2018}
}

@inproceedings{xu2019pre,
  title={Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis},
  author={Xu, Dongfang and Laparra, Egoitz and Bethard, Steven},
  booktitle={Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics},
  year={2019}
}

Contact

For questions about this Python implementation, please open an issue on GitHub.

For questions about the original timenorm project, see https://github.com/clulab/timenorm

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

timenorm_py-0.1.0.tar.gz (40.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

timenorm_py-0.1.0-py3-none-any.whl (37.0 kB view details)

Uploaded Python 3

File details

Details for the file timenorm_py-0.1.0.tar.gz.

File metadata

  • Download URL: timenorm_py-0.1.0.tar.gz
  • Upload date:
  • Size: 40.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for timenorm_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 61ac44350170c98a29c06849d68c000d6cdc247590b270ab2440f4339f8b01c3
MD5 102c80ab069c60a11d22dbcac9287882
BLAKE2b-256 12fff04acabfcc425ea32be0b2ffe66187327e951dd3d2cdfe8be14c16f2b420

See more details on using hashes here.

File details

Details for the file timenorm_py-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: timenorm_py-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 37.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for timenorm_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08d6dfc21ed2a71bc4bcc0f6d241916e16af9d9ca4c64e18fdce9c0b9eb750a7
MD5 8778709c8f513a37dcdfe7cd43193446
BLAKE2b-256 693a622521891d6e489b42e8775cab747c025b4d17a762c328e5728e25c26065

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page