Python-native temporal expression parser and normalizer
Project description
Timenorm-Py
A Python-native temporal expression parser and normalizer based on the timenorm library.
Overview
Timenorm-Py finds and normalizes temporal expressions in natural language text using a neural network-based approach (SCATE - Semantically Compositional Annotation for TEmporal expressions).
Example:
from timenorm import TemporalParser
import datetime
parser = TemporalParser()
text = "I saw her last week and will meet her next Tuesday."
anchor = datetime.datetime(2024, 11, 15)
results = parser.parse(text, anchor)
# Returns temporal expressions with normalized intervals:
# - "last week" → Interval(2024-11-08, 2024-11-15)
# - "next Tuesday" → Interval(2024-11-19, 2024-11-20)
Features
- 🧠 Neural Parser: Character-level RNN for accurate temporal expression identification
- 🔧 Compositional Operators: Build complex temporal expressions from simple operators (Last, Next, This, Before, After, etc.)
- 📅 Python-Native: Built with Python's
datetimeanddateutilfor seamless integration - ✅ Well-Tested: Comprehensive test suite matching the original Scala implementation
Installation
From GitHub
# Clone the repository
git clone https://github.com/dadhichgaurav1/temporalextractor-timenorm-py.git
cd temporalextractor-timenorm-py
# Install in development mode
pip install -e .
From PyPI (Coming Soon)
pip install timenorm-py
Requirements
- Python 3.10+
python-dateutiltensorflow(optional, for neural network inference)
Quick Start
Simple Parsing
from timenorm import TemporalParser, Interval
# Create parser
parser = TemporalParser()
# Parse with anchor time (document creation time)
anchor = Interval.of(2024, 11, 19)
text = "I saw her last week and will meet her next Tuesday."
# Note: Requires TensorFlow model for detection
# Currently returns empty without model, but infrastructure is ready
results = parser.parse(text, anchor=anchor)
Using Direct Temporal Algebra
from timenorm import Interval, Period, Last, Next, DAY, WEEK, MONTH
import datetime
# Create intervals
anchor = Interval.of(2024, 11, 19)
year_2024 = Interval.of(2024)
march_15 = Interval.of(2024, 3, 15)
# Period arithmetic
three_months = Period(MONTH, 3)
start = datetime.datetime(2024, 1, 1)
interval = start + three_months # January 1 + 3 months = April 1
# Temporal operators
last_week = Last(anchor, Period(DAY, 7))
print(f"Last week: {last_week.start} to {last_week.end}")
next_weeks = Next(anchor, Period(WEEK, 3))
print(f"Next 3 weeks: {next_weeks.start} to {next_weeks.end}")
Parsing from XML (Anafora Format)
from timenorm import TemporalParser, Interval
parser = TemporalParser()
anchor = Interval.of(2024, 11, 19)
# Parse from Anafora XML file
results = parser.parse_xml("annotations.xml", anchor=anchor)
for expr in results:
print(f"{expr}: {expr.start} to {expr.end}")
Batch Processing
from timenorm import TemporalParser, Interval
parser = TemporalParser()
text = "Monday meeting. Tuesday lunch. Wednesday presentation."
spans = [(0, 15), (16, 29), (30, 53)]
anchor = Interval.of(2024, 11, 19)
results = parser.parse_batch(text, spans, anchor=anchor)
for i, batch_result in enumerate(results):
print(f"Batch {i+1}: {batch_result}")
Core Concepts
Intervals
Temporal intervals on the timeline with start (inclusive) and end (exclusive) points:
from timenorm import Interval
# Year 2024
year = Interval.of(2024) # 2024-01-01 to 2025-01-01
# Specific day
day = Interval.of(2024, 11, 15) # 2024-11-15 to 2024-11-16
Periods
Amounts of time independent of the timeline:
from timenorm import Period, MONTH, WEEK
three_months = Period(MONTH, 3)
two_weeks = Period(WEEK, 2)
Operators
Compositional operators for building complex temporal expressions:
from timenorm import Last, Next, Period, DAY
from datetime import datetime
anchor = Interval.of(2024, 11, 15)
# "last 7 days"
last_week = Last(anchor, Period(DAY, 7))
# "next Tuesday"
next_tuesday = Next(anchor, Repeating(DAY, WEEK, value=1)) # Tuesday = 1
Requirements
- Python >= 3.10
- TensorFlow >= 2.12
- python-dateutil >= 2.8
Credits
This is a Python-native reimplementation of the original timenorm library developed by:
- Steven Bethard
- Egoitz Laparra
This is a Python-native reimplementation of timenorm, originally developed by Steven Bethard, Egoitz Laparra, and Dongfang Xu at the University of Arizona's Computational Language Understanding Lab (CLU Lab).
Original Authors
- Steven Bethard - University of Arizona
- Egoitz Laparra - University of Arizona
- Dongfang Xu - University of Arizona
Research Papers
The temporal expression normalization approach implemented in this library is based on:
-
Laparra, E., Xu, D., & Bethard, S. (2018). From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations. Transactions of the Association for Computational Linguistics, 6, 343-356.
-
Xu, D., Laparra, E., & Bethard, S. (2019). Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis. Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics.
Acknowledgments
This Python implementation:
- Maintains API compatibility with the original SCATE component
- Uses the same compositional semantics for temporal expressions
- Follows the architectural patterns from the original Scala implementation
- Includes resources (vocabularies, labels, schemas) from the original project
License
Both the original timenorm and this Python implementation are licensed under the Apache License 2.0. See the LICENSE file for details.
Contributing
Contributions are welcome! This project aims to maintain compatibility with the original timenorm while providing a pure-Python implementation.
Areas for contribution:
- Neural network model integration
- Additional language support
- Performance optimizations
- Documentation improvements
Citation
If you use this library in research, please cite the original papers:
@article{laparra2018characters,
title={From Characters to Time Intervals: New Paradigms for Evaluation and Neural Parsing of Time Normalizations},
author={Laparra, Egoitz and Xu, Dongfang and Bethard, Steven},
journal={Transactions of the Association for Computational Linguistics},
volume={6},
pages={343--356},
year={2018}
}
@inproceedings{xu2019pre,
title={Pre-trained Contextualized Character Embeddings Lead to Major Improvements in Time Normalization: A Detailed Analysis},
author={Xu, Dongfang and Laparra, Egoitz and Bethard, Steven},
booktitle={Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics},
year={2019}
}
Contact
For questions about this Python implementation, please open an issue on GitHub.
For questions about the original timenorm project, see https://github.com/clulab/timenorm
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file timenorm_py-0.1.0.tar.gz.
File metadata
- Download URL: timenorm_py-0.1.0.tar.gz
- Upload date:
- Size: 40.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61ac44350170c98a29c06849d68c000d6cdc247590b270ab2440f4339f8b01c3
|
|
| MD5 |
102c80ab069c60a11d22dbcac9287882
|
|
| BLAKE2b-256 |
12fff04acabfcc425ea32be0b2ffe66187327e951dd3d2cdfe8be14c16f2b420
|
File details
Details for the file timenorm_py-0.1.0-py3-none-any.whl.
File metadata
- Download URL: timenorm_py-0.1.0-py3-none-any.whl
- Upload date:
- Size: 37.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08d6dfc21ed2a71bc4bcc0f6d241916e16af9d9ca4c64e18fdce9c0b9eb750a7
|
|
| MD5 |
8778709c8f513a37dcdfe7cd43193446
|
|
| BLAKE2b-256 |
693a622521891d6e489b42e8775cab747c025b4d17a762c328e5728e25c26065
|