Skip to main content

library for working with uncertain, fuzzy, or partially unknown dates and date intervals

Project description

undate overview

undate

undate is a python library for working with uncertain or partially known dates.

[!NOTE] This is beta software; it is still in development and not fully feature complete. If you use it, please let us know and share your feedback.

Currently undate supports parsing, formatting, and reasoning with dates in varying precision and calendars; dates with different precision and from different original calendars can be used together. Supported formats include:

  • portions of EDTF (Extended Date Time Format)
  • ISO8601
  • parsing and calendar conversion for dates in Hebrew Anno Mundi and Islamic Hijri calendars
  • Gregorian dates with full or abbreviated month names in any order for multiple languages (English, Spanish, French, German, Kinyarwanda, Ganda, Tigrinya)
  • Christian liturgical dates (fixed holidays and movable feasts)

For unambiguous dates, there is an experimental omnibus parser which combines all available dates (bare years are currently assumed to be Gregorian calendar).

For more about the origin and goals of undate, read our 2025 software paper:

Rebecca Sutton Koeser, Julia Damerow, Robert Casties, and Cole Crawford. “Undate: Humanistic Dates for Computation.” Computational Humanities Research, August 5, 2025.


DOI License Documentation Status unit tests codecov Ruff

Project documentation is available on ReadTheDocs.

All Contributors

Installation

Recommended: use pip to install the latest published version from PyPI:

pip install undate

To install a development version or specific tag or branch, you can install from GitHub. Use the @name notation to specify the branch or tag; e.g., to install development version:

pip install git+https://github.com/dh-tech/undate-python@develop#egg=undate

Example Usage

Often humanities and cultural data include imprecise or uncertain temporal information. We want to store that information but also work with it in a structured way, not just treat it as text for display. Different projects may need to work with or convert between different date formats or even different calendars.

An undate.Undate is analogous to python’s builtin datetime.date object, but with support for varying degrees of precision and unknown information. You can initialize an Undate with either strings or numbers for whichever parts of the date are known or partially known. An Undate can take an optional label.

from undate import Undate

november7 = Undate(2000, 11, 7)
november = Undate(2000, 11)
year2k = Undate(2000)
november7_some_year = Undate(month=11, day=7)

partially_known_year = Undate("19XX")
partially_known_month = Undate(2022, "1X")

easter1916 = Undate(1916, 4, 23, label="Easter 1916")

You can convert an Undate to string using a date formatter (current default is ISO8601):

>>> [str(d) for d in [november7, november, year2k, november7_some_year]]
['2000-11-07', '2000-11', '2000', '--11-07']

If enough information is known, an Undate object can report on its duration:

>>> december = Undate(2000, 12)
>>> feb_leapyear = Undate(2024, 2)
>>> feb_regularyear = Undate(2023, 2)
>>> for d in [november7, november, december, year2k, november7_some_year, feb_regularyear, feb_leapyear]:
...    print(f"{d}  - duration in days: {d.duration().days}")
...
2000-11-07  - duration in days: 1
2000-11  - duration in days: 30
2000-12  - duration in days: 31
2000  - duration in days: 366
--11-07  - duration in days: 1
2023-02  - duration in days: 28
2024-02  - duration in days: 29

If enough of the date is known and the precision supports it, you can check if one date falls within another date:

>>> november7 = Undate(2000, 11, 7)
>>> november2000 = Undate(2000, 11)
>>> year2k = Undate(2000)
>>> ad100 = Undate(100)
>>> november7 in november
True
>>> november2000 in year2k
True
>>> november7 in year2k
True
>>> november2000 in ad100
False
>>> november7 in ad100
False

For dates that are imprecise or partially known, undate calculates earliest and latest possible dates for comparison purposes so you can sort dates and compare with equals, greater than, and less than. You can also compare with python datetime.date objects.

>>> november7_2020 = Undate(2020, 11, 7)
>>> november_2001 = Undate(2001, 11)
>>> year2k = Undate(2000)
>>> ad100 = Undate(100)
>>> sorted([november7_2020, november_2001, year2k, ad100])
[undate.Undate(year=100, calendar="Gregorian"), undate.Undate(year=2000, calendar="Gregorian"), undate.Undate(year=2001, month=11, calendar="Gregorian"), undate.Undate(year=2020, month=11, day=7, calendar="Gregorian")]
>>> november7_2020 > november_2001
True
>>> year2k < ad100
False
>>> from datetime import date
>>> year2k > date(2001, 1, 1)
False

When dates cannot be compared due to ambiguity or precision, comparison methods raise a NotImplementedError.

>>> november_2020 = Undate(2020, 11)
>>> november7_2020 > november_2020
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/rkoeser/workarea/github/undate-python/src/undate/undate.py", line 262, in __gt__
    return not (self < other or self == other)
  File "/Users/rkoeser/workarea/github/undate-python/src/undate/undate.py", line 245, in __lt__
    raise NotImplementedError(
NotImplementedError: Can't compare when one date falls within the other

An UndateInterval is a date range between two Undate objects. Intervals can be open-ended, allow for optional labels, and can calculate duration if enough information is known. UndateIntervals are inclusive (i.e., a closed interval), and include both the earliest and latest date as part of the range.

>>> from undate import UndateInterval
>>> UndateInterval(Undate(1900), Undate(2000))
undate.UndateInterval(earliest=undate.Undate(year=1900, calendar="Gregorian"), latest=undate.Undate(year=2000, calendar="Gregorian"))
>>> UndateInterval(Undate(1801), Undate(1900), label="19th century")
undate.UndateInterval(earliest=undate.Undate(year=1801, calendar="Gregorian"), latest=undate.Undate(year=1900, calendar="Gregorian"), label="19th century")
>>> UndateInterval(Undate(1801), Undate(1900), label="19th century").duration().days
36524
>>> UndateInterval(Undate(1901), Undate(2000), label="20th century")
undate.UndateInterval(earliest=undate.Undate(year=1901, calendar="Gregorian"), latest=undate.Undate(year=2000, calendar="Gregorian"), label="20th century")
>>> UndateInterval(latest=Undate(2000))  # before 2000
undate.UndateInterval(latest=undate.Undate(year=2000, calendar="Gregorian"))
>>> UndateInterval(Undate(1900))  # after 1900
undate.UndateInterval(earliest=undate.Undate(year=1900, calendar="Gregorian"))
>>> UndateInterval(Undate(1900), Undate(2000), label="19th century").duration().days
36890
>>> UndateInterval(Undate(2000, 1, 1), Undate(2000, 1,31)).duration().days
31

You can initialize Undate or UndateInterval objects by parsing a date string with a specific converter, and you can also output an Undate object in those formats. Currently available converters are "ISO8601" and "EDTF" and supported calendars.

>>> from undate import Undate
>>> Undate.parse("2002", "ISO8601")
undate.Undate(year=2002, calendar="Gregorian")
>>> Undate.parse("2002-05", "EDTF")
undate.Undate(year=2002, month=5, calendar="Gregorian")
>>> Undate.parse("--05-03", "ISO8601")
undate.Undate(month=5, day=3, calendar="Gregorian")
>>> Undate.parse("--05-03", "ISO8601").format("EDTF")
'XXXX-05-03'
>>> Undate.parse("1800/1900", format="EDTF")
undate.UndateInterval(earliest=undate.Undate(year=1800, calendar="Gregorian"), latest=undate.Undate(year=1900, calendar="Gregorian"))

Calendars

All Undate objects are calendar aware, and date converters include support for parsing and working with dates from other calendars. The Gregorian calendar is used by default; currently undate supports the Islamic Hijri calendar and the Hebrew Anno Mundi calendar based on calendar conversion logic implemented in the convertdate package.

Dates are stored with the year, month, day and appropriate precision for the original calendar; internally, earliest and latest dates are calculated in Gregorian / Proleptic Gregorian calendar for standardized comparison across dates from different calendars.

>>> from undate import Undate
>>> tammuz4816 = Undate.parse("26 Tammuz 4816", "Hebrew")
>>> tammuz4816
undate.Undate(year=4816, month=4, day=26, label="26 Tammuz 4816 Anno Mundi", calendar="Hebrew")
>>> rajab495 = Undate.parse("Rajab 495", "Islamic")
>>> rajab495
undate.Undate(year=495, month=7, label="Rajab 495 Islamic", calendar="Islamic")
>>> y2k = Undate.parse("2001", "EDTF")
>>> y2k
undate.Undate(year=2001, calendar="Gregorian")
>>> [str(d.earliest) for d in [rajab495, tammuz4816, y2k]]
['1102-04-28', '1056-07-17', '2001-01-01']
>>> [str(d.precision) for d in [rajab495, tammuz4816, y2k]]
['MONTH', 'DAY', 'YEAR']
>>> sorted([rajab495, tammuz4816, y2k])
[undate.Undate(year=4816, month=4, day=26, label="26 Tammuz 4816 Anno Mundi", calendar="Hebrew"), undate.Undate(year=495, month=7, label="Rajab 495 Islamic", calendar="Islamic"), undate.Undate(year=2001, calendar="Gregorian")]

For more examples, refer to the code notebooks included in the examples directory in this repository.

Documentation

Project documentation is available on ReadTheDocs.

For instructions on setting up for local development, see Developer Notes.

See Contributors for more detailed information about contributors.

Publications & Presentations

Related Projects

License

This software is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

undate-0.7.0.tar.gz (57.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

undate-0.7.0-py3-none-any.whl (53.7 kB view details)

Uploaded Python 3

File details

Details for the file undate-0.7.0.tar.gz.

File metadata

  • Download URL: undate-0.7.0.tar.gz
  • Upload date:
  • Size: 57.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for undate-0.7.0.tar.gz
Algorithm Hash digest
SHA256 8262edf5006801cc706bac5a946895b8f65391390ebcca2e3e70476c5b8100ed
MD5 9faab931672dd123cbac0f9c7f6499e1
BLAKE2b-256 a085652e4f3fa465c439ca34727433ad974fd7730412dcd716785aa892ea9c8c

See more details on using hashes here.

File details

Details for the file undate-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: undate-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 53.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for undate-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a4c8d601b7f2c723a7d63dbb3f3d2cd836148d2c83a595a2cb4c4cc032fc075b
MD5 e1923e2f9de6cd0b8c1f972d7c8df354
BLAKE2b-256 3ea059f528e911d6807b9d6215d9f95dc70f119ffbe05d5138e964d697140543

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page