Skip to main content

library for working with uncertain, fuzzy, or partially unknown dates and date intervals

Project description

undate overview

undate

undate is a python library for working with uncertain or partially known dates.

[!WARNING] This is beta software and is not yet feature complete! Use with caution and give us feedback. Currently undate supports parsing and formatting dates in ISO8601, some portions of EDTF (Extended Date Time Format), and parsing and conversion for dates in Hebrew Anno Mundi and Islamic Hijri calendars.

Undate was initially created as part of a DH-Tech hackathon in November 2022.


DOI License Documentation Status unit tests codecov Ruff

All Contributors

Read Contributors for detailed contribution information.

Installation

Recommended: use pip to install the latest published version from PyPI:

pip install undate

To install a development version or specific tag or branch, you can install from GitHub. Use the @name notation to specify the branch or tag; e.g., to install development version:

pip install git+https://github.com/dh-tech/undate-python@develop#egg=undate

Example Usage

Often humanities and cultural data include imprecise or uncertain temporal information. We want to store that information but also work with it in a structured way, not just treat it as text for display. Different projects may need to work with or convert between different date formats or even different calendars.

An undate.Undate is analogous to python’s builtin datetime.date object, but with support for varying degrees of precision and unknown information. You can initialize an Undate with either strings or numbers for whichever parts of the date are known or partially known. An Undate can take an optional label.

from undate import Undate

november7 = Undate(2000, 11, 7)
november = Undate(2000, 11)
year2k = Undate(2000)
november7_some_year = Undate(month=11, day=7)

partially_known_year = Undate("19XX")
partially_known_month = Undate(2022, "1X")

easter1916 = Undate(1916, 4, 23, label="Easter 1916")

You can convert an Undate to string using a date formatter (current default is ISO8601):

>>> [str(d) for d in [november7, november, year2k, november7_some_year]]
['2000-11-07', '2000-11', '2000', '--11-07']

If enough information is known, an Undate object can report on its duration:

>>> december = Undate(2000, 12)
>>> feb_leapyear = Undate(2024, 2)
>>> feb_regularyear = Undate(2023, 2)
>>> for d in [november7, november, december, year2k, november7_some_year, feb_regularyear, feb_leapyear]:
...    print(f"{d}  - duration in days: {d.duration().days}")
...
2000-11-07  - duration in days: 1
2000-11  - duration in days: 30
2000-12  - duration in days: 31
2000  - duration in days: 366
--11-07  - duration in days: 1
2023-02  - duration in days: 28
2024-02  - duration in days: 29

If enough of the date is known and the precision supports it, you can check if one date falls within another date:

>>> november7 = Undate(2000, 11, 7)
>>> november2000 = Undate(2000, 11)
>>> year2k = Undate(2000)
>>> ad100 = Undate(100)
>>> november7 in november
True
>>> november2000 in year2k
True
>>> november7 in year2k
True
>>> november2000 in ad100
False
>>> november7 in ad100
False

For dates that are imprecise or partially known, undate calculates earliest and latest possible dates for comparison purposes so you can sort dates and compare with equals, greater than, and less than. You can also compare with python datetime.date objects.

>>> november7_2020 = Undate(2020, 11, 7)
>>> november_2001 = Undate(2001, 11)
>>> year2k = Undate(2000)
>>> ad100 = Undate(100)
>>> sorted([november7_2020, november_2001, year2k, ad100])
[undate.Undate(year=100, calendar="Gregorian"), undate.Undate(year=2000, calendar="Gregorian"), undate.Undate(year=2001, month=11, calendar="Gregorian"), undate.Undate(year=2020, month=11, day=7, calendar="Gregorian")]
>>> november7_2020 > november_2001
True
>>> year2k < ad100
False
>>> from datetime import date
>>> year2k > date(2001, 1, 1)
False

When dates cannot be compared due to ambiguity or precision, comparison methods raise a NotImplementedError.

>>> november_2020 = Undate(2020, 11)
>>> november7_2020 > november_2020
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/rkoeser/workarea/github/undate-python/src/undate/undate.py", line 262, in __gt__
    return not (self < other or self == other)
  File "/Users/rkoeser/workarea/github/undate-python/src/undate/undate.py", line 245, in __lt__
    raise NotImplementedError(
NotImplementedError: Can't compare when one date falls within the other

An UndateInterval is a date range between two Undate objects. Intervals can be open-ended, allow for optional labels, and can calculate duration if enough information is known. UndateIntervals are inclusive (i.e., a closed interval), and include both the earliest and latest date as part of the range.

>>> from undate import UndateInterval
>>> UndateInterval(Undate(1900), Undate(2000))
undate.UndateInterval(earliest=undate.Undate(year=1900, calendar="Gregorian"), latest=undate.Undate(year=2000, calendar="Gregorian"))
>>> UndateInterval(Undate(1801), Undate(1900), label="19th century")
undate.UndateInterval(earliest=undate.Undate(year=1801, calendar="Gregorian"), latest=undate.Undate(year=1900, calendar="Gregorian"), label="19th century")
>>> UndateInterval(Undate(1801), Undate(1900), label="19th century").duration().days
36524
>>> UndateInterval(Undate(1901), Undate(2000), label="20th century")
undate.UndateInterval(earliest=undate.Undate(year=1901, calendar="Gregorian"), latest=undate.Undate(year=2000, calendar="Gregorian"), label="20th century")
>>> UndateInterval(latest=Undate(2000))  # before 2000
undate.UndateInterval(latest=undate.Undate(year=2000, calendar="Gregorian"))
>>> UndateInterval(Undate(1900))  # after 1900
undate.UndateInterval(earliest=undate.Undate(year=1900, calendar="Gregorian"))
>>> UndateInterval(Undate(1900), Undate(2000), label="19th century").duration().days
36890
>>> UndateInterval(Undate(2000, 1, 1), Undate(2000, 1,31)).duration().days
31

You can initialize Undate or UndateInterval objects by parsing a date string with a specific converter, and you can also output an Undate object in those formats. Currently available converters are "ISO8601" and "EDTF" and supported calendars.

>>> from undate import Undate
>>> Undate.parse("2002", "ISO8601")
undate.Undate(year=2002, calendar="Gregorian")
>>> Undate.parse("2002-05", "EDTF")
undate.Undate(year=2002, month=5, calendar="Gregorian")
>>> Undate.parse("--05-03", "ISO8601")
undate.Undate(month=5, day=3, calendar="Gregorian")
>>> Undate.parse("--05-03", "ISO8601").format("EDTF")
'XXXX-05-03'
>>> Undate.parse("1800/1900", format="EDTF")
undate.UndateInterval(earliest=undate.Undate(year=1800, calendar="Gregorian"), latest=undate.Undate(year=1900, calendar="Gregorian"))

Calendars

All Undate objects are calendar aware, and date converters include support for parsing and working with dates from other calendars. The Gregorian calendar is used by default; currently undate supports the Islamic Hijri calendar and the Hebrew Anno Mundi calendar based on calendar conversion logic implemented in the convertdate package.

Dates are stored with the year, month, day and appropriate precision for the original calendar; internally, earliest and latest dates are calculated in Gregorian / Proleptic Gregorian calendar for standardized comparison across dates from different calendars.

>>> from undate import Undate
>>> tammuz4816 = Undate.parse("26 Tammuz 4816", "Hebrew")
>>> tammuz4816
undate.Undate(year=4816, month=4, day=26, label="26 Tammuz 4816 Anno Mundi", calendar="Hebrew")
>>> rajab495 = Undate.parse("Rajab 495", "Islamic")
>>> rajab495
undate.Undate(year=495, month=7, label="Rajab 495 Islamic", calendar="Islamic")
>>> y2k = Undate.parse("2001", "EDTF")
>>> y2k
undate.Undate(year=2001, calendar="Gregorian")
>>> [str(d.earliest) for d in [rajab495, tammuz4816, y2k]]
['1102-04-28', '1056-07-17', '2001-01-01']
>>> [str(d.precision) for d in [rajab495, tammuz4816, y2k]]
['MONTH', 'DAY', 'YEAR']
>>> sorted([rajab495, tammuz4816, y2k])
[undate.Undate(year=4816, month=4, day=26, label="26 Tammuz 4816 Anno Mundi", calendar="Hebrew"), undate.Undate(year=495, month=7, label="Rajab 495 Islamic", calendar="Islamic"), undate.Undate(year=2001, calendar="Gregorian")]

For more examples, refer to the code notebooks included in the examples directory in this repository.

Documentation

Project documentation is available on ReadTheDocs.

For instructions on setting up for local development, see Developer Notes.

License

This software is licensed under the Apache 2.0 License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

undate-0.6.0.tar.gz (50.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

undate-0.6.0-py3-none-any.whl (45.8 kB view details)

Uploaded Python 3

File details

Details for the file undate-0.6.0.tar.gz.

File metadata

  • Download URL: undate-0.6.0.tar.gz
  • Upload date:
  • Size: 50.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for undate-0.6.0.tar.gz
Algorithm Hash digest
SHA256 861b0e1b638920bb3470de83dabddfee2950a8b5a62135c25c322da37cbb5b40
MD5 1a42a7580e1570d5d947cb0087a210c0
BLAKE2b-256 5041720ffc159b9b182cc5f373be4eacfb0c95848dd3640f3b9856155645a9d7

See more details on using hashes here.

File details

Details for the file undate-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: undate-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 45.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for undate-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 61eb643beb62420ca6867a71fd3092b122a4e79d95b14803f453a12a527a078f
MD5 87bc77a19d03d2f8cd52350101ee5db6
BLAKE2b-256 75aff1fddb291d2d4b5c436542311800a88a8563c36af0bb648f6b51c0e0182f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page