Skip to main content

Infer plausible time zones for a time series dataset based on Daylight Savings Time switches

Project description

tz-canary - Time Zone Canary

In a perfect world, all time series data is time-zone-aware and stored in UTC. Sadly, we do not live in a perfect world. Time series data often lacks a time zone identifier, or worse, does not actually adhere to the time zone it claims to be in.

tz-canary inspects the Daylight Savings Time (DST) switches in a time series to infer a set of plausible time zones the data could be in. It allows you to infer the full set of plausible time zones for the data, or to validate whether a given time zone is plausible for the data.

Installation

tz-canary is available on PyPI, so you can install it just like any other Python package:

pip install tz-canary

Usage

Time zone validation

The simplest way to use tz-canary is to validate a given time zone for a time series:

import pandas as pd
from tz_canary import validate_time_zone

df = pd.read_csv("docs/data/example_data.csv", index_col="datetime", parse_dates=True)

validate_time_zone(df.index, "Europe/Amsterdam")  # will pass
validate_time_zone(df.index, "America/New_York")  # will raise ImplausibleTimeZoneError
validate_time_zone(df.index, "UTC")  # will raise ImplausibleTimeZoneError

Time zone inference

You can also get a list of all plausible time zones for a time series:

from pprint import pprint

import pandas as pd
from tz_canary import infer_time_zone

df = pd.read_csv("docs/data/example_data.csv", index_col="datetime", parse_dates=True)

plausible_time_zones = infer_time_zone(df.index)
pprint(plausible_time_zones)

# Output:
# {zoneinfo.ZoneInfo(key='Africa/Ceuta'),
#  zoneinfo.ZoneInfo(key='Arctic/Longyearbyen'),
#  zoneinfo.ZoneInfo(key='Europe/Amsterdam'),
#  ...
#  zoneinfo.ZoneInfo(key='Europe/Zurich')}

Advanced usage: inference with cached TransitionsData

When processing many time series, it can be useful to cache the transitions data used by tz-canary to infer time zones. You can do this by creating a TransitionsData object and passing it to infer_time_zone (and this also works for validate_time_zone):

import pandas as pd

from tz_canary import TransitionsData, infer_time_zone

# We create a TransitionsData object to avoid having to recompute the transitions for
#   every call to validate_time_zone.
transition_data = TransitionsData(2010, 2023)

for i in range(10):
    df = pd.read_csv(
        "docs/data/example_data.csv",  # In reality, these would be different files
        index_col="datetime",
        parse_dates=True,
    )
    plausible_time_zones = infer_time_zone(df.index, transition_data=transition_data)
    print(i, plausible_time_zones)

Development

  1. Make sure you have git, git LFS, and Poetry installed.
  2. Clone this repository:
    git clone https://github.com/leonoverweel/tz-canary
    cd tz-canary
    
  3. Install the development requirements:
    poetry install --with dev
    
  4. Install the pre-commit hooks (used for linting):
    pre-commit install
    
  5. Run the tests:
    poetry run pytest
    

Making a release

  1. Bump the version number in pyproject.toml and commit the change.
  2. Make a new release on GitHub.
  3. Build the package:
    poetry build
    
  4. Publish the package to PyPI:
    poetry publish
    

Contributing

Please don't hesitate to open issues and PRs!

GitHub repository: https://github.com/leonoverweel/tz-canary.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tz_canary-0.1.2.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

tz_canary-0.1.2-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file tz_canary-0.1.2.tar.gz.

File metadata

  • Download URL: tz_canary-0.1.2.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/22.3.0

File hashes

Hashes for tz_canary-0.1.2.tar.gz
Algorithm Hash digest
SHA256 95ba8344bea0015ccba3acc7ba671fe3fec92e35bdc333a528b98c9fb57a6281
MD5 7b910159b6e2aeb410405753b758afd5
BLAKE2b-256 4258771a5523bd2d56c078a2998fca50007de778d53369aafe2454c91bf8b225

See more details on using hashes here.

Provenance

File details

Details for the file tz_canary-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: tz_canary-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.11.4 Darwin/22.3.0

File hashes

Hashes for tz_canary-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f2d622c7427f10f9d2672cf59569f05b9e04416f4132d468dd7a9897c1bf4df1
MD5 751b74fd8a37247894aa4dc30400e55c
BLAKE2b-256 7ebb3f05ead2039371b30270c808f81112e291c765efe95cc63b5bdc543df3e9

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page