Skip to main content

Keeping track of aliases

Project description

A very small Python package for keeping track of aliases.

Installation

$ pip install aliases

Getting Started

Keeping track of aliases in your data can be annoying. This small packages provides three small classes than can help you in the bookkeeping associated with the occurrences of aliases in your data. There are also pandas accessors that make it possible to enforce aliases immediately for a whole pandas Series or DataFrame.

The AliasSpace objects keeps track of existing aliases. As input is accepts a dictionary where a string (the “preferred” form) points to a list of all its aliases. Using the str method on the space, we can transform regular strings into AliasAwareString objects.

>>> s = AliasSpace(
>>>     {
>>>         "The Netherlands": ["NL", "Netherlands", "Holland"],
>>>         "The Hague": ["Den Haag", "'s-Gravenhage"],
>>>         "Amsterdam": ["Adam"],
>>>     },
>>>     case_sensitive=False,
>>> )
>>>
>>> s.str("nl")
<'nl' in AliasSpace>

The preferred form of an AliasAwareString is called its representative.

>>> s.str("nl").representative
'The Netherlands'

AliasAwareString objects with the same representative are considered equal and have the same hash.

>>> s.str("holland") == s.str("NL")
True
>>>
>>> data = {s.str("holland"): 12345}
>>> data[s.str("nl")]
12345

The example above already shows how alias aware strings can be used to store data without worrying too much about the different aliases around. However, it is still annoying to cast to an AliasAwareString every time manually. To solve this you can use the AliasAwareDict. This object can be created using the dict method on the space.

>>> data = s.dict(holland=12345)
>>> data['nl']
12345

Finally, when you have pandas installed, the aliases package will register accessors for series and dataframes. This allows you to easily enforce aliases in your pandas DataFrame. The following example was the original motivation for building this package:

>>> import pandas as pd
>>> df = pd.DataFrame(
>>>     {
>>>         "Country": ["NL", "Netherlands", "Belgium"],
>>>         "City": ["Den Haag", "amsterdam", "Brussel"],
>>>         "SomeData": [10, 11, 12],
>>>     }
>>> )
>>> df
       Country       City  SomeData
0           NL   Den Haag        10
1  Netherlands  amsterdam        11
2      Belgium    Brussel        12
>>>
>>> df.Country.aliases.representative(space=s)
0    The Netherlands
1    The Netherlands
2            Belgium
Name: Country, dtype: object
>>>
>>> df.aliases.representative(space=s, missing=pd.NA)
           Country       City  SomeData
0  The Netherlands  The Hague        10
1  The Netherlands  Amsterdam        11
2             <NA>       <NA>        12

Documentation

Coming soon…

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aliases-0.5.6.tar.gz (6.1 kB view details)

Uploaded Source

Built Distribution

aliases-0.5.6-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file aliases-0.5.6.tar.gz.

File metadata

  • Download URL: aliases-0.5.6.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for aliases-0.5.6.tar.gz
Algorithm Hash digest
SHA256 6361f885150b6a4a2e4929f94cd703b2c802e2ac2acd699cc559c674c62ce26f
MD5 6a4478e3c3f63c97282d5c3c01d13929
BLAKE2b-256 2bf71b26ba8723246f757e62a44193e0a7a5007ad73e5e1d873a773686724ae8

See more details on using hashes here.

File details

Details for the file aliases-0.5.6-py3-none-any.whl.

File metadata

  • Download URL: aliases-0.5.6-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for aliases-0.5.6-py3-none-any.whl
Algorithm Hash digest
SHA256 cef0a137166a14f07e8b5c810f1353627ff6a546a175c85e5a3dcde88d61ddc4
MD5 9a343c959e48a77340029102ce468a6d
BLAKE2b-256 8f4a70b5773508b1c94c0801657eb36cca44f203a9057ae4863933a461cc9356

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page