Skip to main content

A Python package that helps translate from one term to another, depending on a passed date, from a CSV that contains some verified information.

Project description

dated-translator

A Python package that helps translate from one term to another, depending on a passed date, from a CSV that contains some verified information.

Getting started

Installation

You can install this package using pip:

$ pip install dated_translator

First lookup

Set up the lookup object first. In this case, we have a data_file.csv which contains (at least) four required columns: Term 1, Term 2, Start Date, and End Date. For a more advanced setup, see below.

lookup = Lookup(dataset="data_file.csv")

lookup.left_translate("Term 1", "1800-01-01") # Will return a list with the values of term 2 that exist in any given span of start and end date

lookup.right_translate("Term 2", "1800-01-01") # Will return a list with the values of term 1 that exist in any given span of start and end date

Advanced setup

This example is a real-world example from the Living with Machines project, and if you want to test it yourself (after installing the package), you can clone this repository and check out the example/Example.ipynb notebook.

Say that we have a list of newspaper titles with different abbreviations, and we need to check which identification number, NLP that each abbreviation is associated with, within a certain date range.

The file that we'd pass to the setup of the Lookup object, in this example called JISC-papers.csv, would look something like this:

Newspaper Title NLP Normalised Title Abbr StartD StartM StartY EndD EndM EndY
Aberdeen Journal and general advertiser for the north of Scotland, The 31 Aberdeen Journal ANJO 1 Jan 1800 23 Aug 1876
Aberdeen Weekly Journal and general advertiser for the north of Scotland 32 Aberdeen Journal ANJO 30 Aug 1876 31 Dec 1900

In this example, we want to get the resulting NLP 31 for any ANJO abbreviations (Abbr) between 1881-01-01 and 1876-08-23, and 32 for any of the same abbreviation between 1876-08-30 and 1900-12-31.

To set this up, we need to pass the dataset's name, and specify the names of the lookup's term 1 (Abbr) and term 2 (NLP). Note: It doesn't matter in which order you pass them, but which one is considered term 1 and 2 will affect our left_translate and right_translate methods further down the line.

We also need to specify the particular date column format in our file. Since we're not using the standard setup here (a Start Date and End Date column respectively), we can pass a dictionary which requires three items, specifying the name of the year, month, and day columns, and their date formatting. We do so for both the start date and end date columns:

lookup = Lookup(
    dataset="JISC-papers.csv",
    term_1_column = "Abbr",
    term_2_column = "NLP",
    start_date_column = {
        "StartY": "%Y",
        "StartM": "%b",
        "StartD": "%d"
    },
    end_date_column = {
        "EndY": "%Y",
        "EndM": "%b",
        "EndD": "%d"
    }
)

After this setup, we can run the left_translate method to check what the NLP is for the abbreviation "ANJO" on the date 1800-01-01:

lookup.left_translate("ANJO", "1800-01-01")

This should return the value: [31], that is, a list of the possible NLPs for this abbreviation on this particular date.

Similarly, we can run the right_translate method to check what the Abbr is for a given NLP (31) on the date 1800-01-01:

lookup.right_translate(31, "1800-01-01")

The result should, in a reverse of the result above, be ['ANJO'], that is, a list of the possible abbreviations for this NLP in on this particular date.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dated-translator-0.1.1.tar.gz (4.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dated_translator-0.1.1-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file dated-translator-0.1.1.tar.gz.

File metadata

  • Download URL: dated-translator-0.1.1.tar.gz
  • Upload date:
  • Size: 4.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.12 Darwin/21.6.0

File hashes

Hashes for dated-translator-0.1.1.tar.gz
Algorithm Hash digest
SHA256 08e4f7ef1bcdf895a0127a3071dc6e373ecf378e0d663f956cbeaef558d10679
MD5 147c66199644f7748f37c4c0ef6617c7
BLAKE2b-256 c5eac1b6c924e730198e9bb9e32aa313c5a5d54243a6c2bd176632b41e6a5547

See more details on using hashes here.

File details

Details for the file dated_translator-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: dated_translator-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 4.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.13 CPython/3.9.12 Darwin/21.6.0

File hashes

Hashes for dated_translator-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 16c1d520cc26de7f16821bfcfb926149868b8c8c7aaecfeac1ca906ee716bb91
MD5 e2b4238567b56a0821c911576e993e7c
BLAKE2b-256 1f746d678ce1643d18127322234e26168f6f5e185432a928ad7581d4a2148bcc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page