Skip to main content

your_name_here

Project description

Correspondentia

Build Status Build status

Python library to map correspondence tables in different formats to data structures.

A quick example:

from correspondentia import match_fields

numbers_to_names = {
    1: [{"value": "one", "type": "exact"}],
    2: [{"value": "two", "weight": 0.5, "type": "disaggregation"},
        {"value": "deux", "weight": 0.5, "type": "disaggregation"}],
}

my_data = [{
    'count': 1,
    'name': 'foo'
}, {
    'count': 2,
    'name': 'bar'
}]

list(match_fields(my_data, numbers_to_names, "count"))
> [{'count': 'one', 'name': 'foo'},
   {'count': 'two', 'name': 'bar', 'correspondentia_allocation': 0.5},
   {'count': 'deux', 'name': 'bar', 'correspondentia_allocation': 0.5}]

match_fields return a generator.

Input data

Input data should be an iterable of objects supporting the dictionary interface.

Input tables

correspondentia currently can import the following formats:

  • CSVs following the simple schema

We plan to also eventually support the following:

  • RDF (Turtle) correspondence tables following the BONSAI spec
  • CSVs with BONSAI ontology predicates

You can also write custom importers, or define correspondence tables manually. In either case, the correspondence table data should include at least the following fields (additional fields are also allowed):

{
    "label in origin schema (usually str, but can be int or float)": {
        "value": "label in destination schema (usually str, but can be int or float)",
        "type": one of ["exact", "disaggregation"],
        "weight": float, # optional
    }
}

Simple CSV schema for input tables

A CSV with two required and one optional columns.

  • First column: Label in origin schema
  • Second column: Label in destination schema
  • Third column (optional): Weight used for disaggregation.

If matching is 1-N or N-1, just use multiple rows with redundant labels.

CSVs should follow the Open Knowledge CSV spec. Do not use column headers.

Installation

Installation via normal pathways; currently has no dependencies.

Contributing

Follow standard fork/pull-request procedure.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

correspondentia-0.1.tar.gz (5.0 kB view details)

Uploaded Source

Built Distribution

correspondentia-0.1-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file correspondentia-0.1.tar.gz.

File metadata

  • Download URL: correspondentia-0.1.tar.gz
  • Upload date:
  • Size: 5.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for correspondentia-0.1.tar.gz
Algorithm Hash digest
SHA256 86a2163b1108c3a59312b7d0f99eac4dd4a6137b73aa400d607667bf1e40ec21
MD5 e3353a066ce0492e747694c3bf87dbc5
BLAKE2b-256 a5b54b3a385c92c791522283157537527feb00652fc48807a427a5cd9253fc21

See more details on using hashes here.

File details

Details for the file correspondentia-0.1-py3-none-any.whl.

File metadata

  • Download URL: correspondentia-0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.1 CPython/3.7.3

File hashes

Hashes for correspondentia-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8c83ba0dda802b7203f5a10b413ed06759f1316d9c6170ee46c7a9dadbe760b3
MD5 0af754738d9352ffd49086f779e4d1d7
BLAKE2b-256 b550001a45927b2ff6ec3fba7e90bd491c1f36054bf85eee1f4cf3ef52668f5b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page