Skip to main content

A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.

Project description

EasyLink is a tool that allows users to build and run highly configurable record linkage/entity resolution pipelines. Its configurability enables users to “mix and match” different pieces of record linkage software by ensuring that each piece of the pipeline conforms to standard patterns.

For example, users at the Census Bureau could easily evaluate whether using a more sophisticated “blocking” method would improve results in a certain pipeline, without having to rewrite the entire pipeline.

In its current state, EasyLink provides only one or two implementations for each step, does not yet have documentation to support users in creating their own implementations, and is not yet stable enough to be recommended as a tool for production pipelines.

Supported Python versions: 3.11, 3.12

Installation

NOTE: This package requires AMD64 CPU architecture - it is not compatible with Apple’s ARM64 architecture (e.g. M1 and newer Macs).

There are a few things to install in order to use this package:

  • Set up Linux.

    Singularity (and thus EasyLink) requires Linux to run. If you are not already using Linux, you will need to set up a virtual machine; refer to the Singularity documentation for installing on Windows or Mac.

  • Install Singularity.

    First check if you already have Singularity installed by running the command singularity --version. For an existing installation, your Singularity version number is printed.

    If Singularity is not yet installed, you will need to install it; refer to the Singularity docs for installing on Linux.

    Note that this requires administrator privileges; you may need to request installation from your system admin if you are working in a shared computing environment.

  • Install conda.

    We recommend miniforge. You can check if you already have conda installed by running the command conda --version. For an existing installation, a version will be displayed.

  • Create a conda environment with python and graphviz installed.

    $ conda create --name easylink -c conda-forge python=3.12 graphviz 'gcc<14' -y
    $ conda activate easylink
  • Install easylink in the environment.

    Option 1 - Install from PyPI with pip:

    $ pip install easylink

    Option 2 - Build from source with pip:

    $ pip install git+https://github.com/ihmeuw/easylink.git

Documentation

You can view documentation at https://easylink.readthedocs.io/en/latest/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

easylink-0.2.2.tar.gz (3.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

easylink-0.2.2-py3-none-any.whl (236.2 kB view details)

Uploaded Python 3

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page