Skip to main content

No project description provided

Project description

matchescu-base

This Python package includes common abstract data types (adt package), utilities (common package) and generic data extraction algorithms for entity resolution. These abstractions are used in other packages such as:

  • matchescu-reference-extraction: extracts entity references from data sources
  • matchescu-reference-stores: stores entity references efficiently
  • matchescu-comparison-space-generation: generates the comparison space used for matching or clustering
  • matchescu-matching: various methods of scoring the similarity of entity references
  • matchescu-clustering: various methods of scoring the colocation of entity references
  • matchescu-profile-assembly: algorithms used to build concrete entity profiles from specific data structures (tuples, lists or graphs)

On its own, the package may be used to create other structured approaches towards entity resolution, particularly based on the Resolvi reference architecture.

Set up dev environment

  1. (optional) install pyenv
  2. install Python 3.11
  3. install Poetry
  4. clone this repository
  5. run a couple of shell commands
$ cd <REPO_ROOT>
$ poetry install

Run tests

$ poetry run pytest

Activate virtual environment

$ poetry shell

-or-

$ source .venv/bin/activate

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

matchescu_base-0.12.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

matchescu_base-0.12.0-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file matchescu_base-0.12.0.tar.gz.

File metadata

  • Download URL: matchescu_base-0.12.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.8 Linux/6.8.0-1021-azure

File hashes

Hashes for matchescu_base-0.12.0.tar.gz
Algorithm Hash digest
SHA256 9320d860089cc432f9a6d63aed436513dad148d9495aea8626a25cff5750a834
MD5 3fce4d456a04347648dbfa39e657dbd0
BLAKE2b-256 3701a6115b4293e00a9b891cc2ad20f11d5b6d6e1acbc5dc4389eaae2dec9327

See more details on using hashes here.

File details

Details for the file matchescu_base-0.12.0-py3-none-any.whl.

File metadata

  • Download URL: matchescu_base-0.12.0-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.1 CPython/3.12.8 Linux/6.8.0-1021-azure

File hashes

Hashes for matchescu_base-0.12.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b13a27e879246a66158436444a4d0e8c4f476b3a9c4e7d6b4b4bd6ebe4e918c6
MD5 a1c09a0e77477965c5d827fe3017e8db
BLAKE2b-256 35f10e52be55d9c26225b3bd1288cb6e5716f76d7b27410b8365ceb52d6d9c66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page