Skip to main content

An open-source library that builds powerful end-to-end Entity Resolution workflows.

Project description


pyJedAI


An open-source library that leverages Python’s data science ecosystem to build
powerful end-to-end Entity Resolution workflows.

Overview

pyJedAI is a python framework, aiming to offer experts and novice users, robust and fast solutions for multiple types of Entity Resolution problems. It is builded using state-of-the-art python frameworks. pyJedAI constitutes the sole open-source Link Discovery tool that is capable of exploiting the latest breakthroughs in Deep Learning and NLP techniques, which are publicly available through the Python data science ecosystem. This applies to both blocking and matching, thus ensuring high time efficiency, high scalability as well as high effectiveness, without requiring any labelled instances from the user.

Key-Features

  • Input data-type independent. Both structured and semi-structured data can be processed.
  • Various implemented algorithms.
  • Easy-to-use.
  • Utilizes some of the famous and cutting-edge machine learning packages.
  • Offers supervised and un-supervised ML techniques.

Open demos are available in:

       

Google Colab Hands-on demo:

Install

pyJedAI has been tested in Windows and Linux OS.

Basic requirements:

  • Python version greater or equal to 3.8.
  • For Windows, Microsoft Visual C++ 14.0 is required. Download it from Microsoft Official site.

PyPI

Install the latest version of pyjedai:

pip install pyjedai

More on PyPI.

Git

Set up locally:

git clone https://github.com/AI-team-UoA/pyJedAI.git

go to the root directory with cd pyJedAI and type:

pip install .

Docker

Available at Docker Hub, or clone this repo and:

docker build -f Dockerfile

Dependencies

         


           

See the full list of dependencies and all versions used, in this file.

Status

Tests PyPi made-with-python codecov

Statistics & Info

PyPI - Downloads PyPI version

Bugs, Discussions & News

GitHub Discussions is the discussion forum for general questions and discussions and our recommended starting point. Please report any bugs that you find here.

Java - Web Application

pyJedAI

For Java users checkout the initial JedAI. There you can find Java based code and a Web Application for interactive creation of ER workflows.

JedAI constitutes an open source, high scalability toolkit that offers out-of-the-box solutions for any data integration task, e.g., Record Linkage, Entity Resolution and Link Discovery. At its core lies a set of domain-independent, state-of-the-art techniques that apply to both RDF and relational data.


Team & Authors

pyJedAI

This is a research project by the AI-Team of the Department of Informatics and Telecommunications at the University of Athens.

Cite us

If you use this code or find it helpful in your research, here's the .bibtex:

@inproceedings{DBLP:conf/semweb/Nikoletos0K22,
  author       = {Konstantinos Nikoletos and
                  George Papadakis and
                  Manolis Koubarakis},
  editor       = {Anastasia Dimou and
                  Armin Haller and
                  Anna Lisa Gentile and
                  Petar Ristoski},
  title        = {pyJedAI: a Lightsaber for Link Discovery},
  booktitle    = {Proceedings of the {ISWC} 2022 Posters, Demos and Industry Tracks:
                  From Novel Ideas to Industrial Practice co-located with 21st International
                  Semantic Web Conference {(ISWC} 2022), Virtual Conference, Hangzhou,
                  China, October 23-27, 2022},
  series       = {{CEUR} Workshop Proceedings},
  volume       = {3254},
  publisher    = {CEUR-WS.org},
  year         = {2022},
  url          = {https://ceur-ws.org/Vol-3254/paper366.pdf},
  timestamp    = {Fri, 10 Mar 2023 16:23:05 +0100},
  biburl       = {https://dblp.org/rec/conf/semweb/Nikoletos0K22.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

License

Released under the Apache-2.0 license (see LICENSE.txt).

Copyright © 2024 AI-Team, University of Athens



       

This project is being funded in the context of STELAR that is an HORIZON-Europe project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyjedai-0.3.6.tar.gz (112.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyjedai-0.3.6-py3-none-any.whl (119.5 kB view details)

Uploaded Python 3

File details

Details for the file pyjedai-0.3.6.tar.gz.

File metadata

  • Download URL: pyjedai-0.3.6.tar.gz
  • Upload date:
  • Size: 112.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pyjedai-0.3.6.tar.gz
Algorithm Hash digest
SHA256 77f1da62c6b69e4f9469f297b3fa5dcce4e4c6f80b286031f1b8ec2e497e3b1b
MD5 ee4682b364b983ed379cba1332959fb6
BLAKE2b-256 735e3f5de57cdeff10a3f30c3de0a64864281de2e4ef952426955b19b993318e

See more details on using hashes here.

File details

Details for the file pyjedai-0.3.6-py3-none-any.whl.

File metadata

  • Download URL: pyjedai-0.3.6-py3-none-any.whl
  • Upload date:
  • Size: 119.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for pyjedai-0.3.6-py3-none-any.whl
Algorithm Hash digest
SHA256 c8c91700dac5e013a4c72a723bd2e25d8c7b31320c18cbc40c2d586b38abfdcf
MD5 5d4716fe371cdf4be757691526b45b32
BLAKE2b-256 cb2e2147136f91c05600f66d7628de136c30383407b21e1b5571deee3ff4e69f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page