A python implementation of IAMsystem algorithm

These details have not been verified by PyPI

Project links

Project description

iamsystem

test

A python implementation of IAMsystem algorithm, a fast dictionary-based approach for semantic annotation, a.k.a entity linking.

Installation

pip install iamsystem

Usage

You provide a list of keywords you want to detect in a document, you can add and combine abbreviations, normalization methods (lemmatization, stemming) and approximate string matching algorithms, IAMsystem algorithm performs the semantic annotation.

See the documentation for the configuration details.

Quick example

from iamsystem import Matcher

matcher = Matcher.build(
    keywords=["North America", "South America"],
    stopwords=["and"],
    abbreviations=[("amer", "America")],
    spellwise=[dict(measure="Levenshtein", max_distance=1)],
    w=2,
)
annots = matcher.annot_text(text="Northh and south Amer.")
for annot in annots:
    print(annot)
# Northh Amer	0 6;17 21	North America
# south Amer	11 21	South America

Algorithm

The algorithm was developed in the context of a PhD thesis. It proposes a solution to quickly annotate documents using a large dictionary (> 300K keywords) and fuzzy matching algorithms. No string distance algorithm is implemented in this package, it imports and leverages external libraries like spellwise, pysimstring and nltk. Its algorithmic complexity is O(n(log(m))) with n the number of tokens in a document and m the size of the dictionary. The formalization of the algorithm is available in this paper.

The algorithm was initially developed in Java (https://github.com/scossin/IAMsystem). It has participated in several semantic annotation competitions in the medical field where it has obtained satisfactory results, for example by obtaining the best results in the Codiesp shared task. A dictionary-based model can achieve close performance to a transformer-based model when the task is simple or when the training set is small. Its main advantage is its speed, which allows a baseline to be generated quickly.

Citation

@article{cossin_iam_2018,
	title = {{IAM} at {CLEF} {eHealth} 2018: {Concept} {Annotation} and {Coding} in {French} {Death} {Certificates}},
	shorttitle = {{IAM} at {CLEF} {eHealth} 2018},
	url = {http://arxiv.org/abs/1807.03674},
	urldate = {2018-07-11},
	journal = {arXiv:1807.03674 [cs]},
	author = {Cossin, Sébastien and Jouhet, Vianney and Mougin, Fleur and Diallo, Gayo and Thiessard, Frantz},
	month = jul,
	year = {2018},
	note = {arXiv: 1807.03674},
	keywords = {Computer Science - Computation and Language},
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.6.1

Apr 30, 2024

0.6.0

Jan 8, 2024

0.5.1

Apr 19, 2023

0.5.0

Mar 22, 2023

0.4.0

Mar 11, 2023

0.3.0

Feb 19, 2023

0.2.0

Feb 12, 2023

0.1.1

Feb 2, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iamsystem-0.6.1.tar.gz (68.2 kB view details)

Uploaded Apr 30, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iamsystem-0.6.1-py3-none-any.whl (56.1 kB view details)

Uploaded Apr 30, 2024 Python 3

File details

Details for the file iamsystem-0.6.1.tar.gz.

File metadata

Download URL: iamsystem-0.6.1.tar.gz
Upload date: Apr 30, 2024
Size: 68.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for iamsystem-0.6.1.tar.gz
Algorithm	Hash digest
SHA256	`f5200c6969984c3d286fee021c22d327afee0444c6702eba0e14e459e18f8221`
MD5	`d05e90ac8693960cdac6c6d53af2239b`
BLAKE2b-256	`55d0c30ede1487a9218c80cc709c138ea96f9367c4f4b2205eea46903128f08d`

See more details on using hashes here.

File details

Details for the file iamsystem-0.6.1-py3-none-any.whl.

File metadata

Download URL: iamsystem-0.6.1-py3-none-any.whl
Upload date: Apr 30, 2024
Size: 56.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for iamsystem-0.6.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6576c21d860d954e8be39fb7e9c7bc540293df51f01f7f52d483abab0c7aa173`
MD5	`408f350582e49e121c7673ce59f8e627`
BLAKE2b-256	`7da34598df7318de97d46e84639299c8c6342a3badf5f0a3eae2b99daadfaf3e`

See more details on using hashes here.

iamsystem 0.6.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

iamsystem

Installation

Usage

Quick example

Algorithm

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes