Skip to main content

Vocabulary suggestion module based on BARTOC FAST

Project description

bartocsuggest

A Python module that suggests vocabularies given a list of words based on the BARTOC FAST API (https://bartoc-fast.ub.unibas.ch/bartocfast/api).

Installation

pip install bartocsuggest

Example

from bartocsuggest import Session

my_words = ["auction", "market", "marketing", "market economy", "perfect competition", "capitalism", "stock market"]

session = Session(my_words)
session.suggest(verbose=True)

The output to the console should look something like this:

73 vocabularies given sensitivity 1. From best to worst (vocabularies with no matches are excluded):
psh.ntkcz.cz, recall: 1.0
vocabulary.worldbank.org, recall: 1.0
zbw.eu, recall: 1.0
eurovoc.europa.eu, recall: 0.8571428571428571
lod.gesis.org, recall: 0.8571428571428571
www.yso.fi/onto/yso, recall: 0.7142857142857143
www.yso.fi/onto/koko, recall: 0.7142857142857143
www.yso.fi/onto/liito, recall: 0.7142857142857143
data.bibliotheken.nl, recall: 0.7142857142857143
lod.nal.usda.gov, recall: 0.7142857142857143
www.yso.fi/onto/juho, recall: 0.5714285714285714
crai.ub.edu, recall: 0.5714285714285714
www.twse.info, recall: 0.5714285714285714
thesaurus.web.ined.fr, recall: 0.5714285714285714
aims.fao.org, recall: 0.5714285714285714
...

Preloading responses

The latency for a response from BARTOC FAST is about 5 seconds per word. Preloading responses is hence useful for dealing with long lists of words or for trying out different types of suggestions for a given list of words without having to resend each query.

from bartocsuggest import Session, Average

# preload words:
session = Session(300_word_list, "my/preload/folder")
session.preload(0-99)
session.preload(100-199)
session.preload(200-299)

# try out different suggestions:
suggestion = session.suggest(remote=False, verbose=True)
suggestion_low_sensitivity = session.suggest(remote=False, sensitivity=5, verbose=True)
suggestion_average = session.suggest(remote=False, score_type="Average", verbose=True)

Exporting suggestions

The input words and the suggested vocabularies are modelled as JSKOS concept schemes (see https://gbv.github.io/jskos/jskos.html). The the concordance between the input words and any suggested vocabulary can be exported as JSON-file. Similarily, the mappings between the input words and any suggested vocabulary can be exported as NDJSON-file (e.g., for use in the Concoda Mapping Tool, see https://coli-conc.gbv.de/cocoda/app).

suggestion.save_concordance("my/save/folder")
suggestion.save_mappings("my/save/folder", vocabulary_uri="vocabulary.worldbank.org")

Annif wrapper

The Annif wrapper is built using the Annif-client module (https://pypi.org/project/annif-client) and enables bartocsuggest to suggest vocabularies based on texts:

from bartocsuggest import AnnifSession

my_text = "Plant viruses are widespread and economically important pathogens..."

# generate subject indexing for my_text:
annif_session = AnnifSession(my_text, project_id="yso-en")

# make suggestion based on subject indexing:
annif_session.suggest(verbose=True)

Documentation

Documentation available at: https://readthedocs.org/projects/bartocsuggest/

License

bartocsuggest is released under the MIT License.

Contact

Maximilian Hindermann
maximilian.hindermann@unibas.ch
https://orcid.org/0000-0002-9337-4655

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bartocsuggest-0.0.3.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

bartocsuggest-0.0.3-py3-none-any.whl (14.6 kB view details)

Uploaded Python 3

File details

Details for the file bartocsuggest-0.0.3.tar.gz.

File metadata

  • Download URL: bartocsuggest-0.0.3.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.3

File hashes

Hashes for bartocsuggest-0.0.3.tar.gz
Algorithm Hash digest
SHA256 cbed470d4c208c8cd209848ae13b3003caedd40fd713a5f8195f9de0cfe90d6d
MD5 48b587fc696ad4c9f9cdb040a7a39b7c
BLAKE2b-256 1d4d8b74a92543c46fb60547dea1f87cee5830b88a7c4aa69d43f628cf0a03d1

See more details on using hashes here.

File details

Details for the file bartocsuggest-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: bartocsuggest-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 14.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.3

File hashes

Hashes for bartocsuggest-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d52233d9ceb479592d3f2be235c147fc79c1c28c8742ae062d0d0a5ea861af2a
MD5 b71814f2bacc90ed0f15921c3e1c5add
BLAKE2b-256 37b6d0df8a4058cc4f9453bd6ac9626219d13b541de42c95dfae93c76ab29608

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page