Skip to main content

Package to create and use Simple Explainable Language Multiset Representations

Project description

This crate provides a library for generating and using simple text data structures that work like language models. The data structures do not use real-valued vector embeddings; instead they use the mathematical concept of multisets and are derived directly from plain text data.

The data structures are named Simple Explainable Language Multiset Representations (SELMRs) and consist of multisets created from all multi-word expressions and all multi-word-context combinations contained in a collection of documents given some contraints. The multisets can be used for downstream NLP tasks like text classifications and searching, in a similar manner as real-valued vector embeddings.

SELMRs produce explainable results without any randomness and enable explicit links with lexical, linguistical and terminological annotations. No model is trained and no dimensionality reduction is applied.

For information on how to use this package, please look here.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selmr-0.4.0.tar.gz (28.0 kB view details)

Uploaded Source

Built Distribution

selmr-0.4.0-cp310-none-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

File details

Details for the file selmr-0.4.0.tar.gz.

File metadata

  • Download URL: selmr-0.4.0.tar.gz
  • Upload date:
  • Size: 28.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for selmr-0.4.0.tar.gz
Algorithm Hash digest
SHA256 b157262a355af0f83b08ab54f26cf27b7d7dc1590b88949c6ae5d620547a4991
MD5 bb1491af34a6bbf17ca9b6a242802ea5
BLAKE2b-256 013390630102da62e2508b1fea90a874e4789424cea0f08c08a1a5aa3c20f4cf

See more details on using hashes here.

File details

Details for the file selmr-0.4.0-cp310-none-win_amd64.whl.

File metadata

  • Download URL: selmr-0.4.0-cp310-none-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for selmr-0.4.0-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 35cdf4edb47fc520cdaa95047df003bb978807a9fe53eefd73a3482cdc5f2ac4
MD5 ad70367f61414878154ce641046964a6
BLAKE2b-256 561027e61f0d1ee3c1ced09b5b9ca95d033417c6f58ad39266c26064421f86cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page