Skip to main content

PYthon Multilingual Ucrel Semantic Analysis System

Project description

PyMUSAS

Python Multilingual Ucrel Semantic Analysis System, is a rule based token and Multi Word Expression semantic tagger. The tagger can support any semantic tagset, however the tagset we have concentrated on and released pre-configured spaCy components for is the Ucrel Semantic Analysis System (USAS).


CI License Code coverage

PyPI Version Supported Python Versions

Number of PyMUSAS PyPI downloads for the last month Launch Binder

Documentation

  • 📚 Usage Guides - What the package is, tutorials, how to guides, and explanations.
  • 🔎 API Reference - The docstrings of the library, with minimum working examples.
  • 🚀 Roadmap

Language support

PyMUSAS currently support 8 different languages with pre-configured spaCy components that can be downloaded, each language has it's own guide on how to tag text using PyMUSAS. Below we show the languages supported, if the model for that language supports Multi Word Expression (MWE) identification and tagging (all languages support token level tagging by default), and size of the model:

Language (BCP 47 language code) MWE Support Size
Mandarin Chinese (cmn) :heavy_check_mark: 1.28MB
Welsh (cy) :heavy_check_mark: 1.09MB
Spanish, Castilian (es) :heavy_check_mark: 0.20MB
French (fr) :x: 0.08MB
Indonesian (id) :x: 0.24MB
Italian (it) :heavy_check_mark: 0.50MB
Dutch, Flemish (nl) :x: 0.15MB
Portuguese (pt) :heavy_check_mark: 0.27MB

Install PyMUSAS

Can be installed on all operating systems and supports Python version >= 3.7, to install run:

pip install pymusas

Development

When developing on the project you will want to install the Python package locally in editable format with all the extra requirements, this can be done like so:

pip install -e .[tests]

For a zsh shell, which is the default shell for the new Macs you will need to escape with \ the brackets:

pip install -e .\[tests\]

Running linters and tests

This code base uses flake8 and mypy to ensure that the format of the code is consistent and contain type hints. The flake8 settings can be found in ./setup.cfg and the mypy settings within ./pyproject.toml. To run these linters:

isort pymusas tests scripts
flake8
mypy

To run the tests with code coverage (NOTE these are the code coverage tests that the Continuos Integration (CI) reports at the top of this README, the doc tests are not part of this report):

coverage run # Runs the tests (uses pytest)
coverage report # Produces a report on the test coverage

To run the doc tests, these are tests to ensure that examples within the documentation run as expected:

coverage run -m pytest --doctest-modules pymusas/ # Runs the doc tests
coverage report # Produces a report on the doc tests coverage

Team

PyMUSAS is an open-source project that has been created and funded by the University Centre for Computer Corpus Research on Language (UCREL) at Lancaster University. For more information on who has contributed to this code base see the contributions page.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymusas-0.3.0.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

pymusas-0.3.0-py3-none-any.whl (51.9 kB view details)

Uploaded Python 3

File details

Details for the file pymusas-0.3.0.tar.gz.

File metadata

  • Download URL: pymusas-0.3.0.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for pymusas-0.3.0.tar.gz
Algorithm Hash digest
SHA256 62be38a6057c729d0e326b6ac5446fe0079d97524f7ea6202fb8c5d901e6e152
MD5 31baac5120d49b9e7e0df2344e7427cf
BLAKE2b-256 949599a023985c30d565c23572e60179d4204f8f30d354eddbb966ece07ef400

See more details on using hashes here.

File details

Details for the file pymusas-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: pymusas-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 51.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.0 CPython/3.9.12

File hashes

Hashes for pymusas-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d0a975142080b24475ac3d134d0c84c1e6312947daf90afbb3c5ea68c68e7f1f
MD5 027e0dc43064fa711b9b1d1a400b41a8
BLAKE2b-256 5e21250b61ffd3ce61ca577b66b3e4dc57f08250a0014b41e19fdd58c85a00e4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page