Skip to main content

COMBO-NLP - A library for Morphosyntactic Tagging and Dependency Parsing.

Project description

COMBO-NLP

A library for Morphosyntactic Tagging and Dependency Parsing based on Universal Dependencies.

Installation

pip install combo-nlp

LAMBO segmenter (optional)

A segmenter is only needed when passing raw text strings to COMBO. If you provide pre-tokenized input (list[str] or list[list[str]]), no segmenter is required.

When you initialize COMBO with a language name (e.g. COMBO("Polish")), it automatically loads a LAMBO segmenter. If LAMBO is not installed, an ImportError is raised. LAMBO is hosted on a custom PyPI index and must be installed separately:

pip install --index-url https://pypi.clarin-pl.eu/ lambo

Usage

Full text input

from combo import COMBO

# Load by HuggingFace model ID:
nlp = COMBO.from_pretrained("clarin-pl/combo-nlp-xlm-roberta-base-polish-pdb-ud2.17")
result = nlp("Ala ma kota.")

# Or load by language name (with Lambo segmenter):
nlp = COMBO("Polish")
result = nlp("Ala ma kota.")

# Or use the Language enum:
from combo import Language
nlp = COMBO(Language.POLISH)
result = nlp("Ala ma kota.")

# Multiple sentences:
result = nlp(["Ala ma kota.", "Kot śpi na kanapie."])

# Access results:
for sentence in result:
    for token in sentence:
        print(token.form, token.upos, token.head, token.deprel, token.lemma)

Pre-tokenized input

from combo import COMBO

nlp = COMBO.from_pretrained("clarin-pl/combo-nlp-xlm-roberta-base-polish-pdb-ud2.17")

# Single sentence:
result = nlp(["Ala", "ma", "kota", "."], tokenized=True)

# Multiple sentences:
result = nlp([["Ala", "ma", "kota", "."], ["Kot", "śpi", "na", "kanapie", "."]], tokenized=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combo_nlp-4.0.7.tar.gz (91.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

combo_nlp-4.0.7-py3-none-any.whl (101.7 kB view details)

Uploaded Python 3

File details

Details for the file combo_nlp-4.0.7.tar.gz.

File metadata

  • Download URL: combo_nlp-4.0.7.tar.gz
  • Upload date:
  • Size: 91.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for combo_nlp-4.0.7.tar.gz
Algorithm Hash digest
SHA256 742b0d4b5c8066b4e4aaa839b8b0f8f2012835166051d8b8c4a28b36dff45e67
MD5 70dc89005b0f0a2bebc68174d48385a2
BLAKE2b-256 50bcdb613d3559ca73d5374dc37267a066c26f140c9b93fd7e0540a07d95d14d

See more details on using hashes here.

File details

Details for the file combo_nlp-4.0.7-py3-none-any.whl.

File metadata

  • Download URL: combo_nlp-4.0.7-py3-none-any.whl
  • Upload date:
  • Size: 101.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for combo_nlp-4.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3fb16968f4832ade7a927b797ae93dbd818d7f7e065609dc4126524b1a0e86ed
MD5 761aa4f924d355366de21edc1cfebed0
BLAKE2b-256 3e60e0129c649604b96fca83d5818d1933a28635d1a038ce955b33fd6192066b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page