Skip to main content

COMBO-NLP - A library for Morphosyntactic Tagging and Dependency Parsing.

Project description

COMBO-NLP

A library for Morphosyntactic Tagging and Dependency Parsing based on Universal Dependencies.

Installation

pip install combo-nlp

LAMBO segmenter (optional)

A segmenter is only needed when passing raw text strings to COMBO. If you provide pre-tokenized input (list[str] or list[list[str]]), no segmenter is required.

When you initialize COMBO with a language name (e.g. COMBO("Polish")), it automatically loads a LAMBO segmenter. If LAMBO is not installed, an ImportError is raised. LAMBO is hosted on a custom PyPI index and must be installed separately:

pip install --index-url https://pypi.clarin-pl.eu/ lambo

Usage

Full text input

from combo import COMBO

# Load by HuggingFace model ID:
nlp = COMBO.from_pretrained("clarin-pl/combo-nlp-xlm-roberta-base-polish-pdb-ud2.17")
result = nlp("Ala ma kota.")

# Or load by language name (with Lambo segmenter):
nlp = COMBO("Polish")
result = nlp("Ala ma kota.")

# Or use the Language enum:
from combo import Language
nlp = COMBO(Language.POLISH)
result = nlp("Ala ma kota.")

# Multiple sentences:
result = nlp(["Ala ma kota.", "Pies je."])

# Access results:
for sentence in result:
    for token in sentence:
        print(token.form, token.upos, token.head, token.deprel, token.lemma)

Pre-tokenized input

from combo import COMBO

nlp = COMBO.from_pretrained("clarin-pl/combo-nlp-xlm-roberta-base-polish-pdb-ud2.17")

# Single sentence:
result = nlp(["Ala", "ma", "kota", "."], tokenized=True)

# Multiple sentences:
result = nlp([["Ala", "ma", "kota", "."], ["Pies", "je", "."]], tokenized=True)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combo_nlp-4.0.3.tar.gz (84.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

combo_nlp-4.0.3-py3-none-any.whl (95.0 kB view details)

Uploaded Python 3

File details

Details for the file combo_nlp-4.0.3.tar.gz.

File metadata

  • Download URL: combo_nlp-4.0.3.tar.gz
  • Upload date:
  • Size: 84.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for combo_nlp-4.0.3.tar.gz
Algorithm Hash digest
SHA256 b7b5171e7ffbe0c085d28cd488cb342d9603128572ec9d401cf3a065d928b4d1
MD5 c62b3ffc1b19915906ff4567e6e5e20b
BLAKE2b-256 906b5606dd07331e9012eefa9920bee17c1d1bb4b0c4a2a37807e29aa3e490ba

See more details on using hashes here.

File details

Details for the file combo_nlp-4.0.3-py3-none-any.whl.

File metadata

  • Download URL: combo_nlp-4.0.3-py3-none-any.whl
  • Upload date:
  • Size: 95.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for combo_nlp-4.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 95eb9f8e8850f1928153bbe22af2e2c5cf511b2afcee01ad8c99e2730012346a
MD5 8e0100aca79ad4110af18591821c2585
BLAKE2b-256 aa0bcce0f6cb70209a8aafc23b423dc3303b11c1ab90c4202c768e1f1649a1e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page