Skip to main content

DSL for building language rules

Project description

Rita Logo

RITA DSL

Documentation Status codecov made-with-python Maintenance PyPI version fury.io PyPI download month GitHub license

This is a language, loosely based on language Apache UIMA RUTA, focused on writing manual language rules, which compiles into either spaCy compatible patterns, or pure regex. These patterns can be used for doing manual NER as well as used in other processes, like retokenizing and pure matching

An Introduction Video

Intro

Links

Support

reddit Gitter

Install

pip install rita-dsl

Simple Rules example

rules = """
cuts = {"fitted", "wide-cut"}
lengths = {"short", "long", "calf-length", "knee-length"}
fabric_types = {"soft", "airy", "crinkled"}
fabrics = {"velour", "chiffon", "knit", "woven", "stretch"}

{IN_LIST(cuts)?, IN_LIST(lengths), WORD("dress")}->MARK("DRESS_TYPE")
{IN_LIST(lengths), IN_LIST(cuts), WORD("dress")}->MARK("DRESS_TYPE")
{IN_LIST(fabric_types)?, IN_LIST(fabrics)}->MARK("DRESS_FABRIC")
"""

Loading in spaCy

import spacy
from rita.shortcuts import setup_spacy


nlp = spacy.load("en")
setup_spacy(nlp, rules_string=rules)

And using it:

>>> r = nlp("She was wearing a short wide-cut dress")
>>> [{"label": e.label_, "text": e.text} for e in r.ents]
[{'label': 'DRESS_TYPE', 'text': 'short wide-cut dress'}]

Loading using Regex (standalone)

import rita

patterns = rita.compile_string(rules, use_engine="standalone")

And using it:

>>> list(patterns.execute("She was wearing a short wide-cut dress"))
[{'end': 38, 'label': 'DRESS_TYPE', 'start': 18, 'text': 'short wide-cut dress'}]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rita-dsl-0.7.4.tar.gz (20.0 kB view hashes)

Uploaded source

Built Distribution

rita_dsl-0.7.4-py3-none-any.whl (24.8 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page