Rule-based facts extraction for Russian language
Project description
Yargy uses rules and dictionaries to extract structured information from Russian texts. Yargy is similar to Tomita parser.
Install
Yargy supports Python 3.7+, PyPy 3, depends only on Pymorphy2.
$ pip install yargy
Usage
from yargy import Parser, rule, and_, not_
from yargy.interpretation import fact
from yargy.predicates import gram
from yargy.relations import gnc_relation
from yargy.pipelines import morph_pipeline
Name = fact(
'Name',
['first', 'last'],
)
Person = fact(
'Person',
['position', 'name']
)
LAST = and_(
gram('Surn'),
not_(gram('Abbr')),
)
FIRST = and_(
gram('Name'),
not_(gram('Abbr')),
)
POSITION = morph_pipeline([
'управляющий директор',
'вице-мэр'
])
gnc = gnc_relation()
NAME = rule(
FIRST.interpretation(
Name.first
).match(gnc),
LAST.interpretation(
Name.last
).match(gnc)
).interpretation(
Name
)
PERSON = rule(
POSITION.interpretation(
Person.position
).match(gnc),
NAME.interpretation(
Person.name
)
).interpretation(
Person
)
parser = Parser(PERSON)
match = parser.match('управляющий директор Иван Ульянов')
print(match)
Person(
position='управляющий директор',
name=Name(
first='Иван',
last='Ульянов'
)
)
Documentation
All materials are in Russian:
Support
- Chat — https://t.me/natural_language_processing
- Issues — https://github.com/natasha/yargy/issues
- Commercial support — https://lab.alexkuk.ru
Development
Dev env
brew install graphviz
python -m venv ~/.venvs/natasha-yargy
source ~/.venvs/natasha-yargy/bin/activate
pip install -r requirements/dev.txt
pip install -e .
python -m ipykernel install --user --name natasha-yargy
Test + lint
make test
Update docs
make exec-docs
# Manually check git diff docs/, commit
Release
# Update setup.py version
git commit -am 'Up version'
git tag v0.16.0
git push
git push --tags
# Github Action builds dist and publishes to PyPi
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
yargy-0.16.0.tar.gz
(68.2 kB
view details)
Built Distribution
yargy-0.16.0-py3-none-any.whl
(34.0 kB
view details)
File details
Details for the file yargy-0.16.0.tar.gz
.
File metadata
- Download URL: yargy-0.16.0.tar.gz
- Upload date:
- Size: 68.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c917eefb32a40c23c46b6ca88d68927072dd00ab94e90fd5dc6ab0a62b59b593 |
|
MD5 | 4d60e6f3ebc5567a69e85c752a61d29b |
|
BLAKE2b-256 | 87ff0ac3b2ae6aca6026e1acc872c1c371182662e94b1c1ab0b9c68854472670 |
File details
Details for the file yargy-0.16.0-py3-none-any.whl
.
File metadata
- Download URL: yargy-0.16.0-py3-none-any.whl
- Upload date:
- Size: 34.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7ca469fa47b336367fab49e8f33ccc195584f69ab758e8196f2fdaa7492adf22 |
|
MD5 | 5ccec641d27d5fc53207666a83f2159f |
|
BLAKE2b-256 | b755d065a9812c619889fbe01a1863743ee45f7c60c462fc95b19576972ee9e4 |