Skip to main content

Text tagger, based on the ELMo embeddings and recurrent neural network, with the simple sklearn-like interface

Project description


neuro_tagger
============
Text tagger based on recurrent neural network. It can be used as NER,
dependency parser, morphoanalyzer etc.
The goal of this project is creation of a simple Python package with
the sklearn-like interface for solution of different tasks of text
tagging (named entity recognition, dependency parsing, etc) in case
number of labeled texts is very small (not greater than several
thousands). Special word embeddings named as `ELMo`
<https://arxiv.org/pdf/1802.05365.pdf> (**E**mbeddings from **L**anguage
**Mo**dels) ensure this possibility, because these embeddings are
contextual and they allow to design more simple and separable feature
space for words in texts.
ELMo embeddings are used as features of words in text, and different
variants of neural network architecture (BiLSTM, hybrid BiLSTM-CRF or
pure CRF) can be used as final classifier (tagger). I recommend to
use a special `TensorFlow Hub ELMo`<https://tfhub.dev/google/elmo/2>
for English NLP tasks and a `DeepPavlov ELMo`
<http://docs.deeppavlov.ai/en/master/apiref/models/embedders.html#deeppavlov.models.embedders.elmo_embedder.ELMoEmbedder>
(http://files.deeppavlov.ai/deeppavlov_data/elmo_ru-news_wmt11-16_1.5M_steps.tar.gz)
for for same tasks in Russian.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

neuro_tagger-0.0.1.1.tar.gz (21.4 kB view details)

Uploaded Source

File details

Details for the file neuro_tagger-0.0.1.1.tar.gz.

File metadata

  • Download URL: neuro_tagger-0.0.1.1.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.6

File hashes

Hashes for neuro_tagger-0.0.1.1.tar.gz
Algorithm Hash digest
SHA256 98da01ca4ae324e2237b23cdbc61fa1acfa7667eda1aa000ac47ce501e039c11
MD5 a0130a429ee7605cfcdf91b9482f7203
BLAKE2b-256 aae489a6ee67f7781ce4effbc5516ecc0ca3070653b9ba8b62a374dbc29183c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page