Skip to main content

Natural Language Procecssing Toolkit with support for tokenization, sentence splitting, lemmatization, tagging and parsing for more than 60 languages

Project description

NLP-Cube

NLP-Cube is an opensource Natural Language Processing Framework with support for languages which are included in the UD Treebanks. Use NLP-Cube if you need:

  • Sentence segmentation
  • Tokenization
  • POS Tagging (both language independent (UPOSes) and language dependent (XPOSes and ATTRs))
  • Lemmatization
  • Dependency parsing

Example input: "This is a test.", output is:

1       This    this    PRON    DT      Number=Sing|PronType=Dem        4       nsubj   _
2       is      be      AUX     VBZ     Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin   4       cop     _
3       a       a       DET     DT      Definite=Ind|PronType=Art       4       det     _
4       test    test    NOUN    NN      Number=Sing     0       root    SpaceAfter=No
5       .       .       PUNCT   .       _       4       punct   SpaceAfter=No

For user that just want to run it, here's how to set up and use NLP-Cube in a few lines: Quick Start Tutorial.

For advanced users that want to create and train their own models, please the the Advanced Tutorials in examples/, starting with how to locally install NLP-Cube.

Simple (PIP) installation

Install (or update) NLP-Cube with:

pip3 install -U nlpcube

API Usage

To use NLP-Cube *programmatically (in Python), follow this tutorial The summary would be:

from cube.api import Cube       # import the Cube object
cube=Cube(verbose=True)         # initialize it
cube.load("en")                 # select the desired language (it will auto-download the model)
text="This is the text I want segmented, tokenized, lemmatized and annotated with POS and dependencies."
sentences=cube(text)            # call with your own text (string) to obtain the annotations

The sentences object now contains the annotated text, one sentence at a time.

Webserver Usage

To use NLP-Cube as a web service, you need to locally install NLP-Cube and start the server:

For example, the following command will start the server and preload languages: en, fr and de.

cd cube
python3 webserver.py --port 8080 --lang=en --lang=fr --lang=de

To test, open the following link (please copy the address of the link as it is a local address and port link)

Cite

If you use NLP-Cube in your research we would be grateful if you would cite the following paper:

  • NLP-Cube: End-to-End Raw Text Processing With Neural Networks, Boroș, Tiberiu and Dumitrescu, Stefan Daniel and Burtica, Ruxandra, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Association for Computational Linguistics. p. 171--179. October 2018

or, in bibtex format:

@InProceedings{boro-dumitrescu-burtica:2018:K18-2,
  author    = {Boroș, Tiberiu  and  Dumitrescu, Stefan Daniel  and  Burtica, Ruxandra},
  title     = {{NLP}-Cube: End-to-End Raw Text Processing With Neural Networks},
  booktitle = {Proceedings of the {CoNLL} 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies},
  month     = {October},
  year      = {2018},
  address   = {Brussels, Belgium},
  publisher = {Association for Computational Linguistics},
  pages     = {171--179},
  abstract  = {We introduce NLP-Cube: an end-to-end Natural Language Processing framework, evaluated in CoNLL's "Multilingual Parsing from Raw Text to Universal Dependencies 2018" Shared Task. It performs sentence splitting, tokenization, compound word expansion, lemmatization, tagging and parsing. Based entirely on recurrent neural networks, written in Python, this ready-to-use open source system is freely available on GitHub. For each task we describe and discuss its specific network architecture, closing with an overview on the results obtained in the competition.},
  url       = {http://www.aclweb.org/anthology/K18-2017}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpcube-0.1.0.2.tar.gz (71.2 kB view hashes)

Uploaded Source

Built Distribution

nlpcube-0.1.0.2-py3-none-any.whl (97.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page