The Python JSON-NLP package
Python JSON-NLP Module
Brought to you by the NLP-Lab.org!
The Python JSON-NLP module contains general mapping functions for JSON-NLP to CoNLL-U, a validator for the generated output, an Natural Language Processing (NLP) pipeline interface (for Flair, spaCy, NLTK, Polyglot, Xrenner, etc.), and various utility functions.
For more details, see JSON-NLP.
This module is a wrapper for outputs from different NLP pipelines and modules into a standardized JSON-NLP format.
To install this package, run the following command:
pip install pyjsonnlp
You might have to use pip3 on some systems.
JSON-NLP is based on a schema, built by NLP-Lab.org, to comprehensively and concisely represent linguistic annotations. We provide a validator to help ensure that generated JSON validates against the schema:
result = MyPipeline().proces(text="I am a sentence") assert pyjsonnlp.validation.is_valid(result)
To enable interoperability with other annotation formats, we support conversions between them. Note that conversion could be lossy, if the relative depths of annotation are not the same. Currently we have a CoNLL-U to JSON-NLP converter, that covers most annotations:
This functionality is still a work in progress.
JSON-NLP provides a simple
Pipeline interface that should be implemented for embedding into a microservice:
from collections import OrderedDict class MockPipeline(pyjsonnlp.pipeline.Pipeline): @staticmethod def process(text='', coreferences=False, constituents=False, dependencies=False, expressions=False, **kwargs) -> OrderedDict: return OrderedDict()
The provided keyword arguments should be used to toggle on or off processing components within the method.
The next step is the JSON-NLP a Microservice class, with a pre-built implementation of [Flask].
from pyjsonnlp.microservices.flask_server import FlaskMicroservice app = FlaskMicroservice(__name__, MyPipeline(), base_route='/')
We recommend creating a
server.py with the
FlaskMicroservice class, which extends the [Flask] app. A corresponding WSGI file would contain:
from mypipeline.server import app as application
To disable a pipeline component (such as phrase structure parsing), add
application.constituents = False
The full list of properties available that can be disabled or enabled are
The microservice exposes the following URIs:
These URIs are shortcuts to disable the other components of the parse. In all cases,
tokenList will be included in the
JSON-NLP output. An example url is:
http://localhost:5000/dependencies?text=I am a sentence
Text is provided to the microservice with the
text parameter, via either
POST. If you pass
url as a parameter, the microservice will scrape that url and process the text of the website.
Other parameters specific to your pipeline implementation can be passed as well:
http://localhost:5000?lang=en&constituents=0&text=I am a sentence.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size & hash SHA256 hash help||File type||Python version||Upload date|
|pyjsonnlp-0.2.5-py3-none-any.whl (35.0 kB) Copy SHA256 hash SHA256||Wheel||py3|
|pyjsonnlp-0.2.5.tar.gz (25.4 kB) Copy SHA256 hash SHA256||Source||None|