A Toolkit for Pre-trained Sequence Labeling Models Inference
Project description
# LightNER
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI version](https://badge.fury.io/py/LightNER.svg)](https://badge.fury.io/py/LightNER)
[![Downloads](https://pepy.tech/badge/lightner)](https://pepy.tech/project/lightner)
<!-- [![Documentation Status](https://readthedocs.org/projects/tensorboard-wrapper/badge/?version=latest)](http://tensorboard-wrapper.readthedocs.io/en/latest/?badge=latest) -->
**Check Our New NER Toolkit🚀🚀🚀**
- **Inference**:
- **[LightNER](https://github.com/LiyuanLucasLiu/LightNER)**: inference w. models pre-trained / trained w. *any* following tools, *efficiently*.
- **Training**:
- **[LD-Net](https://github.com/LiyuanLucasLiu/LD-Net)**: train NER models w. efficient contextualized representations.
- **[VanillaNER](https://github.com/LiyuanLucasLiu/Vanilla_NER)**: train vanilla NER models w. pre-trained embedding.
- **Distant Training**:
- **[AutoNER](https://github.com/shangjingbo1226/AutoNER)**: train NER models w.o. line-by-line annotations and get competitive performance.
--------------------------------
This package supports to conduct inference with models pre-trained by:
- [Vanilla_NER](https://github.com/LiyuanLucasLiu/Vanilla_NER): vanilla sequence labeling models.
- [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net): sequence labeling models w. efficient contextualized representation.
- [AutoNER](https://github.com/shangjingbo1226/AutoNER): distant supervised named entity recognition models (*no line-by-line annotations for training*).
We are in an early-release beta. Expect some adventures and rough edges.
## Quick Links
- [Installation](#installation)
- [Usage](#usage)
## Installation
To install via pypi:
```
pip install lightner
```
To build from source:
```
pip install git+https://github.com/LiyuanLucasLiu/LightNER
```
or
```
git clone https://github.com/LiyuanLucasLiu/LightNER.git
cd LightNER
python setup.py install
```
## Usage
### Pre-trained Models
| | Model | Task | Performance |
| ------------- |------------- | ------------- | ------------- |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pner1.th](http://dmserv4.cs.illinois.edu/pner1.th) | NER for (PER, LOC, ORG & MISC) | F1 92.21 |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pnp0.th](http://dmserv4.cs.illinois.edu/pnp0.th) | Chunking | F1 95.79 |
| Vanilla_NER | | NER for (PER, LOC, ORG & MISC) | |
| Vanilla_NER | | Chunking | |
| [AutoNER](https://github.com/shangjingbo1226/AutoNER) | [autoner0.th](http://dmserv4.cs.illinois.edu/bioner_models/autoner0.th) | Distant NER trained w.o. line-by-line annotations (Disease, Chemical) | F1 85.30 |
### Decode API
The decode api can be called in the following way:
```
from lightner import decoder_wrapper
model = decoder_wrapper()
model.decode(["Ronaldo", "won", "'t", "score", "more", "than", "30", "goals", "for", "Juve", "."])
```
The ```decode()``` method also can conduct decoding at document level (takes list of list of ```str``` as input) or corpus level (takes list of list of list of ```str``` as input).
The ```decoder_wrapper``` method can be customized by choosing a different pre-trained model or passing an additional ```configs``` file as:
```
model = decoder_wrapper(URL_OR_PATH_TO_CHECKPOINT, configs)
```
And you can access the config options by:
```
lightner decode -h
```
### Console
After installing and downloading the pre-trained mdoels, conduct the inference by
```
lightner decode -m MODEL_FILE -i INPUT_FILE -o OUTPUT_FILE
```
You can find more options by:
```
lightner decode -h
```
The current accepted paper format is as below (tokenized by line break and ```-DOCSTART-``` is optional):
```
-DOCSTART-
Ronaldo
won
't
score
more
30
goals
for
Juve
.
```
The output would be:
```
<PER> Ronaldo </PER> won 't score more than 30 goals for <ORG> Juve </ORG> .
```
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI version](https://badge.fury.io/py/LightNER.svg)](https://badge.fury.io/py/LightNER)
[![Downloads](https://pepy.tech/badge/lightner)](https://pepy.tech/project/lightner)
<!-- [![Documentation Status](https://readthedocs.org/projects/tensorboard-wrapper/badge/?version=latest)](http://tensorboard-wrapper.readthedocs.io/en/latest/?badge=latest) -->
**Check Our New NER Toolkit🚀🚀🚀**
- **Inference**:
- **[LightNER](https://github.com/LiyuanLucasLiu/LightNER)**: inference w. models pre-trained / trained w. *any* following tools, *efficiently*.
- **Training**:
- **[LD-Net](https://github.com/LiyuanLucasLiu/LD-Net)**: train NER models w. efficient contextualized representations.
- **[VanillaNER](https://github.com/LiyuanLucasLiu/Vanilla_NER)**: train vanilla NER models w. pre-trained embedding.
- **Distant Training**:
- **[AutoNER](https://github.com/shangjingbo1226/AutoNER)**: train NER models w.o. line-by-line annotations and get competitive performance.
--------------------------------
This package supports to conduct inference with models pre-trained by:
- [Vanilla_NER](https://github.com/LiyuanLucasLiu/Vanilla_NER): vanilla sequence labeling models.
- [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net): sequence labeling models w. efficient contextualized representation.
- [AutoNER](https://github.com/shangjingbo1226/AutoNER): distant supervised named entity recognition models (*no line-by-line annotations for training*).
We are in an early-release beta. Expect some adventures and rough edges.
## Quick Links
- [Installation](#installation)
- [Usage](#usage)
## Installation
To install via pypi:
```
pip install lightner
```
To build from source:
```
pip install git+https://github.com/LiyuanLucasLiu/LightNER
```
or
```
git clone https://github.com/LiyuanLucasLiu/LightNER.git
cd LightNER
python setup.py install
```
## Usage
### Pre-trained Models
| | Model | Task | Performance |
| ------------- |------------- | ------------- | ------------- |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pner1.th](http://dmserv4.cs.illinois.edu/pner1.th) | NER for (PER, LOC, ORG & MISC) | F1 92.21 |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pnp0.th](http://dmserv4.cs.illinois.edu/pnp0.th) | Chunking | F1 95.79 |
| Vanilla_NER | | NER for (PER, LOC, ORG & MISC) | |
| Vanilla_NER | | Chunking | |
| [AutoNER](https://github.com/shangjingbo1226/AutoNER) | [autoner0.th](http://dmserv4.cs.illinois.edu/bioner_models/autoner0.th) | Distant NER trained w.o. line-by-line annotations (Disease, Chemical) | F1 85.30 |
### Decode API
The decode api can be called in the following way:
```
from lightner import decoder_wrapper
model = decoder_wrapper()
model.decode(["Ronaldo", "won", "'t", "score", "more", "than", "30", "goals", "for", "Juve", "."])
```
The ```decode()``` method also can conduct decoding at document level (takes list of list of ```str``` as input) or corpus level (takes list of list of list of ```str``` as input).
The ```decoder_wrapper``` method can be customized by choosing a different pre-trained model or passing an additional ```configs``` file as:
```
model = decoder_wrapper(URL_OR_PATH_TO_CHECKPOINT, configs)
```
And you can access the config options by:
```
lightner decode -h
```
### Console
After installing and downloading the pre-trained mdoels, conduct the inference by
```
lightner decode -m MODEL_FILE -i INPUT_FILE -o OUTPUT_FILE
```
You can find more options by:
```
lightner decode -h
```
The current accepted paper format is as below (tokenized by line break and ```-DOCSTART-``` is optional):
```
-DOCSTART-
Ronaldo
won
't
score
more
30
goals
for
Juve
.
```
The output would be:
```
<PER> Ronaldo </PER> won 't score more than 30 goals for <ORG> Juve </ORG> .
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
LightNER-0.3.0.tar.gz
(17.4 kB
view details)
File details
Details for the file LightNER-0.3.0.tar.gz
.
File metadata
- Download URL: LightNER-0.3.0.tar.gz
- Upload date:
- Size: 17.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41f4e31aea49c6188a14280b45dab69413b28ace38d75464555adf0c681f6f01 |
|
MD5 | e641bac74e3a459479e227291a0db6e3 |
|
BLAKE2b-256 | f3ed0b1d112a829c9f059947fd8617a171019375f8aef0e62a5ea864a75cd1e1 |