Skip to main content

A Toolkit for Pre-trained Sequence Labeling Models Inference

Project description

# LightNER

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![PyPI version](https://badge.fury.io/py/LightNER.svg)](https://badge.fury.io/py/LightNER)
[![Downloads](https://pepy.tech/badge/lightner)](https://pepy.tech/project/lightner)
<!-- [![Documentation Status](https://readthedocs.org/projects/tensorboard-wrapper/badge/?version=latest)](http://tensorboard-wrapper.readthedocs.io/en/latest/?badge=latest) -->

**Check Our New NER Toolkit🚀🚀🚀**
- **Inference**:
- **[LightNER](https://github.com/LiyuanLucasLiu/LightNER)**: inference w. models pre-trained / trained w. *any* following tools, *efficiently*.
- **Training**:
- **[LD-Net](https://github.com/LiyuanLucasLiu/LD-Net)**: train NER models w. efficient contextualized representations.
- **[VanillaNER](https://github.com/LiyuanLucasLiu/Vanilla_NER)**: train vanilla NER models w. pre-trained embedding.
- **Distant Training**:
- **[AutoNER](https://github.com/shangjingbo1226/AutoNER)**: train NER models w.o. line-by-line annotations and get competitive performance.

--------------------------------

This package supports to conduct inference with models pre-trained by:
- [Vanilla_NER](https://github.com/LiyuanLucasLiu/Vanilla_NER): vanilla sequence labeling models.
- [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net): sequence labeling models w. efficient contextualized representation.
- [AutoNER](https://github.com/shangjingbo1226/AutoNER): distant supervised named entity recognition models (*no line-by-line annotations for training*).

We are in an early-release beta. Expect some adventures and rough edges.

## Quick Links

- [Installation](#installation)
- [Usage](#usage)

## Installation

To install via pypi:
```
pip install lightner
```

To build from source:
```
pip install git+https://github.com/LiyuanLucasLiu/LightNER
```
or
```
git clone https://github.com/LiyuanLucasLiu/LightNER.git
cd LightNER
python setup.py install
```

## Usage

### Pre-trained Models

| | Model | Task | Performance |
| ------------- |------------- | ------------- | ------------- |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pner1.th](http://dmserv4.cs.illinois.edu/pner1.th) | NER for (PER, LOC, ORG & MISC) | F1 92.21 |
| [LD-Net](https://github.com/LiyuanLucasLiu/LD-Net) | [pnp0.th](http://dmserv4.cs.illinois.edu/pnp0.th) | Chunking | F1 95.79 |
| Vanilla_NER | | NER for (PER, LOC, ORG & MISC) | |
| Vanilla_NER | | Chunking | |
| [AutoNER](https://github.com/shangjingbo1226/AutoNER) | [autoner0.th](http://dmserv4.cs.illinois.edu/bioner_models/autoner0.th) | Distant NER trained w.o. line-by-line annotations (Disease, Chemical) | F1 85.30 |


### Decode API

The decode api can be called in the following way:
```
from lightner import decoder_wrapper
model = decoder_wrapper()
model.decode(["Ronaldo", "won", "'t", "score", "more", "than", "30", "goals", "for", "Juve", "."])
```

The ```decode()``` method also can conduct decoding at document level (takes list of list of ```str``` as input) or corpus level (takes list of list of list of ```str``` as input).

The ```decoder_wrapper``` method can be customized by choosing a different pre-trained model or passing an additional ```configs``` file as:
```
model = decoder_wrapper(URL_OR_PATH_TO_CHECKPOINT, configs)
```
And you can access the config options by:
```
lightner decode -h
```

### Console

After installing and downloading the pre-trained mdoels, conduct the inference by
```
lightner decode -m MODEL_FILE -i INPUT_FILE -o OUTPUT_FILE
```

You can find more options by:
```
lightner decode -h
```

The current accepted paper format is as below (tokenized by line break and ```-DOCSTART-``` is optional):
```
-DOCSTART-

Ronaldo
won
't
score
more
30
goals
for
Juve
.
```

The output would be:
```
<PER> Ronaldo </PER> won 't score more than 30 goals for <ORG> Juve </ORG> .
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

LightNER-0.3.0.tar.gz (17.4 kB view details)

Uploaded Source

File details

Details for the file LightNER-0.3.0.tar.gz.

File metadata

  • Download URL: LightNER-0.3.0.tar.gz
  • Upload date:
  • Size: 17.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.7.0

File hashes

Hashes for LightNER-0.3.0.tar.gz
Algorithm Hash digest
SHA256 41f4e31aea49c6188a14280b45dab69413b28ace38d75464555adf0c681f6f01
MD5 e641bac74e3a459479e227291a0db6e3
BLAKE2b-256 f3ed0b1d112a829c9f059947fd8617a171019375f8aef0e62a5ea864a75cd1e1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page