A SpaCy wrapper for the GLiNER model for enhanced Named Entity Recognition capabilities
Project description
GLiNER SpaCy Wrapper
Introduction
This project is a wrapper for integrating GLiNER, a Named Entity Recognition (NER) model, with the SpaCy Natural Language Processing (NLP) library. GLiNER, which stands for Generalized Language INdependent Entity Recognition, is an advanced model for recognizing entities in text. The SpaCy wrapper enables easy integration and use of GLiNER within the SpaCy environment, enhancing NER capabilities with GLiNER's advanced features.
For GliNER to work properly, you need to use a Python version 3.7-3.10
Features
- Integrates GLiNER with SpaCy for advanced NER tasks.
- Customizable chunk size for processing large texts.
- Support for specific entity labels like 'person' and 'organization'.
- Configurable output style for entity recognition results.
Installation
To install this library, install it via pip:
pip install gliner-spacy
Usage
To use this wrapper in your SpaCy pipeline, follow these steps:
- Import SpaCy.
- Create a SpaCy
Languageinstance. - Add the
gliner_spacycomponent to the SpaCy pipeline. - Process text using the pipeline.
Example code:
import spacy
nlp = spacy.blank("en")
nlp.add_pipe("gliner_spacy")
text = "This is a text about Bill Gates and Microsoft."
doc = nlp(text)
for ent in doc.ents:
print(ent.text, ent.label_)
Expected Output
Bill Gates person
Microsoft organization
Example with Custom Configs
import spacy
custom_spacy_config = { "gliner_model": "urchade/gliner_multi",
"chunk_size": 250,
"labels": ["people","company"],
"style": "ent"}
nlp = spacy.blank("en")
nlp.add_pipe("gliner_spacy", config=custom_spacy_config)
text = "This is a text about Bill Gates and Microsoft."
doc = nlp(text)
for ent in doc.ents:
print(ent.text, ent.label_, ent._.score)
#Output
# Bill Gates people 0.9967108964920044
# Microsoft company 0.9966742992401123
Example with loading onnx model
import spacy
custom_spacy_config = {
"gliner_model": "onnx-community/gliner_base",
"chunk_size": 250,
"labels": ["people", "company"],
"style": "ent",
"load_onnx_model": True,
"onnx_model_file": "onnx/model.onnx",
}
nlp = spacy.blank("en")
nlp.add_pipe("gliner_spacy", config=custom_spacy_config)
text = "This is a text about Bill Gates and Microsoft."
doc = nlp(text)
for ent in doc.ents:
print(ent.text, ent.label_, ent._.score)
# Output
# Bill Gates people 0.9937531352043152
# Microsoft company 0.994135856628418
Configuration
The default configuration of the wrapper can be modified according to your requirements. The configurable parameters are:
gliner_model: The GLiNER model to be used.chunk_size: Size of the text chunk to be processed at once.labels: The entity labels to be recognized.style: The style of output for the entities (either 'ent' or 'span').threshold: The threshold of the GliNER model (controls the degree to which a hit is considered an entity)map_location: The device on which to run the model:cpuorcudaload_onnx_model: Whether thegliner_modelspecificied is an ONNX model (False by default)onnx_model_file: The path to the onnx file in the Huggingface repo. Defaults tomodel.onnx
Contributing
Contributions to this project are welcome. Please ensure that your code adheres to the project's coding standards and include tests for new features.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gliner_spacy-0.0.11.tar.gz.
File metadata
- Download URL: gliner_spacy-0.0.11.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80d07933c7a2c2e457b0e0274f69962425603c7dbeaac5838c010bfaf3178091
|
|
| MD5 |
8115f26e23a5885f034f4c1fe436da0b
|
|
| BLAKE2b-256 |
001bc737e988cddedc00aadd90bd6ca9cae2a1b974e3a3d4ea7ca64dcf7670d2
|
File details
Details for the file gliner_spacy-0.0.11-py3-none-any.whl.
File metadata
- Download URL: gliner_spacy-0.0.11-py3-none-any.whl
- Upload date:
- Size: 6.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b44836c4b4a895307aaaf15694b1c2cd100d2365eed510ca49062227be85d000
|
|
| MD5 |
93a438de3f835fb3cd51321e6bde477f
|
|
| BLAKE2b-256 |
757d3942a58b5d3be6021f93552fb170d18ecf2079bc73927c7af36524938900
|