Generalist model for Relation Extraction (Extract any relation types from texts)

These details have not been verified by PyPI

Project links

Homepage

Project description

GLiREL : Generalist and Lightweight model for Zero-Shot Relation Extraction

GLiREL is a Relation Extraction model capable of classifying unseen relations given the entities within a text. This builds upon the excelent work done by Urchade Zaratiana, Nadi Tomeh, Pierre Holat, Thierry Charnois on the GLiNER library which enables efficient zero-shot Named Entity Recognition.

📄 GLiREL Paper • 📄 GLiNER Paper • 🤗 Demo • 🤗 Available models

Installation

pip install glirel

Usage

Once you've downloaded the GLiREL library, you can import the GLiREL class. You can then load this model using GLiREL.from_pretrained and predict entities with predict_relations.

from glirel import GLiREL
import spacy

model = GLiREL.from_pretrained("jackboyla/glirel-large-v0")

nlp = spacy.load('en_core_web_sm')

text = 'Derren Nesbitt had a history of being cast in "Doctor Who", having played villainous warlord Tegana in the 1964 First Doctor serial "Marco Polo".'
doc = nlp(text)
tokens = [token.text for token in doc]

labels = ['country of origin', 'licensed to broadcast to', 'father', 'followed by', 'characters']

ner = [[26, 27, 'PERSON', 'Marco Polo'], [22, 23, 'Q2989412', 'First Doctor']] # 'type' is not used -- it can be any string!

relations = model.predict_relations(tokens, labels, threshold=0.0, ner=ner, top_k=1)

print('Number of relations:', len(relations))

sorted_data_desc = sorted(relations, key=lambda x: x['score'], reverse=True)
print("\nDescending Order by Score:")
for item in sorted_data_desc:
    print(f"{item['head_text']} --> {item['label']} --> {item['tail_text']} | score: {item['score']}")

Expected Output

Number of relations: 2

Descending Order by Score:
{'head_pos': [26, 28], 'tail_pos': [22, 24], 'head_text': ['Marco', 'Polo'], 'tail_text': ['First', 'Doctor'], 'label': 'characters', 'score': 0.9923334121704102}
{'head_pos': [22, 24], 'tail_pos': [26, 28], 'head_text': ['First', 'Doctor'], 'tail_text': ['Marco', 'Polo'], 'label': 'characters', 'score': 0.9915636777877808}

Constrain labels

In practice, we usually want to define the types of entities that can exist as a head and/or tail of a relationship. This is already implemented in GLiREL:

labels = {"glirel_labels": {
    'co-founder': {"allowed_head": ["PERSON"], "allowed_tail": ["ORG"]}, 
    'no relation': {},  # head and tail can be any entity type 
    'country of origin': {"allowed_head": ["PERSON", "ORG"], "allowed_tail": ["LOC", "GPE"]}, 
    'parent': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]}, 
    'located in or next to body of water': {"allowed_head": ["LOC", "GPE", "FAC"], "allowed_tail": ["LOC", "GPE"]},  
    'spouse': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]},  
    'child': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]},  
    'founder': {"allowed_head": ["PERSON"], "allowed_tail": ["ORG"]},  
    'founded on date': {"allowed_head": ["ORG"], "allowed_tail": ["DATE"]},
    'headquartered in': {"allowed_head": ["ORG"], "allowed_tail": ["LOC", "GPE", "FAC"]},  
    'acquired by': {"allowed_head": ["ORG"], "allowed_tail": ["ORG", "PERSON"]},  
    'subsidiary of': {"allowed_head": ["ORG"], "allowed_tail": ["ORG", "PERSON"]}, 
    }
}

Usage with spaCy

You can also load GliREL into a regular spaCy NLP pipeline. Here's an example using an English pipeline.

import spacy
import glirel

# Load a blank spaCy model or an existing one
nlp = spacy.load('en_core_web_sm')

# Add the GLiREL component to the pipeline
nlp.add_pipe("glirel", after="ner")

# Now you can use the pipeline with the GLiREL component
text = "Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976. The company is headquartered in Cupertino, California."

labels = {"glirel_labels": {
    'co-founder': {"allowed_head": ["PERSON"], "allowed_tail": ["ORG"]}, 
    'country of origin': {"allowed_head": ["PERSON", "ORG"], "allowed_tail": ["LOC", "GPE"]}, 
    'licensed to broadcast to': {"allowed_head": ["ORG"]},  
    'no relation': {},  
    'parent': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]}, 
    'followed by': {"allowed_head": ["PERSON", "ORG"], "allowed_tail": ["PERSON", "ORG"]},  
    'located in or next to body of water': {"allowed_head": ["LOC", "GPE", "FAC"], "allowed_tail": ["LOC", "GPE"]},  
    'spouse': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]},  
    'child': {"allowed_head": ["PERSON"], "allowed_tail": ["PERSON"]},  
    'founder': {"allowed_head": ["PERSON"], "allowed_tail": ["ORG"]},  
    'headquartered in': {"allowed_head": ["ORG"], "allowed_tail": ["LOC", "GPE", "FAC"]},  
    'acquired by': {"allowed_head": ["ORG"], "allowed_tail": ["ORG", "PERSON"]},  
    'subsidiary of': {"allowed_head": ["ORG"], "allowed_tail": ["ORG", "PERSON"]}, 
    }
}

# Add the labels to the pipeline at inference time
docs = list( nlp.pipe([(text, labels)], as_tuples=True) )
relations = docs[0][0]._.relations

print('Number of relations:', len(relations))

sorted_data_desc = sorted(relations, key=lambda x: x['score'], reverse=True)
print("\nDescending Order by Score:")
for item in sorted_data_desc:
    print(f"{item['head_text']} --> {item['label']} --> {item['tail_text']} | score: {item['score']}")

Expected Output

Number of relations: 5

Descending Order by Score:
['Apple', 'Inc.'] --> headquartered in --> ['California'] | score: 0.9854260683059692
['Apple', 'Inc.'] --> headquartered in --> ['Cupertino'] | score: 0.9569844603538513
['Steve', 'Wozniak'] --> co-founder --> ['Apple', 'Inc.'] | score: 0.09025496244430542
['Steve', 'Jobs'] --> co-founder --> ['Apple', 'Inc.'] | score: 0.08805803954601288
['Ronald', 'Wayne'] --> co-founder --> ['Apple', 'Inc.'] | score: 0.07996643334627151

Example training data

NOTE that the entity indices are inclusive i.e "Binsey" is [7, 7]. This differs from spaCy where the end index is exclusive (in this case spaCy would set the indices to [7, 8])

JSONL file:

{
  "ner": [
    [7, 7, "Q4914513", "Binsey"], 
    [11, 12, "Q19686", "River Thames"]
  ], 
  "relations": [
    {
      "head": {"mention": "Binsey", "position": [7, 7], "type": "LOC"}, # 'type' is not used -- it can be any string!
      "tail": {"mention": "River Thames", "position": [11, 12], "type": "Q19686"}, 
      "relation_text": "located in or next to body of water"
    }
  ], 
  "tokenized_text": ["The", "race", "took", "place", "between", "Godstow", "and", "Binsey", "along", "the", "Upper", "River", "Thames", "."]
},
{
  "ner": [
    [9, 10, "Q4386693", "Legislative Assembly"], 
    [1, 3, "Q1848835", "Parliament of Victoria"]
  ], 
  "relations": [
    {
      "head": {"mention": "Legislative Assembly", "position": [9, 10], "type": "Q4386693"}, 
      "tail": {"mention": "Parliament of Victoria", "position": [1, 3], "type": "Q1848835"}, 
      "relation_text": "part of"
    }
  ], 
  "tokenized_text": ["The", "Parliament", "of", "Victoria", "consists", "of", "the", "lower", "house", "Legislative", "Assembly", ",", "the", "upper", "house", "Legislative", "Council", "and", "the", "Queen", "of", "Australia", "."]
}

License

GLiREL by Jack Boylan is licensed under CC BY-NC-SA 4.0.

Citation

If you use code or ideas from this project, please cite:

@misc{boylan2025glirelgeneralistmodel,
      title={GLiREL -- Generalist Model for Zero-Shot Relation Extraction},
      author={Jack Boylan and Chris Hokamp and Demian Gholipour Ghalandari},
      year={2025},
      eprint={2501.03172},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.03172},
}

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

1.2.1

Apr 11, 2025

This version

1.2.0

Apr 9, 2025

1.1.1

Mar 26, 2025

1.1.0

Mar 13, 2025

1.0.1

Nov 23, 2024

1.0.0

Nov 1, 2024

0.1.4

Jul 5, 2024

0.1.3

Jun 30, 2024

0.1.2

Jun 30, 2024

0.1.1

Jun 7, 2024

0.1.0

Jun 7, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

glirel-1.2.0.tar.gz (49.6 kB view details)

Uploaded Apr 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

glirel-1.2.0-py3-none-any.whl (54.3 kB view details)

Uploaded Apr 9, 2025 Python 3

File details

Details for the file glirel-1.2.0.tar.gz.

File metadata

Download URL: glirel-1.2.0.tar.gz
Upload date: Apr 9, 2025
Size: 49.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for glirel-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`bb17b3d3c1f7c0f224a514402cc642f40b76564c73dfdfb31a574d86a96de9ae`
MD5	`5236883ac8b1e16476853d2f9d995285`
BLAKE2b-256	`ea5a1ae6127d6a95968def45e95ae8c82f75e7b63df94cfe648d21b797fb280d`

See more details on using hashes here.

File details

Details for the file glirel-1.2.0-py3-none-any.whl.

File metadata

Download URL: glirel-1.2.0-py3-none-any.whl
Upload date: Apr 9, 2025
Size: 54.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for glirel-1.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`66db0b60ddc6eb0a2dca6200bcea3b478ee95985bb629a26fdd6bb8cc9904985`
MD5	`3c7e9ae422aa1362cecf1a8c784676a9`
BLAKE2b-256	`1be163a2e91973395d055fb6381955ad4c45a505e55c3634e137e09413558022`

See more details on using hashes here.

glirel 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

GLiREL : Generalist and Lightweight model for Zero-Shot Relation Extraction

Installation

Usage

Expected Output

Constrain labels

Usage with spaCy

Expected Output

Example training data

License

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes