Skip to main content

A Python library for structured information extraction with LLMs.

Project description

Struct-IE: Structured Information Extraction with Large Language Models

struct-ie is a Python library for named entity extraction using a transformer-based model.

Installation

You can install the struct-ie library from PyPI:

pip install struct_ie

Usage

Here's an example of how to use the EntityExtractor:

1. Basic Usage

from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": None
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Extract entities from the text
entities = extractor.extract_entities(text)
print(entities)

2. Usage with a Custom Prompt

from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": "Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'"
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Custom prompt for entity extraction
prompt = "You are an expert on Named Entity Recognition. Extract entities from this text."

# Extract entities from the text using a custom prompt
entities = extractor.extract_entities(text, prompt=prompt)
print(entities)

3. Usage with Few-shot Examples

from struct_ie import EntityExtractor

# Define the entity types with descriptions (optional)
entity_types_with_descriptions = {
    "Name": "Names of individuals like 'Jean-Luc Picard' or 'Jane Doe'",
    "Award": "Names of awards or honors such as the 'Nobel Prize' or the 'Pulitzer Prize'",
    "Date": None,
    "Competition": "Names of competitions or tournaments like the 'World Cup' or the 'Olympic Games'",
    "Team": "Names of sports teams or organizations like 'Manchester United' or 'FC Barcelona'"
}

# Initialize the EntityExtractor
extractor = EntityExtractor("Qwen/Qwen2-0.5B-Instruct", entity_types_with_descriptions, device="cpu")

# Example text for entity extraction
text = "Cristiano Ronaldo won the Ballon d'Or. He was the top scorer in the UEFA Champions League in 2018."

# Few-shot examples for improved entity extraction
demonstrations = [
    {"input": "Lionel Messi won the Ballon d'Or 7 times.", "output": [("Lionel Messi", "Name"), ("Ballon d'Or", "Award")]}
]

# Extract entities from the text using few-shot examples
entities = extractor.extract_entities(text, few_shot_examples=demonstrations)
print(entities)

License

This project is licensed under the Apache-2.0.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

struct_ie-0.0.1.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

struct_ie-0.0.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file struct_ie-0.0.1.tar.gz.

File metadata

  • Download URL: struct_ie-0.0.1.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.18

File hashes

Hashes for struct_ie-0.0.1.tar.gz
Algorithm Hash digest
SHA256 31f90f3790579ebd06b20f2b1014250870f66ad67564680cd233ee53c3227672
MD5 23d7bacc7277df1341c9dbc342e86e5d
BLAKE2b-256 de7c622f7d4f96441ed9d529f9b63e35a383969d702cb86bc7f1fa6f7a3a091d

See more details on using hashes here.

File details

Details for the file struct_ie-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: struct_ie-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.8.18

File hashes

Hashes for struct_ie-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2dc698ccbaee2e86da7f33a9cea6372cf2756701613e29f223afd64db9699d2f
MD5 a832dc86ba1dd59eb114bf594234f36e
BLAKE2b-256 2167a22156ef37cac858ca33f4a092edcd421bf529aaa7699c59c274e0be419e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page