Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.
Project description
structflo.ner
Drug discovery NER powered by LangExtract.
Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.
Install
pip install structflo-ner
# or with uv
uv add structflo-ner
# optional pandas support
pip install "structflo-ner[dataframe]"
Quick start
from structflo.ner import NERExtractor
extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
"Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)
print(result.compounds) # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets) # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities) # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases) # [DiseaseEntity(text='NSCLC', ...)]
df = result.to_dataframe() # flat pandas DataFrame
Local models via Ollama
Run extraction entirely on your own hardware — no API key needed:
extractor = NERExtractor(
model_id="gemma3:27b",
model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")
Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).
Built-in profiles
| Profile | Entity classes |
|---|---|
FULL (default) |
compounds, targets, diseases, bioactivities, assays, mechanisms |
CHEMISTRY |
compound names, SMILES, CAS numbers, molecular formulas |
BIOLOGY |
targets, gene names, protein names |
BIOACTIVITY |
bioactivity measurements, assays |
DISEASE |
diseases and clinical indications |
from structflo.ner import NERExtractor, CHEMISTRY
extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)
Profiles can be merged:
from structflo.ner import CHEMISTRY, BIOLOGY
combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)
Custom profiles
from structflo.ner import NERExtractor, EntityProfile
my_profile = EntityProfile(
name="kinase_inhibitors",
entity_classes=["compound_name", "smiles", "target", "bioactivity"],
prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)
Visualization
Render results as color-coded, interactive HTML directly in Jupyter notebooks:
# Auto-renders when result is the last expression in a cell
result
# Or call explicitly
result.display()
Each entity type gets a distinct color with a superscript label. Hover any highlight to see its type and attributes. Click legend items to toggle categories on or off.
To get the raw HTML string (useful outside Jupyter):
from structflo.ner import render_html
html_str = render_html(result)
Working with results
result.all_entities() # flat list of every entity
result.to_dict() # plain dictionary
result.to_dataframe() # pandas DataFrame (requires structflo-ner[dataframe])
Notebooks
See the notebooks/ directory for worked examples:
- 01_quickstart.ipynb — end-to-end extraction with cloud and local models
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file structflo_ner-0.2.0.tar.gz.
File metadata
- Download URL: structflo_ner-0.2.0.tar.gz
- Upload date:
- Size: 177.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b94ccb33275fc588d05d4b57e15e8431a77153d343fccb99acae46a89affc560
|
|
| MD5 |
1c383f27c2176d8e20c847220d47f423
|
|
| BLAKE2b-256 |
3de2ef31ec2a3b3186ad53b09392015f24afbdedf35498211e93f55b136fd369
|
Provenance
The following attestation bundles were made for structflo_ner-0.2.0.tar.gz:
Publisher:
publish.yml on structflo/structflo-ner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
structflo_ner-0.2.0.tar.gz -
Subject digest:
b94ccb33275fc588d05d4b57e15e8431a77153d343fccb99acae46a89affc560 - Sigstore transparency entry: 953409852
- Sigstore integration time:
-
Permalink:
structflo/structflo-ner@c0a3e6ba03a261b0b5a7f3d54d2aac223b642a56 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/structflo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c0a3e6ba03a261b0b5a7f3d54d2aac223b642a56 -
Trigger Event:
push
-
Statement type:
File details
Details for the file structflo_ner-0.2.0-py3-none-any.whl.
File metadata
- Download URL: structflo_ner-0.2.0-py3-none-any.whl
- Upload date:
- Size: 20.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37754eb3deffd2e113963f91c19bb02587476b8f73c4bd5d6c20a6eb3553a217
|
|
| MD5 |
9e27e15df8375ec5a0bf79196671632d
|
|
| BLAKE2b-256 |
6c3a14c63d8893689e993c1003274dbb01c353fc471eaebaad436858b6fb99ff
|
Provenance
The following attestation bundles were made for structflo_ner-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on structflo/structflo-ner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
structflo_ner-0.2.0-py3-none-any.whl -
Subject digest:
37754eb3deffd2e113963f91c19bb02587476b8f73c4bd5d6c20a6eb3553a217 - Sigstore transparency entry: 953409857
- Sigstore integration time:
-
Permalink:
structflo/structflo-ner@c0a3e6ba03a261b0b5a7f3d54d2aac223b642a56 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/structflo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c0a3e6ba03a261b0b5a7f3d54d2aac223b642a56 -
Trigger Event:
push
-
Statement type: