Skip to main content

Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.

Project description

structflo.ner

Drug discovery NER powered by LangExtract.

Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.

Install

pip install structflo-ner
# or with uv
uv add structflo-ner

# optional pandas support
pip install "structflo-ner[dataframe]"

Quick start

from structflo.ner import NERExtractor

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
    "Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)

print(result.compounds)      # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets)        # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities)  # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases)       # [DiseaseEntity(text='NSCLC', ...)]

df = result.to_dataframe()   # flat pandas DataFrame

Local models via Ollama

Run extraction entirely on your own hardware — no API key needed:

extractor = NERExtractor(
    model_id="gemma3:27b",
    model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")

Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).

Built-in profiles

Profile Entity classes
FULL (default) compounds, targets, diseases, bioactivities, assays, mechanisms
CHEMISTRY compound names, SMILES, CAS numbers, molecular formulas
BIOLOGY targets, gene names, protein names
BIOACTIVITY bioactivity measurements, assays
DISEASE diseases and clinical indications
from structflo.ner import NERExtractor, CHEMISTRY

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)

Profiles can be merged:

from structflo.ner import CHEMISTRY, BIOLOGY

combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)

Custom profiles

from structflo.ner import NERExtractor, EntityProfile

my_profile = EntityProfile(
    name="kinase_inhibitors",
    entity_classes=["compound_name", "smiles", "target", "bioactivity"],
    prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
    examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)

Visualization

Render results as color-coded, interactive HTML directly in Jupyter notebooks:

# Auto-renders when result is the last expression in a cell
result

# Or call explicitly
result.display()

Each entity type gets a distinct color with a superscript label. Hover any highlight to see its type and attributes. Click legend items to toggle categories on or off.

To get the raw HTML string (useful outside Jupyter):

from structflo.ner import render_html

html_str = render_html(result)

Working with results

result.all_entities()   # flat list of every entity
result.to_dict()        # plain dictionary
result.to_dataframe()   # pandas DataFrame (requires structflo-ner[dataframe])

Notebooks

See the notebooks/ directory for worked examples:

  • 01_quickstart.ipynb — end-to-end extraction with cloud and local models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structflo_ner-0.2.2.tar.gz (202.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structflo_ner-0.2.2-py3-none-any.whl (35.1 kB view details)

Uploaded Python 3

File details

Details for the file structflo_ner-0.2.2.tar.gz.

File metadata

  • Download URL: structflo_ner-0.2.2.tar.gz
  • Upload date:
  • Size: 202.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.2.tar.gz
Algorithm Hash digest
SHA256 3cb10c3234ce83572bafdcd57a2a0f6e70492a8412e7374a708dd10987996357
MD5 decf7ae3f62ae14e4ce31cb306768147
BLAKE2b-256 fd2e57eda9f43f3f11ed3ef9209a240652aaae6f0e85b1818a6a2e6897c601e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.2.tar.gz:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file structflo_ner-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: structflo_ner-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 35.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ca87aa1175ddda69260e73c3900453387a0cfb52b0acbe28d34366959770af89
MD5 d27148b272105372f1fbcb4380da073f
BLAKE2b-256 484a39639e07234b10ca5460034052d16f17bcd1b60083ca66aed92b8742197c

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.2-py3-none-any.whl:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page