Skip to main content

Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.

Project description

structflo.ner

Drug discovery NER powered by LangExtract.

Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.

Install

pip install structflo-ner
# or with uv
uv add structflo-ner

# optional pandas support
pip install "structflo-ner[dataframe]"

Quick start

from structflo.ner import NERExtractor

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
    "Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)

print(result.compounds)      # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets)        # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities)  # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases)       # [DiseaseEntity(text='NSCLC', ...)]

df = result.to_dataframe()   # flat pandas DataFrame

Local models via Ollama

Run extraction entirely on your own hardware — no API key needed:

extractor = NERExtractor(
    model_id="gemma3:27b",
    model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")

Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).

Built-in profiles

Profile Entity classes
FULL (default) compounds, targets, diseases, bioactivities, assays, mechanisms
CHEMISTRY compound names, SMILES, CAS numbers, molecular formulas
BIOLOGY targets, gene names, protein names
BIOACTIVITY bioactivity measurements, assays
DISEASE diseases and clinical indications
from structflo.ner import NERExtractor, CHEMISTRY

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)

Profiles can be merged:

from structflo.ner import CHEMISTRY, BIOLOGY

combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)

Custom profiles

from structflo.ner import NERExtractor, EntityProfile

my_profile = EntityProfile(
    name="kinase_inhibitors",
    entity_classes=["compound_name", "smiles", "target", "bioactivity"],
    prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
    examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)

Visualization

Render results as color-coded, interactive HTML directly in Jupyter notebooks:

# Auto-renders when result is the last expression in a cell
result

# Or call explicitly
result.display()

Each entity type gets a distinct color with a superscript label. Hover any highlight to see its type and attributes. Click legend items to toggle categories on or off.

To get the raw HTML string (useful outside Jupyter):

from structflo.ner import render_html

html_str = render_html(result)

Working with results

result.all_entities()   # flat list of every entity
result.to_dict()        # plain dictionary
result.to_dataframe()   # pandas DataFrame (requires structflo-ner[dataframe])

Notebooks

See the notebooks/ directory for worked examples:

  • 01_quickstart.ipynb — end-to-end extraction with cloud and local models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structflo_ner-0.2.0.tar.gz (177.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structflo_ner-0.2.0-py3-none-any.whl (20.5 kB view details)

Uploaded Python 3

File details

Details for the file structflo_ner-0.2.0.tar.gz.

File metadata

  • Download URL: structflo_ner-0.2.0.tar.gz
  • Upload date:
  • Size: 177.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b94ccb33275fc588d05d4b57e15e8431a77153d343fccb99acae46a89affc560
MD5 1c383f27c2176d8e20c847220d47f423
BLAKE2b-256 3de2ef31ec2a3b3186ad53b09392015f24afbdedf35498211e93f55b136fd369

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.0.tar.gz:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file structflo_ner-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: structflo_ner-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 20.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 37754eb3deffd2e113963f91c19bb02587476b8f73c4bd5d6c20a6eb3553a217
MD5 9e27e15df8375ec5a0bf79196671632d
BLAKE2b-256 6c3a14c63d8893689e993c1003274dbb01c353fc471eaebaad436858b6fb99ff

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.0-py3-none-any.whl:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page