Skip to main content

Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.

Project description

structflo.ner

Drug discovery NER powered by LangExtract.

Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.

Install

pip install structflo-ner
# or with uv
uv add structflo-ner

# optional pandas support
pip install "structflo-ner[dataframe]"

Quick start

from structflo.ner import NERExtractor

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
    "Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)

print(result.compounds)      # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets)        # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities)  # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases)       # [DiseaseEntity(text='NSCLC', ...)]

df = result.to_dataframe()   # flat pandas DataFrame

Local models via Ollama

Run extraction entirely on your own hardware — no API key needed:

extractor = NERExtractor(
    model_id="gemma3:27b",
    model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")

Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).

Built-in profiles

Profile Entity classes
FULL (default) compounds, targets, diseases, bioactivities, assays, mechanisms
CHEMISTRY compound names, SMILES, CAS numbers, molecular formulas
BIOLOGY targets, gene names, protein names
BIOACTIVITY bioactivity measurements, assays
DISEASE diseases and clinical indications
from structflo.ner import NERExtractor, CHEMISTRY

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)

Profiles can be merged:

from structflo.ner import CHEMISTRY, BIOLOGY

combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)

Custom profiles

from structflo.ner import NERExtractor, EntityProfile

my_profile = EntityProfile(
    name="kinase_inhibitors",
    entity_classes=["compound_name", "smiles", "target", "bioactivity"],
    prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
    examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)

Visualization

Render results as color-coded, interactive HTML directly in Jupyter notebooks:

# Auto-renders when result is the last expression in a cell
result

# Or call explicitly
result.display()

Each entity type gets a distinct color with a superscript label. Hover any highlight to see its type and attributes. Click legend items to toggle categories on or off.

To get the raw HTML string (useful outside Jupyter):

from structflo.ner import render_html

html_str = render_html(result)

Working with results

result.all_entities()   # flat list of every entity
result.to_dict()        # plain dictionary
result.to_dataframe()   # pandas DataFrame (requires structflo-ner[dataframe])

Notebooks

See the notebooks/ directory for worked examples:

  • 01_quickstart.ipynb — end-to-end extraction with cloud and local models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structflo_ner-0.1.1.tar.gz (173.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structflo_ner-0.1.1-py3-none-any.whl (15.9 kB view details)

Uploaded Python 3

File details

Details for the file structflo_ner-0.1.1.tar.gz.

File metadata

  • Download URL: structflo_ner-0.1.1.tar.gz
  • Upload date:
  • Size: 173.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5bb8d75be4d3aa8b4719ed03ad52eefbb6ce2ad50f26040be96d0bb1c14624e8
MD5 dab20b161e1c4b9d029caf666f86327b
BLAKE2b-256 fcb9362ceab23785831cce787ede5fd9e93dc4cb29871fb6be7b5700ea5397b8

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.1.1.tar.gz:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file structflo_ner-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: structflo_ner-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 257d9deac53cd0a8ff50609dd041a4ee2a8c944f93a4de539ef2a6c730c2b79b
MD5 c84bc290cd387909e396c2a6b80e5086
BLAKE2b-256 32d02ac3868277af27b4698e763fb3b729e2153870e0464b59126c364e3d8071

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.1.1-py3-none-any.whl:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page