Skip to main content

Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.

Project description

structflo.ner

Drug discovery NER powered by LangExtract.

Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.

Install

pip install structflo-ner
# or with uv
uv add structflo-ner

# optional pandas support
pip install "structflo-ner[dataframe]"

Quick start

from structflo.ner import NERExtractor

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
    "Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)

print(result.compounds)      # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets)        # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities)  # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases)       # [DiseaseEntity(text='NSCLC', ...)]

df = result.to_dataframe()   # flat pandas DataFrame

Local models via Ollama

Run extraction entirely on your own hardware — no API key needed:

extractor = NERExtractor(
    model_id="gemma3:27b",
    model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")

Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).

Built-in profiles

Profile Entity classes
FULL (default) compounds, targets, diseases, bioactivities, assays, mechanisms
CHEMISTRY compound names, SMILES, CAS numbers, molecular formulas
BIOLOGY targets, gene names, protein names
BIOACTIVITY bioactivity measurements, assays
DISEASE diseases and clinical indications
from structflo.ner import NERExtractor, CHEMISTRY

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)

Profiles can be merged:

from structflo.ner import CHEMISTRY, BIOLOGY

combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)

Custom profiles

from structflo.ner import NERExtractor, EntityProfile

my_profile = EntityProfile(
    name="kinase_inhibitors",
    entity_classes=["compound_name", "smiles", "target", "bioactivity"],
    prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
    examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)

Visualization

Render results as color-coded, interactive HTML directly in Jupyter notebooks:

# Auto-renders when result is the last expression in a cell
result

# Or call explicitly
result.display()

Each entity type gets a distinct color with a superscript label. Hover any highlight to see its type and attributes. Click legend items to toggle categories on or off.

To get the raw HTML string (useful outside Jupyter):

from structflo.ner import render_html

html_str = render_html(result)

Working with results

result.all_entities()   # flat list of every entity
result.to_dict()        # plain dictionary
result.to_dataframe()   # pandas DataFrame (requires structflo-ner[dataframe])

Notebooks

See the notebooks/ directory for worked examples:

  • 01_quickstart.ipynb — end-to-end extraction with cloud and local models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structflo_ner-0.2.1.tar.gz (177.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structflo_ner-0.2.1-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file structflo_ner-0.2.1.tar.gz.

File metadata

  • Download URL: structflo_ner-0.2.1.tar.gz
  • Upload date:
  • Size: 177.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.1.tar.gz
Algorithm Hash digest
SHA256 0fc970c8ae6d90f57f2eee02695ce09debe89195db7180efe45863ee96fc49b0
MD5 9e9e96e7617b2fd36298e190191db3d7
BLAKE2b-256 089ea686928c5e4d14819e270ec83d0cb8bee246a2a856c4e2400350d79c3d4d

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.1.tar.gz:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file structflo_ner-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: structflo_ner-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ea9a1c469f5ea760816b455db8212b569e84bc460d5986828b58030a7d6460f0
MD5 3a80deef82232a7aae162f5decdd8af7
BLAKE2b-256 64d89d5d5944ae269628366621e7a2ae6889b9cfad04510f20b7ad72da51df11

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.2.1-py3-none-any.whl:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page