Skip to main content

Drug discovery NER wrapper around LangExtract — zero-config entity extraction for chemistry and biology.

Project description

structflo.ner

Drug discovery NER powered by LangExtract.

Extract compounds, targets, bioactivity data, diseases, and more from scientific text — zero configuration required.

Install

pip install structflo-ner
# or with uv
uv add structflo-ner

# optional pandas support
pip install "structflo-ner[dataframe]"

Quick start

from structflo.ner import NERExtractor

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(
    "Gefitinib (ZD1839) is a first-generation EGFR inhibitor with IC50 = 0.033 µM approved for NSCLC."
)

print(result.compounds)      # [ChemicalEntity(text='Gefitinib', ...)]
print(result.targets)        # [TargetEntity(text='EGFR', ...)]
print(result.bioactivities)  # [BioactivityEntity(text='IC50 = 0.033 µM', ...)]
print(result.diseases)       # [DiseaseEntity(text='NSCLC', ...)]

df = result.to_dataframe()   # flat pandas DataFrame

Local models via Ollama

Run extraction entirely on your own hardware — no API key needed:

extractor = NERExtractor(
    model_id="gemma3:27b",
    model_url="http://localhost:11434",
)
result = extractor.extract("Sorafenib inhibits VEGFR-2 and RAF kinases.")

Any model served by Ollama works (gemma, llama, mistral, qwen, deepseek, etc.).

Built-in profiles

Profile Entity classes
FULL (default) compounds, targets, diseases, bioactivities, assays, mechanisms
CHEMISTRY compound names, SMILES, CAS numbers, molecular formulas
BIOLOGY targets, gene names, protein names
BIOACTIVITY bioactivity measurements, assays
DISEASE diseases and clinical indications
from structflo.ner import NERExtractor, CHEMISTRY

extractor = NERExtractor(api_key="YOUR_GEMINI_KEY")
result = extractor.extract(text, profile=CHEMISTRY)

Profiles can be merged:

from structflo.ner import CHEMISTRY, BIOLOGY

combined = CHEMISTRY.merge(BIOLOGY)
result = extractor.extract(text, profile=combined)

Custom profiles

from structflo.ner import NERExtractor, EntityProfile

my_profile = EntityProfile(
    name="kinase_inhibitors",
    entity_classes=["compound_name", "smiles", "target", "bioactivity"],
    prompt="Extract kinase inhibitor names, SMILES, targets, and potency values.",
    examples=my_examples,
)
result = extractor.extract(text, profile=my_profile)

Working with results

result.all_entities()   # flat list of every entity
result.to_dict()        # plain dictionary
result.to_dataframe()   # pandas DataFrame (requires structflo-ner[dataframe])

Notebooks

See the notebooks/ directory for worked examples:

  • 01_quickstart.ipynb — end-to-end extraction with cloud and local models

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

structflo_ner-0.1.0.tar.gz (170.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

structflo_ner-0.1.0-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file structflo_ner-0.1.0.tar.gz.

File metadata

  • Download URL: structflo_ner-0.1.0.tar.gz
  • Upload date:
  • Size: 170.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bc9cc3a355c4d28b85e118da4c3a98d8246546b6b7f35c47fc9437c15073cff2
MD5 0f930c7a3ff520cbbfb4cc8a1dd9d873
BLAKE2b-256 c71370ea633c4e108228906714c5b5189056c2a2501de8f41b19aad32916a526

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.1.0.tar.gz:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file structflo_ner-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: structflo_ner-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for structflo_ner-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 92577d7fdd2298424125ab55cc2ae2a78504848053fd64f7ffd0bda1987754c8
MD5 ef172b8034a1401f4d5ec4b9e038d799
BLAKE2b-256 4f1cefdd255366f166356dc6cda7ccef5a7b06f06fd579a9dd149bce0971f7b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for structflo_ner-0.1.0-py3-none-any.whl:

Publisher: publish.yml on structflo/structflo-ner

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page