Drug target intelligence aggregator — fetch, collate, and visualize public data for any protein target in one command.

These details have not been verified by PyPI

Project links

Project description

title: TargetRecon emoji: 🎯 colorFrom: blue colorTo: green sdk: docker app_file: app.py pinned: false

Python License Version Databases PyPI

TargetRecon

Drug target intelligence aggregator — fetch, collate, and visualize public data for any protein target in one command.

Aggregate UniProt · PDB · AlphaFold · ChEMBL · STRING-DB into a single interactive report — in seconds.

What is TargetRecon?

TargetRecon is a Python CLI and web app that pulls data from 4 public databases and compiles it into a single, richly formatted report for any protein drug target. No API keys. No account. No manual copy-pasting.

Think of it as gget for drug discovery — or TargetDB reimagined for the AlphaFold era.

Four Data Sources, One Report

Source	Data
UniProt	Function, subcellular location, GO terms, diseases, keywords
RCSB PDB	Up to 50 experimental structures, filtered by resolution (default ≤ 4.0 Å), sorted by resolution ascending
AlphaFold DB	Predicted structure with pLDDT confidence coloring
ChEMBL	Bioactivity data (IC50, Ki, Kd, EC50) sorted by pChEMBL descending
STRING-DB	Protein–protein interaction network

Intelligent ID Resolution

Accepts gene names, UniProt accessions, or ChEMBL target IDs:

targetrecon EGFR          # Gene name
targetrecon P00533        # UniProt accession
targetrecon CHEMBL203     # ChEMBL target ID

Bioactivity Data

What is pChEMBL?

pChEMBL is a unified potency scale — the negative log₁₀ of the molar affinity. Higher = more potent.

pChEMBL = -log₁₀(affinity_M)

pChEMBL	Affinity	Interpretation
9	1 nM	Very potent
7	100 nM	Potent
6	1 µM	Moderate
< 5	> 10 µM	Weak

ChEMBL natively reports pChEMBL values. TargetRecon uses this scale directly.

Full pipeline — what happens when you run a query

1. Resolve query → UniProt ID / ChEMBL target ID
         │
         ▼
2. Fetch in parallel (async):
   ├── ChEMBL API  → top-N bioactivity records, sorted by pChEMBL desc (server-side)
   ├── RCSB PDB    → experimental structures
   ├── AlphaFold   → predicted structure
   └── STRING DB   → protein interactions
         │
         ▼
3. Apply min_pchembl filter (if set) to ChEMBL records
         │
         ▼
4. Deduplicate by canonical SMILES → Ligand Summary
         │
         ▼
5. Sort Ligand Summary by best pChEMBL descending
         │
         ▼
6. Output: TargetReport (bioactivities + ligand_summary + structures + ...)

Fetching strategy

ChEMBL — server-side sort + pagination:

Sends order_by=-pchembl_value to the API
Only the top-N most potent records are fetched — no wasted API calls
Records with no pChEMBL value are excluded at the API level

Cap behavior

The default is 1000 records:

Setting	ChEMBL	Total
Default (1000)	top 1000 most potent	up to 1000
`--max-bioactivities 500`	top 500 most potent	up to 500
`--max-bioactivities all`	all records	all available

Because sorting happens before the cap, you always get the most potent compounds — never a random subset.

Interface	No-limit syntax
CLI	`--max-bioactivities all`
Web UI	Drag the Max bioactivities slider to All
Python API	`max_bioactivities=None`

Ligand deduplication

ChEMBL records are grouped by canonical SMILES (via RDKit). If the same molecule appears in multiple assays:

It becomes one entry in the Ligand Summary
The best pChEMBL across all assays is kept
num_assays counts the total assay measurements

The final Ligand Summary is sorted by best pChEMBL descending — the most potent unique compound is always first.

CLI

pip install targetrecon

`targetrecon` / `targetrecon run` — Single target

targetrecon EGFR
targetrecon P00533 -f html -f json -f sdf -o ./reports/
targetrecon BRAF --min-pchembl 7.0 --max-resolution 2.5
targetrecon CDK2 --max-bioactivities 5000         # up to 5000 records
targetrecon CDK2 --max-bioactivities all          # no limit

Option	Default	Description
`-f, --format [json\|html\|sdf]`	`html json sdf`	Output formats (repeat for multiple)
`-o, --output PATH`	`.`	Output directory
`--max-resolution FLOAT`	`4.0`	Max PDB resolution in Å (up to 50 structures returned, sorted by resolution)
`--max-bioactivities INT\|all`	`1000`	Max ChEMBL bioactivity records; `all` = no limit
`--min-pchembl FLOAT`	—	Minimum pChEMBL value filter
`--top-ligands INT`	`20`	Number of top ligands for SDF export
`-q, --quiet`	off	Suppress progress messages

`targetrecon batch` — Multiple targets

# Pass targets directly
targetrecon batch EGFR BRAF CDK2 ABL1

# From a file (one target per line, # = comment)
targetrecon batch -i targets.txt

# With filters and format selection
targetrecon batch -i targets.txt -f html -f sdf --min-pchembl 6.0 --skip-errors

# Unlimited bioactivities
targetrecon batch -i targets.txt --max-bioactivities all

Option	Default	Description
`-i, --input PATH`	—	Text file, one target per line
`-o, --output PATH`	`./batch_reports`	Output directory
`-f, --format [json\|html\|sdf]`	`html json sdf`	Output formats (repeat for multiple)
`--max-resolution FLOAT`	`4.0`	Max PDB resolution in Å (up to 50 structures returned, sorted by resolution)
`--max-bioactivities INT\|all`	`1000`	Max ChEMBL bioactivity records; `all` = no limit
`--min-pchembl FLOAT`	—	Minimum pChEMBL value filter
`--top-ligands INT`	`20`	Ligands per SDF file
`--skip-errors`	off	Continue if a single target fails
`-q, --quiet`	off	Suppress progress messages

After completion, a summary table is printed showing structures / bioactivities / ligands per target.

`targetrecon serve` — Launch web interface

targetrecon serve                  # http://localhost:5000
targetrecon serve --port 8080
targetrecon serve --host 0.0.0.0   # expose on all interfaces

Option	Default	Description
`--port INT`	`5000`	Port to listen on
`--host TEXT`	`0.0.0.0`	Host to bind
`--debug`	off	Enable Flask debug mode

Web UI

targetrecon serve
# Open http://localhost:5000

Dark-themed interface with animated molecular backdrop
Search by gene name, UniProt accession, or ChEMBL target ID
Molecule sketcher (Ketcher) — draw a structure to find matching targets
Sidebar controls: max PDB resolution, min pChEMBL, ChEMBL toggle, max bioactivities slider (100–5000, or drag to All for no limit)

Report tabs

Tab	Contents
Overview	UniProt summary, GO terms, diseases, protein stats
3D Viewer	AlphaFold pLDDT coloring + PDB experimental structures (3Dmol.js)
Bioactivity	pChEMBL distribution histogram, method breakdown chart
Ligands	Sortable table ranked by potency — SMILES, ChEMBL links, activity type, source
PDB	All experimental structures with resolution, method, ligand count
Interactions	STRING protein–protein interaction network (Cytoscape.js)

Export from the UI

Every report page has one-click download buttons:

JSON — full machine-readable report
HTML — self-contained interactive report (works fully offline)
SDF — top ligands with 3D conformers, ready for docking

AI Agent

An AI chat panel is available on every report page. Click the AI button (bottom-right corner) to open it.

Providers & models

Provider	Models
Anthropic	claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5
OpenAI	gpt-4o, gpt-4o-mini
Groq	llama-3.3-70b, mixtral

Bring your own API key — keys are never stored, forgotten after each browser session
Context-aware: the agent already knows the target you're looking at
Tools: search targets, fetch bioactivities, query PDB structures, protein interactions, compare targets
Streaming responses with stop button
Resizable and minimizable panel

Example questions

What are the best scaffolds for covalent inhibition?
Which PDB structures are most suitable for docking?
Compare the selectivity profile of this target vs CDK4.
Summarize the druggability of this target.

Python API

import targetrecon

# Single target — works in scripts and Jupyter
report = targetrecon.recon("EGFR")
print(report.uniprot.protein_name)      # "Epidermal growth factor receptor"
print(report.num_pdb_structures)         # 50
print(report.num_bioactivities)          # up to 1000 (default ChEMBL cap)
print(report.best_ligand.best_pchembl)   # e.g. 10.52

# With options
report = targetrecon.recon(
    "BRAF",
    max_bioactivities=5000,   # up to 5000 per source
    min_pchembl=7.0,
    max_pdb_resolution=2.5,
)

# No limit — fetch all available records
report = targetrecon.recon("BRAF", max_bioactivities=None)

# Async (for use with asyncio.run or inside async functions)
import asyncio
report = asyncio.run(targetrecon.recon_async("BRAF"))

# Async with all options
report = asyncio.run(targetrecon.recon_async(
    "CDK2",
    max_bioactivities=2000,
    # max_bioactivities=None  # no limit
))

Accessing the data

# UniProt info
report.uniprot.protein_name           # e.g. "Epidermal growth factor receptor"
report.uniprot.gene_name              # e.g. "EGFR"
report.uniprot.organism               # e.g. "Homo sapiens"
report.uniprot.function_description   # functional annotation text
report.uniprot.subcellular_locations  # list[str]
report.uniprot.disease_associations   # list[str]
report.uniprot.keywords               # list[str]
report.uniprot.go_terms               # list[GoTerm] — each has .go_id, .term, .category
report.uniprot.sequence_length        # int

# PDB structures
for pdb in report.pdb_structures[:5]:
    print(pdb.pdb_id, pdb.resolution, pdb.method)
    for lig in pdb.ligands:           # list[PDBLigand] — each has .ligand_id, .smiles, .name
        print(lig.ligand_id, lig.name)

# AlphaFold
report.alphafold.pdb_url        # URL to AlphaFold PDB structure
report.alphafold.model_url      # URL to AlphaFold CIF model

# Bioactivity records (sorted by pChEMBL descending)
for b in report.bioactivities[:10]:
    print(b.source, b.activity_type, b.value, b.pchembl_value, b.smiles)

# Ligand summary (deduplicated by canonical SMILES, sorted by best pChEMBL)
for lig in report.ligand_summary[:10]:
    print(lig.name, lig.chembl_id, lig.best_pchembl, lig.best_activity_type, lig.num_assays)
    print(lig.sources)                # e.g. ["ChEMBL"]

report.best_ligand               # most potent unique ligand overall

Export

from targetrecon.core import save_html, save_json, save_sdf

save_html(report, "EGFR_report.html")
save_json(report, "EGFR_report.json")

# SDF with filters
save_sdf(report, "EGFR_ligands.sdf",
         top_n=50,              # limit to top 50
         min_pchembl=7.0,       # only pChEMBL ≥ 7
         activity_type="IC50")  # only IC50 records

Batch (async, concurrent)

import asyncio, targetrecon

async def run_batch(targets):
    reports = await asyncio.gather(*[
        targetrecon.recon_async(t) for t in targets
    ])
    return reports

reports = asyncio.run(run_batch(["EGFR", "BRAF", "CDK2"]))
for r in reports:
    print(r.uniprot.gene_name, r.num_bioactivities, r.num_unique_ligands)

Comparison

Feature	TargetRecon	TargetDB (2020)	gget	Open Targets
AlphaFold integration	✅	❌	✅	✅ (web)
ChEMBL bioactivity	✅	✅	❌	Partial
Interactive HTML report	✅	❌	❌	Web only
3D structure viewer	✅	❌	❌	Web only
Molecule sketcher → targets	✅	❌	❌	❌
Docking-ready SDF export	✅	❌	❌	❌
AI agent chat	✅	❌	❌	❌
Batch CLI processing	✅	❌	✅	N/A
pip install + single command	✅	Partial	✅	N/A

Installation

pip install targetrecon

Quick start:

targetrecon EGFR

Produces EGFR_report.html (interactive, self-contained), EGFR_report.json, and EGFR_top_ligands.sdf — ready for docking.

Development:

git clone https://github.com/nagarh/targetrecon.git
cd targetrecon
pip install -e ".[dev]"

Architecture

src/targetrecon/
├── cli.py           # Click CLI — run, batch, serve
├── webapp.py        # Flask web app — UI, report pages, AI agent routes
├── core.py          # Orchestration, aggregation, export (HTML/JSON/SDF)
├── models.py        # Pydantic data models
├── resolver.py      # Gene → UniProt → ChEMBL ID resolution
├── report.py        # Jinja2 HTML report generator (standalone)
├── agent_chat.py    # AI agent — tool definitions, streaming, multi-provider
└── clients/
    ├── uniprot.py   # UniProt REST API
    ├── pdb_client.py# RCSB PDB REST + Search API
    ├── alphafold.py # AlphaFold Database API
    ├── chembl.py    # ChEMBL REST API
    └── string_db.py # STRING-DB REST API

Author

Hemantn Nagar 📧 hn533621@ohio.edu 🔗 github.com/nagarh

References

Data from: UniProt · RCSB PDB · AlphaFold DB · ChEMBL · STRING-DB

Visualization: 3Dmol.js · Chart.js · Cytoscape.js

Sketcher: Ketcher

Inspired by TargetDB and gget.

License

MIT License — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.12

Mar 22, 2026

0.1.11

Mar 22, 2026

0.1.10

Mar 22, 2026

0.1.9

Mar 22, 2026

0.1.8

Mar 21, 2026

0.1.7

Mar 21, 2026

0.1.6

Mar 21, 2026

0.1.5

Mar 21, 2026

0.1.4

Mar 21, 2026

0.1.3

Mar 21, 2026

0.1.2

Mar 21, 2026

0.1.1

Mar 21, 2026

0.1.0

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

targetrecon-0.1.12.tar.gz (44.0 MB view details)

Uploaded Mar 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

targetrecon-0.1.12-py3-none-any.whl (44.4 MB view details)

Uploaded Mar 22, 2026 Python 3

File details

Details for the file targetrecon-0.1.12.tar.gz.

File metadata

Download URL: targetrecon-0.1.12.tar.gz
Upload date: Mar 22, 2026
Size: 44.0 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for targetrecon-0.1.12.tar.gz
Algorithm	Hash digest
SHA256	`7b8827c36c5228d82ddce57abbbda084bf4eed505b5b5aa477a3f37d3d085b65`
MD5	`9a6bfac6be6bb58d069923f107636a77`
BLAKE2b-256	`d27748fed25ccfd2a5eb4c7aa668ca8135cd90302b07db488bb7bca17c3736f5`

See more details on using hashes here.

File details

Details for the file targetrecon-0.1.12-py3-none-any.whl.

File metadata

Download URL: targetrecon-0.1.12-py3-none-any.whl
Upload date: Mar 22, 2026
Size: 44.4 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for targetrecon-0.1.12-py3-none-any.whl
Algorithm	Hash digest
SHA256	`93a4ab43f9bc2af91eab9f777a737736107d7fd131dfa9cddb2fa8378ed36bec`
MD5	`50255eed8e5a45ec39531782242f0469`
BLAKE2b-256	`1d664f8fc3049db4a4a9d6dc7844b4d87eda1a2fea2cceb17dbc161b160e60b5`

See more details on using hashes here.

targetrecon 0.1.12

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

title: TargetRecon emoji: 🎯 colorFrom: blue colorTo: green sdk: docker app_file: app.py pinned: false

TargetRecon

What is TargetRecon?

Four Data Sources, One Report

Intelligent ID Resolution

Bioactivity Data

What is pChEMBL?

Full pipeline — what happens when you run a query

Fetching strategy

Cap behavior

Ligand deduplication

CLI

targetrecon / targetrecon run — Single target

targetrecon batch — Multiple targets

targetrecon serve — Launch web interface

Web UI

Report tabs

Export from the UI

AI Agent

Providers & models

Example questions

Python API

Accessing the data

Export

Batch (async, concurrent)

Comparison

Installation

Architecture

Author

References

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`targetrecon` / `targetrecon run` — Single target

`targetrecon batch` — Multiple targets

`targetrecon serve` — Launch web interface