Skip to main content

Thompson Sampling-Assisted Chemical Targeting and Iterative Compound Selection for Drug Discovery

Project description

TACTICS: Thompson Sampling-Assisted Chemical Targeting and Iterative Compound Selection for Drug Discovery

TACTICS Logo

A comprehensive library for Thompson Sampling-based optimization of chemical combinatorial libraries, featuring a unified architecture with flexible strategy selection, modern Pydantic configuration, and preset configurations for out-of-the-box usage.

Quick Start with Interactive Tutorials

TACTICS includes interactive marimo notebooks for learning and exploration. For full documentation, see the TACTICS Documentation.

Installation

pip install chem-tactics[tutorials]  # Includes marimo

Running Tutorials

As an interactive app (recommended for exploration):

marimo run tutorials/thompson_sampling_tutorial.py

In edit mode (for learning/modification):

marimo edit tutorials/thompson_sampling_tutorial.py

Available Tutorials

Tutorial Description
library_enumeration_tutorial.py SynthesisPipeline and enumeration
thompson_sampling_tutorial.py Selection strategies comparison
reaction_config_builder.py ReactionConfig builder
library_component_comparison.py Library component analysis
legacy_vs_current_comparison.py Legacy vs current benchmark

Note: Tutorials default to the bundled Thrombin dataset. Select "Local Data" mode to use your own files.

Key Features

  • Unified Thompson Sampling Framework: Single ThompsonSampler with pluggable selection strategies
  • Multiple Selection Strategies:
    • Greedy (pure exploitation)
    • Roulette Wheel (adaptive thermal cycling)
    • UCB (Upper Confidence Bound)
    • Epsilon-Greedy (balanced exploration/exploitation)
    • Bayes-UCB (Bayesian upper confidence bound)
    • Boltzmann (temperature-based selection)
  • Warmup Strategies: Balanced (recommended), Standard, Enhanced
  • Preset Configurations: 5 ready-to-use presets for common use cases
  • Modern Pydantic Configuration: Type-safe configuration with full validation
  • Parallel Processing: Batch mode with multiprocessing for expensive evaluators
  • Multiple Evaluators: Lookup, Database, ROCS, Fred, ML classifiers, and more
  • Synthesis Pipeline: SynthesisPipeline architecture for single-step, alternative SMARTS, and multi-step reactions
  • SMARTS Toolkit: ReactionDef with built-in validation, visualization, and protecting group support
  • Library Enumeration: Efficient generation of combinatorial reaction products with write_enumerated_library()
  • Library Analysis: Comprehensive analysis and visualization tools
  • Polars DataFrames: Fast, efficient data handling throughout

Package Structure

TACTICS/
├── thompson_sampling/
│   ├── config.py              # ThompsonSamplingConfig (Pydantic v2)
│   ├── presets.py             # Preset configurations
│   ├── factories.py           # Factory functions for component creation
│   ├── core/                  # Core unified sampler
│   │   ├── sampler.py         # ThompsonSampler (unified)
│   │   ├── evaluators.py      # All evaluator classes
│   │   └── evaluator_config.py # Evaluator Pydantic configs
│   ├── strategies/            # Selection strategies
│   │   ├── greedy.py
│   │   ├── roulette_wheel.py
│   │   ├── ucb.py
│   │   ├── epsilon_greedy.py
│   │   ├── bayes_ucb.py
│   │   └── config.py          # Strategy Pydantic configs
│   ├── warmup/                # Warmup strategies
│   │   └── config.py          # Warmup Pydantic configs (Balanced, Standard, Enhanced)
│   └── baseline.py            # Random baseline sampling
├── library_enumeration/       # Library generation tools
│   ├── synthesis_pipeline.py  # SynthesisPipeline - main entry point
│   ├── enumeration_utils.py   # EnumerationResult, EnumerationError
│   ├── file_writer.py         # write_enumerated_library()
│   ├── generate_products.py   # Product generation utilities
│   └── smarts_toolkit/        # SMARTS validation and configuration
│       ├── config.py          # ReactionDef, ReactionConfig, StepInput, DeprotectionSpec
│       ├── _validator.py      # ValidationResult, internal validation
│       └── constants.py       # Protecting groups, salt fragments
└── library_analysis/          # Analysis and visualization

Repository Structure

TACTICS/
├── src/TACTICS/              # Core package (pip installable)
│   ├── thompson_sampling/    # Thompson Sampling algorithms
│   ├── library_enumeration/  # Library generation tools
│   ├── library_analysis/     # Analysis and visualization
│   └── data/                 # Bundled tutorial datasets
│       └── thrombin/         # Thrombin inhibitor dataset
│
├── tutorials/                # Interactive marimo tutorials
├── tests/                    # Unit and integration tests
└── docs/                     # Sphinx documentation

Quick Start

Simple Out-of-the-Box Usage with Presets (Recommended)

The easiest way to get started is using presets with SynthesisPipeline:

from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler, get_preset
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# 1. Create synthesis pipeline (single source of truth for reactions)
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0,
        description="Amide coupling"
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# 2. Create evaluator config
evaluator = LookupEvaluatorConfig(ref_filename="scores.csv")

# 3. Get a preset configuration
config = get_preset(
    "fast_exploration",  # Quick screening with epsilon-greedy
    synthesis_pipeline=pipeline,
    evaluator_config=evaluator,
    mode="minimize",  # Use "minimize" for docking scores
    num_iterations=1000
)

# 4. Create sampler from config and run
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

# 5. Analyze top results
print(results_df.sort("score").head(10))

Available Presets:

  • "fast_exploration" - Epsilon-greedy strategy, quick screening
  • "parallel_batch" - Batch processing with multiprocessing (for slow evaluators)
  • "conservative_exploit" - Greedy strategy, focus on best reagents
  • "balanced_sampling" - UCB strategy with theoretical guarantees
  • "diverse_coverage" - Maximum diversity exploration

Parallel Batch Processing (for Expensive Evaluators)

For slow evaluators (docking, ML models), use batch mode with multiprocessing:

from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler, get_preset
from TACTICS.thompson_sampling.core.evaluator_config import FredEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Configure slow evaluator (molecular docking)
evaluator = FredEvaluatorConfig(design_unit_file="receptor.oedu")

# Get parallel batch preset
config = get_preset(
    "parallel_batch",
    synthesis_pipeline=pipeline,
    evaluator_config=evaluator,
    mode="minimize",  # Docking scores (lower is better)
    batch_size=100,   # Sample 100 compounds per cycle
)

# Create sampler and run
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

Custom Configuration (Advanced)

For full control, create custom configurations:

from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSampler, ThompsonSamplingConfig
from TACTICS.thompson_sampling.strategies.config import RouletteWheelConfig
from TACTICS.thompson_sampling.warmup.config import BalancedWarmupConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Create fully customized configuration
config = ThompsonSamplingConfig(
    synthesis_pipeline=pipeline,
    num_ts_iterations=5000,
    num_warmup_trials=5,
    strategy_config=RouletteWheelConfig(
        mode="maximize",
        alpha=0.1,  # Initial heating temperature
        beta=0.1,   # Initial cooling temperature
    ),
    warmup_config=BalancedWarmupConfig(
        observations_per_reagent=5,
        use_per_reagent_variance=True,
    ),
    evaluator_config=LookupEvaluatorConfig(
        ref_filename="scores.csv",
        score_col="binding_affinity"
    ),
    batch_size=10,
    log_filename="optimization.log"
)

# Create sampler and run
sampler = ThompsonSampler.from_config(config)
warmup_df = sampler.warm_up(num_warmup_trials=config.num_warmup_trials)
results_df = sampler.search(num_cycles=config.num_ts_iterations)
sampler.close()

# Save results
results_df.write_csv("my_results.csv")

Random Baseline Sampling

from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import RandomBaselineConfig, run_random_baseline
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

config = RandomBaselineConfig(
    synthesis_pipeline=pipeline,
    evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv"),
    num_trials=1000,
    num_to_save=100,
    ascending_output=False,
    outfile_name="random_results.csv"
)

results_df = run_random_baseline(config)

Configuration

Pydantic Configuration Models

The package uses Pydantic v2 for robust configuration validation:

from TACTICS.library_enumeration import SynthesisPipeline, ReactionConfig, ReactionDef
from TACTICS.thompson_sampling import ThompsonSamplingConfig
from TACTICS.thompson_sampling.strategies.config import EpsilonGreedyConfig
from TACTICS.thompson_sampling.warmup.config import BalancedWarmupConfig
from TACTICS.thompson_sampling.core.evaluator_config import LookupEvaluatorConfig

# Create synthesis pipeline
rxn_config = ReactionConfig(
    reactions=[ReactionDef(
        reaction_smarts="[C:1](=O)[OH].[NH2:2]>>[C:1](=O)[NH:2]",
        step_index=0
    )],
    reagent_file_list=["acids.smi", "amines.smi"]
)
pipeline = SynthesisPipeline(rxn_config)

# Automatic validation and type checking
config = ThompsonSamplingConfig(
    synthesis_pipeline=pipeline,  # Required: single source of truth
    num_ts_iterations=1000,
    strategy_config=EpsilonGreedyConfig(mode="maximize", epsilon=0.2),
    warmup_config=BalancedWarmupConfig(),
    evaluator_config=LookupEvaluatorConfig(ref_filename="scores.csv"),
)

Configuration Validation

from pydantic import ValidationError

# Invalid configuration raises ValidationError
try:
    rxn = ReactionDef(
        reaction_smarts="invalid-smarts",  # ValidationError: Invalid SMARTS
        step_index=0
    )
except ValidationError as e:
    print(f"Configuration error: {e}")

Testing

The package includes comprehensive tests for configuration validation:

# Run all tests
pytest tests/

# Run configuration tests
pytest tests/test_config_validation.py -v

# Run with coverage
pytest tests/ --cov=TACTICS --cov-report=html

Documentation

  • Full Documentation: TACTICS Documentation
  • Interactive Tutorials: See tutorials/ for marimo notebooks
  • API Reference: Build locally with cd docs && make html

Installation

# Clone repository and install package in development mode
git clone https://github.com/aakankschit/TACTICS.git
cd TACTICS
pip install -e .

# With interactive tutorials (marimo):
pip install -e ".[tutorials]"

# With test dependencies:
pip install -e ".[test]"

Requirements

  • Python 3.11+
  • Multiprocessing support

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass
  6. Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use TACTICS in your research, please cite:

@software{tactics,
    title={TACTICS: Thompson Sampling-Assisted Chemical Targeting and Iterative Compound Selection for Drug Discovery},
    author={Aakankschit Nandkeolyar},
    year={2025},
    url={https://github.com/your-org/TACTICS}
}

Support

For questions and support:


This work is based on previous work by Patrick Walters. This project is a collaboration between the University of California Irvine, Leiden University and Groningen University.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chem_tactics-1.1.0.tar.gz (3.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chem_tactics-1.1.0-py3-none-any.whl (3.6 MB view details)

Uploaded Python 3

File details

Details for the file chem_tactics-1.1.0.tar.gz.

File metadata

  • Download URL: chem_tactics-1.1.0.tar.gz
  • Upload date:
  • Size: 3.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for chem_tactics-1.1.0.tar.gz
Algorithm Hash digest
SHA256 bf79527180ba56642c5ca05896314a5a6c562c8ba2811b0f44922cca7417a8c9
MD5 593c38869e947f2dfd929ec5e7214d10
BLAKE2b-256 bcf5f80c0247c5522e468cfe3e45bc0bd21a45cefb8dcb5f736e23a31da87b62

See more details on using hashes here.

File details

Details for the file chem_tactics-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: chem_tactics-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for chem_tactics-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 59d18fa8b16fa83dfdda28305bd440b65c5be178055639207dedfd3d9e6df4d5
MD5 0a1c53c64d46a1caec8f8ef250800b97
BLAKE2b-256 13e790312c9dbed6e9f276834627c2df792af38ab2ffc78255d957d9547e2408

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page