synthetic-models is a virtual cell generation library for generating biological synthetic data that can be used for benchmarking predictive modelling workflows. The synthetic data are generated using ODE models formulated in biochemical laws common in cancer cell signalling networks. Synthetic can create datasets in the format of scikit-learn's make_regression method.

Project description

Synthetic

Synthetic (aka ModelBuilder) is a virtual cell generation library for generating biological synthetic data that can be used for benchmarking predictive modelling workflows. The synthetic data are generated using ODE models formulated in biochemical laws common in cancer cell signalling networks. Synthetic can create datasets in the format of scikit-learn's make_regression method.

Installation

pip install synthetic-models

For development:

git clone https://github.com/AnEvilBurrito/Synthetic.git
cd Synthetic
uv sync

Alpha Testing

For lab members participating in alpha testing, install directly from the lab's GitHub repository:

pip install git+https://github.com/IntegratedNetworkModellingLab/Synthetic.git

To install a specific version or commit:

pip install git+https://github.com/IntegratedNetworkModellingLab/Synthetic.git@v0.1.0
# or
pip install git+https://github.com/IntegratedNetworkModellingLab/Synthetic.git@<commit-hash>

For lab members with write access, you can use SSH:

pip install git+ssh://git@github.com/IntegratedNetworkModellingLab/Synthetic.git

Quick Start

The high-level API provides a simple interface for creating virtual cell models and generating sklearn-compatible datasets:

from synthetic import Builder, make_dataset_drug_response

# Create a virtual cell model (auto-compiles with default settings)
vc = Builder.specify(degree_cascades=[1, 2, 5], random_seed=42)

# Generate a sklearn-compatible dataset
X, y = make_dataset_drug_response(n=1000, cell_model=vc, target_specie='Oa')

print(f"Feature matrix shape: {X.shape}")  # (1000, n_features)
print(f"Target vector shape: {y.shape}")    # (1000,)

Examples

The examples/ directory contains practical examples demonstrating various features of Synthetic:

create_dataset.py

Basic example showing how to generate a synthetic drug response dataset and analyze feature correlations.

python examples/create_dataset.py

Demonstrates:

Creating a virtual cell model with custom network topology
Generating sklearn-compatible datasets
Calculating Pearson correlations between features and targets
Identifying predictive features in the network

export_model_sbml.py

Shows how to export synthetic models to standard SBML and Antimony formats for interoperability with other tools.

python examples/export_model_sbml.py

Demonstrates:

Creating a virtual cell model
Accessing the underlying ModelBuilder
Exporting models to SBML format (for COPASI, libRoadRunner, etc.)
Exporting models to Antimony format (human-readable)
Retrieving model strings for programmatic use

detailed_examples.md

Comprehensive guide with detailed workflows and advanced use cases, including:

Feature correlation analysis - Complete workflow with CSV export
Combination therapy - Testing multi-drug interactions
Step-by-step explanations of each workflow

View the guide:

cat examples/detailed_examples.md

or open it in your preferred markdown viewer.

API Overview

VirtualCell

The VirtualCell class represents a virtual cell model with a hierarchical network topology.

from synthetic import VirtualCell

# Create a virtual cell without auto-compiling
vc = VirtualCell(
    degree_cascades=[1, 2, 5],      # Network topology: 1 cascade degree 1, 2 cascades degree 2, 5 cascades degree 3
    name="MyCell",
    random_seed=42,
    feedback_density=0.5,           # 50% of possible feedback connections
    auto_compile=False,             # Disable auto-compile for custom configuration
    auto_drug=True,                 # Auto-generate a drug targeting degree 1 species
    drug_name="D",
    drug_start_time=5000.0,          # Drug becomes active at t=5000
    drug_value=100.0,                # Drug active concentration
    drug_regulation_type="down",    # Drug inhibits targets
    simulation_end=10000.0,          # Default simulation end time
)

# Compile the model (required before simulation)
vc.compile(
    mean_range_species=(50, 150),          # Initial concentration range
    rangeScale_params=(0.8, 1.2),          # Parameter scaling range
    use_kinetic_tuner=True,                # Use KineticParameterTuner for biologically plausible parameters
    active_percentage_range=(0.3, 0.7),     # Target 30-70% active species
)

Methods

Method	Description
`add_drug(...)`	Add a drug to the model
`list_drugs()`	List all drugs in the system
`compile(...)`	Compile the model (lazy initialization)
`get_species_names(exclude_drugs=True)`	Get list of species names
`get_initial_values(exclude_drugs=True)`	Get initial species values
`get_target_concentrations()`	Get target active concentrations from kinetic tuner

Properties

Property	Description
`spec`	Underlying `DegreeInteractionSpec` object
`model`	Underlying `ModelBuilder` object
`tuner`	Underlying `KineticParameterTuner` object (if used)

Builder

The Builder class provides a factory method for creating VirtualCell instances with a clean, fluent API.

from synthetic import Builder

# Basic usage (auto-compiles)
vc = Builder.specify(degree_cascades=[1, 2, 5])

# With custom parameters
vc = Builder.specify(
    degree_cascades=[3, 6, 15, 25],
    name="ComplexCell",
    random_seed=42,
    feedback_density=0.5,
    auto_compile=True,
    auto_drug=True,
    drug_name="Inhibitor",
    drug_start_time=5000.0,
    drug_value=100.0,
    drug_regulation_type="down",
    simulation_end=10000.0,
)

Adding Drugs

Drugs can be added manually to target degree 1 R species:

# Create a virtual cell without auto-drug
vc = Builder.specify(degree_cascades=[1, 2, 5], auto_drug=False)

# Add a drug manually (returns self for chaining)
vc.add_drug(
    name="DrugX",
    start_time=500.0,               # Time at which drug becomes active
    default_value=0.0,              # Default value when not active
    regulation=["R1_1", "R1_2"],     # Target species (must be degree 1 R species)
    regulation_type=["down", "down"], # Regulation types: "up" or "down"
    value=100.0,                    # Optional override for active concentration
)

# Compile to apply drugs
vc.compile()

List all drugs in the system:

drugs = vc.list_drugs()
for drug in drugs:
    print(f"Drug: {drug['name']}, Targets: {drug['targets']}, Types: {drug['types']}")

make_dataset_drug_response

Generate synthetic drug response datasets compatible with scikit-learn's make_regression format.

from synthetic import Builder, make_dataset_drug_response

vc = Builder.specify(degree_cascades=[1, 2, 5], random_seed=42)

# Basic usage
X, y = make_dataset_drug_response(
    n=1000,                          # Number of samples
    cell_model=vc,                   # Compiled VirtualCell instance
    target_specie='Oa',              # Outcome species to use as target
)

# With custom parameters
X, y = make_dataset_drug_response(
    n=500,
    cell_model=vc,
    target_specie='Oa',
    perturbation_type='gaussian',    # 'uniform', 'gaussian', 'lognormal', 'lhs'
    perturbation_params={'rsd': 0.2}, # Relative standard deviation
    simulation_params={'start': 0, 'end': 10000, 'points': 101},
    solver_type='scipy',            # 'scipy' or 'roadrunner'
    jit=True,                        # Enable JIT compilation (scipy only)
    verbose=True,                    # Show progress bar
    seed=42,                         # Random seed for reproducibility
)

Extended Return Format

For more detailed output, use return_details=True:

result = make_dataset_drug_response(
    n=100,
    cell_model=vc,
    return_details=True,
)

# Access extended data structure
X = result['X']                      # Feature matrix
y = result['y']                      # Target vector
features = result['features']       # Feature dataframe (initial values)
targets = result['targets']         # Target dataframe (outcome values)
timecourse = result['timecourse']   # Timecourse simulation data
metadata = result['metadata']        # Metadata about generation

Perturbation Types

Type	Description	Parameters
`uniform`	Uniform distribution	`min`, `max`
`gaussian`	Gaussian (normal) distribution	`rsd` (relative standard deviation)
`lognormal`	Log-normal distribution	`rsd`
`lhs`	Latin Hypercube Sampling	`rsd`

Kinetic Parameter Perturbation

You can also perturb kinetic parameters:

X, y = make_dataset_drug_response(
    n=100,
    cell_model=vc,
    parameter_values={'Km_J0': 100.0},      # Base parameters to perturb
    param_perturbation_type='gaussian',
    param_perturbation_params={'rsd': 0.2},
    param_seed=42,                          # Separate seed for parameters
)

Simulation Robustness

The API includes robust simulation handling with automatic resampling:

X, y = make_dataset_drug_response(
    n=100,
    cell_model=vc,
    resample_size=10,           # Number of alternative samples on failure
    max_retries=3,              # Maximum retries per failed index
    require_all_successful=False, # If False, returns partial results
)

Solver Options

Two solver backends are available:

Solver	Description	Features
`scipy`	Uses `scipy.integrate.odeint` with JIT compilation	Fast for batch simulations
`roadrunner`	Uses libRoadRunner for SBML simulation	Mature, robust

# Scipy solver (default)
X, y = make_dataset_drug_response(
    n=100, cell_model=vc, solver_type='scipy', jit=True
)

# Roadrunner solver
X, y = make_dataset_drug_response(
    n=100, cell_model=vc, solver_type='roadrunner'
)

Example: Complete Workflow

from synthetic import Builder, make_dataset_drug_response
import numpy as np

# 1. Create a virtual cell with custom network topology
vc = Builder.specify(
    degree_cascades=[3, 6, 15],   # 3 degree-1, 6 degree-2, 15 degree-3 cascades
    random_seed=42,
    feedback_density=0.5,
)

# 4. Generate dataset
X, y = make_dataset_drug_response(
    n=1000,
    cell_model=vc,
    target_specie='Oa',
    perturbation_type='gaussian',
    perturbation_params={'rsd': 0.2},
    simulation_params={'start': 0, 'end': 8000, 'points': 101},
    seed=42,
    verbose=True,
)

# 5. Use with sklearn
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)

score = model.score(X_test, y_test)
print(f"Model R^2 score: {score:.3f}")

Network Topology

The degree_cascades parameter defines the hierarchical network structure:

# Simple network: 1 cascade each across 3 degrees
vc = Builder.specify(degree_cascades=[1, 1, 1])
# Creates: R1_1 -> I1_1, R2_1 -> I2_1, R3_1 -> I3_1, with hierarchical connections

# More complex network
vc = Builder.specify(degree_cascades=[3, 6, 15, 25])
# Creates many parallel cascades with increasing complexity per degree

Each cascade creates:

A Receptor (R) species at each degree
An Intermediate (I) species at each degree
The first degree also connects to an Outcome (O) species

Kinetic Parameter Tuning

The library includes KineticParameterTuner for generating biologically plausible kinetic parameters:

from synthetic import KineticParameterTuner

vc = Builder.specify(degree_cascades=[3, 6, 15], random_seed=42)
vc.compile(
    use_kinetic_tuner=True,
    active_percentage_range=(0.3, 0.7),  # Target 30-70% active states
    X_total_multiplier=5.0,               # Km_b = X_total * 5.0
    ki_val=100.0,                        # Constant Ki for inhibitors
    v_max_f_random_range=(5.0, 10.0),    # Forward Vmax range
)

# Access target concentrations
targets = vc.get_target_concentrations()
for species, concentration in targets.items():
    print(f"{species}: {concentration:.2f}")

Advanced Usage

Accessing Underlying Components

vc = Builder.specify(degree_cascades=[1, 2, 5])

# Access the specification
spec = vc.spec
print(spec.species_list)

# Access the model builder
model = vc.model
print(model.get_antimony_model())

# Access the tuner (if used)
tuner = vc.tuner
print(tuner.get_target_concentrations())

Direct Model Building

For more control, you can use the lower-level API:

from synthetic.Specs.DegreeInteractionSpec import DegreeInteractionSpec
from synthetic.Specs.Drug import Drug
from synthetic.utils.kinetic_tuner import KineticParameterTuner

# Create specification
spec = DegreeInteractionSpec(degree_cascades=[1, 2, 5])
spec.generate_specifications(random_seed=42, feedback_density=0.5)

# Add drug
drug = Drug(
    name="D",
    start_time=5000.0,
    default_value=0.0,
    regulation=["R1_1", "R1_2", "R1_3"],
    regulation_type=["down", "down", "down"],
)
spec.add_drug(drug, value=100.0)

# Generate model
model = spec.generate_network(
    network_name="MyNetwork",
    random_seed=42,
)
model.precompile()

# Apply kinetic tuning
tuner = KineticParameterTuner(model, random_seed=42)
updated_params = tuner.generate_parameters(
    active_percentage_range=(0.3, 0.7),
    X_total_multiplier=5.0,
    ki_val=100.0,
)
for param_name, value in updated_params.items():
    model.set_parameter(param_name, value)

Development

Running Tests

pytest

Run specific tests:

pytest tests/test_api.py::TestVirtualCell::test_compile_with_kinetic_tuner

Project details

Release history Release notifications | RSS feed

This version

0.1.3

Mar 30, 2026

0.1.2

Mar 30, 2026

0.1.1

Mar 26, 2026

0.1.0

Mar 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthetic_models-0.1.3.tar.gz (695.3 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

synthetic_models-0.1.3-py3-none-any.whl (97.9 kB view details)

Uploaded Mar 30, 2026 Python 3

File details

Details for the file synthetic_models-0.1.3.tar.gz.

File metadata

Download URL: synthetic_models-0.1.3.tar.gz
Upload date: Mar 30, 2026
Size: 695.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for synthetic_models-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`bd3cce1032264998e43a678c43ddcb287ea496ecb05e75955e08a04f4325ae81`
MD5	`5ee1640999e0c229894c73013fcf7bde`
BLAKE2b-256	`2f53c6d211bb213c4b53f1a8d9143e752436cdee9fb3d44a19f8c54aaec79a6f`

See more details on using hashes here.

File details

Details for the file synthetic_models-0.1.3-py3-none-any.whl.

File metadata

Download URL: synthetic_models-0.1.3-py3-none-any.whl
Upload date: Mar 30, 2026
Size: 97.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for synthetic_models-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9168175e06dcf7a1164133a42e9b0b05a9b25f82861f227cebcf67a490e3cb7a`
MD5	`c584d500d207e4311e3c95fd0255732f`
BLAKE2b-256	`b7dd5752fbfb409852596919194533aa3f3f92ecec0a0e65a4abd1c91ef6264c`

See more details on using hashes here.

synthetic-models 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Synthetic

Installation

Alpha Testing

Quick Start

Examples

create_dataset.py

export_model_sbml.py

detailed_examples.md

API Overview

VirtualCell

Methods

Properties

Builder

Adding Drugs

make_dataset_drug_response

Extended Return Format

Perturbation Types

Kinetic Parameter Perturbation

Simulation Robustness

Solver Options

Example: Complete Workflow

Network Topology

Kinetic Parameter Tuning

Advanced Usage

Accessing Underlying Components

Direct Model Building

Development

Running Tests

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes