Python SDK for the Discovery Engine API

These details have not been verified by PyPI

Project links

Project description

Discovery Engine Python SDK

Python client library for the Discovery Engine API.

Installation

pip install leap-discovery-client

For pandas DataFrame support:

pip install leap-discovery-client[pandas]

Quick Start

from discovery import Client

# Initialize client - automatically uses the production API
client = Client(api_key="your-api-key")

# Analyze a dataset and wait for results
result = client.analyze(
    file="data.csv",
    target_column="price",
    mode="fast",
    description="House price dataset from Kaggle",
    column_descriptions={
        "age": "Age of the house in years",
        "price": "Sale price in USD"
    },
    visibility="public",
    wait=True  # Wait for completion and return full results
)

print(f"Run ID: {result.run_id}")
print(f"Status: {result.status}")
print(f"Found {len(result.patterns)} patterns")

Features

Simple API: Single analyze() method handles the entire workflow
Complete Results: Returns everything shown in the Discovery dashboard
Pandas Support: Upload DataFrames directly with automatic column inference
Async Support: Use analyze_async() for async workflows
Polling: Automatically wait for completion with configurable timeout

What You Get Back

The SDK returns an AnalysisResult with everything the Discovery dashboard shows:

Summary (LLM-generated)

result.summary.overview           # High-level explanation of findings
result.summary.key_insights       # List of main takeaways
result.summary.novel_patterns     # Novel pattern explanations
result.summary.surprising_findings
result.summary.statistically_significant
result.summary.data_insights      # Important features, correlations

Patterns

for pattern in result.patterns:
    print(f"Pattern {pattern.id}: {pattern.description}")
    print(f"  Direction: {pattern.direction}")
    print(f"  Lift: {pattern.lift_value}")
    print(f"  Support: {pattern.support_count} ({pattern.support_percentage:.1%})")
    print(f"  P-value: {pattern.p_value}")
    print(f"  Type: {pattern.pattern_type} / {pattern.novelty_type}")
    print(f"  Conditions: {pattern.conditions}")
    print(f"  Citations: {len(pattern.citations)}")

Columns with Feature Importance

for col in result.columns:
    print(f"{col.display_name}")
    print(f"  Type: {col.type} ({col.data_type})")
    print(f"  Stats: mean={col.mean}, std={col.std}, min={col.min}, max={col.max}")
    print(f"  Null %: {col.null_percentage}")
    if col.feature_importance_score:
        print(f"  Importance: {col.feature_importance_score}")

Correlation Matrix

for entry in result.correlation_matrix:
    print(f"{entry.feature_x} <-> {entry.feature_y}: {entry.value:.3f}")

Feature Importance

if result.feature_importance:
    print(f"Model type: {result.feature_importance.kind}")
    print(f"Baseline: {result.feature_importance.baseline}")
    for score in result.feature_importance.scores:
        print(f"  {score.feature}: {score.score}")

Configuration

The client automatically uses the production API endpoint. For testing or custom deployments, you can override the URL via the DISCOVERY_API_URL environment variable:

export DISCOVERY_API_URL="https://custom-api.example.com"

Configuration Options

All dashboard options are supported:

Option	Type	Default	Description
`file`	`str`, `Path`, or `DataFrame`	-	Dataset file or pandas DataFrame
`target_column`	`str`	-	Name of column to predict
`mode`	`"fast"` / `"deep"`	`"fast"`	Analysis depth
`visibility`	`"public"` / `"private"`	`"public"`	Dataset visibility
`task`	`str`	auto	`"regression"`, `"binary_classification"`, or `"multiclass_classification"`
`description`	`str`	-	Dataset description
`column_descriptions`	`Dict[str, str]`	-	Column name -> description mapping
`timeseries_groups`	`List[Dict]`	-	Timeseries column groups
`auto_train_num_trials`	`int`	1	Number of training trials
`auto_train_max_epochs`	`int`	10	Maximum training epochs
`auto_report_use_llm_evals`	`bool`	`True`	Use LLM for descriptions
`wait`	`bool`	`False`	Wait for completion
`wait_timeout`	`float`	`None`	Max seconds to wait

Async Usage

import asyncio
from discovery import Client

async def main():
    async with Client(api_key="...") as client:
        # Start analysis without waiting
        result = await client.analyze_async(
            file=df,
            target_column="target"
        )
        print(f"Started run: {result.run_id}")

        # Later, get results
        result = await client.get_results(result.run_id)
        
        # Or wait for completion
        result = await client.wait_for_completion(result.run_id, timeout=600)

asyncio.run(main())

Step-by-Step API

For more control, use the individual methods:

# 1. Upload file
file_info = await client.upload_file("data.csv")

# 2. Create dataset
dataset = await client.create_dataset(
    title="My Dataset",
    description="...",
    total_rows=1000
)

# 3. Link file to dataset
await client.create_file_record(dataset["id"], file_info)

# 4. Define columns
columns = await client.create_columns(dataset["id"], [
    {"name": "age", "display_name": "Age", "type": "continuous", ...},
    {"name": "price", "display_name": "Price", "type": "continuous", ...},
])

# 5. Start run
run = await client.create_run(
    dataset["id"],
    target_column_id=columns[1]["id"],
    task="regression",
    mode="fast"
)

# 6. Get results
result = await client.get_results(run["id"])

Data Types

AnalysisResult

@dataclass
class AnalysisResult:
    run_id: str
    report_id: Optional[str]
    status: str  # "pending", "processing", "completed", "failed"
    
    # Dataset metadata
    dataset_title: Optional[str]
    dataset_description: Optional[str]
    total_rows: Optional[int]
    target_column: Optional[str]
    task: Optional[str]
    
    # Results
    summary: Optional[Summary]
    patterns: List[Pattern]
    columns: List[Column]
    correlation_matrix: List[CorrelationEntry]
    feature_importance: Optional[FeatureImportance]
    
    # Job tracking
    job_id: Optional[str]
    job_status: Optional[str]
    error_message: Optional[str]

Pattern

@dataclass
class Pattern:
    id: str
    task: str
    target_column: str
    direction: str  # "min" or "max"
    p_value: float
    conditions: List[Dict]  # Continuous, categorical, or datetime conditions
    lift_value: float
    support_count: int
    support_percentage: float
    pattern_type: str  # "validated" or "speculative"
    novelty_type: str  # "novel" or "confirmatory"
    target_score: float
    description: str
    novelty_explanation: str
    target_class: Optional[str]
    target_mean: Optional[float]
    target_std: Optional[float]
    citations: List[Dict]

Column

@dataclass
class Column:
    id: str
    name: str
    display_name: str
    type: str  # "continuous" or "categorical"
    data_type: str  # "int", "float", "string", "boolean", "datetime"
    enabled: bool
    description: Optional[str]
    
    # Statistics
    mean: Optional[float]
    median: Optional[float]
    std: Optional[float]
    min: Optional[float]
    max: Optional[float]
    iqr_min: Optional[float]
    iqr_max: Optional[float]
    mode: Optional[str]
    approx_unique: Optional[int]
    null_percentage: Optional[float]
    
    # Feature importance
    feature_importance_score: Optional[float]

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

leap_discovery_client-0.1.0.tar.gz (17.0 kB view details)

Uploaded Jan 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

leap_discovery_client-0.1.0-py3-none-any.whl (12.7 kB view details)

Uploaded Jan 21, 2026 Python 3

File details

Details for the file leap_discovery_client-0.1.0.tar.gz.

File metadata

Download URL: leap_discovery_client-0.1.0.tar.gz
Upload date: Jan 21, 2026
Size: 17.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for leap_discovery_client-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`49676435463885ff003e36de9129400d6d198a3955608f236010c5edfd1a5876`
MD5	`1e0c32925f3e3064b81653a6a7454715`
BLAKE2b-256	`3cd989a9a4935a16e7dd9c0a232217d5bb21791c1439d47918c3ba7d152c2153`

See more details on using hashes here.

File details

Details for the file leap_discovery_client-0.1.0-py3-none-any.whl.

File metadata

Download URL: leap_discovery_client-0.1.0-py3-none-any.whl
Upload date: Jan 21, 2026
Size: 12.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for leap_discovery_client-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b5673df494ed74a7d7c78e6fa70fbf72ca97ba3a1d8f679d012c32fc3dbef440`
MD5	`6339371b6a9258b17fbf00d0f2458c25`
BLAKE2b-256	`d435b89b9ea90b9160178d9f575a3bc974ffc102ec2d9ecd30524b70ab7f662d`

See more details on using hashes here.

leap-discovery-client 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Discovery Engine Python SDK

Installation

Quick Start

Features

What You Get Back

Summary (LLM-generated)

Patterns

Columns with Feature Importance

Correlation Matrix

Feature Importance

Configuration

Configuration Options

Async Usage

Step-by-Step API

Data Types

AnalysisResult

Pattern

Column

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes