Skip to main content

Python SDK for Lightning Rod AI-powered forecasting dataset generation

Project description

Lightning Rod Python SDK

AI-powered forecasting dataset generation platform.

Introduction

Lightning Rod helps you generate high-quality datasets by automating the process of seed collection, question generation, and answer labeling. Whether you're building forecasting models or running SFT over unstructed filesets, Lightning Rod transforms raw information into structured, ML-ready datasets.

Core Concepts

Lightning Rod works with a simple but powerful data model:

Sample

A Sample is the fundamental unit of data in Lightning Rod. Each sample contains:

  • sample_id: Unique identifier for the sample
  • seed: Optional starting point (raw data)
  • question: Optional forecasting question
  • label: Optional ground truth answer
  • meta: Dictionary for additional metadata

Seed

A Seed is your starting point - raw data that will be transformed into questions. For example:

  • seed_text: The raw text content (e.g., news articles, reports, tweets)

Question

A Question is a forecasting question generated from seeds:

  • question_text: The forecasting question (e.g., "Will Arsenal finish above Tottenham in the 2025-26 season?")

Label

A Label represents the ground truth answer to a question:

  • label: The answer (e.g., "Yes", "No", or a numeric value)
  • label_confidence: Confidence score (0.0 to 1.0)
  • resolution_date: When the question can be resolved

Dataset

A Dataset is a collection of samples stored efficiently as Parquet files. Datasets can be:

  • Downloaded for local analysis
  • Used as input to pipelines
  • Exported for model training

Installation

pip install lightningrod

Quick Start

Authentication

First, get your API key from lightningrod.ai and initialize the client:

from lightningrod import LightningRodClient

client = LightningRodClient(api_key="your-api-key-here")

Question Generation Pipeline

Create a complete question generation pipeline that takes seeds, generates questions, and labels them:

from datetime import datetime, timedelta
from lightningrod import LightningRodClient
from lightningrod.pipelines import QuestionGenerationPipeline
from lightningrod.transforms import (
    NewsSeedGenerator,
    AIQuestionGenerator,
    QuestionFilter,
    WebSearchLabeler
)

client = LightningRodClient(api_key="your-api-key-here")

pipeline = QuestionGenerationPipeline(
    seed_generator=NewsSeedGenerator(
        start_date=datetime.now() - timedelta(days=90),
        end_date=datetime.now(),
        search_query="Premier League Soccer"
    ),
    question_generator=AIQuestionGenerator(
        instructions="Write forward-looking, self-contained questions with explicit dates/entities.",
        examples=[
            "Who will win Manchester City vs Liverpool on Dec 18, 2025?",
            "Will Arsenal finish above Tottenham in the 2025-26 season?"
        ],
        bad_examples=["Who won the match?"],
        filter=QuestionFilter(threshold_score=5),
    ),
    labeler=WebSearchLabeler(
        confidence_threshold=0.5
    )
)

dataset = client.run(pipeline, dataset=None)

This pipeline will:

  1. Collect Seeds: Search for recent news about Premier League Soccer
  2. Generate Questions: Use AI to create forecasting questions from the news
  3. Label Questions: Automatically find answers using web search
  4. Return Dataset: Get a dataset with all samples ready for download

Support

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lightningrod_ai-0.1.0.tar.gz (26.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lightningrod_ai-0.1.0-py3-none-any.whl (55.4 kB view details)

Uploaded Python 3

File details

Details for the file lightningrod_ai-0.1.0.tar.gz.

File metadata

  • Download URL: lightningrod_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 26.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for lightningrod_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e7a7df1ab58e31cd8c0b33aca63cc2a2dc70f037f746b6d2eb6e4b13ed2ca041
MD5 15ae6a0cb096e1d74fbc8511a757bb3a
BLAKE2b-256 bec59170e63c38fba3aaf63741cc55971f9acf01d531fafbbcf4c25417604e7c

See more details on using hashes here.

File details

Details for the file lightningrod_ai-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for lightningrod_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7bd52ca84f49f4f7da171bda5c0119c92293077d7d431d043036eae3357ae81f
MD5 5b3e9b87cd717d320b90aeea0e40073b
BLAKE2b-256 048aadc607fa9b0db982e20dfd2532e409cf940dacb1d61cb9fb2c50e0f771e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page