Open LangGraph-based framework for multi-agent research hypothesis generation, adapted from Google Research's AI Co-Scientist.
Project description
Open Coscientist
AI-powered research hypothesis generation using LangGraph
Open Coscientist is an open adaptation based on Google Research's AI Co-Scientist research paper. This project provides an implementation that generates, reviews, ranks, and evolves research hypotheses using the multi-agent architecture described. It orchestrates 8-10 specialized AI agents through a LangGraph workflow and aims to produce novel hypotheses grounded in scientific literature.
Demo
In this demo we use Open Coscientist to generate hypotheses for novel approaches to early detection of Alzheimer's disease. Click to watch the full demo on YouTube.
Standalone operation
The engine works with any LLM and can run without external data sources.
For high-quality hypothesis generation, the system provides an MCP server integration to perform literature-aware reasoning over published research. See MCP Integration for setup and configuration details, and to run the basic reference MCP server.
Quick Start
Installation
pip install open-coscientist
Set your API key (any LiteLLM-supported provider):
export GEMINI_API_KEY="your-key-here"
# or: export ANTHROPIC_API_KEY="your-key-here"
# or: export OPENAI_API_KEY="your-key-here"
For development, see CONTRIBUTING.md.
Note: for the any literature review to run, you must provide an MCP server with literature review tools/capabilities. You can use the provided reference implementation MCP Server. Otherwise, no published research will be used.
Model Support: Uses LiteLLM for 100+ LLM providers (OpenAI, Anthropic, Google, Azure, AWS Bedrock, Cohere, etc.). May need to tweak some constants.py token usage and other params, such as initial hypotheses count, in order to work with less powerful models.
Basic Usage
import asyncio
from open_coscientist import HypothesisGenerator
async def main():
generator = HypothesisGenerator(
model_name="gemini/gemini-2.5-flash", # default model if not provided
max_iterations=1,
initial_hypotheses_count=5,
evolution_max_count=3
)
async for node_name, state in generator.generate_hypotheses(
research_goal="Your research question",
stream=True
):
print(f"Completed: {node_name}")
if node_name == "generate":
print(f"Generated {len(state['hypotheses'])} hypotheses")
if __name__ == "__main__":
asyncio.run(main())
See examples/run.py for a full example cli script with a built-in Console Reporter. Remember, you must run the literature review MCP server for any literature review to be included in the hypothesis generation.
Features
- Multi-agent workflow: Supervisor, Generator, Reviewer, Ranker, Tournament Judge, Meta-Reviewer, Evolution, Proximity Deduplication
- Literature review integration: Optional MCP server provides access to real published research
- Real-time streaming: Stream results as they're generated
- Intelligent caching: Faster development iteration with LLM response caching
- Elo-based tournament: Pairwise hypothesis comparison with Elo ratings
- Iterative refinement: Evolves top hypotheses while preserving diversity
The workflow automatically detects MCP availability and adjusts accordingly.
Functional reference MCP server included in mcp_server/ directory.
Documentation
- Architecture - Workflow diagram, node descriptions, state management
- MCP Integration - Literature review setup and configuration
- Generation Modes - Three generate node modes explained, and parameters to enable them
- Configuration - All parameters, caching, performance tuning
- Logging - File logging, rotating logs, log levels
- Development - Contributing, node structure, testing
Node Descriptions
| Node | Purpose | Key Operations |
|---|---|---|
| Supervisor | Research planning | Analyzes research goal, identifies key areas, creates workflow strategy |
| Literature Review (Recommended) | Academic literature search | Queries databases (PubMed, Google Scholar), retrieves and analyzes real published papers (requires MCP server; without it, uses only LLM's latent knowledge) |
| Generate | Hypothesis creation | Generates N initial hypotheses using LLM with high temperature for diversity |
| Reflection (Recommended) | Literature comparison | Analyzes hypotheses against literature review findings, identifies novel contributions and validates against real research (requires literature review) |
| Review | Adaptive evaluation | Reviews hypotheses across 6 criteria using adaptive strategy (comparative batch for ≤5, parallel for >5) |
| Rank | Holistic ranking | LLM ranks all hypotheses considering composite scores and review feedback |
| Tournament | Pairwise comparison | Runs Elo tournament with random pairwise matchups, updates ratings |
| Meta-Review | Insight synthesis | Analyzes all reviews to identify common strengths, weaknesses, and strategic directions |
| Evolve | Hypothesis refinement | Refines top-k hypotheses with context awareness to preserve diversity |
| Proximity | Deduplication | Clusters similar hypotheses and removes high-similarity duplicates |
Literature Review
Our MCP server reference implementation is meant to provide a template for how to integrate literature review with Open Coscientist. It is by no means extensive and currently only supports PubMed. See MCP Integration for more on how to extend this reference implementation to meet your needs.
Attribution
Open Coscientist is a source-available implementation inspired by Google Research's AI Co-Scientist. While Google's original system is closed-source, this project adapts their multi-agent hypothesis generation architecture from their published research paper.
Reference:
- Blog: Accelerating scientific breakthroughs with an AI Co-Scientist
- Paper: Towards an AI co-scientist
This version provides a LangGraph-based implementation. It includes some optimizations for parallel execution, streaming support, and caching.
Citation
If you use this work, please cite both this implementation and the original Google Research paper:
@article{coscientist2025,
title={Towards an AI co-scientist},
author={Google Research Team},
journal={arXiv preprint arXiv:2502.18864},
year={2025},
url={https://arxiv.org/abs/2502.18864}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file open_coscientist-0.1.1.tar.gz.
File metadata
- Download URL: open_coscientist-0.1.1.tar.gz
- Upload date:
- Size: 97.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
12fb296bc4e7a7f6ff51d563046695b0fcb9464d9cd36dc600bd00cedbceeb71
|
|
| MD5 |
f2671c83cf23a7d389dc8e295ddf6302
|
|
| BLAKE2b-256 |
40d4f702b6a57a78c02d905b81c8657d743b65403e094369855c9069ece234e7
|
File details
Details for the file open_coscientist-0.1.1-py3-none-any.whl.
File metadata
- Download URL: open_coscientist-0.1.1-py3-none-any.whl
- Upload date:
- Size: 122.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0cf3ac4d93682045b7c2afbe5b3a9c7237f4bf24887927d9437bd8c0e7049fac
|
|
| MD5 |
c2ecef339e8c09858475b852cc859e4a
|
|
| BLAKE2b-256 |
5766e3ed9ed68b4662215b2282cea3680f0bb7574c20ebf43d1ad9803c242319
|