Autonomous AI research and development platform powered by Claude
Project description
AI-AtlasForge
An autonomous AI research and development platform with multi-provider LLM support (Claude, Codex, Gemini). Run long-duration missions, accumulate cross-session knowledge, and build software autonomously.
What is AI-AtlasForge?
AI-AtlasForge is not a chatbot wrapper. It's an autonomous research engine that:
- Runs multi-day missions without human intervention
- Maintains mission continuity across context windows
- Accumulates knowledge that persists across sessions
- Self-corrects when drifting from objectives
- Adversarially tests its own outputs
- Multi-provider: Supports Claude, OpenAI Codex, and Google Gemini as LLM backends
Quick Start
Prerequisites
- Python 3.10+
- Anthropic API key (get one at https://console.anthropic.com/)
- Linux environment (tested on Ubuntu 22.04+, Debian 12+)
Platform Notes:
- Windows: Use WSL2 (Windows Subsystem for Linux)
- macOS: Should work but is untested. Please report issues.
Option 1: Standard Installation
# Clone the repository
git clone https://github.com/DragonShadows1978/AI-AtlasForge.git
cd AI-AtlasForge
# Run the installer
./install.sh
# Configure your API key
export ANTHROPIC_API_KEY='your-key-here'
# Or edit config.yaml / .env
# Verify installation
./verify.sh
Option 2: One-Liner Install
curl -sSL https://raw.githubusercontent.com/DragonShadows1978/AI-AtlasForge/main/quick_install.sh | bash
Option 3: Docker Installation
git clone https://github.com/DragonShadows1978/AI-AtlasForge.git
cd AI-AtlasForge
docker compose up -d
# Dashboard at http://localhost:5050
For detailed installation options, see INSTALL.md or QUICKSTART.md.
Running Your First Mission
-
Start the Dashboard (optional, for monitoring):
make dashboard # Or: python3 dashboard_v2.py # Access at http://localhost:5050
-
Create a Mission:
- Via Dashboard: Click "Create Mission" and enter your objectives
- Via Sample: Run
make sample-missionto load a hello-world mission - Via JSON: Create
state/mission.jsonmanually
-
Start the Engine:
make run # Or: python3 atlasforge_conductor.py --mode=rd
Development Commands
Run make help to see all available commands:
make install # Full installation
make verify # Verify installation
make dashboard # Start dashboard
make run # Start autonomous agent
make docker # Start with Docker
make sample-mission # Load sample mission
What's New in v1.8.4
- Handoff System Overhaul - Complete rework of the conductor handoff system for improved reliability across mission cycles
- Widget Visibility Toggles - Dashboard widgets can now be hidden/shown without disabling backend services
- Dashboard Drag & Drop - Drag-and-drop widget reordering with layout presets, undo/redo, and touch support
- Context Watcher Improvements - Enhanced token tracking and handoff logic
- Systemd Auto-Start - Fixed graphical-session.target dependency on Linux Mint, Dashboard and Tray services now auto-start on boot via default.target
What's New in v1.8.3
- Test Harness Improvements - Refactored subprocess mocking in conductor timeout tests, improved phase-aware drift validation, provider-aware ground rules caching
- Stability Fixes - Enhanced test coverage for timeout scenarios, improved error handling in stage handlers, Gemini provider integration tests
What's New in v1.8.2
- Bug Fixes - Fixed null handling in suggestion analyzer, improved storage fallback in dashboard similarity analysis
What's New in v1.8.1
- Dashboard Services Config - Added Atlas Lab service configuration to services registry
What's New in v1.8.0
- Google Gemini Support - Full provider integration with subscription-based API access. Gemini missions validated on complex codebases (custom autograd implementations). Code generation, testing, and iteration loops proven functional
- Provider-Agnostic Architecture - Three LLM backends (Claude, Codex, Gemini) running through unified orchestration with provider-specific hardening
- Enhanced Gemini Integration - Defensive API invocation, clear error parsing, subscription auth support (API key or OAuth)
- Mission Validation - Tested Gemini on Project Tensor (custom autograd) - improved code robustness and performance through multi-cycle iteration
What's New in v1.7.0
- OpenAI Codex Support - Full multi-provider support: run missions and investigations with Claude or Codex as the LLM backend. Provider-aware ground rules, prompt templates, and transcript handling
- Ground Rules Loader - Provider-aware ground rules system with overlay support for Claude/Codex/investigation modes
- Enhanced Context Watcher - Major overhaul with improved token tracking, time-based handoff, and Haiku-powered summaries
- Experiment Framework - Expanded scientific experiment orchestration with multi-hypothesis testing
- Investigation Engine - Enhanced multi-subagent investigation system with provider selection
- Dashboard Improvements - New widgets system, improved chat interface, better WebSocket handling
- Transcript Archival - New integration for automatic transcript archival
- 110 files changed, 3500+ lines added across the platform
Architecture
+-------------------+
| Mission State |
| (mission.json) |
+--------+----------+
|
+--------------+--------------+
| |
+---------v---------+ +--------v--------+
| AtlasForge | | Dashboard |
| (Execution Engine)| | (Monitoring) |
+---------+---------+ +-----------------+
|
+---------v---------+ +-------------------+
| Modular Engine |<------->| Context Watcher |
| (StageOrchestrator)| | (Token + Time) |
+---------+---------+ +-------------------+
|
+---------v-------------------+
| Stage Handlers |
| |
| PLANNING -> BUILDING -> |
| TESTING -> ANALYZING -> |
| CYCLE_END -> COMPLETE |
+-----------------------------+
|
+---------v-------------------+
| Integration Manager |
| (Event-Driven Hooks) |
+-----------------------------+
Mission Lifecycle
- PLANNING - Understand objectives, research codebase, create implementation plan
- BUILDING - Implement the solution
- TESTING - Validate implementation
- ANALYZING - Evaluate results, identify issues
- CYCLE_END - Generate reports, prepare continuation
- COMPLETE - Mission finished
Missions can iterate through multiple cycles until success criteria are met.
Core Components
atlasforge.py
Main execution loop. Spawns Claude instances, manages state, handles graceful shutdown.
af_engine/ (Modular Engine)
Plugin-based mission execution system:
- StageOrchestrator - Core workflow orchestrator (~300 lines)
- Stage Handlers - Pluggable handlers for each stage (Planning, Building, Testing, Analyzing, CycleEnd, Complete)
- IntegrationManager - Event-driven integration coordination
- PromptFactory - Template-based prompt generation
Mission Queue
Queue multiple missions to run sequentially:
- Auto-start next mission when current completes
- Set cycle budgets per mission
- Priority ordering
- Dashboard integration for queue management
Context Watcher
Real-time context monitoring to prevent timeout waste:
- Token-based detection: Monitors JSONL transcripts for context exhaustion (130K/140K thresholds)
- Time-based detection: Proactive handoff at 55 minutes before 1-hour timeout
- Haiku-powered summaries: Generates intelligent HANDOFF.md via Claude Haiku
- Automatic recovery: Sessions continue from HANDOFF.md on restart
See context_watcher/README.md for detailed documentation.
dashboard_v2.py
Web-based monitoring interface showing mission status, knowledge base, and analytics.
Knowledge Base
SQLite database accumulating learnings across all missions:
- Techniques discovered
- Insights gained
- Gotchas encountered
- Reusable code patterns
Adversarial Testing
Separate Claude instances that test implementations:
- RedTeam agents with no implementation knowledge
- Mutation testing
- Property-based testing
GlassBox
Post-mission introspection system:
- Transcript parsing
- Agent hierarchy reconstruction
- Stage timeline visualization
Key Features
Display Layer (Windows)
Visual environment for graphical application testing:
- Screenshot capture from virtual display
- Web-accessible display via noVNC (localhost:6080)
- Web terminal via ttyd (localhost:7681)
- Browser support for OAuth flows and web testing
- Automatic GPU detection with software fallback
See docs/DISPLAY_LAYER.md for the user guide.
Mission Continuity
Missions survive context window limits through:
- Persistent mission.json state
- Cycle-based iteration
- Continuation prompts that preserve context
Knowledge Accumulation
Every mission adds to the knowledge base. The system improves over time as it learns patterns, gotchas, and techniques.
Autonomous Operation
Designed for unattended execution:
- Graceful crash recovery
- Stage checkpointing
- Automatic cycle progression
Directory Structure
AI-AtlasForge/
+-- atlasforge_conductor.py # Main orchestrator
+-- af_engine/ # Modular engine package
| +-- orchestrator.py # StageOrchestrator
| +-- stages/ # Stage handlers
| +-- integrations/ # Event-driven integrations
+-- af_engine_legacy.py # Legacy engine (fallback)
+-- context_watcher/ # Context monitoring module
| +-- context_watcher.py # Token + time-based handoff
| +-- tests/ # Context watcher tests
+-- dashboard_v2.py # Web dashboard
+-- adversarial_testing/ # Testing framework
+-- atlasforge_enhancements/ # Enhancement modules
+-- workspace/ # Active workspace
| +-- glassbox/ # Introspection tools
| +-- artifacts/ # Plans, reports
| +-- research/ # Notes, findings
| +-- tests/ # Test scripts
+-- state/ # Runtime state
| +-- mission.json # Current mission
| +-- claude_state.json # Execution state
+-- missions/ # Mission workspaces
+-- atlasforge_data/
| +-- knowledge_base/ # Accumulated learnings
+-- logs/ # Execution logs
Configuration
AI-AtlasForge uses environment variables for configuration:
| Variable | Default | Description |
|---|---|---|
ATLASFORGE_PORT |
5050 |
Dashboard port |
ATLASFORGE_ROOT |
(script directory) | Base directory |
ATLASFORGE_DEBUG |
false |
Enable debug logging |
USE_MODULAR_ENGINE |
true |
Use new modular engine (set to false for legacy) |
Dashboard Features
The web dashboard provides real-time monitoring:
- Mission Status - Current stage, progress, timing
- Activity Feed - Live log of agent actions
- Knowledge Base - Search and browse learnings
- Analytics - Token usage, cost tracking
- Mission Queue - Queue and schedule missions
- GlassBox - Post-mission analysis
Philosophy
First principles only. No frameworks hiding integration failures. Every component built from scratch for full visibility.
Speed of machine, not human. Designed for autonomous operation. Check in when convenient, not when required.
Knowledge accumulates. Every mission adds to the knowledge base. The system gets better over time.
Trust but verify. Adversarial testing catches what regular testing misses. The same agent that writes code doesn't validate it.
Requirements
- Python 3.10+
- Node.js 18+ (optional, for dashboard JS modifications)
- Anthropic API key
- Linux environment (Ubuntu 22.04+, Debian 12+)
Python Dependencies
See requirements.txt or pyproject.toml for full list.
Documentation
- QUICKSTART.md - Get started in 5 minutes
- INSTALL.md - Detailed installation guide
- USAGE.md - How to use AI-AtlasForge
- ARCHITECTURE.md - System architecture
- DISPLAY_LAYER.md - Display Layer user guide (Windows)
- TROUBLESHOOTING.md - Display Layer troubleshooting
Recent Changes
v1.7.0 (2026-02-06)
- OpenAI Codex Support - Multi-provider LLM backend: run missions and investigations with Claude or Codex. Provider-aware ground rules, prompts, and transcript handling
- Ground Rules Loader - Provider-aware ground rules system with overlay support for Claude/Codex/investigation modes
- Enhanced Context Watcher - Major overhaul with improved token tracking, time-based handoff, and Haiku-powered summaries
- Experiment Framework - Expanded scientific experiment orchestration with multi-hypothesis testing
- Investigation Engine - Enhanced multi-subagent investigation system with provider selection
- Dashboard Improvements - New widgets system, improved chat interface, better WebSocket handling
- PromptFactory Enhancements - Provider-aware caching, AfterImage integration with fallback paths
- Conductor Hardening - Improved session management, singleton protocol, crash recovery
- Transcript Archival - New integration for automatic transcript archival
- Research Agent - Improved web researcher and knowledge synthesizer
- 110 files changed, 3500+ lines added across the platform
v1.6.9 (2026-02-02)
- Fixed GlassBox visualization issues
v1.6.8 (2026-02-01)
- Fixed zombie timer bug - stale session cleanup now stops timer threads
- Fixed continuation prompt bug - cycle progression now updates problem_statement
- Added conductor singleton with takeover protocol (prevents multiple instances)
v1.6.7 (2026-02-01)
- Fixed JSON response parsing bug in conductor (handles markdown code blocks)
- ContextWatcher stability improvements
v1.6.5 (2026-01-31)
- Build checkpoint improvements
- Mission state persistence fixes
License
MIT License - see LICENSE for details.
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.
Related Projects
- AI-AfterImage - Episodic memory for AI coding agents. Gives Claude Code persistent memory of code it has written across sessions. Works great with AtlasForge for cross-mission code recall.
Acknowledgments
Built on Claude by Anthropic. Special thanks to the Claude Code team for making autonomous AI development possible.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_atlasforge-1.8.4.tar.gz.
File metadata
- Download URL: ai_atlasforge-1.8.4.tar.gz
- Upload date:
- Size: 227.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cef6694bf3a240c956c3852df4edfbb8792e5ad5a1cb071ce2bf13fa34981e91
|
|
| MD5 |
79ced333b1f27207b798f85b91f6c359
|
|
| BLAKE2b-256 |
3833d7d73fad707e91f88f2562a4d2cf730af5a92ddfaf591371aeef2c3437ad
|
File details
Details for the file ai_atlasforge-1.8.4-py3-none-any.whl.
File metadata
- Download URL: ai_atlasforge-1.8.4-py3-none-any.whl
- Upload date:
- Size: 236.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
44d54792ec311714aafd9ef6a9af7f043312559c83980e88af5d0559f24adfa1
|
|
| MD5 |
ccd6b2f74ff250a7525f4a542abdc0f6
|
|
| BLAKE2b-256 |
c899239bbcf61566e00c48add33fd2bb21117196ed63b5f907ea62eef40963a6
|