Skip to main content

Personal AGI Operating Core

Project description

๐ŸŒŒ MONAD

Personal AGI Operating Core

๐Ÿ‡จ๐Ÿ‡ณ ็ฎ€ไฝ“ไธญๆ–‡ (Chinese) โ€ข How It Works โ€ข Installation โ€ข Architecture

Python OpenAI API ReAct Agent


MONAD is not a chatbot or a simple tool-matcher. It is a self-learning, objective-driven autonomous rational agent core.

Unlike traditional agents that rely on a predefined, hardcoded set of tools, MONAD acts like a rational entity with basic "instincts". It has no memory and no pre-loaded knowledge of how to perform specific tasks (like checking the weather or searching the web).

Instead, it autonomously learns how to complete your tasks by writing and executing Python code on the fly, and then saving those successful experiences as reusable skills.


๐Ÿง  Core Philosophy

  • File System as Database: The system itself has no memory of past sessions. It persists all learned information (axioms, environment knowledge, learned skills, user context, and experiences) directly to local Markdown files. No vector databases, no RAG, zero external dependencies.
  • Absolute Rationality: MONAD follows a strict reasoning loop (Analyze โ†’ Self-check โ†’ Learn โ†’ Execute โ†’ Reflect) to accomplish goals logically.
  • Self-Learning & Self-Evolving: Instead of shipping with 100 tools, MONAD ships with only 5 basic instincts (hands ๐Ÿคฒ, voice ๐Ÿ—ฃ๏ธ, eyes ๐Ÿ‘๏ธ, dialogue ๐Ÿ’ฌ, screen ๐Ÿ–ฅ๏ธ) plus a growing library of built-in skills. It learns everything else by generating code.
  • LLM as a Command Executor: The LLM's own training data is disregarded. All factual information must be retrieved from the real world via code execution or web perception.
  • Stateless Message Management: Every user request starts with a fresh, clean message context. MONAD doesn't rely on LLM Chat History; instead, it persists vital information via reflection loops. This ensures reasoning purity and prevents hallucination buildup from long conversations.
  • Search First, Ask Later: When stuck during execution (errors, missing packages, unfamiliar tools), MONAD's first instinct is to search the web via web_fetch, never to guess. But if the user's intent is unclear, MONAD asks the user first. In short: unclear query โ†’ ask user; execution problem โ†’ search first.
  • URL-First Principle: When the user provides a specific URL or domain (e.g., "Analyze kexue.fm"), MONAD must directly access that URL first, not detour through a search engine. Search engines are a fallback, not the default when a target is already known.
  • Experience Staging & Hygiene: Experiences don't go directly into long-term memory. New experiences first land in a staging area (pending.jsonl). Only when the same tag pattern recurs โ‰ฅ3 times is the best example promoted to the permanent experience file โ€” a frequency-based deduplication inspired by how humans consolidate short-term memory into long-term memory. Failed experiences are tagged [FAILED] and never promoted, preventing "experience pollution."
  • Tag-Based Experience Retrieval: Experiences are tagged during reflection. At reasoning time, MONAD scores each experience by relevance ร— 2 + recency (not semantic embeddings, just keyword overlap + timestamp), picks the top entries, and always includes the 3 most recent as fallback. Simple, fast, zero infrastructure.
  • Anti-Hallucination Verification: LLMs sometimes claim "I created the skill" without actually writing any files. MONAD defends against this at two levels: (1) Post-Action Verification โ€” after actions that should create files, the system checks the filesystem and appends verification results to the LLM's observation; (2) Hollow Answer Guard โ€” if the LLM tries to deliver a final answer claiming creation/saving but never executed a write action, the answer is rejected and the LLM is forced to actually do the work.
  • Skill Deduplication (Reuse First): Before creating a new skill, the system prompts the LLM to check existing skills and prefer modifying them. The SkillBuilder module independently evaluates all existing skills and supports three actions: skip, update (preferred), or create โ€” preventing the skill library from growing duplicate entries.

โšก Basic Capabilities ("Instincts")

MONAD comes with five built-in capabilities:

Capability Metaphor Description
๐Ÿ python_exec Hands ๐Ÿคฒ Evaluate arbitrary Python code. Process data, call APIs, read/write files, install librariesโ€”learn to do anything.
๐Ÿ’ป shell Voice ๐Ÿ—ฃ๏ธ Execute shell commands on the host operating system.
๐Ÿ‘๏ธ web_fetch Eyes ๐Ÿ‘๏ธ Perceive the internet directly. Fetch web pages with 3 modes: fast (HTTP), stealth (anti-bot), browser (JS render). Powered by Scrapling.
๐Ÿ™‹ ask_user Dialogue ๐Ÿ’ฌ Ask the user for clarification when it truly cannot proceed independently.
๐Ÿ–ฅ๏ธ desktop_control Screen ๐Ÿ–ฅ๏ธ Control any desktop application via screenshot + OCR + keyboard/mouse. Cross-platform (macOS/Windows/Linux).

Note: desktop_control requires optional dependencies: pip install monad-core[desktop]


๐Ÿ“‚ Knowledge Architecture

MONAD uses Categorized Memory instead of semantic retrieval (RAG).

knowledge/
โ”œโ”€โ”€ axioms/          # System axioms & core behavioral principles
โ”œโ”€โ”€ environment/     # World knowledge (e.g., search engine URLs, API endpoints)
โ”œโ”€โ”€ user/            # Categorized user context (No RAG used here)
โ”‚   โ”œโ”€โ”€ facts.md     #   Objective facts & preferences (e.g., prefers Python)
โ”‚   โ”œโ”€โ”€ mood.md      #   Current state & mood
โ”‚   โ””โ”€โ”€ goals.md     #   Long-term goals & ongoing projects
โ”œโ”€โ”€ skills/          # Reusable Python skills (built-in + auto-generated)
โ”‚   โ””โ”€โ”€ <skill>/
โ”‚       โ”œโ”€โ”€ skill.yaml   # Metadata: name, goal, inputs, steps, triggers
โ”‚       โ””โ”€โ”€ executor.py  # Python implementation with run(**kwargs)
โ”œโ”€โ”€ experiences/     # Two-tier experience memory
โ”‚   โ”œโ”€โ”€ pending.jsonl            # Short-term: all recent experiences (staging area)
โ”‚   โ””โ”€โ”€ accumulated_experiences.md  # Long-term: promoted high-frequency patterns
โ”œโ”€โ”€ protocols/       # Error handling protocols
โ””โ”€โ”€ tools/           # Documentation for the 5 basic capabilities + built-in skills

๐Ÿ› ๏ธ Built-in Skills

Beyond the 5 core instincts, MONAD ships with a set of ready-to-use skills:

Skill Description
start_recording Start background screen recording (MKV format via ffmpeg). Non-blocking โ€” returns immediately so other tasks can run in parallel.
stop_recording Stop recording, transcode MKV โ†’ MP4 (guaranteed valid moov atom), return file path + http://localhost:8000/output/ download link.
publish_to_xhs Publish posts/articles to Xiaohongshu (RED). Supports text + image.
fetch_topic_news Fetch and summarize latest news on any topic from the web.
parse_document Parse and extract structured content from documents (PDF, Word, etc.).
web_to_markdown Convert any web page to clean Markdown.
markdown_to_knowledge_map Convert Markdown/text/URL into a visual knowledge graph (SVG/PNG) via Mermaid.

Skills are Python modules (executor.py + skill.yaml). MONAD can also auto-generate new skills from any successful task.


โš™๏ธ How It Works

When you give MONAD an objective (e.g., "What is the weather in Hangzhou today?"):

  1. Analyze & Self-Check: Understand intent and check the local knowledge base for existing skills.
  2. Learn & Research (The "Search First" Principle): If the task is unknown or an error occurs, MONAD uses web_fetch to research documentation, API usage, or solutions. This is the "Learning" phase where it acquires the "how-to" knowledge before acting.
  3. Execute & Observe: MONAD writes and executes Python code or shell commands via python_exec. It treats the output as "Observations" to verify success or identify new obstacles.
  4. Reflect & Persist: After a successful execution, the Reflection module summarizes the experience with tags. The SkillBuilder evaluates if the logic should be abstracted into a permanent skill โ€” checking existing skills first to avoid duplication.
  5. Verify & Answer: Before delivering the final answer, the system verifies that claimed actions actually happened (files exist, skills were written). The answer is based on real-world data verified through execution.

๐Ÿ’ก Deep Dive: Why Stateless?

MONAD intentionally discards traditional "Chat History" in favor of a Stateless Design, where every task starts with a clean context and persists only vital information via the file system.

  • Mitigating Hallucination: Long-running chat histories eventually lead to context pollution and attention decay. By resetting the context per task, we force the LLM to reason in a pure, noise-free environment.
  • Physical Memory: Unlike black-box model caches, MONAD's memory consists of human-readable Markdown files. This is a deliberate step towards Personal Data Sovereignty.
  • Task Atomicity: Every objective becomes an independent, reproducible unit of execution.
  • The Future of Agents: We believe the evolution of Agents will shift from "simulating conversation" to "simulating rational execution." Maintaining a living "State Whiteboard" via reflection loops is far more aligned with the essence of AGI than endlessly stacking chat logs.

๐Ÿš€ Installation

1. Install via pip (Recommended)

pip install monad-core

Optional extras:

pip install monad-core[desktop]   # Desktop control (screenshot + OCR + keyboard/mouse)
pip install monad-core[feishu]    # Feishu (Lark) bot integration
pip install monad-core[all]       # Everything

Or install from source:

git clone https://github.com/hscspring/Monad.git
cd Monad
pip install -e .            # core only
pip install -e ".[all]"     # with all extras

2. Configure your LLM On your first run, MONAD will initialize its workspace in ~/.monad/. Update ~/.monad/.env with your LLM Base URL, API Key, and Model name.

Note: If you don't configure this manually, MONAD will guide you through an interactive setup with connectivity validation on your first launch.


๐Ÿ’ป Usage

Once installed, you can start the MONAD agent from any directory in your terminal.

Start Web UI (Default)

Launch the modern browser-based interface:

monad

Interactive Terminal Mode (Classic)

Start the continuous ReAct agent loop in the CLI:

monad --cli

Feishu (Lark) Bot Mode

  1. Follow the first two steps in the Feishu Bot Guide to create a bot and obtain your APP_ID and APP_SECRET.
  2. Connect MONAD to your Feishu bot via WebSocket:
APP_ID=xxx APP_SECRET=yyy monad --feishu

Note: Requires pip install monad-core[feishu] for the lark-oapi dependency.

Self-Test

Verify all modules load correctly and the LLM connection is functioning:

monad --test

Unit Tests

Run the test suite for all tools:

python -m pytest tests/ -v

Built with pure rational reasoning ๐Ÿ’ก

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

monad_core-0.4.1.tar.gz (149.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

monad_core-0.4.1-py3-none-any.whl (155.0 kB view details)

Uploaded Python 3

File details

Details for the file monad_core-0.4.1.tar.gz.

File metadata

  • Download URL: monad_core-0.4.1.tar.gz
  • Upload date:
  • Size: 149.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for monad_core-0.4.1.tar.gz
Algorithm Hash digest
SHA256 9870f0ef1b3024d43bbc7a6381590330abc1856f9893fd717713cc0c87d87920
MD5 a820fc79420cde46a77d1036706e899d
BLAKE2b-256 1bc3817a1b9a36fd3af23934b3d9c0d8f53114addef8ea6bc856b488fb1f5c3f

See more details on using hashes here.

File details

Details for the file monad_core-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: monad_core-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 155.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for monad_core-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d4e0d0b4f39ef562f537bd7914d67e81387e5ef6e06a597ce598354f390b46d5
MD5 e664616a92153d276716920b0dbacb78
BLAKE2b-256 428fc995d2a0642e56ba9d55cb7bec1989bdc1c2a9694419a4f78165ecf5cc56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page