Advanced AI Agent Framework with Local LLM Integration
Project description
NeuralNode 2.0.0 - AI Agent Framework
A powerful, modular AI agent framework for building intelligent desktop applications with local LLM integration, multi-platform messaging support, and autonomous task execution. Now featuring Advanced RAG, LM Studio integration, WhatsApp support, and Production-Ready Architecture.
๐ What's New in NeuralNode 2.0.0
๐ NeuralNode Package (New!)
- Modular Architecture: Installable Python package with clean separation of concerns
- Core Framework:
BaseTool,Message,ReActAgentclasses for building custom agents - Chain System: Advanced chain operations beyond LangChain capabilities
- Local LLM: Unified interface for Ollama and LM Studio
- Easy Imports:
from neuralnode import ReActAgent, BaseTool, Message
๐ LM Studio Integration (New!)
- OpenAI-Compatible API: Direct integration with LM Studio's local inference server
- Streaming Support: Real-time token streaming for responsive interactions
- Model Management: Auto-detection, switching, and unloading capabilities
- Backend Manager: Seamlessly switch between Ollama and LM Studio
๐ WhatsApp Integration (New!)
- WhatsApp Business API: Production-ready Meta API integration
- WhatsApp Web: Development mode with pywhatkit
- AI Agent Bot: Complete WhatsApp bot with agent integration
- Media Support: Text, images, audio, and documents
๐ Advanced RAG System (New!)
Superior to LangChain:
- Semantic Chunking: Intelligent document splitting with overlap
- Multi-Format Support: PDF, DOCX, TXT, MD, JSON, CSV, HTML
- Hybrid Search: Vector + Keyword search with configurable weights
- Query Expansion: LLM-powered query enhancement
- Re-ranking: Cross-encoder relevance scoring
- Caching Layer: Persistent embeddings for faster retrieval
๐ Advanced Chains System (New!)
Beyond LangChain:
- Sequential Chain: Step-by-step processing
- Parallel Chain: Concurrent execution
- Conditional Chain: Branching logic
- Map-Reduce Chain: Distributed processing
- Fallback Chain: Error recovery
- Conversation Chain: Context-aware dialogues
- Extraction Chain: Structured data extraction
- Fluent Builder API: ChainBuilder for easy composition
๐ Direct TTS Playback (New!)
- No File Saving: Play audio directly without saving MP3 files
- Pygame Integration: Real-time audio streaming
- Voice Selection: Multiple voices and languages
- Async Support: Non-blocking audio playback
๐ Customizable Prompts via .env (New!)
- Environment-Based: Customize prompts without editing code
- Personality Templates: Professional, Coding, Creative assistants
- Dynamic Loading: Change prompts without restart
- Multi-User Support: Different prompts for different users
Core Framework
- ReAct Agent Architecture: Advanced reasoning and action loop with tool integration
- Local LLM First: Native Ollama integration with auto-start capability
- Memory System: Long-term semantic memory using ChromaDB with sentence transformers
- Multi-Modal: Vision capabilities through local Ollama vision models (llava)
Platform Integrations
- Telegram Bot: Full-featured bot with voice notes, screenshots, and image analysis
- WhatsApp: Business API and Web integration
- Flutter Desktop App: Modern cross-platform UI with real-time WebSocket updates
- FastAPI Backend: RESTful API with async support and automatic documentation
Agent Tools
- System Control: Lock, unlock, shutdown, restart PC
- App Launcher: Open any Windows application (Spotify, VS Code, Edge, etc.)
- Browser Automation: Microsoft Edge control and web search
- Input Simulation: Keyboard shortcuts, mouse clicks, text typing
- Screen Analysis: Screenshot capture with AI vision analysis
- Voice Synthesis: Text-to-speech with edge-tts integration (file + direct playback)
- Window Management: List, focus, minimize, maximize, close windows
- Command Execution: PowerShell/CMD command execution
- Task Planning: Autonomous multi-step task creation and execution
- Python REPL: Execute Python code dynamically
๐ ๏ธ Installation
Prerequisites
- Python 3.10+
- Flutter SDK 3.0+ (optional, for desktop UI)
- Ollama OR LM Studio (local LLM server)
- Windows 10/11 (for system control features)
1. Backend Setup
cd backend
pip install -r requirements.txt
# Install neuralnode package
pip install -e .
Updated requirements.txt:
fastapi==0.104.1
uvicorn[standard]==0.24.0
pydantic==2.5.0
python-multipart==0.0.6
aiohttp==3.9.1
websockets==12.0
python-dotenv==1.0.0
pillow==10.1.0
pyautogui==0.9.54
python-telegram-bot==20.7
requests==2.31.0
chromadb==0.4.18
sentence-transformers==2.2.2
edge-tts==6.1.9
pytz==2023.3
pygetwindow==0.0.9
pygame==2.5.2
pypdf==3.17.1
python-docx==1.1.0
2. Environment Configuration
Create .env file in project root:
# =============================================================================
# AI Agent Personality & Prompts (Customize these!)
# =============================================================================
# Main Agent System Prompt - Define your AI's personality
AGENT_SYSTEM_PROMPT="""You are Vision, a powerful AI agent with FULL CONTROL over the user's Windows PC.
You have access to many tools. If the user asks you to do something (open app, screenshot, click, type), USE THE TOOLS.
Format your reasoning:
Thought: reasoning...
Action: tool_name Input: {...}
If no tool needed, respond directly with 'Final Answer: [your response]'"""
# Fast Mode Prompt
FAST_SYSTEM_PROMPT="""You are Vision, an AI assistant. Respond directly and concisely."""
# =============================================================================
# LLM Backend Configuration
# =============================================================================
# Ollama (Option 1)
OLLAMA_URL=http://localhost:11434
TEXT_MODEL=qwen3.5:4b
VISION_MODEL=llava
# LM Studio (Option 2) - Uncomment to use instead of Ollama
# LM_STUDIO_URL=http://localhost:1234
# LM_STUDIO_MODEL=your-model-name
# =============================================================================
# Platform Integrations
# =============================================================================
# Telegram Bot (from @BotFather)
TELEGRAM_BOT_TOKEN=your_token_here
# WhatsApp Business API (from Meta Developer Dashboard)
WHATSAPP_PHONE_NUMBER_ID=your_phone_number_id
WHATSAPP_ACCESS_TOKEN=your_access_token
# =============================================================================
# Web Search Configuration
# =============================================================================
GOOGLE_API_KEY=your_google_api_key
GOOGLE_SEARCH_ENGINE_ID=your_search_engine_id
# OR
SERPAPI_KEY=your_serpapi_key
# =============================================================================
# Agent Behavior Settings
# =============================================================================
SHOW_THINKING=false
VOICE_RESPONSE_ENABLED=true
AUTO_START_TELEGRAM=true
AUTO_START_OLLAMA=true
3. Flutter Setup
flutter pub get
flutter config --enable-windows-desktop
๐ Quick Start
Start the AI Agent
# Start with auto-configuration
python main.py
Auto-start features:
- โ Ollama server (if not running)
- โ Telegram bot (if token configured)
- โ Memory service (ChromaDB)
Start with LM Studio
from neuralnode.integrations.lm_studio import LMStudioLLM
# Initialize LM Studio connection
llm = LMStudioLLM(base_url="http://localhost:1234")
# Check connection
status = await llm.check_connection()
print(f"LM Studio connected: {status['connected']}")
Start the Backend API
cd backend
python api_server.py
Server runs at: http://localhost:8000
- Swagger UI: http://localhost:8000/docs
- WebSocket: ws://localhost:8000/ws
Run Flutter Desktop App
flutter run -d windows
๐ค Agent Modes
1. Fast Mode (SHOW_THINKING = False)
- Quick responses without reasoning steps
- Direct tool execution
- Best for: Simple commands, quick queries
2. ReAct Mode (SHOW_THINKING = True)
- Full reasoning loop with Thought/Action/Observation
- Multi-step problem solving
- Best for: Complex tasks, planning
Configuration
Edit backend/core/config.py:
SHOW_THINKING = False # Toggle between Fast and ReAct modes
VOICE_RESPONSE_ENABLED = True # Send voice notes by default
AUTO_START_TELEGRAM = True
AUTO_START_OLLAMA = True
๐ง Available Tools
| Tool | Description | Example Usage |
|---|---|---|
launch_app |
Open any application | "Open Spotify" |
system_control |
Lock/unlock/shutdown PC | "Lock my PC" |
keyboard_shortcut |
Press key combinations | "Press Ctrl+C" |
type_text |
Type text into active window | "Type 'Hello World'" |
mouse_click |
Click at coordinates | "Click at (100, 200)" |
window_manager |
Manage open windows | "List all windows" |
take_screenshot |
Capture and analyze screen | "Take a screenshot" |
check_screen |
Silent screen analysis | "What's on my screen?" |
send_voice_note |
Convert text to speech | "Read this aloud" |
edge_browser |
Control Microsoft Edge | "Search Google for Python" |
spotify_control |
Control Spotify playback | "Play song Believer" |
execute_command |
Run PowerShell/CMD | "Run 'dir' command" |
python_repl |
Execute Python code | "Calculate 2+2 in Python" |
๐ฌ WhatsApp Integration
WhatsApp Business API
from neuralnode.integrations.whatsapp import WhatsAppAgentBot
# Initialize bot
bot = WhatsAppAgentBot(
method="business_api",
phone_number_id="123456789",
access_token="your_access_token"
)
# Set AI agent
bot.set_agent(your_agent)
# Authorize users
bot.authorize_number("+201234567890")
# Send AI response
await bot.send_ai_response("+201234567890", "What's the weather?")
WhatsApp Web (Development)
from neuralnode.integrations.whatsapp import WhatsAppWebIntegration
wa = WhatsAppWebIntegration()
# Start session (shows QR code)
await wa.start_session()
# Send message
await wa.send_message("+201234567890", "Hello from NeuralNode!")
๐๏ธ TTS (Text-to-Speech)
Direct Playback (No File Saving)
import asyncio
from backend.tools.tts_tools import DirectTTSTool
async def play_direct():
tts = DirectTTSTool()
# Play directly without saving
await tts.execute(
text="Hello from NeuralNode!",
voice="en-US-EmmaNeural",
speed="+0%"
)
asyncio.run(play_direct())
Save to File
from backend.core.utils import generate_voice_file
async def save_tts():
await generate_voice_file(
text="Hello, this is NeuralNode!",
output_path="output.mp3"
)
asyncio.run(save_tts())
๐ง Advanced RAG System
Basic Usage
from backend.rag.advanced_rag import AdvancedRAG
# Initialize RAG
rag = AdvancedRAG(persist_dir="./rag_store")
# Add documents
await rag.add_document("document.pdf", doc_id="doc_1")
await rag.add_document("document.txt", doc_id="doc_2")
# Search
results = await rag.search("What is machine learning?", k=5)
for result in results:
print(f"Score: {result.score}")
print(f"Content: {result.content[:200]}...")
With Query Expansion
# Enable query expansion
results = await rag.search(
"What is machine learning?",
k=5,
expand_query=True, # LLM will expand query
rerank=True # Re-rank results
)
Using with LLM
# Get context for prompt
context = rag.get_context_for_prompt(
query="Explain neural networks",
max_tokens=2000
)
# Use in LLM prompt
messages = [
Message.system("You are a helpful assistant."),
Message.user(f"Context: {context}\n\nQuestion: Explain neural networks")
]
response = await llm.achat(messages)
๐ Advanced Chains
Sequential Chain
from neuralnode.chains import SequentialChain, PromptTemplate
chain = SequentialChain([
PromptTemplate("Summarize: {text}"),
llm,
PromptTemplate("Translate to French: {output}"),
llm
])
result = await chain.execute({"text": "Long article here..."})
Parallel Chain
from neuralnode.chains import ParallelChain
chain = ParallelChain([
("summary", [PromptTemplate("Summarize: {text}"), llm]),
("keywords", [PromptTemplate("Extract keywords: {text}"), llm])
])
results = await chain.execute({"text": "Article here..."})
# Returns: {"summary": "...", "keywords": "..."}
Conditional Chain
from neuralnode.chains import ConditionalChain
chain = ConditionalChain(
condition=lambda x: "code" in x["text"].lower(),
true_chain=[PromptTemplate("Review this code: {text}"), llm],
false_chain=[PromptTemplate("Summarize: {text}"), llm]
)
result = await chain.execute({"text": "def hello(): print('world')"})
Map-Reduce Chain
from neuralnode.chains import MapReduceChain
# Process large documents in chunks
chain = MapReduceChain(
map_chain=[PromptTemplate("Summarize chunk: {chunk}"), llm],
reduce_chain=[PromptTemplate("Combine summaries: {outputs}"), llm]
)
result = await chain.execute({
"chunks": ["chunk1...", "chunk2...", "chunk3..."]
})
๐พ Memory System
Long-Term Memory
from backend.memory_service import LongTermMemory
memory = LongTermMemory()
# Store memory
await memory.add_memory(
content="User prefers dark mode",
type="preference",
importance=0.8,
tags=["ui", "preference"]
)
# Search
results = await memory.search_memories("What are user preferences?")
# Get context for prompt
context = await memory.get_context_for_prompt("Tell me about the user")
Memory-Enhanced Agent
from backend.memory_service import MemoryEnhancedAgent
agent = MemoryEnhancedAgent()
# Process with automatic memory
result = await agent.process_with_memory("I like dark mode")
# Automatically stores and retrieves context
๐ฌ Telegram Bot Commands
The Telegram bot supports natural language commands:
/voice on - Enable voice responses
/voice off - Disable voice responses (text mode)
Screenshot commands:
- "Screenshot" / "Screen shot" / "ุตูุฑุฉ" / "ุงุณูุฑูู"
- "What do you see?" / "What's on screen?"
Image Analysis:
- Send any photo for AI vision analysis
Voice Notes:
- "Read this aloud" / "Send voice note"
๐ API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/chat/message |
POST | Send message to AI |
/search/web |
POST | Web search |
/monitor/screenshot |
POST | Take screenshot |
/monitor/toggle |
POST | Toggle auto-monitor |
/monitor/status |
GET | Get monitor status |
/ws |
WebSocket | Real-time updates |
Example API Call
curl -X POST http://localhost:8000/chat/message \
-H "Content-Type: application/json" \
-d '{"message": "Open Spotify", "include_screenshot": false}'
๐ Integration Status
| Platform | Status | Features |
|---|---|---|
| Ollama | โ Ready | Auto-start, vision models, local inference |
| LM Studio | โ Ready | OpenAI-compatible API, streaming, model management |
| Telegram | โ Ready | Text, voice, images, screenshots |
| โ Ready | Business API, Web integration, media support | |
| Flutter Desktop | โ Ready | Windows UI, WebSocket, API client |
| Discord | ๐ Planned | Coming in v2.1 |
| Slack | ๐ Planned | Coming in v2.2 |
โ๏ธ Supported Models
Ollama Models
| Model | Size | Best For |
|---|---|---|
| qwen3.5:4b | 4B | Fast responses, tool use |
| llama3.2:latest | 3.2B | General purpose |
| gemma2:9b | 9B | Complex reasoning |
| mistral:latest | 7B | Balanced performance |
| codellama:latest | 7B | Code generation |
| llava | 7B | Vision analysis |
LM Studio
- Any model loaded in LM Studio
- Auto-detection of available models
- Support for GGUF, GPTQ, AWQ formats
๐ ๏ธ Customization
Add Custom Tools
from neuralnode.core import BaseTool, ToolResult, ParameterSpec
class MyCustomTool(BaseTool):
name = "my_tool"
description = "Description of what it does"
parameters = {
"param": ParameterSpec(type="string", description="Param description", required=True)
}
def execute(self, param: str) -> ToolResult:
# Your logic here
return ToolResult.ok(content=f"Result: {param}")
# Add to main.py TOOLS list
TOOLS.append(MyCustomTool())
๐ Troubleshooting
Ollama Not Starting
# Check if Ollama is installed
ollama --version
# Start manually
ollama serve
Telegram Bot Not Responding
- Verify bot token in
.env - Check bot is started with
/start - Ensure bot has message permissions
LM Studio Connection Failed
# Check LM Studio is running on correct port
from neuralnode.integrations.lm_studio import LMStudioLLM
llm = LMStudioLLM()
status = await llm.check_connection()
print(status) # Should show connected: True
Import Errors
# Reinstall neuralnode package
cd /path/to/neuralnode
pip install -e .
# Reinstall dependencies
pip install -r backend/requirements.txt --force-reinstall
ChromaDB/Memory Errors
# Clear memory store
rm -rf memory_store/
rm -rf rag_store/
# Reinitialize
python -c "from backend.memory_service import LongTermMemory; LongTermMemory()"
Flutter Build Issues
flutter clean
flutter pub get
flutter build windows
๐ฆ Building for Production
Windows Executable
# Build Flutter app
flutter build windows --release
# Output: build/windows/x64/runner/Release/vision.exe
๐ค Contributing
NeuralNode is an open framework. Contributions welcome!
Roadmap v2.1
- Discord integration
- Slack integration
- macOS/Linux support
- WebRTC voice chat
- Plugin system
- Docker deployment
๐ License
MIT License - See LICENSE file for details
๐ก Key Features Summary
- โ 100% Local - No cloud dependencies, full privacy
- โ Multi-Modal - Text, voice, images, vision
- โ Advanced RAG - Superior to LangChain
- โ Advanced Chains - Beyond LangChain capabilities
- โ Multiple Backends - Ollama + LM Studio
- โ Multi-Platform - Telegram + WhatsApp
- โ Autonomous - Self-planning task execution
- โ Extensible - Easy tool and chain creation
- โ Memory - Long-term semantic memory
- โ Fast - GPU acceleration
- โ Free - No API costs, open source
NeuralNode 2.0.0 - Built for AI agents that actually do things.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neuralnode-2.0.0.tar.gz.
File metadata
- Download URL: neuralnode-2.0.0.tar.gz
- Upload date:
- Size: 26.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f44ab52ebff0cd73bed820dce1e818d506b01e0119e04d0a6d9368c53e1ad5b0
|
|
| MD5 |
7da5173c8522c7bb7aead5fec4782782
|
|
| BLAKE2b-256 |
3abb65afe5548f269ac48a8ca7baab8b8e16775408c080c3ca279ef2d110bcb3
|
File details
Details for the file neuralnode-2.0.0-py3-none-any.whl.
File metadata
- Download URL: neuralnode-2.0.0-py3-none-any.whl
- Upload date:
- Size: 17.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3bbd97d29c28157460932ed2cb2c6b1f0d81c45b3dc56d46b556f1f1c53f260a
|
|
| MD5 |
ebe2d5bd9411464b8272fca305a5cf59
|
|
| BLAKE2b-256 |
3060d73e8fef656fde8f3701d9bb5ffde2f50bacc94543bf8e52254fb7373628
|