A comprehensive Python framework for building AI agents with support for both cloud-based LLM APIs and local models
Project description
NeuralNode Documentation
Table of Contents
- Introduction
- Features
- Architecture Overview
- Installation Guide
- Quick Start
- Core Concepts
- API Reference
- Model System
- Training Guide
- Inference System
- Configuration & Settings
- Plugins / Extensions System
- Performance Optimization
- Use Cases
- Comparison
- Troubleshooting
- Roadmap
- Contributing Guide
- License
- Credits & Author
1. Introduction
NeuralNode is a comprehensive Python framework for building AI agents with support for both cloud-based LLM APIs and local models. It provides a unified interface for creating intelligent applications with advanced features like multi-modal processing, autonomous agent capabilities, and enterprise-grade security.
Key Differentiators
- Dual Mode Operation: Works with both cloud APIs (OpenAI, Anthropic, Google) and local LLMs (Llama, Mistral, etc.)
- Security First: Built-in sandboxing, human-in-the-loop approval, and privacy mode
- Production Ready: Observability, monitoring, and distributed inference capabilities
- Extensible: Plugin system for custom providers, tools, and integrations
- Performance Optimized: FAISS vector search, model quantization, and caching layers
Who Should Use NeuralNode?
- Developers building AI-powered applications
- Data scientists creating LLM pipelines
- Enterprises requiring on-premise AI solutions
- Researchers experimenting with agent architectures
- Startups needing scalable AI infrastructure
2. Features
2.1 Core Features
Unified LLM Interface
import neuralnode as nn
# Works with any provider
ai = nn.NeuralNode(provider="openai", model="gpt-4")
ai = nn.NeuralNode(provider="anthropic", model="claude-3-sonnet")
ai = nn.NeuralNode(provider="ollama", model="llama3")
# Same interface for all
response = ai.chat("Hello, world!")
Advanced Memory System
from neuralnode.memory import AdvancedMemorySystem
memory = AdvancedMemorySystem(
short_term_limit=10,
long_term_db_path="./memory.db",
enable_semantic=True
)
# Store information
memory.add_long_term("User prefers Python over JavaScript")
# Semantic search
results = memory.search("What programming languages does the user like?", k=3)
Intelligent Agents
from neuralnode import Agent
from neuralnode.tools import WebSearch, FileManager
agent = Agent(
llm=ai,
tools=[WebSearch(), FileManager()],
system_prompt="You are a helpful assistant."
)
# Agent automatically decides which tools to use
result = agent.run("Find the latest AI news and save to file")
Multi-Modal Processing
from neuralnode.chains import MultiModalChain
mm_chain = MultiModalChain(llm=ai)
# Process text + image + audio
result = mm_chain.chat_with_multimodal(
text="What's in this image?",
image_path="photo.jpg",
audio_path="question.mp3"
)
2.2 Security Features
Human-in-the-Loop
from neuralnode.security import HumanInTheLoop
hitl = HumanInTheLoop(require_confirmation=["HIGH", "CRITICAL"])
# Will prompt user before executing dangerous operations
approved, reason = hitl.check_operation(
"delete_file",
{"path": "/important/data.txt"}
)
Sandboxed Code Execution
from neuralnode.tools.secure_code_interpreter import safe_execute
# Runs in isolated environment
result = safe_execute("""
def fibonacci(n):
if n <= 1: return n
return fibonacci(n-1) + fibonacci(n-2)
print(fibonacci(10))
""")
Privacy Mode
from neuralnode.security import PrivacyMode
privacy = PrivacyMode()
privacy.enable(password="secure_key")
# All data encrypted automatically
encrypted = privacy.encrypt_message("sensitive data")
2.3 Performance Features
FAISS Vector Search
from neuralnode.rag import FAISSVectorStore
store = FAISSVectorStore(embedding_dim=384, index_type="flat")
store.add_batch(documents)
# Search millions of documents in milliseconds
results = store.search(query_embedding, k=5)
Distributed Inference
from neuralnode.distributed import DistributedInferenceEngine
engine = DistributedInferenceEngine()
engine.add_node("gpu-1", "192.168.1.10", 8000)
engine.add_node("gpu-2", "192.168.1.11", 8000)
# Automatically distributes across GPUs
results = engine.parallel_generate(prompts)
Model Quantization
from neuralnode.local import ModelQuantizer
quantizer = ModelQuantizer()
quantizer.quantize(
"meta-llama/Llama-2-7b",
output_path="./llama-7b-q4.gguf",
method="Q4_K_M"
)
2.4 Training Features
RLHF Pipeline
from neuralnode.training import CompleteRLHFPipeline
rlhf = CompleteRLHFPipeline(model, tokenizer)
rlhf.run_full_pipeline(
prompts=training_prompts,
test_prompts=eval_prompts,
output_dir="./rlhf_output"
)
Fine-tuning with LoRA
from neuralnode.training import FineTuner
tuner = FineTuner(model="meta-llama/Llama-2-7b")
tuner.finetune(
dataset="./training_data.json",
method="lora",
epochs=3,
batch_size=4
)
Federated Learning
from neuralnode.training import FederatedLearningServer
server = FederatedLearningServer(model, config)
server.register_client("client_1", data_size=1000)
server.register_client("client_2", data_size=1500)
# Train without sharing raw data
final_model = server.train(num_rounds=10)
3. Architecture Overview
3.1 System Architecture
┌─────────────────────────────────────────────────────────────┐
│ NeuralNode Framework │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ LLM Layer │ │ Agent Layer │ │ Tool Layer │ │
│ │ │ │ │ │ │ │
│ │ - OpenAI │ │ - ReAct │ │ - Web Search │ │
│ │ - Anthropic │ │ - Auto-Agent │ │ - File Sys │ │
│ │ - Google │ │ - Planning │ │ - Browser │ │
│ │ - Local │ │ - Multi-Agent│ │ - Code Exec │ │
│ │ - Ollama │ │ │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └─────────────────┼─────────────────┘ │
│ │ │
│ ┌──────────────────────┴──────────────────────┐ │
│ │ Memory & Context Layer │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Short │ │ Long │ │ Semantic │ │ │
│ │ │ Term │ │ Term │ │ Memory │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └───────────────────────────────────────────┘ │
│ │ │
│ ┌──────────────────────┴──────────────────────┐ │
│ │ Infrastructure Layer │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Vector │ │ Cache │ │ Security │ │ │
│ │ │ Store │ │ Layer │ │ Layer │ │ │
│ │ │ (FAISS) │ │ (Redis) │ │ (HITL) │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └───────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
3.2 Data Flow
User Input
│
▼
┌─────────────┐
│ Input │ ──► Preprocessing ──► Safety Check
│ Processing │
└─────────────┘
│
▼
┌─────────────┐
│ Memory │ ──► Retrieve Context ──► Add to Prompt
│ Lookup │
└─────────────┘
│
▼
┌─────────────┐
│ LLM │ ──► Generate Response
│ Core │
└─────────────┘
│
▼
┌─────────────┐
│ Tool │ ──► Parse Tool Calls ──► Execute
│ Parser │
└─────────────┘
│
▼
┌─────────────┐
│ Response │ ──► Postprocessing ──► Store in Memory
│ Builder │
└─────────────┘
│
▼
User Output
3.3 Component Diagram
NeuralNode Core
│
┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Providers│ │ Tools │ │ Memory │
│ │ │ │ │ │
│- OpenAI │ │- Search │ │- SQLite │
│- Claude │ │- Files │ │- FAISS │
│- Local │ │- Browser│ │- Cache │
│- Ollama │ │- Code │ │ │
└─────────┘ └─────────┘ └─────────┘
│ │ │
└────────────────┼────────────────┘
│
▼
┌─────────────────┐
│ Agent System │
│ │
│ - ReAct Pattern │
│ - Planning │
│ - Multi-Agent │
└─────────────────┘
│
▼
┌─────────────────┐
│ Extensions │
│ │
│ - Plugins │
│ - Custom Tools │
│ - Integrations │
└─────────────────┘
4. Installation Guide
4.1 Prerequisites
- Python 3.8 or higher
- pip or conda package manager
- For local models: 8GB+ RAM (16GB+ recommended)
- For GPU acceleration: CUDA-capable GPU (optional)
4.2 Basic Installation
pip install neuralnode
4.3 Development Installation
git clone https://github.com/yourusername/neuralnode.git
cd neuralnode
pip install -e ".[dev]"
4.4 Optional Dependencies
# Vector search
pip install neuralnode[vectors] # FAISS + ChromaDB
# Local models
pip install neuralnode[local] # Ollama + llama.cpp
# Training
pip install neuralnode[training] # PyTorch + Transformers
# All features
pip install neuralnode[all]
4.5 Docker Installation
FROM python:3.11-slim
WORKDIR /app
RUN pip install neuralnode[all]
COPY . .
CMD ["python", "app.py"]
4.6 Verification
import neuralnode as nn
print(nn.__version__)
# Check available features
from neuralnode.utils.graceful_degradation import FeatureAvailability
FeatureAvailability.print_feature_matrix()
5. Quick Start
5.1 First Steps
import neuralnode as nn
# Initialize with OpenAI
ai = nn.NeuralNode(
provider="openai",
model="gpt-4",
api_key="your-api-key" # Or set OPENAI_API_KEY env var
)
# Simple chat
response = ai.chat("What is machine learning?")
print(response.text)
5.2 Building Your First Agent
from neuralnode import Agent
from neuralnode.tools import WebSearch, Calculator
agent = Agent(
llm=ai,
tools=[WebSearch(), Calculator()],
system_prompt="You are a helpful research assistant."
)
# Agent automatically searches and calculates
result = agent.run("""
What is the population of Tokyo?
Calculate what percentage it is of Japan's total population.
""")
print(result)
5.3 Adding Memory
from neuralnode.memory import AdvancedMemorySystem
memory = AdvancedMemorySystem()
memory.add_long_term("User is a Python developer")
agent = Agent(llm=ai, memory=memory)
# Agent remembers previous context
response = agent.run("What programming language should I recommend?")
# Will consider that user knows Python
5.4 Using Local Models
# With Ollama (must be installed separately)
ai = nn.NeuralNode(provider="ollama", model="llama3")
# Or with local GGUF file
from neuralnode.local import LocalLLM
ai = LocalLLM(model_path="./models/llama-3-8b.gguf")
6. Core Concepts
6.1 NeuralNode
The core class that provides a unified interface to LLMs.
from neuralnode import NeuralNode
# Configuration
ai = NeuralNode(
provider="openai", # Provider name
model="gpt-4", # Model identifier
temperature=0.7, # Sampling temperature
max_tokens=1000, # Maximum response length
timeout=30, # Request timeout
retries=3, # Retry attempts
cache=True, # Enable caching
)
6.2 Agent
An intelligent entity that can use tools and make decisions.
from neuralnode import Agent
agent = Agent(
llm=ai, # LLM instance
tools=[tool1, tool2], # Available tools
memory=memory, # Memory system
max_steps=10, # Maximum reasoning steps
system_prompt="...", # System instructions
)
Agent types:
- SimpleAgent: Basic question-answering
- ToolAgent: Can use tools
- ReActAgent: Reasoning and acting with self-correction
- MultiAgent: Coordinates multiple specialized agents
6.3 Tools
Functions that agents can use to interact with the world.
from neuralnode.tools import Tool
# Creating a custom tool
class WeatherTool(Tool):
def __init__(self):
super().__init__(
name="get_weather",
description="Get current weather for a location",
parameters={
"location": {"type": "string", "description": "City name"}
}
)
def execute(self, location: str) -> str:
# Implementation
return f"Weather in {location}: 25C, Sunny"
6.4 Memory Systems
Different types of memory for different time horizons.
Short-term Memory: Recent conversation context
from neuralnode.memory import ConversationMemory
memory = ConversationMemory(max_messages=10)
Long-term Memory: Persistent storage
from neuralnode.memory import SQLiteMemory
memory = SQLiteMemory(db_path="./memory.db")
Semantic Memory: Vector-based retrieval
from neuralnode.memory import SemanticMemory
memory = SemanticMemory(embedding_model="sentence-transformers/all-MiniLM-L6-v2")
6.5 RAG (Retrieval-Augmented Generation)
Combining LLMs with document retrieval.
from neuralnode.rag import RAG, Document
# Load documents
docs = [
Document(content="NeuralNode is a framework...", metadata={"source": "docs"}),
Document(content="Installation requires Python 3.8...", metadata={"source": "docs"})
]
# Create RAG system
rag = RAG(llm=ai, documents=docs)
# Query with context
answer = rag.query("What are the system requirements?")
6.6 Chains
Sequential processing pipelines.
from neuralnode.chains import Chain
# Define a chain
chain = Chain()
chain.add_step(lambda x: x.upper())
chain.add_step(lambda x: x + "!!!")
chain.add_step(lambda x: ai.chat(x))
# Execute
result = chain.run("hello")
7. API Reference
7.1 NeuralNode Class
class NeuralNode:
"""
Unified interface for LLM providers.
Args:
provider: LLM provider name ("openai", "anthropic", "google", "ollama", etc.)
model: Model identifier
api_key: API key (or set via environment variable)
temperature: Sampling temperature (0.0 to 2.0)
max_tokens: Maximum tokens to generate
timeout: Request timeout in seconds
retries: Number of retry attempts
cache: Enable response caching
**kwargs: Provider-specific options
"""
def __init__(
self,
provider: str,
model: Optional[str] = None,
api_key: Optional[str] = None,
temperature: float = 0.7,
max_tokens: Optional[int] = None,
timeout: int = 30,
retries: int = 3,
cache: bool = True,
**kwargs
):
...
def chat(
self,
message: str,
system: Optional[str] = None,
history: Optional[List[Dict]] = None,
**kwargs
) -> ChatResponse:
"""Send a chat message."""
...
def stream(
self,
message: str,
**kwargs
) -> Iterator[str]:
"""Stream response tokens."""
...
def embed(
self,
text: Union[str, List[str]]
) -> Union[List[float], List[List[float]]]:
"""Generate embeddings."""
...
7.2 Agent Class
class Agent:
"""
Intelligent agent with tool use capabilities.
Args:
llm: NeuralNode instance
tools: List of available tools
memory: Memory system
max_steps: Maximum reasoning steps
system_prompt: System instructions
"""
def __init__(
self,
llm: NeuralNode,
tools: Optional[List[Tool]] = None,
memory: Optional[Any] = None,
max_steps: int = 10,
system_prompt: Optional[str] = None
):
...
def run(
self,
task: str,
context: Optional[Dict] = None
) -> str:
"""Execute a task."""
...
def add_tool(self, tool: Tool) -> None:
"""Add a tool to the agent."""
...
7.3 Tool Class
class Tool:
"""
Base class for tools.
Args:
name: Tool identifier
description: What the tool does
parameters: JSON Schema for parameters
func: Function to execute
"""
def __init__(
self,
name: str,
description: str,
parameters: Dict[str, Any],
func: Optional[Callable] = None
):
...
def execute(self, **kwargs) -> Any:
"""Execute the tool."""
...
7.4 Memory Classes
class ConversationMemory:
"""Short-term conversation memory."""
def __init__(self, max_messages: int = 10):
...
def add(self, role: str, content: str) -> None:
...
def get_messages(self) -> List[Dict]:
...
def clear(self) -> None:
...
class AdvancedMemorySystem:
"""Multi-tier memory system."""
def __init__(
self,
short_term_limit: int = 10,
long_term_db_path: Optional[str] = None,
enable_semantic: bool = False,
embedding_model: Optional[str] = None
):
...
def add_short_term(self, content: str) -> None:
...
def add_long_term(self, content: str, metadata: Optional[Dict] = None) -> None:
...
def search(
self,
query: str,
k: int = 5,
search_type: str = "semantic"
) -> List[Dict]:
...
8. Model System
8.1 Supported Providers
| Provider | Cloud/Local | Models | Authentication |
|---|---|---|---|
| OpenAI | Cloud | GPT-4, GPT-3.5 | API Key |
| Anthropic | Cloud | Claude 3, Claude 2 | API Key |
| Cloud | Gemini Pro | API Key | |
| Azure | Cloud | GPT-4, GPT-3.5 | Azure Credentials |
| Ollama | Local | Llama, Mistral, etc. | None |
| HuggingFace | Local | Any HF model | Token (optional) |
| llama.cpp | Local | GGUF models | None |
8.2 Provider Configuration
# OpenAI
ai = nn.NeuralNode(
provider="openai",
model="gpt-4",
api_key="sk-...",
organization="org-..." # Optional
)
# Anthropic
ai = nn.NeuralNode(
provider="anthropic",
model="claude-3-sonnet-20240229",
api_key="sk-ant-..."
)
# Google
ai = nn.NeuralNode(
provider="google",
model="gemini-pro",
api_key="..."
)
# Azure OpenAI
ai = nn.NeuralNode(
provider="azure",
model="gpt-4",
api_key="...",
api_base="https://your-resource.openai.azure.com/",
api_version="2024-02-01"
)
# Ollama
ai = nn.NeuralNode(
provider="ollama",
model="llama3",
base_url="http://localhost:11434"
)
8.3 Local Model Management
from neuralnode.local import LocalLLMHub
# Create model hub
hub = LocalLLMHub()
# Scan for models
models = hub.scan_directory("./models")
# Add model from HuggingFace
hub.import_from_hf("meta-llama/Llama-2-7b-chat-hf", quantization="Q4_K_M")
# Launch model server
hub.launch("llama-2-7b", port=8000)
# Use in NeuralNode
ai = nn.NeuralNode(provider="local", base_url="http://localhost:8000")
8.4 Model Quantization
from neuralnode.local import ModelQuantizer
quantizer = ModelQuantizer()
# Convert to GGUF
quantizer.convert(
model_path="meta-llama/Llama-2-7b-hf",
output_path="./llama-7b-f16.gguf",
format="f16"
)
# Quantize
quantizer.quantize(
model_path="./llama-7b-f16.gguf",
output_path="./llama-7b-q4.gguf",
method="Q4_K_M"
)
9. Training Guide
9.1 Fine-tuning with LoRA
from neuralnode.training import FineTuner
# Initialize
tuner = FineTuner(model="meta-llama/Llama-2-7b")
# Prepare dataset
dataset = tuner.prepare_dataset(
data_path="./training_data.jsonl",
format="instruction", # or "conversation"
max_length=2048
)
# Configure LoRA
tuner.configure_lora(
r=16,
lora_alpha=32,
target_modules=["q_proj", "v_proj"],
lora_dropout=0.05
)
# Train
tuner.finetune(
dataset=dataset,
output_dir="./lora_output",
num_train_epochs=3,
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
learning_rate=2e-4,
warmup_steps=100,
logging_steps=10,
save_steps=500,
fp16=True
)
9.2 RLHF Training
from neuralnode.training import RLHFTrainer
# Initialize trainer
trainer = RLHFTrainer(
model="meta-llama/Llama-2-7b",
reward_model="your-reward-model"
)
# Stage 1: Collect preferences
preferences = trainer.collect_preferences(
prompts=eval_prompts,
num_responses_per_prompt=4
)
# Stage 2: Train reward model
trainer.train_reward_model(preferences, epochs=3)
# Stage 3: Train policy with PPO
trainer.train_policy(
prompts=training_prompts,
num_epochs=4,
batch_size=8
)
# Full pipeline
from neuralnode.training import CompleteRLHFPipeline
pipeline = CompleteRLHFPipeline(model, tokenizer)
results = pipeline.run_full_pipeline(
prompts=training_prompts,
test_prompts=eval_prompts,
output_dir="./rlhf_output"
)
9.3 Model Compression
from neuralnode.training import ModelCompressor
compressor = ModelCompressor()
# Pruning
pruned_model = compressor.prune(
model=model,
method="structured", # or "unstructured", "iterative"
amount=0.3 # Remove 30% of weights
)
# Knowledge Distillation
distilled_model = compressor.distill(
teacher_model=large_model,
student_model=small_model,
train_data=training_data,
temperature=4.0,
alpha=0.5
)
# Full compression pipeline
compressed_model, stats = compressor.compress(
model=model,
methods=["pruning", "quantization", "distillation"],
target_size_mb=500
)
10. Inference System
10.1 Basic Inference
# Single request
response = ai.chat("Hello, how are you?")
# Streaming
for token in ai.stream("Tell me a story"):
print(token, end="")
# Batch processing
responses = ai.batch_chat([
"Question 1",
"Question 2",
"Question 3"
])
10.2 Distributed Inference
from neuralnode.distributed import DistributedInferenceEngine
# Create engine
engine = DistributedInferenceEngine()
# Add compute nodes
engine.add_node("node-1", "192.168.1.10", 8000, devices=["cuda:0"])
engine.add_node("node-2", "192.168.1.11", 8000, devices=["cuda:0", "cuda:1"])
# Shard model across nodes
engine.shard_model("meta-llama/Llama-70b", shards=4)
# Parallel generation
results = engine.parallel_generate(
prompts=["Prompt 1", "Prompt 2", "Prompt 3"],
max_tokens=100
)
# Pipeline parallelism
pipeline = engine.create_pipeline([
"node-1", # Layer 0-15
"node-2", # Layer 16-31
"node-3", # Layer 32-47
"node-4", # Layer 48-80
])
10.3 Quantized Inference
from neuralnode.local import QuantizedLLM
# Load quantized model
model = QuantizedLLM(
model_path="./llama-7b-q4.gguf",
n_ctx=4096,
n_threads=8
)
# Inference
response = model.generate(
prompt="What is AI?",
max_tokens=100,
temperature=0.7
)
10.4 Caching
from neuralnode.utils import SmartCache
# Initialize cache
cache = SmartCache(
backend="redis", # or "disk", "memory"
ttl=3600,
max_size=10000
)
# Use with NeuralNode
ai = nn.NeuralNode(
provider="openai",
cache=cache,
cache_similarity_threshold=0.95
)
# Cache hit for similar queries
response1 = ai.chat("What is Python?")
response2 = ai.chat("Tell me about Python programming") # May use cache
11. Configuration & Settings
11.1 Configuration Files
YAML Configuration
# config.yaml
llm:
provider: openai
model: gpt-4
temperature: 0.7
max_tokens: 1000
agent:
max_steps: 10
system_prompt: "You are a helpful assistant."
tools:
- name: web_search
enabled: true
- name: file_manager
enabled: true
memory:
type: advanced
short_term_limit: 10
long_term_db_path: ./memory.db
enable_semantic: true
security:
human_in_the_loop: true
require_confirmation:
- HIGH
- CRITICAL
JSON Configuration
{
"llm": {
"provider": "openai",
"model": "gpt-4"
},
"cache": {
"enabled": true,
"backend": "redis",
"ttl": 3600
}
}
Loading Configuration
import neuralnode as nn
# From file
ai = nn.NeuralNode.from_config("config.yaml")
# From dict
config = {
"provider": "openai",
"model": "gpt-4"
}
ai = nn.NeuralNode.from_config(config)
11.2 Environment Variables
# API Keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
# Azure
export AZURE_OPENAI_KEY="..."
export AZURE_OPENAI_ENDPOINT="https://..."
# Local Models
export OLLAMA_BASE_URL="http://localhost:11434"
# Cache
export NEURALNODE_CACHE_BACKEND="redis"
export REDIS_URL="redis://localhost:6379"
# Security
export NEURALNODE_REQUIRE_CONFIRMATION="true"
export NEURALNODE_SAFE_MODE="true"
# Logging
export NEURALNODE_LOG_LEVEL="INFO"
export NEURALNODE_LOG_FILE="./neuralnode.log"
11.3 Performance Tuning
# Memory optimization
ai = nn.NeuralNode(
provider="local",
context_length=4096,
batch_size=1, # Reduce for low memory
gpu_layers=35 # Offload to GPU
)
# CPU optimization
import os
os.environ["OMP_NUM_THREADS"] = "8"
os.environ["OPENBLAS_NUM_THREADS"] = "8"
# Async for high throughput
import asyncio
async def batch_process(prompts):
tasks = [ai.achat(p) for p in prompts]
return await asyncio.gather(*tasks)
12. Plugins / Extensions System
12.1 Creating a Custom Provider
from neuralnode.providers import BaseProvider, register_provider
@register_provider("my_provider")
class MyProvider(BaseProvider):
"""Custom LLM provider."""
def __init__(self, config):
self.api_key = config.get("api_key")
self.base_url = config.get("base_url")
def chat(self, message, **kwargs):
# Implementation
return ChatResponse(text="...")
def embed(self, text):
# Implementation
return [0.1, 0.2, 0.3]
def is_available(self):
return True
# Usage
ai = nn.NeuralNode(provider="my_provider", api_key="...")
12.2 Creating Custom Tools
from neuralnode.tools import Tool
class DatabaseTool(Tool):
"""Custom database tool."""
def __init__(self, connection_string):
super().__init__(
name="query_database",
description="Query SQL database",
parameters={
"query": {
"type": "string",
"description": "SQL query"
}
}
)
self.db = connect(connection_string)
def execute(self, query: str) -> str:
result = self.db.execute(query)
return str(result.fetchall())
# Register
def register_plugin():
return {
"tools": [DatabaseTool],
"providers": [],
"hooks": {}
}
12.3 Plugin API
# neuralnode/plugins/my_plugin/__init__.py
from neuralnode.plugins import Plugin
class MyPlugin(Plugin):
name = "my_plugin"
version = "1.0.0"
def setup(self, app):
"""Called when plugin is loaded."""
app.add_tool(MyTool())
app.add_hook("pre_chat", self.preprocess)
def teardown(self):
"""Called when plugin is unloaded."""
pass
# Loading plugins
import neuralnode as nn
nn.load_plugin("my_plugin")
nn.load_plugins_from_dir("./plugins")
12.4 Integration Examples
Slack Integration
from neuralnode.integrations import SlackBot
bot = SlackBot(agent=agent, token="xoxb-...")
bot.start()
Discord Integration
from neuralnode.integrations import DiscordBot
bot = DiscordBot(agent=agent, token="...")
bot.start()
13. Performance Optimization
13.1 Quantization
# 4-bit quantization (75% size reduction)
from neuralnode.local import ModelQuantizer
quantizer = ModelQuantizer()
quantizer.quantize(
"meta-llama/Llama-2-7b",
method="Q4_K_M", # 4-bit with medium quality
output_path="./llama-7b-q4.gguf"
)
# 8-bit quantization (50% size reduction, better quality)
quantizer.quantize(
"meta-llama/Llama-2-7b",
method="Q8_0",
output_path="./llama-7b-q8.gguf"
)
13.2 Model Caching
from neuralnode.utils import DiskCache, RedisCache
# Disk cache
cache = DiskCache(
cache_dir="./cache",
max_size_gb=10,
ttl=86400 # 24 hours
)
# Redis cache (for distributed systems)
cache = RedisCache(
host="localhost",
port=6379,
db=0
)
# Use with LLM
ai = nn.NeuralNode(
provider="openai",
cache=cache,
cache_key_generator=lambda msg: hash(msg)
)
13.3 Memory Optimization
# Gradient checkpointing for training
from neuralnode.training import FineTuner
tuner = FineTuner(
model="meta-llama/Llama-2-7b",
gradient_checkpointing=True,
fp16=True
)
# Model sharding for inference
from neuralnode.distributed import ModelSharding
sharder = ModelSharding()
sharder.distribute(
model="meta-llama/Llama-70b",
devices=["cuda:0", "cuda:1", "cuda:2", "cuda:3"]
)
13.4 Multi-threading
import concurrent.futures
# Parallel tool execution
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
futures = [
executor.submit(tool.execute, query)
for tool, query in tasks
]
results = [f.result() for f in futures]
# Async I/O
import asyncio
async def batch_inference(prompts):
semaphore = asyncio.Semaphore(10) # Limit concurrent requests
async def bounded_chat(prompt):
async with semaphore:
return await ai.achat(prompt)
return await asyncio.gather(*[
bounded_chat(p) for p in prompts
])
13.5 GPU Acceleration
# GPU layers for local models
ai = nn.NeuralNode(
provider="local",
model_path="./llama-7b.gguf",
gpu_layers=35, # Offload 35 layers to GPU
n_gpu_layers=35,
main_gpu=0
)
# Multi-GPU training
from neuralnode.training import FineTuner
tuner = FineTuner(
model="meta-llama/Llama-2-7b",
device_map="auto", # Automatically distribute across GPUs
torch_dtype="float16"
)
14. Use Cases
14.1 Chatbots
# Customer support chatbot
from neuralnode import Agent
from neuralnode.rag import RAG
# Load knowledge base
rag = RAG.from_directory("./knowledge_base")
# Create chatbot
chatbot = Agent(
llm=ai,
tools=[rag.as_tool()],
system_prompt="You are a helpful customer support agent."
)
# Deploy
from neuralnode.integrations import TelegramAgent
telegram_bot = TelegramAgent(agent=chatbot, token="...")
telegram_bot.start()
14.2 Voice Assistants
from neuralnode.tools import SpeechToText, TextToSpeech
from neuralnode.chains import MultiModalChain
# Voice pipeline
stt = SpeechToText()
tts = TextToSpeech()
# Process voice command
audio_input = "command.wav"
text = stt.transcribe(audio_input)
# Get response
response = agent.run(text)
# Speak response
tts.speak(response, output="response.wav")
14.3 Computer Vision
from neuralnode.tools import VisionProcessor
vision = VisionProcessor(llm=ai)
# Analyze image
description = vision.describe_image("photo.jpg")
objects = vision.detect_objects("photo.jpg")
text = vision.extract_text("document.jpg")
# Multi-modal query
result = agent.run("""
Look at this image and tell me:
1. What objects do you see?
2. Is there any text?
3. What is the mood/atmosphere?
""", context={"image": "photo.jpg"})
14.4 NLP Systems
# Text classification
from neuralnode.nlp import Classifier
classifier = Classifier(llm=ai, labels=["positive", "negative", "neutral"])
sentiment = classifier.predict("This product is amazing!")
# Named Entity Recognition
from neuralnode.nlp import NER
ner = NER(llm=ai)
entities = ner.extract("Apple Inc. is located in Cupertino, California.")
# Result: [{"text": "Apple Inc.", "type": "ORG"}, ...]
# Text Summarization
from neuralnode.nlp import Summarizer
summarizer = Summarizer(llm=ai)
summary = summarizer.summarize(long_document, max_length=100)
14.5 Local AI Agents
# Fully offline agent
ai = nn.NeuralNode(
provider="ollama",
model="llama3:70b"
)
agent = Agent(
llm=ai,
tools=[
FileManager(),
ProcessManager(),
CodeRunner()
],
memory=AdvancedMemorySystem()
)
# Agent can control your computer completely offline
agent.run("Find all Python files in my project and list their dependencies")
14.6 Enterprise Solutions
# Multi-agent system for enterprise
from neuralnode.agents import AgentOrchestrator
orchestrator = AgentOrchestrator()
# Create specialized agents
researcher = orchestrator.create_agent(
name="Researcher",
role="research",
tools=[WebSearch(), DocumentLoader()]
)
analyst = orchestrator.create_agent(
name="Analyst",
role="analysis",
tools=[Calculator(), DataProcessor()]
)
writer = orchestrator.create_agent(
name="Writer",
role="content",
tools=[]
)
# Execute workflow
result = orchestrator.execute_workflow(
[researcher, analyst, writer],
task="Create a market analysis report"
)
15. Comparison
15.1 NeuralNode vs TensorFlow
| Feature | NeuralNode | TensorFlow |
|---|---|---|
| Type | LLM Framework | Deep Learning Framework |
| Focus | AI Agents & LLMs | General ML & Neural Networks |
| Ease of Use | High-level API | Low to mid-level |
| Local LLMs | Native support | Via conversion |
| Agent System | Built-in | Not available |
| Tool Integration | Native | Manual implementation |
| Use Case | Conversational AI, Agents | Research, Custom models |
15.2 NeuralNode vs PyTorch
| Feature | NeuralNode | PyTorch |
|---|---|---|
| Abstraction | High-level | Low-level |
| LLM Focus | Yes | No (general purpose) |
| Training | Simplified RLHF, LoRA | Full control |
| Deployment | Built-in tools | Manual setup |
| Learning Curve | Gentle | Steep |
| Flexibility | Moderate | Very High |
15.3 NeuralNode vs ONNX Runtime
| Feature | NeuralNode | ONNX Runtime |
|---|---|---|
| Purpose | AI Agent Framework | Model Inference Engine |
| Scope | End-to-end solutions | Inference optimization |
| LLM Support | Native | Via conversion |
| Tools | Built-in ecosystem | None |
| Memory | Advanced systems | Basic management |
| Use Case | Production agents | Optimized inference |
15.4 NeuralNode vs LangChain
| Feature | NeuralNode | LangChain |
|---|---|---|
| Design | Monolithic, integrated | Modular, composable |
| Complexity | Lower | Higher |
| Local LLMs | First-class | Via extensions |
| Security | Built-in sandbox | Manual implementation |
| Performance | Optimized defaults | Requires tuning |
| Documentation | Comprehensive | Extensive |
| Community | Growing | Larger |
16. Troubleshooting
16.1 Installation Errors
Problem: ImportError: No module named 'neuralnode'
Solution:
pip install --upgrade neuralnode
# Or for development
pip install -e .
Problem: Cannot install package
Solution:
# Check Python version (requires 3.8+)
python --version
# Use virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
pip install neuralnode
16.2 GPU Not Detected
Problem: CUDA not available
Solution:
import torch
print(torch.cuda.is_available())
print(torch.version.cuda)
# Install CUDA-enabled PyTorch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Problem: Out of memory on GPU
Solution:
# Reduce GPU layers
ai = nn.NeuralNode(
provider="local",
gpu_layers=20, # Reduce from 35
n_batch=512 # Smaller batch size
)
# Or use CPU
ai = nn.NeuralNode(provider="local", gpu_layers=0)
16.3 Model Loading Issues
Problem: Model not found
Solution:
# Check model path
import os
print(os.path.exists("./model.gguf"))
# Use absolute path
ai = nn.NeuralNode(
provider="local",
model_path=os.path.abspath("./model.gguf")
)
Problem: GGUF format not recognized
Solution:
# Update llama-cpp-python
pip install --upgrade llama-cpp-python
# Or reinstall with specific version
pip install llama-cpp-python==0.2.20
16.4 Memory Crashes
Problem: Segmentation fault or Killed
Solution:
# Limit memory usage
import resource
resource.setrlimit(resource.RLIMIT_AS, (8 * 1024 * 1024 * 1024, -1)) # 8GB
# Use quantized model
ai = nn.NeuralNode(
provider="local",
model_path="./model-q4.gguf", # 4-bit instead of 16-bit
n_ctx=2048 # Reduce context
)
Problem: Context length exceeded
Solution:
# Truncate input
from neuralnode.utils import truncate_text
truncated = truncate_text(long_text, max_tokens=3000)
response = ai.chat(truncated)
# Or use context compression
from neuralnode.memory import SlidingWindowMemory
memory = SlidingWindowMemory(max_messages=10)
16.5 API Errors
Problem: Rate limit exceeded
Solution:
# Enable caching
ai = nn.NeuralNode(
provider="openai",
cache=True,
cache_similarity_threshold=0.95
)
# Add retry with exponential backoff
from neuralnode.utils import with_retry
@with_retry(max_attempts=5, backoff=2)
def chat_with_retry(prompt):
return ai.chat(prompt)
Problem: API key invalid
Solution:
# Set environment variable
export OPENAI_API_KEY="sk-..."
# Or pass directly
ai = nn.NeuralNode(
provider="openai",
api_key="sk-..."
)
16.6 Debug Mode
# Enable detailed logging
import logging
logging.basicConfig(level=logging.DEBUG)
# Check tool execution
agent = Agent(llm=ai, tools=[...], debug=True)
# Monitor memory
from neuralnode.utils import MemoryMonitor
monitor = MemoryMonitor()
monitor.start()
result = agent.run("task")
monitor.stop()
print(monitor.report())
17. Roadmap
17.1 Version 1.0 (Current)
Core Features:
- Unified LLM interface
- Basic agent system
- Tool integration
- Memory systems
- RAG support
- Local model support
Security:
- Basic sandboxing
- Human-in-the-loop
- Privacy mode
17.2 Version 1.1 (Q2 2024)
- Advanced ReAct pattern
- Multi-agent orchestration
- Workflow builder
- Telegram/Discord integrations
- Enhanced caching
17.3 Version 1.2 (Q3 2024)
- Distributed inference
- Federated learning
- RLHF pipeline
- Model compression
- Quantization GUI
17.4 Version 1.3 (Q4 2024)
- Mobile app builder
- Desktop GUI
- Cloud deployment tools
- Auto-agent capabilities
- Self-healing system
17.5 Version 2.0 (2025)
- Multi-modal chain improvements
- Local LLM hub enhancements
- Advanced observability
- Plugin marketplace
- Enterprise features
17.6 Future Plans
Research Directions:
- Constitutional AI integration
- Advanced reasoning architectures
- Multi-modal foundation models
- Neuro-symbolic AI
Infrastructure:
- Kubernetes operator
- Serverless deployment
- Edge computing support
- Real-time streaming
Integrations:
- Slack, Teams, Discord
- Notion, Confluence
- Jira, Trello
- AWS, GCP, Azure services
18. Contributing Guide
18.1 Getting Started
- Fork the repository
- Clone your fork:
git clone https://github.com/yourusername/neuralnode.git
cd neuralnode
- Create virtual environment:
python -m venv venv
source venv/bin/activate
- Install dependencies:
pip install -e ".[dev]"
18.2 Project Structure
neuralnode/
├── src/
│ └── neuralnode/
│ ├── __init__.py
│ ├── core.py # Core NeuralNode class
│ ├── agent/
│ │ ├── __init__.py
│ │ ├── base.py # Base agent
│ │ ├── react.py # ReAct agent
│ │ └── orchestrator.py # Multi-agent
│ ├── providers/ # LLM providers
│ ├── tools/ # Built-in tools
│ ├── memory/ # Memory systems
│ ├── rag/ # RAG components
│ ├── training/ # Training modules
│ ├── distributed/ # Distributed inference
│ └── utils/ # Utilities
├── tests/
├── docs/
├── examples/
└── scripts/
18.3 Coding Standards
Style Guide:
- Follow PEP 8
- Use type hints
- Document with docstrings
- Maximum line length: 100
Example:
def process_data(
input_data: List[Dict[str, Any]],
threshold: float = 0.5
) -> List[Dict[str, Any]]:
"""
Process input data with threshold filtering.
Args:
input_data: List of data dictionaries
threshold: Minimum confidence threshold
Returns:
Filtered list of data
Raises:
ValueError: If threshold is not between 0 and 1
"""
if not 0 <= threshold <= 1:
raise ValueError("Threshold must be between 0 and 1")
return [item for item in input_data if item["confidence"] >= threshold]
18.4 Testing
# Run all tests
pytest
# Run specific test
pytest tests/test_agents.py
# With coverage
pytest --cov=neuralnode --cov-report=html
# Run linting
ruff check src/
black --check src/
mypy src/
18.5 Pull Request Process
- Create feature branch:
git checkout -b feature/my-feature
- Make changes and commit:
git add .
git commit -m "Add feature: description"
- Push to fork:
git push origin feature/my-feature
- Create Pull Request:
- Fill PR template
- Link related issues
- Request review
- Ensure CI passes
18.6 Commit Message Format
type(scope): description
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentationtest: Testsrefactor: Code refactoringperf: Performancechore: Maintenance
Example:
feat(agent): add ReAct pattern support
Implements reasoning and acting cycle with
self-correction capabilities.
Closes #123
19. License
NeuralNode is released under the MIT License.
MIT License
Copyright (c) 2024 NeuralNode Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
20. Credits & Author
20.1 Author
Assem Sabry
- Creator and Lead Developer
- GitHub: @assemsabry
20.2 Contributors
Thank you to all contributors who have helped make NeuralNode better:
- [List of contributors will be maintained here]
20.3 Acknowledgments
Special thanks to:
- OpenAI, Anthropic, Google - For their groundbreaking LLM research
- Meta AI - For open-sourcing Llama models
- Ollama team - For making local LLMs accessible
- HuggingFace - For the transformers ecosystem
- LangChain - For inspiring agent architectures
- FAISS team - For efficient vector search
- The Python community - For excellent libraries and tools
20.4 References
Papers:
- ReAct: Synergizing Reasoning and Acting in Language Models (Yao et al., 2022)
- LoRA: Low-Rank Adaptation of Large Language Models (Hu et al., 2021)
- RLHF: Training language models to follow instructions (Ouyang et al., 2022)
Libraries:
- llama.cpp (ggerganov)
- sentence-transformers (UKPLab)
- chromadb (Chroma)
- faiss (Facebook AI)
Appendix A: Missing Integrations
The following integrations are planned for future releases:
Communication:
- Slack (in progress)
- Discord (in progress)
- WhatsApp Business API
- Microsoft Teams
- Twilio (SMS/Voice)
Storage:
- AWS S3
- Google Cloud Storage
- Azure Blob Storage
- MongoDB
- PostgreSQL
- Redis
Productivity:
- Notion API
- Confluence
- Google Workspace
- Microsoft 365
Project Management:
- Jira
- Trello
- Asana
- Monday.com
DevOps:
- GitHub Actions
- GitLab CI
- Docker
- Kubernetes
Monitoring:
- Prometheus
- Grafana
- Datadog
- New Relic
- Sentry
Appendix B: Changelog
See CHANGELOG.md for detailed version history.
Documentation Version: 1.0.0
Last Updated: 2024
For support: Open an issue on GitHub or contact the maintainers.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file neuralnode-1.0.3.tar.gz.
File metadata
- Download URL: neuralnode-1.0.3.tar.gz
- Upload date:
- Size: 257.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bd8862a2295fefc25399422d5eb37b7bc55407201e8ccfdcf38f42c673a36b1
|
|
| MD5 |
778a7a342dc48faa35fe788019b47989
|
|
| BLAKE2b-256 |
dfd3753648a3aad0be2df2f5666f942f9c1d29bdc79028cf53b06920a0ef411a
|
File details
Details for the file neuralnode-1.0.3-py3-none-any.whl.
File metadata
- Download URL: neuralnode-1.0.3-py3-none-any.whl
- Upload date:
- Size: 237.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0145c77dcbd81a1d7a9fbe070d31b9f0d6c61916f45dc9589b7721b0719b390c
|
|
| MD5 |
a473ab57dcfa51760f231ac8214b0ae8
|
|
| BLAKE2b-256 |
c5c07ad93b0c838af767e74090a61a7b1e30c446feab8a1da2957cc8a5d1e916
|