Skip to main content

AI-Native Distributed Filesystem Architecture

Project description

Nexus: AI-Native Distributed Filesystem

Test Lint PyPI version License Python 3.11+

Version 0.1.0 | AI Agent Infrastructure Platform

Nexus is a complete AI agent infrastructure platform that combines distributed unified filesystem, self-evolving agent memory, intelligent document processing, and seamless deployment from local development to hosted production—all from a single codebase.

Features

Foundation

  • Distributed Unified Filesystem: Multi-backend abstraction (S3, GDrive, SharePoint, LocalFS)
  • Tiered Storage: Hot/Warm/Cold tiers with automatic lineage tracking
  • Content-Addressable Storage: 30-50% storage savings via deduplication
  • "Everything as a File" Paradigm: Configuration, memory, jobs, and commands as files

Agent Intelligence

  • Self-Evolving Memory: Agent memory with automatic consolidation
  • Memory Versioning: Track knowledge evolution over time
  • Multi-Agent Sharing: Shared memory spaces within tenants
  • Memory Analytics: Effectiveness tracking and insights
  • Prompt Version Control: Track prompt evolution with lineage
  • Training Data Management: Version-controlled datasets with deduplication
  • Prompt Optimization: Multi-candidate testing, execution traces, tradeoff analysis
  • Experiment Tracking: Organize optimization runs, per-example results, regression detection

Content Processing

  • Rich Format Parsing: Extensible parsers (PDF, Excel, CSV, JSON, images)
  • LLM KV Cache Management: 50-90% cost savings on AI queries
  • Semantic Chunking: Better search via intelligent document segmentation
  • MCP Integration: Native Model Context Protocol server
  • Document Type Detection: Automatic routing to appropriate parsers

Operations

  • Resumable Jobs: Checkpointing system survives restarts
  • OAuth Token Management: Auto-refreshing credentials
  • Backend Auto-Mount: Automatic recognition and mounting
  • Resource Management: CPU throttling and rate limiting
  • Work Queue Detection: SQL views for efficient task scheduling and dependency resolution

Deployment Modes

Nexus supports two deployment modes from a single codebase:

Mode Use Case Setup Time Scaling
Local Individual developers, CLI tools, prototyping 60 seconds Single machine (~10GB)
Hosted Teams and production (auto-scales) Sign up Automatic (GB to Petabytes)

Note: Hosted mode automatically scales infrastructure under the hood—you don't choose between "monolithic" or "distributed". Nexus handles that for you based on your usage.

Quick Start: Local Mode

import nexus

# Zero-deployment filesystem with AI features
# Config auto-discovered from nexus.yaml or environment
nx = nexus.connect()

async with nx:
    # Write and read files
    await nx.write("/workspace/data.txt", b"Hello World")
    content = await nx.read("/workspace/data.txt")

    # Semantic search across documents
    results = await nx.semantic_search(
        "/docs/**/*.pdf",
        query="authentication implementation"
    )

    # LLM-powered document reading with KV cache
    answer = await nx.llm_read(
        "/reports/q4.pdf",
        prompt="Summarize key findings",
        model="claude-sonnet-4"
    )

Config file (nexus.yaml):

mode: local
data_dir: ./nexus-data
cache_size_mb: 100
enable_vector_search: true

Quick Start: Hosted Mode

Coming Soon! Sign up for early access at nexus.ai

import nexus

# Connect to Nexus hosted instance
# Infrastructure scales automatically based on your usage
nx = nexus.connect(
    api_key="your-api-key",
    endpoint="https://api.nexus.ai"
)

async with nx:
    # Same API as local mode!
    await nx.write("/workspace/data.txt", b"Hello World")
    content = await nx.read("/workspace/data.txt")

For self-hosted deployments, see the S3-Compatible HTTP Server section below for deployment instructions.

Storage Backends

Nexus supports multiple storage backends through a unified API. All backends use Content-Addressable Storage (CAS) for automatic deduplication.

Local Backend (Default)

Store files on local filesystem:

import nexus

# Auto-detected from config or uses default
nx = nexus.connect()

# Or explicitly configure
nx = nexus.connect(config={
    "backend": "local",
    "data_dir": "./nexus-data"
})

Google Cloud Storage (GCS) Backend

Store files in Google Cloud Storage with local metadata:

import nexus

# Connect with GCS backend
nx = nexus.connect(config={
    "backend": "gcs",
    "gcs_bucket_name": "my-nexus-bucket",
    "gcs_project_id": "my-gcp-project",  # Optional
    "gcs_credentials_path": "/path/to/credentials.json",  # Optional
})

Authentication Methods:

  1. Service Account Key: Provide gcs_credentials_path
  2. Application Default Credentials (if not provided):
    • GOOGLE_APPLICATION_CREDENTIALS environment variable
    • gcloud auth application-default login credentials
    • GCE/Cloud Run service account (when running on GCP)

Using Config File (nexus.yaml):

backend: gcs
gcs_bucket_name: my-nexus-bucket
gcs_project_id: my-gcp-project  # Optional
# gcs_credentials_path: /path/to/credentials.json  # Optional

Using Environment Variables:

export NEXUS_BACKEND=gcs
export NEXUS_GCS_BUCKET_NAME=my-nexus-bucket
export NEXUS_GCS_PROJECT_ID=my-gcp-project  # Optional
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json  # Optional

CLI Usage with GCS:

# Write file to GCS
nexus write /workspace/data.txt "Hello GCS!" \
  --backend=gcs \
  --gcs-bucket=my-nexus-bucket

# Or use config file (simpler!)
nexus write /workspace/data.txt "Hello GCS!" --config=nexus.yaml

Advanced: Direct Backend API

For advanced use cases, instantiate backends directly:

from nexus import NexusFS, LocalBackend, GCSBackend

# Local backend
nx_local = NexusFS(
    backend=LocalBackend("/path/to/data"),
    db_path="./metadata.db"
)

# GCS backend
nx_gcs = NexusFS(
    backend=GCSBackend(
        bucket_name="my-bucket",
        project_id="my-project",
        credentials_path="/path/to/creds.json"
    ),
    db_path="./gcs-metadata.db"
)

# Same API for both!
nx_local.write("/file.txt", b"data")
nx_gcs.write("/file.txt", b"data")

Backend Comparison

Feature Local Backend GCS Backend
Content Storage Local filesystem Google Cloud Storage
Metadata Storage Local SQLite Local SQLite
Deduplication ✅ CAS (30-50% savings) ✅ CAS (30-50% savings)
Multi-machine Access ❌ Single machine ✅ Shared across machines
Durability Single disk 99.999999999% (11 nines)
Latency <1ms (local) 10-50ms (network)
Cost Free (local disk) GCS storage pricing
Use Case Development, single machine Teams, production, backup

Coming Soon

  • Amazon S3 Backend (v0.7.0)
  • Azure Blob Storage (v0.7.0)
  • Google Drive (v0.7.0)
  • SharePoint (v0.7.0)

Installation

Using pip (Recommended)

# Install core Nexus
pip install nexus-ai-fs

# Install with FUSE support
pip install nexus-ai-fs[fuse]

# Install with PostgreSQL support
pip install nexus-ai-fs[postgres]

# Install everything
pip install nexus-ai-fs[all]  # All features (FUSE + PostgreSQL + future plugins)

# Verify installation
nexus --version

Installing First-Party Plugins (Local Development)

First-party plugins are in development and not yet published to PyPI. Install from source:

# Clone repository
git clone https://github.com/nexi-lab/nexus.git
cd nexus

# Install Nexus
pip install -e .

# Install plugins from local source
pip install -e ./nexus-plugin-anthropic      # Claude Skills API
pip install -e ./nexus-plugin-skill-seekers  # Doc scraper

# Verify plugins
nexus plugins list

See PLUGIN_INSTALLATION.md for detailed instructions.

From Source (Development)

# Clone the repository
git clone https://github.com/nexi-lab/nexus.git
cd nexus

# Install using uv (recommended for faster installs)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"

# Or using pip
pip install -e ".[dev]"

Development Setup

# Install development dependencies
uv pip install -e ".[dev,test]"

# Run tests
pytest

# Run type checking
mypy src/nexus

# Format code
ruff format .

# Lint
ruff check .

CLI Usage

Nexus provides a beautiful command-line interface for all file operations. After installation, the nexus command will be available.

Quick Start

# Initialize a new workspace
nexus init ./my-workspace

# Write a file
nexus write /workspace/hello.txt "Hello, Nexus!"

# Read a file
nexus cat /workspace/hello.txt

# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long  # Detailed view with metadata

Available Commands

File Operations

# Write content to a file
nexus write /path/to/file.txt "content"
echo "content" | nexus write /path/to/file.txt --input -

# Display file contents (with syntax highlighting)
nexus cat /workspace/code.py

# Copy files
nexus cp /source.txt /dest.txt

# Delete files
nexus rm /workspace/old-file.txt
nexus rm /workspace/old-file.txt --force  # Skip confirmation

# Show file information
nexus info /workspace/data.txt

Directory Operations

# Create directory
nexus mkdir /workspace/data
nexus mkdir /workspace/deep/nested/dir --parents

# Remove directory
nexus rmdir /workspace/data
nexus rmdir /workspace/data --recursive --force

File Discovery

# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long  # Show size, modified time, etag

# Find files by pattern (glob)
nexus glob "**/*.py"  # All Python files recursively
nexus glob "*.txt" --path /workspace  # Text files in workspace
nexus glob "test_*.py"  # Test files

# Search file contents (grep)
nexus grep "TODO"  # Find all TODO comments
nexus grep "def \w+" --file-pattern "**/*.py"  # Find function definitions
nexus grep "error" --ignore-case  # Case-insensitive search
nexus grep "TODO" --max-results 50  # Limit results

# Search modes (v0.2.0+)
nexus grep "revenue" --file-pattern "**/*.pdf"  # Auto mode: tries parsed first
nexus grep "revenue" --file-pattern "**/*.pdf" --search-mode=parsed  # Only parsed content
nexus grep "TODO" --search-mode=raw  # Only raw text (skip parsing)

# Result shows source type
# Match: TODO (parsed) ← from parsed PDF
# Match: TODO (raw) ← from source code

File Permissions (v0.3.0)

# Change file permissions
nexus chmod 755 /workspace/script.sh
nexus chmod rw-r--r-- /workspace/data.txt

# Change file owner and group
nexus chown alice /workspace/file.txt
nexus chgrp developers /workspace/code/

# View ACL entries
nexus getfacl /workspace/file.txt

# Manage ACL entries
nexus setfacl user:alice:rw- /workspace/file.txt
nexus setfacl group:developers:r-x /workspace/code/
nexus setfacl deny:user:bob /workspace/secret.txt
nexus setfacl user:alice:rwx /workspace/file.txt --remove

Supported Formats:

  • Octal: 755, 0o644, 0755
  • Symbolic: rwxr-xr-x, rw-r--r--
  • ACL Entries: user:<name>:rwx, group:<name>:r-x, deny:user:<name>

ReBAC - Relationship-Based Access Control (v0.3.0)

Nexus implements Zanzibar-style relationship-based authorization for team-based permissions, hierarchical access, and dynamic permission inheritance.

# Create relationship tuples
nexus rebac create agent alice member-of group eng-team
nexus rebac create group eng-team owner-of file project-docs
nexus rebac create file folder-parent parent-of file folder-child

# Check permissions (with graph traversal)
nexus rebac check agent alice member-of group eng-team  # Direct check
nexus rebac check agent alice owner-of file project-docs  # Inherited via group

# Find all subjects with a permission
nexus rebac expand owner-of file project-docs  # Returns: alice (via eng-team)
nexus rebac expand member-of group eng-team    # Returns: alice, bob, ...

# Delete relationships
nexus rebac delete <tuple-id>

# Create temporary access (expires automatically)
nexus rebac create agent alice viewer-of file temp-report \
  --expires "2025-12-31T23:59:59"

ReBAC Features:

  • Relationship Types: member-of, owner-of, viewer-of, editor-of, parent-of
  • Graph Traversal: Recursive permission checking through relationship chains
  • Permission Inheritance: Team ownership, hierarchical folders, group membership
  • Caching: 5-minute TTL with automatic invalidation on changes
  • Expiring Access: Temporary permissions with automatic cleanup
  • Cycle Detection: Prevents infinite loops in relationship graphs

Example Use Cases:

# Team-based file access
nexus rebac create agent alice member-of group engineering
nexus rebac create group engineering owner-of file /projects/backend
# alice now has owner permission on /projects/backend

# Hierarchical folder permissions
nexus rebac create agent bob owner-of file /workspace/parent-folder
nexus rebac create file /workspace/parent-folder parent-of file /workspace/parent-folder/child
# bob automatically has owner permission on child folder

# Temporary collaborator access
nexus rebac create agent charlie viewer-of file /reports/q4.pdf \
  --expires "2025-01-31T23:59:59"
# charlie's access expires automatically on Jan 31, 2025

Work Queue Operations

# Query work items by status
nexus work ready --limit 10  # Get ready work items (high priority first)
nexus work pending  # Get pending work items
nexus work blocked  # Get blocked work items (with dependency info)
nexus work in-progress  # Get currently processing items

# View aggregate statistics
nexus work status  # Show counts for all work queues

# Output as JSON (for scripting)
nexus work ready --json
nexus work status --json

Note: Work items are files with special metadata (status, priority, depends_on, worker_id). See docs/SQL_VIEWS_FOR_WORK_DETECTION.md for details on setting up work queues.

Version Tracking & History (v0.3.5)

Nexus provides CAS-backed version tracking for all files and skills. Every write operation automatically preserves the previous version with zero storage overhead through content deduplication.

# View version history for a file
nexus versions history /workspace/document.txt
# Output:
# ┌─────────┬────────┬─────────────────────┬────────────┬──────────┬───────────────┐
# │ Version │ Size   │ Created At          │ Created By │ Source   │ Change Reason │
# ├─────────┼────────┼─────────────────────┼────────────┼──────────┼───────────────┤
# │ 3       │ 1.2 KB │ 2025-01-20 14:30:00 │ alice      │ original │ -             │
# │ 2       │ 1.0 KB │ 2025-01-20 10:15:00 │ alice      │ original │ -             │
# │ 1       │ 0.8 KB │ 2025-01-19 09:00:00 │ alice      │ original │ -             │
# └─────────┴────────┴─────────────────────┴────────────┴──────────┴───────────────┘

# Retrieve specific version content
nexus versions get /workspace/document.txt --version 1
# Output: <content of version 1>

# Save to file
nexus versions get /workspace/document.txt --version 2 -o old_version.txt
# ✓ Wrote version 2 to: old_version.txt

# Compare two versions (metadata)
nexus versions diff /workspace/document.txt --v1 1 --v2 3 --mode metadata
# Output:
# Size: 819 bytes → 1,234 bytes
# Content changed: True
# Hash v1: abc123...
# Hash v3: def456...

# Compare two versions (content)
nexus versions diff /workspace/document.txt --v1 1 --v2 3
# Output (unified diff format):
# --- /workspace/document.txt (v1)
# +++ /workspace/document.txt (v3)
# @@ -1,3 +1,5 @@
#  First line
# -Second line
# +Second line modified
# +Third line added
#  Last line

# Rollback to previous version
nexus versions rollback /workspace/document.txt --version 2
# ✓ Rolled back /workspace/document.txt to version 2
# New version: 4 (rollback creates new version, no data loss!)

# Skip confirmation
nexus versions rollback /workspace/document.txt --version 1 --yes

Version Tracking Features:

  • Automatic Version Creation: Every write operation preserves the previous version
  • Zero Storage Overhead: CAS deduplication means identical content is stored only once
  • Complete History: Never lose data - every version is preserved forever
  • Rollback Support: Revert to any previous version (creates new version, no destructive changes)
  • Content Diff: Compare versions with unified diff output
  • Skills Versioning: Track changes to SKILL.md files over time
  • Time Travel: Retrieve exact state from any point in time

Python SDK:

import nexus

nx = nexus.connect()

# Version history is automatic on every write
nx.write("/workspace/doc.txt", b"Version 1")
nx.write("/workspace/doc.txt", b"Version 2")
nx.write("/workspace/doc.txt", b"Version 3")

# List all versions
versions = nx.list_versions("/workspace/doc.txt")
print(f"Total versions: {len(versions)}")  # 3

# Get specific version
v1_content = nx.get_version("/workspace/doc.txt", version=1)
print(f"V1: {v1_content}")  # b"Version 1"

# Compare versions
diff = nx.diff_versions("/workspace/doc.txt", v1=1, v2=3, mode="content")
print(diff)  # Unified diff output

# Rollback to v2
nx.rollback("/workspace/doc.txt", version=2)
# Current version is now 4, pointing to v2's content

Examples

Initialize and populate a workspace:

# Create workspace
nexus init ./my-project

# Create structure
nexus mkdir /workspace/src --data-dir ./my-project/nexus-data
nexus mkdir /workspace/tests --data-dir ./my-project/nexus-data

# Add files
echo "print('Hello World')" | nexus write /workspace/src/main.py --input - \
  --data-dir ./my-project/nexus-data

# List everything
nexus ls / --recursive --long --data-dir ./my-project/nexus-data

Find and analyze code:

# Find all Python files
nexus glob "**/*.py"

# Search for TODO comments
nexus grep "TODO|FIXME" --file-pattern "**/*.py"

# Find all test files
nexus glob "**/test_*.py"

# Search for function definitions
nexus grep "^def \w+\(" --file-pattern "**/*.py"

Work with data:

# Write JSON data
echo '{"name": "test", "value": 42}' | nexus write /data/config.json --input -

# Display with syntax highlighting
nexus cat /data/config.json

# Get file information
nexus info /data/config.json

Global Options

All commands support these global options:

# Use custom config file
nexus ls /workspace --config /path/to/config.yaml

# Override data directory
nexus ls /workspace --data-dir /path/to/nexus-data

# Combine both (config takes precedence)
nexus ls /workspace --config ./my-config.yaml --data-dir ./data

Plugin Management

Nexus has a modular plugin system for external integrations:

# List installed plugins
nexus plugins list

# Get detailed plugin information
nexus plugins info anthropic
nexus plugins info skill-seekers

# Install a plugin
nexus plugins install anthropic
nexus plugins install skill-seekers

# Enable/disable plugins
nexus plugins enable anthropic
nexus plugins disable anthropic

# Uninstall a plugin
nexus plugins uninstall skill-seekers

First-party plugins (local development only - not yet on PyPI):

  • anthropic - Claude Skills API integration (upload/download/manage skills)
  • skill-seekers - Generate skills from documentation websites

Installation:

# Install from local source
pip install -e ./nexus-plugin-anthropic
pip install -e ./nexus-plugin-skill-seekers

Using plugin commands:

# Anthropic plugin commands
nexus anthropic upload-skill my-skill
nexus anthropic list-skills
nexus anthropic import-github canvas-design

# Skill Seekers plugin commands
nexus skill-seekers generate https://react.dev/ --name react-basics
nexus skill-seekers import /path/to/SKILL.md
nexus skill-seekers list

See detailed documentation:

Try plugin examples:

# CLI demo - plugin management commands
./examples/plugin_cli_demo.sh

# SDK demo - programmatic plugin usage
python examples/plugin_sdk_demo.py

Help

Get help for any command:

nexus --help  # Show all commands
nexus ls --help  # Show help for ls command
nexus grep --help  # Show help for grep command
nexus plugins --help  # Show plugin management commands

Remote Nexus Server

Nexus includes a JSON-RPC server that exposes the full NexusFileSystem interface over HTTP, enabling remote filesystem access and FUSE mounts to remote servers.

Quick Start

Method 1: Using the Startup Script (Recommended)

# Navigate to nexus directory
cd /path/to/nexus

# Start with defaults (host: 0.0.0.0, port: 8080, no auth)
./start-server.sh

# Or with custom options
./start-server.sh --host localhost --port 8080 --api-key mysecret

Method 2: Direct Command

# Start the server (optional API key authentication)
nexus serve --host 0.0.0.0 --port 8080 --api-key mysecret

# Use remote filesystem from Python
from nexus import RemoteNexusFS

nx = RemoteNexusFS(
    server_url="http://localhost:8080",
    api_key="mysecret"  # Optional
)

# Same API as local NexusFS!
nx.write("/workspace/hello.txt", b"Hello Remote!")
content = nx.read("/workspace/hello.txt")
files = nx.list("/workspace", recursive=True)

Features

  • Full NFS Interface: All filesystem operations exposed over RPC (read, write, list, glob, grep, mkdir, etc.)
  • JSON-RPC 2.0 Protocol: Standard RPC protocol with proper error handling
  • API Key Authentication: Optional Bearer token authentication for security
  • Backend Agnostic: Works with local and GCS backends
  • FUSE Compatible: Mount remote Nexus servers as local filesystems

Remote Client Usage

from nexus import RemoteNexusFS

# Connect to remote server
nx = RemoteNexusFS(
    server_url="http://your-server:8080",
    api_key="your-api-key"  # Optional
)

# All standard operations work
nx.write("/workspace/data.txt", b"content")
content = nx.read("/workspace/data.txt")
files = nx.list("/workspace", recursive=True)
results = nx.glob("**/*.py")
matches = nx.grep("TODO", file_pattern="*.py")

Server Options

# Start with custom host/port
nexus serve --host 0.0.0.0 --port 8080

# Start with API key authentication
nexus serve --api-key mysecret

# Start with GCS backend
nexus serve --backend=gcs --gcs-bucket=my-bucket --api-key mysecret

# Custom data directory
nexus serve --data-dir /path/to/data

Testing the Server

Once the server is running, verify it's working:

# Health check
curl http://localhost:8080/health
# Expected: {"status": "healthy", "service": "nexus-rpc"}

# Check available methods
curl http://localhost:8080/api/nfs/status
# Expected: {"status": "running", "service": "nexus-rpc", "version": "1.0", "methods": [...]}

# List files (JSON-RPC)
curl -X POST http://localhost:8080/api/nfs/list \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "list",
    "params": {"path": "/", "recursive": false, "details": true},
    "id": 1
  }'

# With API key
curl -X POST http://localhost:8080/api/nfs/list \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer mysecretkey" \
  -d '{"jsonrpc": "2.0", "method": "list", "params": {"path": "/"}, "id": 1}'

Troubleshooting

Port Already in Use:

# Find and kill process using port 8080
lsof -ti:8080 | xargs kill -9

# Or use a different port
nexus serve --port 8081

Module Not Found:

# Activate virtual environment and install
source .venv/bin/activate
pip install -e .

Permission Denied:

# Use a directory you have write access to
nexus serve --data-dir ~/nexus-data

Deploying Nexus Server

Google Cloud Platform (Recommended)

Deploy to GCP with a single command using the automated deployment script:

# Quick start
./deploy-gcp.sh --project-id YOUR-PROJECT-ID --api-key mysecret

# With GCS backend
./deploy-gcp.sh \
  --project-id YOUR-PROJECT-ID \
  --gcs-bucket your-nexus-bucket \
  --api-key mysecret \
  --machine-type e2-standard-2

Features:

  • ✅ Automated VM provisioning (Ubuntu 22.04)
  • ✅ Systemd service with auto-restart
  • ✅ Firewall configuration
  • ✅ GCS backend support
  • ✅ Production-ready setup

See GCP Deployment Guide for complete instructions.

Docker Deployment

Deploy using Docker for consistent environments and easy management:

# Quick start with Docker Compose
cp .env.docker.example .env
# Edit .env with your configuration
docker-compose up -d

# Or run directly
docker build -t nexus-server:latest .
docker run -d \
  --name nexus-server \
  --restart unless-stopped \
  -p 8080:8080 \
  -v nexus-data:/app/data \
  -e NEXUS_API_KEY="your-api-key" \
  nexus-server:latest

# Deploy to GCP with Docker (automated)
./deploy-gcp-docker.sh \
  --project-id your-project-id \
  --api-key mysecret \
  --build-local

Features:

  • ✅ Multi-stage build for optimized image size (~300MB)
  • ✅ Non-root user for security
  • ✅ Health checks and auto-restart
  • ✅ GCS backend support
  • ✅ Docker Compose for easy orchestration

See Docker Deployment Guide for complete instructions.

Deployment Features:

  • Persistent Metadata: SQLite database stored on VM disk at /var/lib/nexus/
  • Content Storage: All file content stored in configured backend (GCS, local, etc.)
  • Content Deduplication: CAS-based storage with 30-50% savings
  • Full NFS API: All operations available remotely

FUSE Mount: Use Standard Unix Tools (v0.2.0)

Mount Nexus to a local path and use any standard Unix tool seamlessly - ls, cat, grep, vim, and more!

Installation

First, install FUSE support:

# Install Nexus with FUSE support
pip install nexus-ai-fs[fuse]

# Platform-specific FUSE library:
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3  # or equivalent for your distro

Quick Start

# Mount Nexus to local path (smart mode by default)
nexus mount /mnt/nexus

# Now use ANY standard Unix tools!
ls -la /mnt/nexus/workspace/
cat /mnt/nexus/workspace/notes.txt
grep -r "TODO" /mnt/nexus/workspace/
find /mnt/nexus -name "*.py"
vim /mnt/nexus/workspace/code.py
git clone /some/repo /mnt/nexus/repos/myproject

# Unmount when done
nexus unmount /mnt/nexus

Quick Start Examples

Example 1: Default (Explicit Views) - Best for Mixed Workflows

# Mount normally
nexus mount /mnt/nexus

# Binary tools work directly
evince /mnt/nexus/docs/report.pdf     # PDF viewer works ✓

# Add .txt for text operations
cat /mnt/nexus/docs/report.pdf.txt    # Read as text
grep "results" /mnt/nexus/docs/*.pdf.txt

# Virtual views auto-generated
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt  (virtual)
# → report.pdf.md   (virtual)

Example 2: Auto-Parse - Best for Search-Heavy Workflows

# Mount with auto-parse
nexus mount /mnt/nexus --auto-parse

# grep works directly on PDFs!
grep "results" /mnt/nexus/docs/*.pdf      # No .txt needed! ✓
cat /mnt/nexus/docs/report.pdf            # Returns text ✓

# Search across everything
grep -r "TODO" /mnt/nexus/workspace/      # Searches PDFs, Excel, etc.

# Binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf   # For PDF viewer

Example 3: Real-World Script

#!/bin/bash
# Find all PDFs mentioning "invoice"

# Mount in background - command returns immediately!
nexus mount /mnt/nexus --auto-parse --daemon
# (No blocking - script continues immediately)

# Mount is ready - grep works on PDFs!
grep -l "invoice" /mnt/nexus/documents/*.pdf

# Process results
for pdf in $(grep -l "invoice" /mnt/nexus/documents/*.pdf); do
    echo "Found in: $pdf"
    grep -n "invoice" "$pdf" | head -5
done

# Clean up
nexus unmount /mnt/nexus

Remote server example:

#!/bin/bash
# Search PDFs on remote Nexus server

# Mount remote server in background
nexus mount /mnt/nexus \
  --remote-url http://nexus-server:8080 \
  --auto-parse \
  --daemon

# Command returns immediately - daemon process runs in background
# You can now use standard Unix tools on remote filesystem!

# Search across remote PDFs
grep -r "TODO" /mnt/nexus/workspace/ | head -20

# Find large files
find /mnt/nexus -type f -size +10M

# Clean up when done
nexus unmount /mnt/nexus

File Access: Two Modes

Nexus supports two ways to access files - choose what fits your workflow:

1. Explicit Views (Default) - Best for Compatibility

Binary files return binary, use .txt/.md suffixes for parsed content:

nexus mount /mnt/nexus

# Binary files work with native tools
evince /mnt/nexus/docs/report.pdf      # PDF viewer gets binary ✓
libreoffice /mnt/nexus/data/sheet.xlsx # Excel app gets binary ✓

# Add .txt to search/read as text
cat /mnt/nexus/docs/report.pdf.txt     # Returns parsed text
grep "pattern" /mnt/nexus/docs/*.pdf.txt

# Virtual views appear automatically
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt  (virtual view)
# → report.pdf.md   (virtual view)

When to use: You want both binary tools AND text search to work

2. Auto-Parse Mode - Best for Search/Grep

Binary files return parsed text directly, use .raw/ for binary:

nexus mount /mnt/nexus --auto-parse

# Binary files return text directly - perfect for grep!
cat /mnt/nexus/docs/report.pdf         # Returns parsed text ✓
grep "pattern" /mnt/nexus/docs/*.pdf   # Works directly! ✓
less /mnt/nexus/docs/report.pdf        # Page through text ✓

# Access binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf # PDF viewer gets binary

# No .txt/.md suffixes - files return text by default
ls /mnt/nexus/docs/
# → report.pdf  (returns text when read)

When to use: Text search is your primary use case, binary tools are secondary

Mount Modes (Content Parsing)

Control what gets parsed:

# Smart mode (default) - Auto-detect file types
nexus mount /mnt/nexus --mode=smart
# ✅ PDFs, Excel, Word → parsed
# ✅ .py, .txt, .md → pass-through
# ✅ Best for mixed content

# Text mode - Parse everything aggressively
nexus mount /mnt/nexus --mode=text
# ✅ All files parsed to text
# ⚠️  Slower (always parses)

# Binary mode - No parsing at all
nexus mount /mnt/nexus --mode=binary
# ✅ All files return binary
# ❌ grep won't work on PDFs

Comparison Table

Feature Explicit Views (default) Auto-Parse Mode (--auto-parse)
PDF viewers work evince file.pdf ⚠️ evince .raw/file.pdf
grep on PDFs ⚠️ grep *.pdf.txt grep *.pdf
Excel apps work libreoffice file.xlsx ⚠️ libreoffice .raw/file.xlsx
Best for Binary tools + search Text search primary use case
Virtual views .txt, .md suffixes No suffixes needed
Binary access Direct (file.pdf) Via .raw/ directory

Background (Daemon) Mode

Run the mount in the background and return to your shell immediately:

# Mount in background - command returns immediately
nexus mount /mnt/nexus --daemon
# ✓ Mounted Nexus to /mnt/nexus
#
# To unmount:
#   nexus unmount /mnt/nexus
#
# (Shell prompt returns immediately, mount runs in background)

# Mount is active - you can use it immediately
ls /mnt/nexus
cat /mnt/nexus/workspace/file.txt

# Check daemon status
ps aux | grep "nexus mount" | grep -v grep
# jinjingzhou  43097  ... nexus mount /mnt/nexus --daemon

# Later, unmount when done
nexus unmount /mnt/nexus

How it works:

  • Command returns to shell immediately (using double-fork technique)
  • Background daemon process keeps mount active
  • Daemon survives terminal close and persists until unmount
  • Safe to close your terminal - mount stays active

Local Mount:

# Mount local Nexus data in background
nexus mount /mnt/nexus --daemon

Remote Mount:

# Mount remote Nexus server in background
nexus mount /mnt/nexus --remote-url http://your-server:8080 --daemon

# With API key authentication
nexus mount /mnt/nexus \
  --remote-url http://your-server:8080 \
  --api-key your-secret-key \
  --daemon

Performance & Caching (v0.2.0)

FUSE mounts include automatic caching for improved performance. Caching is enabled by default with sensible defaults - no configuration needed for most users.

Default Performance:

  • ✅ Attribute caching (1024 entries, 60s TTL) - Makes ls and stat operations faster
  • ✅ Content caching (100 files) - Speeds up repeated file reads
  • ✅ Parsed content caching (50 files) - Accelerates PDF/Excel text extraction
  • ✅ Automatic cache invalidation on writes/deletes - Always consistent

Advanced: Custom Cache Configuration

For power users with specific performance requirements:

from nexus import connect
from nexus.fuse import mount_nexus

nx = connect(config={"data_dir": "./nexus-data"})

# Custom cache configuration
cache_config = {
    "attr_cache_size": 2048,      # Double the attribute cache (default: 1024)
    "attr_cache_ttl": 120,         # Cache attributes for 2 minutes (default: 60s)
    "content_cache_size": 200,     # Cache 200 files (default: 100)
    "parsed_cache_size": 100,      # Cache 100 parsed files (default: 50)
    "enable_metrics": True         # Track cache hit/miss rates (default: False)
}

fuse = mount_nexus(
    nx,
    "/mnt/nexus",
    mode="smart",
    cache_config=cache_config,
    foreground=False
)

# View cache performance (if metrics enabled)
# Note: Access via fuse.fuse.operations.cache

Cache Configuration Options:

Option Default Description
attr_cache_size 1024 Max number of cached file attribute entries
attr_cache_ttl 60 Time-to-live for attributes in seconds
content_cache_size 100 Max number of cached file contents
parsed_cache_size 50 Max number of cached parsed contents (PDFs, etc.)
enable_metrics False Enable cache hit/miss tracking

When to Tune Cache Settings:

  • Large directory listings: Increase attr_cache_size to 2048+ and attr_cache_ttl to 120+
  • Many small files: Increase content_cache_size to 500+
  • Heavy PDF/Excel use: Increase parsed_cache_size to 200+
  • Performance analysis: Enable enable_metrics to measure cache effectiveness
  • Memory-constrained: Decrease all cache sizes (e.g., 512 / 50 / 25)

Notes:

  • Caches are thread-safe - safe for concurrent access
  • Caches are automatically invalidated on file writes, deletes, and renames
  • Default settings work well for most use cases - tune only if needed

Troubleshooting FUSE Mounts

Check Mount Status

# Check if daemon process is running
ps aux | grep "nexus mount" | grep -v grep

# Check mount points
mount | grep nexus

# List files in mount point (should show files, not empty)
ls -la /mnt/nexus/

Common Issues

Mount appears empty or shows "Transport endpoint is not connected":

# Unmount the stale mount point
nexus unmount /mnt/nexus

# Or force unmount (macOS)
umount -f /mnt/nexus

# Or force unmount (Linux)
fusermount -u /mnt/nexus

# Then remount
nexus mount /mnt/nexus --daemon

Process won't die (stuck in 'D' or 'U' state):

# Find stuck processes
ps aux | grep nexus | grep -E "D|U"

# Force kill
kill -9 <PID>

# If process is still stuck (uninterruptible I/O), try:
# macOS: umount -f /mnt/nexus
# Linux: fusermount -uz /mnt/nexus

# Note: Stuck processes in 'D' state typically resolve after unmount
# If they persist, they'll be cleaned up on system reboot

"Directory not empty" error when mounting:

# Unmount first
nexus unmount /mnt/nexus

# Or remove and recreate the mount point
rm -rf /mnt/nexus && mkdir /mnt/nexus

# Then mount
nexus mount /mnt/nexus --daemon

Permission denied errors:

# Ensure FUSE is installed
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3

# Check mount point permissions
ls -ld /mnt/nexus
# Should be owned by your user

# Create mount point with correct permissions
mkdir -p /mnt/nexus
chmod 755 /mnt/nexus

Connection refused (remote mounts):

# Check server is running
curl http://your-server:8080/health

# Test connectivity
ping your-server

# Verify API key (if required)
nexus mount /mnt/nexus \
  --remote-url http://your-server:8080 \
  --api-key your-key \
  --daemon

Multiple mounts to same mount point:

# Check for existing mounts
mount | grep /mnt/nexus

# Unmount all instances
nexus unmount /mnt/nexus

# Kill any lingering processes
pkill -f "nexus mount /mnt/nexus"

# Clean mount and remount
rm -rf /mnt/nexus && mkdir /mnt/nexus
nexus mount /mnt/nexus --daemon

Debug Mode

For detailed debugging output:

# Run in foreground with debug output
nexus mount /mnt/nexus --debug

# This will show all FUSE operations in real-time
# Press Ctrl+C to stop

rclone-style CLI Commands (v0.2.0)

Nexus provides efficient file operations inspired by rclone, with automatic deduplication and progress tracking:

Sync Command

One-way synchronization with hash-based change detection:

# Sync local directory to Nexus (only copies changed files)
nexus sync ./local/dataset/ /workspace/training/

# Preview changes before syncing (dry-run)
nexus sync ./data/ /workspace/backup/ --dry-run

# Mirror sync - delete extra files in destination
nexus sync /workspace/source/ /workspace/dest/ --delete

# Disable hash comparison (force copy all files)
nexus sync ./data/ /workspace/ --no-checksum

Copy Command

Smart copy with automatic deduplication:

# Copy directory recursively (skips identical files)
nexus copy ./local/data/ /workspace/project/ --recursive

# Copy within Nexus (leverages CAS deduplication)
nexus copy /workspace/source/ /workspace/dest/ --recursive

# Copy Nexus to local
nexus copy /workspace/data/ ./backup/ --recursive

# Copy single file
nexus copy /workspace/file.txt /workspace/copy.txt

# Disable checksum verification
nexus copy ./data/ /workspace/ --recursive --no-checksum

Move Command

Efficient file/directory moves with confirmation prompts:

# Move file (rename if possible, copy+delete otherwise)
nexus move /workspace/old.txt /workspace/new.txt

# Move directory without confirmation
nexus move /workspace/old_dir/ /archives/2024/ --force

Tree Command

Visualize directory structure as ASCII tree:

# Show full directory tree
nexus tree /workspace/

# Limit depth to 2 levels
nexus tree /workspace/ -L 2

# Show file sizes
nexus tree /workspace/ --show-size

Size Command

Calculate directory sizes with human-readable output:

# Calculate total size
nexus size /workspace/project/

# Human-readable output (KB, MB, GB)
nexus size /workspace/ --human

# Show top 10 largest files
nexus size /workspace/ --human --details

Features:

  • Hash-based deduplication - Only copies changed files
  • Progress bars - Visual feedback for long operations
  • Dry-run mode - Preview changes before execution
  • Cross-platform paths - Works with local filesystem and Nexus paths
  • Automatic deduplication - Leverages Content-Addressable Storage (CAS)

Performance Comparison

Method Speed Content-Aware Use Case
grep -r /mnt/nexus/ Medium ✅ Yes (via mount) Interactive use
nexus grep "pattern" Fast (DB-backed) ✅ Yes Large-scale search
Standard tools Familiar ✅ Yes (via mount) Day-to-day work

Use Cases

Interactive Development:

# Mount for interactive work
nexus mount /mnt/nexus
vim /mnt/nexus/workspace/code.py
git clone /mnt/nexus/repos/myproject

Bulk Operations:

# Use rclone-style commands for efficiency
nexus sync /local/dataset/ /workspace/training-data/
nexus tree /workspace/ > structure.txt

Automated Workflows:

# Standard Unix tools in scripts
find /mnt/nexus -name "*.pdf" -exec grep -l "invoice" {} \;
rsync -av /mnt/nexus/workspace/ /backup/

Architecture

Agent Workspace Structure

Every agent gets a structured workspace at /workspace/{tenant}/{agent}/:

/workspace/acme-corp/research-agent/
├── .nexus/                          # Nexus metadata (Git-trackable)
│   ├── agent.yaml                   # Agent configuration
│   ├── commands/                    # Custom commands (markdown files)
│   │   ├── analyze-codebase.md
│   │   └── summarize-docs.md
│   ├── jobs/                        # Background job definitions
│   │   └── daily-summary.yaml
│   ├── memory/                      # File-based memory
│   │   ├── project-knowledge.md
│   │   └── recent-tasks.jsonl
│   └── secrets.encrypted            # KMS-encrypted credentials
├── data/                            # Agent's working data
│   ├── inputs/
│   └── outputs/
└── INSTRUCTIONS.md                  # Agent instructions (auto-loaded)

Path Namespace

/
├── workspace/        # Agent scratch space (hot tier, ephemeral)
├── shared/           # Shared tenant data (warm tier, persistent)
├── external/         # Pass-through backends (no content storage)
├── system/           # System metadata (admin-only)
└── archives/         # Cold storage (read-only)

Core Components

File System Operations

import nexus

# Works in both local and hosted modes
# Mode determined by config file or environment
nx = nexus.connect()

async with nx:
    # Basic operations
    await nx.write("/workspace/data.txt", b"content")
    content = await nx.read("/workspace/data.txt")
    await nx.delete("/workspace/data.txt")

    # Batch operations
    files = await nx.list("/workspace/", recursive=True)
    results = await nx.copy_batch(sources, destinations)

    # File discovery
    python_files = await nx.glob("**/*.py")
    todos = await nx.grep(r"TODO:|FIXME:", file_pattern="*.py")

Semantic Search

# Search across documents with vector embeddings
async with nexus.connect() as nx:
    results = await nx.semantic_search(
        path="/docs/",
        query="How does authentication work?",
        limit=10,
        filters={"file_type": "markdown"}
    )

    for result in results:
        print(f"{result.path}:{result.line} - {result.text}")

LLM-Powered Reading

# Read documents with AI, with automatic KV cache
async with nexus.connect() as nx:
    answer = await nx.llm_read(
        path="/reports/q4-2024.pdf",
        prompt="What were the top 3 challenges?",
        model="claude-sonnet-4",
        max_tokens=1000
    )

Agent Memory

# Store and retrieve agent memories
async with nexus.connect() as nx:
    await nx.store_memory(
        content="User prefers TypeScript over JavaScript",
        memory_type="preference",
        tags=["coding", "languages"]
    )

    memories = await nx.search_memories(
        query="programming language preferences",
        limit=5
    )

Prompt Optimization (Coming in v0.9.5)

# Track multiple prompt candidates during optimization
async with nexus.connect() as nx:
    # Start optimization run
    run_id = await nx.start_optimization_run(
        module_name="SearchModule",
        objectives=["accuracy", "latency", "cost"]
    )

    # Store prompt candidates with detailed traces
    for candidate in prompt_variants:
        version_id = await nx.store_prompt_version(
            module_name="SearchModule",
            prompt_template=candidate.template,
            metrics={"accuracy": 0.85, "latency_ms": 450},
            run_id=run_id
        )

        # Store execution traces for debugging
        await nx.store_execution_trace(
            prompt_version_id=version_id,
            inputs=test_inputs,
            outputs=predictions,
            intermediate_steps=reasoning_chain
        )

    # Analyze tradeoffs across candidates
    analysis = await nx.analyze_prompt_tradeoffs(
        run_id=run_id,
        objectives=["accuracy", "latency_ms", "cost_per_query"]
    )

    # Get per-example results to find failure patterns
    failures = await nx.get_failing_examples(
        prompt_version_id=version_id,
        limit=20
    )

Custom Commands

Create /workspace/{tenant}/{agent}/.nexus/commands/semantic-search.md:

---
name: semantic-search
description: Search codebase semantically
allowed-tools: [semantic_read, glob, grep]
required-scopes: [read]
model: sonnet
---

## Your task

Given query: {{query}}

1. Use `glob` to find relevant files by pattern
2. Use `semantic_read` to extract relevant sections
3. Summarize findings with file:line citations

Execute via API:

async with nexus.connect() as nx:
    result = await nx.execute_command(
        "semantic-search",
        context={"query": "authentication implementation"}
    )

Skills System (v0.3.0)

Manage reusable AI agent skills with SKILL.md format, progressive disclosure, lifecycle management, and dependency resolution:

from nexus.skills import SkillRegistry, SkillManager, SkillExporter

# Initialize filesystem
nx = nexus.connect()

# Create skill registry
registry = SkillRegistry(nx)

# Discover skills from three tiers (agent > tenant > system)
# Loads metadata only - lightweight and fast
await registry.discover()

# List available skills
skills = registry.list_skills()
# ['analyze-code', 'data-processing', 'report-generation']

# Get skill metadata (no content loading)
metadata = registry.get_metadata("analyze-code")
print(f"{metadata.name}: {metadata.description}")
# analyze-code: Analyzes code quality and structure

# Load full skill content (lazy loading + caching)
skill = await registry.get_skill("analyze-code")
print(skill.content)  # Full markdown content

# Resolve dependencies automatically (DAG with cycle detection)
deps = await registry.resolve_dependencies("complex-skill")
# ['base-skill', 'helper-skill', 'complex-skill']

# Create skill manager for lifecycle operations
manager = SkillManager(nx, registry)

# Create new skill from template
await manager.create_skill(
    "my-analyzer",
    description="Analyzes code quality and structure",
    template="code-generation",  # basic, data-analysis, code-generation, document-processing, api-integration
    author="Alice",
    tier="agent"
)

# Fork existing skill with lineage tracking
await manager.fork_skill(
    "analyze-code",
    "my-custom-analyzer",
    tier="agent",
    author="Bob"
)

# Publish skill to tenant library
await manager.publish_skill(
    "my-analyzer",
    source_tier="agent",
    target_tier="tenant"
)

# Export skills to .zip (vendor-neutral)
exporter = SkillExporter(registry)

# Export with dependencies
await exporter.export_skill(
    "analyze-code",
    output_path="analyze-code.zip",
    format="claude",  # Enforces 8MB limit
    include_dependencies=True
)

# Validate before export
valid, msg, size = await exporter.validate_export("large-skill", format="claude")
if not valid:
    print(f"Cannot export: {msg}")

# Enterprise Features (NEW in v0.3.0)
from nexus.skills import (
    SkillAnalyticsTracker,
    SkillGovernance,
    SkillAuditLogger,
    AuditAction
)

# Track skill usage and analytics
tracker = SkillAnalyticsTracker(db_connection)
await tracker.track_usage(
    "analyze-code",
    agent_id="alice",
    execution_time=1.5,
    success=True
)

# Get analytics for a skill
analytics = await tracker.get_skill_analytics("analyze-code")
print(f"Success rate: {analytics.success_rate:.1%}")
print(f"Avg execution time: {analytics.avg_execution_time:.2f}s")

# Get dashboard metrics
dashboard = await tracker.get_dashboard_metrics()
print(f"Total skills: {dashboard.total_skills}")
print(f"Most used: {dashboard.most_used_skills[:5]}")

# Governance - approval workflow for org-wide skills
gov = SkillGovernance(db_connection)

# Submit for approval
approval_id = await gov.submit_for_approval(
    "my-analyzer",
    submitted_by="alice",
    reviewers=["bob", "charlie"],
    comments="Ready for team-wide use"
)

# Approve skill
await gov.approve_skill(approval_id, reviewed_by="bob", comments="Excellent work!")
is_approved = await gov.is_approved("my-analyzer")

# Audit logging for compliance
audit = SkillAuditLogger(db_connection)

# Log skill operations
await audit.log(
    "analyze-code",
    AuditAction.EXECUTED,
    agent_id="alice",
    details={"execution_time": 1.5, "success": True}
)

# Query audit logs
logs = await audit.query_logs(skill_name="analyze-code", action=AuditAction.EXECUTED)

# Generate compliance report
report = await audit.generate_compliance_report(tenant_id="tenant1")
print(f"Total operations: {report['total_operations']}")
print(f"Top skills: {report['top_skills'][:5]}")

# Search skills by description
results = await manager.search_skills("code analysis", limit=5)
for skill_name, score in results:
    print(f"{skill_name}: {score:.1f}")

Skills CLI Commands (v0.3.0)

Nexus provides comprehensive CLI commands for skill management:

# List all skills
nexus skills list
nexus skills list --tenant  # Show tenant skills
nexus skills list --system  # Show system skills
nexus skills list --tier agent  # Filter by tier

# Create new skill from template
nexus skills create my-skill --description "My custom skill"
nexus skills create data-viz --description "Data visualization" --template data-analysis
nexus skills create analyzer --description "Code analyzer" --author Alice

# Fork existing skill
nexus skills fork analyze-code my-analyzer
nexus skills fork data-analysis custom-analysis --author Bob

# Publish skill to tenant library
nexus skills publish my-skill
nexus skills publish shared-skill --from-tier tenant --to-tier system

# Search skills by description
nexus skills search "data analysis"
nexus skills search "code" --tier tenant --limit 5

# Show detailed skill information
nexus skills info analyze-code
nexus skills info data-analysis

# Export skill to .zip package (vendor-neutral)
nexus skills export my-skill --output ./my-skill.zip
nexus skills export analyze-code --output ./export.zip --format claude
nexus skills export my-skill --output ./export.zip --no-deps  # Exclude dependencies

# Validate skill format and size limits
nexus skills validate my-skill
nexus skills validate analyze-code --format claude

# Calculate skill size
nexus skills size my-skill
nexus skills size analyze-code --human

Available Templates:

  • basic - Simple skill template
  • data-analysis - Data processing and analysis
  • code-generation - Code generation and modification
  • document-processing - Document parsing and analysis
  • api-integration - API integration and data fetching

Export Formats:

  • generic - Vendor-neutral .zip format (no size limit)
  • claude - Anthropic Claude format (8MB limit enforced)
  • openai - OpenAI format (validation only, ready for future plugins)

Note: External API integrations (uploading to Claude API, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. The core CLI provides generic export functionality.

SKILL.md Format:

---
name: analyze-code
description: Analyzes code quality and structure
version: 1.0.0
author: Your Name
requires:
  - base-parser
  - ast-analyzer
---

# Code Analysis Skill

This skill analyzes code for quality metrics...

## Usage

1. Parse the code files
2. Run static analysis
3. Generate report

Features:

  • Progressive Disclosure: Load metadata during discovery, full content on-demand
  • Lazy Loading: Skills cached only when accessed
  • Three-Tier Hierarchy: Agent skills override tenant/system skills
  • Dependency Resolution: Automatic DAG resolution with cycle detection
  • Skill Lifecycle: Create, fork, and publish skills with lineage tracking
  • Template System: 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
  • Vendor-Neutral Export: Generic .zip format with Claude/OpenAI validation
  • Usage Analytics: Track performance, success rates, dashboard metrics (NEW in v0.3.0)
  • Governance: Approval workflows for team-wide skill publication (NEW in v0.3.0)
  • Audit Logging: Complete compliance tracking and reporting (NEW in v0.3.0)
  • Skill Search: Find skills by description with relevance scoring (NEW in v0.3.0)
  • Comprehensive Tests: 156 passing tests (31%+ overall coverage, 65-91% skills module)

Skill Tiers:

  • Agent (/workspace/.nexus/skills/) - Personal skills (highest priority)
  • Tenant (/shared/skills/) - Team-shared skills
  • System (/system/skills/) - Built-in skills (lowest priority)

Technology Stack

Core

  • Language: Python 3.11+
  • API Framework: FastAPI
  • Database: PostgreSQL / SQLite (configurable via environment variable)
  • Cache: Redis (prod) / In-memory (dev)
  • Vector DB: Qdrant
  • Object Storage: S3-compatible, GCS, Azure Blob

AI/ML

  • LLM Providers: Anthropic Claude, OpenAI, Google Gemini
  • Embeddings: text-embedding-3-large, voyage-ai
  • Parsing: PyPDF2, pandas, openpyxl, Pillow

Infrastructure

  • Orchestration: Kubernetes (distributed mode)
  • Monitoring: Prometheus + Grafana
  • Logging: Structlog + Loki
  • Admin UI: Simple HTML/JS (jobs, memories, files, operations)

Performance Targets

Metric Target Impact
Write Throughput 500-1000 MB/s 10-50× vs direct backend
Read Latency <10ms 10-50× vs remote storage
Memory Search <100ms Vector search across memories
Storage Savings 30-50% CAS deduplication
Job Resumability 100% Survives all restarts
LLM Cache Hit Rate 50-90% Major cost savings
Prompt Versioning Full lineage Track optimization history
Training Data Dedup 30-50% CAS-based deduplication
Prompt Optimization Multi-candidate Test multiple strategies in parallel
Trace Storage Full execution logs Debug failures, analyze patterns

Configuration

Local Mode

import nexus

# Config via Python (useful for programmatic configuration)
nx = nexus.connect(config={
    "mode": "local",
    "data_dir": "./nexus-data",
    "cache_size_mb": 100,
    "enable_vector_search": True
})

# Or let it auto-discover from nexus.yaml
nx = nexus.connect()

Self-Hosted Deployment

For organizations that want to run their own Nexus instance, create config.yaml:

mode: server  # local or server

database:
  url: postgresql://user:pass@localhost/nexus
  # or for SQLite: sqlite:///./nexus.db
  # Can also use NEXUS_DATABASE_URL or POSTGRES_URL environment variable

cache:
  type: redis  # memory, redis
  url: redis://localhost:6379

vector_db:
  type: qdrant
  url: http://localhost:6333

backends:
  - type: s3
    bucket: my-company-files
    region: us-east-1

  - type: gdrive
    credentials_path: ./gdrive-creds.json

auth:
  jwt_secret: your-secret-key
  token_expiry_hours: 24

rate_limits:
  default: "100/minute"
  semantic_search: "10/minute"
  llm_read: "50/hour"

Run server:

nexus server --config config.yaml

Security

Multi-Layer Security Model

  1. API Key Authentication: Tenant and agent identification
  2. Row-Level Security (RLS): Database-level tenant isolation
  3. Type-Level Validation: Fail-fast validation before database operations
  4. UNIX-Style Permissions: Owner, group, and mode bits (v0.3.0)
  5. ACL Permissions: Fine-grained access control lists (v0.3.0)
  6. ReBAC (Relationship-Based Access Control): Zanzibar-style authorization (v0.3.0)

Type-Level Validation (NEW in v0.1.0)

All domain types have validation methods that are called automatically before database operations. This provides:

  • Fail Fast: Catch invalid data before expensive database operations
  • Clear Error Messages: Actionable feedback for developers and API consumers
  • Data Integrity: Prevent invalid data from entering the database
  • Consistent Validation: Same rules across all code paths
from nexus.core.metadata import FileMetadata
from nexus.core.exceptions import ValidationError

# Validation happens automatically on put()
try:
    metadata = FileMetadata(
        path="/data/file.txt",  # Must start with /
        backend_name="local",
        physical_path="/storage/file.txt",
        size=1024,  # Must be >= 0
    )
    store.put(metadata)  # Validates before DB operation
except ValidationError as e:
    print(f"Validation failed: {e}")
    # Example: "size cannot be negative, got -1"

Validation Rules:

  • Paths must start with / and not contain null bytes
  • File sizes and ref counts must be non-negative
  • Required fields (path, backend_name, physical_path, etc.) must not be empty
  • Content hashes must be valid 64-character SHA-256 hex strings
  • Metadata keys must be ≤ 255 characters

Permission Enforcement (v0.3.0 - NEW)

Permission enforcement is enabled by default to provide secure-by-default behavior with multi-layer access control:

import nexus
from nexus.core.permissions import OperationContext

# Enable permission enforcement
nx = nexus.connect(config={
    "data_dir": "./nexus-data",
    "agent_id": "alice",
    "tenant_id": "acme-corp",
    "enforce_permissions": True  # Enable permission checks
})

# Operations now check permissions using multi-layer security
nx.write("/workspace/file.txt", b"data")  # Uses default context (alice)
nx.read("/workspace/file.txt")  # Permission check: alice can read?

# Or provide explicit context for each operation
ctx = OperationContext(user="bob", groups=["developers"])
nx.read("/workspace/file.txt", context=ctx)  # Permission check: bob can read?

Permission Evaluation Order:

  1. Admin/System Bypass - Admin users and system operations bypass all checks
  2. ReBAC - Check relationship graph (team membership, hierarchical permissions)
  3. ACL - Check explicit allow/deny entries (highest priority)
  4. UNIX Permissions - Check owner/group/other mode bits (fallback)
  5. Default Deny - Deny if no permissions are set (when enforcement is enabled)

Enabling Permission Enforcement:

# Enable via environment variable (CLI)
export NEXUS_ENFORCE_PERMISSIONS=true
export USER=alice  # User context for operations
nexus write /workspace/file.txt "data"  # Permission checked!

# Or via config file
cat > nexus.yaml <<EOF
enforce_permissions: true
agent_id: alice
tenant_id: acme-corp
EOF
nexus write /workspace/file.txt "data"

CLI Operations: The CLI automatically uses the current system user ($USER environment variable) as the operation context:

# CLI operations use $USER for context
export USER=alice
export NEXUS_ENFORCE_PERMISSIONS=true  # Enable enforcement
nexus write /workspace/file.txt "data"  # Context: user=alice
nexus ls /workspace/  # Results filtered by alice's permissions

Configuration:

  • Permission enforcement is enabled by default (enforce_permissions=True)
  • Provides secure-by-default behavior with multi-layer security
  • Disable if needed via NEXUS_ENFORCE_PERMISSIONS=false or config
  • CLI operations automatically use $USER as the operation context
  • When disabled, all operations succeed (no permission checks)

Example: Multi-Tenancy Isolation

-- RLS automatically filters queries by tenant
SET LOCAL app.current_tenant_id = '<tenant_uuid>';

-- All queries auto-filtered, even with bugs
SELECT * FROM file_paths WHERE path = '/data';
-- Returns only rows for current tenant

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=nexus --cov-report=html

# Run specific test file
pytest tests/test_filesystem.py

# Run integration tests
pytest tests/integration/ -v

# Run performance tests
pytest tests/performance/ --benchmark-only

Documentation

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

# Fork the repo and clone
git clone https://github.com/yourusername/nexus.git
cd nexus

# Create a feature branch
git checkout -b feature/your-feature

# Make changes and test
uv pip install -e ".[dev,test]"
pytest

# Format and lint
ruff format .
ruff check .

# Commit and push
git commit -am "Add your feature"
git push origin feature/your-feature

License

Apache 2.0 License - see LICENSE for details.

Roadmap

v0.1.0 - Local Mode Foundation (Current)

  • Core embedded filesystem (read/write/delete)
  • SQLite metadata store
  • Local filesystem backend
  • Basic file operations (list, glob, grep)
  • Virtual path routing
  • Directory operations (mkdir, rmdir, is_directory)
  • Basic CLI interface with Click and Rich
  • Metadata export/import (JSONL format)
  • SQL views for ready work detection
  • In-memory caching
  • Batch operations (avoid N+1 queries)
  • Type-level validation

v0.2.0 - FUSE Mount & Content-Aware Operations (Current)

  • FUSE filesystem mount - Mount Nexus to local path (e.g., /mnt/nexus)
  • Smart read mode - Return parsed text for binary files (PDFs, Excel, etc.)
  • Virtual file views - Auto-generate .txt and .md views for binary files
  • Content parser framework - Extensible parser system for document types (MarkItDown)
  • PDF parser - Extract text and markdown from PDFs
  • Excel/CSV parser - Parse spreadsheets to structured data
  • Content-aware file access - Access parsed content via virtual views
  • Document type detection - Auto-detect MIME types and route to parsers
  • Mount CLI commands - nexus mount, nexus unmount
  • Mount modes - Binary, text, and smart modes
  • .raw directory - Access original binary files
  • Background daemon mode - Run mount in background with --daemon
  • All FUSE operations - read, write, create, delete, mkdir, rmdir, rename, truncate
  • Unit tests - Comprehensive test coverage for FUSE operations
  • rclone-style CLI commands - sync, copy, move, tree, size with progress bars
  • Background parsing - Async content parsing on write
  • FUSE performance optimizations - Caching (TTL/LRU), cache invalidation, metrics
  • Image OCR parser - Extract text from images (PNG, JPEG)

v0.3.0 - File Permissions & Skills System

Permissions (Complete):

  • UNIX-style file permissions (owner, group, mode)
  • Permission operations (chmod, chown, chgrp)
  • ACL (Access Control List) support
  • CLI commands (getfacl, setfacl)
  • Database schema for permissions and ACL entries
  • Comprehensive tests (91 passing tests)
  • ReBAC (Relationship-Based Access Control) - Zanzibar-style authorization
  • Relationship types - member-of, owner-of, viewer-of, editor-of, parent-of
  • Permission inheritance via relationships - Team ownership, group membership
  • Relationship graph queries - Graph traversal with cycle detection
  • Namespaced tuples - (subject, relation, object) authorization model
  • Check API - Fast permission checks with 5-minute TTL caching
  • Expand API - Discover all subjects with specific permissions
  • Relationship management - Create, delete, query relationships via CLI
  • Expiring tuples - Temporary permissions with automatic cleanup
  • Comprehensive ReBAC tests (14 passing tests, 100% pass rate)

Permissions (Phase 2 - Permission Enforcement):

  • OperationContext dataclass - Carry user/agent auth context through operations
  • PermissionEnforcer class - Multi-layer security (ReBAC → ACL → UNIX)
  • Permission check helper - _check_permission() method in NexusFS
  • Default context creation - Auto-create from agent_id/tenant_id
  • Comprehensive tests - 22 new tests for enforcement
  • Usage guide - Complete documentation with examples
  • Example demos - permission_enforcement_demo.py

Permissions (Phase 3 - Full Integration - Complete):

  • Integrate permission checks into all file operations (read, write, delete, list, mkdir)
  • CLI context creation - Create OperationContext from system user
  • Optional context parameter - All operations accept optional context parameter with sensible defaults
  • Permission enforcement enabled by default - Secure-by-default with enforce_permissions=True
  • Automatic user detection - CLI operations use $USER environment variable for context
  • Default permission policies per namespace
  • Permission migration for existing files
  • Full ACL and ReBAC integration - Multi-layer security fully functional

Skills System (Core - Vendor Neutral):

  • SKILL.md parser - Parse Anthropic-compatible SKILL.md with frontmatter
  • Skill registry - Progressive disclosure, lazy loading, three-tier hierarchy
  • Skill discovery - Scan /workspace/.nexus/skills/, /shared/skills/, /system/skills/
  • Dependency resolution - Automatic DAG resolution with cycle detection
  • Skill export - Export to generic formats (validate, pack, size check)
  • Skill templates - 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
  • Skill lifecycle - Create, fork, publish workflows with lineage tracking
  • Comprehensive tests - 156 passing tests (31%+ overall coverage, 65-91% skills module)
  • Skill analytics - Usage tracking, success rates, execution time, dashboard metrics
  • Skill search - Text-based search across skill descriptions with relevance scoring
  • Skill governance - Approval workflow for org-wide skills (submit, approve, reject)
  • Audit trails - Log all skill operations, compliance reporting, query by filters
  • Skill versioning - CAS-backed version control with history tracking
  • CLI commands - list, create, fork, publish, search, info, export, validate, size (see issue #88)

Note: External integrations (Claude API upload/download, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. Core Nexus provides generic skill export (nexus skills export --format claude), while nexus-plugin-anthropic handles API-specific operations.

v0.3.5 - Plugin System & External Integrations

  • Plugin discovery - Entry point-based plugin discovery
  • Plugin registry - Register and manage installed plugins
  • Plugin CLI namespace - nexus <plugin-name> <command> pattern
  • Plugin hooks - Lifecycle hooks (before_write, after_read, etc.)
  • Plugin configuration - Per-plugin config in ~/.nexus/plugins/<name>/
  • Plugin manager - nexus plugins list/install/uninstall/info
  • First-party plugins:
    • nexus-plugin-anthropic - Claude API integration (upload/download skills)
    • nexus-plugin-skill-seekers - Integration with Skill_Seekers scraper

v0.3.9 - Architecture & Versioning Improvements

  • Library/CLI Separation - Clean SDK interface
    • Pure business logic in nexus/core/
    • CLI-only concerns in nexus/cli/
    • Standalone SDK in nexus/sdk/
    • Enable third-party GUIs, TUIs, web interfaces
  • Workspace Versioning - Time-travel for agent workspaces
    • Workspace as versioned entity (CAS-backed snapshots)
    • nexus workspace log <agent> - Show version history
    • nexus workspace restore <agent> <version> - Rollback workspace
    • nexus workspace diff <agent> <v1> <v2> - Compare versions
    • Workspace branching for parallel experimentation
    • Agent debugging: "What workspace state caused this failure?"
  • Lock-Free Concurrency - Optimistic concurrency control
    • Remove file locks, use optimistic concurrency
    • Conflict detection after operations complete
    • Conflict resolution API (keep_latest, merge, manual)
    • Multi-agent safe operations without lock contention
    • Concurrent filesystem modifications (Dropbox/rsync safe)
  • Operation Log - Undo & audit trail
    • operation_log table (parent_id, timestamp, agent, type, affected_paths)
    • Log all operations (write, delete, mkdir, rename)
    • nexus ops log - Show operation history
    • nexus undo - Undo last operation
    • nexus op restore <op_id> - Restore to specific operation
    • Operation query: list by agent, time range, operation type
    • CAS-backed operation snapshots
  • Time-Travel Debugging - Read files at historical points
    • --at-operation flag for time-travel reads
    • nexus cat /file.txt --at-operation op_123456
    • nexus ls /workspace/ --at-operation op_yesterday
    • Operation diff - Show changes between operations
    • Agent operation debugging: "What did agent do 10 steps ago?"
    • Concurrent operation visualization - Show divergent operations

v0.4.0 - AI Integration

  • LLM provider abstraction
  • Anthropic Claude integration
  • OpenAI integration
  • Basic KV cache for prompts
  • Semantic search (vector embeddings)
  • LLM-powered document reading

v0.5.0 - Agent Workspaces

  • Agent workspace structure
  • File-based configuration (.nexus/)
  • Custom command system (markdown)
  • Basic agent memory storage
  • Memory consolidation
  • Memory reflection phase - Extract insights from execution trajectories
  • Strategy/playbook organization - Organize memories as reusable strategies
  • Semantic skill search - Vector-based search across skill descriptions

v0.6.0 - Server Mode (Self-Hosted & Managed)

  • FastAPI REST API
  • API key authentication
  • Multi-tenancy support
  • PostgreSQL support
  • Redis caching
  • Docker deployment
  • Batch/transaction APIs (atomic multi-operation updates)
  • Auto-scaling configuration (for hosted deployments)

v0.7.0 - Extended Features & Event System

  • S3 backend support
  • Google Drive backend
  • Job system with checkpointing
  • OAuth token management
  • MCP server implementation
  • Webhook/event system (file changes, memory updates, job events)
  • Watch API for real-time updates (streaming changes to clients)
  • Server-Sent Events (SSE) support for live monitoring
  • Simple admin UI (jobs, memories, files, operations)

v0.8.0 - Advanced AI Features & Rich Query

  • Advanced KV cache with context tracking
  • Memory versioning and lineage
  • Multi-agent memory sharing
  • Enhanced semantic search
  • Importance-based memory preservation - Prevent brevity bias in consolidation
  • Context-aware memory retrieval - Include execution context in search
  • Automated strategy extraction - LLM-powered extraction from successful trajectories
  • Rich File Query Language - Advanced querying beyond glob/grep
    • nexus query "files where size > 1MB and created_after='2025-10-01'"
    • nexus query "files modified_by='agent1' in last_7_days"
    • nexus query "ancestors(file='/workspace/final.md')" - File lineage
    • Query operations log with filters
    • Query file dependencies and relationships
  • Rich memory query language - Filter by metadata, importance, task type, date ranges
  • Memory query builder API - Fluent interface for complex queries
  • Combined vector + metadata search - Hybrid search

v0.9.0 - Production Readiness

  • Monitoring and observability
  • Performance optimization
  • Comprehensive testing
  • Security hardening
  • Documentation completion
  • Optional OpenTelemetry export (for framework integration)

v0.9.5 - Prompt Engineering & Optimization

  • Prompt version control with lineage tracking
  • Training dataset storage with CAS deduplication
  • Evaluation metrics time series (performance tracking)
  • Frozen inference snapshots (immutable program state)
  • Experiment tracking export (MLflow, W&B integration)
  • Prompt diff viewer (compare versions)
  • Regression detection alerts (performance drops)
  • Multi-candidate pool management (concurrent prompt testing)
  • Execution trace storage (detailed run logs for debugging)
  • Per-example evaluation results (granular performance tracking)
  • Optimization run grouping (experiment management)
  • Multi-objective tradeoff analysis (accuracy vs latency vs cost)

v0.10.0 - Production Infrastructure & Auto-Scaling

  • Automatic infrastructure scaling
  • Redis distributed locks (for large deployments)
  • PostgreSQL replication (for high availability)
  • Kubernetes deployment templates
  • Multi-region load balancing
  • Automatic migration from single-node to distributed

v1.0.0 - Production Release

  • Complete feature set
  • Production-tested
  • Comprehensive documentation
  • Migration tools
  • Enterprise support

Support


Built with ❤️ by the Nexus team

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexus_ai_fs-0.3.0.tar.gz (830.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nexus_ai_fs-0.3.0-py3-none-any.whl (229.7 kB view details)

Uploaded Python 3

File details

Details for the file nexus_ai_fs-0.3.0.tar.gz.

File metadata

  • Download URL: nexus_ai_fs-0.3.0.tar.gz
  • Upload date:
  • Size: 830.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nexus_ai_fs-0.3.0.tar.gz
Algorithm Hash digest
SHA256 6cc92d7b31b92499d720ba7ea77ef0c3cdb5dc47ca8d87109b3abd43ad4017be
MD5 317be7195335e3d1add23a877eff3734
BLAKE2b-256 dbc47801a8e082d080380235ae127339efa19c4dbfcca1424b70f72b32adf722

See more details on using hashes here.

File details

Details for the file nexus_ai_fs-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: nexus_ai_fs-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 229.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nexus_ai_fs-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6aa1ccd775d7ec187936f83e47862f6cacacf589003c5db002dfe08b33425679
MD5 de72b9fcce5fbc6135695964f499f345
BLAKE2b-256 43f910b5764359f6510e79945f7ecd6cb8303e1e08fd4ef46959607aaf088f80

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page