AI-Native Distributed Filesystem Architecture

These details have not been verified by PyPI

Project links

Project description

Nexus: AI-Native Distributed Filesystem

Version 0.1.0 | AI Agent Infrastructure Platform

Nexus is a complete AI agent infrastructure platform that combines distributed unified filesystem, self-evolving agent memory, intelligent document processing, and seamless deployment from local development to hosted production—all from a single codebase.

Features

Foundation

Distributed Unified Filesystem: Multi-backend abstraction (S3, GDrive, SharePoint, LocalFS)
Tiered Storage: Hot/Warm/Cold tiers with automatic lineage tracking
Content-Addressable Storage: 30-50% storage savings via deduplication
"Everything as a File" Paradigm: Configuration, memory, jobs, and commands as files

Agent Intelligence

Self-Evolving Memory: Agent memory with automatic consolidation
Memory Versioning: Track knowledge evolution over time
Multi-Agent Sharing: Shared memory spaces within tenants
Memory Analytics: Effectiveness tracking and insights
Prompt Version Control: Track prompt evolution with lineage
Training Data Management: Version-controlled datasets with deduplication
Prompt Optimization: Multi-candidate testing, execution traces, tradeoff analysis
Experiment Tracking: Organize optimization runs, per-example results, regression detection

Content Processing

Rich Format Parsing: Extensible parsers (PDF, Excel, CSV, JSON, images)
LLM KV Cache Management: 50-90% cost savings on AI queries
Semantic Chunking: Better search via intelligent document segmentation
MCP Integration: Native Model Context Protocol server
Document Type Detection: Automatic routing to appropriate parsers

Operations

Resumable Jobs: Checkpointing system survives restarts
OAuth Token Management: Auto-refreshing credentials
Backend Auto-Mount: Automatic recognition and mounting
Resource Management: CPU throttling and rate limiting
Work Queue Detection: SQL views for efficient task scheduling and dependency resolution

Deployment Modes

Nexus supports two deployment modes from a single codebase:

Mode	Use Case	Setup Time	Scaling
Local	Individual developers, CLI tools, prototyping	60 seconds	Single machine (~10GB)
Hosted	Teams and production (auto-scales)	Sign up	Automatic (GB to Petabytes)

Note: Hosted mode automatically scales infrastructure under the hood—you don't choose between "monolithic" or "distributed". Nexus handles that for you based on your usage.

Quick Start: Local Mode

import nexus

# Zero-deployment filesystem with AI features
# Config auto-discovered from nexus.yaml or environment
nx = nexus.connect()

async with nx:
    # Write and read files
    await nx.write("/workspace/data.txt", b"Hello World")
    content = await nx.read("/workspace/data.txt")

    # Semantic search across documents
    results = await nx.semantic_search(
        "/docs/**/*.pdf",
        query="authentication implementation"
    )

    # LLM-powered document reading with KV cache
    answer = await nx.llm_read(
        "/reports/q4.pdf",
        prompt="Summarize key findings",
        model="claude-sonnet-4"
    )

Config file (nexus.yaml):

mode: local
data_dir: ./nexus-data
cache_size_mb: 100
enable_vector_search: true

Quick Start: Hosted Mode

Coming Soon! Sign up for early access at nexus.ai

import nexus

# Connect to Nexus hosted instance
# Infrastructure scales automatically based on your usage
nx = nexus.connect(
    api_key="your-api-key",
    endpoint="https://api.nexus.ai"
)

async with nx:
    # Same API as local mode!
    await nx.write("/workspace/data.txt", b"Hello World")
    content = await nx.read("/workspace/data.txt")

For self-hosted deployments, see the S3-Compatible HTTP Server section below for deployment instructions.

Storage Backends

Nexus supports multiple storage backends through a unified API. All backends use Content-Addressable Storage (CAS) for automatic deduplication.

Local Backend (Default)

Store files on local filesystem:

import nexus

# Auto-detected from config or uses default
nx = nexus.connect()

# Or explicitly configure
nx = nexus.connect(config={
    "backend": "local",
    "data_dir": "./nexus-data"
})

Google Cloud Storage (GCS) Backend

Store files in Google Cloud Storage with local metadata:

import nexus

# Connect with GCS backend
nx = nexus.connect(config={
    "backend": "gcs",
    "gcs_bucket_name": "my-nexus-bucket",
    "gcs_project_id": "my-gcp-project",  # Optional
    "gcs_credentials_path": "/path/to/credentials.json",  # Optional
})

Authentication Methods:

Service Account Key: Provide gcs_credentials_path
Application Default Credentials (if not provided):
- GOOGLE_APPLICATION_CREDENTIALS environment variable
- gcloud auth application-default login credentials
- GCE/Cloud Run service account (when running on GCP)

Using Config File (nexus.yaml):

backend: gcs
gcs_bucket_name: my-nexus-bucket
gcs_project_id: my-gcp-project  # Optional
# gcs_credentials_path: /path/to/credentials.json  # Optional

Using Environment Variables:

export NEXUS_BACKEND=gcs
export NEXUS_GCS_BUCKET_NAME=my-nexus-bucket
export NEXUS_GCS_PROJECT_ID=my-gcp-project  # Optional
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json  # Optional

CLI Usage with GCS:

# Write file to GCS
nexus write /workspace/data.txt "Hello GCS!" \
  --backend=gcs \
  --gcs-bucket=my-nexus-bucket

# Or use config file (simpler!)
nexus write /workspace/data.txt "Hello GCS!" --config=nexus.yaml

Advanced: Direct Backend API

For advanced use cases, instantiate backends directly:

from nexus import NexusFS, LocalBackend, GCSBackend

# Local backend
nx_local = NexusFS(
    backend=LocalBackend("/path/to/data"),
    db_path="./metadata.db"
)

# GCS backend
nx_gcs = NexusFS(
    backend=GCSBackend(
        bucket_name="my-bucket",
        project_id="my-project",
        credentials_path="/path/to/creds.json"
    ),
    db_path="./gcs-metadata.db"
)

# Same API for both!
nx_local.write("/file.txt", b"data")
nx_gcs.write("/file.txt", b"data")

Backend Comparison

Feature	Local Backend	GCS Backend
Content Storage	Local filesystem	Google Cloud Storage
Metadata Storage	Local SQLite	Local SQLite
Deduplication	✅ CAS (30-50% savings)	✅ CAS (30-50% savings)
Multi-machine Access	❌ Single machine	✅ Shared across machines
Durability	Single disk	99.999999999% (11 nines)
Latency	<1ms (local)	10-50ms (network)
Cost	Free (local disk)	GCS storage pricing
Use Case	Development, single machine	Teams, production, backup

Coming Soon

Amazon S3 Backend (v0.7.0)
Azure Blob Storage (v0.7.0)
Google Drive (v0.7.0)
SharePoint (v0.7.0)

Installation

Using pip (Recommended)

# Install from PyPI
pip install nexus-ai-fs

# Verify installation
nexus --version

From Source (Development)

# Clone the repository
git clone https://github.com/nexi-lab/nexus.git
cd nexus

# Install using uv (recommended for faster installs)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"

# Or using pip
pip install -e ".[dev]"

Development Setup

# Install development dependencies
uv pip install -e ".[dev,test]"

# Run tests
pytest

# Run type checking
mypy src/nexus

# Format code
ruff format .

# Lint
ruff check .

CLI Usage

Nexus provides a beautiful command-line interface for all file operations. After installation, the nexus command will be available.

Quick Start

# Initialize a new workspace
nexus init ./my-workspace

# Write a file
nexus write /workspace/hello.txt "Hello, Nexus!"

# Read a file
nexus cat /workspace/hello.txt

# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long  # Detailed view with metadata

Available Commands

File Operations

# Write content to a file
nexus write /path/to/file.txt "content"
echo "content" | nexus write /path/to/file.txt --input -

# Display file contents (with syntax highlighting)
nexus cat /workspace/code.py

# Copy files
nexus cp /source.txt /dest.txt

# Delete files
nexus rm /workspace/old-file.txt
nexus rm /workspace/old-file.txt --force  # Skip confirmation

# Show file information
nexus info /workspace/data.txt

Directory Operations

# Create directory
nexus mkdir /workspace/data
nexus mkdir /workspace/deep/nested/dir --parents

# Remove directory
nexus rmdir /workspace/data
nexus rmdir /workspace/data --recursive --force

File Discovery

# List files
nexus ls /workspace
nexus ls /workspace --recursive
nexus ls /workspace --long  # Show size, modified time, etag

# Find files by pattern (glob)
nexus glob "**/*.py"  # All Python files recursively
nexus glob "*.txt" --path /workspace  # Text files in workspace
nexus glob "test_*.py"  # Test files

# Search file contents (grep)
nexus grep "TODO"  # Find all TODO comments
nexus grep "def \w+" --file-pattern "**/*.py"  # Find function definitions
nexus grep "error" --ignore-case  # Case-insensitive search
nexus grep "TODO" --max-results 50  # Limit results

# Search modes (v0.2.0+)
nexus grep "revenue" --file-pattern "**/*.pdf"  # Auto mode: tries parsed first
nexus grep "revenue" --file-pattern "**/*.pdf" --search-mode=parsed  # Only parsed content
nexus grep "TODO" --search-mode=raw  # Only raw text (skip parsing)

# Result shows source type
# Match: TODO (parsed) ← from parsed PDF
# Match: TODO (raw) ← from source code

File Permissions (v0.3.0)

# Change file permissions
nexus chmod 755 /workspace/script.sh
nexus chmod rw-r--r-- /workspace/data.txt

# Change file owner and group
nexus chown alice /workspace/file.txt
nexus chgrp developers /workspace/code/

# View ACL entries
nexus getfacl /workspace/file.txt

# Manage ACL entries
nexus setfacl user:alice:rw- /workspace/file.txt
nexus setfacl group:developers:r-x /workspace/code/
nexus setfacl deny:user:bob /workspace/secret.txt
nexus setfacl user:alice:rwx /workspace/file.txt --remove

Supported Formats:

Octal: 755, 0o644, 0755
Symbolic: rwxr-xr-x, rw-r--r--
ACL Entries: user:<name>:rwx, group:<name>:r-x, deny:user:<name>

ReBAC - Relationship-Based Access Control (v0.3.0)

Nexus implements Zanzibar-style relationship-based authorization for team-based permissions, hierarchical access, and dynamic permission inheritance.

# Create relationship tuples
nexus rebac create agent alice member-of group eng-team
nexus rebac create group eng-team owner-of file project-docs
nexus rebac create file folder-parent parent-of file folder-child

# Check permissions (with graph traversal)
nexus rebac check agent alice member-of group eng-team  # Direct check
nexus rebac check agent alice owner-of file project-docs  # Inherited via group

# Find all subjects with a permission
nexus rebac expand owner-of file project-docs  # Returns: alice (via eng-team)
nexus rebac expand member-of group eng-team    # Returns: alice, bob, ...

# Delete relationships
nexus rebac delete <tuple-id>

# Create temporary access (expires automatically)
nexus rebac create agent alice viewer-of file temp-report \
  --expires "2025-12-31T23:59:59"

ReBAC Features:

Relationship Types: member-of, owner-of, viewer-of, editor-of, parent-of
Graph Traversal: Recursive permission checking through relationship chains
Permission Inheritance: Team ownership, hierarchical folders, group membership
Caching: 5-minute TTL with automatic invalidation on changes
Expiring Access: Temporary permissions with automatic cleanup
Cycle Detection: Prevents infinite loops in relationship graphs

Example Use Cases:

# Team-based file access
nexus rebac create agent alice member-of group engineering
nexus rebac create group engineering owner-of file /projects/backend
# alice now has owner permission on /projects/backend

# Hierarchical folder permissions
nexus rebac create agent bob owner-of file /workspace/parent-folder
nexus rebac create file /workspace/parent-folder parent-of file /workspace/parent-folder/child
# bob automatically has owner permission on child folder

# Temporary collaborator access
nexus rebac create agent charlie viewer-of file /reports/q4.pdf \
  --expires "2025-01-31T23:59:59"
# charlie's access expires automatically on Jan 31, 2025

Work Queue Operations

# Query work items by status
nexus work ready --limit 10  # Get ready work items (high priority first)
nexus work pending  # Get pending work items
nexus work blocked  # Get blocked work items (with dependency info)
nexus work in-progress  # Get currently processing items

# View aggregate statistics
nexus work status  # Show counts for all work queues

# Output as JSON (for scripting)
nexus work ready --json
nexus work status --json

Note: Work items are files with special metadata (status, priority, depends_on, worker_id). See docs/SQL_VIEWS_FOR_WORK_DETECTION.md for details on setting up work queues.

Examples

Initialize and populate a workspace:

# Create workspace
nexus init ./my-project

# Create structure
nexus mkdir /workspace/src --data-dir ./my-project/nexus-data
nexus mkdir /workspace/tests --data-dir ./my-project/nexus-data

# Add files
echo "print('Hello World')" | nexus write /workspace/src/main.py --input - \
  --data-dir ./my-project/nexus-data

# List everything
nexus ls / --recursive --long --data-dir ./my-project/nexus-data

Find and analyze code:

# Find all Python files
nexus glob "**/*.py"

# Search for TODO comments
nexus grep "TODO|FIXME" --file-pattern "**/*.py"

# Find all test files
nexus glob "**/test_*.py"

# Search for function definitions
nexus grep "^def \w+\(" --file-pattern "**/*.py"

Work with data:

# Write JSON data
echo '{"name": "test", "value": 42}' | nexus write /data/config.json --input -

# Display with syntax highlighting
nexus cat /data/config.json

# Get file information
nexus info /data/config.json

Global Options

All commands support these global options:

# Use custom config file
nexus ls /workspace --config /path/to/config.yaml

# Override data directory
nexus ls /workspace --data-dir /path/to/nexus-data

# Combine both (config takes precedence)
nexus ls /workspace --config ./my-config.yaml --data-dir ./data

Help

Get help for any command:

nexus --help  # Show all commands
nexus ls --help  # Show help for ls command
nexus grep --help  # Show help for grep command

Remote Nexus Server

Nexus includes a JSON-RPC server that exposes the full NexusFileSystem interface over HTTP, enabling remote filesystem access and FUSE mounts to remote servers.

Quick Start

Method 1: Using the Startup Script (Recommended)

# Navigate to nexus directory
cd /path/to/nexus

# Start with defaults (host: 0.0.0.0, port: 8080, no auth)
./start-server.sh

# Or with custom options
./start-server.sh --host localhost --port 8080 --api-key mysecret

Method 2: Direct Command

# Start the server (optional API key authentication)
nexus serve --host 0.0.0.0 --port 8080 --api-key mysecret

# Use remote filesystem from Python
from nexus import RemoteNexusFS

nx = RemoteNexusFS(
    server_url="http://localhost:8080",
    api_key="mysecret"  # Optional
)

# Same API as local NexusFS!
nx.write("/workspace/hello.txt", b"Hello Remote!")
content = nx.read("/workspace/hello.txt")
files = nx.list("/workspace", recursive=True)

Features

Full NFS Interface: All filesystem operations exposed over RPC (read, write, list, glob, grep, mkdir, etc.)
JSON-RPC 2.0 Protocol: Standard RPC protocol with proper error handling
API Key Authentication: Optional Bearer token authentication for security
Backend Agnostic: Works with local and GCS backends
FUSE Compatible: Mount remote Nexus servers as local filesystems

Remote Client Usage

from nexus import RemoteNexusFS

# Connect to remote server
nx = RemoteNexusFS(
    server_url="http://your-server:8080",
    api_key="your-api-key"  # Optional
)

# All standard operations work
nx.write("/workspace/data.txt", b"content")
content = nx.read("/workspace/data.txt")
files = nx.list("/workspace", recursive=True)
results = nx.glob("**/*.py")
matches = nx.grep("TODO", file_pattern="*.py")

Server Options

# Start with custom host/port
nexus serve --host 0.0.0.0 --port 8080

# Start with API key authentication
nexus serve --api-key mysecret

# Start with GCS backend
nexus serve --backend=gcs --gcs-bucket=my-bucket --api-key mysecret

# Custom data directory
nexus serve --data-dir /path/to/data

Testing the Server

Once the server is running, verify it's working:

# Health check
curl http://localhost:8080/health
# Expected: {"status": "healthy", "service": "nexus-rpc"}

# Check available methods
curl http://localhost:8080/api/nfs/status
# Expected: {"status": "running", "service": "nexus-rpc", "version": "1.0", "methods": [...]}

# List files (JSON-RPC)
curl -X POST http://localhost:8080/api/nfs/list \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "list",
    "params": {"path": "/", "recursive": false, "details": true},
    "id": 1
  }'

# With API key
curl -X POST http://localhost:8080/api/nfs/list \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer mysecretkey" \
  -d '{"jsonrpc": "2.0", "method": "list", "params": {"path": "/"}, "id": 1}'

Troubleshooting

Port Already in Use:

# Find and kill process using port 8080
lsof -ti:8080 | xargs kill -9

# Or use a different port
nexus serve --port 8081

Module Not Found:

# Activate virtual environment and install
source .venv/bin/activate
pip install -e .

Permission Denied:

# Use a directory you have write access to
nexus serve --data-dir ~/nexus-data

Deploying Nexus Server

Google Cloud Platform (Recommended)

Deploy to GCP with a single command using the automated deployment script:

# Quick start
./deploy-gcp.sh --project-id YOUR-PROJECT-ID --api-key mysecret

# With GCS backend
./deploy-gcp.sh \
  --project-id YOUR-PROJECT-ID \
  --gcs-bucket your-nexus-bucket \
  --api-key mysecret \
  --machine-type e2-standard-2

Features:

✅ Automated VM provisioning (Ubuntu 22.04)
✅ Systemd service with auto-restart
✅ Firewall configuration
✅ GCS backend support
✅ Production-ready setup

See GCP Deployment Guide for complete instructions.

Docker Deployment

Deploy using Docker for consistent environments and easy management:

# Quick start with Docker Compose
cp .env.docker.example .env
# Edit .env with your configuration
docker-compose up -d

# Or run directly
docker build -t nexus-server:latest .
docker run -d \
  --name nexus-server \
  --restart unless-stopped \
  -p 8080:8080 \
  -v nexus-data:/app/data \
  -e NEXUS_API_KEY="your-api-key" \
  nexus-server:latest

# Deploy to GCP with Docker (automated)
./deploy-gcp-docker.sh \
  --project-id your-project-id \
  --api-key mysecret \
  --build-local

Features:

✅ Multi-stage build for optimized image size (~300MB)
✅ Non-root user for security
✅ Health checks and auto-restart
✅ GCS backend support
✅ Docker Compose for easy orchestration

See Docker Deployment Guide for complete instructions.

Deployment Features:

Persistent Metadata: SQLite database stored on VM disk at /var/lib/nexus/
Content Storage: All file content stored in configured backend (GCS, local, etc.)
Content Deduplication: CAS-based storage with 30-50% savings
Full NFS API: All operations available remotely

FUSE Mount: Use Standard Unix Tools (v0.2.0)

Mount Nexus to a local path and use any standard Unix tool seamlessly - ls, cat, grep, vim, and more!

Installation

First, install FUSE support:

# Install Nexus with FUSE support
pip install nexus-ai-fs[fuse]

# Platform-specific FUSE library:
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3  # or equivalent for your distro

Quick Start

# Mount Nexus to local path (smart mode by default)
nexus mount /mnt/nexus

# Now use ANY standard Unix tools!
ls -la /mnt/nexus/workspace/
cat /mnt/nexus/workspace/notes.txt
grep -r "TODO" /mnt/nexus/workspace/
find /mnt/nexus -name "*.py"
vim /mnt/nexus/workspace/code.py
git clone /some/repo /mnt/nexus/repos/myproject

# Unmount when done
nexus unmount /mnt/nexus

Quick Start Examples

Example 1: Default (Explicit Views) - Best for Mixed Workflows

# Mount normally
nexus mount /mnt/nexus

# Binary tools work directly
evince /mnt/nexus/docs/report.pdf     # PDF viewer works ✓

# Add .txt for text operations
cat /mnt/nexus/docs/report.pdf.txt    # Read as text
grep "results" /mnt/nexus/docs/*.pdf.txt

# Virtual views auto-generated
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt  (virtual)
# → report.pdf.md   (virtual)

Example 2: Auto-Parse - Best for Search-Heavy Workflows

# Mount with auto-parse
nexus mount /mnt/nexus --auto-parse

# grep works directly on PDFs!
grep "results" /mnt/nexus/docs/*.pdf      # No .txt needed! ✓
cat /mnt/nexus/docs/report.pdf            # Returns text ✓

# Search across everything
grep -r "TODO" /mnt/nexus/workspace/      # Searches PDFs, Excel, etc.

# Binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf   # For PDF viewer

Example 3: Real-World Script

#!/bin/bash
# Find all PDFs mentioning "invoice"

# Mount in background - command returns immediately!
nexus mount /mnt/nexus --auto-parse --daemon
# (No blocking - script continues immediately)

# Mount is ready - grep works on PDFs!
grep -l "invoice" /mnt/nexus/documents/*.pdf

# Process results
for pdf in $(grep -l "invoice" /mnt/nexus/documents/*.pdf); do
    echo "Found in: $pdf"
    grep -n "invoice" "$pdf" | head -5
done

# Clean up
nexus unmount /mnt/nexus

Remote server example:

#!/bin/bash
# Search PDFs on remote Nexus server

# Mount remote server in background
nexus mount /mnt/nexus \
  --remote-url http://nexus-server:8080 \
  --auto-parse \
  --daemon

# Command returns immediately - daemon process runs in background
# You can now use standard Unix tools on remote filesystem!

# Search across remote PDFs
grep -r "TODO" /mnt/nexus/workspace/ | head -20

# Find large files
find /mnt/nexus -type f -size +10M

# Clean up when done
nexus unmount /mnt/nexus

File Access: Two Modes

Nexus supports two ways to access files - choose what fits your workflow:

1. Explicit Views (Default) - Best for Compatibility

Binary files return binary, use .txt/.md suffixes for parsed content:

nexus mount /mnt/nexus

# Binary files work with native tools
evince /mnt/nexus/docs/report.pdf      # PDF viewer gets binary ✓
libreoffice /mnt/nexus/data/sheet.xlsx # Excel app gets binary ✓

# Add .txt to search/read as text
cat /mnt/nexus/docs/report.pdf.txt     # Returns parsed text
grep "pattern" /mnt/nexus/docs/*.pdf.txt

# Virtual views appear automatically
ls /mnt/nexus/docs/
# → report.pdf
# → report.pdf.txt  (virtual view)
# → report.pdf.md   (virtual view)

When to use: You want both binary tools AND text search to work

2. Auto-Parse Mode - Best for Search/Grep

Binary files return parsed text directly, use .raw/ for binary:

nexus mount /mnt/nexus --auto-parse

# Binary files return text directly - perfect for grep!
cat /mnt/nexus/docs/report.pdf         # Returns parsed text ✓
grep "pattern" /mnt/nexus/docs/*.pdf   # Works directly! ✓
less /mnt/nexus/docs/report.pdf        # Page through text ✓

# Access binary via .raw/ when needed
evince /mnt/nexus/.raw/docs/report.pdf # PDF viewer gets binary

# No .txt/.md suffixes - files return text by default
ls /mnt/nexus/docs/
# → report.pdf  (returns text when read)

When to use: Text search is your primary use case, binary tools are secondary

Mount Modes (Content Parsing)

Control what gets parsed:

# Smart mode (default) - Auto-detect file types
nexus mount /mnt/nexus --mode=smart
# ✅ PDFs, Excel, Word → parsed
# ✅ .py, .txt, .md → pass-through
# ✅ Best for mixed content

# Text mode - Parse everything aggressively
nexus mount /mnt/nexus --mode=text
# ✅ All files parsed to text
# ⚠️  Slower (always parses)

# Binary mode - No parsing at all
nexus mount /mnt/nexus --mode=binary
# ✅ All files return binary
# ❌ grep won't work on PDFs

Comparison Table

Feature	Explicit Views (default)	Auto-Parse Mode (`--auto-parse`)
PDF viewers work	✅ `evince file.pdf`	⚠️ `evince .raw/file.pdf`
grep on PDFs	⚠️ `grep *.pdf.txt`	✅ `grep *.pdf`
Excel apps work	✅ `libreoffice file.xlsx`	⚠️ `libreoffice .raw/file.xlsx`
Best for	Binary tools + search	Text search primary use case
Virtual views	`.txt`, `.md` suffixes	No suffixes needed
Binary access	Direct (`file.pdf`)	Via `.raw/` directory

Background (Daemon) Mode

Run the mount in the background and return to your shell immediately:

# Mount in background - command returns immediately
nexus mount /mnt/nexus --daemon
# ✓ Mounted Nexus to /mnt/nexus
#
# To unmount:
#   nexus unmount /mnt/nexus
#
# (Shell prompt returns immediately, mount runs in background)

# Mount is active - you can use it immediately
ls /mnt/nexus
cat /mnt/nexus/workspace/file.txt

# Check daemon status
ps aux | grep "nexus mount" | grep -v grep
# jinjingzhou  43097  ... nexus mount /mnt/nexus --daemon

# Later, unmount when done
nexus unmount /mnt/nexus

How it works:

Command returns to shell immediately (using double-fork technique)
Background daemon process keeps mount active
Daemon survives terminal close and persists until unmount
Safe to close your terminal - mount stays active

Local Mount:

# Mount local Nexus data in background
nexus mount /mnt/nexus --daemon

Remote Mount:

# Mount remote Nexus server in background
nexus mount /mnt/nexus --remote-url http://your-server:8080 --daemon

# With API key authentication
nexus mount /mnt/nexus \
  --remote-url http://your-server:8080 \
  --api-key your-secret-key \
  --daemon

Performance & Caching (v0.2.0)

FUSE mounts include automatic caching for improved performance. Caching is enabled by default with sensible defaults - no configuration needed for most users.

Default Performance:

✅ Attribute caching (1024 entries, 60s TTL) - Makes ls and stat operations faster
✅ Content caching (100 files) - Speeds up repeated file reads
✅ Parsed content caching (50 files) - Accelerates PDF/Excel text extraction
✅ Automatic cache invalidation on writes/deletes - Always consistent

Advanced: Custom Cache Configuration

For power users with specific performance requirements:

from nexus import connect
from nexus.fuse import mount_nexus

nx = connect(config={"data_dir": "./nexus-data"})

# Custom cache configuration
cache_config = {
    "attr_cache_size": 2048,      # Double the attribute cache (default: 1024)
    "attr_cache_ttl": 120,         # Cache attributes for 2 minutes (default: 60s)
    "content_cache_size": 200,     # Cache 200 files (default: 100)
    "parsed_cache_size": 100,      # Cache 100 parsed files (default: 50)
    "enable_metrics": True         # Track cache hit/miss rates (default: False)
}

fuse = mount_nexus(
    nx,
    "/mnt/nexus",
    mode="smart",
    cache_config=cache_config,
    foreground=False
)

# View cache performance (if metrics enabled)
# Note: Access via fuse.fuse.operations.cache

Cache Configuration Options:

Option	Default	Description
`attr_cache_size`	1024	Max number of cached file attribute entries
`attr_cache_ttl`	60	Time-to-live for attributes in seconds
`content_cache_size`	100	Max number of cached file contents
`parsed_cache_size`	50	Max number of cached parsed contents (PDFs, etc.)
`enable_metrics`	False	Enable cache hit/miss tracking

When to Tune Cache Settings:

Large directory listings: Increase attr_cache_size to 2048+ and attr_cache_ttl to 120+
Many small files: Increase content_cache_size to 500+
Heavy PDF/Excel use: Increase parsed_cache_size to 200+
Performance analysis: Enable enable_metrics to measure cache effectiveness
Memory-constrained: Decrease all cache sizes (e.g., 512 / 50 / 25)

Notes:

Caches are thread-safe - safe for concurrent access
Caches are automatically invalidated on file writes, deletes, and renames
Default settings work well for most use cases - tune only if needed

Troubleshooting FUSE Mounts

Check Mount Status

# Check if daemon process is running
ps aux | grep "nexus mount" | grep -v grep

# Check mount points
mount | grep nexus

# List files in mount point (should show files, not empty)
ls -la /mnt/nexus/

Common Issues

Mount appears empty or shows "Transport endpoint is not connected":

# Unmount the stale mount point
nexus unmount /mnt/nexus

# Or force unmount (macOS)
umount -f /mnt/nexus

# Or force unmount (Linux)
fusermount -u /mnt/nexus

# Then remount
nexus mount /mnt/nexus --daemon

Process won't die (stuck in 'D' or 'U' state):

# Find stuck processes
ps aux | grep nexus | grep -E "D|U"

# Force kill
kill -9 <PID>

# If process is still stuck (uninterruptible I/O), try:
# macOS: umount -f /mnt/nexus
# Linux: fusermount -uz /mnt/nexus

# Note: Stuck processes in 'D' state typically resolve after unmount
# If they persist, they'll be cleaned up on system reboot

"Directory not empty" error when mounting:

# Unmount first
nexus unmount /mnt/nexus

# Or remove and recreate the mount point
rm -rf /mnt/nexus && mkdir /mnt/nexus

# Then mount
nexus mount /mnt/nexus --daemon

Permission denied errors:

# Ensure FUSE is installed
# macOS: Install macFUSE from https://osxfuse.github.io/
# Linux: sudo apt-get install fuse3

# Check mount point permissions
ls -ld /mnt/nexus
# Should be owned by your user

# Create mount point with correct permissions
mkdir -p /mnt/nexus
chmod 755 /mnt/nexus

Connection refused (remote mounts):

# Check server is running
curl http://your-server:8080/health

# Test connectivity
ping your-server

# Verify API key (if required)
nexus mount /mnt/nexus \
  --remote-url http://your-server:8080 \
  --api-key your-key \
  --daemon

Multiple mounts to same mount point:

# Check for existing mounts
mount | grep /mnt/nexus

# Unmount all instances
nexus unmount /mnt/nexus

# Kill any lingering processes
pkill -f "nexus mount /mnt/nexus"

# Clean mount and remount
rm -rf /mnt/nexus && mkdir /mnt/nexus
nexus mount /mnt/nexus --daemon

Debug Mode

For detailed debugging output:

# Run in foreground with debug output
nexus mount /mnt/nexus --debug

# This will show all FUSE operations in real-time
# Press Ctrl+C to stop

rclone-style CLI Commands (v0.2.0)

Nexus provides efficient file operations inspired by rclone, with automatic deduplication and progress tracking:

Sync Command

One-way synchronization with hash-based change detection:

# Sync local directory to Nexus (only copies changed files)
nexus sync ./local/dataset/ /workspace/training/

# Preview changes before syncing (dry-run)
nexus sync ./data/ /workspace/backup/ --dry-run

# Mirror sync - delete extra files in destination
nexus sync /workspace/source/ /workspace/dest/ --delete

# Disable hash comparison (force copy all files)
nexus sync ./data/ /workspace/ --no-checksum

Copy Command

Smart copy with automatic deduplication:

# Copy directory recursively (skips identical files)
nexus copy ./local/data/ /workspace/project/ --recursive

# Copy within Nexus (leverages CAS deduplication)
nexus copy /workspace/source/ /workspace/dest/ --recursive

# Copy Nexus to local
nexus copy /workspace/data/ ./backup/ --recursive

# Copy single file
nexus copy /workspace/file.txt /workspace/copy.txt

# Disable checksum verification
nexus copy ./data/ /workspace/ --recursive --no-checksum

Move Command

Efficient file/directory moves with confirmation prompts:

# Move file (rename if possible, copy+delete otherwise)
nexus move /workspace/old.txt /workspace/new.txt

# Move directory without confirmation
nexus move /workspace/old_dir/ /archives/2024/ --force

Tree Command

Visualize directory structure as ASCII tree:

# Show full directory tree
nexus tree /workspace/

# Limit depth to 2 levels
nexus tree /workspace/ -L 2

# Show file sizes
nexus tree /workspace/ --show-size

Size Command

Calculate directory sizes with human-readable output:

# Calculate total size
nexus size /workspace/project/

# Human-readable output (KB, MB, GB)
nexus size /workspace/ --human

# Show top 10 largest files
nexus size /workspace/ --human --details

Features:

Hash-based deduplication - Only copies changed files
Progress bars - Visual feedback for long operations
Dry-run mode - Preview changes before execution
Cross-platform paths - Works with local filesystem and Nexus paths
Automatic deduplication - Leverages Content-Addressable Storage (CAS)

Performance Comparison

Method	Speed	Content-Aware	Use Case
`grep -r /mnt/nexus/`	Medium	✅ Yes (via mount)	Interactive use
`nexus grep "pattern"`	Fast (DB-backed)	✅ Yes	Large-scale search
Standard tools	Familiar	✅ Yes (via mount)	Day-to-day work

Use Cases

Interactive Development:

# Mount for interactive work
nexus mount /mnt/nexus
vim /mnt/nexus/workspace/code.py
git clone /mnt/nexus/repos/myproject

Bulk Operations:

# Use rclone-style commands for efficiency
nexus sync /local/dataset/ /workspace/training-data/
nexus tree /workspace/ > structure.txt

Automated Workflows:

# Standard Unix tools in scripts
find /mnt/nexus -name "*.pdf" -exec grep -l "invoice" {} \;
rsync -av /mnt/nexus/workspace/ /backup/

Architecture

Agent Workspace Structure

Every agent gets a structured workspace at /workspace/{tenant}/{agent}/:

/workspace/acme-corp/research-agent/
├── .nexus/                          # Nexus metadata (Git-trackable)
│   ├── agent.yaml                   # Agent configuration
│   ├── commands/                    # Custom commands (markdown files)
│   │   ├── analyze-codebase.md
│   │   └── summarize-docs.md
│   ├── jobs/                        # Background job definitions
│   │   └── daily-summary.yaml
│   ├── memory/                      # File-based memory
│   │   ├── project-knowledge.md
│   │   └── recent-tasks.jsonl
│   └── secrets.encrypted            # KMS-encrypted credentials
├── data/                            # Agent's working data
│   ├── inputs/
│   └── outputs/
└── INSTRUCTIONS.md                  # Agent instructions (auto-loaded)

Path Namespace

/
├── workspace/        # Agent scratch space (hot tier, ephemeral)
├── shared/           # Shared tenant data (warm tier, persistent)
├── external/         # Pass-through backends (no content storage)
├── system/           # System metadata (admin-only)
└── archives/         # Cold storage (read-only)

Core Components

File System Operations

import nexus

# Works in both local and hosted modes
# Mode determined by config file or environment
nx = nexus.connect()

async with nx:
    # Basic operations
    await nx.write("/workspace/data.txt", b"content")
    content = await nx.read("/workspace/data.txt")
    await nx.delete("/workspace/data.txt")

    # Batch operations
    files = await nx.list("/workspace/", recursive=True)
    results = await nx.copy_batch(sources, destinations)

    # File discovery
    python_files = await nx.glob("**/*.py")
    todos = await nx.grep(r"TODO:|FIXME:", file_pattern="*.py")

Semantic Search

# Search across documents with vector embeddings
async with nexus.connect() as nx:
    results = await nx.semantic_search(
        path="/docs/",
        query="How does authentication work?",
        limit=10,
        filters={"file_type": "markdown"}
    )

    for result in results:
        print(f"{result.path}:{result.line} - {result.text}")

LLM-Powered Reading

# Read documents with AI, with automatic KV cache
async with nexus.connect() as nx:
    answer = await nx.llm_read(
        path="/reports/q4-2024.pdf",
        prompt="What were the top 3 challenges?",
        model="claude-sonnet-4",
        max_tokens=1000
    )

Agent Memory

# Store and retrieve agent memories
async with nexus.connect() as nx:
    await nx.store_memory(
        content="User prefers TypeScript over JavaScript",
        memory_type="preference",
        tags=["coding", "languages"]
    )

    memories = await nx.search_memories(
        query="programming language preferences",
        limit=5
    )

Prompt Optimization (Coming in v0.9.5)

# Track multiple prompt candidates during optimization
async with nexus.connect() as nx:
    # Start optimization run
    run_id = await nx.start_optimization_run(
        module_name="SearchModule",
        objectives=["accuracy", "latency", "cost"]
    )

    # Store prompt candidates with detailed traces
    for candidate in prompt_variants:
        version_id = await nx.store_prompt_version(
            module_name="SearchModule",
            prompt_template=candidate.template,
            metrics={"accuracy": 0.85, "latency_ms": 450},
            run_id=run_id
        )

        # Store execution traces for debugging
        await nx.store_execution_trace(
            prompt_version_id=version_id,
            inputs=test_inputs,
            outputs=predictions,
            intermediate_steps=reasoning_chain
        )

    # Analyze tradeoffs across candidates
    analysis = await nx.analyze_prompt_tradeoffs(
        run_id=run_id,
        objectives=["accuracy", "latency_ms", "cost_per_query"]
    )

    # Get per-example results to find failure patterns
    failures = await nx.get_failing_examples(
        prompt_version_id=version_id,
        limit=20
    )

Custom Commands

Create /workspace/{tenant}/{agent}/.nexus/commands/semantic-search.md:

---
name: semantic-search
description: Search codebase semantically
allowed-tools: [semantic_read, glob, grep]
required-scopes: [read]
model: sonnet
---

## Your task

Given query: {{query}}

1. Use `glob` to find relevant files by pattern
2. Use `semantic_read` to extract relevant sections
3. Summarize findings with file:line citations

Execute via API:

async with nexus.connect() as nx:
    result = await nx.execute_command(
        "semantic-search",
        context={"query": "authentication implementation"}
    )

Skills System (v0.3.0)

Manage reusable AI agent skills with SKILL.md format, progressive disclosure, lifecycle management, and dependency resolution:

from nexus.skills import SkillRegistry, SkillManager, SkillExporter

# Initialize filesystem
nx = nexus.connect()

# Create skill registry
registry = SkillRegistry(nx)

# Discover skills from three tiers (agent > tenant > system)
# Loads metadata only - lightweight and fast
await registry.discover()

# List available skills
skills = registry.list_skills()
# ['analyze-code', 'data-processing', 'report-generation']

# Get skill metadata (no content loading)
metadata = registry.get_metadata("analyze-code")
print(f"{metadata.name}: {metadata.description}")
# analyze-code: Analyzes code quality and structure

# Load full skill content (lazy loading + caching)
skill = await registry.get_skill("analyze-code")
print(skill.content)  # Full markdown content

# Resolve dependencies automatically (DAG with cycle detection)
deps = await registry.resolve_dependencies("complex-skill")
# ['base-skill', 'helper-skill', 'complex-skill']

# Create skill manager for lifecycle operations
manager = SkillManager(nx, registry)

# Create new skill from template
await manager.create_skill(
    "my-analyzer",
    description="Analyzes code quality and structure",
    template="code-generation",  # basic, data-analysis, code-generation, document-processing, api-integration
    author="Alice",
    tier="agent"
)

# Fork existing skill with lineage tracking
await manager.fork_skill(
    "analyze-code",
    "my-custom-analyzer",
    tier="agent",
    author="Bob"
)

# Publish skill to tenant library
await manager.publish_skill(
    "my-analyzer",
    source_tier="agent",
    target_tier="tenant"
)

# Export skills to .zip (vendor-neutral)
exporter = SkillExporter(registry)

# Export with dependencies
await exporter.export_skill(
    "analyze-code",
    output_path="analyze-code.zip",
    format="claude",  # Enforces 8MB limit
    include_dependencies=True
)

# Validate before export
valid, msg, size = await exporter.validate_export("large-skill", format="claude")
if not valid:
    print(f"Cannot export: {msg}")

# Enterprise Features (NEW in v0.3.0)
from nexus.skills import (
    SkillAnalyticsTracker,
    SkillGovernance,
    SkillAuditLogger,
    AuditAction
)

# Track skill usage and analytics
tracker = SkillAnalyticsTracker(db_connection)
await tracker.track_usage(
    "analyze-code",
    agent_id="alice",
    execution_time=1.5,
    success=True
)

# Get analytics for a skill
analytics = await tracker.get_skill_analytics("analyze-code")
print(f"Success rate: {analytics.success_rate:.1%}")
print(f"Avg execution time: {analytics.avg_execution_time:.2f}s")

# Get dashboard metrics
dashboard = await tracker.get_dashboard_metrics()
print(f"Total skills: {dashboard.total_skills}")
print(f"Most used: {dashboard.most_used_skills[:5]}")

# Governance - approval workflow for org-wide skills
gov = SkillGovernance(db_connection)

# Submit for approval
approval_id = await gov.submit_for_approval(
    "my-analyzer",
    submitted_by="alice",
    reviewers=["bob", "charlie"],
    comments="Ready for team-wide use"
)

# Approve skill
await gov.approve_skill(approval_id, reviewed_by="bob", comments="Excellent work!")
is_approved = await gov.is_approved("my-analyzer")

# Audit logging for compliance
audit = SkillAuditLogger(db_connection)

# Log skill operations
await audit.log(
    "analyze-code",
    AuditAction.EXECUTED,
    agent_id="alice",
    details={"execution_time": 1.5, "success": True}
)

# Query audit logs
logs = await audit.query_logs(skill_name="analyze-code", action=AuditAction.EXECUTED)

# Generate compliance report
report = await audit.generate_compliance_report(tenant_id="tenant1")
print(f"Total operations: {report['total_operations']}")
print(f"Top skills: {report['top_skills'][:5]}")

# Search skills by description
results = await manager.search_skills("code analysis", limit=5)
for skill_name, score in results:
    print(f"{skill_name}: {score:.1f}")

Skills CLI Commands (v0.3.0)

Nexus provides comprehensive CLI commands for skill management:

# List all skills
nexus skills list
nexus skills list --tenant  # Show tenant skills
nexus skills list --system  # Show system skills
nexus skills list --tier agent  # Filter by tier

# Create new skill from template
nexus skills create my-skill --description "My custom skill"
nexus skills create data-viz --description "Data visualization" --template data-analysis
nexus skills create analyzer --description "Code analyzer" --author Alice

# Fork existing skill
nexus skills fork analyze-code my-analyzer
nexus skills fork data-analysis custom-analysis --author Bob

# Publish skill to tenant library
nexus skills publish my-skill
nexus skills publish shared-skill --from-tier tenant --to-tier system

# Search skills by description
nexus skills search "data analysis"
nexus skills search "code" --tier tenant --limit 5

# Show detailed skill information
nexus skills info analyze-code
nexus skills info data-analysis

# Export skill to .zip package (vendor-neutral)
nexus skills export my-skill --output ./my-skill.zip
nexus skills export analyze-code --output ./export.zip --format claude
nexus skills export my-skill --output ./export.zip --no-deps  # Exclude dependencies

# Validate skill format and size limits
nexus skills validate my-skill
nexus skills validate analyze-code --format claude

# Calculate skill size
nexus skills size my-skill
nexus skills size analyze-code --human

Available Templates:

basic - Simple skill template
data-analysis - Data processing and analysis
code-generation - Code generation and modification
document-processing - Document parsing and analysis
api-integration - API integration and data fetching

Export Formats:

generic - Vendor-neutral .zip format (no size limit)
claude - Anthropic Claude format (8MB limit enforced)
openai - OpenAI format (validation only, ready for future plugins)

Note: External API integrations (uploading to Claude API, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. The core CLI provides generic export functionality.

SKILL.md Format:

---
name: analyze-code
description: Analyzes code quality and structure
version: 1.0.0
author: Your Name
requires:
  - base-parser
  - ast-analyzer
---

# Code Analysis Skill

This skill analyzes code for quality metrics...

## Usage

1. Parse the code files
2. Run static analysis
3. Generate report

Features:

Progressive Disclosure: Load metadata during discovery, full content on-demand
Lazy Loading: Skills cached only when accessed
Three-Tier Hierarchy: Agent skills override tenant/system skills
Dependency Resolution: Automatic DAG resolution with cycle detection
Skill Lifecycle: Create, fork, and publish skills with lineage tracking
Template System: 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
Vendor-Neutral Export: Generic .zip format with Claude/OpenAI validation
Usage Analytics: Track performance, success rates, dashboard metrics (NEW in v0.3.0)
Governance: Approval workflows for team-wide skill publication (NEW in v0.3.0)
Audit Logging: Complete compliance tracking and reporting (NEW in v0.3.0)
Skill Search: Find skills by description with relevance scoring (NEW in v0.3.0)
Comprehensive Tests: 156 passing tests (31%+ overall coverage, 65-91% skills module)

Skill Tiers:

Agent (/workspace/.nexus/skills/) - Personal skills (highest priority)
Tenant (/shared/skills/) - Team-shared skills
System (/system/skills/) - Built-in skills (lowest priority)

Technology Stack

Core

Language: Python 3.11+
API Framework: FastAPI
Database: PostgreSQL (prod) / SQLite (dev)
Cache: Redis (prod) / In-memory (dev)
Vector DB: Qdrant
Object Storage: S3-compatible, GCS, Azure Blob

AI/ML

LLM Providers: Anthropic Claude, OpenAI, Google Gemini
Embeddings: text-embedding-3-large, voyage-ai
Parsing: PyPDF2, pandas, openpyxl, Pillow

Infrastructure

Orchestration: Kubernetes (distributed mode)
Monitoring: Prometheus + Grafana
Logging: Structlog + Loki
Admin UI: Simple HTML/JS (jobs, memories, files, operations)

Performance Targets

Metric	Target	Impact
Write Throughput	500-1000 MB/s	10-50× vs direct backend
Read Latency	<10ms	10-50× vs remote storage
Memory Search	<100ms	Vector search across memories
Storage Savings	30-50%	CAS deduplication
Job Resumability	100%	Survives all restarts
LLM Cache Hit Rate	50-90%	Major cost savings
Prompt Versioning	Full lineage	Track optimization history
Training Data Dedup	30-50%	CAS-based deduplication
Prompt Optimization	Multi-candidate	Test multiple strategies in parallel
Trace Storage	Full execution logs	Debug failures, analyze patterns

Configuration

Local Mode

import nexus

# Config via Python (useful for programmatic configuration)
nx = nexus.connect(config={
    "mode": "local",
    "data_dir": "./nexus-data",
    "cache_size_mb": 100,
    "enable_vector_search": True
})

# Or let it auto-discover from nexus.yaml
nx = nexus.connect()

Self-Hosted Deployment

For organizations that want to run their own Nexus instance, create config.yaml:

mode: server  # local or server

database:
  url: postgresql://user:pass@localhost/nexus
  # or for SQLite: sqlite:///./nexus.db

cache:
  type: redis  # memory, redis
  url: redis://localhost:6379

vector_db:
  type: qdrant
  url: http://localhost:6333

backends:
  - type: s3
    bucket: my-company-files
    region: us-east-1

  - type: gdrive
    credentials_path: ./gdrive-creds.json

auth:
  jwt_secret: your-secret-key
  token_expiry_hours: 24

rate_limits:
  default: "100/minute"
  semantic_search: "10/minute"
  llm_read: "50/hour"

Run server:

nexus server --config config.yaml

Security

Multi-Layer Security Model

API Key Authentication: Tenant and agent identification
Row-Level Security (RLS): Database-level tenant isolation
Type-Level Validation: Fail-fast validation before database operations
UNIX-Style Permissions: Owner, group, and mode bits (v0.3.0)
ACL Permissions: Fine-grained access control lists (v0.3.0)
ReBAC (Relationship-Based Access Control): Zanzibar-style authorization (v0.3.0)

Type-Level Validation (NEW in v0.1.0)

All domain types have validation methods that are called automatically before database operations. This provides:

Fail Fast: Catch invalid data before expensive database operations
Clear Error Messages: Actionable feedback for developers and API consumers
Data Integrity: Prevent invalid data from entering the database
Consistent Validation: Same rules across all code paths

from nexus.core.metadata import FileMetadata
from nexus.core.exceptions import ValidationError

# Validation happens automatically on put()
try:
    metadata = FileMetadata(
        path="/data/file.txt",  # Must start with /
        backend_name="local",
        physical_path="/storage/file.txt",
        size=1024,  # Must be >= 0
    )
    store.put(metadata)  # Validates before DB operation
except ValidationError as e:
    print(f"Validation failed: {e}")
    # Example: "size cannot be negative, got -1"

Validation Rules:

Paths must start with / and not contain null bytes
File sizes and ref counts must be non-negative
Required fields (path, backend_name, physical_path, etc.) must not be empty
Content hashes must be valid 64-character SHA-256 hex strings
Metadata keys must be ≤ 255 characters

Example: Multi-Tenancy Isolation

-- RLS automatically filters queries by tenant
SET LOCAL app.current_tenant_id = '<tenant_uuid>';

-- All queries auto-filtered, even with bugs
SELECT * FROM file_paths WHERE path = '/data';
-- Returns only rows for current tenant

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=nexus --cov-report=html

# Run specific test file
pytest tests/test_filesystem.py

# Run integration tests
pytest tests/integration/ -v

# Run performance tests
pytest tests/performance/ --benchmark-only

Documentation

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

# Fork the repo and clone
git clone https://github.com/yourusername/nexus.git
cd nexus

# Create a feature branch
git checkout -b feature/your-feature

# Make changes and test
uv pip install -e ".[dev,test]"
pytest

# Format and lint
ruff format .
ruff check .

# Commit and push
git commit -am "Add your feature"
git push origin feature/your-feature

License

Apache 2.0 License - see LICENSE for details.

Roadmap

v0.1.0 - Local Mode Foundation (Current)

Core embedded filesystem (read/write/delete)
SQLite metadata store
Local filesystem backend
Basic file operations (list, glob, grep)
Virtual path routing
Directory operations (mkdir, rmdir, is_directory)
Basic CLI interface with Click and Rich
Metadata export/import (JSONL format)
SQL views for ready work detection
In-memory caching
Batch operations (avoid N+1 queries)
Type-level validation

v0.2.0 - FUSE Mount & Content-Aware Operations (Current)

FUSE filesystem mount - Mount Nexus to local path (e.g., /mnt/nexus)
Smart read mode - Return parsed text for binary files (PDFs, Excel, etc.)
Virtual file views - Auto-generate .txt and .md views for binary files
Content parser framework - Extensible parser system for document types (MarkItDown)
PDF parser - Extract text and markdown from PDFs
Excel/CSV parser - Parse spreadsheets to structured data
Content-aware file access - Access parsed content via virtual views
Document type detection - Auto-detect MIME types and route to parsers
Mount CLI commands - nexus mount, nexus unmount
Mount modes - Binary, text, and smart modes
.raw directory - Access original binary files
Background daemon mode - Run mount in background with --daemon
All FUSE operations - read, write, create, delete, mkdir, rmdir, rename, truncate
Unit tests - Comprehensive test coverage for FUSE operations
rclone-style CLI commands - sync, copy, move, tree, size with progress bars
Background parsing - Async content parsing on write
FUSE performance optimizations - Caching (TTL/LRU), cache invalidation, metrics
Image OCR parser - Extract text from images (PNG, JPEG)

v0.3.0 - File Permissions & Skills System

Permissions (Complete):

UNIX-style file permissions (owner, group, mode)
Permission operations (chmod, chown, chgrp)
ACL (Access Control List) support
CLI commands (getfacl, setfacl)
Database schema for permissions and ACL entries
Comprehensive tests (91 passing tests)
ReBAC (Relationship-Based Access Control) - Zanzibar-style authorization
Relationship types - member-of, owner-of, viewer-of, editor-of, parent-of
Permission inheritance via relationships - Team ownership, group membership
Relationship graph queries - Graph traversal with cycle detection
Namespaced tuples - (subject, relation, object) authorization model
Check API - Fast permission checks with 5-minute TTL caching
Expand API - Discover all subjects with specific permissions
Relationship management - Create, delete, query relationships via CLI
Expiring tuples - Temporary permissions with automatic cleanup
Comprehensive ReBAC tests (14 passing tests, 100% pass rate)

Permissions (Remaining):

Default permission policies per namespace
Permission inheritance for new files
Permission checking in all file operations
Permission migration for existing files

Skills System (Core - Vendor Neutral):

SKILL.md parser - Parse Anthropic-compatible SKILL.md with frontmatter
Skill registry - Progressive disclosure, lazy loading, three-tier hierarchy
Skill discovery - Scan /workspace/.nexus/skills/, /shared/skills/, /system/skills/
Dependency resolution - Automatic DAG resolution with cycle detection
Skill export - Export to generic formats (validate, pack, size check)
Skill templates - 5 pre-built templates (basic, data-analysis, code-generation, document-processing, api-integration)
Skill lifecycle - Create, fork, publish workflows with lineage tracking
Comprehensive tests - 156 passing tests (31%+ overall coverage, 65-91% skills module)
Skill analytics - Usage tracking, success rates, execution time, dashboard metrics
Skill search - Text-based search across skill descriptions with relevance scoring
Skill governance - Approval workflow for org-wide skills (submit, approve, reject)
Audit trails - Log all skill operations, compliance reporting, query by filters
Skill versioning - CAS-backed version control with history tracking
Semantic skill search - Vector-based search across skill descriptions
CLI commands - list, create, fork, publish, search, info, export, validate, size (see issue #88)

Note: External integrations (Claude API upload/download, OpenAI, etc.) will be implemented as plugins in v0.3.5+ to maintain vendor neutrality. Core Nexus provides generic skill export (nexus skills export --format claude), while nexus-plugin-anthropic handles API-specific operations.

v0.3.5 - Plugin System & External Integrations

Plugin discovery - Entry point-based plugin discovery
Plugin registry - Register and manage installed plugins
Plugin CLI namespace - nexus <plugin-name> <command> pattern
Plugin hooks - Lifecycle hooks (before_write, after_read, etc.)
Plugin configuration - Per-plugin config in ~/.nexus/plugins/<name>/
Plugin manager - nexus plugins list/install/uninstall/info
First-party plugins:
- nexus-plugin-anthropic - Claude API integration (upload/download skills)
- nexus-plugin-openai - OpenAI API integration
- nexus-plugin-skill-seekers - Integration with Skill_Seekers scraper

v0.4.0 - AI Integration

LLM provider abstraction
Anthropic Claude integration
OpenAI integration
Basic KV cache for prompts
Semantic search (vector embeddings)
LLM-powered document reading

v0.5.0 - Agent Workspaces

Agent workspace structure
File-based configuration (.nexus/)
Custom command system (markdown)
Basic agent memory storage
Memory consolidation
Memory reflection phase (ACE-inspired: extract insights from execution trajectories)
Strategy/playbook organization (ACE-inspired: organize memories as reusable strategies)

v0.6.0 - Server Mode (Self-Hosted & Managed)

FastAPI REST API
API key authentication
Multi-tenancy support
PostgreSQL support
Redis caching
Docker deployment
Batch/transaction APIs (atomic multi-operation updates)
Optimistic locking for concurrent writes
Auto-scaling configuration (for hosted deployments)

v0.7.0 - Extended Features & Event System

S3 backend support
Google Drive backend
Job system with checkpointing
OAuth token management
MCP server implementation
Webhook/event system (file changes, memory updates, job events)
Watch API for real-time updates (streaming changes to clients)
Server-Sent Events (SSE) support for live monitoring
Simple admin UI (jobs, memories, files, operation logs)
Operation logs table (track storage operations for debugging)

v0.8.0 - Advanced AI Features & Rich Query

Advanced KV cache with context tracking
Memory versioning and lineage
Multi-agent memory sharing
Enhanced semantic search
Importance-based memory preservation (ACE-inspired: prevent brevity bias in consolidation)
Context-aware memory retrieval (include execution context in search)
Automated strategy extraction (LLM-powered extraction from successful trajectories)
Rich memory query language (filter by metadata, importance, task type, date ranges, etc.)
Memory query builder API (fluent interface for complex queries)
Combined vector + metadata search (hybrid search)

v0.9.0 - Production Readiness

Monitoring and observability
Performance optimization
Comprehensive testing
Security hardening
Documentation completion
Optional OpenTelemetry export (for framework integration)

v0.9.5 - Prompt Engineering & Optimization

Prompt version control with lineage tracking
Training dataset storage with CAS deduplication
Evaluation metrics time series (performance tracking)
Frozen inference snapshots (immutable program state)
Experiment tracking export (MLflow, W&B integration)
Prompt diff viewer (compare versions)
Regression detection alerts (performance drops)
Multi-candidate pool management (concurrent prompt testing)
Execution trace storage (detailed run logs for debugging)
Per-example evaluation results (granular performance tracking)
Optimization run grouping (experiment management)
Multi-objective tradeoff analysis (accuracy vs latency vs cost)

v0.10.0 - Production Infrastructure & Auto-Scaling

Automatic infrastructure scaling
Redis distributed locks (for large deployments)
PostgreSQL replication (for high availability)
Kubernetes deployment templates
Multi-region load balancing
Automatic migration from single-node to distributed

v1.0.0 - Production Release

Complete feature set
Production-tested
Comprehensive documentation
Migration tools
Enterprise support

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: support@nexus.example.com
Slack: Join our community

Built with ❤️ by the Nexus team

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.9.26

Apr 5, 2026

0.9.25

Apr 5, 2026

0.9.24

Apr 5, 2026

0.9.23

Apr 4, 2026

0.9.22

Apr 4, 2026

0.9.21

Apr 4, 2026

0.9.20

Apr 4, 2026

0.9.19

Apr 2, 2026

0.9.18

Apr 1, 2026

0.9.16

Mar 30, 2026

0.9.14

Mar 27, 2026

0.9.12

Mar 25, 2026

0.9.11

Mar 23, 2026

0.9.10

Mar 23, 2026

0.9.9

Mar 22, 2026

0.9.8

Mar 19, 2026

0.9.6

Mar 16, 2026

0.9.5

Mar 16, 2026

0.9.4

Mar 15, 2026

0.9.3

Mar 15, 2026

0.9.2

Mar 13, 2026

0.9.1

Mar 11, 2026

0.9.0

Mar 11, 2026

0.7.0

Feb 1, 2026

0.6.4

Dec 18, 2025

0.6.3

Dec 17, 2025

0.6.2

Dec 10, 2025

0.6.1

Dec 9, 2025

0.6.0

Dec 7, 2025

0.5.6

Nov 18, 2025

0.5.5

Nov 18, 2025

0.5.4

Nov 14, 2025

0.5.3

Nov 4, 2025

0.5.2

Oct 31, 2025

0.5.0

Oct 30, 2025

0.3.9

Oct 23, 2025

0.3.0

Oct 22, 2025

0.2.5

Oct 21, 2025

This version

0.2.4

Oct 20, 2025

0.2.3

Oct 20, 2025

0.2.2

Oct 19, 2025

0.1.3

Oct 17, 2025

0.1.2

Oct 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nexus_ai_fs-0.2.4.tar.gz (641.8 kB view details)

Uploaded Oct 20, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

nexus_ai_fs-0.2.4-py3-none-any.whl (187.2 kB view details)

Uploaded Oct 20, 2025 Python 3

File details

Details for the file nexus_ai_fs-0.2.4.tar.gz.

File metadata

Download URL: nexus_ai_fs-0.2.4.tar.gz
Upload date: Oct 20, 2025
Size: 641.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for nexus_ai_fs-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`eb916b45be7801237cd3cbe51037a585a8293749fad0bd3dde5c7035ec91f463`
MD5	`7f5fc19103eb90410ff1ffadf7e78050`
BLAKE2b-256	`ee5507fa66a6b87451431d46ee6e42a8c7caf163f57c6bda74602fdfd5fb7dfa`

See more details on using hashes here.

File details

Details for the file nexus_ai_fs-0.2.4-py3-none-any.whl.

File metadata

Download URL: nexus_ai_fs-0.2.4-py3-none-any.whl
Upload date: Oct 20, 2025
Size: 187.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for nexus_ai_fs-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`05115a32fe932340d97b4f9e93bc55d44dbe750e76249ab437ec9d9a07692090`
MD5	`5147dff69f8d69d9e53867a5b1987dcb`
BLAKE2b-256	`50727f6de5e4df564a16f50f6026fc47d7794304e9b992fd59919d3b18b07724`

See more details on using hashes here.

nexus-ai-fs 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Nexus: AI-Native Distributed Filesystem

Features

Foundation

Agent Intelligence

Content Processing

Operations

Deployment Modes

Quick Start: Local Mode

Quick Start: Hosted Mode

Storage Backends

Local Backend (Default)

Google Cloud Storage (GCS) Backend

Advanced: Direct Backend API

Backend Comparison

Coming Soon

Installation

Using pip (Recommended)

From Source (Development)

Development Setup

CLI Usage

Quick Start

Available Commands

File Operations

Directory Operations

File Discovery

File Permissions (v0.3.0)

ReBAC - Relationship-Based Access Control (v0.3.0)

Work Queue Operations

Examples

Global Options

Help

Remote Nexus Server

Quick Start

Method 1: Using the Startup Script (Recommended)

Method 2: Direct Command

Features

Remote Client Usage

Server Options

Testing the Server

Troubleshooting

Deploying Nexus Server

Google Cloud Platform (Recommended)

Docker Deployment

FUSE Mount: Use Standard Unix Tools (v0.2.0)

Installation

Quick Start

Quick Start Examples

File Access: Two Modes

1. Explicit Views (Default) - Best for Compatibility

2. Auto-Parse Mode - Best for Search/Grep

Mount Modes (Content Parsing)

Comparison Table

Background (Daemon) Mode

Performance & Caching (v0.2.0)

Troubleshooting FUSE Mounts

Check Mount Status

Common Issues

Debug Mode

rclone-style CLI Commands (v0.2.0)

Sync Command

Copy Command

Move Command

Tree Command

Size Command

Performance Comparison

Use Cases

Architecture

Agent Workspace Structure

Path Namespace

Core Components

File System Operations