Skip to main content

Durable, persistent knowledge graph for AI agents - local-first, schema-driven, cryptographically verified

Reason this release was yanked:

Deprecated โ€” use novyx instead

Project description

Novyx Core

Code that endures. Intelligence that persists.

A production-ready persistent knowledge graph for AI agents with deep semantic understanding and cryptographic verification. Built for agents that think across time horizonsโ€”days, weeks, or monthsโ€”without losing context.

PyPI Python Schema License AI Visibility


๐ŸŽฏ Mission

Novyx Core is a persistent knowledge graph system designed for AI-first workflows. Unlike ephemeral chatbots or session-based systems, Novyx maintains cryptographically verified state across time, enabling:

  • Long-horizon intelligence: Projects that span weeks without context loss
  • Semantic memory: 384-dimensional embeddings for every artifact
  • AI discoverability: Machine-readable metadata for 2026+ AI search engines
  • Verifiable integrity: SHA-256 hashing and automated auditing

โš™๏ธ Core Tools

Tool Purpose Key Features
๐Ÿง  Semantic Pulse Ingest raw text and generate embeddings all-MiniLM-L6-v2 model, auto-linking, external authority support
๐Ÿ›ก๏ธ Sentinel Verify data integrity and schema compliance Backward-compatible v1.0/v1.1.0, hash validation, link verification
๐Ÿ” Query Engine Semantic search and knowledge graph traversal Vector similarity search, authority filtering, graph statistics
๐ŸŒ AI Visibility Manager Generate Schema.org entities for discoverability Organization profile, external identity linking, trust signals
๐ŸŽจ Interactive Dashboard Web UI for all core tools File upload, semantic search, integrity checks, graph visualization

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10+ required
  • No cloud dependencies - runs entirely on your machine

Installation

Option 1: Install from PyPI (Recommended)

# Install core package
pip install novyx-core

# With semantic search support
pip install novyx-core[semantic]

# With API + GraphQL support
pip install novyx-core[api]

# Full installation (all features)
pip install novyx-core[semantic,api,dashboard]

Option 2: Install from source

# Clone the repository
git clone https://github.com/novyxlabs/novyx-core.git
cd novyx-core

# Install in development mode
pip install -e ".[dev]"

Option 3: Docker

# Using docker-compose (see docker-compose.yml)
docker-compose up -d

# Or run directly
docker run -p 8000:8000 -v $(pwd)/memory:/app/memory novyxlabs/novyx-core

Basic Usage

1. Ingest a new artifact

# Simple ingestion
python tools/ingestor.py --file inbox/note.txt --category research

# With external authority linking
python tools/ingestor.py \
  --file inbox/insight.txt \
  --category decisions \
  --link https://github.com/novyxlabs

2. Search the knowledge graph

# Semantic search (finds by meaning, not keywords)
python tools/query.py --search "deep learning and cognitive science"

# Authority search (filter by external links)
python tools/query.py --authority "github"

# Show statistics
python tools/query.py --stats

3. Verify integrity

# Run before every commit
python tools/sentinel.py

# Verbose mode
python tools/sentinel.py --verbose

4. Generate AI visibility report

python tools/entity_generator.py generate-report

5. Launch interactive dashboard

streamlit run tools/dashboard.py

Open browser to http://localhost:8501 for web UI


๐Ÿ›ก๏ธ Enforcement Automation (Layer 5)

Enable Commit Gate in 10 Seconds

Novyx Core includes automated integrity checks that can run before every commit:

# Install git hooks (one-time setup)
bash scripts/install_git_hooks.sh

# Or use the CLI
python3 -m tools.cli install-hooks

This creates a .git/hooks/pre-commit hook that:

  • โœ… Runs Sentinel integrity validation
  • โœ… Blocks commits if validation fails
  • โœ… (Optional) Enforces graph integrity with NOVYX_ENFORCE_GRAPH=1

Enable graph enforcement:

export NOVYX_ENFORCE_GRAPH=1  # Add to ~/.bashrc for persistence

Operator Reports

Generate daily snapshot reports for monitoring:

# Via CLI (recommended)
python3 -m tools.cli report --directory memory --out reports/

# Or directly
python3 -m tools.operator_report --directory memory --out reports/

Reports include:

  • ๐Ÿ“Š Sentinel integrity summary (passed/failed counts)
  • ๐Ÿ”— Graph statistics (artifacts, links, orphans)
  • ๐Ÿ“… New artifacts since last report
  • โš ๏ธ Risk notes (orphaned links, parse errors)

CLI Commands

Graph Analysis:

# Show graph statistics
python3 -m tools.cli graph --directory memory

# Explain an artifact
python3 -m tools.cli explain --id urn:uuid:... --directory memory
python3 -m tools.cli explain --file memory/decisions/my_decision.jsonld

# Repair orphaned links
python3 -m tools.cli repair --directory memory --dry-run
python3 -m tools.cli repair --directory memory  # Actually fix
python3 -m tools.cli repair --directory memory --create-stubs  # Create stub artifacts

Semantic Search (Phase 8):

# Basic semantic search (requires: pip install -e ".[semantic]")
python3 -m tools.cli query --directory memory --search "database selection"

# With custom similarity threshold (0.0-1.0)
python3 -m tools.cli query --directory memory --search "API authentication" --min-score 0.5

# Limit number of results
python3 -m tools.cli query --directory memory --search "GraphQL vs REST" --top-k 5

# JSON output for programmatic use
python3 -m tools.cli query --directory memory --search "microservices" --json

# Statistics only (no search)
python3 -m tools.cli query --directory memory

Direct Query Engine:

# Using query.py directly
python3 -m tools.query --search "machine learning embeddings"
python3 -m tools.query --search "vector embeddings" --min-score 0.7 --top-k 3
python3 -m tools.query --search "semantic similarity" --json
python3 -m tools.query --stats  # Show graph statistics

CI/CD Integration

GitHub Actions workflow included at .github/workflows/ci.yml:

  • โœ… Compile checks
  • โœ… Unit tests
  • โœ… Sentinel validation
  • โœ… Graph enforcement

๐ŸŒ API Deployment (Phase 6)

RESTful API Server

Novyx Core exposes a production-ready FastAPI server with 8 endpoints for programmatic access:

Start API Server:

# Development (with auto-reload)
uvicorn tools.api:app --reload --host 0.0.0.0 --port 8000

# Production
uvicorn tools.api:app --host 0.0.0.0 --port 8000 --workers 4

Environment Variables:

export NOVYX_API_KEY="your-secret-key-here"
export NOVYX_RATE_LIMIT="100/minute"

Production Security

Novyx Core includes a startup security check that validates configuration before the API starts.

Required Environment Variables for Production:

Variable Description Requirement
NOVYX_API_KEY API authentication key Min 16 chars, no default markers
NOVYX_JWT_SECRET JWT signing secret Min 32 chars, no default markers

Optional Security Variables:

Variable Description Default
NOVYX_ENVIRONMENT Set to development to suppress warnings (none)
NOVYX_STRICT_SECURITY Set to 1 to exit on insecure config 0

Behavior:

  • Development (NOVYX_ENVIRONMENT=development): Prints gentle warnings to stdout
  • Non-development: Prints loud warnings to stderr
  • Strict mode (NOVYX_STRICT_SECURITY=1): Exits with code 1 if insecure (non-development only)

CORS Configuration (Phase 12):

Variable Description Default
NOVYX_CORS_ORIGINS Comma-separated list of allowed origins http://localhost:3000,http://localhost:8080
# Production CORS - only allow your domains
export NOVYX_CORS_ORIGINS="https://app.example.com,https://admin.example.com"

Input Validation (Phase 12):

All API endpoints validate:

  • artifact_id: Must be valid UUID format (urn:uuid:... or bare UUID)
  • tenant_id: Must be 3-64 lowercase alphanumeric with hyphens/underscores
  • Path traversal attacks (../, ..\\) are blocked with HTTP 400

Example Production Configuration:

export NOVYX_API_KEY="$(openssl rand -hex 32)"
export NOVYX_JWT_SECRET="$(openssl rand -hex 32)"
export NOVYX_CORS_ORIGINS="https://app.example.com"
export NOVYX_STRICT_SECURITY="1"

API Documentation:

Endpoints

Method Endpoint Description Auth Required
GET /health Health check โŒ
POST /api/v1/create/{type} Create artifact โœ…
POST /api/v1/ingest/text Ingest raw text โœ…
POST /api/v1/ingest/decision Create Decision โœ…
GET /api/v1/query/search Semantic search โœ…
GET /api/v1/query/stats Graph statistics โœ…
GET /api/v1/graph Graph analysis โœ…
POST /api/v1/validate Run Sentinel checks โœ…
POST /api/v1/repair Repair orphaned links โœ…

cURL Examples

1. Health Check:

curl http://localhost:8000/health

2. Create a Decision Artifact:

curl -X POST http://localhost:8000/api/v1/ingest/decision \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Choose Database",
    "context": "Need to select a database for new service",
    "options_considered": ["PostgreSQL", "MongoDB", "SQLite"],
    "chosen_option": "PostgreSQL",
    "reasoning": "Relational data with strong ACID guarantees",
    "constraints": ["Budget under $100/month", "Must scale to 10k users"],
    "assumptions": ["Data is relational", "Team knows SQL"],
    "confidence": 0.85,
    "expected_outcome": "Reliable data storage with good query performance",
    "category": "architecture"
  }'

3. Search the Knowledge Graph:

curl "http://localhost:8000/api/v1/query/search?search_term=database&limit=5" \
  -H "X-API-Key: novyx-dev-key-change-in-production"

4. Get Graph Statistics:

curl "http://localhost:8000/api/v1/query/stats?directory=memory" \
  -H "X-API-Key: novyx-dev-key-change-in-production"

5. Analyze Graph (with orphan detection):

curl "http://localhost:8000/api/v1/graph?directory=memory&include_orphans=true" \
  -H "X-API-Key: novyx-dev-key-change-in-production"

6. Validate Artifacts:

curl -X POST http://localhost:8000/api/v1/validate \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "enforce_graph": false,
    "verbose": true
  }'

7. Repair Graph (Dry Run):

curl -X POST http://localhost:8000/api/v1/repair \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "dry_run": true,
    "create_stubs": false
  }'

8. Create Any Artifact Type:

curl -X POST http://localhost:8000/api/v1/create/apiendpoint \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "payload": {
      "title": "New Endpoint",
      "method": "GET",
      "path": "/api/v1/example",
      "description": "Example endpoint",
      "parameters": [],
      "response_schema": "{\"status\": \"success\"}"
    },
    "external_links": ["https://github.com/novyxlabs"]
  }'

Docker Deployment

Build & Run:

# Build image
docker build -t novyx-core-api .

# Run container
docker run -d \
  -p 8000:8000 \
  -e NOVYX_API_KEY="your-production-key" \
  -e NOVYX_RATE_LIMIT="200/minute" \
  -v $(pwd)/memory:/app/memory \
  --name novyx-api \
  novyx-core-api

# View logs
docker logs -f novyx-api

# Stop container
docker stop novyx-api

Vercel Deployment

Novyx Core API can be deployed to Vercel with minimal configuration:

1. Create vercel.json:

{
  "builds": [
    {
      "src": "tools/api.py",
      "use": "@vercel/python"
    }
  ],
  "routes": [
    {
      "src": "/(.*)",
      "dest": "tools/api.py"
    }
  ],
  "env": {
    "NOVYX_API_KEY": "@novyx-api-key",
    "NOVYX_RATE_LIMIT": "100/minute"
  }
}

2. Deploy:

# Install Vercel CLI
npm install -g vercel

# Deploy
vercel --prod

# Set environment variable
vercel env add NOVYX_API_KEY production

3. Access:

https://your-project.vercel.app/health
https://your-project.vercel.app/api/docs

Note: For Vercel deployment, the memory/ directory is ephemeral. Consider using:

  • External storage (S3, Google Cloud Storage)
  • Database backend (PostgreSQL with JSON columns)
  • Git-based persistence (commit artifacts on change)

Security Best Practices

  1. Change Default API Key:

    export NOVYX_API_KEY="$(openssl rand -hex 32)"
    
  2. Use HTTPS in Production:

    • Deploy behind reverse proxy (nginx, Caddy)
    • Or use platform SSL (Vercel, Railway, Fly.io)
  3. Configure Rate Limiting:

    export NOVYX_RATE_LIMIT="50/minute"  # Adjust per environment
    
  4. Monitor API Usage:

    python3 -m tools.cli report --directory memory --out reports/
    # Check API usage metrics in report
    

๐Ÿ”ท GraphQL API (Phase 7)

New in v1.3.0: Novyx Core now provides a GraphQL interface alongside the REST API, offering flexible querying with a single endpoint.

GraphQL Playground

Access the interactive GraphQL Playground at:

http://localhost:8000/graphql

Features:

  • ๐ŸŽฏ Single endpoint for all operations
  • ๐Ÿ“– Auto-generated documentation & schema introspection
  • ๐Ÿ” 5 Query operations (read data)
  • โœ๏ธ 4 Mutation operations (modify data)
  • ๐Ÿ” API key authentication (same as REST API)
  • ๐Ÿ“Š Real-time query execution with GraphiQL interface

GraphQL Operations

Queries (Read Operations):

  1. hello - Health check
  2. artifactTypes - List all registered artifact types
  3. searchArtifacts - Text-based search across artifacts
  4. graphStats - Knowledge graph statistics
  5. validateGraph - Run Sentinel validation

Mutations (Write Operations):

  1. createDecision - Create Decision artifact
  2. ingestText - Ingest raw text as DigitalDocument
  3. createArtifact - Create generic artifact
  4. repairGraph - Repair orphaned links

GraphQL Examples

1. Hello Query (Health Check):

query {
  hello
}

cURL equivalent:

curl -X POST http://localhost:8000/graphql \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{"query": "{ hello }"}'

2. List Artifact Types:

query {
  artifactTypes
}

Response:

{
  "data": {
    "artifactTypes": ["decision", "apiendpoint", "graphqloperation"]
  }
}

3. Search Artifacts:

query SearchArtifacts($searchTerm: String!, $threshold: Float) {
  searchArtifacts(searchTerm: $searchTerm, threshold: $threshold, directory: "memory") {
    artifactId
    title
    similarityScore
    category
    createdAt
    snippet
  }
}

Variables:

{
  "searchTerm": "database decision",
  "threshold": 0.3
}

cURL:

curl -X POST http://localhost:8000/graphql \
  -H "X-API-Key: novyx-dev-key-change-in-production" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "query SearchArtifacts($searchTerm: String!) { searchArtifacts(searchTerm: $searchTerm) { artifactId title similarityScore } }",
    "variables": {"searchTerm": "database"}
  }'

4. Get Graph Statistics:

query {
  graphStats(directory: "memory") {
    totalArtifacts
    totalIds
    totalLinks
    orphanedLinks
    artifactsByType
    artifactsByCategory
    latestTimestamp
  }
}

5. Validate Graph:

query {
  validateGraph(directory: "memory", enforceGraph: true) {
    totalArtifacts
    passed
    failed
    uniqueIds
    success
    errors
  }
}

6. Create Decision (Mutation):

mutation CreateDecision($input: CreateDecisionInput!) {
  createDecision(input: $input) {
    success
    artifactId
    filePath
    message
  }
}

Variables:

{
  "input": {
    "title": "Choose GraphQL vs REST",
    "context": "Evaluating API design approach",
    "optionsConsidered": ["GraphQL only", "REST only", "Hybrid GraphQL + REST"],
    "chosenOption": "Hybrid GraphQL + REST",
    "reasoning": "GraphQL provides flexible querying, REST offers simplicity. Hybrid gives best of both.",
    "confidence": 0.9,
    "expectedOutcome": "Flexible API with wide compatibility",
    "category": "architecture",
    "constraints": [],
    "assumptions": [],
    "externalLinks": []
  }
}

7. Ingest Text (Mutation):

mutation IngestText($input: IngestTextInput!) {
  ingestText(input: $input) {
    success
    artifactId
    message
  }
}

Variables:

{
  "input": {
    "text": "GraphQL provides a strongly-typed schema and efficient data fetching with a single endpoint.",
    "category": "research",
    "externalLinks": ["https://graphql.org"]
  }
}

8. Create Generic Artifact (Mutation):

mutation CreateArtifact($input: CreateArtifactInput!) {
  createArtifact(input: $input) {
    success
    artifactId
    filePath
  }
}

Variables:

{
  "input": {
    "artifactType": "apiendpoint",
    "payload": "{\"title\":\"New Endpoint\",\"method\":\"GET\",\"path\":\"/api/v2/data\",\"description\":\"Fetch data\",\"parameters\":[],\"response_schema\":\"{}\"}",
    "externalLinks": []
  }
}

9. Repair Graph (Mutation):

mutation RepairGraph($input: RepairInput!) {
  repairGraph(input: $input) {
    orphansFound
    linksRemoved
    stubsCreated
    filesModified
    success
    message
  }
}

Variables:

{
  "input": {
    "directory": "memory",
    "dryRun": true,
    "createStubs": false
  }
}

GraphQL vs REST: When to Use Each

Feature GraphQL REST
Best For Complex queries, flexible data fetching Simple CRUD, caching, standard HTTP
Endpoint Single (/graphql) Multiple (/api/v1/*)
Data Fetching Request exactly what you need Fixed response structure
Documentation Auto-generated schema introspection Swagger/OpenAPI
Learning Curve Steeper (query language) Easier (HTTP verbs)

Recommendation: Use both! GraphQL for complex querying, REST for simple operations.

Testing GraphQL

Run the GraphQL test suite:

python3 -m pytest tests/test_graphql.py -v
# 22 tests covering all operations

๐Ÿข Multi-Tenant Architecture (Phase 9)

Novyx Core supports multi-tenant deployments with isolated data, JWT authentication, and role-based access control.

Key Features

  • ๐Ÿ” JWT Authentication: Secure token-based authentication with role hierarchy
  • ๐Ÿข Data Isolation: Tenant-specific subdirectories ensure complete data separation
  • ๐Ÿ‘ฅ Role-Based Access: Admin, Editor, and Viewer roles with hierarchical permissions
  • ๐Ÿ” Tenant Filtering: CLI and API support for tenant-scoped operations
  • ๐Ÿ“Š Tenant Metrics: Per-tenant analytics and usage tracking

Tenant Structure

Each tenant is defined by a Tenant artifact in memory/tenants/:

{
  "tenant_id": "acme-corp",
  "name": "Acme Corporation",
  "admin_email": "admin@acme-corp.com",
  "plan_type": "enterprise",
  "status": "active",
  "max_artifacts": 10000,
  "features": ["semantic_search", "graphql", "api_access", "multi_user"],
  "users": [
    {
      "email": "admin@acme.com",
      "role": "admin",
      "name": "Admin User"
    },
    {
      "email": "editor@acme.com",
      "role": "editor",
      "name": "Editor User"
    }
  ]
}

Authentication Flow

1. Login to get JWT token:

curl -X POST http://localhost:8000/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{
    "tenant_id": "acme-corp",
    "email": "admin@acme.com",
    "password": "your-password"
  }'

Response:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer",
  "tenant_id": "acme-corp",
  "email": "admin@acme.com",
  "role": "admin",
  "expires_in": 3600
}

2. Use JWT token for authenticated requests:

curl http://localhost:8000/api/v1/query/search?search_term=database \
  -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."

3. Check current user context:

curl http://localhost:8000/api/v1/auth/me \
  -H "Authorization: Bearer your-jwt-token"

Role-Based Permissions

Role Permissions
Viewer Read artifacts, search, query statistics
Editor All viewer permissions + create/modify artifacts
Admin All editor permissions + user management, tenant configuration

Permission hierarchy: Admin > Editor > Viewer

CLI Tenant Operations

Query tenant-specific artifacts:

python3 -m tools.cli query --directory memory/decisions --tenant acme-corp
# Shows only artifacts for acme-corp tenant

Validate tenant artifacts:

python3 -m tools.cli validate --directory memory/decisions --tenant acme-corp
# Validates only acme-corp's artifacts

Repair tenant graph:

python3 -m tools.cli repair --directory memory/decisions --tenant beta-testing --dry-run
# Repairs graph for beta-testing tenant only

Data Isolation

Tenant artifacts are automatically stored in isolated subdirectories:

memory/
โ”œโ”€โ”€ decisions/
โ”‚   โ”œโ”€โ”€ acme-corp/          # Isolated from other tenants
โ”‚   โ”‚   โ”œโ”€โ”€ decision_1.jsonld
โ”‚   โ”‚   โ””โ”€โ”€ decision_2.jsonld
โ”‚   โ””โ”€โ”€ beta-testing/       # Completely separate
โ”‚       โ”œโ”€โ”€ decision_1.jsonld
โ”‚       โ””โ”€โ”€ decision_2.jsonld
โ”œโ”€โ”€ tenants/
โ”‚   โ”œโ”€โ”€ acme-corp/
โ”‚   โ”‚   โ””โ”€โ”€ acme_corporation.jsonld
โ”‚   โ””โ”€โ”€ beta-testing/
โ”‚       โ””โ”€โ”€ beta_testing.jsonld

Security guarantee: Tenants cannot access each other's data through filesystem isolation.

Creating Tenant-Scoped Artifacts

Using the factory with tenant_id:

from tools.factory import create_artifact, persist_artifact
from pathlib import Path

# Create artifact for specific tenant
artifact = create_artifact(
    "decision",
    {
        "title": "Technical Decision",
        "context": "Choosing database",
        "options_considered": ["PostgreSQL", "MongoDB"],
        "chosen_option": "PostgreSQL",
        "reasoning": "Better consistency guarantees",
        "constraints": ["Budget", "Timeline"],
        "assumptions": ["Relational data model"],
        "confidence": 0.9,
        "expected_outcome": "Stable database layer",
        "category": "architecture"
    },
    tenant_id="acme-corp"  # Tenant isolation
)

# Automatically persists to memory/decisions/acme-corp/
path = persist_artifact(artifact, Path("memory/decisions"))

Environment Variables

Configure JWT authentication:

# JWT secret key (REQUIRED in production)
export NOVYX_JWT_SECRET="your-secret-key-min-32-chars"

# Token expiration (default: 60 minutes)
export NOVYX_JWT_EXPIRE_MINUTES="120"

# API key (backward compatibility)
export NOVYX_API_KEY="your-api-key"

Testing Multi-Tenant Features

Run the multi-tenant test suite:

python3 -m pytest tests/test_multi_tenant.py -v
# Tests: JWT auth, roles, isolation, cross-tenant security

Migration from Single-Tenant

Existing artifacts without tenant_id continue to work:

  • Stored in base directory (e.g., memory/decisions/)
  • Accessible via global API key authentication
  • Can be migrated by adding tenant_id field and moving to tenant subdirectory

๐Ÿ“ฆ Artifact Versioning & History (Phase 10)

Novyx Core provides comprehensive version control for all artifacts, enabling full audit trails, diff-based storage, and rollback capabilities.

Key Features

  • ๐Ÿ“œ Complete History: Every artifact update creates a version record with diff
  • ๐Ÿ”„ Rollback: Restore artifacts to any previous version
  • ๐Ÿ’พ Diff-Based Storage: Space-efficient versioning using unified diffs
  • ๐Ÿ” Integrity Verified: Each version has its own cryptographic hash
  • ๐Ÿ‘ฅ Tenant-Aware: Version histories are isolated per tenant
  • ๐ŸŽฏ Automated: Versioning happens automatically on artifact updates

How It Works

When an artifact is updated via persist_artifact(), Novyx automatically:

  1. Loads the previous version of the artifact
  2. Computes a unified diff between old and new content
  3. Creates a Version artifact with the diff, timestamp, and change metadata
  4. Stores the version in memory/versions/{artifact_id}/
  5. Updates the main artifact with new content and hash

CLI Version Management

List all versions for an artifact:

python3 -m tools.cli version --list --artifact-id urn:uuid:abc123

Output:

๐Ÿ“‹ Version History for urn:uuid:abc123
   Total versions: 5

   v1: 2026-01-10T14:30:00Z
      UUID: urn:uuid:version-001
   v2: 2026-01-11T09:15:00Z
      UUID: urn:uuid:version-002
   v3: 2026-01-12T16:45:00Z
      UUID: urn:uuid:version-003

Rollback to a specific version:

python3 -m tools.cli version --rollback --artifact-id urn:uuid:abc123 --version 2

Output:

๐Ÿ”„ Rolling back artifact urn:uuid:abc123 to version 2...
โœ… Rollback successful!
   Updated: memory/decisions/my_decision_a1b2c3d4.jsonld

Tenant-scoped version operations:

python3 -m tools.cli version --list --artifact-id urn:uuid:xyz789 --tenant acme-corp
python3 -m tools.cli version --rollback --artifact-id urn:uuid:xyz789 --version 1 --tenant acme-corp

API Version Endpoints

GET /api/v1/version/{artifact_id}/list - List versions

curl -X GET "http://localhost:8000/api/v1/version/abc123/list" \
  -H "X-API-Key: your-api-key"

Response:

{
  "status": "success",
  "message": "Found 5 version(s) for artifact urn:uuid:abc123",
  "data": {
    "artifact_id": "urn:uuid:abc123",
    "version_count": 5,
    "versions": [
      {
        "version_number": 1,
        "uuid": "urn:uuid:version-001",
        "timestamp": "2026-01-10T14:30:00+00:00",
        "change_summary": "Initial decision created",
        "changed_by": "user@example.com"
      },
      {
        "version_number": 2,
        "uuid": "urn:uuid:version-002",
        "timestamp": "2026-01-11T09:15:00+00:00",
        "change_summary": "Enhanced reasoning",
        "changed_by": "user@example.com"
      }
    ]
  }
}

POST /api/v1/version/{artifact_id}/rollback - Rollback to version

curl -X POST "http://localhost:8000/api/v1/version/abc123/rollback" \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"version_number": 2}'

Response:

{
  "status": "success",
  "message": "Artifact rolled back to version 2",
  "data": {
    "artifact_id": "urn:uuid:abc123",
    "version_number": 2,
    "updated_path": "memory/decisions/my_decision_a1b2c3d4.jsonld",
    "updated_at": "2026-01-13T10:30:00+00:00"
  }
}

Version Schema

Each Version artifact follows the novyx:Version schema:

{
  "@type": ["novyx:Version", "schema:CreativeWork"],
  "uuid": "urn:uuid:version-001",
  "artifact_id": "urn:uuid:abc123",
  "version_number": 2,
  "parent_version": "urn:uuid:version-000",
  "diff": "--- old\n+++ new\n@@ -1,3 +1,3 @@\n...",
  "change_summary": "Enhanced reasoning with more details",
  "changed_by": "user@example.com",
  "change_type": "update",
  "artifact_snapshot": null,
  "createdAt": "2026-01-11T09:15:00+00:00",
  "novyx:integrityHash": "7f83b1657ff1fc53b92dc18148a1d65dfc2d4b1fa3d677284addd200126d9069"
}

Note: Version 1 includes artifact_snapshot with the full initial state. Subsequent versions store only diffs.

Programmatic Usage

from tools.versioning import list_versions, rollback_to_version
from tools.factory import persist_artifact
from pathlib import Path

# List versions
versions = list_versions("urn:uuid:abc123")
print(f"Found {len(versions)} versions")

# Rollback to version 2
restored_artifact = rollback_to_version("urn:uuid:abc123", 2)

# Persist the restored artifact (skip_versioning to avoid circular versioning)
output_dir = Path("memory/decisions")
path = persist_artifact(restored_artifact, output_dir, skip_versioning=True)

Testing Versioning

Run the versioning test suite (16 tests):

python3 -m pytest tests/test_versioning.py -v
# Tests: version creation, diffs, listing, rollback, multi-tenant, integrity

Version Storage Structure

memory/
โ”œโ”€โ”€ versions/
โ”‚   โ”œโ”€โ”€ abc123-def4-5678-9012-345678901234/  # Artifact-specific directory
โ”‚   โ”‚   โ”œโ”€โ”€ version_001.jsonld  # v1 with full snapshot
โ”‚   โ”‚   โ”œโ”€โ”€ version_002.jsonld  # v2 with diff
โ”‚   โ”‚   โ””โ”€โ”€ version_003.jsonld  # v3 with diff
โ”‚   โ””โ”€โ”€ acme-corp/  # Tenant-specific versions
โ”‚       โ””โ”€โ”€ xyz789-abc1-2345-6789-012345678901/
โ”‚           โ”œโ”€โ”€ version_001.jsonld
โ”‚           โ””โ”€โ”€ version_002.jsonld

Use Cases

  • Audit Trails: Track who changed what and when
  • Compliance: Maintain immutable history for regulatory requirements
  • Experimentation: Try changes and rollback if needed
  • Collaboration: Review version history to understand decision evolution
  • Recovery: Restore from accidental modifications

๐Ÿ“ฆ Streaming & Batch Operations (Phase 13)

Novyx Core supports memory-efficient streaming and batch operations for large artifact collections, enabling scalable deployments.

Key Features

  • ๐ŸŒŠ Streaming Iteration: Generator-based artifact loading with O(1) memory per item
  • ๐Ÿ“ฆ Batch Create: Create up to 100 artifacts in a single operation
  • โœ… Batch Validation: Stream-validate artifacts without loading entire dataset
  • ๐Ÿข Multi-Tenant Aware: All streaming operations support tenant filtering
  • ๐Ÿ” Streaming Semantic Search: Search large datasets with bounded memory

CLI Batch Commands

Count artifacts (fast, no content loading):

python3 -m tools.cli batch --count --directory memory
python3 -m tools.cli batch --count --directory memory --tenant acme-corp

Test streaming performance:

python3 -m tools.cli batch --stream-test --directory memory --batch-size 50

Batch create from JSON file:

# Create artifacts.json with array of artifact payloads
python3 -m tools.cli batch --create artifacts.json --directory memory

# With tenant isolation
python3 -m tools.cli batch --create artifacts.json --directory memory --tenant acme-corp

# Skip versioning for bulk imports
python3 -m tools.cli batch --create artifacts.json --directory memory --skip-versioning

Stream-validate artifacts:

python3 -m tools.cli batch --validate --directory memory --batch-size 100

API Batch Endpoints

POST /api/v1/batch/create - Create multiple artifacts

curl -X POST http://localhost:8000/api/v1/batch/create \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "artifacts": [
      {
        "@type": "novyx:Decision",
        "title": "Decision 1",
        "context": "Context",
        "options_considered": ["A", "B"],
        "chosen_option": "A",
        "reasoning": "Reasoning",
        "constraints": [],
        "assumptions": [],
        "confidence": 0.9,
        "expected_outcome": "Success",
        "category": "architecture"
      },
      {
        "@type": "novyx:Decision",
        "title": "Decision 2",
        "context": "Another context",
        "options_considered": ["X", "Y"],
        "chosen_option": "X",
        "reasoning": "Another reasoning",
        "constraints": [],
        "assumptions": [],
        "confidence": 0.85,
        "expected_outcome": "Success",
        "category": "architecture"
      }
    ],
    "tenant_id": "acme-corp",
    "skip_versioning": false
  }'

Response:

{
  "status": "success",
  "message": "Batch create completed: 2 created, 0 failed",
  "data": {
    "created": 2,
    "failed": 0,
    "results": [
      {"index": 0, "artifact_id": "urn:uuid:...", "path": "...", "type": "decision"},
      {"index": 1, "artifact_id": "urn:uuid:...", "path": "...", "type": "decision"}
    ],
    "errors": []
  }
}

GET /api/v1/batch/count - Count artifacts

curl "http://localhost:8000/api/v1/batch/count?directory=memory&tenant_id=acme-corp" \
  -H "X-API-Key: your-api-key"

POST /api/v1/batch/validate - Stream-validate artifacts

curl -X POST http://localhost:8000/api/v1/batch/validate \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"directory": "memory", "batch_size": 50}'

Programmatic Usage

from tools.batch import (
    stream_artifacts,
    batch_create,
    batch_validate,
    count_artifacts
)
from pathlib import Path

# Stream artifacts in memory-efficient batches
for batch in stream_artifacts(Path("memory"), batch_size=100):
    for artifact in batch:
        process(artifact)

# Stream with tenant filter
for batch in stream_artifacts(Path("memory"), tenant_id="acme-corp"):
    print(f"Processing {len(batch)} artifacts")

# Batch create artifacts
artifacts = [
    {"@type": "novyx:Decision", "title": "Decision 1", ...},
    {"@type": "novyx:Decision", "title": "Decision 2", ...},
]
result = batch_create(artifacts, tenant_id="acme-corp")
print(f"Created: {result['created']}, Failed: {result['failed']}")

# Stream-validate with progress tracking
for validation_result in batch_validate(Path("memory"), batch_size=50):
    print(f"Batch: {validation_result['passed']}/{validation_result['batch_size']} passed")

# Fast artifact count (no content loading)
count = count_artifacts(Path("memory"), tenant_id="acme-corp")

Streaming Semantic Search

For large datasets, use streaming semantic search to maintain bounded memory:

from tools.query import QueryEngine

engine = QueryEngine()

# Streaming search across large datasets
results = engine.search_semantic_streaming(
    "deployment strategy",
    top_k=10,
    min_score=0.3,
    batch_size=100,
    tenant_id="acme-corp"
)

Batch Limits

Operation Limit Notes
Batch Create 100 Maximum artifacts per request
Stream Batch 100 Maximum batch size (configurable)
Validation 100 Maximum validation batch size

Testing Batch Operations

python3 -m pytest tests/test_batch.py -v
# 19 tests covering streaming, batch create, validation, multi-tenant

๐Ÿ“ค Export/Import Formats (Phase 11)

Novyx Core supports export and import to multiple formats for migrations, integrations, and interoperability with external systems.

Supported Formats

Format Export Import Description
JSON-LD โœ… โœ… Native format, consolidated graph export
RDF/Turtle โœ… โœ… Semantic web standard (requires rdflib)
Neo4j Cypher โœ… โŒ Graph database import scripts

CLI Export Commands

Export to JSON-LD:

python3 -m tools.cli export --format jsonld --output artifacts.jsonld --directory memory

Export to RDF/Turtle:

python3 -m tools.cli export --format rdf --output graph.ttl --directory memory

Export to Neo4j Cypher:

python3 -m tools.cli export --format neo4j --output import.cypher --directory memory

Export specific tenant:

python3 -m tools.cli export --format jsonld --output tenant_data.jsonld --directory memory --tenant acme-corp

CLI Import Commands

Import from JSON-LD:

python3 -m tools.cli import --input artifacts.jsonld --format jsonld --output memory

Import from RDF/Turtle:

python3 -m tools.cli import --input graph.ttl --format rdf --output memory

Import with tenant assignment:

python3 -m tools.cli import --input external.jsonld --format jsonld --output memory --tenant imported-org

Preserve original hashes (no regeneration):

python3 -m tools.cli import --input backup.jsonld --format jsonld --output memory --no-regenerate-hashes

API Export/Import Endpoints

POST /api/v1/export - Export artifacts

curl -X POST http://localhost:8000/api/v1/export \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "directory": "memory",
    "format": "jsonld",
    "tenant_id": "acme-corp",
    "include_embeddings": false
  }'

Response:

{
  "status": "success",
  "message": "Exported 42 artifact(s) to JSONLD",
  "data": {
    "total_artifacts": 42,
    "exported": 42,
    "failed": 0,
    "format": "jsonld",
    "content_base64": "eyJAY29udGV4dCI6Li4u..."
  }
}

POST /api/v1/import - Import artifacts (multipart/form-data)

curl -X POST http://localhost:8000/api/v1/import \
  -H "X-API-Key: your-api-key" \
  -F "file=@artifacts.jsonld" \
  -F "format=jsonld" \
  -F "output_dir=memory" \
  -F "tenant_id=imported-org"

Programmatic Usage

from tools.export import export_artifacts, ExportFormat
from tools.import_data import import_artifacts, ImportFormat
from pathlib import Path

# Export to JSON-LD
stats = export_artifacts(
    Path("memory"),
    ExportFormat.JSONLD,
    Path("export.jsonld"),
    tenant_id="acme-corp"
)
print(f"Exported {stats['exported']} artifacts")

# Export to RDF/Turtle (requires rdflib)
stats = export_artifacts(
    Path("memory"),
    ExportFormat.RDF,
    Path("graph.ttl"),
    rdf_format="turtle"
)
print(f"Created {stats['triples']} RDF triples")

# Export to Neo4j Cypher
stats = export_artifacts(
    Path("memory"),
    ExportFormat.NEO4J,
    Path("import.cypher")
)
print(f"Generated {stats['nodes']} Neo4j nodes")

# Import from JSON-LD
stats = import_artifacts(
    Path("export.jsonld"),
    ImportFormat.JSONLD,
    Path("memory/restored"),
    tenant_id="restored-tenant",
    regenerate_hashes=True
)
print(f"Imported {stats['imported']} artifacts")

Neo4j Import Example

After exporting to Cypher, import into Neo4j:

# Export from Novyx
python3 -m tools.cli export --format neo4j --output novyx_import.cypher --directory memory

# Import to Neo4j (using cypher-shell)
cypher-shell -u neo4j -p password < novyx_import.cypher

The generated Cypher includes:

  • Uniqueness constraints on uuid
  • CREATE statements for each artifact
  • Labels based on artifact type (:Artifact, :Decision, :APIEndpoint, etc.)
  • Relationship creation for sameAs links

Round-Trip Integrity

Export and re-import preserves all core data:

# Export
export_artifacts(Path("memory"), ExportFormat.JSONLD, Path("backup.jsonld"))

# Import (with hash regeneration)
import_artifacts(Path("backup.jsonld"), ImportFormat.JSONLD, Path("memory/restored"))

# Validate restored artifacts
from tools.sentinel import NovyxSentinel
sentinel = NovyxSentinel()
results = sentinel.audit(Path("memory/restored"))
print(f"Validation: {results['passed']}/{results['total']} passed")

Testing Export/Import

python3 -m pytest tests/test_export_import.py -v
# 17 tests covering round-trip, integrity, multi-tenant

Federation: Distributed Sync (Phase 14)

Novyx Core supports federation - synchronizing artifacts between multiple Novyx instances.

RemoteRef: Tracking Remote Artifacts

from tools.factory import create_remote_ref, persist_artifact
from pathlib import Path

# Create a reference to a remote artifact
ref = create_remote_ref(
    remote_url="https://partner.novyx.ai/api/artifacts/urn:uuid:dec-123",
    remote_artifact_id="urn:uuid:dec-123",
    remote_instance="https://partner.novyx.ai",
    sync_status="pending",
    direction="pull",
    metadata={"name": "Partner Decision", "type": "Decision"}
)

# Persist the reference
persist_artifact(ref, Path("memory/federation"))

CLI: Federation Commands

# Check federation status
novyx sync status

# List all remote references
novyx sync list

# List pending refs only
novyx sync list --status pending

# Sync all pending refs
novyx sync run

# Sync with conflict resolution override
novyx sync run --conflict-resolution remote_wins

# Pull specific artifact from remote (requires requests)
novyx sync pull --remote https://partner.novyx.ai --artifact-id urn:uuid:dec-123

API: Federation Endpoints

# Get federation status
curl -X GET "http://localhost:8000/api/v1/federation/status" \
  -H "Authorization: Bearer YOUR_API_KEY"

# List remote references
curl -X GET "http://localhost:8000/api/v1/federation/refs?status=pending" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Pull artifact from remote
curl -X POST "http://localhost:8000/api/v1/federation/pull?remote_instance=https://partner.novyx.ai&artifact_id=urn:uuid:dec-123" \
  -H "Authorization: Bearer YOUR_API_KEY"

# Sync pending refs
curl -X POST "http://localhost:8000/api/v1/federation/sync" \
  -H "Authorization: Bearer YOUR_API_KEY"

Conflict Resolution Strategies

Strategy Behavior
remote_wins Remote changes overwrite local (default)
local_wins Local changes preserved, remote ignored
manual Mark as conflict, require manual resolution

Testing Federation

python3 -m pytest tests/test_federation.py -v
# 10+ tests covering sync, conflicts, remote refs

๐Ÿค– AI Visibility: 100/100 Discoverability Score

Novyx Core is designed to be machine-discoverable by next-generation AI search agents. Our organization.jsonld file provides:

  • โœ… Schema.org compliance (Organization, SoftwareApplication)
  • โœ… External identity verification (sameAs links to GitHub, LinkedIn)
  • โœ… Semantic metadata (product descriptions, trust signals)
  • โœ… Verifiable credentials (cryptographic integrity hashing)

For AI agents: Start at organization.jsonld to understand what Novyx Labs builds.


๐Ÿ—๏ธ Architecture

Data Model: JSON-LD 1.1

All artifacts are stored as JSON-LD (Linked Data) documents, ensuring:

  • Interoperability: Standard RDF vocabularies (Schema.org, custom novyx: namespace)
  • Semantic richness: Type information, relationships, and context preserved
  • Future-proof: Data remains readable even as tools evolve

Schema Versions

Version ID Format Hash Field Timestamp Field Status
v1.0 Persistent_ID Integrity_Hash Temporal_Marker โœ… Supported
v1.1.0 uuid novyx:integrityHash novyx:ingestedAt โœ… Current

Backward compatibility: All tools support both schemas seamlessly.

Local-First Philosophy

Novyx Core runs entirely on your machine:

  • โŒ No cloud dependencies
  • โŒ No API keys required
  • โŒ No telemetry or tracking
  • โœ… Full control over your data
  • โœ… Works offline
  • โœ… Git-friendly (plain JSON files)

๐Ÿ“Š Example: Knowledge Graph in Action

$ python tools/query.py --stats

๐Ÿ“Š Knowledge Graph Statistics
------------------------------
Total Artifacts: 6
  โ€ข research: 1
  โ€ข health: 3
  โ€ข decisions: 2
------------------------------
๐Ÿง  Semantic Connections: 3
๐ŸŒ Authority Links:      1
๐Ÿ›ก๏ธ  Integrity Status:     Active
$ python tools/query.py --search "AI agents"

๐Ÿ” Semantic Results for: 'AI agents'
--------------------------------------------------
๐Ÿ“„ Vision (Score: 0.8432)
   ๐Ÿ“‚ Category: decisions
   ๐Ÿ’ก Excerpt: Novyx Core is designed to be discoverable by AI agents in 2026.
   ๐ŸŒ Authority: https://github.com/novyxlabs

๐Ÿ”’ Integrity Guarantees

Every artifact includes:

  1. Cryptographic Hash: SHA-256 of content (excluding the hash field itself)
  2. Temporal Marker: ISO 8601 timestamp
  3. Persistent ID: URN:UUID that never changes
  4. Context Links: Traceable relationships to other artifacts

Pre-commit enforcement: .cursorrules mandates running sentinel.py before every commit, ensuring:

  • โœ… All hashes are valid
  • โœ… No orphaned links
  • โœ… Schema compliance
  • โœ… No data corruption

๐Ÿ“š Documentation


๐Ÿ› ๏ธ Development

Running Tests

# Validate all artifacts
python tools/sentinel.py --verbose

# Generate AI visibility report
python tools/entity_generator.py generate-report

Adding New Artifacts

# 1. Create a text file in inbox/
echo "Your insight here" > inbox/new_idea.txt

# 2. Ingest with semantic pulse
python tools/ingestor.py --file inbox/new_idea.txt --category research

# 3. Verify integrity
python tools/sentinel.py

# 4. Commit if passed
git add memory/
git commit -m "Add new research artifact"

๐ŸŒŸ Why Novyx Core?

Problem Traditional AI Novyx Core
Context Loss Forgets between sessions Persistent memory with embeddings
Data Integrity No verification SHA-256 + automated auditing
AI Visibility Invisible to search Schema.org + external authority links
Vendor Lock-in Cloud-dependent Local-first, open standards
Ephemeral Session-based Designed for weeks/months

๐Ÿค Contributing

Novyx Core follows strict durability and legibility standards. Before contributing:

  1. Read agents/MANIFESTO.md
  2. Follow .cursorrules protocols
  3. Ensure python tools/sentinel.py passes
  4. Write clear commit messages explaining "why," not "what"

๐Ÿ“œ License

MIT License - See LICENSE for details.


๐Ÿ”— Links


Novyx Labs โ€” Building AI that remembers, learns, and persists.

"The best way to predict the future is to build systems that endure."

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

novyx_core-1.0.0.tar.gz (190.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

novyx_core-1.0.0-py3-none-any.whl (141.1 kB view details)

Uploaded Python 3

File details

Details for the file novyx_core-1.0.0.tar.gz.

File metadata

  • Download URL: novyx_core-1.0.0.tar.gz
  • Upload date:
  • Size: 190.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for novyx_core-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d1716e97dd44aa715f0169c242a9f5754ab8d4b8adc44615f499da993e329f05
MD5 1c905eab7306b13c3cf385591bc3cabe
BLAKE2b-256 c578b6b71364d237584bc7d900f372cdf978c225522de4729939312293705cfd

See more details on using hashes here.

File details

Details for the file novyx_core-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: novyx_core-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 141.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for novyx_core-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3b36a370e469d5938f7c1fceb98bd0ab2e44ed6f42b64a70207c144db2d54d0
MD5 859e4b01db3a852684780fabbf0e5edb
BLAKE2b-256 e7e846e146188cce9bc12a03117853c931414876d74f2b0d5d836bbb548766c5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page