An agentic tensor database with unified SDK, agent orchestration, and intelligent workflows for ML/AI applications.
Project description
license: mit title: Core sdk: docker emoji: ๐ colorFrom: blue colorTo: yellow short_description: Tensorus Core
Tensorus: Agentic Tensor Database/Data Lake
๐ New in v0.0.5: Unified Python SDK with intuitive API, Agent Orchestrator for multi-agent workflows, and comprehensive examples. See What's New for details.
Tensorus is a production-ready, specialized data platform focused on the management and agent-driven manipulation of tensor data. It offers a streamlined environment for storing, retrieving, and operating on tensors at scale, providing the foundation for advanced AI and machine learning workflows.
๐ฏ What Makes Tensorus Special
Tensorus bridges the gap between traditional databases and AI/ML requirements by providing:
- ๐ง Intelligent Agent Framework - Built-in agents for data ingestion, reinforcement learning, AutoML, and embedding generation
- โก High-Performance Tensor Operations - 40+ optimized operations with 10-100x performance improvements
- ๐ Natural Language Queries - Intuitive NQL interface for tensor discovery and analysis
- ๐ Complete Observability - Full computational lineage and operation history tracking
- ๐๏ธ Production-Grade Architecture - Enterprise security, scaling, and deployment capabilities
The core purpose of Tensorus is to simplify and accelerate how developers and AI agents interact with tensor datasets, enabling faster development of automated data ingestion, reinforcement learning from stored experiences, AutoML processes, and intelligent data utilization in AI projects.
๐ Quick Start (3 Minutes)
Installation
# Install from PyPI
pip install tensorus
# Or install from source for development
git clone https://github.com/tensorus/tensorus.git
cd tensorus
pip install -e .
Basic Usage with Python SDK
from tensorus import Tensorus
import torch
# Initialize Tensorus SDK (minimal dependencies)
ts = Tensorus(
enable_nql=False, # Disable if transformers not installed
enable_embeddings=False, # Disable if sentence-transformers not installed
enable_vector_search=False
)
# Create a dataset
ts.create_dataset("my_dataset")
# Create and store tensors
tensor_a = ts.create_tensor(
[[1, 2], [3, 4]],
name="matrix_a",
dataset="my_dataset"
)
tensor_b = ts.create_tensor(
[[5, 6], [7, 8]],
name="matrix_b",
dataset="my_dataset"
)
# Perform operations
result = ts.matmul(tensor_a.to_tensor(), tensor_b.to_tensor())
print(f"Result shape: {result.shape}") # (2, 2)
# List all tensors
tensors = ts.list_tensors("my_dataset")
print(f"Stored {len(tensors)} tensors")
Start the API Server
# Start development server
python -m uvicorn tensorus.api:app --reload --port 8000
# Access interactive API docs at:
# - Swagger UI: http://localhost:8000/docs
# - ReDoc: http://localhost:8000/redoc
๐ Python SDK Features
The Tensorus SDK provides a unified interface for all tensor operations, agent coordination, and data management.
Core SDK Operations
from tensorus import Tensorus
# Full initialization with all features
ts = Tensorus(
enable_nql=True, # Natural Query Language
enable_embeddings=True, # Embedding generation
enable_vector_search=True, # Vector similarity search
enable_orchestrator=True, # Multi-agent workflows
embedding_model="all-MiniLM-L6-v2"
)
# Dataset management
ts.create_dataset("research_data")
ts.list_datasets()
ts.delete_dataset("old_data")
# Tensor operations
a = ts.create_tensor([[1, 2], [3, 4]], name="matrix_a", dataset="research_data")
b = ts.create_tensor([[5, 6], [7, 8]], name="matrix_b", dataset="research_data")
# Mathematical operations
result = ts.matmul(a.to_tensor(), b.to_tensor())
transposed = ts.transpose(a.to_tensor())
eigenvals = ts.eigenvalues(a.to_tensor())
# Natural language queries (requires enable_nql=True)
results = ts.query("find tensors in research_data where shape is (2, 2)")
# Vector operations (requires enable_embeddings=True)
ts.create_index("docs", dimensions=384, metric="cosine")
ts.embed_and_index(
texts=["Machine learning paper", "Deep learning tutorial"],
index_name="docs",
dataset="research_data"
)
search_results = ts.search("neural networks", index_name="docs", top_k=5)
# Multi-agent workflows (requires enable_orchestrator=True)
workflow = ts.create_workflow("data_pipeline")
ts.orchestrator.add_task(workflow, "embed", "embedding", "generate", {...})
ts.orchestrator.add_task(workflow, "index", "vector", "index", {...}, deps=["embed"])
results = ts.execute_workflow(workflow)
SDK Benefits
- Unified Interface - Single entry point for all Tensorus capabilities
- Lazy Loading - Agents load only when enabled, reducing dependencies
- Type Safety - Full type hints for IDE autocomplete and validation
- Error Handling - Comprehensive exception handling with helpful messages
- Performance - Optimized for both single-node and distributed workloads
๐ Documentation
For comprehensive documentation, including user guides and examples, please visit our documentation site.
Interactive API Documentation
Access the interactive API documentation when the server is running:
- Swagger UI:
http://localhost:8000/docs- Interactive API exploration with "Try it out" functionality - ReDoc:
http://localhost:8000/redoc- Clean, responsive API documentation
Quick Links
- Getting Started Guide - Learn the basics of Tensorus
- Examples - Practical code examples including
basic_usage.pyandcomplete_workflow_example.py - Deployment Guide - Production deployment instructions
๐ Comprehensive Documentation
๐ Learning Resources
- ๐ Documentation Hub - Central portal with guided learning paths for all skill levels
- ๐ Getting Started Guide - Complete 15-minute tutorial with real examples
- ๐ก Use Case Examples - Real-world implementations and practical guides
๐ง Technical References
- ๐ Complete API Reference - Full REST API documentation with code samples
- ๐ญ Production Deployment - Enterprise deployment strategies and operations
- โก Performance & Scaling - Benchmarks, optimization, and capacity planning
๐ข Business & Strategy
- ๐ฏ Executive Overview - Product positioning, market analysis, and business value
- ๐ Architecture Guide - System design and technical architecture
๐ฆ What's New in v0.0.5
Major Release - Unified SDK and Agent Orchestration
New Features
- โจ Unified Tensorus SDK - Single
Tensorusclass with intuitive API for all operations - ๐ค Agent Orchestrator - Multi-agent workflow coordination with DAG-based execution
- ๐ Updated Examples - All examples now use the new SDK (
examples/basic_usage.py,examples/complete_workflow_example.py) - ๐ Benchmarking Suite - Comprehensive performance testing framework (
benchmarks/benchmark_suite.py) - ๐ง Lazy Agent Loading - Agents only load when enabled, reducing startup dependencies
- ๐ Enhanced Documentation - Complete SDK reference and implementation guides
Breaking Changes
- SDK Interface - New unified API replaces direct component access (migration is straightforward - see Quick Start)
- Optional Dependencies - NQL, embeddings, and vector search now require explicit enabling
Improvements
- Better error messages for missing dependencies
- Cleaner separation of concerns
- Improved performance through optimized initialization
- More intuitive API naming
See QUICKSTART.md for migration guide and examples/ for updated code samples.
Table of Contents
- What's New in v0.0.5
- Python SDK Features
- Key Features
- Project Structure
- Demos
- Architecture
- Getting Started
- Docker Deployment
- Environment Configuration
- Production Deployment
- Testing
- Using Tensorus
- Metadata System
- Streamlit UI
- Natural Query Language (NQL)
- Agent Details
- Tensorus Models
- Basic Tensor Operations
- Tensor Decomposition Operations
- Vector Database Features
- Completed Features
- Future Implementation
- Contributing
- License
๐ Core Capabilities
๐๏ธ Advanced Tensor Storage System
- High-Performance Storage - Efficiently store and retrieve PyTorch tensors with rich metadata support
- Intelligent Compression - Multiple algorithms (LZ4, GZIP, quantization) with up to 4x space savings
- Schema Validation - Optional per-dataset schemas enforce metadata fields and tensor shape/dtype constraints
- Chunked Processing - Handle tensors larger than available memory through intelligent chunking
- Multi-Backend Support - Local filesystem, PostgreSQL, S3, and cloud storage backends
๐ค Intelligent Agent Ecosystem
- Data Ingestion Agent - Automatically monitors directories and ingests files as tensors with preprocessing
- Reinforcement Learning Agent - Deep Q-Network (DQN) agent that learns from experiences stored in tensor datasets
- AutoML Agent - Hyperparameter optimization and model selection using advanced search algorithms
- Embedding Agent - Multi-provider embedding generation with intelligent caching and vector indexing
- Extensible Framework - Build custom agents that interact intelligently with your tensor data
๐ Advanced Query & Search Engine
- Natural Query Language (NQL) - Query tensor data using intuitive, natural language-like syntax
- Vector Database Integration - Advanced similarity search with multi-provider embedding generation
- Hybrid Search - Combine semantic similarity with computational tensor properties
- Geometric Partitioning - Efficient vector indexing with automatic clustering and freshness layers
๐ฌ Production-Grade Operations
- 40+ Tensor Operations - Comprehensive library covering arithmetic, linear algebra, decompositions, and advanced operations
- Computational Lineage - Complete tracking of tensor transformations for reproducible scientific workflows
- Operation History - Full audit trail with performance metrics and error tracking
- Asynchronous Processing - Background operations and job queuing for long-running computations
๐ Developer-Friendly Interface
- RESTful API - FastAPI backend with comprehensive OpenAPI documentation and authentication
- Interactive Web UI - Streamlit-based dashboard for data exploration and agent control
- Python SDK - Rich client library with intuitive APIs and comprehensive error handling
- Model Context Protocol - Standardized integration for AI agents and LLMs via tensorus/mcp
๐ Enterprise Features
- Rich Metadata System - Pydantic schemas for semantic, lineage, computational, quality, and usage metadata
- Security & Authentication - API key management, role-based access control, and audit logging
- Monitoring & Observability - Health checks, performance metrics, and comprehensive logging
- Scalable Architecture - Horizontal scaling, load balancing, and distributed processing capabilities
Project Structure
app.py: The main Streamlit frontend application (located at the project root).pages/: Directory containing individual Streamlit page scripts and shared UI utilities for the dashboard.pages/ui_utils.py: Utility functions specifically for the Streamlit UI.- (Other page scripts like
01_dashboard.py,02_control_panel.py, etc., define the different views of the dashboard)
tensorus/: Directory containing the coretensoruslibrary modules (this is the main installable package).tensorus/__init__.py: Makestensorusa Python package.tensorus/api.py: The FastAPI application providing the backend API for Tensorus.tensorus/tensor_storage.py: Core TensorStorage implementation for managing tensor data.tensorus/tensor_ops.py: Library of functions for tensor manipulations.tensorus/vector_database.py: Advanced vector indexing with geometric partitioning and freshness layers.tensorus/embedding_agent.py: Multi-provider embedding generation and vector database integration.tensorus/hybrid_search.py: Hybrid search engine combining semantic similarity with computational tensor properties.tensorus/nql_agent.py: Agent for processing Natural Query Language queries.tensorus/ingestion_agent.py: Agent for ingesting data from various sources.tensorus/rl_agent.py: Agent for Reinforcement Learning tasks.tensorus/automl_agent.py: Agent for AutoML processes.tensorus/dummy_env.py: A simple environment for the RL agent demonstration.- (Other Python files within
tensorus/are part of the core library.)
requirements.txt: Lists the project's Python dependencies for development and local execution.pyproject.toml: Project metadata, dependencies for distribution, and build system configuration (e.g., for PyPI).README.md: This file.LICENSE: Project license file..gitignore: Specifies intentionally untracked files that Git should ignore.
๐ Live Demos & Integrations
๐ Try Tensorus Online (No Installation Required)
Experience Tensorus directly in your browser via Huggingface Spaces:
- ๐ Interactive API Documentation - Full Swagger UI with live examples and real-time testing
- ๐ Alternative API Docs - Clean ReDoc interface with detailed schemas
- ๐ Web Dashboard Demo - Complete Streamlit UI for data exploration and agent control
๐ค AI Agent Integration
Model Context Protocol (MCP) Support - Standardized integration for AI agents and LLMs:
- Repository: tensorus/mcp - Complete MCP server implementation
- Features: Standardized protocol access to all Tensorus capabilities
- Use Cases: LLM-driven tensor analysis, automated data workflows, intelligent agent interactions
Architecture
Tensorus Execution Cycle
graph TD
%% User Interface Layer
subgraph UI_Layer ["User Interaction"]
UI[Streamlit UI]
end
%% API Gateway Layer
subgraph API_Layer ["Backend Services"]
API[FastAPI Backend]
end
%% Core Storage with Method Interface
subgraph Storage_Layer ["Core Storage - TensorStorage"]
TS[TensorStorage Core]
subgraph Storage_Methods ["Storage Interface"]
TS_insert[insert data metadata]
TS_query[query query_fn]
TS_get[get_by_id id]
TS_sample[sample n]
TS_update[update_metadata]
end
TS --- Storage_Methods
end
%% Agent Processing Layer
subgraph Agent_Layer ["Processing Agents"]
IA[Ingestion Agent]
NQLA[NQL Agent]
RLA[RL Agent]
AutoMLA[AutoML Agent]
EA[Embedding Agent]
end
%% Vector Database Layer
subgraph Vector_Layer ["Vector Database"]
VDB[Vector Index Manager]
HSE[Hybrid Search Engine]
end
%% Model System
subgraph Model_Layer ["Model System"]
Registry[Model Registry]
ModelsPkg[Models Package]
end
%% Tensor Operations Library
subgraph Ops_Layer ["Tensor Operations"]
TOps[TensorOps Library]
end
%% Primary UI Flow
UI -->|HTTP Requests| API
%% API Orchestration
API -->|Command Dispatch| IA
API -->|Command Dispatch| NQLA
API -->|Command Dispatch| RLA
API -->|Command Dispatch| AutoMLA
API -->|Vector Operations| EA
API -->|Model Training| Registry
API -->|Direct Query| TS_query
%% Vector Database Integration
EA -->|Vector Indexing| VDB
HSE -->|Hybrid Search| VDB
API -->|Search Requests| HSE
%% Model System Interactions
Registry -->|Uses Models| ModelsPkg
Registry -->|Load/Save| TS
ModelsPkg -->|Tensor Ops| TOps
%% Agent Storage Interactions
IA -->|Data Ingestion| TS_insert
NQLA -->|Query Execution| TS_query
NQLA -->|Record Retrieval| TS_get
RLA -->|State Persistence| TS_insert
RLA -->|Experience Sampling| TS_sample
RLA -->|State Retrieval| TS_get
AutoMLA -->|Trial Storage| TS_insert
AutoMLA -->|Data Retrieval| TS_query
EA -->|Embedding Storage| TS_insert
EA -->|Vector Retrieval| TS_query
%% Computational Operations
NQLA -->|Vector Operations| TOps
RLA -->|Policy Evaluation| TOps
AutoMLA -->|Model Optimization| TOps
HSE -->|Tensor Analysis| TOps
%% Indirect Storage Write-back
TOps -.->|Intermediate Results| TS_insert
๐ Installation & Setup
๐ System Requirements
Minimum Requirements
- Python: 3.10+ (3.11+ recommended for best performance)
- Memory: 4 GB RAM (8+ GB recommended)
- Storage: 10 GB available disk space
- OS: Linux, macOS, Windows with WSL2
Production Requirements
- CPU: 8+ cores with 16+ threads
- Memory: 32+ GB RAM (64+ GB for large tensor workloads)
- Storage: 1 TB+ NVMe SSD for optimal I/O performance
- Network: 10+ Gbps for distributed deployments
- See: Production Deployment Guide for detailed specifications
๐ง Installation Options
Option 1: Quick Install (Recommended for New Users)
# Install latest stable version from PyPI
pip install tensorus
# Start development server
tensorus start --dev
# Access web interface at http://localhost:8000
# API documentation at http://localhost:8000/docs
Option 2: Feature-Specific Installation
# Install with GPU acceleration support
pip install tensorus[gpu]
# Install with advanced compression algorithms
pip install tensorus[compression]
# Install with monitoring and metrics
pip install tensorus[monitoring]
# Install everything (enterprise features)
pip install tensorus[all]
Option 3: Development Installation
# Clone repository for development and contributions
git clone https://github.com/tensorus/tensorus.git
cd tensorus
# Create isolated virtual environment
python3 -m venv venv
source venv/bin/activate # Linux/macOS
# venv\Scripts\activate # Windows
# Install in development mode with all dependencies
./setup.sh
Development Installation Notes:
- Uses
requirements.txtandrequirements-test.txtfor full dependency management - Installs CPU-optimized PyTorch wheels (modify
setup.shfor GPU versions) - Includes testing frameworks and development tools
- Heavy ML libraries (
xgboost,lightgbm, etc.) available viapip install tensorus[models] - Audit logging to
tensorus_audit.log(configurable viaTENSORUS_AUDIT_LOG_PATH)
Option 4: Container Deployment
# Production deployment with PostgreSQL backend
docker compose up --build
# Quick testing with in-memory storage
docker run -p 8000:8000 tensorus/tensorus:latest
# Custom configuration with environment variables
docker run -p 8000:8000 \
-e TENSORUS_STORAGE_BACKEND=postgres \
-e TENSORUS_API_KEYS=your-api-key \
tensorus/tensorus:latest
โก Performance & Scalability
๐ Benchmark Results
Tensorus delivers 10-100x performance improvements over traditional file-based tensor storage:
| Operation Type | Traditional Files | Tensorus | Improvement |
|---|---|---|---|
| Tensor Retrieval | 280 ops/sec | 15,000 ops/sec | 53.6x faster |
| Query Processing | 850ms | 45ms | 18.9x faster |
| Storage Efficiency | 1.0x baseline | 4.0x compressed | 75% space saved |
| Vector Search | 15,000ms | 125ms | 120x faster |
| Concurrent Operations | 450 ops/sec | 12,000 ops/sec | 26.7x higher throughput |
๐ Scaling Characteristics
- Linear scaling up to 32+ nodes in distributed deployments
- Sub-200ms response times at enterprise scale (millions of tensors)
- 99.9% availability with proper redundancy configuration
- Automatic load balancing and intelligent request routing
๐ฏ Use Cases & Applications
๐ง AI/ML Development & Production
- Model Training Pipelines - Store training data, model checkpoints, and experiment results
- Real-time Inference - Fast retrieval of model weights and feature tensors for serving
- Experiment Tracking - Complete lineage of model development with reproducible workflows
- AutoML Platforms - Automated hyperparameter optimization and model architecture search
๐ฌ Scientific Computing & Research
- Numerical Simulations - Large-scale scientific computing with computational provenance
- Climate & Weather Modeling - Multi-dimensional data analysis with temporal tracking
- Genomics & Bioinformatics - DNA sequence analysis, protein folding, and molecular dynamics
- Materials Science - Quantum chemistry simulations and materials property prediction
๐๏ธ Computer Vision & Autonomous Systems
- Image/Video Processing - Efficient storage and retrieval of visual data tensors
- Object Detection & Recognition - Real-time inference with cached model components
- Autonomous Vehicles - Sensor fusion, path planning, and decision-making algorithms
- Medical Imaging - DICOM processing, radiological analysis, and diagnostic AI
๐ฐ Financial Services & Trading
- Risk Management - Real-time portfolio optimization and risk assessment models
- Algorithmic Trading - High-frequency trading with microsecond-latency model execution
- Fraud Detection - Anomaly detection in transaction patterns and behavioral analysis
- Credit Scoring - ML-driven creditworthiness assessment with regulatory compliance
Running the API Server
-
Navigate to the project root directory (the directory containing the
tensorusfolder andpyproject.toml). -
Ensure your virtual environment is activated if you are using one.
-
Start the FastAPI backend server using:
uvicorn tensorus.api:app --reload --host 127.0.0.1 --port 7860 # For external access (e.g., Docker/WSL/other machines), use: # uvicorn tensorus.api:app --host 0.0.0.0 --port 7860
- This command launches Uvicorn with the
appinstance defined intensorus/api.py. - Access the API documentation at
http://localhost:7860/docsorhttp://localhost:7860/redoc. - All dataset and agent endpoints are available once the server is running.
To use S3 for tensor dataset persistence instead of local disk, set:
export TENSORUS_TENSOR_STORAGE_PATH="s3://your-bucket/optional/prefix" # Ensure AWS credentials are available (env vars, profile, or instance role) uvicorn tensorus.api:app --host 0.0.0.0 --port 7860
- This command launches Uvicorn with the
Running the Streamlit UI
-
In a separate terminal (with the virtual environment activated), navigate to the project root.
-
Start the Streamlit frontend:
streamlit run app.py
- Access the UI in your browser at the URL provided by Streamlit (usually
http://localhost:8501).
- Access the UI in your browser at the URL provided by Streamlit (usually
Model Context Protocol Integration
For AI agents and LLMs that need standardized protocol access to Tensorus capabilities, see the separate Tensorus MCP package which provides a complete MCP server implementation.
Running the Agents (Examples)
You can run the example agents directly from their respective files:
-
RL Agent:
python tensorus/rl_agent.py -
AutoML Agent:
python tensorus/automl_agent.py -
Ingestion Agent:
python tensorus/ingestion_agent.py- Note: The Ingestion Agent will monitor the
temp_ingestion_sourcedirectory (created automatically if it doesn't exist in the project root) for new files.
- Note: The Ingestion Agent will monitor the
Docker Deployment
Docker Quickstart
Run Tensorus with Docker in two ways: a single container (inโmemory storage) or a full stack with PostgreSQL via Docker Compose.
Option A: Full stack with PostgreSQL (recommended)
-
Install Docker Desktop (or Docker Engine) and Docker Compose v2.
-
Generate an API key:
python generate_api_key.py --format env # Copy the value printed after TENSORUS_API_KEYS=
-
Open
docker-compose.ymland set your key. Either:- Replace the placeholder in
TENSORUS_VALID_API_KEYSwith your key, or - Add
TENSORUS_API_KEYS: "tsr_..."alongside it. Both are supported;TENSORUS_API_KEYSis preferred.
- Replace the placeholder in
-
Start the stack from the project root:
docker compose up --build
- The API starts on
http://localhost:7860 - PostgreSQL is exposed on host
5433(container5432) - Audit logs are persisted to
./tensorus_audit.logvia a bind mount
- The API starts on
-
Test authentication (Bearer token is recommended):
# Replace tsr_your_key with the key you generated curl -H "Authorization: Bearer tsr_your_key" http://localhost:7860/datasets
Notes
- The compose file waits for Postgres to become healthy before starting the app.
- Legacy header
X-API-KEY: tsr_your_keyis still accepted for backward compatibility.
Useful commands
# View logs
docker compose logs -f app
# Rebuild after code changes
docker compose up --build --force-recreate
# Stop stack
docker compose down
Option B: Single container (inโmemory storage)
Use this for quick, ephemeral testing without Postgres.
docker build -t tensorus .
docker run --rm -p 7860:7860 \
-e TENSORUS_AUTH_ENABLED=true \
-e TENSORUS_API_KEYS=tsr_your_key \
-e TENSORUS_STORAGE_BACKEND=in_memory \
-v "$(pwd)/tensorus_audit.log:/app/tensorus_audit.log" \
tensorus
Then open http://localhost:7860/docs.
WSL2 tip: If you run Docker Desktop on Windows with WSL2, localhost:7860 works from both Windows and the WSL distro. Keep volumes on the Linux side (/home/...) for best performance.
GPU acceleration (optional)
The default image uses CPU wheels. For GPUs, install the NVIDIA Container Toolkit and switch to CUDAโenabled PyTorch wheels in your build (e.g., modify setup.sh or your Dockerfile). Pass --gpus all to docker run.
Environment Configuration
Environment configuration (reference)
Tensorus reads configuration from environment variables (prefix TENSORUS_). Common settings:
-
Authentication
TENSORUS_AUTH_ENABLED(default:true)TENSORUS_API_KEYS: Commaโseparated list of keys (recommended)TENSORUS_VALID_API_KEYS: Legacy alternative; comma list or JSON array- Usage: Prefer
Authorization: Bearer tsr_...; legacyX-API-KEYalso accepted
-
Storage backend
TENSORUS_STORAGE_BACKEND:in_memory|postgres(default:in_memory)- Postgres when
postgres:TENSORUS_POSTGRES_HOST,TENSORUS_POSTGRES_PORT(default5432),TENSORUS_POSTGRES_USER,TENSORUS_POSTGRES_PASSWORD,TENSORUS_POSTGRES_DB- or
TENSORUS_POSTGRES_DSN(overrides individual fields)
- Optional tensor persistence path:
TENSORUS_TENSOR_STORAGE_PATH(e.g., a local path or URI)
-
Security headers
TENSORUS_X_FRAME_OPTIONS(defaultSAMEORIGIN; set toNONEto omit)TENSORUS_CONTENT_SECURITY_POLICY(defaultdefault-src 'self'; set toNONEto omit)
-
Misc
TENSORUS_AUDIT_LOG_PATH(defaulttensorus_audit.log)TENSORUS_MINIMAL_IMPORT=1 to skip optional model package imports- NQL with LLM:
NQL_USE_LLM=true,GOOGLE_API_KEY, optionalNQL_LLM_MODEL
Example .env (for local runs or compose env_file):
TENSORUS_AUTH_ENABLED=true
TENSORUS_API_KEYS=tsr_your_key
TENSORUS_STORAGE_BACKEND=postgres
TENSORUS_POSTGRES_HOST=db
TENSORUS_POSTGRES_PORT=5432
TENSORUS_POSTGRES_USER=tensorus_user
TENSORUS_POSTGRES_PASSWORD=change_me
TENSORUS_POSTGRES_DB=tensorus_db
TENSORUS_AUDIT_LOG_PATH=/app/tensorus_audit.log
Production Deployment
Production deployment with Docker (stepโbyโstep)
This example uses Docker Compose with PostgreSQL. Adjust for your infra as needed.
-
Generate and store your API key securely
-
python generate_api_key.py --format env -
Prefer secret management (Docker/Swarm/K8s/Vault). For Compose, you can use a fileโbased secret:
# secrets/api_key.txt contains only your key value (no quotes) echo "tsr_prod_key_..." > secrets/api_key.txt
-
-
Configure Compose for production
- Edit
docker-compose.ymland set:TENSORUS_AUTH_ENABLED: "true"TENSORUS_API_KEYS: ${TENSORUS_API_KEYS:-}or point to a secret/fileTENSORUS_STORAGE_BACKEND: postgresand your Postgres credentials
- Optionally add
env_file: .envand put nonโsecret config there.
- Edit
-
Harden runtime
- Put Tensorus behind a reverse proxy (Nginx/Traefik) with TLS
- Restrict CORS/hosts at the proxy; the app currently allows all by default
- Set security headers via env vars (see below)
-
Start and verify
docker compose up -d --build docker compose ps curl -f -H "Authorization: Bearer tsr_prod_key_..." http://localhost:7860/ || echo "API not ready"
-
Health and logs
- Postgres health is checked automatically; the app waits until healthy
docker compose logs -f app
Security headers
- Override defaults to match your CSP and embedding needs. If set to
NONEor empty, the header is omitted.
# Example: allow Swagger/ReDoc CDNs and a trusted frame host
TENSORUS_X_FRAME_OPTIONS="ALLOW-FROM https://example.com"
TENSORUS_CONTENT_SECURITY_POLICY="default-src 'self'; script-src 'self' https://cdn.jsdelivr.net; style-src 'self' https://fonts.googleapis.com 'unsafe-inline'; font-src 'self' https://fonts.gstatic.com; img-src 'self' https://fastapi.tiangolo.com"
Troubleshooting
- 401 Unauthorized: ensure you send
Authorization: Bearer tsr_...and the key is configured (TENSORUS_API_KEYSorTENSORUS_VALID_API_KEYS). - 503 auth not configured: set an API key when auth is enabled.
- DB connection errors: verify Postgres env, port conflicts (host 5433 vs local 5432), and that the DB user/database exist.
- Windows/WSL2 volume performance: keep bindโmounted files on the Linux filesystem for best performance.
Testing
Preparing the Test Environment
The tests expect all dependencies from both requirements.txt and
requirements-test.txt to be installed. A simple setup script is provided
to handle this automatically:
./setup.sh
Run this after creating and activating a Python virtual environment. The script
installs the Tensorus runtime requirements and the additional packages needed
for pytest. Once completed, executing pytest from the repository root will
automatically discover and run the entire suiteโno manual package discovery is
required.
Test Suite Dependencies
The Python tests rely on packages from both requirements.txt and
requirements-test.txt. The latter includes httpx and other packages
used by the test suite. Always run ./setup.sh before executing
pytest to install these requirements:
./setup.sh
Running Tests
Tensorus includes Python unit tests. To set up the environment and run them:
-
Install all dependencies using the setup script before running any tests:
./setup.sh
This script installs packages from
requirements.txtandrequirements-test.txt(which pinsfastapi>=0.110for Pydantic v2 support). -
Run the Python test suite:
pytest
All tests should pass without errors when dependencies are properly installed.
Using Tensorus
API Basics
Base URL: http://localhost:7860
Authentication:
- Preferred: send
Authorization: Bearer tsr_<your_key> - Legacy (still supported):
X-API-KEY: tsr_<your_key>
PowerShell notes:
- Use double quotes for JSON and escape inner quotes, or run in WSL/Git Bash for copy/paste fidelity.
- Alternatively, use
--%to stop PowerShell from interpreting special characters.
API Endpoints
The API provides the following main endpoints:
- Datasets:
POST /datasets/create: Create a new dataset.POST /datasets/{name}/ingest: Ingest a tensor into a dataset.GET /datasets/{name}/fetch: Retrieve all records from a dataset.GET /datasets/{name}/records: Retrieve a page of records. Supportsoffset(start index, default0) andlimit(max results, default100).GET /datasets: List all available datasets.GET /datasets/{name}/count: Count records in a dataset.GET /datasets/{dataset_name}/tensors/{record_id}: Retrieve a tensor by record ID.DELETE /datasets/{dataset_name}: Delete a dataset.DELETE /datasets/{dataset_name}/tensors/{record_id}: Delete a tensor by record ID.PUT /datasets/{dataset_name}/tensors/{record_id}/metadata: Update tensor metadata.
- Querying:
POST /api/v1/query: Execute an NQL query.
- Vector Database:
POST /api/v1/vector/embed: Generate and store embeddings from text.POST /api/v1/vector/search: Perform vector similarity search.POST /api/v1/vector/hybrid-search: Execute hybrid semantic-computational search.POST /api/v1/vector/tensor-workflow: Run tensor workflow with lineage tracking.POST /api/v1/vector/index/build: Build vector indexes with geometric partitioning.GET /api/v1/vector/models: List available embedding models.GET /api/v1/vector/stats/{dataset_name}: Get embedding statistics for a dataset.GET /api/v1/vector/metrics: Get performance metrics.DELETE /api/v1/vector/vectors/{dataset_name}: Delete vectors from a dataset.
- Operation History & Lineage:
GET /api/v1/operations/recent: Get recent operations with optional filtering by type/status.GET /api/v1/operations/tensor/{tensor_id}: Get all operations that involved a specific tensor.GET /api/v1/operations/statistics: Get aggregate operation statistics.GET /api/v1/operations/types: List available operation types.GET /api/v1/operations/statuses: List available operation statuses.GET /api/v1/lineage/tensor/{tensor_id}: Get computational lineage for a tensor.GET /api/v1/lineage/tensor/{tensor_id}/dot: DOT graph for lineage visualization.GET /api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}: Operation path between two tensors.
- Agents:
GET /agents: List all registered agents.GET /agents/{agent_id}/status: Get the status of a specific agent.POST /agents/{agent_id}/start: Start an agent.POST /agents/{agent_id}/stop: Stop an agent.GET /agents/{agent_id}/logs: Get recent logs for an agent.GET /agents/{agent_id}/config: Get stored configuration for an agent.POST /agents/{agent_id}/configure: Update agent configuration.
- Metrics & Monitoring:
GET /metrics/dashboard: Get aggregated dashboard metrics.
Vector Database Examples
Note on path parameter names across endpoints:
- Datasets CRUD often uses
namein path: e.g.,/datasets/{name}/ingest,/datasets/{name}/records - Tensor CRUD uses
dataset_name+record_id: e.g.,/datasets/{dataset_name}/tensors/{record_id} - Vector API consistently uses
dataset_namein path and body
For an end-to-end quickstart with PowerShell-friendly curl commands and authentication setup, see DEMO.md โ "Vector & Embedding API Quickstart".
Generate & Store Embeddings
curl -X POST "http://localhost:7860/api/v1/vector/embed" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"texts": ["Machine learning algorithms", "Deep neural networks"],
"dataset_name": "ai_research",
"model_name": "all-mpnet-base-v2",
"namespace": "research",
"tenant_id": "team_alpha"
}'
Semantic Similarity Search
curl -X POST "http://localhost:7860/api/v1/vector/search" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"query": "artificial intelligence models",
"dataset_name": "ai_research",
"k": 5,
"similarity_threshold": 0.7,
"namespace": "research"
}'
Example response:
{
"success": true,
"query": "artificial intelligence models",
"total_results": 2,
"search_time_ms": 8.42,
"results": [
{
"record_id": "rec_123",
"similarity_score": 0.9153,
"rank": 1,
"source_text": "Deep learning models for AI",
"metadata": {"source": "paper_db", "year": 2024},
"namespace": "research",
"tenant_id": "team_alpha"
},
{
"record_id": "rec_456",
"similarity_score": 0.8831,
"rank": 2,
"source_text": "AI model architectures",
"metadata": {"source": "notes"},
"namespace": "research",
"tenant_id": "team_alpha"
}
]
}
Hybrid Computational Search
curl -X POST "http://localhost:7860/api/v1/vector/hybrid-search" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"text_query": "neural network weights",
"dataset_name": "model_tensors",
"tensor_operations": [
{
"operation_name": "svd",
"description": "Singular value decomposition",
"parameters": {}
}
],
"similarity_weight": 0.7,
"computation_weight": 0.3,
"filters": {
"preferred_shape": [512, 512],
"sparsity_preference": 0.1
}
}'
Agents API Examples
List Agents
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/agents"
Start Agent
curl -X POST -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/agents/ingestion/start"
Agent Status & Logs
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/agents/ingestion/status"
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/agents/ingestion/logs?lines=50"
Operation History & Lineage Examples
Recent Operations
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/api/v1/operations/recent?limit=50"
Tensor Lineage (JSON)
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/api/v1/lineage/tensor/{tensor_id}"
Tensor Lineage (DOT Graph)
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/api/v1/lineage/tensor/{tensor_id}/dot"
Operation Path Between Two Tensors
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/api/v1/lineage/tensor/{source_tensor_id}/path/{target_tensor_id}"
Authentication Examples
Recommended (Bearer):
curl -s \
-H "Authorization: Bearer tsr_your_api_key" \
"http://localhost:7860/datasets"
Legacy header (still supported):
curl -s \
-H "X-API-KEY: tsr_your_api_key" \
"http://localhost:7860/datasets"
NQL Query Example
curl -X POST "http://localhost:7860/api/v1/query" \
-H "Authorization: Bearer tsr_your_api_key" \
-H "Content-Type: application/json" \
-d '{
"query": "find tensors from 'my_dataset' where metadata.source = 'api_ingest' limit 10"
}'
Request/Response Schemas
Below are the primary Pydantic models used by the API. See tensorus/api.py and tensorus/api/models.py for full details.
-
ApiResponse (
tensorus/api.py)success: boolmessage: strdata: Any | null
-
DatasetCreateRequest (
tensorus/api.py)name: str
-
TensorInput (
tensorus/api.py)shape: List[int]dtype: strdata: List[Any] | int | floatmetadata: Dict[str, Any] | null
-
TensorOutput (
tensorus/api/models.py)record_id: strshape: List[int]dtype: strdata: List[Any] | int | floatmetadata: Dict[str, Any]
-
NQLQueryRequest / NQLResponse (
tensorus/api/models.py)- Request:
query: str - Response:
success: bool,message: str,count?: int,results?: List[TensorOutput]
- Request:
-
VectorSearchQuery (
tensorus/api/models.py)query: str,dataset_name: str,k: int = 5,namespace?: str,tenant_id?: str,similarity_threshold?: float,include_vectors: bool = false
-
OperationHistoryRequest (
tensorus/api/models.py)- Filters:
tensor_id?,operation_type?,status?,user_id?,session_id?,start_time?,end_time?,limit: int = 100
- Filters:
-
LineageResponse (
tensorus/api/models.py)- Key fields:
tensor_id,root_tensor_ids,max_depth,total_operations,lineage_nodes[],operations[], timestamps
- Key fields:
๐งฉ Examples
Explore our collection of examples to get started with Tensorus: "message": "Tensor ingested successfully.", "data": { "record_id": "abc123" } }
Not Found (FastAPI error shape):
```json
{
"detail": "Not Found"
}
Unauthorized:
{
"detail": "Not authenticated"
}
Dataset API Examples
All requests require authentication by default: -H "Authorization: Bearer your_api_key" (legacy X-API-KEY also supported).
Create Dataset
curl -X POST "http://localhost:7860/datasets/create" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{"name": "my_dataset"}'
Ingest Tensor
curl -X POST "http://localhost:7860/datasets/my_dataset/ingest" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"shape": [2, 3],
"dtype": "float32",
"data": [[1.0,2.0,3.0],[4.0,5.0,6.0]],
"metadata": {"source": "api_ingest", "label": "row_batch_1"}
}'
Fetch Entire Dataset
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset/fetch"
Fetch Records (Pagination)
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset/records?offset=0&limit=50"
Count Records
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset/count"
Get Tensor By ID
curl -s -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset/tensors/{record_id}"
Update Tensor Metadata (replace entire metadata)
curl -X PUT "http://localhost:7860/datasets/my_dataset/tensors/{record_id}/metadata" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{"new_metadata": {"source": "sensor_A", "priority": "high"}}'
Delete Tensor
curl -X DELETE -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset/tensors/{record_id}"
Delete Dataset
curl -X DELETE -H "Authorization: Bearer your_api_key" \
"http://localhost:7860/datasets/my_dataset"
Dataset Schemas
Datasets can optionally include a schema when created. The schema defines
required metadata fields and expected tensor shape and dtype. Inserts that
violate the schema will raise a validation error.
Example:
schema = {
"shape": [3, 10],
"dtype": "float32",
"metadata": {"source": "str", "value": "int"}
}
storage.create_dataset("my_ds", schema=schema)
storage.insert("my_ds", torch.rand(3, 10), {"source": "sensor", "value": 5})
Metadata System
Tensorus includes a detailed metadata subsystem for describing tensors beyond their raw data. Each tensor has a TensorDescriptor and can be associated with optional semantic, lineage, computational, quality, relational, and usage metadata. The metadata storage backend is pluggable, supporting in-memory storage for quick testing or PostgreSQL for persistence. Search and aggregation utilities allow querying across these metadata fields. See metadata_schemas.md for schema details.
Streamlit UI
The Streamlit UI provides a user-friendly interface for:
- Dashboard: View basic system metrics and agent status.
- Agent Control: Start, stop, and view logs for agents.
- NQL Chat: Enter natural language queries and view results.
- Data Explorer: Browse datasets, preview data, and perform tensor operations.
Natural Query Language (NQL)
Tensorus ships with a simple regexโbased Natural Query Language for retrieving tensors by metadata. You can issue NQL queries via the API or from the "NQL Chat" page in the Streamlit UI.
See also: NQL Query Example for a minimal API request.
Enabling LLM rewriting
Set NQL_USE_LLM=true to enable parsing of freeโform queries with
Google's Gemini model. Provide your API key in the GOOGLE_API_KEY
environment variable and optionally set NQL_LLM_MODEL (defaults to
gemini-2.0-flash) to choose the model version. The agent sends the
current dataset schema and your query to Gemini via
langchain-google. If the model or key are unavailable the agent
silently falls back to the regex-based parser.
Example query using the LLM parser:
show me all images containing a dog from dataset animals where source is "mobile"
This phrasing is more natural than the regex format and will be rewritten into a structured NQL query by Gemini.
Agent Details
Data Ingestion Agent
- Functionality: Monitors a source directory for new files, preprocesses them into tensors, and inserts them into TensorStorage.
- Supported File Types: CSV, PNG, JPG, JPEG, TIF, TIFF (can be extended).
- Preprocessing: Uses default functions for CSV and images (resize, normalize).
- Configuration:
source_directory: The directory to monitor.polling_interval_sec: How often to check for new files.preprocessing_rules: A dictionary mapping file extensions to custom preprocessing functions.
RL Agent
- Functionality: A Deep Q-Network (DQN) agent that learns from experiences stored in TensorStorage.
- Environment: Uses a
DummyEnvfor demonstration. - Experience Storage: Stores experiences (state, action, reward, next_state, done) in TensorStorage.
- Training: Implements epsilon-greedy exploration and target network updates.
- Configuration:
state_dim: Dimensionality of the environment state.action_dim: Number of discrete actions.hidden_size: Hidden layer size for the DQN.lr: Learning rate.gamma: Discount factor.epsilon_*: Epsilon-greedy parameters.target_update_freq: Target network update frequency.batch_size: Experience batch size.experience_dataset: Dataset name for experiences.state_dataset: Dataset name for state tensors.
AutoML Agent
- Functionality: Performs hyperparameter optimization using random search.
- Model: Trains a simple
DummyMLPmodel. - Search Space: Configurable hyperparameter search space (learning rate, hidden size, activation).
- Evaluation: Trains and evaluates models on synthetic data.
- Results: Stores trial results (parameters, score) in TensorStorage.
- Configuration:
search_space: Dictionary defining the hyperparameter search space.input_dim: Input dimension for the model.output_dim: Output dimension for the model.task_type: Type of task ('regression' or 'classification').results_dataset: Dataset name for storing results.
Embedding Agent
- Functionality: Multi-provider embedding generation with intelligent caching and vector database integration.
- Providers: Supports Sentence Transformers, OpenAI, and extensible architecture for additional providers.
- Features: Automatic batching, embedding caching, vector indexing, and performance monitoring.
- Configuration:
default_provider: Default embedding provider to use.default_model: Default model for embedding generation.batch_size: Batch size for embedding generation.cache_ttl: Time-to-live for embedding cache entries.
Tensorus Models
The collection of example models previously bundled with Tensorus now lives in a separate repository: tensorus/models. Install it with:
pip install tensorus-models
When the package is installed, Tensorus will automatically import it. Set the
environment variable TENSORUS_MINIMAL_IMPORT=1 before importing Tensorus to
skip this optional dependency and keep startup lightweight.
Basic Tensor Operations
This section details the core tensor manipulation functionalities provided by tensor_ops.py. These operations are designed to be robust, with built-in type and shape checking where appropriate.
Arithmetic Operations
add(t1, t2): Element-wise addition of two tensors, or a tensor and a scalar.subtract(t1, t2): Element-wise subtraction of two tensors, or a tensor and a scalar.multiply(t1, t2): Element-wise multiplication of two tensors, or a tensor and a scalar.divide(t1, t2): Element-wise division of two tensors, or a tensor and a scalar. Includes checks for division by zero.power(t1, t2): Raises each element int1to the power oft2. Supports tensor or scalar exponents.log(tensor): Element-wise natural logarithm with warnings for non-positive values.
Matrix and Dot Operations
matmul(t1, t2): Matrix multiplication of two tensors, supporting various dimensionalities (e.g., 2D matrices, batched matrix multiplication).dot(t1, t2): Computes the dot product of two 1D tensors.outer(t1, t2): Computes the outer product of two 1โD tensors.cross(t1, t2, dim=-1): Computes the cross product along the specified dimension (size must be 3).matrix_eigendecomposition(matrix_A): Returns eigenvalues and eigenvectors of a square matrix.matrix_trace(matrix_A): Computes the trace of a 2-D matrix.tensor_trace(tensor_A, axis1=0, axis2=1): Trace of a tensor along two axes.svd(matrix): Singular value decomposition of a matrix, returnsU,S, andVh.qr_decomposition(matrix): QR decomposition returningQandR.lu_decomposition(matrix): LU decomposition returning permutationP, lowerL, and upperUmatrices.cholesky_decomposition(matrix): Cholesky factor of a symmetric positive-definite matrix.matrix_inverse(matrix): Inverse of a square matrix.matrix_determinant(matrix): Determinant of a square matrix.matrix_rank(matrix): Rank of a matrix.
Reduction Operations
sum(tensor, dim=None, keepdim=False): Computes the sum of tensor elements over specified dimensions.mean(tensor, dim=None, keepdim=False): Computes the mean of tensor elements over specified dimensions. Tensor is cast to float for calculation.min(tensor, dim=None, keepdim=False): Finds the minimum value in a tensor, optionally along a dimension. Returns values and indices ifdimis specified.max(tensor, dim=None, keepdim=False): Finds the maximum value in a tensor, optionally along a dimension. Returns values and indices ifdimis specified.variance(tensor, dim=None, unbiased=False, keepdim=False): Variance of tensor elements.covariance(matrix_X, matrix_Y=None, rowvar=True, bias=False, ddof=None): Covariance matrix estimation.correlation(matrix_X, matrix_Y=None, rowvar=True): Correlation coefficient matrix.
Reshaping and Slicing
reshape(tensor, shape): Changes the shape of a tensor without changing its data.transpose(tensor, dim0, dim1): Swaps two dimensions of a tensor.permute(tensor, dims): Permutes the dimensions of a tensor according to the specified order.flatten(tensor, start_dim=0, end_dim=-1): Flattens a range of dimensions into a single dimension.squeeze(tensor, dim=None): Removes dimensions of size 1, or a specific dimension if provided.unsqueeze(tensor, dim): Inserts a dimension of size 1 at the given position.
Concatenation and Splitting
concatenate(tensors, dim=0): Joins a sequence of tensors along an existing dimension.stack(tensors, dim=0): Joins a sequence of tensors along a new dimension.
Advanced Operations
einsum(equation, *tensors): Applies Einstein summation convention to the input tensors based on the provided equation string.compute_gradient(func, tensor): Returns the gradient of a scalarfuncwith respect totensor.compute_jacobian(func, tensor): Computes the Jacobian matrix of a vector function.convolve_1d(signal_x, kernel_w, mode='valid'): 1โD convolution usingtorch.nn.functional.conv1d.convolve_2d(image_I, kernel_K, mode='valid'): 2โD convolution usingtorch.nn.functional.conv2d.frobenius_norm(tensor): Calculates the Frobenius norm.l1_norm(tensor): Calculates the L1 norm (sum of absolute values).
Tensor Decomposition Operations
Tensorus includes a library of higherโorder tensor factorizations in
tensor_decompositions.py. These operations mirror the algorithms
available in TensorLy and related libraries.
- CP Decomposition โ Canonical Polyadic factorization returning weights and factor matrices.
- NTFโCP Decomposition โ Nonโnegative CP using
non_negative_parafac. - Tucker Decomposition โ Standard Tucker factorization for specified ranks.
- Nonโnegative Tucker / Partial Tucker โ Variants with HOOI and nonโnegative constraints.
- HOSVD โ Higherโorder SVD (Tucker with full ranks).
- Tensor Train (TT) โ Sequence of TT cores representing the tensor.
- TTโSVD โ TT factorization via SVD initialization.
- Tensor Ring (TR) โ Circular variant of TT.
- Hierarchical Tucker (HT) โ Decomposition using a dimension tree.
- Block Term Decomposition (BTD) โ Sum of Tuckerโ1 terms for 3โway tensors.
- tโSVD โ Tensor singular value decomposition based on the tโproduct.
Examples of how to call these methods are provided in
tensorus/tensor_decompositions.py.
Vector Database Features
Embedding Generation
Tensorus supports multiple embedding providers for generating high-quality vector representations of text:
- Sentence Transformers: Local models including all-MiniLM-L6-v2, all-mpnet-base-v2, and specialized models
- OpenAI: Cloud-based models like text-embedding-3-small and text-embedding-3-large
- Extensible Architecture: Easy integration of additional embedding providers
Vector Indexing
Advanced vector indexing capabilities for efficient similarity search:
- Geometric Partitioning: Automatic distribution of vectors across partitions using k-means clustering
- Freshness Layers: Real-time updates without requiring full index rebuilds
- FAISS Integration: High-performance similarity search with multiple distance metrics
- Multi-tenancy: Namespace and tenant isolation for secure multi-user deployments
Hybrid Search
Unique hybrid search capabilities that combine semantic similarity with computational tensor properties:
- Semantic Scoring: Traditional vector similarity search based on text embeddings
- Computational Scoring: Mathematical property evaluation including shape compatibility, sparsity, rank analysis
- Operation Compatibility: Scoring tensors based on suitability for specific mathematical operations
- Combined Ranking: Weighted combination of semantic and computational relevance scores
Tensor Workflows
Execute complex mathematical workflows with full computational lineage tracking:
- Workflow Execution: Chain multiple tensor operations with intermediate result storage
- Lineage Tracking: Complete provenance tracking of tensor transformations
- Scientific Reproducibility: Full audit trail of computational steps for research applications
- Intermediate Storage: Optional preservation of intermediate results for analysis
Completed Features
The current codebase implements all of the items listed in Key Features. Tensorus already provides efficient tensor storage with optional file persistence, a natural query language, a flexible agent framework, a RESTful API, a Streamlit UI, robust tensor operations, and advanced vector database capabilities. The modular architecture makes future extensions straightforward.
Future Implementation
- Enhanced NQL: Integrate a local or remote LLM for more robust natural language understanding.
- Advanced Agents: Develop more sophisticated agents for specific tasks (e.g., anomaly detection, forecasting).
- Persistent Storage Backend: Replace/augment current file-based persistence with more robust database or cloud storage solutions (e.g., PostgreSQL, S3, MinIO).
- Advanced Vector Indexing: Implement HNSW and IVF-PQ algorithms for even more efficient similarity search.
- Scalability & Performance:
- Implement tensor chunking for very large tensors.
- Optimize query performance with indexing.
- Asynchronous operations for agents and API calls.
- Security: Implement authentication and authorization mechanisms for the API and UI.
- Real-World Integration:
- Connect Ingestion Agent to more data sources (e.g., cloud storage, databases, APIs).
- Integrate RL Agent with real-world environments or more complex simulations.
- Advanced AutoML:
- Implement sophisticated search algorithms (e.g., Bayesian Optimization, Hyperband).
- Support for diverse model architectures and custom models.
- Model Management: Add capabilities for saving, loading, versioning, and deploying trained models (from RL/AutoML).
- Streaming Data Support: Enhance Ingestion Agent to handle real-time streaming data.
- Resource Management: Add tools and controls for monitoring and managing the resource consumption (CPU, memory) of agents.
- Improved UI/UX: Continuously refine the Streamlit UI for better usability and richer visualizations.
- Comprehensive Testing: Expand unit, integration, and end-to-end tests.
- Multi-modal Embeddings: Support for image, audio, and video embeddings alongside text.
- Distributed Architecture: Multi-node deployments for large-scale vector search workloads.
๐ค Community & Contributing
๐ฌ Get Help & Support
Community Resources:
- ๐ Documentation Hub - Comprehensive guides and tutorials
- ๐ฌ GitHub Discussions - Ask questions and share ideas
- ๐ Issue Tracker - Bug reports and feature requests
- ๐ท๏ธ Stack Overflow - Technical Q&A with the community
Enterprise Support:
- ๐ง Technical Support: support@tensorus.com
- ๐ง Sales & Partnerships: sales@tensorus.com
- ๐ง Security Issues: security@tensorus.com
๐ Contributing to Tensorus
We welcome contributions from the community! Here's how to get involved:
๐ Report Issues
- Use our issue templates for bug reports
- Include system information, reproduction steps, and expected behavior
- Search existing issues before creating new ones
๐ง Code Contributions
- Fork the repository and create a feature branch
- Develop with proper tests and documentation
- Test your changes locally using
pytest - Submit a pull request with clear description and examples
๐ Documentation Improvements
- Fix typos, improve clarity, and add examples
- Translate documentation to other languages
- Create tutorials and use case guides
- Update API documentation and code comments
๐ก Feature Requests & Ideas
- Propose new features via GitHub Discussions
- Provide detailed use cases and implementation suggestions
- Participate in design discussions and RFC processes
Development Resources:
- ๐ Contributing Guide - Detailed contribution guidelines
- ๐ Code of Conduct - Community standards and expectations
- ๐๏ธ Development Setup - Local development environment
๐ License & Legal
MIT License - See LICENSE file for complete terms.
Copyright (c) 2024 Tensorus Contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Third-Party Licenses: This project includes dependencies with their own licenses. See requirements.txt and individual package documentation for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tensorus-0.1.0.tar.gz.
File metadata
- Download URL: tensorus-0.1.0.tar.gz
- Upload date:
- Size: 340.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8836bc692383a1720c113765ca8b51314bf74a0040c29ce5e8c65b874b34d6d6
|
|
| MD5 |
d0f49f2e958eaad252f8bb9d98cc0538
|
|
| BLAKE2b-256 |
13e698e22da049cc91622e5f169e364e5ae04f27d7b66eb437ba6f2c411a30eb
|
Provenance
The following attestation bundles were made for tensorus-0.1.0.tar.gz:
Publisher:
workflow.yml on tensorus/tensorus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tensorus-0.1.0.tar.gz -
Subject digest:
8836bc692383a1720c113765ca8b51314bf74a0040c29ce5e8c65b874b34d6d6 - Sigstore transparency entry: 663559725
- Sigstore integration time:
-
Permalink:
tensorus/tensorus@b4e1e6cf00e51da3ae6c7137b49281aee3cda0ce -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tensorus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b4e1e6cf00e51da3ae6c7137b49281aee3cda0ce -
Trigger Event:
push
-
Statement type:
File details
Details for the file tensorus-0.1.0-py3-none-any.whl.
File metadata
- Download URL: tensorus-0.1.0-py3-none-any.whl
- Upload date:
- Size: 241.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9493f4db310e4c5e1711073d38d291612fdde40b0e7f091b4f3e3b02c59060e2
|
|
| MD5 |
db518d5ce45f7336ff8303f9b24079f2
|
|
| BLAKE2b-256 |
6992c83f463ce83c60bd1202e06e500143cdaf7587dc62426984226cd6c9ef26
|
Provenance
The following attestation bundles were made for tensorus-0.1.0-py3-none-any.whl:
Publisher:
workflow.yml on tensorus/tensorus
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tensorus-0.1.0-py3-none-any.whl -
Subject digest:
9493f4db310e4c5e1711073d38d291612fdde40b0e7f091b4f3e3b02c59060e2 - Sigstore transparency entry: 663559747
- Sigstore integration time:
-
Permalink:
tensorus/tensorus@b4e1e6cf00e51da3ae6c7137b49281aee3cda0ce -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tensorus
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@b4e1e6cf00e51da3ae6c7137b49281aee3cda0ce -
Trigger Event:
push
-
Statement type: