A library for sharing GPU memory objects across processes using IPC mechanisms
Project description
Shared Tensor
A high-performance library for sharing GPU memory objects across processes using IPC mechanisms with JSON-RPC 2.0 protocol, enabling model and inference engine separation architecture.
๐ Project Overview
Shared Tensor is a cross-process communication library designed specifically for deep learning and AI applications, utilizing IPC mechanisms and JSON-RPC protocol to achieve:
- Efficient GPU Memory Sharing: Cross-process sharing of PyTorch tensors and models
- Remote Function Execution: Easy remote function calls through decorators
- Async/Sync Support: Flexible execution modes for different scenarios
- Model Serving: Deploy machine learning models as independent services
- Distributed Inference: Support for distributed computing in multi-GPU environments
๐ Core Features
๐ Cross-Process Communication
- JSON-RPC 2.0 Protocol: Standardized remote procedure calls
- HTTP Transport: Reliable HTTP-based communication mechanism
- Serialization Optimization: Efficient PyTorch object serialization/deserialization
๐ฏ Function Sharing
- Decorator Pattern: Easy function sharing using
@provider.share - Auto Discovery: Smart function path resolution and import
- Parameter Passing: Support for complex data type parameters
โก Async Support
- Async Execution:
AsyncSharedTensorProvidersupports non-blocking calls - Task Management: Complete async task status tracking
- Concurrent Processing: Efficient concurrent request handling
๐ฅ๏ธ GPU Compatibility
- CUDA Support: Native CUDA tensor sharing support
- Device Management: Smart data migration between devices
- Memory Optimization: Efficient GPU memory usage
๐ ๏ธ Installation Guide
Requirements
- Python: 3.8+
- Operating System: Linux (recommended)
- PyTorch: 1.12.0+
- CUDA: Optional, for GPU support
Installation Methods
Install from Pypi
pip install shared-tensor
Install from Source
# Clone the repository
git clone https://github.com/world-sim-dev/shared-tensor.git
cd shared-tensor
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .
Development Installation
# Install with development dependencies
pip install -e ".[dev]"
# Install with test dependencies
pip install -e ".[test]"
Verify Installation
# Check core functionality
python -c "import shared_tensor; print('โ Shared Tensor installed successfully')"
๐ฏ Quick Start
1. Basic Function Sharing
from shared_tensor.async_provider import AsyncSharedTensorProvider
# Create provider
provider = AsyncSharedTensorProvider()
# Share simple function
@provider.share()
def add_numbers(a, b):
return a + b
# Share PyTorch function
@provider.share()
def create_tensor(shape):
import torch
return torch.zeros(shape)
# Load PyTorch model
@provider.share()
def load_model():
...
2. Start Server
# Method 1: Use command line tool, single server
shared-tensor-server
# Method 2: Use torchrun
torchrun --nproc_per_node=4 --no-python shared-tensor-server
# Method 3: Custom configuration
python shared_tensor/server.py
๐ Detailed Usage
Model Sharing Example
import torch
import torch.nn as nn
from shared_tensor.async_provider import AsyncSharedTensorProvider
# Create provider
provider = AsyncSharedTensorProvider()
# Define model
class SimpleNet(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# Share model creation function
@provider.share(name="create_model")
def create_model(input_size=784, hidden_size=128, output_size=10):
model = SimpleNet(input_size, hidden_size, output_size)
return model
# Share inference function
model = create_model()
with torch.no_grad():
model(input_data)
๐ง Configuration Options
Server Configuration
from shared_tensor.server import SharedTensorServer
server = SharedTensorServer(
host="0.0.0.0", # Listen address
port=2537, # Port number
timeout=30, # Request timeout
max_workers=4, # Maximum worker threads
enable_cache=True, # Enable result caching
debug=False # Debug mode
)
๐งช Testing
Run Test Suite
# Run all tests
python tests/run_tests.py
# Run specific category tests
python tests/run_tests.py --category unit
python tests/run_tests.py --category integration
python tests/run_tests.py --category pytorch
# Run only PyTorch related tests
python tests/run_tests.py --torch-only
# Verbose output
python tests/run_tests.py --verbose
Test Environment Info
# Check test environment
python tests/run_tests.py --env-info
Individual Test Files
# Test tensor serialization
python tests/pytorch_tests/test_tensor_serialization.py
# Test async system
python tests/integration/test_async_system.py
# Test client
python tests/integration/test_client.py
๐๏ธ Architecture Design
Core Components
shared-tensor/
โโโ shared_tensor/ # Core modules
โ โโโ server.py # JSON-RPC server
โ โโโ client.py # Sync client
โ โโโ provider.py # Sync provider
โ โโโ async_client.py # Async client
โ โโโ async_provider.py # Async provider
โ โโโ async_task.py # Async task management
โ โโโ jsonrpc.py # JSON-RPC protocol implementation
โ โโโ utils.py # Utility functions
โ โโโ errors.py # Exception definitions
โโโ examples/ # Usage examples
โโโ tests/ # Test suite
Communication Flow
sequenceDiagram
participant CA as Client App
participant SC as SharedTensorClient
participant SS as SharedTensorServer
participant FE as Function Executor
Note over CA, FE: Client-Server Communication Flow
CA->>SC: call_function("model_inference", args)
SC->>SC: Serialize parameters
SC->>SS: HTTP POST /jsonrpc<br/>JSON-RPC Request
Note over SS: Server Processing
SS->>SS: Parse JSON-RPC request
SS->>SS: Resolve function path
SS->>FE: Import & execute function
FE->>FE: Deserialize parameters
FE->>FE: Execute function logic
FE->>SS: Return execution result
Note over SS: Response Preparation
SS->>SS: Serialize result
SS->>SS: Create JSON-RPC response
SS->>SC: HTTP Response<br/>JSON-RPC Result
Note over SC: Client Processing
SC->>SC: Parse response
SC->>SC: Deserialize result
SC->>CA: Return final result
Note over CA, FE: End-to-End Process Complete
Debug Tips
- Enable verbose logging:
import logging
logging.basicConfig(level=logging.DEBUG)
- Use debug mode:
provider = SharedTensorProvider(verbose_debug=True)
- Check function paths:
provider = SharedTensorProvider()
print(provider._registered_functions)
๐ค Contributing
We welcome community contributions! Please follow these steps:
Development Environment Setup
# Clone repository
git clone https://github.com/world-sim-dev/shared-tensor.git
cd shared-tensor
# Create virtual environment
python -m venv venv
source venv/bin/activate
# Install development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install
# Package & Publish
python -m pip install build
python -m build --sdist
python -m twine upload --repository testpypi dist/*
python -m twine upload dist/*
Code Standards
# Code formatting
black shared_tensor/ tests/ examples/
# Import sorting
isort shared_tensor/ tests/ examples/
# Static checking
flake8 shared_tensor/
mypy shared_tensor/
Submission Process
- Fork the project and create a feature branch
- Write code and tests
- Run the complete test suite
- Submit a Pull Request
Test Requirements
- New features must include tests
- Maintain test coverage > 90%
- All tests must pass
๐ License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details
๐ Acknowledgments
- PyTorch - Deep learning framework
- JSON-RPC 2.0 - Remote procedure call protocol
๐ Contact Us
- Issues: GitHub Issues
- Documentation: Shared Tensor Documentation
- Source: GitHub Repository
Shared Tensor - Making GPU memory sharing simple and efficient ๐
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.