Skip to main content

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Project description

Unit Tests PyPI version PyPi downloads Contributions welcome

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices.

About

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

JetStream Engine Implementation

Currently, there are two reference engine implementations available -- one for Jax models and another for Pytorch models.

Jax

Pytorch

Documentation

JetStream Standalone Local Setup

Getting Started

Setup

pip install -r requirements.txt

Run local server & Testing

Use the following commands to run a server locally:

# Start a server
python -m jetstream.core.implementations.mock.server

# Test local mock server
python -m jetstream.tools.requester

# Load test local mock server
python -m jetstream.tools.load_tester

Test core modules

# Test JetStream core orchestrator
python -m jetstream.tests.core.test_orchestrator

# Test JetStream core server library
python -m jetstream.tests.core.test_server

# Test mock JetStream engine implementation
python -m jetstream.tests.engine.test_mock_engine

# Test mock JetStream token utils
python -m jetstream.tests.engine.test_utils

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

google_jetstream-0.2.1.tar.gz (37.7 kB view hashes)

Uploaded Source

Built Distribution

google_jetstream-0.2.1-py3-none-any.whl (57.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page