Skip to main content

Multimodal AI services & pipelines with cloud-native stack: gRPC, Kubernetes, Docker, OpenTelemetry, Prometheus, Jaeger, etc.

Project description

Jina-Serve

PyPI PyPI - Downloads from official pypistats Github CD status

Jina-serve is a framework for building and deploying AI services that communicate via gRPC, HTTP and WebSockets. Scale your services from local development to production while focusing on your core logic.

Key Features

  • Native support for all major ML frameworks and data types
  • High-performance service design with scaling, streaming, and dynamic batching
  • LLM serving with streaming output
  • Built-in Docker integration and Executor Hub
  • One-click deployment to Jina AI Cloud
  • Enterprise-ready with Kubernetes and Docker Compose support
Comparison with FastAPI

Key advantages over FastAPI:

  • DocArray-based data handling with native gRPC support
  • Built-in containerization and service orchestration
  • Seamless scaling of microservices
  • One-command cloud deployment

Install

pip install jina

See guides for Apple Silicon and Windows.

Core Concepts

Three main layers:

  • Data: BaseDoc and DocList for input/output
  • Serving: Executors process Documents, Gateway connects services
  • Orchestration: Deployments serve Executors, Flows create pipelines

Build AI Services

Let's create a gRPC-based AI service using StableLM:

from jina import Executor, requests
from docarray import DocList, BaseDoc
from transformers import pipeline


class Prompt(BaseDoc):
    text: str


class Generation(BaseDoc):
    prompt: str
    text: str


class StableLM(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline(
            'text-generation', model='stabilityai/stablelm-base-alpha-3b'
        )

    @requests
    def generate(self, docs: DocList[Prompt], **kwargs) -> DocList[Generation]:
        generations = DocList[Generation]()
        prompts = docs.text
        llm_outputs = self.generator(prompts)
        for prompt, output in zip(prompts, llm_outputs):
            generations.append(Generation(prompt=prompt, text=output))
        return generations

Deploy with Python or YAML:

from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()
jtype: Deployment
with:
 uses: StableLM
 py_modules:
   - executor.py
 timeout_ready: -1
 port: 12345

Use the client:

from jina import Client
from docarray import DocList
from executor import Prompt, Generation

prompt = Prompt(text='suggest an interesting image generation prompt')
client = Client(port=12345)
response = client.post('/', inputs=[prompt], return_type=DocList[Generation])

Build Pipelines

Chain services into a Flow:

from jina import Flow

flow = Flow(port=12345).add(uses=StableLM).add(uses=TextToImage)

with flow:
    flow.block()

Scaling and Deployment

Local Scaling

Boost throughput with built-in features:

  • Replicas for parallel processing
  • Shards for data partitioning
  • Dynamic batching for efficient model inference

Example scaling a Stable Diffusion deployment:

jtype: Deployment
with:
 uses: TextToImage
 timeout_ready: -1
 py_modules:
   - text_to_image.py
 env:
  CUDA_VISIBLE_DEVICES: RR
 replicas: 2
 uses_dynamic_batching:
   /default:
     preferred_batch_size: 10
     timeout: 200

Cloud Deployment

Containerize Services

  1. Structure your Executor:
TextToImage/
├── executor.py
├── config.yml
├── requirements.txt
  1. Configure:
# config.yml
jtype: TextToImage
py_modules:
 - executor.py
metas:
 name: TextToImage
 description: Text to Image generation Executor
  1. Push to Hub:
jina hub push TextToImage

Deploy to Kubernetes

jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s

Use Docker Compose

jina export docker-compose flow.yml docker-compose.yml
docker-compose up

JCloud Deployment

Deploy with a single command:

jina cloud deploy jcloud-flow.yml

LLM Streaming

Enable token-by-token streaming for responsive LLM applications:

  1. Define schemas:
from docarray import BaseDoc


class PromptDocument(BaseDoc):
    prompt: str
    max_tokens: int


class ModelOutputDocument(BaseDoc):
    token_id: int
    generated_text: str
  1. Initialize service:
from transformers import GPT2Tokenizer, GPT2LMHeadModel


class TokenStreamingExecutor(Executor):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.model = GPT2LMHeadModel.from_pretrained('gpt2')
  1. Implement streaming:
@requests(on='/stream')
async def task(self, doc: PromptDocument, **kwargs) -> ModelOutputDocument:
    input = tokenizer(doc.prompt, return_tensors='pt')
    input_len = input['input_ids'].shape[1]
    for _ in range(doc.max_tokens):
        output = self.model.generate(**input, max_new_tokens=1)
        if output[0][-1] == tokenizer.eos_token_id:
            break
        yield ModelOutputDocument(
            token_id=output[0][-1],
            generated_text=tokenizer.decode(
                output[0][input_len:], skip_special_tokens=True
            ),
        )
        input = {
            'input_ids': output,
            'attention_mask': torch.ones(1, len(output[0])),
        }
  1. Serve and use:
# Server
with Deployment(uses=TokenStreamingExecutor, port=12345, protocol='grpc') as dep:
    dep.block()


# Client
async def main():
    client = Client(port=12345, protocol='grpc', asyncio=True)
    async for doc in client.stream_doc(
        on='/stream',
        inputs=PromptDocument(prompt='what is the capital of France ?', max_tokens=10),
        return_type=ModelOutputDocument,
    ):
        print(doc.generated_text)

Support

Jina-serve is backed by Jina AI and licensed under Apache-2.0.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jina-3.28.0.tar.gz (368.1 kB view details)

Uploaded Source

Built Distributions

jina-3.28.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.28.0-cp311-cp311-macosx_11_0_arm64.whl (8.4 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

jina-3.28.0-cp311-cp311-macosx_10_9_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

jina-3.28.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.28.0-cp310-cp310-macosx_11_0_arm64.whl (8.4 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

jina-3.28.0-cp310-cp310-macosx_10_9_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

jina-3.28.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.28.0-cp39-cp39-macosx_11_0_arm64.whl (8.4 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

jina-3.28.0-cp39-cp39-macosx_10_9_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

jina-3.28.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.28.0-cp38-cp38-macosx_11_0_arm64.whl (8.4 MB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

jina-3.28.0-cp38-cp38-macosx_10_9_x86_64.whl (9.0 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

jina-3.28.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

File details

Details for the file jina-3.28.0.tar.gz.

File metadata

  • Download URL: jina-3.28.0.tar.gz
  • Upload date:
  • Size: 368.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.17

File hashes

Hashes for jina-3.28.0.tar.gz
Algorithm Hash digest
SHA256 9e24c1f35d14ee5a67a1be9c78475105ba48dc348471d826394ec0604469cf4f
MD5 f39b12263cef264a4f215264adf08382
BLAKE2b-256 4620e23d29ac99c0cd0bd161414b585cce9b0a00d9893264d8e39c520e7bb030

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c228cba2f2d6f7d18645d6eb39711d75c8a587b358a69827af84b3f2c7e1df3c
MD5 2a6edc1fcd5328421320b549accea4cb
BLAKE2b-256 14be37bd54461b9e81bebeee9c701e2fb5bb341704ab6c62ef20dde2c0bb8dc0

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8e06055342ec4850a069ef8572b4315b17442f256c047e4931a48d3d5b2d2965
MD5 5c300a0682a8b364cd030cc57b5fc805
BLAKE2b-256 093b937bbbaee2b8dc7d5f316a2d6b6cb0872a0bae0c24cc4f351453edb37c92

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 dafe3055c37ca7a0b1627d82543498178462bbe6795fe4f60e2ee043a67adb55
MD5 09d9f1aac38d346c7cae49920920eb06
BLAKE2b-256 5ba823898ce4246fe9959553c3ab153b1d0db83c7ff336219034e5fb784cf792

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8743afbfc1fdf62e9ea818f42b44799a0f74e3093be4132c87a41d560c61ebea
MD5 73bae2b81305df0888f199afd6ebc1db
BLAKE2b-256 48963ea845b2dd9586cef6d4218556e0152dac31b38e526987a5fc8a6b9d5663

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c3c04a686074e3064da3ad25660aad850cfa664e9a599872540208ffd92a1a04
MD5 91600335bc2c6f338c0b5543c8725601
BLAKE2b-256 34b60d63d3cb3d6061920b6c869308de6584d04842a02beaa10ad601b7e551a6

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 d18097f09f579a28522e1e202ae821dfc45e99fb497db4c68e9b1d22563449ae
MD5 f9f38341be15e21ab936b94e989fca69
BLAKE2b-256 1a02f921df7a9b4b9ba22c47004142e0ed5a0333bafc979a78ec3f34991917f1

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a02e662d058c495590bd7bf1d0d38c090763613d2f51376de76b42542be00ef9
MD5 efef225a78a812eb7753ea31bdf53786
BLAKE2b-256 5743cf13af0c40051c412e55cd4ab4eb6b3c25c7db74d13a2a7471c504e92b53

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 958c4f76efa69d2656cb65306ccd78e6b869c15622dfc044c4ab2a733ea5a44c
MD5 bc7850df32e7c8503bd60e37acb87543
BLAKE2b-256 c377eff3128f12330bdb33aa7962aa92e22d77765dfe747e8bd186c50aaa74af

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 60542412aeee9208c176ba40c9bc6a388d38a285e9cf1ca797509299abe645eb
MD5 ba267c65c2bdc4eae81f1ecbc07fc0ce
BLAKE2b-256 ecff7373eae824bd395cf4dae5544edca720e4318e79bfdfb8cfeb5fea4a17d9

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 11deb9d729df525273bf81e9287bb43b1464c1d86a61bd30db9061887bf5ab94
MD5 45330b97d30b5ceae792c9556a16de1a
BLAKE2b-256 e825a18da07f7c50dd82691a4a34a27d5d9645db12db6d49f8c7b88843b6bc8c

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4dd99ff35b4d8c1a0af2e95e69d362da80c618c33803b5da549ad0dce9aa4074
MD5 ae9f7a7367d8481638b954b735e9508c
BLAKE2b-256 a05ffc4371ff27bcda9e011448ab8ba273d53c46e09413b4fabb52c06e5b99f3

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 de290edcf5b7c26c5fb2895ecbc2e58fc5c819104bf9c5dad069be1d623e8abf
MD5 0ce0d75014af5b5406144a099e8aec3c
BLAKE2b-256 ec8f98d8e1b880f1dafadbd87aa8a4bee319b693798f6816c984d56148fb3d13

See more details on using hashes here.

File details

Details for the file jina-3.28.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jina-3.28.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f918063ecf73a0bb3fc08f4db44287489f06569b702006b0ab90e32a5f95a562
MD5 d57111df10e7e964734e7293adf08ae6
BLAKE2b-256 0d9cf2952d1fbe92bd89cb9f9d940342274918208a2eafd2e1aa29de4ea59d8a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page