Skip to main content

Build multimodal AI services via cloud native technologies · Neural Search · Generative AI · MLOps

Project description

Jina logo: Build multimodal AI services via cloud native technologies · Neural Search · Generative AI · Cloud Native

Build multimodal AI services with cloud native technologies

PyPI PyPI - Downloads from official pypistats Github CD status

Jina lets you build multimodal AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production. You can focus on your logic and algorithms, without worrying about the infrastructure complexity.

Jina provides a smooth Pythonic experience transitioning from local deployment to advanced orchestration frameworks like Docker-Compose, Kubernetes, or Jina AI Cloud. Jina makes advanced solution engineering and cloud-native technologies accessible to every developer.

  • Build applications for any data type, any mainstream deep learning framework, and any protocol.
  • Design high-performance microservices, with easy scaling, duplex client-server streaming, and async/non-blocking data processing over dynamic flows.
  • Docker container integration via Executor Hub, OpenTelemetry/Prometheus observability, and fast Kubernetes/Docker-Compose deployment.
  • CPU/GPU hosting via Jina AI Cloud.
Wait, how is Jina different from FastAPI? Jina's value proposition may seem quite similar to that of FastAPI. However, there are several fundamental differences:

Data structure and communication protocols

  • FastAPI communication relies on Pydantic and Jina relies on DocArray allowing Jina to support multiple protocols to expose its services.

Advanced orchestration and scaling capabilities

  • Jina lets you deploy applications formed from multiple microservices that can be containerized and scaled independently.
  • Jina allows you to easily containerize and orchestrate your services, providing concurrency and scalability.

Journey to the cloud

  • Jina provides a smooth transition from local development (using DocArray) to local serving using (Jina's orchestration layer) to having production-ready services by using Kubernetes capacity to orchestrate the lifetime of containers.
  • By using Jina AI Cloud you have access to scalable and serverless deployments of your applications in one command.

Documentation

Install

pip install jina

Find more install options on Apple Silicon/Windows.

Get Started

Basic Concepts

Jina has four fundamental concepts:

  • A Document (from DocArray) is the input/output format in Jina.
  • An Executor is a Python class that transforms and processes Documents.
  • A Deployment serves a single Executor, while a Flow serves Executors chained into a pipeline.

The full glossary is explained here.

Build AI Services

Let's build a fast, reliable and scalable gRPC-based AI service. In Jina we call this an Executor. Our simple Executor will wrap the StableLM LLM from Stability AI. We'll then use a Deployment to serve it.

Note A Deployment serves just one Executor. To combine multiple Executors into a pipeline and serve that, use a Flow.

Let's implement the service's logic:

executor.py
from jina import Executor, requests
from docarray import DocumentArray

from transformers import pipeline


class StableLM(Executor):

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.generator = pipeline('text-generation', model='stablelm-3b')

    @requests
    def generate(self, docs: DocumentArray, **kwargs):
        generated_text = self.generator(docs.texts)
        docs.texts = [gen[0]['generated_text'] for gen in generated_text]

Then we deploy it with either the Python API or YAML:

Python API: deployment.py YAML: deployment.yml
from jina import Deployment
from executor import StableLM

dep = Deployment(uses=StableLM, timeout_ready=-1, port=12345)

with dep:
    dep.block()
jtype: Deployment
with:
  uses: StableLM
  py_modules:
    - executor.py
  timeout_ready: -1
  port: 12345

And run the YAML Deployment with the CLI: jina deployment --uses deployment.yml

Use Jina Client to make requests to the service:

from docarray import Document
from jina import Client

prompt = Document(
    tags = {'prompt': 'suggest an interesting image generation prompt for a mona lisa variant'}
)

client = Client(port=12345)  # use port from output above
response = client.post(on='/', inputs=[prompt])

print(response[0].text)
a steampunk version of the Mona Lisa, incorporating mechanical gears, brass elements, and Victorian era clothing details

Note In a notebook, you can't use deployment.block() and then make requests to the client. Please refer to the Colab link above for reproducible Jupyter Notebook code snippets.

Build a pipeline

Sometimes you want to chain microservices together into a pipeline. That's where a Flow comes in.

A Flow is a DAG pipeline, composed of a set of steps, It orchestrates a set of Executors and a Gateway to offer an end-to-end service.

Note If you just want to serve a single Executor, you can use a Deployment.

For instance, let's combine our StableLM language model with a Stable Diffusion image generation service from Jina AI's Executor Hub. Chaining these services together into a Flow will give us a service that will generate images based on a prompt generated by the LLM.

Build the Flow with either Python or YAML:

Python API: flow.py YAML: flow.yml
from jina import Flow
from executor import StableLM

flow = (
    Flow()
    .add(uses=StableLM, timeout_ready=-1, port=12345)
    .add(
        uses='jinaai://jina-ai/TextToImage',
        timeout_ready=-1,
        install_requirements=True,
    )
)  # use the Executor from Jina's Executor hub

with flow:
    flow.block()
jtype: Flow
with:
    port: 12345
executors:
  - uses: StableLM
    timeout_ready: -1
    py_modules:
      - executor.py
  - uses: jinaai://jina-ai/TextToImage
    timeout_ready: -1
    install_requirements: true

Then run the YAML Flow with the CLI: jina flow --uses flow.yml

Then, use Jina Client to make requests to the Flow:

from jina import Client, Document

client = Client(port=12345)

prompt = Document(
    tags = {'prompt': 'suggest an interesting image generation prompt for a mona lisa variant'}
)

response = client.post(on='/', inputs=[prompt])

response[0].display()

Deploy to the cloud

You can also deploy a Flow to JCloud.

First, turn the flow.yml file into a JCloud-compatible YAML by specifying resource requirements and using containerized Hub Executors.

Then, use jina cloud deploy command to deploy to the cloud:

wget https://raw.githubusercontent.com/jina-ai/jina/master/.github/getting-started/jcloud-flow.yml
jina cloud deploy jcloud-flow.yml

Warning

Make sure to delete/clean up the Flow once you are done with this tutorial to save resources and credits.

Read more about deploying Flows to JCloud.

Check the getting-started project source code.

Easy scalability and concurrency

Why not just use standard Python to build that microservice and pipeline? Jina accelerates time to market of your application by making it more scalable and cloud-native. Jina also handles the infrastructure complexity in production and other Day-2 operations so that you can focus on the data application itself.

Increase your application's throughput with scalability features out of the box, like replicas, shards and dynamic batching.

Let's scale a Stable Diffusion Executor deployment with replicas and dynamic batching:

  • Create two replicas, with a GPU assigned for each.
  • Enable dynamic batching to process incoming parallel requests together with the same model inference.
Normal Deployment Scaled Deployment
jtype: Deployment
with:
  timeout_ready: -1
  uses: jinaai://jina-ai/TextToImage
  install_requirements: true
jtype: Deployment
with:
  timeout_ready: -1
  uses: jinaai://jina-ai/TextToImage
  install_requirements: true
  env:
   CUDA_VISIBLE_DEVICES: RR
  replicas: 2
  uses_dynamic_batching: # configure dynamic batching
    /default:
      preferred_batch_size: 10
      timeout: 200

Assuming your machine has two GPUs, using the scaled deployment YAML will give better throughput compared to the normal deployment.

These features apply to both Deployment YAML and Flow YAML. Thanks to the YAML syntax, you can inject deployment configurations regardless of Executor code.

Get on the fast lane to cloud-native

Using Kubernetes with Jina is easy:

jina export kubernetes flow.yml ./my-k8s
kubectl apply -R -f my-k8s

And so is Docker Compose:

jina export docker-compose flow.yml docker-compose.yml
docker-compose up

Note You can also export Deployment YAML to Kubernetes and Docker Compose.

That's not all. We also support OpenTelemetry, Prometheus, and Jaeger.

What cloud-native technology is still challenging to you? Tell us and we'll handle the complexity and make it easy for you.

Support

Join Us

Jina is backed by Jina AI and licensed under Apache-2.0.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jina-3.18.0.tar.gz (353.7 kB view hashes)

Uploaded Source

Built Distributions

jina-3.18.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.18.0-cp311-cp311-macosx_11_0_arm64.whl (8.2 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

jina-3.18.0-cp311-cp311-macosx_10_9_x86_64.whl (8.7 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

jina-3.18.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.18.0-cp310-cp310-macosx_11_0_arm64.whl (8.2 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

jina-3.18.0-cp310-cp310-macosx_10_9_x86_64.whl (8.7 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

jina-3.18.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.18.0-cp39-cp39-macosx_11_0_arm64.whl (8.2 MB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

jina-3.18.0-cp39-cp39-macosx_10_9_x86_64.whl (8.7 MB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

jina-3.18.0-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

jina-3.18.0-cp38-cp38-macosx_11_0_arm64.whl (8.2 MB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

jina-3.18.0-cp38-cp38-macosx_10_9_x86_64.whl (8.7 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

jina-3.18.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.7 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page