No project description provided
Project description
AI21 Labs Python SDK
Table of Contents
- Examples 🗂️
- Migration from v1.3.4 and below
- AI21 Official Documentation
- Installation 💿
- Usage - Chat Completions
- Conversational RAG (Beta)
- Older Models Support Usage
- More Models
- Token Counting
- Environment Variables
- Error Handling
- Cloud Providers ☁️
Examples (tl;dr)
If you want to quickly get a glance how to use the AI21 Python SDK and jump straight to business, you can check out the examples. Take a look at our models and see them in action! Several examples and demonstrations have been put together to show our models' functionality and capabilities.
Check out the Examples
Feel free to dive in, experiment, and adapt these examples to suit your needs. We believe they'll help you get up and running quickly.
Migration from v1.3.4 and below
In v2.0.0
we introduced a new SDK that is not backwards compatible with the previous version.
This version allows for non-static instances of the client, defined parameters to each resource, modelized responses and
more.
Migration Examples
Instance creation (not available in v1.3.4 and below)
from ai21 import AI21Client
client = AI21Client(api_key='my_api_key')
# or set api_key in environment variable - AI21_API_KEY and then
client = AI21Client()
We No longer support static methods for each resource, instead we have a client instance that has a method for each allowing for more flexibility and better control.
Completion before/after
prompt = "some prompt"
- import ai21
- response = ai21.Completion.execute(model="j2-light", prompt=prompt, maxTokens=2)
+ from ai21 import AI21Client
+ client = ai21.AI21Client()
+ response = client.completion.create(model="j2-light", prompt=prompt, max_tokens=2)
This applies to all resources. You would now need to create a client instance and use it to call the resource method.
Tokenization and Token counting before/after
- response = ai21.Tokenization.execute(text=prompt)
- print(len(response)) # number of tokens
+ from ai21 import AI21Client
+ client = AI21Client()
+ token_count = client.count_tokens(text=prompt)
Key Access in Response Objects before/after
It is no longer possible to access the response object as a dictionary. Instead, you can access the response object as an object with attributes.
- import ai21
- response = ai21.Summarize.execute(source="some text", sourceType="TEXT")
- response["summary"]
+ from ai21 import AI21Client
+ from ai21.models import DocumentType
+ client = AI21Client()
+ response = client.summarize.create(source="some text", source_type=DocumentType.TEXT)
+ response.summary
AWS Client Creations
Bedrock Client creation before/after
- import ai21
- destination = ai21.BedrockDestination(model_id=ai21.BedrockModelID.J2_MID_V1)
- response = ai21.Completion.execute(prompt=prompt, maxTokens=1000, destination=destination)
+ from ai21 import AI21BedrockClient, BedrockModelID
+ client = AI21BedrockClient()
+ response = client.completion.create(prompt=prompt, max_tokens=1000, model_id=BedrockModelID.J2_MID_V1)
SageMaker Client creation before/after
- import ai21
- destination = ai21.SageMakerDestination("j2-mid-test-endpoint")
- response = ai21.Completion.execute(prompt=prompt, maxTokens=1000, destination=destination)
+ from ai21 import AI21SageMakerClient
+ client = AI21SageMakerClient(endpoint_name="j2-mid-test-endpoint")
+ response = client.completion.create(prompt=prompt, max_tokens=1000)
Documentation
The full documentation for the REST API can be found on docs.ai21.com.
Installation
pip install ai21
Usage
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
client = AI21Client(
# defaults to os.enviorn.get('AI21_API_KEY')
api_key='my_api_key',
)
system = "You're a support engineer in a SaaS company"
messages = [
ChatMessage(content=system, role="system"),
ChatMessage(content="Hello, I need help with a signup process.", role="user"),
]
chat_completions = client.chat.completions.create(
messages=messages,
model="jamba-1.5-mini",
)
Async Usage
You can use the AsyncAI21Client
to make asynchronous requests.
There is no difference between the sync and the async client in terms of usage.
import asyncio
from ai21 import AsyncAI21Client
from ai21.models.chat import ChatMessage
system = "You're a support engineer in a SaaS company"
messages = [
ChatMessage(content=system, role="system"),
ChatMessage(content="Hello, I need help with a signup process.", role="user"),
]
client = AsyncAI21Client(
# defaults to os.enviorn.get('AI21_API_KEY')
api_key='my_api_key',
)
async def main():
response = await client.chat.completions.create(
messages=messages,
model="jamba-1.5-mini",
)
print(response)
asyncio.run(main())
A more detailed example can be found here.
Older Models Support Usage
Examples
Supported Models:
- j2-light
- j2-ultra
- j2-mid
- jamba-instruct
you can read more about the models here.
Chat
from ai21 import AI21Client
from ai21.models import RoleType
from ai21.models import ChatMessage
system = "You're a support engineer in a SaaS company"
messages = [
ChatMessage(text="Hello, I need help with a signup process.", role=RoleType.USER),
ChatMessage(text="Hi Alice, I can help you with that. What seems to be the problem?", role=RoleType.ASSISTANT),
ChatMessage(text="I am having trouble signing up for your product with my Google account.", role=RoleType.USER),
]
client = AI21Client()
chat_response = client.chat.create(
system=system,
messages=messages,
model="j2-ultra",
)
For a more detailed example, see the chat examples.
Completion
from ai21 import AI21Client
client = AI21Client()
completion_response = client.completion.create(
prompt="This is a test prompt",
model="j2-mid",
)
Chat Completion
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
system = "You're a support engineer in a SaaS company"
messages = [
ChatMessage(content=system, role="system"),
ChatMessage(content="Hello, I need help with a signup process.", role="user"),
ChatMessage(content="Hi Alice, I can help you with that. What seems to be the problem?", role="assistant"),
ChatMessage(content="I am having trouble signing up for your product with my Google account.", role="user"),
]
client = AI21Client()
response = client.chat.completions.create(
messages=messages,
model="jamba-instruct",
max_tokens=100,
temperature=0.7,
top_p=1.0,
stop=["\n"],
)
print(response)
Note that jamba-instruct supports async and streaming as well.
For a more detailed example, see the completion examples.
Streaming
We currently support streaming for the Chat Completions API in Jamba.
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
messages = [ChatMessage(content="What is the meaning of life?", role="user")]
client = AI21Client()
response = client.chat.completions.create(
messages=messages,
model="jamba-instruct",
stream=True,
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
Async Streaming
import asyncio
from ai21 import AsyncAI21Client
from ai21.models.chat import ChatMessage
messages = [ChatMessage(content="What is the meaning of life?", role="user")]
client = AsyncAI21Client()
async def main():
response = await client.chat.completions.create(
messages=messages,
model="jamba-1.5-mini",
stream=True,
)
async for chunk in response:
print(chunk.choices[0].delta.content, end="")
asyncio.run(main())
Conversational RAG (Beta)
Like chat, but with the ability to retrieve information from your Studio library.
from ai21 import AI21Client
from ai21.models.chat import ChatMessage
messages = [
ChatMessage(content="Ask a question about your files", role="user"),
]
client = AI21Client()
client.library.files.create(
file_path="path/to/file",
path="path/to/file/in/library",
labels=["my_file_label"],
)
chat_response = client.beta.conversational_rag.create(
messages=messages,
labels=["my_file_label"],
)
For a more detailed example, see the chat sync and async examples.
More Models
TSMs
AI21 Studio's Task-Specific Models offer a range of powerful tools. These models have been specifically designed for their respective tasks and provide high-quality results while optimizing efficiency. The full documentation and guides can be found here.
Contextual Answers
The answer
API allows you to access our high-quality question answering model.
from ai21 import AI21Client
client = AI21Client()
response = client.answer.create(
context="This is a text is for testing purposes",
question="Question about context",
)
A detailed explanation on Contextual Answers, can be found here
File Upload
from ai21 import AI21Client
client = AI21Client()
file_id = client.library.files.create(
file_path="path/to/file",
path="path/to/file/in/library",
labels=["label1", "label2"],
public_url="www.example.com",
)
uploaded_file = client.library.files.get(file_id)
For more information on more Task Specific Models, see the documentation.
Token Counting
By using the count_tokens
method, you can estimate the billing for a given request.
from ai21.tokenizers import get_tokenizer
tokenizer = get_tokenizer(name="jamba-tokenizer")
total_tokens = tokenizer.count_tokens(text="some text") # returns int
print(total_tokens)
Async Usage
from ai21.tokenizers import get_async_tokenizer
## Your async function code
#...
tokenizer = await get_async_tokenizer(name="jamba-tokenizer")
total_tokens = await tokenizer.count_tokens(text="some text") # returns int
print(total_tokens)
Available tokenizers are:
jamba-tokenizer
j2-tokenizer
For more information on AI21 Tokenizers, see the documentation.
Environment Variables
You can set several environment variables to configure the client.
Logging
We use the standard library logging
module.
To enable logging, set the AI21_LOG_LEVEL
environment variable.
$ export AI21_LOG_LEVEL=debug
Other Important Environment Variables
AI21_API_KEY
- Your API key. If not set, you must pass it to the client constructor.AI21_API_VERSION
- The API version. Defaults tov1
.AI21_API_HOST
- The API host. Defaults tohttps://api.ai21.com/v1/
.AI21_TIMEOUT_SEC
- The timeout for API requests.AI21_NUM_RETRIES
- The maximum number of retries for API requests. Defaults to3
retries.AI21_AWS_REGION
- The AWS region to use for AWS clients. Defaults tous-east-1
.
Error Handling
from ai21 import errors as ai21_errors
from ai21 import AI21Client, AI21APIError
from ai21.models import ChatMessage
client = AI21Client()
system = "You're a support engineer in a SaaS company"
messages = [
# Notice the given role does not exist and will be the reason for the raised error
ChatMessage(text="Hello, I need help with a signup process.", role="Non-Existent-Role"),
]
try:
chat_completion = client.chat.create(
messages=messages,
model="j2-ultra",
system=system
)
except ai21_errors.AI21ServerError as e:
print("Server error and could not be reached")
print(e.details)
except ai21_errors.TooManyRequestsError as e:
print("A 429 status code was returned. Slow down on the requests")
except AI21APIError as e:
print("A non 200 status code error. For more error types see ai21.errors")
Cloud Providers
AWS
AI21 Library provides convenient ways to interact with two AWS clients for use with AWS Bedrock and AWS SageMaker.
Installation
pip install -U "ai21[AWS]"
This will make sure you have the required dependencies installed, including boto3 >= 1.28.82
.
Usage
Bedrock
from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage
client = AI21BedrockClient(region='us-east-1') # region is optional, as you can use the env variable instead
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
response = client.chat.completions.create(
messages=messages,
model_id=BedrockModelID.JAMBA_1_5_LARGE,
)
Stream
from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage
system = "You're a support engineer in a SaaS company"
messages = [
ChatMessage(content=system, role="system"),
ChatMessage(content="Hello, I need help with a signup process.", role="user"),
ChatMessage(content="Hi Alice, I can help you with that. What seems to be the problem?", role="assistant"),
ChatMessage(content="I am having trouble signing up for your product with my Google account.", role="user"),
]
client = AI21BedrockClient()
response = client.chat.completions.create(
messages=messages,
model=BedrockModelID.JAMBA_1_5_LARGE,
stream=True,
)
for chunk in response:
print(chunk.choices[0].message.content, end="")
Async
import asyncio
from ai21 import AsyncAI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage
client = AsyncAI21BedrockClient(region='us-east-1') # region is optional, as you can use the env variable instead
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
async def main():
response = await client.chat.completions.create(
messages=messages,
model_id=BedrockModelID.JAMBA_1_5_LARGE,
)
asyncio.run(main())
With Boto3 Session
import boto3
from ai21 import AI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage
boto_session = boto3.Session(region_name="us-east-1")
client = AI21BedrockClient(session=boto_session)
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
response = client.chat.completions.create(
messages=messages,
model_id=BedrockModelID.JAMBA_1_5_LARGE,
)
Async
import boto3
import asyncio
from ai21 import AsyncAI21BedrockClient, BedrockModelID
from ai21.models.chat import ChatMessage
boto_session = boto3.Session(region_name="us-east-1")
client = AsyncAI21BedrockClient(session=boto_session)
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
async def main():
response = await client.chat.completions.create(
messages=messages,
model_id=BedrockModelID.JAMBA_1_5_LARGE,
)
asyncio.run(main())
SageMaker
from ai21 import AI21SageMakerClient
client = AI21SageMakerClient(endpoint_name="j2-endpoint-name")
response = client.summarize.create(
source="Text to summarize",
source_type="TEXT",
)
print(response.summary)
Async
import asyncio
from ai21 import AsyncAI21SageMakerClient
client = AsyncAI21SageMakerClient(endpoint_name="j2-endpoint-name")
async def main():
response = await client.summarize.create(
source="Text to summarize",
source_type="TEXT",
)
print(response.summary)
asyncio.run(main())
With Boto3 Session
from ai21 import AI21SageMakerClient
import boto3
boto_session = boto3.Session(region_name="us-east-1")
client = AI21SageMakerClient(
session=boto_session,
endpoint_name="j2-endpoint-name",
)
Azure
If you wish to interact with your Azure endpoint on Azure AI Studio, use the AI21AzureClient
and AsyncAI21AzureClient
clients.
The following models are supported on Azure:
jamba-instruct
from ai21 import AI21AzureClient
from ai21.models.chat import ChatMessage
client = AI21AzureClient(
base_url="https://<YOUR-ENDPOINT>.inference.ai.azure.com",
api_key="<your Azure api key>",
)
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
response = client.chat.completions.create(
model="jamba-1.5-mini",
messages=messages,
)
Async
import asyncio
from ai21 import AsyncAI21AzureClient
from ai21.models.chat import ChatMessage
client = AsyncAI21AzureClient(
base_url="https://<YOUR-ENDPOINT>.inference.ai.azure.com/v1/chat/completions",
api_key="<your Azure api key>",
)
messages = [
ChatMessage(content="You are a helpful assistant", role="system"),
ChatMessage(content="What is the meaning of life?", role="user")
]
async def main():
response = await client.chat.completions.create(
model="jamba-instruct",
messages=messages,
)
asyncio.run(main())
Vertex
If you wish to interact with your Vertex AI endpoint on GCP, use the AI21VertexClient
and AsyncAI21VertexClient
clients.
The following models are supported on Vertex:
jamba-1.5-mini
jamba-1.5-large
from ai21 import AI21VertexClient
from ai21.models.chat import ChatMessage
# You can also set the project_id, region, access_token and Google credentials in the constructor
client = AI21VertexClient()
messages = ChatMessage(content="What is the meaning of life?", role="user")
response = client.chat.completions.create(
model="jamba-1.5-mini",
messages=[messages],
)
Async
import asyncio
from ai21 import AsyncAI21VertexClient
from ai21.models.chat import ChatMessage
# You can also set the project_id, region, access_token and Google credentials in the constructor
client = AsyncAI21VertexClient()
async def main():
messages = ChatMessage(content="What is the meaning of life?", role="user")
response = await client.chat.completions.create(
model="jamba-1.5-mini",
messages=[messages],
)
asyncio.run(main())
Happy prompting! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.