Skip to main content

Add your description here

Project description

Sarvam Conv AI SDK

The Sarvam Conversational AI SDK is a Python package that helps developers build and extend conversational agents. It provides core components to manage conversation flow, language preferences, and messaging, making it easier to develop interactive and context-aware AI experiences.


Overview

The Sarvam Conv AI SDK enables developers to create tools that can:

  • Facilitate agentic capabilities like API calling in the middle of a conversation.
  • Manage agent-specific variables
  • Control and modify the language used during conversations
  • Send dynamic messages to both the user and the underlying language model (LLM)

Installation

Install the SDK via pip:

pip install sarvam-conv-ai-sdk

Example Usage

import httpx
from pydantic import Field

from sarvam_conv_ai_sdk import (
    SarvamInteractionTurnRole,
    SarvamOnEndTool,
    SarvamOnEndToolContext,
    SarvamOnStartTool,
    SarvamOnStartToolContext,
    SarvamTool,
    SarvamToolContext,
    SarvamToolLanguageName,
    SarvamToolOutput,
)

class OnStart(SarvamOnStartTool): #Name of the class has to be OnStart
    async def run(self, context: SarvamOnStartToolContext):
        user_id = context.get_user_identifier()
        async with httpx.AsyncClient() as client:
            response = await client.get(f"https://sarvam-flights.com/users/{user_id}")
            response.raise_for_status()
            user_data = response.json()

        source_destination = user_data.get("home_city")
        context.set_agent_variable("source_destination", source_destination)
        context.set_agent_variable("passenger_name", user_data.get("name"))
        
        # Store telephony call SID if available (for telephony channels)
        if context.provider_ref_id:
            context.set_agent_variable("call_sid", context.provider_ref_id)
        
        context.set_initial_language_name(SarvamToolLanguageName.ENGLISH)
        context.set_initial_bot_message(
            f"Hello! Would you like to book a flight from {source_destination}? Where would you like to go?",
        )
        return context


class BookFlight(SarvamTool):
    """Book a flight based on the user's travel preferences."""

    destination: str = Field(description="City of destination")
    travel_date: str = Field(description="Date of travel (YYYY-MM-DD)")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        source_destination = context.get_agent_variable("source_destination")
        booking_data = {
            "source": source_destination,
            "destination": self.destination,
            "travel_date": self.travel_date,
            "passenger_name": context.get_agent_variable("passenger_name"),
        }

        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://sarvam-flights.com/book", json=booking_data
            )
            response.raise_for_status()
            booking_result = response.json()

        if booking_result.get("status") == "confirmed":
            context.set_agent_variable("booking_id", booking_result.get("booking_id"))
            context.set_end_conversation()
            return SarvamToolOutput(
                message_to_user=f"Flight booked successfully to {self.destination}!",
                context=context,
            )
        else:
            context.change_state("recommend_destinations")
            return SarvamToolOutput(
                message_to_llm="Booking failed. Please suggest similar destinations.",
                context=context,
            )


class OnEnd(SarvamOnEndTool):  #Name of the class has to be OnEnd
    async def run(self, context: SarvamOnEndToolContext):
        feedback = context.get_agent_variable("feedback")
        negative_words = ["bad", "poor", "disappointed", "unhappy", "problem"]
        interaction_transcript = context.get_interaction_transcript()
        if interaction_transcript.interaction_transcript:
            for turn in interaction_transcript.interaction_transcript:
                if turn.role == SarvamInteractionTurnRole.USER:
                    is_negative = any(word in feedback.lower() for word in negative_words)
            context.set_agent_variable("feedback_sentiment", is_negative)
        
        # Log call details if telephony SID is available
        if context.provider_ref_id:
            async with httpx.AsyncClient() as client:
                await client.post(
                    "https://sarvam-flights.com/analytics/call-logs",
                    json={
                        "call_sid": context.provider_ref_id,
                        "user_id": context.get_user_identifier(),
                        "sentiment": is_negative,
                        "duration": (
                            interaction_transcript.interaction_end_time 
                            - interaction_transcript.interaction_start_time
                        ).total_seconds()
                    }
                )

        return context

Base Classes

The SDK exposes three base classes for tool development:

1. SarvamTool

Primary base class for all operational tools invoked during conversation flow.

Example:

class MyCustomTool(SarvamTool):
    """Brief description of the tool's purpose."""

    tool_variable: type = Field(description="Description of this input parameter")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        # Custom tool logic
        return SarvamToolOutput(
            message_to_user="Response to user",
            message_to_llm="Context for LLM",
            context=context
        )

2. SarvamOnStartTool

Executed at the beginning of a conversation, typically for initialization. The class must be named OnStart.

3. SarvamOnEndTool

Executed at the end of a conversation, typically for cleanup or post-processing. The class must be named OnEnd.


Context Classes and Methods

SarvamToolContext

The context object passed to SarvamTool.run() methods.

Variable Management

  • get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.

  • set_agent_variable(variable_name: str, value: Any) -> None Update a variable's value.

Language Control

  • get_current_language() -> SarvamToolLanguageName Returns the current language of the agent.

  • change_language(language: SarvamToolLanguageName) -> None Update the language preference.

Conversation Flow

  • set_end_conversation() -> None Explicitly end the conversation.

State Management

  • get_current_state() -> str Returns the current state of the conversation.

  • change_state(state: str) -> None Transition to a new state. Note: The new state must be one of the next valid states defined in the agent configuration.

Engagement Metadata

  • get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

SarvamOnStartToolContext

The context object passed to SarvamOnStartTool.run() methods.

Variable Management

  • get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.

  • set_agent_variable(variable_name: str, value: Any) -> None Update a variable's value.

User Information

  • get_user_identifier() -> str Get the user identifier.

Telephony Information

  • provider_ref_id: Optional[str] The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to None for channels that don't provide reference IDs.

Initialization Methods

  • set_initial_bot_message(message: str) -> None Set the first message sent by the agent when the conversation starts.

  • set_initial_state_name(state_name: str) -> None Set the initial state from which the agent should start.

  • set_initial_language_name(language: SarvamToolLanguageName) -> None Define the initial language preference for the user.

Engagement Metadata

  • get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

SarvamOnEndToolContext

The context object passed to SarvamOnEndTool.run() methods.

Variable Management

  • get_agent_variable(variable_name: str) -> Any Retrieve the value of a variable.

  • set_agent_variable(variable_name: str, value: Any) -> None Update a variable's value.

User Information

  • get_user_identifier() -> str Get the user identifier.

Telephony Information

  • provider_ref_id: Optional[str] The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to None for channels that don't provide reference IDs.

Engagement Metadata

  • get_engagement_metadata() -> EngagementMetadata Retrieve the engagement metadata containing information about the current interaction.

Interaction Reattempt

  • set_retry_interaction The user will be reattempted with the same agent. Useful when any business goal has not been met.

Interaction Transcript

  • get_interaction_transcript() -> SarvamInteractionTranscript Retrieve the conversation history containing user and agent messages in English and the timestamp when the conversation began and ended. Format: yyyy-mm-dd hh:mm:ss

Example transcript:

[
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Hello! How can I help you today?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I need to book a flight'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='I can help you with that. Where would you like to go?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I want to go to Mumbai'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Great! When would you like to travel?')
]

Return Types

SarvamToolOutput

The return type for SarvamTool.run() methods. Contains:

  • message_to_user: Optional[str] - Message that is sent directly to the user
  • message_to_llm: Optional[str] - Message that is sent to the LLM, which then responds
  • context: SarvamToolContext - The updated context object

Note: At least one of message_to_llm or message_to_user must be set.

Important: When both message_to_user and message_to_llm are set, only the message_to_user is actually sent to the user, but the message_to_llm overrides the message_to_user when adding to the chat thread for the LLM's context.

EngagementMetadata

The engagement metadata object that can be retrieved from context objects using get_engagement_metadata(). Contains:

  • interaction_id: str - Unique identifier for each conversation between user & agent.
  • attempt_id: Optional[str] - Unique identifier for each attempt created on the platform
  • campaign_id: Optional[str] - Campaign ID for the interaction
  • interaction_language: SarvamToolLanguageName - The language used for the interaction (defaults to English)
  • app_id: str - Application identifier of the agent for the interaction
  • app_version: int - Version number of the agent
  • agent_phone_number: Optional[str] - Phone number associated with the conversational agent application

Supported Languages

The SDK supports multilingual conversations using the SarvamToolLanguageName enum. Available languages include:

  • Bengali
  • Gujarati
  • Kannada
  • Malayalam
  • Tamil
  • Telugu
  • Punjabi
  • Odia
  • Marathi
  • Hindi
  • English

Note: The allowed languages are actually a subset that is preselected while defining the agent configurations.


Best Practices

  1. Always implement run(): The run() method is the entry point for tool execution logic.
  2. Use Field() for parameters: Ensures type safety and adds descriptive metadata necessary for LLM to use in the prompt.
  3. Gracefully handle errors: Avoid accessing unset variables or using invalid types.
  4. Return the appropriate type: SarvamTool.run() must return SarvamToolOutput, while SarvamOnStartTool.run() and SarvamOnEndTool.run() return their respective context objects.
  5. Write meaningful docstrings: Clearly describe what each tool is intended to do as this directly impacts the performance of tool calling capabilities of the agent.
  6. Use async operations for I/O: For the best performance, use async/await for external API calls to avoid blocking.
  7. Use context methods: Use the provided context methods for variable management, language control, and messaging instead of directly accessing context attributes.

Error Handling

The SDK includes built-in error handling for common scenarios:

  • Variable not found: Raises ValueError when accessing undefined variables
  • Variable not defined: Raises ValueError when setting variables that haven't been initialized
  • Non-serializable values: Raises ValueError when variable values cannot be JSON serialized
  • Invalid output: Raises ValueError when SarvamToolOutput is created without at least one message

Testing Your Tools

After creating a tool, you can test it locally to ensure it works as expected. Here's how to test your tools:

Testing Steps

  1. Create the ToolContext: Initialize the appropriate context object with test data
  2. Instantiate the tool class: Use tool.model_validate(tool_args) to create a tool instance
  3. Run the tool: Call the tool's run() method with the context
  4. Observe the returned object: Check if the necessary changes have been made to the context

Example Test: SarvamTool

# Test the BookFlight tool
async def test_book_flight():
    # 1. Create the ToolContext
    context = SarvamToolContext(
        language=SarvamToolLanguageName.ENGLISH,
        allowed_languages=[SarvamToolLanguageName.ENGLISH],
        state="booking",
        next_valid_states=["recommend_destinations", "end"],
        agent_variables={
            "source_destination": "Mumbai",
            "passenger_name": "John Doe",
            "booking_id": "123"
        },
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
    )
    
    # 2. Instantiate the tool class
    tool_args = {
        "destination": "Delhi",
        "travel_date": "2024-03-15"
    }
    tool_instance = BookFlight.model_validate(tool_args)
    
    # 3. Run the tool
    result = await tool_instance.run(context)
    
    # 4. Observe the returned object
    print(f"Message to user: {result.message_to_user}")
    print(f"Message to LLM: {result.message_to_llm}")
    print(f"End conversation: {result.context.end_conversation}")
    print(f"Current state: {result.context.get_current_state()}")
    print(f"Agent variables: {result.context.agent_variables}")
    print(f"Current Language: {result.context.get_current_language()}")

# Run the test
asyncio.run(test_book_flight())

Example Test: OnStart Tool

For SarvamOnStartTool, the testing approach is similar but it returns the context object directly:

# Testing OnStart tool
async def test_on_start():
    context = SarvamOnStartToolContext(
        user_identifier="user123",
        agent_variables={"source_destination": "Mumbai", "passenger_name": "John Doe"},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        initial_bot_message=None,
        initial_state_name="start",
        initial_language_name=SarvamToolLanguageName.ENGLISH,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnStart()
    result = await tool_instance.run(context)
    
    print(f"Initial bot message: {result.initial_bot_message}")
    print(f"Initial state: {result.initial_state_name}")
    print(f"Initial Language Name: {result.initial_language_name}")
    print(f"Agent variables: {result.agent_variables}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_start())

Example Test: OnEnd Tool

# Testing OnEnd tool
async def test_on_end():
    context = SarvamOnEndToolContext(
        user_identifier="user123",
        agent_variables={"feedback": "I had a bad experience", "feedback_sentiment": False},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        interaction_transcript=SarvamInteractionTranscript(
            interaction_transcript=[
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Hello! How can I help you today?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I need to book a flight'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='I can help you with that. Where would you like to go?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I want to go to Mumbai'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Great! When would you like to travel?')
            ],
            interaction_start_time=datetime.now() - timedelta(minutes=2),
            interaction_end_time=datetime.now(),
        ),
        retry_interaction=False,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnEnd()
    result = await tool_instance.run(context)
    
    print(f"Agent variables: {result.agent_variables}")
    print(f"Interaction Retry: {result.retry_interaction}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_end())

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sarvam_conv_ai_sdk-1.0.7.tar.gz (23.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sarvam_conv_ai_sdk-1.0.7-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file sarvam_conv_ai_sdk-1.0.7.tar.gz.

File metadata

  • Download URL: sarvam_conv_ai_sdk-1.0.7.tar.gz
  • Upload date:
  • Size: 23.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for sarvam_conv_ai_sdk-1.0.7.tar.gz
Algorithm Hash digest
SHA256 84ae05db1b650594940c5be2d446280135bbc941ddacf3461536498d5913a157
MD5 fc980e982b3a7db9ade23d6c850596a0
BLAKE2b-256 0bfa6f3cae4e6b32c1577431581a3656c6b1676bf060d6682abbca13a922f1c9

See more details on using hashes here.

File details

Details for the file sarvam_conv_ai_sdk-1.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for sarvam_conv_ai_sdk-1.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 76528aa097636b58301416dfb843a220dadf691a1f6a261da1f297dfc0f5aebc
MD5 c1086b4ed0b87b00c8c0d33731b183a3
BLAKE2b-256 527981637bb73b86e00386e93c176c0666a54aed66c4c9f7e6ddcbe10089ccf9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page