Skip to main content

A data quality guardrail SDK for AI agents

Project description

Datagusto SDK for Python

A data quality guardrail SDK for LangGraph agents that provides tools for ensuring data quality and governance in AI agent workflows.

Features

  • Data Quality Guardrails: Automatically filter out incomplete records with null/None values
  • LangGraph Integration: Seamlessly integrates with LangGraph agents as a tool
  • API-Based Configuration: Fetches guardrail rules from your Datagusto backend
  • Easy to Use: Simple function calls that can be added to any LangGraph workflow

Installation

pip install datagusto-sdk

Quick Start

1. Set Environment Variables

export DATAGUSTO_API_KEY="your_api_key_here"
export DATAGUSTO_API_URL="http://localhost:8000"  # Optional, defaults to localhost:8000

2. Basic Usage with LangGraph

Simply add datagusto_guardrail to your existing tool set - no complex integration required!

from datagusto_sdk import datagusto_guardrail
from langchain_tavily import TavilySearch
from langgraph.graph import StateGraph, START, END
from langchain.chat_models import init_chat_model

# Initialize your tools including the datagusto guardrail
tool = TavilySearch(max_results=2)
tools = [tool, datagusto_guardrail]  # Just add datagusto_guardrail to your existing tools

# Set up your LLM with tools
llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")
llm_with_tools = llm.bind_tools(tools)

# The guardrail will automatically be called after other tools to ensure data quality

Complete Example

Here's a complete example of a LangGraph agent with Datagusto guardrails, based on the LangGraph Add Tools tutorial:

import json
from typing import Annotated
from typing_extensions import TypedDict

from langchain.chat_models import init_chat_model
from langchain_core.messages import ToolMessage
from langgraph.graph import StateGraph, START, END
from langgraph.graph.message import add_messages
from langchain_tavily import TavilySearch

from datagusto_sdk.langgraph.toolset import datagusto_guardrail

# Initialize tools
tool = TavilySearch(max_results=2)
tools = [tool, datagusto_guardrail]

# Set up LLM
llm = init_chat_model("anthropic:claude-3-5-sonnet-latest")
llm_with_tools = llm.bind_tools(tools)

class State(TypedDict):
    messages: Annotated[list, add_messages]

def chatbot(state: State):
    return {"messages": [llm_with_tools.invoke(state["messages"])]}

class BasicToolNode:
    """A node that runs the tools requested in the last AIMessage."""
    
    def __init__(self, tools: list) -> None:
        self.tools_by_name = {tool.name: tool for tool in tools}
    
    def __call__(self, inputs: dict):
        if messages := inputs.get("messages", []):
            message = messages[-1]
        else:
            raise ValueError("No message found in input")
        outputs = []
        for tool_call in message.tool_calls:
            tool_result = self.tools_by_name[tool_call["name"]].invoke(
                tool_call["args"]
            )
            outputs.append(
                ToolMessage(
                    content=json.dumps(tool_result),
                    name=tool_call["name"],
                    tool_call_id=tool_call["id"],
                )
            )
        return {"messages": outputs}

tool_node = BasicToolNode(tools=[tool])

def route_tools(state: State):
    """Route to the ToolNode if the last message has tool calls."""
    if isinstance(state, list):
        ai_message = state[-1]
    elif messages := state.get("messages", []):
        ai_message = messages[-1]
    else:
        raise ValueError(f"No messages found in input state to tool_edge: {state}")
    if hasattr(ai_message, "tool_calls") and len(ai_message.tool_calls) > 0:
        return "tools"
    return END

# Build the graph
graph_builder = StateGraph(State)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_node("tools", tool_node)
graph_builder.add_conditional_edges(
    "chatbot",
    route_tools,
    {"tools": "tools", END: END},
)
graph_builder.add_edge("tools", "chatbot")
graph_builder.add_edge(START, "chatbot")
graph = graph_builder.compile()

# Run the agent
def stream_graph_updates(user_input: str):
    for event in graph.stream({"messages": [{"role": "user", "content": user_input}]}):
        for value in event.values():
            print("Assistant:", value["messages"][-1].content)

if __name__ == "__main__":
    stream_graph_updates("What do you know about LangGraph?")

Configuration

The SDK uses environment variables for configuration:

  • DATAGUSTO_API_KEY: Your API key for authentication (required)
  • DATAGUSTO_API_URL: The base URL of your Datagusto API (optional, defaults to http://localhost:8000)

Requirements

  • Python 3.8+
  • langchain-core >= 0.1.0
  • requests >= 2.25.0

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datagusto_sdk-0.1.5.tar.gz (7.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datagusto_sdk-0.1.5-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file datagusto_sdk-0.1.5.tar.gz.

File metadata

  • Download URL: datagusto_sdk-0.1.5.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for datagusto_sdk-0.1.5.tar.gz
Algorithm Hash digest
SHA256 71b3f0296d34786283dc0995a77eb9b9d42ba2962ac0ea3078612c74d3f12a9b
MD5 1b623faa1a35c694ac97d536e0d641eb
BLAKE2b-256 e681ad5063d54ca3f80430cdecd665bf3202bf3405e48641193626b598b41efc

See more details on using hashes here.

File details

Details for the file datagusto_sdk-0.1.5-py3-none-any.whl.

File metadata

  • Download URL: datagusto_sdk-0.1.5-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for datagusto_sdk-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ac9edcf75431ff1536cce0f597662cc60c0dca1fa4571a218b91bcf45e51d5a2
MD5 63badd8bbec72a8832b3a5430bb46198
BLAKE2b-256 f2ffe949816d09710a295810099ef5440a8ffa6a3ad56015ef72b3b8cd341a2f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page