Pluggable agent SDK to talk to Llama Stack using a cohort of agents (default Llama Stack Responses API or LangGraph)
Project description
portazgo
Pluggable agent SDK to talk to Llama Stack using different agentic frameworks. Use the default backend (Llama Stack Responses API) or a lang-graph backend (stub for future implementation).
Installation
From source with uv (recommended):
cd portazgo
uv sync --extra dev
With optional LangGraph extra (when that backend is implemented):
uv sync --extra dev --extra langgraph
With pip (from source):
pip install -e .
For PyPI (once published):
pip install portazgo
Usage
Agent with type
from portazgo import Agent
# Default: Llama Stack Responses API (same as ragas_pipeline / ragas_dataset_generator)
agent = Agent(type="default")
ragas_dataset = agent.generate_ragas_dataset(
base_dataset=base_dataset,
client=llama_stack_client,
model_id="my-model",
vector_store_id=vs_id,
mcp_tools=mcp_tools,
instructions="Optional system prompt",
)
Single query: invoke (normal agent call)
Same parameter shape as generate_ragas_dataset, but for one input. The name follows LangChain/LangGraph (agent.invoke(input)):
from portazgo import Agent
agent = Agent(type="default")
result = agent.invoke(
"What is the capital of France?",
client=llama_stack_client,
model_id="my-model",
vector_store_id=vs_id,
mcp_tools=[], # or list of MCP tool configs
instructions="You are a helpful assistant.",
)
# result["answer"] -> str
# result["contexts"] -> list[str] (retrieved chunks + non–file_search tool responses)
# result["tool_calls"] -> list[dict]
Chat with history (e.g. chatbots)
Pass messages so the model sees previous turns. Each message is {"role": "user"|"assistant"|"system", "content": str}:
history = [
{"role": "user", "content": "My name is Alice."},
{"role": "assistant", "content": "Nice to meet you, Alice!"},
]
result = agent.invoke(
"What's my name?",
client=client,
model_id=model_id,
vector_store_id=vs_id,
mcp_tools=[],
messages=history,
)
# result["answer"] can refer to the conversation (e.g. "Your name is Alice.")
Streaming: invoke_stream
For real-time display (e.g. Streamlit), use invoke_stream. It yields events: content_delta (chunk of text) then done (final answer + contexts + tool_calls). If the backend does not support token-level streaming, the full answer is sent as one delta then done.
for event in agent.invoke_stream(
"Explain RAG in one sentence.",
client=client,
model_id=model_id,
vector_store_id=vs_id,
mcp_tools=[],
messages=st.session_state.messages, # optional history
):
if event["type"] == "content_delta":
print(event["delta"], end="", flush=True)
elif event["type"] == "done":
answer, contexts, tool_calls = event["answer"], event["contexts"], event["tool_calls"]
Streamlit chat example (with history + streaming)
import streamlit as st
from portazgo import Agent
# Init session state
if "messages" not in st.session_state:
st.session_state.messages = []
agent = Agent(type="default")
# client, model_id, vector_store_id from your config (e.g. sidebar)
# Display history
for msg in st.session_state.messages:
with st.chat_message(msg["role"]):
st.markdown(msg["content"])
if prompt := st.chat_input("Your message"):
st.session_state.messages.append({"role": "user", "content": prompt})
with st.chat_message("user"):
st.markdown(prompt)
with st.chat_message("assistant"):
placeholder = st.empty()
full = ""
for event in agent.invoke_stream(
prompt,
client=client,
model_id=model_id,
vector_store_id=vector_store_id,
mcp_tools=[],
messages=st.session_state.messages[:-1], # history (exclude current)
):
if event["type"] == "content_delta":
full += event["delta"]
placeholder.markdown(full + "▌")
placeholder.markdown(full)
st.session_state.messages.append({"role": "assistant", "content": full})
# LangGraph backend (not yet implemented; will raise NotImplementedError)
agent = Agent(type="lang-graph")
# agent.invoke(...) # NotImplementedError
Utilities
The library also exposes helpers used by the default backend, useful for custom pipelines:
from portazgo import strip_think_blocks, serialize_for_json, extract_tool_calls
strip_think_blocks(text)– remove<think>...</think>blocks from model output.serialize_for_json(val)– convert objects to JSON-serializable form.extract_tool_calls(response)– extract tool calls from a Llama Stack response.
Testing a simple invoke
Option 1: Unit tests (no Llama Stack server)
Runs invoke against a mock client so you can confirm the API shape:
cd portazgo
uv run pytest tests/test_agent.py -v -k invoke
Option 2: Real invoke against Llama Stack
Use the example script (requires a running Llama Stack and a vector store):
cd portazgo
export LLAMA_STACK_HOST=localhost
export LLAMA_STACK_PORT=8080
# optional: AGENT_VECTOR_STORE_NAME=rag-store, AGENT_MODEL_ID="your/model"
uv run python examples/simple_invoke.py "What is 2+2?"
You can pass any question as arguments; default is "What is 2+2?".
Option 3: OpenShift (oc)
If Llama Stack is exposed on OpenShift, use the helper script to get APPS_DOMAIN and run the example:
cd portazgo
./scripts/run_invoke_oc.sh "What is 2+2?"
The script sources .env (for PROJECT, etc.), runs oc get ingresses.config.openshift.io cluster for the apps domain, sets LLAMA_STACK_HOST to llama-stack-demo-route-${PROJECT}.${APPS_DOMAIN}, then runs the example with any arguments you pass.
Development
Uses uv for the venv and running tools. From the portazgo directory:
- Create venv and install deps:
make install-dev(oruv sync --extra dev) - Lock dependencies:
make lock(oruv lock) - Lint:
make lint(ruff viauv run) - Format:
make format - Tests:
make test(oruv run pytest tests) - Coverage:
make coverage - Build:
make build(oruv run python -m build)
License
Apache-2.0. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file portazgo-0.2.0.tar.gz.
File metadata
- Download URL: portazgo-0.2.0.tar.gz
- Upload date:
- Size: 23.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4fd35e6130e768cd21d5f0da5bdf6d700e22102d5154ce0d3e4a5951d4032195
|
|
| MD5 |
c77e5e57c7d3c722c1730178229e2fa0
|
|
| BLAKE2b-256 |
a44ee1bf4a5f97acb9b4888fb4e0f12860022eb34c582521311a104d019d60d6
|
File details
Details for the file portazgo-0.2.0-py3-none-any.whl.
File metadata
- Download URL: portazgo-0.2.0-py3-none-any.whl
- Upload date:
- Size: 24.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f5f8431c883120a0a7e8f7d1565d0bea3f858880b4bb686d0cb4821db578e9d
|
|
| MD5 |
b04095a31c820f2b92b93f5dec615753
|
|
| BLAKE2b-256 |
058ac2ec799941743a6f77157c6b00d22721fbbabf30c71150a722a89da538f5
|