Build reliable Gen AI solutions without overhead
Project description
Build reliable Gen AI solutions without overhead
Written in Python. Designed for speed. A no-fluff GenAI framework that gets your agents from dev to prod, fast
๐ Homepage โข ๐ Quick Start โข ๐ Documentation โข ๐ฏ Examples โข ๐ค Community
๐ Why Datapizza AI?
A framework that keeps your agents predictable, your debugging fast, and your code trusted in production. Built by Engineers, trusted by Engineers.
โก Less abstraction, more control | ๐ API-first design | ๐ง Observable by design
How to install
pip install datapizza-ai
Client invoke
from datapizza.clients.openai import OpenAIClient
client = OpenAIClient(api_key="YOUR_API_KEY")
result = client.invoke("Hi, how are u?")
print(result.text)
โจ Key Features
๐ฏ API-first
|
๐ Composable
|
๐ง Observable
|
๐ Vendor-Agnostic
|
๐ Quick Start
Installation
# Core framework
pip install datapizza-ai
# With specific providers (optional)
pip install datapizza-ai-clients-openai
pip install datapizza-ai-clients-google
pip install datapizza-ai-clients-anthropic
Start with Agent
from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool
@tool
def get_weather(city: str) -> str:
return f"The weather in {city} is sunny"
client = OpenAIClient(api_key="YOUR_API_KEY")
agent = Agent(name="assistant", client=client, tools = [get_weather])
response = agent.run("What is the weather in Rome?")
# output: The weather in Rome is sunny
๐ Detailed Tracing
A key requirement for principled development of LLM applications over your data (RAG systems, agents) is being able to observe and debug.
Datapizza-ai provides built-in observability with OpenTelemetry tracing to help you monitor performance and understand execution flow.
pip install datapizza-ai-tools-duckduckgo
from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools.duckduckgo import DuckDuckGoSearchTool
from datapizza.tracing import ContextTracing
client = OpenAIClient(api_key="OPENAI_API_KEY")
agent = Agent(name="assistant", client=client, tools = [DuckDuckGoSearchTool()])
with ContextTracing().trace("my_ai_operation"):
response = agent.run("Tell me some news about Bitcoin")
# Output shows:
# โญโ Trace Summary of my_ai_operation โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
# โ Total Spans: 3 โ
# โ Duration: 2.45s โ
# โ โโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโโโโโณโโโโโโโโโโโโโโโโ |
# โ โ Model โ Prompt Tokens โ Completion Tokens โ Cached Tokens โ |
# โ โกโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฉ |
# โ โ gpt-4o-mini โ 31 โ 27 โ 0 โ |
# โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโ |
# โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
๐ฏ Examples
๐ Multi-Agent System
Build sophisticated AI systems where multiple specialized agents collaborate to solve complex tasks. This example shows how to create a trip planning system with dedicated agents for weather information, web search, and planning coordination.
# Install DuckDuckGo tool
pip install datapizza-ai-tools-duckduckgo
from datapizza.agents.agent import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool
from datapizza.tools.duckduckgo import DuckDuckGoSearchTool
client = OpenAIClient(api_key="YOUR_API_KEY", model="gpt-4.1")
@tool
def get_weather(city: str) -> str:
return f""" it's sunny all the week in {city}"""
weather_agent = Agent(
name="weather_expert",
client=client,
system_prompt="You are a weather expert. Provide detailed weather information and forecasts.",
tools=[get_weather]
)
web_search_agent = Agent(
name="web_search_expert",
client=client,
system_prompt="You are a web search expert. You can search the web for information.",
tools=[DuckDuckGoSearchTool()]
)
planner_agent = Agent(
name="planner",
client=client,
system_prompt="You are a trip planner. You should provide a plan for the user. Make sure to provide a detailed plan with the best places to visit and the best time to visit them."
)
planner_agent.can_call([weather_agent, web_search_agent])
response = planner_agent.run(
"I need to plan a hiking trip in Seattle next week. I want to see some waterfalls and a forest."
)
print(response.text)
๐ Document Ingestion
Process and index documents for retrieval-augmented generation (RAG). This pipeline automatically parses PDFs, splits them into chunks, generates embeddings, and stores them in a vector database for efficient similarity search.
pip install datapizza-ai-parsers-docling
from datapizza.core.vectorstore import VectorConfig
from datapizza.embedders import ChunkEmbedder
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.parsers.docling import DoclingParser
from datapizza.modules.splitters import NodeSplitter
from datapizza.pipeline import IngestionPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore
vectorstore = QdrantVectorstore(location=":memory:")
embedder = ChunkEmbedder(client=OpenAIEmbedder(api_key="YOUR_API_KEY", model_name="text-embedding-3-small"))
vectorstore.create_collection("my_documents",vector_config=[VectorConfig(name="embedding", dimensions=1536)])
pipeline = IngestionPipeline(
modules=[
DoclingParser(),
NodeSplitter(max_char=1024),
embedder,
],
vector_store=vectorstore,
collection_name="my_documents"
)
pipeline.run("sample.pdf")
results = vectorstore.search(query_vector = [0.0] * 1536, collection_name="my_documents", k=5)
print(results)
๐ RAG (Retrieval-Augmented Generation)
Create a complete RAG pipeline that enhances AI responses with relevant document context. This example demonstrates query rewriting, embedding generation, document retrieval, and response generation in a connected workflow.
from datapizza.clients.openai import OpenAIClient
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.prompt import ChatPromptTemplate
from datapizza.modules.rewriters import ToolRewriter
from datapizza.pipeline import DagPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore
openai_client = OpenAIClient(
model="gpt-4o-mini",
api_key="YOUR_API_KEY"
)
dag_pipeline = DagPipeline()
dag_pipeline.add_module("rewriter", ToolRewriter(client=openai_client, system_prompt="Rewrite user queries to improve retrieval accuracy."))
dag_pipeline.add_module("embedder", OpenAIEmbedder(api_key= "YOUR_API_KEY", model_name="text-embedding-3-small"))
dag_pipeline.add_module("retriever", QdrantVectorstore(host="localhost", port=6333).as_retriever(collection_name="my_documents", k=5))
dag_pipeline.add_module("prompt", ChatPromptTemplate(user_prompt_template="User question: {{user_prompt}}\n:", retrieval_prompt_template="Retrieved content:\n{% for chunk in chunks %}{{ chunk.text }}\n{% endfor %}"))
dag_pipeline.add_module("generator", openai_client)
dag_pipeline.connect("rewriter", "embedder", target_key="text")
dag_pipeline.connect("embedder", "retriever", target_key="query_vector")
dag_pipeline.connect("retriever", "prompt", target_key="chunks")
dag_pipeline.connect("prompt", "generator", target_key="memory")
query = "tell me something about this document"
result = dag_pipeline.run({
"rewriter": {"user_prompt": query},
"prompt": {"user_prompt": query},
"retriever": {"collection_name": "my_documents", "k": 3},
"generator":{"input": query}
})
print(f"Generated response: {result['generator']}")
๐ Ecosystem
๐ค Supported AI Providers
OpenAI |
Google Gemini |
Anthropic |
Mistral |
Azure OpenAI |
๐ง Tools & Integrations
| Category | Components |
|---|---|
| ๐ Document Parsers | Azure AI Document Intelligence, Docling |
| ๐ Vector Stores | Qdrant |
| ๐ฏ Rerankers | Cohere, Together AI |
| ๐ Tools | DuckDuckGo Search, Custom Tools |
| ๐พ Caching | Redis integration for performance optimization |
| ๐ Embedders | OpenAI, Google, Cohere, FastEmbed |
๐ Learning Resources
- ๐ Complete Documentation - Comprehensive guides and API reference
- ๐ฏ RAG Tutorial - Build production RAG systems
- ๐ค Agent Examples - Real-world agent implementations
๐ค Community
- ๐ฌ Discord Community
- ๐ Documentation
- ๐ง GitHub Issues
- ๐ฆ Twitter
๐ Contributing
We love contributions! Whether it's:
- ๐ Bug Reports - Help us improve
- ๐ก Feature Requests - Share your ideas
- ๐ Documentation - Make it better for everyone
- ๐ง Code Contributions - Build the future together
Check out our Contributing Guide to get started.
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Built by Datapizza, the AI native company
A framework made to be easy to learn, easy to maintain and ready for production ๐
โญ Star us on GitHub โข ๐ Get Started โข ๐ฌ Join Discord
Star History
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datapizza_ai-0.0.9.tar.gz.
File metadata
- Download URL: datapizza_ai-0.0.9.tar.gz
- Upload date:
- Size: 119.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcf97c041d08a24423acdde919c9e3c7a0cdbad09e014eef11651d3f846958f7
|
|
| MD5 |
3c30bba106d0a69ce98d62333d6b7aad
|
|
| BLAKE2b-256 |
484fad9783bb8e08896733e0a66f5a32c9da909bcc5af0dbd66f4ed954ecbed1
|
File details
Details for the file datapizza_ai-0.0.9-py3-none-any.whl.
File metadata
- Download URL: datapizza_ai-0.0.9-py3-none-any.whl
- Upload date:
- Size: 182.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f42b2d2825c5580238571568070c9d42408bbcd7fef245f344fb404d4eee3451
|
|
| MD5 |
b8b61c32af1355254d5bfd2260dfff19
|
|
| BLAKE2b-256 |
aeba7d73aee9f98b4b12b32d4da228e8c3cf70dbeda302bb750455fad742a518
|