Skip to main content

Build reliable Gen AI solutions without overhead

Project description

Datapizza AI Logo

Build reliable Gen AI solutions without overhead

Written in Python. Designed for speed. A no-fluff GenAI framework that gets your agents from dev to prod, fast

License: MIT PyPI version Python 3.10+ Downloads GitHub stars

๐Ÿ Homepage โ€ข ๐Ÿš€ Quick Start โ€ข ๐Ÿ“– Documentation โ€ข ๐ŸŽฏ Examples โ€ข ๐Ÿค Community


๐ŸŒŸ Why Datapizza AI?

A framework that keeps your agents predictable, your debugging fast, and your code trusted in production. Built by Engineers, trusted by Engineers.

โšก Less abstraction, more control | ๐Ÿš€ API-first design | ๐Ÿ”ง Observable by design

How to install

pip install datapizza-ai

Client invoke

from datapizza.clients.openai import OpenAIClient

client = OpenAIClient(api_key="YOUR_API_KEY")
result = client.invoke("Hi, how are u?")
print(result.text)

โœจ Key Features

๐ŸŽฏ API-first

  • Multi-Provider Support: OpenAI, Google Gemini, Anthropic, Mistral, Azure
  • Tool Integration: Built-in web search, document processing, custom tools
  • Memory Management: Persistent conversations and context awareness

๐Ÿ” Composable

  • Reusable blocks: Declarative configuration, easy overrides
  • Document Processing: PDF, DOCX, images with Azure AI & Docling
  • Smart Chunking: Context-aware text splitting and embedding
  • Built-in reranking: Add a reranker (e.g., Cohere) to boost relevance

๐Ÿ”ง Observable

  • OpenTelemetry tracing: Standards-based instrumentation
  • Client I/O tracing: Optional toggle to log inputs, outputs, and in-memory context
  • Custom spans: Trace fine-grained phases and sub-steps to pinpoint bottlenecks

๐Ÿš€ Vendor-Agnostic

  • Swap models: Change providers without rewiring business logic
  • Clear Interfaces: Predictable APIs across all components
  • Rich Ecosystem: Modular design with optional components
  • Migration-friendly: Quick migration from other frameworks

๐Ÿš€ Quick Start

Installation

# Core framework
pip install datapizza-ai

# With specific providers (optional)
pip install datapizza-ai-clients-openai
pip install datapizza-ai-clients-google
pip install datapizza-ai-clients-anthropic

Start with Agent

from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool

@tool
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny"

client = OpenAIClient(api_key="YOUR_API_KEY")
agent = Agent(name="assistant", client=client, tools = [get_weather])

response = agent.run("What is the weather in Rome?")
# output: The weather in Rome is sunny

๐Ÿ“Š Detailed Tracing

A key requirement for principled development of LLM applications over your data (RAG systems, agents) is being able to observe and debug.

Datapizza-ai provides built-in observability with OpenTelemetry tracing to help you monitor performance and understand execution flow.

๐Ÿ” Trace Your AI Operations
pip install datapizza-ai-tools-duckduckgo
from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools.duckduckgo import DuckDuckGoSearchTool
from datapizza.tracing import ContextTracing

client = OpenAIClient(api_key="OPENAI_API_KEY")
agent = Agent(name="assistant", client=client, tools = [DuckDuckGoSearchTool()])

with ContextTracing().trace("my_ai_operation"):
    response = agent.run("Tell me some news about Bitcoin")

# Output shows:
# โ•ญโ”€ Trace Summary of my_ai_operation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
# โ”‚ Total Spans: 3                                                      โ”‚
# โ”‚ Duration: 2.45s                                                     โ”‚
# โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“ |
# โ”‚ โ”ƒ Model       โ”ƒ Prompt Tokens โ”ƒ Completion Tokens โ”ƒ Cached Tokens โ”ƒ |
# โ”‚ โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ |
# โ”‚ โ”‚ gpt-4o-mini โ”‚ 31            โ”‚ 27                โ”‚ 0             โ”‚ |
# โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ |
# โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Demo

๐ŸŽฏ Examples

๐ŸŒ Multi-Agent System

Build sophisticated AI systems where multiple specialized agents collaborate to solve complex tasks. This example shows how to create a trip planning system with dedicated agents for weather information, web search, and planning coordination.

# Install DuckDuckGo tool
pip install datapizza-ai-tools-duckduckgo
from datapizza.agents.agent import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool
from datapizza.tools.duckduckgo import DuckDuckGoSearchTool

client = OpenAIClient(api_key="YOUR_API_KEY", model="gpt-4.1")

@tool
def get_weather(city: str) -> str:
    return f""" it's sunny all the week in {city}"""

weather_agent = Agent(
    name="weather_expert",
    client=client,
    system_prompt="You are a weather expert. Provide detailed weather information and forecasts.",
    tools=[get_weather]
)

web_search_agent = Agent(
    name="web_search_expert",
    client=client,
    system_prompt="You are a web search expert. You can search the web for information.",
    tools=[DuckDuckGoSearchTool()]
)

planner_agent = Agent(
    name="planner",
    client=client,
    system_prompt="You are a trip planner. You should provide a plan for the user. Make sure to provide a detailed plan with the best places to visit and the best time to visit them."
)

planner_agent.can_call([weather_agent, web_search_agent])

response = planner_agent.run(
    "I need to plan a hiking trip in Seattle next week. I want to see some waterfalls and a forest."
)
print(response.text)

๐Ÿ“Š Document Ingestion

Process and index documents for retrieval-augmented generation (RAG). This pipeline automatically parses PDFs, splits them into chunks, generates embeddings, and stores them in a vector database for efficient similarity search.

pip install datapizza-ai-parsers-docling
from datapizza.core.vectorstore import VectorConfig
from datapizza.embedders import ChunkEmbedder
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.parsers.docling import DoclingParser
from datapizza.modules.splitters import NodeSplitter
from datapizza.pipeline import IngestionPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore

vectorstore = QdrantVectorstore(location=":memory:")
embedder = ChunkEmbedder(client=OpenAIEmbedder(api_key="YOUR_API_KEY", model_name="text-embedding-3-small"))
vectorstore.create_collection("my_documents",vector_config=[VectorConfig(name="embedding", dimensions=1536)])

pipeline = IngestionPipeline(
    modules=[
        DoclingParser(),
        NodeSplitter(max_char=1024),
        embedder,
    ],
    vector_store=vectorstore,
    collection_name="my_documents"
)

pipeline.run("sample.pdf")

results = vectorstore.search(query_vector = [0.0] * 1536, collection_name="my_documents", k=5)
print(results)

๐Ÿ“Š RAG (Retrieval-Augmented Generation)

Create a complete RAG pipeline that enhances AI responses with relevant document context. This example demonstrates query rewriting, embedding generation, document retrieval, and response generation in a connected workflow.

from datapizza.clients.openai import OpenAIClient
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.prompt import ChatPromptTemplate
from datapizza.modules.rewriters import ToolRewriter
from datapizza.pipeline import DagPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore

openai_client = OpenAIClient(
    model="gpt-4o-mini",
    api_key="YOUR_API_KEY"
)

dag_pipeline = DagPipeline()
dag_pipeline.add_module("rewriter", ToolRewriter(client=openai_client, system_prompt="Rewrite user queries to improve retrieval accuracy."))
dag_pipeline.add_module("embedder", OpenAIEmbedder(api_key= "YOUR_API_KEY", model_name="text-embedding-3-small"))
dag_pipeline.add_module("retriever", QdrantVectorstore(host="localhost", port=6333).as_retriever(collection_name="my_documents", k=5))
dag_pipeline.add_module("prompt", ChatPromptTemplate(user_prompt_template="User question: {{user_prompt}}\n:", retrieval_prompt_template="Retrieved content:\n{% for chunk in chunks %}{{ chunk.text }}\n{% endfor %}"))
dag_pipeline.add_module("generator", openai_client)

dag_pipeline.connect("rewriter", "embedder", target_key="text")
dag_pipeline.connect("embedder", "retriever", target_key="query_vector")
dag_pipeline.connect("retriever", "prompt", target_key="chunks")
dag_pipeline.connect("prompt", "generator", target_key="memory")

query = "tell me something about this document"
result = dag_pipeline.run({
    "rewriter": {"user_prompt": query},
    "prompt": {"user_prompt": query},
    "retriever": {"collection_name": "my_documents", "k": 3},
    "generator":{"input": query}
})

print(f"Generated response: {result['generator']}")

๐ŸŒ Ecosystem

๐Ÿค– Supported AI Providers


OpenAI

Google Gemini

Anthropic

Mistral

Azure OpenAI

๐Ÿ”ง Tools & Integrations

Category Components
๐Ÿ“„ Document Parsers Azure AI Document Intelligence, Docling
๐Ÿ” Vector Stores Qdrant
๐ŸŽฏ Rerankers Cohere, Together AI
๐ŸŒ Tools DuckDuckGo Search, Custom Tools
๐Ÿ’พ Caching Redis integration for performance optimization
๐Ÿ“Š Embedders OpenAI, Google, Cohere, FastEmbed

๐ŸŽ“ Learning Resources

๐Ÿค Community

๐ŸŒŸ Contributing

We love contributions! Whether it's:

  • ๐Ÿ› Bug Reports - Help us improve
  • ๐Ÿ’ก Feature Requests - Share your ideas
  • ๐Ÿ“ Documentation - Make it better for everyone
  • ๐Ÿ”ง Code Contributions - Build the future together

Check out our Contributing Guide to get started.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built by Datapizza, the AI native company

A framework made to be easy to learn, easy to maintain and ready for production ๐Ÿ•

โญ Star us on GitHub โ€ข ๐Ÿš€ Get Started โ€ข ๐Ÿ’ฌ Join Discord

Star History

Star History Chart

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datapizza_ai-0.1.0.tar.gz (136.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datapizza_ai-0.1.0-py3-none-any.whl (207.6 kB view details)

Uploaded Python 3

File details

Details for the file datapizza_ai-0.1.0.tar.gz.

File metadata

  • Download URL: datapizza_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 136.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for datapizza_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b42285db7546dc53c183efa8f75f0eb347c5925924b405a333583a9903750896
MD5 626d4cdc22e73381b3d52f0ab5251142
BLAKE2b-256 f9addde238cf3217e84208da55ed15cfe449b52a00b7ae16f4e28f96665133e0

See more details on using hashes here.

File details

Details for the file datapizza_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: datapizza_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 207.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for datapizza_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3d94289c9d88cb2582cb959faa972b45481523c3c24d7bbf69d6d7af125f7ff9
MD5 9ef91577725b4c10349de4435fd9b81f
BLAKE2b-256 9a93fe5ae5f29045ae8f13cd9e5145c36afba7d77bc83deea3d7aab5ca855273

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page