Skip to main content

Build reliable Gen AI solutions without overhead

Project description

Datapizza AI Logo

Build reliable Gen AI solutions without overhead

Written in Python. Designed for speed. A no-fluff GenAI framework that gets your agents from dev to prod, fast

License: MIT PyPI version Python 3.10+ Downloads GitHub stars

๐Ÿš€ Quick Start โ€ข ๐Ÿ“– Documentation โ€ข ๐ŸŽฏ Examples โ€ข ๐Ÿค Community


๐ŸŒŸ Why Datapizza AI?

A framework that keeps your agents predictable, your debugging fast, and your code trusted in production. Built by Engineers, trusted by Engineers.

โšก Less abstraction, more control | ๐Ÿš€ API-first design | ๐Ÿ”ง Observable by design

How to install

pip install datapizza-ai

Client invoke

from datapizza.clients.openai import OpenAIClient

client=OpenAIClient(api_key="YOUR_API_KEY")
result = client.invoke("Hi, how are u?")
print(result.text)

โœจ Key Features

๐ŸŽฏ API-first

  • Multi-Provider Support: OpenAI, Google Gemini, Anthropic, Mistral, Azure
  • Tool Integration: Built-in web search, document processing, custom tools
  • Memory Management: Persistent conversations and context awareness

๐Ÿ” Composable

  • Reusable blocks: Declarative configuration, easy overrides
  • Document Processing: PDF, DOCX, images with Azure AI & Docling
  • Smart Chunking: Context-aware text splitting and embedding
  • Built-in reranking: Add a reranker (e.g., Cohere) to boost relevance

๐Ÿ”ง Observable

  • OpenTelemetry tracing: Standards-based instrumentation
  • Client I/O tracing: Optional toggle to log inputs, outputs, and in-memory context
  • Custom spans: Trace fine-grained phases and sub-steps to pinpoint bottlenecks

๐Ÿš€ Vendor-Agnostic

  • Swap models: Change providers without rewiring business logic
  • Clear Interfaces: Predictable APIs across all components
  • Rich Ecosystem: Modular design with optional components
  • Migration-friendly: Quick migration from other frameworks

๐Ÿš€ Quick Start

Installation

# Core framework
pip install datapizza-ai

# With specific providers (optional)
pip install datapizza-ai-clients-openai
pip install datapizza-ai-clients-google
pip install datapizza-ai-clients-anthropic

Start with Agent

from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool

@tool
def get_weather(city: str) -> str:
    return f"The weather in {city} is sunny"

client = OpenAIClient(api_key="YOUR_API_KEY")
agent = Agent(name="assistant", client=client, tools = [get_weather])

response = agent.run("What is the weather in Rome?")
# output: The weather in Rome is sunny

๐Ÿ“Š Detailed Tracing

A key requirement for principled development of LLM applications over your data (RAG systems, agents) is being able to observe and debug.

Datapizza-ai provides built-in observability with OpenTelemetry tracing to help you monitor performance and understand execution flow.

๐Ÿ” Trace Your AI Operations
from datapizza.tracing import ContextTracing
from datapizza.agents import Agent
from datapizza.clients.openai import OpenAIClient

client = OpenAIClient(api_key=os.getenv("OPENAI_API_KEY"))
agent = Agent(name="assistant", client=client, tools = [DuckDuckGoSearchTool()])

with ContextTracing().trace("my_ai_operation"):
    response = agent.run("Tell me some news about Bitcoin")

# Output shows:
# โ•ญโ”€ Trace Summary of my_ai_operation โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
# โ”‚ Total Spans: 3                                                      โ”‚
# โ”‚ Duration: 2.45s                                                     โ”‚
# โ”‚ โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ณโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“ |
# โ”‚ โ”ƒ Model       โ”ƒ Prompt Tokens โ”ƒ Completion Tokens โ”ƒ Cached Tokens โ”ƒ |
# โ”‚ โ”กโ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ•‡โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”ฉ |
# โ”‚ โ”‚ gpt-4o-mini โ”‚ 31            โ”‚ 27                โ”‚ 0             โ”‚ |
# โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ |
# โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Demo

๐ŸŽฏ Examples

๐ŸŒ Multi-Agent System

Build sophisticated AI systems where multiple specialized agents collaborate to solve complex tasks. This example shows how to create a trip planning system with dedicated agents for weather information, web search, and planning coordination.

# Install DuckDuckGo tool
pip install datapizza-ai-tools-duckduckgo
from datapizza.agents.agent import Agent
from datapizza.clients.openai import OpenAIClient
from datapizza.tools import tool
from datapizza.tools.duckduckgo import DuckDuckGoSearchTool

client = OpenAIClient(api_key="YOUR_API_KEY", model="gpt-4.1")

@tool
def get_weather(city: str) -> str:
    return f""" it's sunny all the week in {city}"""

weather_agent = Agent(
    name="weather_expert",
    client=client,
    system_prompt="You are a weather expert. Provide detailed weather information and forecasts.",
    tools=[get_weather]
)

web_search_agent = Agent(
    name="web_search_expert",
    client=client,
    system_prompt="You are a web search expert. You can search the web for information.",
    tools=[DuckDuckGoSearchTool()]
)

planner_agent = Agent(
    name="planner",
    client=client, 
    system_prompt="You are a trip planner. You should provide a plan for the user. Make sure to provide a detailed plan with the best places to visit and the best time to visit them."
)

planner_agent.can_call([weather_agent, web_search_agent])

response = planner_agent.run(
    "I need to plan a hiking trip in Seattle next week. I want to see some waterfalls and a forest."
)
print(response.text)

๐Ÿ“Š Document Ingestion

Process and index documents for retrieval-augmented generation (RAG). This pipeline automatically parses PDFs, splits them into chunks, generates embeddings, and stores them in a vector database for efficient similarity search.

from datapizza.core.vectorstore import VectorConfig
from datapizza.embedders import ChunkEmbedder
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.parsers.docling import DoclingParser
from datapizza.modules.splitters import NodeSplitter
from datapizza.pipeline import IngestionPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore

vectorstore = QdrantVectorstore(location=":memory:")
embedder = ChunkEmbedder(client=OpenAIEmbedder(api_key="YOUR_API_KEY", model_name="text-embedding-3-small"))
vectorstore.create_collection("my_documents",vector_config=[VectorConfig(name="embedding", dimensions=1536)])

pipeline = IngestionPipeline(
    modules=[
        DoclingParser(),
        NodeSplitter(max_char=1024),
        embedder,
    ],
    vector_store=vectorstore,
    collection_name="my_documents"
)

pipeline.run("sample.pdf")

results = vectorstore.search(query_vector = [0.0] * 1536, collection_name="my_documents", k=5)
print(results)

๐Ÿ“Š RAG (Retrieval-Augmented Generation)

Create a complete RAG pipeline that enhances AI responses with relevant document context. This example demonstrates query rewriting, embedding generation, document retrieval, and response generation in a connected workflow.

from datapizza.clients.openai import OpenAIClient
from datapizza.embedders.openai import OpenAIEmbedder
from datapizza.modules.prompt import ChatPromptTemplate
from datapizza.modules.rewriters import ToolRewriter
from datapizza.pipeline import DagPipeline
from datapizza.vectorstores.qdrant import QdrantVectorstore

openai_client = OpenAIClient(
    model="gpt-4o-mini",
    api_key="YOUR_API_KEY"
)

dag_pipeline = DagPipeline()
dag_pipeline.add_module("rewriter",  ToolRewriter( client=openai_client, system_prompt="Rewrite user queries to improve retrieval accuracy." ))
dag_pipeline.add_module("embedder", OpenAIEmbedder( api_key= "YOUR_API_KEY", model_name="text-embedding-3-small" ))
dag_pipeline.add_module("retriever", QdrantVectorstore(host="localhost", port=6333).as_retriever(collection_name="my_documents", k=5))
dag_pipeline.add_module("prompt", ChatPromptTemplate( user_prompt_template="User question: {{user_prompt}}\n:", retrieval_prompt_template="Retrieved content:\n{% for chunk in chunks %}{{ chunk.text }}\n{% endfor %}"))
dag_pipeline.add_module("generator", openai_client)

dag_pipeline.connect("rewriter", "embedder", target_key="text")
dag_pipeline.connect("embedder", "retriever", target_key="query_vector")
dag_pipeline.connect("retriever", "prompt", target_key="chunks")
dag_pipeline.connect("prompt", "generator", target_key="memory")

query = "tell me something about this document"
result = dag_pipeline.run({
    "rewriter": {"user_prompt": query},
    "prompt": {"user_prompt": query},
    "retriever": {"collection_name": "my_documents", "k": 3},
    "generator":{"input": query}
})

print(f"Generated response: {result['generator']}")

๐ŸŒ Ecosystem

๐Ÿค– Supported AI Providers


OpenAI

Google Gemini

Anthropic

Mistral

Azure OpenAI

๐Ÿ”ง Tools & Integrations

Category Components
๐Ÿ“„ Document Parsers Azure AI Document Intelligence, Docling
๐Ÿ” Vector Stores Qdrant
๐ŸŽฏ Rerankers Cohere, Together AI
๐ŸŒ Tools DuckDuckGo Search, Custom Tools
๐Ÿ’พ Caching Redis integration for performance optimization
๐Ÿ“Š Embedders OpenAI, Google, Cohere, FastEmbed

๐ŸŽ“ Learning Resources

๐Ÿค Community

๐ŸŒŸ Contributing

We love contributions! Whether it's:

  • ๐Ÿ› Bug Reports - Help us improve
  • ๐Ÿ’ก Feature Requests - Share your ideas
  • ๐Ÿ“ Documentation - Make it better for everyone
  • ๐Ÿ”ง Code Contributions - Build the future together

Check out our Contributing Guide to get started.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Built by Datapizza, the AI native company

A framework made to be easy to learn, easy to maintain and ready for production ๐Ÿ•

โญ Star us on GitHub โ€ข ๐Ÿš€ Get Started โ€ข ๐Ÿ’ฌ Join Discord

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datapizza_ai-0.0.2.tar.gz (98.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datapizza_ai-0.0.2-py3-none-any.whl (153.8 kB view details)

Uploaded Python 3

File details

Details for the file datapizza_ai-0.0.2.tar.gz.

File metadata

  • Download URL: datapizza_ai-0.0.2.tar.gz
  • Upload date:
  • Size: 98.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for datapizza_ai-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8a19391ac9554d332f917328573cc2ec2f5ef824dc551f2fdd2dde62b9e57933
MD5 0ff4560fa70432b582d6786d746ed52a
BLAKE2b-256 27424c870754c27512c84b0d5a269f8e19ae469f031017a714f97fed16eee7ee

See more details on using hashes here.

File details

Details for the file datapizza_ai-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: datapizza_ai-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 153.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for datapizza_ai-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3603cfdf537620ff5a10f7bbee6c6bce46e53804a72cf44dcca13390165d1a89
MD5 0effb56330e1d07a9fa981f084fc8b05
BLAKE2b-256 6ff5ee5ea2694d53855d398b4366ea758183c0b41266b1e870f55583626f540c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page