MCP server for datapizza-ai documentation and examples

These details have not been verified by PyPI

Project links

Project description

DataPizza MCP Server 🍕

A Model Context Protocol (MCP) server that provides intelligent access to datapizza-ai documentation through vector similarity search and retrieval-augmented generation.

Overview

This MCP server enables AI assistants and applications to query the comprehensive datapizza-ai documentation using natural language queries. It indexes documentation from the datapizza-ai repository and provides contextual, relevant responses through a RAG (Retrieval-Augmented Generation) pipeline.

Features

Intelligent Documentation Search: Natural language queries across datapizza-ai documentation
Vector-Based Retrieval: Uses OpenAI embeddings and Qdrant vector database for semantic search
MCP Protocol Compliance: Standard Model Context Protocol implementation for broad compatibility
Automatic Indexing: Downloads and indexes documentation from GitHub automatically
Cloud-Ready: Supports Qdrant Cloud for scalable vector storage
Configurable: Environment-based configuration for flexible deployment

Architecture

The server consists of four main components:

MCP Server: FastMCP-based server exposing the query_datapizza tool
Indexer: Downloads and processes datapizza-ai documentation into searchable chunks
Retriever: RAG engine for semantic search and response generation
Configuration: Environment-based settings management with validation

Prerequisites

Python 3.10 or higher
OpenAI API key
Qdrant Cloud account and API key
Internet connection for documentation indexing

Installation

Clone the repository:

git clone https://github.com/datapizza-labs/mcp_server_datapizza.git
cd datapizza-mcp-server

Navigate to the package directory:

cd datapizza-mcp-server

Install the package with development dependencies:

pip install -e ".[dev]"

Configuration

Create a .env file in the datapizza-mcp-server directory with the following variables:

# Required Configuration
OPENAI_API_KEY=your_openai_api_key_here
QDRANT_URL=your_qdrant_cloud_url
QDRANT_API_KEY=your_qdrant_api_key

# Optional Configuration
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
COLLECTION_NAME=datapizza_docs
MAX_RESULTS=5
CHUNK_SIZE=1024
CHUNK_OVERLAP=200
LOG_LEVEL=INFO

Required Environment Variables

Variable	Description
`OPENAI_API_KEY`	OpenAI API key for generating embeddings
`QDRANT_URL`	Qdrant Cloud instance URL
`QDRANT_API_KEY`	Qdrant Cloud API key

Optional Environment Variables

Variable	Default	Description
`EMBEDDING_MODEL`	`text-embedding-3-small`	OpenAI embedding model
`EMBEDDING_DIMENSIONS`	`1536`	Embedding vector dimensions
`COLLECTION_NAME`	`datapizza_docs`	Qdrant collection name
`MAX_RESULTS`	`5`	Maximum search results returned
`CHUNK_SIZE`	`1024`	Document chunk size for indexing
`CHUNK_OVERLAP`	`200`	Overlap between document chunks
`LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)

Usage

1. Index Documentation

Before using the server, index the datapizza-ai documentation:

python -m datapizza_mcp.indexer

To force re-indexing (clears existing data):

python -m datapizza_mcp.indexer --force

2. Start the MCP Server

python -m datapizza_mcp.server

Or use the provided Windows batch script:

../run_datapizza.bat

3. Query the Documentation

The server exposes a query_datapizza tool that can be called by MCP clients:

# Example query
result = await client.call_tool("query_datapizza", {
    "query": "come creare un agente con OpenAI",
    "max_results": 5
})

MCP Tools and Resources

Tools

query_datapizza: Search datapizza-ai documentation
- query (string): Natural language search query
- max_results (int, optional): Maximum number of results (default: 5)

Resources

datapizza://status: System status and configuration information

Development

Code Quality Tools

# Format code
black src/

# Lint code
ruff check src/
ruff check src/ --fix  # Auto-fix issues

# Type checking
mypy src/

# Run tests
pytest

Project Structure

datapizza-mcp-server/
├── src/datapizza_mcp/
│   ├── __init__.py          # Package exports
│   ├── config.py            # Configuration management
│   ├── server.py            # MCP server implementation
│   ├── indexer.py           # Documentation indexing
│   └── retriever.py         # RAG retrieval engine
├── pyproject.toml           # Package configuration
├── .env                     # Environment variables
└── README.md               # This file

Dependencies

Core Dependencies

mcp: Model Context Protocol framework
datapizza-ai-core: Core datapizza-ai functionality
datapizza-ai-embedders-openai: OpenAI embedding integration
datapizza-ai-vectorstores-qdrant: Qdrant vector store integration
openai: OpenAI API client
qdrant-client: Qdrant database client
requests: HTTP client for GitHub API
python-dotenv: Environment variable management

Development Dependencies

pytest: Testing framework
black: Code formatter
ruff: Linter and code style checker
mypy: Static type checker

Troubleshooting

Common Issues

Authentication Errors
- Verify OPENAI_API_KEY is set correctly
- Check Qdrant Cloud credentials (QDRANT_URL and QDRANT_API_KEY)
Empty Search Results
- Ensure documentation is indexed: python -m datapizza_mcp.indexer
- Check system status: query the datapizza://status resource
Connection Issues
- Verify internet connectivity for GitHub and Qdrant Cloud access
- Check firewall settings for outbound HTTPS connections

Debugging

Enable debug logging by setting LOG_LEVEL=DEBUG in your .env file.

Contributing

Fork the repository
Create a feature branch
Make your changes following the code style guidelines
Run the full test suite and code quality checks
Submit a pull request

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

For issues and questions:

GitHub Issues: datapizza-mcp-server/issues
DataPizza AI Documentation: datapizza-ai

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

iflow_mcp_mat1312_datapizza_mcp_server-0.1.0-py3-none-any.whl (18.4 kB view details)

Uploaded Feb 4, 2026 Python 3

File details

Details for the file iflow_mcp_mat1312_datapizza_mcp_server-0.1.0-py3-none-any.whl.

File metadata

Download URL: iflow_mcp_mat1312_datapizza_mcp_server-0.1.0-py3-none-any.whl
Upload date: Feb 4, 2026
Size: 18.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.29 {"installer":{"name":"uv","version":"0.9.29","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_mat1312_datapizza_mcp_server-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a0ff79503eae9f6a71926bb89173566be9fbf852a67f337fc0369293077afafe`
MD5	`8c57f5f60779f338e6e7d1e9e3bc05b6`
BLAKE2b-256	`d4ba6b0808777c9843e899de4f643fd3c1a1080e65371d8d4aac5916dfe53440`

See more details on using hashes here.

iflow-mcp_mat1312_datapizza-mcp-server 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DataPizza MCP Server 🍕

Overview

Features

Architecture

Prerequisites

Installation

Configuration

Required Environment Variables

Optional Environment Variables

Usage

1. Index Documentation

2. Start the MCP Server

3. Query the Documentation

MCP Tools and Resources

Tools

Resources

Development

Code Quality Tools

Project Structure

Dependencies

Core Dependencies

Development Dependencies

Troubleshooting

Common Issues

Debugging

Contributing

License

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes