Skip to main content

Add your description here

Project description

Generic Data Analytics MCP Server

A MCP (Model Context Protocol) server that transforms any structured dataset (JSON/CSV) into intelligent, AI-guided analytics workflows. This server demonstrates advanced modular architecture with dataset-agnostic design - it automatically adapts to ANY data without hardcoded schemas.

๐Ÿš€ Quick Setup

  1. Configure for your MCP client:

    cp .mcp.json.sample .mcp.json
    # Edit .mcp.json and update paths to your system
    
  2. Find your UV path and update configuration:

    which uv
    # Example output: /Users/yourusername/.local/bin/uv
    
    pwd  
    # Example output: /Users/yourusername/path/to/quick-data-mcp
    
  3. Test the server:

    uv run python main.py
    

๐Ÿš€ Getting Started in Claude Code

Once your MCP server is configured and running, start with this slash command in Claude Code to get oriented:

/quick-data:list_mcp_assets_prompt

This will show you all available tools, resources, and prompts with descriptions - your complete toolkit for data analytics!

๐Ÿš€ What Makes This Special

Universal Data Analytics

  • Works with ANY JSON/CSV dataset - no schema definition required
  • Automatic column type detection - numerical, categorical, temporal, identifier
  • AI-powered analysis suggestions - recommends analyses based on your data characteristics
  • Adaptive conversation prompts - guides users through analytics workflows using actual column names

Tested Architecture

  • 32 Analytics Tools (20 analytics + 12 resource mirrors) for comprehensive data analysis
  • 12 Dynamic Resources providing real-time data context
  • 7 Adaptive Prompts for AI-guided exploration
  • 100% Test Coverage (103 tests passing)
  • Universal MCP Client Compatibility (supports tool-only clients)
  • Memory optimization with usage monitoring

๐Ÿ“Š Complete Capabilities

๐Ÿ”ง Analytics Tools (32 total)

Data Loading & Management

  • load_dataset(file_path, dataset_name, sample_size?) - Load any JSON/CSV with automatic schema discovery
  • list_loaded_datasets() - Show all datasets currently in memory with statistics
  • clear_dataset(dataset_name) - Remove specific dataset from memory
  • clear_all_datasets() - Clear all datasets from memory
  • get_dataset_info(dataset_name) - Get comprehensive dataset information

Core Analytics

  • segment_by_column(dataset_name, column_name, method?, top_n?) - Generic segmentation on any categorical column
  • find_correlations(dataset_name, columns?, threshold?) - Correlation analysis with configurable thresholds
  • analyze_distributions(dataset_name, column_name) - Statistical distribution analysis for any column
  • detect_outliers(dataset_name, columns?, method) - Outlier detection (IQR, Z-score methods)
  • time_series_analysis(dataset_name, date_column, value_column, frequency?) - Temporal analysis with trend detection

Advanced Analytics

  • validate_data_quality(dataset_name) - Comprehensive data quality assessment (0-100 scoring)
  • compare_datasets(dataset_a, dataset_b, common_columns?) - Multi-dataset comparison analysis
  • merge_datasets(dataset_configs, join_strategy?) - Join datasets with flexible strategies
  • calculate_feature_importance(dataset_name, target_column, feature_columns?) - ML feature importance
  • memory_optimization_report(dataset_name) - Performance analysis and optimization suggestions

Visualization & Export

  • create_chart(dataset_name, chart_type, x_column, y_column?, groupby_column?, title?, save_path?) - Generate charts (bar, scatter, histogram, line, box)
  • generate_dashboard(dataset_name, chart_configs) - Multi-chart interactive dashboards
  • export_insights(dataset_name, format?, include_charts?) - Export in JSON, CSV, HTML formats

AI-Powered Assistance

  • suggest_analysis(dataset_name) - AI recommendations based on data characteristics
  • execute_custom_analytics_code(dataset_name, python_code) - Execute custom Python code against datasets with full pandas/numpy/plotly support

๐Ÿ”„ Resource Mirror Tools (Tool-Only Client Support)

For MCP clients that don't support resources, all resource functionality is available through mirror tools:

Dataset Context Tools (4)

  • resource_datasets_loaded() - List all loaded datasets (mirrors datasets://loaded)
  • resource_datasets_schema(dataset_name) - Get dataset schema (mirrors datasets://{name}/schema)
  • resource_datasets_summary(dataset_name) - Statistical summary (mirrors datasets://{name}/summary)
  • resource_datasets_sample(dataset_name) - Sample data rows (mirrors datasets://{name}/sample)

Analytics Intelligence Tools (5)

  • resource_analytics_current_dataset() - Currently active dataset (mirrors analytics://current_dataset)
  • resource_analytics_available_analyses() - Applicable analysis types (mirrors analytics://available_analyses)
  • resource_analytics_column_types() - Column classifications (mirrors analytics://column_types)
  • resource_analytics_suggested_insights() - AI recommendations (mirrors analytics://suggested_insights)
  • resource_analytics_memory_usage() - Memory monitoring (mirrors analytics://memory_usage)

System Tools (3)

  • resource_config_server() - Server configuration (mirrors config://server)
  • resource_users_profile(user_id) - User profile access (mirrors users://{user_id}/profile)
  • resource_system_status() - System health info (mirrors system://status)

๐Ÿ“š Dynamic Resources (12 total)

Dataset Context Resources

  • datasets://loaded - Real-time inventory of all loaded datasets
  • datasets://{dataset_name}/schema - Dynamic schema with column classification
  • datasets://{dataset_name}/summary - Statistical summary (pandas.describe() equivalent)
  • datasets://{dataset_name}/sample - Sample rows for data preview

Analytics Intelligence Resources

  • analytics://current_dataset - Currently active dataset context
  • analytics://available_analyses - Applicable analysis types for current data
  • analytics://column_types - Column role classification (numerical, categorical, temporal, identifier)
  • analytics://suggested_insights - AI-generated analysis recommendations
  • analytics://memory_usage - Real-time memory monitoring

System Resources (Legacy Compatibility)

  • config://server - Server configuration information
  • users://{user_id}/profile - User profile access by ID
  • system://status - System health and status information

๐Ÿ’ฌ Adaptive Prompts (7 total)

Data Exploration Prompts

  • dataset_first_look(dataset_name) - Personalized initial exploration guide based on actual data structure
  • segmentation_workshop(dataset_name) - Interactive segmentation strategy using real column names
  • data_quality_assessment(dataset_name) - Systematic quality review with specific recommendations

Analysis Workflow Prompts

  • correlation_investigation(dataset_name) - Guided correlation analysis workflow
  • pattern_discovery_session(dataset_name) - Open-ended pattern mining conversation

Business Intelligence Prompts

  • insight_generation_workshop(dataset_name, business_context?) - Business insight generation with domain context
  • dashboard_design_consultation(dataset_name, audience?) - Audience-specific dashboard planning

๐Ÿ—๏ธ Project Structure

quick-data-mcp/
โ”œโ”€โ”€ .mcp.json                      # Ready-to-use MCP client configuration
โ”œโ”€โ”€ data/                          # Sample datasets
โ”‚   โ”œโ”€โ”€ ecommerce_orders.json      # E-commerce transaction data
โ”‚   โ”œโ”€โ”€ employee_survey.csv        # HR analytics dataset
โ”‚   โ”œโ”€โ”€ product_performance.csv    # Product metrics dataset
โ”‚   โ””โ”€โ”€ README.md                  # Data documentation
โ”œโ”€โ”€ src/mcp_server/               # Core server implementation
โ”‚   โ”œโ”€โ”€ server.py                 # Main server with 31 tools, 12 resources, 7 prompts
โ”‚   โ”œโ”€โ”€ tools/                    # Tool implementations
โ”‚   โ”‚   โ”œโ”€โ”€ pandas_tools.py       # Pandas-based tools grouped module
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py           # All tools (32 total)
โ”‚   โ”‚   โ””โ”€โ”€ [individual_tool_files.py]  # Individual tool implementations
โ”‚   โ”œโ”€โ”€ resources/                # Resource handlers
โ”‚   โ”‚   โ””โ”€โ”€ data_resources.py     # Dynamic data access (12 resources)
โ”‚   โ”œโ”€โ”€ prompts/                  # Conversation starters
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py           # All prompts (9 total)
โ”‚   โ”‚   โ””โ”€โ”€ [individual_prompt_files.py]  # Individual prompt implementations
โ”‚   โ”œโ”€โ”€ models/                   # Data models and schemas
โ”‚   โ”‚   โ””โ”€โ”€ schemas.py            # DatasetManager, ColumnInfo, DatasetSchema
โ”‚   โ””โ”€โ”€ config/                   # Configuration
โ”‚       โ””โ”€โ”€ settings.py           # Server settings
โ”œโ”€โ”€ tests/                            # Comprehensive test suite (130 tests)
โ”‚   โ”œโ”€โ”€ test_pandas_tools.py              # Pandas tools tests
โ”‚   โ”œโ”€โ”€ test_analytics_tools.py           # Advanced tools tests
โ”‚   โ”œโ”€โ”€ test_analytics_prompts.py         # Prompts functionality tests
โ”‚   โ”œโ”€โ”€ test_data_resources.py            # Resource access tests
โ”‚   โ”œโ”€โ”€ test_resource_mirror_tools.py     # Resource mirror tool tests
โ”‚   โ””โ”€โ”€ test_custom_analytics_code.py     # Custom code execution tests
โ”œโ”€โ”€ outputs/                      # Generated files (excluded from git)
โ”‚   โ”œโ”€โ”€ charts/                   # Generated HTML charts and dashboards
โ”‚   โ””โ”€โ”€ reports/                  # Exported insights and reports
โ””โ”€โ”€ main.py                       # Entry point

๐Ÿ“ฆ Dependencies

Core Analytics Stack

  • mcp[cli]>=1.9.2 - Official MCP Python SDK
  • pandas>=2.2.3 - Data manipulation and analysis
  • plotly>=6.1.2 - Interactive visualizations

Testing & Development

  • pytest>=8.3.5 - Testing framework
  • pytest-asyncio>=1.0.0 - Async testing support

๐Ÿš€ Usage

MCP Client Integration

Once configured, your MCP client can access all 32 tools, 12 resources, and 9 prompts for comprehensive data analytics.

Example Analytics Workflow

# 1. Load any dataset
await load_dataset("data/ecommerce_orders.json", "sales")

# 2. Get AI-powered first look guidance
await dataset_first_look("sales")
# โ†’ Returns personalized exploration guide with actual column names

# 3. Automatic analysis suggestions
await suggest_analysis("sales")
# โ†’ AI recommends: correlation_analysis, segmentation_analysis based on detected columns

# 4. Perform suggested analyses
await find_correlations("sales")
# โ†’ Finds relationships between numerical columns

await segment_by_column("sales", "customer_segment")
# โ†’ Groups data and calculates statistics automatically

# 5. Create adaptive visualizations
await create_chart("sales", "bar", "region", "order_value")
# โ†’ Generates interactive plotly charts

# 6. Comprehensive data quality assessment
await validate_data_quality("sales")
# โ†’ Returns 0-100 quality score with detailed recommendations

Advanced Multi-Dataset Analysis

# Load multiple datasets
await load_dataset("data/employee_survey.csv", "hr")
await load_dataset("data/product_performance.csv", "products")

# Compare datasets
await compare_datasets("sales", "products", ["category"])

# Generate business insights
await insight_generation_workshop("sales", "e-commerce")

# Create executive dashboard
await dashboard_design_consultation("hr", "executive")

๐Ÿ”ฅ Custom Analytics Code Execution

Execute any Python code against your datasets with full pandas/numpy/plotly support:

# Custom analysis that goes beyond predefined tools
output = await execute_custom_analytics_code("sales", """
print("=== Custom Customer Segmentation ===")

# Advanced customer scoring algorithm
customer_scores = df.groupby('customer_id').agg({
    'order_value': ['sum', 'mean', 'count'],
    'date': ['min', 'max']
}).round(2)

# Flatten column names
customer_scores.columns = ['total_spent', 'avg_order', 'order_count', 'first_order', 'last_order']

# Calculate customer lifetime (days)
customer_scores['lifetime_days'] = (
    pd.to_datetime(customer_scores['last_order']) - 
    pd.to_datetime(customer_scores['first_order'])
).dt.days

# Custom scoring formula
customer_scores['loyalty_score'] = (
    customer_scores['total_spent'] * 0.4 + 
    customer_scores['order_count'] * 50 + 
    customer_scores['lifetime_days'] * 0.1
).round(1)

# Segment customers
def segment_customer(score):
    if score >= 1000: return 'VIP'
    elif score >= 500: return 'Gold'
    elif score >= 200: return 'Silver'
    else: return 'Bronze'

customer_scores['segment'] = customer_scores['loyalty_score'].apply(segment_customer)

print("Customer Segments:")
print(customer_scores['segment'].value_counts())

print("\\nTop 5 Customers:")
top_customers = customer_scores.sort_values('loyalty_score', ascending=False).head()
for idx, (customer_id, data) in enumerate(top_customers.iterrows(), 1):
    print(f"{idx}. {customer_id}: {data['segment']} (Score: {data['loyalty_score']})")
""")

# Agents can iterate on code based on output
if "ERROR:" in output:
    # Fix the code and try again
    pass
else:
    print("Analysis completed successfully!")

๐Ÿ”„ Resource Mirror Tools Usage (Tool-Only Clients)

For MCP clients that don't support resources, use the resource mirror tools for identical functionality:

# Instead of accessing resource: datasets://loaded
datasets = await resource_datasets_loaded()
# โ†’ Returns: {"datasets": [...], "total_datasets": 2, "status": "loaded"}

# Instead of accessing resource: datasets://sales/schema  
schema = await resource_datasets_schema("sales")
# โ†’ Returns: {"dataset_name": "sales", "columns_by_type": {...}}

# Instead of accessing resource: analytics://memory_usage
memory = await resource_analytics_memory_usage()
# โ†’ Returns: {"datasets": [...], "total_memory_mb": 15.2}

# Instead of accessing resource: config://server
config = await resource_config_server()
# โ†’ Returns: {"name": "Generic Data Analytics MCP", "features": [...]}

# All 12 resource mirror tools provide identical data to their resource counterparts
# Perfect for tool-only MCP clients or when resource support is unavailable

๐Ÿงช Testing

# Run all 130 tests
uv run python -m pytest tests/ -v

# Test specific functionality
uv run python -m pytest tests/test_pandas_tools.py -v              # Pandas tools
uv run python -m pytest tests/test_analytics_tools.py -v           # Advanced tools
uv run python -m pytest tests/test_analytics_prompts.py -v         # Prompts functionality
uv run python -m pytest tests/test_resource_mirror_tools.py -v     # Resource mirror tools
uv run python -m pytest tests/test_custom_analytics_code.py -v     # Custom code execution

# Quick test run
uv run python -m pytest tests/ -q
# Expected: 130 passed

๐Ÿ”ง MCP Client Configuration

Quick Setup (Recommended)

This project includes a sample configuration that you can customize:

  1. Copy the sample configuration:

    cp .mcp.json.sample .mcp.json
    
  2. Update paths in .mcp.json to match your system:

    {
      "mcpServers": {
        "quick-data": {
          "command": "/path/to/uv",
          "args": [
            "--directory",
            "/path/to/your/quick-data-mcp",
            "run",
            "python",
            "main.py"
          ],
          "env": {
            "LOG_LEVEL": "INFO"
          }
        }
      }
    }
    
  3. Find your UV path:

    which uv
    # Example output: /Users/yourusername/.local/bin/uv
    
  4. Get absolute path to this directory:

    pwd
    # Example output: /Users/yourusername/path/to/quick-data-mcp
    
  5. Update .mcp.json with your actual paths:

    • Replace /path/to/uv with your UV path
    • Replace /path/to/your/quick-data-mcp with your absolute directory path
  6. Copy to your MCP client or reference directly if supported

Option 2: Manual Configuration

If you prefer to configure manually, add to your MCP client configuration:

{
  "mcpServers": {
    "quick-data": {
      "command": "/path/to/uv",
      "args": [
        "--directory", 
        "/absolute/path/to/quick-data-mcp",
        "run", 
        "python", 
        "main.py"
      ],
      "env": {
        "LOG_LEVEL": "INFO"
      }
    }
  }
}

Important: Replace the placeholder paths with your actual system paths.

Configuration Notes

  • Use absolute paths for reliability across different working directories
  • --directory flag ensures UV operates in the correct project directory
  • .mcp.json is gitignored - each user needs their own copy with local paths
  • Use .mcp.json.sample as a template to avoid path conflicts
  • Environment variables can be customized per deployment

Environment Variables

  • LOG_LEVEL - Logging level (default: INFO)
  • SERVER_NAME - Server name (default: "Generic Data Analytics MCP")

๐Ÿš€ Getting Started in Claude Code

Once your MCP server is configured and running, start with this slash command in Claude Code to get oriented:

/quick-data:list_mcp_assets_prompt

This will show you all available tools, resources, and prompts with descriptions - your complete toolkit for data analytics!

๐Ÿ’ก Sample Datasets Included

E-commerce Orders (data/ecommerce_orders.json)

  • 15 orders with customer segments, regions, product categories
  • Use cases: Revenue analysis, customer segmentation, regional performance

Employee Survey (data/employee_survey.csv)

  • 25 employees with satisfaction scores, departments, tenure
  • Use cases: HR analytics, satisfaction analysis, department comparisons

Product Performance (data/product_performance.csv)

  • 20 products with sales, suppliers, ratings, launch dates
  • Use cases: Product analysis, supplier performance, market trends

๐ŸŽฏ Architecture Benefits

Dataset Agnosticism

  • Works with ANY structured data - no hardcoded schemas required
  • Intelligent column detection - automatically classifies data types
  • Zero configuration - drop in data files and start analyzing immediately

Modular Excellence

  • Clean separation - tools, resources, prompts, and models organized logically
  • Independent testing - each component tested in isolation
  • Easy extension - add new analytics without affecting existing functionality

Production Ready

  • Comprehensive error handling - graceful failures with actionable messages
  • Memory optimization - efficient pandas operations with usage monitoring
  • Performance monitoring - built-in analytics for large datasets

AI Integration

  • Smart recommendations - analysis suggestions based on data characteristics
  • Context-aware prompts - conversations that reference real column names
  • Adaptive workflows - tools that adjust behavior based on data types

๐Ÿ”ฎ Extension Examples

Adding Custom Analytics

# Add to tools/__init__.py or individual tool file
@staticmethod
async def custom_analysis(dataset_name: str, parameters: dict) -> dict:
    """Your custom analysis function."""
    df = DatasetManager.get_dataset(dataset_name)
    # Your analysis logic here
    return {"analysis": "results"}

# Register in server.py
@mcp.tool()
async def custom_analysis(dataset_name: str, parameters: dict) -> dict:
    return await tools.custom_analysis(dataset_name, parameters)

Adding Domain-Specific Prompts

# Add to prompts/__init__.py
@staticmethod
async def financial_analysis_workshop(dataset_name: str) -> str:
    """Guide financial analysis workflows."""
    # Custom financial analysis guidance
    return prompt_text

# Register in server.py  
@mcp.prompt()
async def financial_analysis_workshop(dataset_name: str) -> str:
    return await prompts.financial_analysis_workshop(dataset_name)

๐Ÿ† Success Metrics

  • โœ… Comprehensive Test Coverage - 130 tests passing
  • โœ… Universal Data Compatibility - Works with any JSON/CSV structure
  • โœ… Universal MCP Client Compatibility - Supports both resource-enabled and tool-only clients
  • โœ… Custom Code Execution - Full Python analytics capabilities with pandas/numpy/plotly
  • โœ… AI Integration - Smart recommendations and adaptive conversations
  • โœ… Performance Optimized - Memory-efficient operations with monitoring

This MCP server transforms the concept of data analytics from rigid, schema-dependent tools into a flexible, AI-guided platform that adapts to any dataset while providing expert-level guidance through conversational interfaces.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_disler_quick_data_mcp-0.1.0.tar.gz (54.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_disler_quick_data_mcp-0.1.0-py3-none-any.whl (68.6 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_disler_quick_data_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_disler_quick_data_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 54.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_disler_quick_data_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 a18584fc515f7fd3f919aa0933f64a0d39810843b4817fcca74f9457db866658
MD5 c47f4a6791aee84a00628937244bd0fd
BLAKE2b-256 07a8b4751016fd487f62be3e8ba772c6eda56116a2c3b925de5c42539b2afb1c

See more details on using hashes here.

File details

Details for the file iflow_mcp_disler_quick_data_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_disler_quick_data_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 68.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_disler_quick_data_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3cd3ccef85d69654d9d2b0737d81eeff6e51200782bdfb5709597451ba8344db
MD5 b99437323a70a7ac892ae8b4152d295a
BLAKE2b-256 1266a7d8aefa78c1abcfd87b4a7dc6a792175a572f3c53a2408af4f3416ea01c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page