Skip to main content

MCP server for interacting with Hugging Face dataset viewer API, providing dataset browsing, filtering, and statistics capabilities

Project description

Dataset Viewer MCP Server

An MCP server for interacting with the Hugging Face Dataset Viewer API, providing capabilities to browse and analyze datasets hosted on the Hugging Face Hub.

Features

Resources

  • Uses dataset:// URI scheme for accessing Hugging Face datasets
  • Supports dataset configurations and splits
  • Provides paginated access to dataset contents
  • Handles authentication for private datasets
  • Supports searching and filtering dataset contents
  • Provides dataset statistics and analysis

Tools

The server provides the following tools:

  1. validate

    • Check if a dataset exists and is accessible
    • Parameters:
      • dataset: Dataset identifier (e.g. 'stanfordnlp/imdb')
      • auth_token (optional): For private datasets
  2. get_info

    • Get detailed information about a dataset
    • Parameters:
      • dataset: Dataset identifier
      • auth_token (optional): For private datasets
  3. get_rows

    • Get paginated contents of a dataset
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • page (optional): Page number (0-based)
      • auth_token (optional): For private datasets
  4. get_first_rows

    • Get first rows from a dataset split
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • auth_token (optional): For private datasets
  5. get_statistics

    • Get statistics about a dataset split
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • auth_token (optional): For private datasets
  6. search_dataset

    • Search for text within a dataset
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • query: Text to search for
      • auth_token (optional): For private datasets
  7. filter

    • Filter rows using SQL-like conditions
    • Parameters:
      • dataset: Dataset identifier
      • config: Configuration name
      • split: Split name
      • where: SQL WHERE clause (e.g. "score > 0.5")
      • orderby (optional): SQL ORDER BY clause
      • page (optional): Page number (0-based)
      • auth_token (optional): For private datasets
  8. get_parquet

    • Download entire dataset in Parquet format
    • Parameters:
      • dataset: Dataset identifier
      • auth_token (optional): For private datasets

Installation

Prerequisites

  • Python 3.12 or higher
  • uv - Fast Python package installer and resolver

Setup

  1. Clone the repository:
git clone https://github.com/privetin/dataset-viewer.git
cd dataset-viewer
  1. Create a virtual environment and install:
# Create virtual environment
uv venv

# Activate virtual environment
# On Unix:
source .venv/bin/activate
# On Windows:
.venv\Scripts\activate

# Install in development mode
uv add -e .

Configuration

Environment Variables

  • HUGGINGFACE_TOKEN: Your Hugging Face API token for accessing private datasets

Claude Desktop Integration

Add the following to your Claude Desktop config file:

On Windows: %APPDATA%\Claude\claude_desktop_config.json

On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "dataset-viewer": {
      "command": "uv",
      "args": [
        "--directory",
        "parent_to_repo/dataset-viewer",
        "run",
        "dataset-viewer"
      ]
    }
  }
}

License

MIT License - see LICENSE for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_dataset_viewer-0.1.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_dataset_viewer-0.1.0-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_dataset_viewer-0.1.0.tar.gz.

File metadata

File hashes

Hashes for iflow_mcp_dataset_viewer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4abfd00903b739e0f5833c521fe689e0995369f663f1d6710c18eb7857dbcf50
MD5 ad318632065ff5c9f5f520ffe8f125ca
BLAKE2b-256 9fc5f2d559802b20b07b13955eea63f8462fcf523aff457dfc54dfb1b13f0853

See more details on using hashes here.

File details

Details for the file iflow_mcp_dataset_viewer-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for iflow_mcp_dataset_viewer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6ca4b44701a14e84940db74aaccd7578a68dbe74febb1c671bc562f37c4599ed
MD5 93f81e2bbaf15aefbf8af56c6c6b9c82
BLAKE2b-256 fb8ecf2cada108013bbe4b0a981ce80deee68a94a2ef8890393b8a1c96e9f715

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page