A Model Context Protocol server for web crawling using Crawl4ai

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Crawl4AI MCP Server

A Model Context Protocol server for web crawling using the Crawl4ai library.

📋 Overview

Crawl4AI MCP Server provides a set of tools and prompts for web crawling through the Model Context Protocol (MCP). It allows AI assistants to autonomously crawl websites, extract content, and save information as Markdown files.

✨ Features

🕸️ Single Page Crawling: Extract content from a single webpage in Markdown format
🌐 Deep Website Crawling: Crawl multiple pages of a website with configurable depth and limits
🔍 Structured Data Extraction: Use CSS selectors to extract specific structured data from webpages
💾 Markdown Export: Save crawled content directly as Markdown files

🚀 Installation

pip install crawl4ai-mcp-server

🛠️ Usage

Command Line

Run the server directly from the command line:

crawl4ai-mcp

Python API

import asyncio
from crawl4ai_mcp import serve

# Run the server
asyncio.run(serve())

📝 Available Tools

crawl_webpage

Crawls a single webpage and returns its content as markdown.

Parameters:

url (string, required): URL to crawl
include_images (boolean, optional): Whether to include images in the result (default: true)
bypass_cache (boolean, optional): Whether to bypass cache (default: false)

crawl_website

Crawls a website starting from the given URL, with specified depth and page limit.

Parameters:

url (string, required): Starting URL
max_depth (integer, optional): Maximum crawl depth (default: 1)
max_pages (integer, optional): Maximum number of pages to crawl (default: 5)
include_images (boolean, optional): Whether to include images (default: true)

extract_structured_data

Extracts structured data from a webpage using CSS selectors.

Parameters:

url (string, required): URL to extract data from
schema (object, optional): Schema defining what to extract
css_selector (string, optional): CSS selector to locate specific parts of the page (default: "body")

save_as_markdown

Crawls a webpage and saves the content as a Markdown file.

Parameters:

url (string, required): URL to crawl
filename (string, required): Filename to save the Markdown
include_images (boolean, optional): Whether to include images (default: true)

🔌 Available Prompts

crawl

Crawls a webpage and retrieves its content.

Arguments:

url (required): URL to crawl

save_page

Crawls a webpage and saves it as a Markdown file.

Arguments:

url (required): URL to crawl
filename (required): Filename to save the Markdown

🧩 Requirements

Python 3.8+
mcp>=1.0.0
crawl4ai
pydantic

📄 License

MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.1.5

May 4, 2025

This version

0.1.4

May 4, 2025

0.1.3

May 4, 2025

0.1.2

May 4, 2025

0.1.1

May 4, 2025

0.1.0

May 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crawl4ai_mcp_server-0.1.4.tar.gz (15.7 kB view details)

Uploaded May 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crawl4ai_mcp_server-0.1.4-py3-none-any.whl (14.2 kB view details)

Uploaded May 4, 2025 Python 3

File details

Details for the file crawl4ai_mcp_server-0.1.4.tar.gz.

File metadata

Download URL: crawl4ai_mcp_server-0.1.4.tar.gz
Upload date: May 4, 2025
Size: 15.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for crawl4ai_mcp_server-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`9c5f9e655026e157bbb9eb514d9ff644af30e3fc7a76e5298d8bbecf8f7930a2`
MD5	`784f63e0204cad56b556ae4d011e9932`
BLAKE2b-256	`fccd51dc9c602068f7626446090cb54bb1c890345cc37e5921bc0f7154b610ac`

See more details on using hashes here.

File details

Details for the file crawl4ai_mcp_server-0.1.4-py3-none-any.whl.

File metadata

Download URL: crawl4ai_mcp_server-0.1.4-py3-none-any.whl
Upload date: May 4, 2025
Size: 14.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for crawl4ai_mcp_server-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9e268b28accfcf1466e05bb711242c37d206134c9c486a3f7022a2922144561f`
MD5	`a9cdf51799c4414539768467c54e6b41`
BLAKE2b-256	`317b198d5b92ea3e08f5ff0c6b8e476585dc5e9022921b96528c6320213c3c80`

See more details on using hashes here.

crawl4ai-mcp-server 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Crawl4AI MCP Server

📋 Overview

✨ Features

🚀 Installation

🛠️ Usage

Command Line

Python API

📝 Available Tools

crawl_webpage

crawl_website

extract_structured_data

save_as_markdown

🔌 Available Prompts

crawl

save_page

🧩 Requirements

📄 License

🤝 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes