MCP server that extracts content from web pages for LLMs
Project description
MCP Web Extractor (by sadasiba)
mcp-web-extractor-sadasiba is an MCP (Model Context Protocol) server that extracts clean text content from web pages for use by LLMs such as Claude.
It fetches HTML from a given URL, parses it with BeautifulSoup, and returns only the readable text.
Features
- 🌐 Extracts readable text from any web page
- 🧹 Strips away HTML tags, scripts, and styling
- ⚡ Simple command-line interface
- 🤝 Works seamlessly with Claude's MCP integration
Installation
You can install directly from PyPI:
pip install mcp-web-extractor-sadasiba
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcp_web_extractor_sadasiba-0.1.3.tar.gz.
File metadata
- Download URL: mcp_web_extractor_sadasiba-0.1.3.tar.gz
- Upload date:
- Size: 2.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8afdc99a42b26aea1336351b5a68aceab53ffd9c3a6e627da74e3b6d408cb5b9
|
|
| MD5 |
8fa4273c2bd1176897327212fa12691d
|
|
| BLAKE2b-256 |
8bc0f413089ca02a83e774b9689cbd870e9ba0ea5e36c08d23f7d6ade0c9aeec
|
File details
Details for the file mcp_web_extractor_sadasiba-0.1.3-py3-none-any.whl.
File metadata
- Download URL: mcp_web_extractor_sadasiba-0.1.3-py3-none-any.whl
- Upload date:
- Size: 2.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c1e603c22284c0d976ad6da85610281c97490faf81090ff5975960bcbd8be94
|
|
| MD5 |
f86a477206853394726b030f6d5a00f0
|
|
| BLAKE2b-256 |
9ce4a86a6154df89f4d1b388e0b659a2d714fb9c490bec5c9dee38f1d17b301d
|