Open-source MCP Server for B2B lead intelligence extraction. Works with Claude Desktop, Cursor, and any MCP client.

These details have not been verified by PyPI

Project links

Project description

LeadsClean MCP Server

An open-source MCP server that extracts structured B2B lead intelligence from company websites. Point it at any URL — get back a clean JSON object with company summary, buying signals, inferred needs, and personalised icebreaker lines.

Built as a reference implementation for MCP tool development. Demonstrates multi-provider LLM routing, dual-transport MCP serving, GDPR compliance patterns, and API key management — patterns you can reuse in your own MCP servers.

Works with Claude Desktop, Cursor, and any MCP-compatible client.

Tools

Tool	Description
`extract_lead_intelligence`	Analyse a single company URL and return structured lead intel
`batch_extract_leads`	Analyse up to 20 URLs in parallel — designed for agent list-processing

Output schema

{
  "company_name": "Acme Hotels Group",
  "core_business_summary": "Boutique hotel chain with 12 properties across Europe.",
  "product_category_match": "Strong match — hotel groups purchase furniture in bulk for room refits.",
  "recent_company_trigger": "Announced expansion to 3 new cities in Q1 2026, adding 400+ rooms.",
  "inferred_business_need": "Bulk furnishing for new hotel rooms on tight fit-out timelines.",
  "icebreaker_hook_business": "Running 12 properties across Europe is impressive — furnishing them at scale is where we help.",
  "icebreaker_hook_news": "Saw the Q1 expansion news — we help hotel groups source wholesale beds and sofas fast.",
  "data_provenance": {
    "source_url": "https://acmehotels.com",
    "source_type": "public_website",
    "collection_method": "jina_reader_public_fetch",
    "contains_pii": false,
    "gdpr_basis": "legitimate_interest",
    "gdpr_notes": "Extracted solely from publicly available company web pages. No personal data collected. Compliant with GDPR Art. 6(1)(f)."
  }
}

Every response includes data_provenance — a machine-readable GDPR metadata block indicating data source, PII status, and legal basis.

Quick start

Prerequisites

Python 3.11+
An API key for at least one supported LLM provider (see Environment variables below)

Install

pip install mcp-leadsclean

Or clone and install from source:

git clone https://github.com/edition/leadsclean
cd leadsclean
pip install -e .

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "leadsclean": {
      "command": "mcp-leadsclean",
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Set the key for whichever provider(s) you use (see Environment variables).

Cursor

Add to your Cursor MCP config (~/.cursor/mcp.json):

{
  "mcpServers": {
    "leadsclean": {
      "command": "mcp-leadsclean",
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

HTTP transport (production agent pipelines)

For remote agents or multi-tenant deployments, run with Streamable HTTP transport:

OPENAI_API_KEY=sk-... mcp-leadsclean --transport http --port 8001

The server exposes a single MCP endpoint at http://localhost:8001/mcp.

Demo mode

Try the server without an API key — useful for testing your agent pipeline or reviewing the output schema:

LEADSCLEAN_DEMO=1 mcp-leadsclean

All tool calls return a sanitised fixture response when LEADSCLEAN_DEMO=1 is set. The response includes "_demo": true so agents can detect and discard it.

Environment variables

The model parameter controls which provider is used. Provider is inferred from the model-name prefix — set the corresponding key:

Variable	Required when	Model prefix	Description
`OPENAI_API_KEY`	Using OpenAI (default)	`gpt-`, `o1-`, `o3-*`	OpenAI API key
`ANTHROPIC_API_KEY`	Using Claude	`claude-*`	Anthropic API key
`DASHSCOPE_API_KEY`	Using Alibaba Qwen	`qwen-*`	Alibaba DashScope API key
`MINIMAX_API_KEY`	Using MiniMax	`abab`, `minimax-`	MiniMax API key
`LEADSCLEAN_DEMO`	—	—	Set to `1` to return fixture data without any LLM call

The default model is gpt-4o-mini (OpenAI). To switch provider, pass the desired model ID in the tool call — e.g. claude-3-5-haiku-20241022 for Anthropic, qwen-turbo for Alibaba.

REST API

A standard FastAPI REST endpoint is also available for non-MCP integrations:

uvicorn main:app --reload

curl -X POST http://localhost:8000/extract-leads \
  -H "Content-Type: application/json" \
  -d '{
    "target_url": "https://acmecorp.com",
    "seller_context": "We provide cloud HR software to mid-size logistics companies."
  }'

Reusable patterns

This project demonstrates several patterns worth extracting for your own MCP servers:

Pattern	Where	What it does
Multi-provider LLM routing	`core.py`	Dispatches to OpenAI / Anthropic / Qwen / MiniMax based on model name prefix
Dual-transport MCP serving	`mcp_server.py`	Same tool logic served over stdio (local) and HTTP (remote)
SSRF protection	`core.py`	Validates URLs against private IP ranges before external fetch
Prompt injection mitigation	`core.py`	XML boundary tags around user-controlled content in LLM prompts
API key hashing	`db.py`	SHA-256 hashing with prefix display — keys are never stored in plain text
Usage metering	`db.py` + `auth.py`	Per-key monthly quotas with auto-reset and atomic increment
GDPR provenance	`core.py`	Machine-readable compliance metadata on every response
Demo mode	`core.py` + `auth.py`	Full bypass of external services for pipeline testing

Development

# Install dependencies
pip install -r requirements.txt

# Run MCP server (stdio)
python mcp_server.py

# Run MCP server (HTTP, port 8001)
python mcp_server.py --transport http

# Run REST API
uvicorn main:app --reload

How it works

Fetch — retrieves clean Markdown from the target URL via Jina Reader
Extract — passes the content to your chosen LLM (OpenAI, Anthropic Claude, Alibaba Qwen, or MiniMax) with a structured prompt
Return — outputs a JSON object matching the schema above

Content never leaves the pipeline: no data is stored by LeadsClean.

Built with Claude

This project was developed with the assistance of Claude by Anthropic — an AI assistant used for code generation, architecture design, and documentation.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.0

Mar 3, 2026

0.1.0

Mar 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcp_leadsclean-0.2.0.tar.gz (24.5 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mcp_leadsclean-0.2.0-py3-none-any.whl (14.6 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file mcp_leadsclean-0.2.0.tar.gz.

File metadata

Download URL: mcp_leadsclean-0.2.0.tar.gz
Upload date: Mar 3, 2026
Size: 24.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for mcp_leadsclean-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f331b961b09bda68a5fb5c3e77baef4ae4bd095f2aad8c9c5ceff7c0468174ec`
MD5	`d88965d4de7d3149f88f780acbb7bafa`
BLAKE2b-256	`098b6b8e48e963371881c94ece2409e17e7361aaa719f16ed2ef6ad19098102a`

See more details on using hashes here.

File details

Details for the file mcp_leadsclean-0.2.0-py3-none-any.whl.

File metadata

Download URL: mcp_leadsclean-0.2.0-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.6

File hashes

Hashes for mcp_leadsclean-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5f4c4ba95603dc759f20446311237f9a9533bccccfe8c59c27f55a16c1609c97`
MD5	`1d146c70df33c25ef539a3b61b7fd679`
BLAKE2b-256	`e2e073ea0c31655d204b7c07ba659ed8708d7b363a998a6e339bcdaf6c1d145d`

See more details on using hashes here.

mcp-leadsclean 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LeadsClean MCP Server

Tools

Output schema

Quick start

Prerequisites

Install

Claude Desktop

Cursor

HTTP transport (production agent pipelines)

Demo mode

Environment variables

REST API

Reusable patterns

Development

How it works

Built with Claude

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes