A modern LinkedIn scraping library with CLI and MCP server
Project description
linkedin-spider
A modern LinkedIn scraping library with built-in anti-detection, available as a Python library, CLI tool, and MCP server.
MCP Server · Python Library · CLI · Docker
Features
- Profile search with advanced filters (location, industry, company, connection degree)
- Post search by keywords with engagement metrics and comments
- Profile scraping with experience, education, skills, and contact details
- Company scraping with industry, size, specialties, and more
- Connection management — retrieve and send connection requests
- Conversations — list threads, read message history, and send messages
- Proxy support — route traffic through HTTP or SOCKS5 proxies
- Anti-detection — human-like behavior simulation, stealth mode, session persistence
Installation
pip install linkedin-spider # library only
pip install linkedin-spider[cli] # library + CLI
pip install linkedin-spider[mcp] # library + MCP server
pip install linkedin-spider[all] # everything
Authentication
linkedin-spider supports two authentication methods. Sessions are persisted in a Chrome profile so you typically only authenticate once.
LinkedIn Cookie (Recommended)
- Log in to LinkedIn in your browser
- Open DevTools (F12) → Application → Cookies →
linkedin.com - Copy the
li_atcookie value
scraper = LinkedinSpider(li_at_cookie="your_cookie_value")
Email & Password
scraper = LinkedinSpider(email="you@example.com", password="your_password")
For CLI and MCP, pass --cookie (or --email/--password) flags, or set the LINKEDIN_COOKIE (or LINKEDIN_EMAIL/LINKEDIN_PASSWORD) environment variables.
MCP Server
The MCP server exposes LinkedIn data to AI assistants like Claude.
Example prompts you can give Claude once connected:
Research the background of this candidate https://www.linkedin.com/in/johndoe
Find 10 product managers in San Francisco and summarize their experience
What has OpenAI been posting about recently? https://www.linkedin.com/company/openai
Show me my pending connection requests and summarize who they are
Send a message to John Doe saying "Thanks for connecting!"
It provides 12 tools:
| Tool | Description |
|---|---|
search_profiles |
Search profiles with filters (location, industry, company, connections) |
scrape_profile |
Extract complete profile data from a URL |
search_posts |
Search posts by keywords with date filters |
scrape_company |
Get company details from a URL |
scrape_incoming_connections |
List pending connection requests received |
scrape_outgoing_connections |
List connection requests you've sent |
send_connection_request |
Send a connection request with optional note |
scrape_conversations_list |
List messaging conversations |
scrape_conversation |
Read messages from a conversation |
send_message |
Send a message in an existing or new conversation |
get_session_status |
Check if the browser session is active |
reset_session |
Close and reset the browser session |
Start the server
# stdio (default — for Claude Desktop, Claude Code)
linkedin-spider-mcp serve --cookie your_li_at_cookie_value
# SSE or HTTP (for remote clients)
linkedin-spider-mcp serve --transport sse --host 127.0.0.1 --port 8000
linkedin-spider-mcp serve --transport http --host 0.0.0.0 --port 9000
Or configure via environment variables in a .env file:
LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://host:port # optional
Claude Desktop
Add to your Claude Desktop config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
With Docker (Recommended)
No local dependencies required — the Docker image includes Python, Chrome, and everything needed.
{
"mcpServers": {
"linkedin-spider": {
"command": "docker",
"args": [
"run",
"--rm",
"-i",
"-e",
"LINKEDIN_COOKIE=your_li_at_cookie_value",
"-e",
"HEADLESS=true",
"-e",
"TRANSPORT=stdio",
"vertexcoverlabs/linkedin-mcp"
]
}
}
}
Without Docker
Requires pip install linkedin-spider[mcp] and Chrome installed locally.
{
"mcpServers": {
"linkedin-spider": {
"command": "linkedin-spider-mcp",
"args": ["serve", "--cookie", "your_li_at_cookie_value"]
}
}
}
Claude Code
With Docker
claude mcp add linkedin-spider -- \
docker run --rm -i \
-e LINKEDIN_COOKIE=your_li_at_cookie_value \
-e HEADLESS=true \
-e TRANSPORT=stdio \
vertexcoverlabs/linkedin-mcp
Without Docker
claude mcp add linkedin-spider -- \
linkedin-spider-mcp serve --cookie your_li_at_cookie_value
Or connect to a running SSE/HTTP server:
claude mcp add linkedin-spider --transport sse http://localhost:8000/sse
Troubleshooting
Authentication issues
- Cookie expired: LinkedIn cookies expire periodically. Grab a fresh
li_atvalue from your browser. - Email/password not working: LinkedIn may trigger a CAPTCHA or verification. Try cookie auth instead.
- Session reuse: Sessions are saved in a Chrome profile. If things break, delete the profile directory and re-authenticate.
Docker issues
- First-time pull is slow: The image is ~1.4GB. Pre-pull with
docker pull vertexcoverlabs/linkedin-mcpbefore configuring Claude Desktop to avoid timeout. - Port conflicts: If port 8080 is in use, map to a different host port:
-p 9090:8080.
Browser / Chrome issues
- Chrome not found (non-Docker): Set
CHROMEDRIVER_PATHin your.envor passchromedriver_pathinScraperConfig. - Page load timeouts: Increase
page_load_timeoutinScraperConfigor use a faster proxy. - Headless mode issues: Some LinkedIn pages behave differently in headless mode. Try
headless=Falsefor debugging.
Python Library
from linkedin_spider import LinkedinSpider, ScraperConfig
config = ScraperConfig(headless=True, page_load_timeout=30)
scraper = LinkedinSpider(
li_at_cookie="your_cookie_value",
config=config,
)
# Search profiles
results = scraper.search_profiles("software engineer", max_results=10)
# Search posts
posts = scraper.search_posts("artificial intelligence", max_results=10)
# Scrape a single profile
profile = scraper.scrape_profile("https://linkedin.com/in/someone")
# Scrape a company page
company = scraper.scrape_company("https://linkedin.com/company/openai")
# Connection requests
incoming = scraper.scrape_incoming_connections(max_results=20)
scraper.send_connection_request("https://linkedin.com/in/someone", note="Hi!")
# Conversations
threads = scraper.scrape_conversations_list(max_results=10)
messages = scraper.scrape_conversation_messages("John Doe")
scraper.send_message("Thanks for connecting!", participant_name="John Doe")
# Always clean up
scraper.close()
See the examples/ directory for more detailed usage.
Configuration
ScraperConfig accepts the following options:
| Option | Default | Description |
|---|---|---|
headless |
False |
Run browser without a visible window |
stealth_mode |
True |
Inject anti-detection scripts |
page_load_timeout |
30 |
Page load timeout in seconds |
implicit_wait |
10 |
Implicit wait timeout in seconds |
human_delay_range |
(0.5, 2.0) |
Random delay range between actions |
proxy |
None |
Proxy URL (http:// or socks5://) |
custom_user_agent |
None |
Override the default user agent |
Command Line Interface
# Search profiles
linkedin-spider-cli search -q "product manager" -n 10 -o results.json
# Search posts
linkedin-spider-cli search-posts -k "artificial intelligence" -n 10 -o posts.json
# Scrape a profile
linkedin-spider-cli profile -u "https://linkedin.com/in/johndoe" -o profile.json
# Scrape a company
linkedin-spider-cli company -u "https://linkedin.com/company/openai" -o company.json
# List connection requests
linkedin-spider-cli connections -n 20 -o connections.json
Pass --cookie on first use (or set LINKEDIN_COOKIE env var). You can also use --email/--password instead. The session is saved and reused for subsequent commands.
Output defaults to stdout. Use -o to write JSON or CSV files.
Docker
A pre-built Docker image is published on Docker Hub:
docker pull vertexcoverlabs/linkedin-mcp
Or build locally:
docker build -t linkedin-spider .
Run with different transports:
# stdio
docker run --rm -i -e TRANSPORT=stdio --env-file .env vertexcoverlabs/linkedin-mcp
# SSE
docker run -p 8080:8080 -e TRANSPORT=sse --env-file .env vertexcoverlabs/linkedin-mcp
# HTTP
docker run -p 8080:8080 -e TRANSPORT=http --env-file .env vertexcoverlabs/linkedin-mcp
Running in the Cloud
LinkedIn blocks requests from known datacenter IP ranges. To run linkedin-spider on a cloud server (AWS, GCP, Azure, etc.), route browser traffic through a residential or mobile proxy using the PROXY_URL environment variable.
LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://user:pass@proxy-host:port
Both HTTP and SOCKS5 proxies are supported:
PROXY_URL=http://user:pass@proxy-host:port
PROXY_URL=socks5://user:pass@proxy-host:port
When using Docker, pass it as an environment variable:
docker run --rm -i \
-e LINKEDIN_COOKIE=your_li_at_cookie_value \
-e PROXY_URL=http://user:pass@proxy-host:port \
-e HEADLESS=true \
-e TRANSPORT=stdio \
vertexcoverlabs/linkedin-mcp
When using the Python library, pass it via ScraperConfig:
config = ScraperConfig(headless=True, proxy="http://user:pass@proxy-host:port")
[!TIP] Residential proxies are recommended over datacenter proxies to avoid detection.
Development
git clone https://github.com/vertexcover-io/linkedin-spider
cd linkedin-spider
uv sync
cp .env.example .env # add your credentials
make check # lint + typecheck
make test # run tests
make build # build wheel
Disclaimer
This tool is for personal and educational use. Please respect LinkedIn's Terms of Service, use reasonable rate limits, and handle collected data responsibly.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file linkedin_spider-0.3.0.tar.gz.
File metadata
- Download URL: linkedin_spider-0.3.0.tar.gz
- Upload date:
- Size: 171.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f18c3d8bd43d0e9f46e786e3aad7f429d6f99cc1db4255ae6dba4f9da5e31c57
|
|
| MD5 |
ee4629a75903401da98958ac44782fac
|
|
| BLAKE2b-256 |
aefe1b3c6fb050c740aeda03546f679abc388926edaa485e21eea071c9584ec1
|
File details
Details for the file linkedin_spider-0.3.0-py3-none-any.whl.
File metadata
- Download URL: linkedin_spider-0.3.0-py3-none-any.whl
- Upload date:
- Size: 73.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00ef107d1379e73dc5bada349058f16be9207d759568ad018b835b552a01afe4
|
|
| MD5 |
6cb72e01b7983e20d17f1233bcd16d31
|
|
| BLAKE2b-256 |
9a2e7cdaf22869e7d3c9aec3a9aac0d4f84df52de491b86d55c5fe2ecc7694c3
|