Skip to main content

A modern LinkedIn scraping library with CLI and MCP server

Project description

linkedin-spider

A modern LinkedIn scraping library with built-in anti-detection, available as a Python library, CLI tool, and MCP server.

PyPI Python License Docker

MCP Server · Python Library · CLI · Docker

Features

  • Profile search with advanced filters (location, industry, company, connection degree)
  • Post search by keywords with engagement metrics and comments
  • Profile scraping with experience, education, skills, and contact details
  • Company scraping with industry, size, specialties, and more
  • Connection management — retrieve and send connection requests
  • Conversations — list threads, read message history, and send messages
  • Proxy support — route traffic through HTTP or SOCKS5 proxies
  • Anti-detection — human-like behavior simulation, stealth mode, session persistence

Installation

pip install linkedin-spider        # library only
pip install linkedin-spider[cli]   # library + CLI
pip install linkedin-spider[mcp]   # library + MCP server
pip install linkedin-spider[all]   # everything

Authentication

linkedin-spider supports two authentication methods. Sessions are persisted in a Chrome profile so you typically only authenticate once.

LinkedIn Cookie (Recommended)

  1. Log in to LinkedIn in your browser
  2. Open DevTools (F12) → Application → Cookies → linkedin.com
  3. Copy the li_at cookie value
scraper = LinkedinSpider(li_at_cookie="your_cookie_value")

Email & Password

scraper = LinkedinSpider(email="you@example.com", password="your_password")

For CLI and MCP, pass --cookie (or --email/--password) flags, or set the LINKEDIN_COOKIE (or LINKEDIN_EMAIL/LINKEDIN_PASSWORD) environment variables.

MCP Server

The MCP server exposes LinkedIn data to AI assistants like Claude.

Example prompts you can give Claude once connected:

Research the background of this candidate https://www.linkedin.com/in/johndoe
Find 10 product managers in San Francisco and summarize their experience
What has OpenAI been posting about recently? https://www.linkedin.com/company/openai
Show me my pending connection requests and summarize who they are
Send a message to John Doe saying "Thanks for connecting!"

It provides 12 tools:

Tool Description
search_profiles Search profiles with filters (location, industry, company, connections)
scrape_profile Extract complete profile data from a URL
search_posts Search posts by keywords with date filters
scrape_company Get company details from a URL
scrape_incoming_connections List pending connection requests received
scrape_outgoing_connections List connection requests you've sent
send_connection_request Send a connection request with optional note
scrape_conversations_list List messaging conversations
scrape_conversation Read messages from a conversation
send_message Send a message in an existing or new conversation
get_session_status Check if the browser session is active
reset_session Close and reset the browser session

Start the server

# stdio (default — for Claude Desktop, Claude Code)
linkedin-spider-mcp serve --cookie your_li_at_cookie_value

# SSE or HTTP (for remote clients)
linkedin-spider-mcp serve --transport sse --host 127.0.0.1 --port 8000
linkedin-spider-mcp serve --transport http --host 0.0.0.0 --port 9000

Or configure via environment variables in a .env file:

LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://host:port          # optional

Claude Desktop

Add to your Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

With Docker (Recommended)

No local dependencies required — the Docker image includes Python, Chrome, and everything needed.

{
  "mcpServers": {
    "linkedin-spider": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-e",
        "LINKEDIN_COOKIE=your_li_at_cookie_value",
        "-e",
        "HEADLESS=true",
        "-e",
        "TRANSPORT=stdio",
        "vertexcoverlabs/linkedin-mcp"
      ]
    }
  }
}

Without Docker

Requires pip install linkedin-spider[mcp] and Chrome installed locally.

{
  "mcpServers": {
    "linkedin-spider": {
      "command": "linkedin-spider-mcp",
      "args": ["serve", "--cookie", "your_li_at_cookie_value"]
    }
  }
}

Claude Code

With Docker

claude mcp add linkedin-spider -- \
  docker run --rm -i \
  -e LINKEDIN_COOKIE=your_li_at_cookie_value \
  -e HEADLESS=true \
  -e TRANSPORT=stdio \
  vertexcoverlabs/linkedin-mcp

Without Docker

claude mcp add linkedin-spider -- \
  linkedin-spider-mcp serve --cookie your_li_at_cookie_value

Or connect to a running SSE/HTTP server:

claude mcp add linkedin-spider --transport sse http://localhost:8000/sse

Troubleshooting

Authentication issues
  • Cookie expired: LinkedIn cookies expire periodically. Grab a fresh li_at value from your browser.
  • Email/password not working: LinkedIn may trigger a CAPTCHA or verification. Try cookie auth instead.
  • Session reuse: Sessions are saved in a Chrome profile. If things break, delete the profile directory and re-authenticate.
Docker issues
  • First-time pull is slow: The image is ~1.4GB. Pre-pull with docker pull vertexcoverlabs/linkedin-mcp before configuring Claude Desktop to avoid timeout.
  • Port conflicts: If port 8080 is in use, map to a different host port: -p 9090:8080.
Browser / Chrome issues
  • Chrome not found (non-Docker): Set CHROMEDRIVER_PATH in your .env or pass chromedriver_path in ScraperConfig.
  • Page load timeouts: Increase page_load_timeout in ScraperConfig or use a faster proxy.
  • Headless mode issues: Some LinkedIn pages behave differently in headless mode. Try headless=False for debugging.

Python Library

from linkedin_spider import LinkedinSpider, ScraperConfig

config = ScraperConfig(headless=True, page_load_timeout=30)
scraper = LinkedinSpider(
    li_at_cookie="your_cookie_value",
    config=config,
)

# Search profiles
results = scraper.search_profiles("software engineer", max_results=10)

# Search posts
posts = scraper.search_posts("artificial intelligence", max_results=10)

# Scrape a single profile
profile = scraper.scrape_profile("https://linkedin.com/in/someone")

# Scrape a company page
company = scraper.scrape_company("https://linkedin.com/company/openai")

# Connection requests
incoming = scraper.scrape_incoming_connections(max_results=20)
scraper.send_connection_request("https://linkedin.com/in/someone", note="Hi!")

# Conversations
threads = scraper.scrape_conversations_list(max_results=10)
messages = scraper.scrape_conversation_messages("John Doe")
scraper.send_message("Thanks for connecting!", participant_name="John Doe")

# Always clean up
scraper.close()

See the examples/ directory for more detailed usage.

Configuration

ScraperConfig accepts the following options:

Option Default Description
headless False Run browser without a visible window
stealth_mode True Inject anti-detection scripts
page_load_timeout 30 Page load timeout in seconds
implicit_wait 10 Implicit wait timeout in seconds
human_delay_range (0.5, 2.0) Random delay range between actions
proxy None Proxy URL (http:// or socks5://)
custom_user_agent None Override the default user agent

Command Line Interface

# Search profiles
linkedin-spider-cli search -q "product manager" -n 10 -o results.json

# Search posts
linkedin-spider-cli search-posts -k "artificial intelligence" -n 10 -o posts.json

# Scrape a profile
linkedin-spider-cli profile -u "https://linkedin.com/in/johndoe" -o profile.json

# Scrape a company
linkedin-spider-cli company -u "https://linkedin.com/company/openai" -o company.json

# List connection requests
linkedin-spider-cli connections -n 20 -o connections.json

Pass --cookie on first use (or set LINKEDIN_COOKIE env var). You can also use --email/--password instead. The session is saved and reused for subsequent commands.

Output defaults to stdout. Use -o to write JSON or CSV files.

Docker

A pre-built Docker image is published on Docker Hub:

docker pull vertexcoverlabs/linkedin-mcp

Or build locally:

docker build -t linkedin-spider .

Run with different transports:

# stdio
docker run --rm -i -e TRANSPORT=stdio --env-file .env vertexcoverlabs/linkedin-mcp

# SSE
docker run -p 8080:8080 -e TRANSPORT=sse --env-file .env vertexcoverlabs/linkedin-mcp

# HTTP
docker run -p 8080:8080 -e TRANSPORT=http --env-file .env vertexcoverlabs/linkedin-mcp

Running in the Cloud

LinkedIn blocks requests from known datacenter IP ranges. To run linkedin-spider on a cloud server (AWS, GCP, Azure, etc.), route browser traffic through a residential or mobile proxy using the PROXY_URL environment variable.

LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://user:pass@proxy-host:port

Both HTTP and SOCKS5 proxies are supported:

PROXY_URL=http://user:pass@proxy-host:port
PROXY_URL=socks5://user:pass@proxy-host:port

When using Docker, pass it as an environment variable:

docker run --rm -i \
  -e LINKEDIN_COOKIE=your_li_at_cookie_value \
  -e PROXY_URL=http://user:pass@proxy-host:port \
  -e HEADLESS=true \
  -e TRANSPORT=stdio \
  vertexcoverlabs/linkedin-mcp

When using the Python library, pass it via ScraperConfig:

config = ScraperConfig(headless=True, proxy="http://user:pass@proxy-host:port")

[!TIP] Residential proxies are recommended over datacenter proxies to avoid detection.

Development

git clone https://github.com/vertexcover-io/linkedin-spider
cd linkedin-spider
uv sync
cp .env.example .env   # add your credentials
make check    # lint + typecheck
make test     # run tests
make build    # build wheel

Disclaimer

This tool is for personal and educational use. Please respect LinkedIn's Terms of Service, use reasonable rate limits, and handle collected data responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linkedin_spider-0.3.0.tar.gz (171.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linkedin_spider-0.3.0-py3-none-any.whl (73.8 kB view details)

Uploaded Python 3

File details

Details for the file linkedin_spider-0.3.0.tar.gz.

File metadata

  • Download URL: linkedin_spider-0.3.0.tar.gz
  • Upload date:
  • Size: 171.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for linkedin_spider-0.3.0.tar.gz
Algorithm Hash digest
SHA256 f18c3d8bd43d0e9f46e786e3aad7f429d6f99cc1db4255ae6dba4f9da5e31c57
MD5 ee4629a75903401da98958ac44782fac
BLAKE2b-256 aefe1b3c6fb050c740aeda03546f679abc388926edaa485e21eea071c9584ec1

See more details on using hashes here.

File details

Details for the file linkedin_spider-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: linkedin_spider-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 73.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.13 {"installer":{"name":"uv","version":"0.11.13","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for linkedin_spider-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 00ef107d1379e73dc5bada349058f16be9207d759568ad018b835b552a01afe4
MD5 6cb72e01b7983e20d17f1233bcd16d31
BLAKE2b-256 9a2e7cdaf22869e7d3c9aec3a9aac0d4f84df52de491b86d55c5fe2ecc7694c3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page