Skip to main content

A modern LinkedIn scraping library with CLI and MCP server

Project description

linkedin-spider

A modern LinkedIn scraping library with built-in anti-detection, available as a Python library, CLI tool, and MCP server.

PyPI Python License Docker

MCP Server · Python Library · CLI · Docker

Features

  • Profile search with advanced filters (location, industry, company, connection degree)
  • Post search by keywords with engagement metrics and comments
  • Profile scraping with experience, education, skills, and contact details
  • Company scraping with industry, size, specialties, and more
  • Connection management — retrieve and send connection requests
  • Conversations — list threads and read message history
  • Proxy support — route traffic through HTTP or SOCKS5 proxies
  • Anti-detection — human-like behavior simulation, stealth mode, session persistence

Installation

pip install linkedin-spider        # library only
pip install linkedin-spider[cli]   # library + CLI
pip install linkedin-spider[mcp]   # library + MCP server
pip install linkedin-spider[all]   # everything

Authentication

linkedin-spider supports two authentication methods. Sessions are persisted in a Chrome profile so you typically only authenticate once.

LinkedIn Cookie (Recommended)

  1. Log in to LinkedIn in your browser
  2. Open DevTools (F12) → Application → Cookies → linkedin.com
  3. Copy the li_at cookie value
scraper = LinkedinSpider(li_at_cookie="your_cookie_value")

Email & Password

scraper = LinkedinSpider(email="you@example.com", password="your_password")

For CLI and MCP, pass --cookie (or --email/--password) flags, or set the LINKEDIN_COOKIE (or LINKEDIN_EMAIL/LINKEDIN_PASSWORD) environment variables.

MCP Server

The MCP server exposes LinkedIn data to AI assistants like Claude.

Example prompts you can give Claude once connected:

Research the background of this candidate https://www.linkedin.com/in/johndoe
Find 10 product managers in San Francisco and summarize their experience
What has OpenAI been posting about recently? https://www.linkedin.com/company/openai
Show me my pending connection requests and summarize who they are

It provides 11 tools:

Tool Description
search_profiles Search profiles with filters (location, industry, company, connections)
scrape_profile Extract complete profile data from a URL
search_posts Search posts by keywords with date filters
scrape_company Get company details from a URL
scrape_incoming_connections List pending connection requests received
scrape_outgoing_connections List connection requests you've sent
send_connection_request Send a connection request with optional note
scrape_conversations_list List messaging conversations
scrape_conversation Read messages from a conversation
get_session_status Check if the browser session is active
reset_session Close and reset the browser session

Start the server

# stdio (default — for Claude Desktop, Claude Code)
linkedin-spider-mcp serve --cookie your_li_at_cookie_value

# SSE or HTTP (for remote clients)
linkedin-spider-mcp serve --transport sse --host 127.0.0.1 --port 8000
linkedin-spider-mcp serve --transport http --host 0.0.0.0 --port 9000

Or configure via environment variables in a .env file:

LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://host:port          # optional

Claude Desktop

Add to your Claude Desktop config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

With Docker (Recommended)

No local dependencies required — the Docker image includes Python, Chrome, and everything needed.

{
  "mcpServers": {
    "linkedin-spider": {
      "command": "docker",
      "args": [
        "run",
        "--rm",
        "-i",
        "-e",
        "LINKEDIN_COOKIE=your_li_at_cookie_value",
        "-e",
        "HEADLESS=true",
        "-e",
        "TRANSPORT=stdio",
        "vertexcoverlabs/linkedin-spider"
      ]
    }
  }
}

Without Docker

Requires pip install linkedin-spider[mcp] and Chrome installed locally.

{
  "mcpServers": {
    "linkedin-spider": {
      "command": "linkedin-spider-mcp",
      "args": ["serve", "--cookie", "your_li_at_cookie_value"]
    }
  }
}

Claude Code

With Docker

claude mcp add linkedin-spider -- \
  docker run --rm -i \
  -e LINKEDIN_COOKIE=your_li_at_cookie_value \
  -e HEADLESS=true \
  -e TRANSPORT=stdio \
  vertexcoverlabs/linkedin-spider

Without Docker

claude mcp add linkedin-spider -- \
  linkedin-spider-mcp serve --cookie your_li_at_cookie_value

Or connect to a running SSE/HTTP server:

claude mcp add linkedin-spider --transport sse http://localhost:8000/sse

Troubleshooting

Authentication issues
  • Cookie expired: LinkedIn cookies expire periodically. Grab a fresh li_at value from your browser.
  • Email/password not working: LinkedIn may trigger a CAPTCHA or verification. Try cookie auth instead.
  • Session reuse: Sessions are saved in a Chrome profile. If things break, delete the profile directory and re-authenticate.
Docker issues
  • First-time pull is slow: The image is ~1.4GB. Pre-pull with docker pull vertexcoverlabs/linkedin-spider before configuring Claude Desktop to avoid timeout.
  • Port conflicts: If port 8080 is in use, map to a different host port: -p 9090:8080.
Browser / Chrome issues
  • Chrome not found (non-Docker): Set CHROMEDRIVER_PATH in your .env or pass chromedriver_path in ScraperConfig.
  • Page load timeouts: Increase page_load_timeout in ScraperConfig or use a faster proxy.
  • Headless mode issues: Some LinkedIn pages behave differently in headless mode. Try headless=False for debugging.

Python Library

from linkedin_spider import LinkedinSpider, ScraperConfig

config = ScraperConfig(headless=True, page_load_timeout=30)
scraper = LinkedinSpider(
    li_at_cookie="your_cookie_value",
    config=config,
)

# Search profiles
results = scraper.search_profiles("software engineer", max_results=10)

# Search posts
posts = scraper.search_posts("artificial intelligence", max_results=10)

# Scrape a single profile
profile = scraper.scrape_profile("https://linkedin.com/in/someone")

# Scrape a company page
company = scraper.scrape_company("https://linkedin.com/company/openai")

# Connection requests
incoming = scraper.scrape_incoming_connections(max_results=20)
scraper.send_connection_request("https://linkedin.com/in/someone", note="Hi!")

# Conversations
threads = scraper.scrape_conversations_list(max_results=10)
messages = scraper.scrape_conversation_messages("John Doe")

# Always clean up
scraper.close()

See the examples/ directory for more detailed usage.

Configuration

ScraperConfig accepts the following options:

Option Default Description
headless False Run browser without a visible window
stealth_mode True Inject anti-detection scripts
page_load_timeout 30 Page load timeout in seconds
implicit_wait 10 Implicit wait timeout in seconds
human_delay_range (0.5, 2.0) Random delay range between actions
proxy None Proxy URL (http:// or socks5://)
custom_user_agent None Override the default user agent

Command Line Interface

# Search profiles
linkedin-spider-cli search -q "product manager" -n 10 -o results.json

# Search posts
linkedin-spider-cli search-posts -k "artificial intelligence" -n 10 -o posts.json

# Scrape a profile
linkedin-spider-cli profile -u "https://linkedin.com/in/johndoe" -o profile.json

# Scrape a company
linkedin-spider-cli company -u "https://linkedin.com/company/openai" -o company.json

# List connection requests
linkedin-spider-cli connections -n 20 -o connections.json

Pass --cookie on first use (or set LINKEDIN_COOKIE env var). You can also use --email/--password instead. The session is saved and reused for subsequent commands.

Output defaults to stdout. Use -o to write JSON or CSV files.

Docker

A pre-built Docker image is published on Docker Hub:

docker pull vertexcoverlabs/linkedin-spider

Or build locally:

docker build -t linkedin-spider .

Run with different transports:

# stdio
docker run --rm -i -e TRANSPORT=stdio --env-file .env vertexcoverlabs/linkedin-spider

# SSE
docker run -p 8080:8080 -e TRANSPORT=sse --env-file .env vertexcoverlabs/linkedin-spider

# HTTP
docker run -p 8080:8080 -e TRANSPORT=http --env-file .env vertexcoverlabs/linkedin-spider

Running in the Cloud

LinkedIn blocks requests from known datacenter IP ranges. To run linkedin-spider on a cloud server (AWS, GCP, Azure, etc.), route browser traffic through a residential or mobile proxy using the PROXY_URL environment variable.

LINKEDIN_COOKIE=your_li_at_cookie_value
HEADLESS=true
PROXY_URL=http://user:pass@proxy-host:port

Both HTTP and SOCKS5 proxies are supported:

PROXY_URL=http://user:pass@proxy-host:port
PROXY_URL=socks5://user:pass@proxy-host:port

When using Docker, pass it as an environment variable:

docker run --rm -i \
  -e LINKEDIN_COOKIE=your_li_at_cookie_value \
  -e PROXY_URL=http://user:pass@proxy-host:port \
  -e HEADLESS=true \
  -e TRANSPORT=stdio \
  vertexcoverlabs/linkedin-spider

When using the Python library, pass it via ScraperConfig:

config = ScraperConfig(headless=True, proxy="http://user:pass@proxy-host:port")

[!TIP] Residential proxies are recommended over datacenter proxies to avoid detection.

Development

git clone https://github.com/vertexcover-io/linkedin-spider
cd linkedin-spider
uv sync
cp .env.example .env   # add your credentials
make check    # lint + typecheck
make test     # run tests
make build    # build wheel

Disclaimer

This tool is for personal and educational use. Please respect LinkedIn's Terms of Service, use reasonable rate limits, and handle collected data responsibly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linkedin_spider-0.2.1.tar.gz (153.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linkedin_spider-0.2.1-py3-none-any.whl (65.0 kB view details)

Uploaded Python 3

File details

Details for the file linkedin_spider-0.2.1.tar.gz.

File metadata

  • Download URL: linkedin_spider-0.2.1.tar.gz
  • Upload date:
  • Size: 153.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for linkedin_spider-0.2.1.tar.gz
Algorithm Hash digest
SHA256 42500be4e62bd39e0d5439ae16918308b26d6f5d3c0364b9dac66d9562e41f08
MD5 35456bae53eb302cff4526d4646958af
BLAKE2b-256 3ec3fe8144860abdef2c15dde476b2a60530758eab8b89c9d1aee251fd1821e2

See more details on using hashes here.

File details

Details for the file linkedin_spider-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: linkedin_spider-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 65.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for linkedin_spider-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4f5685daf363ba7fa4ff733e78940f34e19c444719772ba51c811e0bcd4f943c
MD5 e1c909f5e7521c6c85614f4d8bea3272
BLAKE2b-256 f26aca6ae2726fe1c352a2a7a34ed53752433bfc016374d922fcae31a5ad567c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page