Skip to main content

A modern LinkedIn scraping library with CLI and MCP server

Project description

linkedin-spider

Effortless Linkedin scraping with zero detection. Extract, export, and automate your Linkedin data.

Features

  • Search Linkedin profiles with advanced filters (location, connection type, current company, position)
  • Extract complete profile information (experience, education, skills, contact details)
  • Get company details and information
  • Retrieve incoming and outgoing connection requests
  • Send connection requests to profiles
  • Get conversations list and detailed conversation history
  • Built-in anti-detection and session management

Quick Start

Installation

# Clone the repo
github.com/vertexcover-io/linkedin-spider
cd linkedin-spider
# Install with uv
uv sync

[!NOTE] Authentication Update: Linkedin has enhanced their anti-bot mechanisms, temporarily affecting cookie-based authentication. We recommend using the email/password authentication method for reliable access. We are actively working on restoring full cookie authentication support.

Different ways to use it

1. Python Library

Perfect for integration into your existing Python applications:

from linkedin_scraper import LinkedinSpider, ScraperConfig

config = ScraperConfig(headless=True, page_load_timeout=30)
# Authenticate (use either email/password or cookie).
# Authentication is mostly done once and the session is saved in the chrome profile
scraper = LinkedinSpider(
    email="your_email@example.com",
    password="your_password",
    config=config
)
# Search for profiles
results = scraper.search_profiles("software engineer", max_results=10)

Output sample:

[
  {
    "name": "John Doe",
    "title": "Senior Software Engineer at Google",
    "location": "San Francisco, CA",
    "profile_url": "https://linkedin.com/in/johndoe",
    "connections": "500+"
  },
  {
    "name": "Jane Smith",
    "title": "Software Engineer at Microsoft",
    "location": "Seattle, WA",
    "profile_url": "https://linkedin.com/in/janesmith",
    "connections": "200+"
  }
]
# Scrape individual profile
profile = scraper.scrape_profile("https://linkedin.com/in/someone")

Output sample:

{
  "name": "John Doe",
  "title": "Senior Software Engineer",
  "location": "San Francisco, CA",
  "about": "Passionate software engineer with 8+ years of experience...",
  "experience": [
    {
      "title": "Senior Software Engineer",
      "company": "Google",
      "duration": "2021 - Present",
      "description": "Leading backend development for search infrastructure..."
    }
  ],
  "education": [
    {
      "school": "Stanford University",
      "degree": "BS Computer Science",
      "years": "2013 - 2017"
    }
  ],
  "skills": ["Python", "Java", "Kubernetes", "AWS"]
}
# Scrape company information
company = scraper.scrape_company("https://linkedin.com/company/tech-corp")

Output sample:

{
  "name": "TechCorp Inc",
  "industry": "Software Development",
  "company_size": "1,001-5,000 employees",
  "headquarters": "San Francisco, CA",
  "founded": "2010",
  "specialties": ["Cloud Computing", "AI/ML", "Data Analytics"],
  "description": "Leading technology company focused on enterprise solutions...",
  "website": "https://techcorp.com",
  "follower_count": "45,230"
}
# Don't forget to clean up
scraper.close()

For more examples : examples

2. Command Line Interface

Great for quick data extraction and scripting:

uv run linkedin-spider-cli search -q "product manager" -n 10 -o results.json
uv run linkedin-spider-cli profile -u "https://linkedin.com/in/johndoe" -o profile.json
uv run linkedin-spider-cli company -u "https://linkedin.com/company/openai" -o company.json

3. MCP Server

Set up environment variables in .env file:

# Authentication (choose one method)
LINKEDIN_EMAIL=your_email@example.com
LINKEDIN_PASSWORD=your_password
# OR
LINKEDIN_COOKIE=your_li_at_cookie_value

# Configuration
HEADLESS=true

# Transport (optional, defaults to stdio)
TRANSPORT=sse
HOST=127.0.0.1
PORT=8000

Start the MCP server:

# Show available transport options
uv run linkedin_mcp

# Start with specific transport
uv run linkedin_mcp sse
uv run linkedin_mcp http --host 0.0.0.0 --port 9000
uv run linkedin_mcp stdio

# Or use environment variables
TRANSPORT=sse uv run linkedin_mcp

Claude Code Integration

# Add to Claude Code
claude mcp add linkedin-spider --transport sse <server-url> 
# Example server URL format: http://localhost:8080/sse

Claude Desktop Integration

Add to your Claude Desktop configuration file:

Windows: %APPDATA%\Claude\claude_desktop_config.json macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Linux: ~/.config/Claude/claude_desktop_config.json

Option 1: Docker (Recommended)

The Docker approach provides reliable, isolated execution with all dependencies included.

First, build the Docker image:

# Build the stdio server image
docker build -f Dockerfile.stdio -t linkedin-mcp-stdio .

Then add this to your Claude Desktop configuration:

{
  "mcpServers": {
    "linkedin-spider": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-e", "LINKEDIN_EMAIL=your_email@example.com",
        "-e", "LINKEDIN_PASSWORD=your_password",
        "-e", "HEADLESS=true",
        "-e", "TRANSPORT=stdio",
        "linkedin-mcp-stdio"
      ]
    }
  }
}

Docker Development & Testing

For development and testing with Docker, you can use a single image with different transport configurations:

Build the Docker Image

# Build once for all transport types
docker build -t linkedin-mcp .

Run with Different Transports

SSE Server

docker run -p 8000:8000 -e TRANSPORT=sse --env-file .env linkedin-mcp

HTTP Server

docker run -p 8000:8000 -e TRANSPORT=http --env-file .env linkedin-mcp

STDIO Server

docker run --rm -i -e TRANSPORT=stdio --env-file .env linkedin-mcp

Authentication Methods

Method 1: Linkedin Cookie

  1. Login to Linkedin in your browser
  2. Open Developer Tools (F12)
  3. Go to Application/Storage → Cookies → linkedin.com
  4. Copy the li_at cookie value
  5. Use it in your code:
scraper = LinkedinSpider(li_at_cookie="your_cookie_value")

Method 2: Email & Password (Recommended)

scraper = LinkedinSpider(
    email="your_email@example.com",
    password="your_password"
)

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Disclaimer

This tool is for personal use only. Please:

  • Respect Linkedin's Terms of Service
  • Use reasonable rate limits
  • Don't spam or harass users
  • Be responsible with the data you collect

Ready to extract Linkedin data like a pro? Star this repo and start scraping!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linkedin_spider-0.1.0.tar.gz (136.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linkedin_spider-0.1.0-py3-none-any.whl (51.2 kB view details)

Uploaded Python 3

File details

Details for the file linkedin_spider-0.1.0.tar.gz.

File metadata

  • Download URL: linkedin_spider-0.1.0.tar.gz
  • Upload date:
  • Size: 136.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.22

File hashes

Hashes for linkedin_spider-0.1.0.tar.gz
Algorithm Hash digest
SHA256 20d5b5d82f19b1b0f66210ded418ab437e2af1af8c654f1a773c9ce1e412c650
MD5 2475785b05259496e26458dd3356640b
BLAKE2b-256 4ea8d94cf876d36547754d234f873081695cc6c21c68472287e27ad5e3ba6bb0

See more details on using hashes here.

File details

Details for the file linkedin_spider-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for linkedin_spider-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e96a2ada88adfd95fedad985f2e65eef05ba76e0905c1a07be00a083f6484ac
MD5 e25242f66d627212b8292b92a54f3632
BLAKE2b-256 bda6a08b383c9769491952fbf6b7550d051060fe303d89dec38852370a6411c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page