Skip to main content

A professional CLI tool for scraping LinkedIn profiles via Google Search (linkedin-spider was taken)

Project description

๐Ÿ•ท๏ธ LinkedIn Spider

LinkedIn Spider Python Poetry License: MIT

A professional CLI tool for scraping LinkedIn profiles via Google Search

๐Ÿ“ฆ PyPI Package Name: This project is available on PyPI as linkedin-tarantula because linkedin-spider was already taken by another project. The GitHub repository remains linkedin-spider.


๐Ÿ“– Overview

LinkedIn Spider is a powerful, user-friendly command-line tool that helps you collect and analyze LinkedIn profiles at scale. By leveraging Google Search instead of direct LinkedIn scraping, it significantly reduces the risk of account restrictions while providing comprehensive profile data.

โœจ Features

  • ๐Ÿ” Smart Search - Find profiles via Google Search to avoid LinkedIn rate limits
  • ๐ŸŽจ Beautiful CLI - Interactive arrow-key menu navigation with ASCII art
  • ๐Ÿ“Š Data Export - Export to CSV, JSON, or Excel formats
  • ๐Ÿ” Secure - Environment-based configuration for credentials
  • ๐ŸŒ VPN Support - Optional IP rotation for enhanced privacy
  • โšก Fast & Efficient - Progress tracking and batch processing
  • ๐Ÿ›ก๏ธ Anti-Detection - Random delays, user agents, and human-like behavior
  • ๐Ÿค– CAPTCHA Handler - Automatic CAPTCHA detection with auto-resume
  • ๐ŸŽฎ Interactive Menu - Navigate with arrow keys (โ†‘โ†“) and Enter

๐Ÿ“ฆ Installation

Option 1: PyPI Installation (Recommended)

Note: The PyPI package is named linkedin-tarantula (not linkedin-spider) because the latter name was already taken.

# Install from PyPI
pip install linkedin-tarantula

# Or with Excel export support
pip install linkedin-tarantula[excel]

This installs the linkedin-spider command globally.

Option 2: Quick Install from Source

# Clone the repository
git clone https://github.com/alexcolls/linkedin-spider.git
cd linkedin-spider

# Run the installation script
./install.sh

The installation script provides three options:

  1. System Installation - Installs globally as linkedin-spider command
  2. Development Installation - Installs locally with Poetry for testing
  3. Both - Installs both system and development modes

Option 3: Development Installation

# Install from source with Poetry
poetry install

# Optional: Install with Excel support
poetry install -E excel

# Activate the virtual environment
poetry shell

Option 4: Install from GitHub (Direct)

# Install directly from GitHub
pip install git+https://github.com/alexcolls/linkedin-spider.git

# Or with Excel support
pip install "linkedin-spider[excel] @ git+https://github.com/alexcolls/linkedin-spider.git"

โš™๏ธ Configuration

1. Environment Variables

cp .env.sample .env
# Edit .env with your LinkedIn credentials

2. Configuration File

Edit config.yaml for advanced settings (delays, VPN, export format, etc.)

๐ŸŽฏ Usage

Quick Start

# If installed from PyPI (pip install linkedin-tarantula)
linkedin-spider

# If installed from source with system mode
linkedin-spider

# If installed with development mode
./run.sh

# Or with Poetry directly
poetry run python -m linkedin_spider

Interactive Mode

The CLI provides an interactive menu with ASCII art and arrow-key navigation:

linkedin-spider  # or ./run.sh for development

Navigation:

  • Use โ†‘โ†“ arrow keys to navigate
  • Press Enter to select
  • Or type the number directly

Menu options:

  1. ๐Ÿ” Search & Collect Profile URLs
  2. ๐Ÿ“Š Scrape Profile Data
  3. ๐Ÿค Auto-Connect to Profiles
  4. ๐Ÿ“ View/Export Results
  5. โš™๏ธ Configure Settings
  6. โ“ Help
  7. ๐Ÿšช Exit

Command-Line Mode

# Search for profiles
linkedin-spider search "Python Developer" "San Francisco" --max-pages 10

# Scrape profiles from file
linkedin-spider scrape --urls data/profile_urls.txt --output results --format csv

# Show version
linkedin-spider version

๐Ÿ—‘๏ธ Uninstallation

To remove LinkedIn Spider from your system:

./uninstall.sh

This will:

  • Remove the system command (if installed)
  • Clean up Poetry virtual environments
  • Optionally remove .env and data files

๐Ÿ”ง Key Features Explained

CAPTCHA Handling

LinkedIn Spider automatically detects and handles Google CAPTCHA challenges:

  • Automatic Detection: Instantly detects when CAPTCHA appears
  • Clear Instructions: Shows what to do in the terminal
  • Auto-Resume: Automatically continues when CAPTCHA is solved (no manual Enter press needed!)
  • Progress Updates: Shows elapsed time every 10 seconds
  • Smart Polling: Checks every 2 seconds for resolution
  • Timeout Protection: 5-minute maximum wait with fallback

Data Directory

All data is saved in the data/ folder in your current working directory:

  • Profile URLs: data/profile_urls.txt
  • Exported profiles: data/profiles_YYYYMMDD_HHMMSS.csv/json/xlsx
  • Logs: logs/linkedin-spider.log

โš ๏ธ Legal & Ethical Considerations

  • Terms of Service: This tool is for educational purposes. Always comply with LinkedIn's Terms of Service.
  • Rate Limiting: Use appropriate delays to avoid overwhelming servers.
  • Privacy: Respect privacy. Only collect publicly available information.
  • Usage: Use this tool responsibly and ethically.

๐Ÿ“„ License

MIT License - see LICENSE file for details.

๐Ÿ™ Acknowledgments

Built with Selenium, Typer, Rich, and Poetry.


๐Ÿ“ž Support


โญ Show Your Support

If this project helped you, please consider:

  • โญ Starring the repository
  • ๐Ÿ› Reporting bugs
  • ๐Ÿ’ก Suggesting features
  • ๐Ÿค Contributing code
  • ๐Ÿ“ข Sharing with others

                                                |
                                                |
                                                |
                                                |
                                                |
                                                |
                                                |
                                    ____        |              ,
                                   /---.'.__    |        ____//
                                        '--.\   |       /.---'
                                   _______  \\  |      //
                                 /.------.\  \| |    .'/  ______
                                //  ___  \ \ ||/|\  //  _/_----.\__
                               |/  /.-.\  \ \:|< >|// _/.'..\   '--'
                                  //   \'. | \'.|.'/ /_/ /  \\
                                 //     \ \_\/" ' ~\-'.-'    \\
                                //       '-._| :H: |'-.__     \\
                               //           (/'==='\)'-._\     ||
                               ||                        \\    \|
                               ||                         \\    '
                               |/                          \\
                                                            ||
                                                            ||
                                                            \\
                                                             '
               โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•ชโ•โ•ชโ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
               โ•‘                                                                   โ•‘
               โ•‘    โ–ˆโ–ˆโ•—     โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ•—   โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ•—   โ–ˆโ–ˆโ•—     โ•‘
               โ•‘    โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•‘     โ•‘
               โ•‘    โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ• โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•‘     โ•‘
               โ•‘    โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•”โ•โ•โ•  โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘     โ•‘
               โ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ•‘     โ•‘
               โ•‘    โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•โ•โ•šโ•โ•  โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•šโ•โ•  โ•šโ•โ•โ•โ•     โ•‘
               โ•‘                                                                   โ•‘
               โ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—      ^.-.^          โ•‘
               โ•‘    โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—    '^\+/^`         โ•‘
               โ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—  โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•    '/`"'\`         โ•‘
               โ•‘    โ•šโ•โ•โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•โ• โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•  โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—                    โ•‘
               โ•‘    โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘     โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘  โ–ˆโ–ˆโ•‘                    โ•‘
               โ•‘    โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•     โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ•  โ•šโ•โ•                    โ•‘
               โ•‘    โ•โ•โ•โ•ชโ•โ•โ•ชโ•โ•ชโ•โ•ชโ•โ•โ•โ•ชโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•    โ•‘
               โ•‘                                                                   โ•‘
               โ•‘               Professional Network Profile Scraper                โ•‘
               โ•‘                โ”โ”โ” Weaving Through Networks โ”โ”โ”                   โ•‘
               โ•‘                                                                   โ•‘
               โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

Made with โค๏ธ and ๐Ÿ Python
Get Linkedin profiles at scale

ยฉ 2022 LinkedIn Spider | MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

linkedin_tarantula-0.1.0.tar.gz (37.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

linkedin_tarantula-0.1.0-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file linkedin_tarantula-0.1.0.tar.gz.

File metadata

  • Download URL: linkedin_tarantula-0.1.0.tar.gz
  • Upload date:
  • Size: 37.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.8.0-52-generic

File hashes

Hashes for linkedin_tarantula-0.1.0.tar.gz
Algorithm Hash digest
SHA256 50f1eafac90777c4f58795917101536d982ca24628bfcabae81078fbbe37182e
MD5 41634ef37756dcef420a87c2e47cc3b5
BLAKE2b-256 a8cc3ff3dafa230a147dff8ccc2f2ed76eb695da424f7fa119967b2b36e44275

See more details on using hashes here.

File details

Details for the file linkedin_tarantula-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: linkedin_tarantula-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.8.0-52-generic

File hashes

Hashes for linkedin_tarantula-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 12504993ac975c34ca32e0a21e2c3dbe069d56b5143c211a055609322bb72d20
MD5 e3349b24a487c7fe768bc399c0763071
BLAKE2b-256 c07644d0bfefdbabdcfb73560011ebd8898fedf79fbc46393619196fd8a19dff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page