Skip to main content

A Python package for detecting and solving various captchas in Selenium-based web automation, supporting reCAPTCHA, Cloudflare Turnstile, and more.

Project description

Selenium Captcha Processing

A Python package for detecting and solving captchas in Selenium-based web automation. It supports identifying various captcha types, including reCAPTCHA, Cloudflare Turnstile, GeeTest, KeyCaptcha, Lemin Captcha, MTCaptcha, and unknown image-based captchas. Currently, solvers are implemented for reCAPTCHA and Cloudflare Turnstile, with community contributions welcome to expand solver support.

Features

  • Captcha Detection: Identifies multiple captcha types using detectors.
  • Captcha Solving: Includes solvers for reCAPTCHA and Cloudflare Turnstile.
  • Speech Recognition: Uses Google Speech API for audio-based captchas (requires FFmpeg).
  • Extensible Design: Easily extendable with new detectors and solvers via a modular architecture.
  • Community-Driven: Actively seeking contributions for additional captcha solvers.

Installation

Install the package via pip:

pip install selenium-captcha-processing

Prerequisites

  • Python: Version 3.10 or higher.
  • FFmpeg: Required for audio processing with pydub. Ensure FFmpeg is installed and added to your system's PATH. For installation instructions, see FFmpeg's official site. Example for Ubuntu:
    sudo apt update
    sudo apt install ffmpeg
    
    For Windows or macOS, download FFmpeg and add it to your PATH as described in your OS documentation.

Dependencies

The package requires the following Python libraries, which are automatically installed:

  • speechrecognition>=3.14.3,<4.0.0
  • requests>=2.32.4,<3.0.0
  • selenium>=4.33.0,<5.0.0
  • pydub>=0.25.1,<0.26.0

Usage

Below is an example of how to use selenium-captcha-processing to detect and attempt to bypass captchas on various websites. The example uses a Selenium WebDriver to navigate to demo pages and applies the BypassCaptcha class.

from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.chrome.options import Options
from time import sleep
from selenium_captcha_processing import BypassCaptcha

# Initialize Selenium WebDriver
options = Options()
options.add_argument("--headless")  # Run in headless mode (optional)
driver = WebDriver(service=Service(), options=options)

# List of URLs to test (including demo captchas and a non-captcha page)
urls = [
    "https://www.google.com/recaptcha/api2/demo",
    "https://2captcha.com/demo/cloudflare-turnstile",
    "https://2captcha.com/demo/cloudflare-turnstile-challenge",
    "https://2captcha.com/demo/normal",
    "https://2captcha.com/demo/keycaptcha",
    "https://2captcha.com/demo/lemin",
    "https://2captcha.com/demo/text",
    "https://2captcha.com/demo/mtcaptcha",
    "https://en.wikipedia.org/wiki/Entropy_(information_theory)",
]

# Initialize captcha bypasser
bypassing = BypassCaptcha(driver)

try:
    for url in urls:
        print(f"Navigating to: {url}")
        driver.get(url)

        # Wait for page to load
        sleep(10)

        try:
            # Attempt to bypass captcha
            result = bypassing.bypass()
            print(f"Captcha bypassed: {result}")
        except Exception as e:
            print(f"Error processing captcha at {url}: {str(e)}")

        # Wait before moving to the next URL
        sleep(5)

        print("-------------------------------------------")

finally:
    # Clean up
    if driver:
        driver.quit()

Notes

  • Supported Captchas: The package can detect reCAPTCHA, Cloudflare Turnstile, GeeTest, KeyCaptcha, Lemin Captcha, MTCaptcha, and unknown image captchas. Solvers are currently available for reCAPTCHA and Cloudflare Turnstile only.
  • Exceptions: Be prepared to handle exceptions, as Selenium or utility functions (e.g., speech recognition) may fail due to network issues, missing elements, or invalid API responses. Wrap calls in try-except blocks as shown above.
  • FFmpeg: Audio captcha solving requires FFmpeg for pydub to process audio files. Ensure it's installed and in your PATH.
  • Google API: Audio captcha solving uses the Google Speech API, which requires a valid API key configured in Config.

Community Contributions

This project supports a limited set of captcha solvers, and community contributions are crucial for expanding support to other captcha types (e.g., GeeTest, KeyCaptcha, Lemin Captcha, MTCaptcha). We warmly welcome pull requests and contributions! To contribute:

  1. Fork the repository: https://github.com/FINWAX/selenium-captcha-processing.py
  2. Create a new branch for your feature or bug fix.
  3. Implement a new detector or solver in the detectors or solvers directory, following the interfaces in detectors/interfaces/detector.py or solvers/interfaces/solver.py.
  4. Submit a pull request with a clear description of your changes.

Acknowledgments

This project builds upon ideas and code from the following repositories:

Issues and Support

Encounter a bug or have a feature request? Please open an issue at https://github.com/FINWAX/selenium-captcha-processing.py/issues.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selenium_captcha_processing-1.0.2.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

selenium_captcha_processing-1.0.2-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file selenium_captcha_processing-1.0.2.tar.gz.

File metadata

  • Download URL: selenium_captcha_processing-1.0.2.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.10.11 Windows/10

File hashes

Hashes for selenium_captcha_processing-1.0.2.tar.gz
Algorithm Hash digest
SHA256 7ad207dcf17763df6371fd81466ed492be04a7da51ba3e851379b5e4b49008bb
MD5 1fd366178720f9bd0e0e2665571c8e72
BLAKE2b-256 237eea1130d5d7f466678636ef07bde27403e5a39dea4b607259f339bbe694eb

See more details on using hashes here.

File details

Details for the file selenium_captcha_processing-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for selenium_captcha_processing-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 052b8a39e3b89ad1bbeaf086a379c8b7708dcd3c7b77eb6c53feb8970f363cbc
MD5 5db23866d1f06152cfd046b9a1cdd303
BLAKE2b-256 6c2b533e99eb1b2f580d6b62f776e5cc6d69f8f6b2fa948b61a9afe62c9f0210

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page