Skip to main content

A Python library for automating interaction with websites

Project description

MechanicalSoup. A Python library for automating website interaction.

Home page

https://mechanicalsoup.readthedocs.io/

Overview

A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn’t do JavaScript.

MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and BeautifulSoup (for document navigation). Since 2017 it is a project actively maintained by a small team including @hemberger and @moy.

Gitter Chat

Installation

Latest Version Supported Versions

PyPy3 is also supported (and tested against).

Download and install the latest released version from PyPI:

pip install MechanicalSoup

Download and install the development version from GitHub:

pip install git+https://github.com/MechanicalSoup/MechanicalSoup

Installing from source (installs the version in the current working directory):

python setup.py install

(In all cases, add --user to the install command to install in the current user’s home directory.)

Documentation

The full documentation is available on https://mechanicalsoup.readthedocs.io/. You may want to jump directly to the automatically generated API documentation.

Example

From examples/expl_qwant.py, code to get the results from a Qwant search:

"""Example usage of MechanicalSoup to get the results from the Qwant
search engine.
"""

import re
import mechanicalsoup
import html
import urllib.parse

# Connect to Qwant
browser = mechanicalsoup.StatefulBrowser(user_agent='MechanicalSoup')
browser.open("https://lite.qwant.com/")

# Fill-in the search form
browser.select_form('#search-form')
browser["q"] = "MechanicalSoup"
browser.submit_selected()

# Display the results
for link in browser.page.select('.result a'):
    # Qwant shows redirection links, not the actual URL, so extract
    # the actual URL from the redirect link:
    href = link.attrs['href']
    m = re.match(r"^/redirect/[^/]*/(.*)$", href)
    if m:
        href = urllib.parse.unquote(m.group(1))
    print(link.text, '->', href)

More examples are available in examples/.

For an example with a more complex form (checkboxes, radio buttons and textareas), read tests/test_browser.py and tests/test_form.py.

Development

Build Status Coverage Status Documentation Status CII Best Practices

Instructions for building, testing and contributing to MechanicalSoup: see CONTRIBUTING.rst.

Common problems

Read the FAQ.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mechanicalsoup-1.4.0.tar.gz (51.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mechanicalsoup-1.4.0-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file mechanicalsoup-1.4.0.tar.gz.

File metadata

  • Download URL: mechanicalsoup-1.4.0.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mechanicalsoup-1.4.0.tar.gz
Algorithm Hash digest
SHA256 dce8a797635a748c4724471b5ebee5b1be42531a8464ed4a955e9a46cf612a4c
MD5 3448abb2e797e68bbb05bbb584ffe3f0
BLAKE2b-256 926cc7cdd19798444f0239a7b9e3ff6afea28bb73d5a040f8f9d369717d45077

See more details on using hashes here.

File details

Details for the file mechanicalsoup-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: mechanicalsoup-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for mechanicalsoup-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f460febdffabe24cc36b96dec420def2b30297c6d4663ab33c74e5b67787592d
MD5 e620d3a8e3c3ee13f636a4da754f5f54
BLAKE2b-256 4e61969eee3b34e39aa7810608e3d1932b555fa535cffa743cba6498b054453c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page