Skip to main content

Python package to detect bots/crawlers/spiders via user-agent

Project description

is-bot

CI Coverage Status PyPI version

Python package to detect bots/crawlers/spiders via user-agent string. This is a port of the isbot JavaScript module.

Requirements

  • Python >= 3.7
  • regex >= 2022.8.17

Installation

pip install is-bot

Usage

Simple usage

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/104.0.5112.79 Safari/537.36'
assert bots.is_bot(ua)

ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
assert not bots.is_bot(ua)

Add/remove parsing rules

from is_bot import Bots

bots = Bots()

# Exclude Chrome-Lighthouse from default bot list
ua = 'Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse'
assert bots.is_bot(ua)
bots.exclude(['chrome-lighthouse'])
assert not bots.is_bot(ua)

# Add some browser to default bot list
ua = 'SomeAwesomeBrowser/10.0 (Linux; Android 7.0)'
assert not bots.is_bot(ua)
bots.extend(['SomeAwesomeBrowser'])
assert bots.is_bot(ua)

Get additional parsing information

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0'

# view the respective match for bot user agent rule
print(bots.find(ua))
#> Search

# list all patterns that match the user agent string
print(bots.matches(ua))
#> ['(?<! (ya|yandex))search', '(?<! cu)bot']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

is_bot-0.1.3.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

is_bot-0.1.3-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file is_bot-0.1.3.tar.gz.

File metadata

  • Download URL: is_bot-0.1.3.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.11

File hashes

Hashes for is_bot-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8166718a931908da88477b7664436b08acd537430d38ad7ed6200ff052a2388b
MD5 c4604f1ad2a51f725a96723e301a8f99
BLAKE2b-256 df4b6863bf697b082e0c735d87fb0918c77e05c0f098b92153cf38fd34368a17

See more details on using hashes here.

File details

Details for the file is_bot-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: is_bot-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/58.0.4 requests-toolbelt/0.9.1 tqdm/4.56.2 CPython/3.7.11

File hashes

Hashes for is_bot-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2d73d006b460470b5f59fce27de78521bff06ab36c8580c1b2dcd500f5178b28
MD5 45b119d9962e231a415821389e2e7dba
BLAKE2b-256 341d00764aae6aeea64cbe031b5fa2c0359aef25e1fa91cb9c0ba2ab918e22e8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page