Skip to main content

Python package to detect bots/crawlers/spiders via user-agent

Project description

is-bot

CI Coverage Status PyPI version

Python package to detect bots/crawlers/spiders via user-agent string. This is a port of the isbot JavaScript module.

Requirements

  • Python >= 3.8
  • regex >= 2022.8.17

Installation

pip install is-bot

Usage

Simple usage

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/104.0.5112.79 Safari/537.36'
assert bots.is_bot(ua)

ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
assert not bots.is_bot(ua)

Add/remove parsing rules

from is_bot import Bots

bots = Bots()

# Exclude Chrome-Lighthouse from default bot list
ua = 'Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse'
assert bots.is_bot(ua)
bots.exclude(['chrome-lighthouse'])
assert not bots.is_bot(ua)

# Add some browser to default bot list
ua = 'SomeAwesomeBrowser/10.0 (Linux; Android 7.0)'
assert not bots.is_bot(ua)
bots.extend(['SomeAwesomeBrowser'])
assert bots.is_bot(ua)

Get additional parsing information

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0'

# view the respective match for bot user agent rule
print(bots.find(ua))
#> Search

# list all patterns that match the user agent string
print(bots.matches(ua))
#> ['(?<! (ya|yandex))search', '(?<! cu)bot']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

is_bot-0.3.4.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

is_bot-0.3.4-py3-none-any.whl (12.5 kB view details)

Uploaded Python 3

File details

Details for the file is_bot-0.3.4.tar.gz.

File metadata

  • Download URL: is_bot-0.3.4.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for is_bot-0.3.4.tar.gz
Algorithm Hash digest
SHA256 ee7486b6a56d39adf16aa96fdcefce3576a7f4aee50acf71d166be3ecb86f3ea
MD5 29524345f43fd3d0267fd7eed75596e8
BLAKE2b-256 3350f42e60a4173e9c2a5a59d9757724e8323584e8623dc5aeee2dcc073fd2db

See more details on using hashes here.

File details

Details for the file is_bot-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: is_bot-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 12.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for is_bot-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4a84c5b5ab57de5fd23f9281e5f9d508fabe65a2bab866950299f6d490d1b577
MD5 5cb2cd7ef6488acb8bea98878744f2d9
BLAKE2b-256 20bcf865c0b6cdecdcf7fa965e5c2f7e307d723aea0d72923e5e90ace541cc25

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page