Skip to main content

Python package to detect bots/crawlers/spiders via user-agent

Project description

is-bot

CI Coverage Status PyPI version

Python package to detect bots/crawlers/spiders via user-agent string. This is a port of the isbot JavaScript module.

Requirements

  • Python >= 3.8
  • regex >= 2022.8.17

Installation

pip install is-bot

Usage

Simple usage

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/104.0.5112.79 Safari/537.36'
assert bots.is_bot(ua)

ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
assert not bots.is_bot(ua)

Add/remove parsing rules

from is_bot import Bots

bots = Bots()

# Exclude Chrome-Lighthouse from default bot list
ua = 'Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse'
assert bots.is_bot(ua)
bots.exclude(['chrome-lighthouse'])
assert not bots.is_bot(ua)

# Add some browser to default bot list
ua = 'SomeAwesomeBrowser/10.0 (Linux; Android 7.0)'
assert not bots.is_bot(ua)
bots.extend(['SomeAwesomeBrowser'])
assert bots.is_bot(ua)

Get additional parsing information

from is_bot import Bots

bots = Bots()

ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0'

# view the respective match for bot user agent rule
print(bots.find(ua))
#> Search

# list all patterns that match the user agent string
print(bots.matches(ua))
#> ['(?<! (ya|yandex))search', '(?<! cu)bot']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

is_bot-0.3.5.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

is_bot-0.3.5-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file is_bot-0.3.5.tar.gz.

File metadata

  • Download URL: is_bot-0.3.5.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for is_bot-0.3.5.tar.gz
Algorithm Hash digest
SHA256 698b65d07cf2da35ef47dedf064b380019fb9cd0cde876f9fab2557f38364aaf
MD5 e77a514029715f6ea0ba15a828aa558d
BLAKE2b-256 89ddd59691e978cb020e2a3374d69eebbb78b707e6cbdde62c69abe5d75dac99

See more details on using hashes here.

File details

Details for the file is_bot-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: is_bot-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.5

File hashes

Hashes for is_bot-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 892ef06fd17d615f093b7a927c7b4c1dae178cdad81d4627e376eacafa4e2335
MD5 9a5d69fdaa5bd396254c6a490bae6e23
BLAKE2b-256 79c99b207fef25d2d2127a708d57e9f3ba6686cb472b3429ca694160d9d7069c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page