Python package to detect bots/crawlers/spiders via user-agent
Project description
is-bot
Python package to detect bots/crawlers/spiders via user-agent string. This is a port of the isbot JavaScript module.
Requirements
- Python >= 3.8
- regex >= 2022.8.17
Installation
pip install is-bot
Usage
Simple usage
from is_bot import Bots
bots = Bots()
ua = 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/104.0.5112.79 Safari/537.36'
assert bots.is_bot(ua)
ua = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36'
assert not bots.is_bot(ua)
Add/remove parsing rules
from is_bot import Bots
bots = Bots()
# Exclude Chrome-Lighthouse from default bot list
ua = 'Mozilla/5.0 (Linux; Android 7.0; Moto G (4)) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4695.0 Mobile Safari/537.36 Chrome-Lighthouse'
assert bots.is_bot(ua)
bots.exclude(['chrome-lighthouse'])
assert not bots.is_bot(ua)
# Add some browser to default bot list
ua = 'SomeAwesomeBrowser/10.0 (Linux; Android 7.0)'
assert not bots.is_bot(ua)
bots.extend(['SomeAwesomeBrowser'])
assert bots.is_bot(ua)
Get additional parsing information
from is_bot import Bots
bots = Bots()
ua = 'Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 SearchRobot/1.0'
# view the respective match for bot user agent rule
print(bots.find(ua))
#> Search
# list all patterns that match the user agent string
print(bots.matches(ua))
#> ['(?<! (ya|yandex))search', '(?<! cu)bot']
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
is_bot-0.3.4.tar.gz
(14.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
is_bot-0.3.4-py3-none-any.whl
(12.5 kB
view details)
File details
Details for the file is_bot-0.3.4.tar.gz.
File metadata
- Download URL: is_bot-0.3.4.tar.gz
- Upload date:
- Size: 14.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee7486b6a56d39adf16aa96fdcefce3576a7f4aee50acf71d166be3ecb86f3ea
|
|
| MD5 |
29524345f43fd3d0267fd7eed75596e8
|
|
| BLAKE2b-256 |
3350f42e60a4173e9c2a5a59d9757724e8323584e8623dc5aeee2dcc073fd2db
|
File details
Details for the file is_bot-0.3.4-py3-none-any.whl.
File metadata
- Download URL: is_bot-0.3.4-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a84c5b5ab57de5fd23f9281e5f9d508fabe65a2bab866950299f6d490d1b577
|
|
| MD5 |
5cb2cd7ef6488acb8bea98878744f2d9
|
|
| BLAKE2b-256 |
20bcf865c0b6cdecdcf7fa965e5c2f7e307d723aea0d72923e5e90ace541cc25
|