Skip to main content

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Project description

build status

jwm.robotstxt

Python Wrapper for Googles Robotstxt Parser

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Websites may provide an optional robots.txt file in their domains root to govern the access and behavior of web scrapers. One of the most famous webscrapers GoogleBot is responsible for promoting this standard and sites interested in SEO will closely conform to GoogleBot behavior.

All credit for the parser goes to the Google team who created, open sourced and promoted it.

SEO (Search Engine Optimization): The process of modifying a websites content or metadata to boost rankings in search engines page indexes. Higher rankings lead to higher positions in user searches leading to more visitors. For further details, see the SEO wikipedia page.

Usage

Basic usage using the RobotsMatcher class provided by Google.

import jwm.robotstxt.googlebot

robotstxt = """
user-agent: GoodBot
allowed: /path
"""

matcher = jwm.robotstxt.googlebot.RobotsMatcher()
assert matcher.AllowedByRobots(robotstxt, ("GoodBot",), "/path")

Check out the documentation for further details. For more use cases, see the test cases for jwm.robotstxt and robotstxt.

Installation

Install from Pypi under the jwm.robotstxt distribution.

pip install jwm.robotstxt

Import into your program through the jwm.robotstxt.googlebot package.

import jwm.robotstxt.googlebot

Virtual Environment

It is highly recommended to install python projects into a virtual environment, see PEP405 for motivations.

Create a virtual environment in the .venv directory.

python3 -m venv ./.venv

Activate with the correct command for your system.

# Linux/MacOS
. ./.venv/bin/activate
# Windows
.\.venv\Scripts\activate

Installing from source

Make sure you have cloned the repository and its submodules.

git clone --recurse-submodules https://github.com/jwmorley73/jwm.robotstxt.git

Install the project using pip. This will build the required robotstxt static library files and link them into the produced python package.

pip install .

If you want to include the developer tooling, add the dev optional dependencies.

pip install .[dev]

Known Issues

  • Windows 32 bit is not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jwm_robotstxt-1.0.7.tar.gz (43.2 kB view details)

Uploaded Source

Built Distributions

jwm.robotstxt-1.0.7-pp310-pypy310_pp73-win_amd64.whl (146.9 kB view details)

Uploaded PyPy Windows x86-64

jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (221.6 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (231.2 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_11_0_arm64.whl (182.8 kB view details)

Uploaded PyPy macOS 11.0+ ARM64

jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_10_15_x86_64.whl (185.6 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

jwm.robotstxt-1.0.7-cp312-cp312-win_amd64.whl (144.6 kB view details)

Uploaded CPython 3.12 Windows x86-64

jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (226.0 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (233.8 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.7-cp312-cp312-macosx_11_0_arm64.whl (180.0 kB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

jwm.robotstxt-1.0.7-cp312-cp312-macosx_10_9_x86_64.whl (180.8 kB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

jwm.robotstxt-1.0.7-cp311-cp311-win_amd64.whl (140.6 kB view details)

Uploaded CPython 3.11 Windows x86-64

jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (222.5 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (229.9 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.7-cp311-cp311-macosx_11_0_arm64.whl (176.4 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

jwm.robotstxt-1.0.7-cp311-cp311-macosx_10_9_x86_64.whl (176.9 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

jwm.robotstxt-1.0.7-cp310-cp310-win_amd64.whl (136.7 kB view details)

Uploaded CPython 3.10 Windows x86-64

jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_x86_64.whl (1.2 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_i686.whl (1.3 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (218.1 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (225.8 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.7-cp310-cp310-macosx_11_0_arm64.whl (172.2 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

jwm.robotstxt-1.0.7-cp310-cp310-macosx_10_9_x86_64.whl (172.7 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

File details

Details for the file jwm_robotstxt-1.0.7.tar.gz.

File metadata

  • Download URL: jwm_robotstxt-1.0.7.tar.gz
  • Upload date:
  • Size: 43.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for jwm_robotstxt-1.0.7.tar.gz
Algorithm Hash digest
SHA256 0628af4080085de6effec03aacc35dea1db41320efd12636f6d8eb21b82bccc7
MD5 83dcf0fbe7ddf52f7158b2f244dc65d1
BLAKE2b-256 3697c05c819229e20311c5603d374063fff238381934f63df58dec90ecb98f48

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-pp310-pypy310_pp73-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-pp310-pypy310_pp73-win_amd64.whl
Algorithm Hash digest
SHA256 696a396e8f1d79cb53447842050d37c18260fbd089b5e27c88fae291105dd5a1
MD5 b280db5642c5fe037b12d22c558cf2f4
BLAKE2b-256 1ad6e9b8c958449385e9bba58312ac6e29d17a35a42721015d0e93335057fd72

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 db77bc1b1a6289809e1f094c5fa5dbc087490550b4ad702e6a7611410621691e
MD5 e7cb3299152e66da1310cdde3f5a4bad
BLAKE2b-256 4c5b5baaa1d49cafc1b8fded9c8d9d1a86ffa098de35287106c44dbcb1fd4bb4

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 a7aa85d545be97e3e432be8d56ba993124ea46c30e3af3fb1535ff7c2f654a7e
MD5 165cdf01ff04ed7fb6da73835e0e6c4a
BLAKE2b-256 79bd716cd7ff2c3db42dc9abfb06b77a77917f9f20db6df909bfe305707b2425

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 52e53bcb2f3a66a7a2de8a1983d76b55a41fd219f64833f1aa40f7d82f2a2857
MD5 cb57e2c6bf211dc33855d5a74b8491cc
BLAKE2b-256 bd1ecf21bce9c41f8b84f9fd248579ca1c7e0f0a18c8bbda389d886457098c1a

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-pp310-pypy310_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 95f31f54af26cd1d9b28e82ee9c1c1886c645148ffc620d91279bd633c09414f
MD5 368eb5657417b51cf42b5677b9162741
BLAKE2b-256 e6a9f1649c1b9905ae7bb53922cf3824bfb4065ccce1026a8e2590924120fc01

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 eb50711b5771995371e9314e4aa5b2dde9bf2f29e8168b9485c8caff4f5f09f6
MD5 e1e208d3967aa5121e52bab0056242a9
BLAKE2b-256 91a4d0aadca9e2bc2bc5f7e87f3b0af45b0970035b87ad1921eddc7a866ea143

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e191c7928355693daec9bddf479d25304449e92f9963a8683164d00a4af2a884
MD5 904f42ddba5ae574fc6df73fa72320ea
BLAKE2b-256 c12a1714f12b8ff0d7b5519fd36e9038bca5f6102a699ba7a1a88f94aeb3910a

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 388b3504c51cfbefc77611764ca5df3a7ba53961ce001f84755640cae14bdbe9
MD5 d024e3875c7decd7c5e63992dea23914
BLAKE2b-256 53ba9df2a749d431ace90a16e482b185326600d7a3fefb2a72487d5729b449f7

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cae9cb86e287ada656baa3efd2eff6190a7750b12a3f5c96fa2d2b5259c31354
MD5 701d49efbe67a214c0994aaeedc27c90
BLAKE2b-256 b30da576a9362a6ecd97f6cd9c72e16503a4e35fdded5cf69fac50fef6ba156f

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 77908b9c57fbdd27630816c81fdc032d0d6ffa939af9859829bd4139087e9546
MD5 a1ac10864c92c0985ae210f26b641236
BLAKE2b-256 a771770bb591502652a6936242a128575119405817bbc1ae6be92a61789fce46

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6a2bee6f569a7f5c30756b0783d70d9cd1d5830c53f97af456a3f240542bf97d
MD5 57945b10fdb305342d020167e9983dc9
BLAKE2b-256 9cbf8ab597d9c3eadc44a261cee544125177422660a62228d7f6ff80348a5fc4

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a1a29a71682fe6e7cd4280aaa8990b815e7ec589eb549aab2551aabf0bb0e540
MD5 dd9cab6eb74209e41d6c801e477b2dea
BLAKE2b-256 c7a839157e235d211ab141b513c7b40780fbac5b198c413ceaf1b12c28dc25a3

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 5cecd10f014d56c1cac0bcdd73753f74a82371c8c01dbd11896d0a4305a9d6fd
MD5 1a0d0d941138679e373634b6d4b8dab7
BLAKE2b-256 d88ec241b3d837e8714c4354ee2de7fac4b25d62494d23ab3859923f1684571a

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 4272842304f9f02d7e7ea828cfb0d38edc7460f7ee3c80fdde0638db56b088f6
MD5 529438d203adda6d82700f9a2f9133f0
BLAKE2b-256 9d3626a97d7a0ba0668c68d993d35b8ef4215810d5917eca00c0b3b1b72b392a

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 bdcfc2dfdfb4cc2c9b72a7a4300c04416074dd3ba4ed0aa78f90afb4360cbdc6
MD5 3218b3de3d0952844f07fed4fe13aa4a
BLAKE2b-256 33a228b57d6233e0f91f2d2e7b077874e6425bd4ae4c42204b4f1f15605d7e11

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 26a0f5ef8241639f8917501879e4c4933f287b593d99abd9b9a166dd12a239e2
MD5 7ed9fed96a4c6e51c46f5d6f7c59d4b9
BLAKE2b-256 b39fb784ba7e829b28d9c64a37f9d07263f212d64a760c42ea43a0eb10f1dcb2

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 b19f320b6ceb7d24bc87990de88e73a493671215b49ba7486084946b17bf0b28
MD5 98cfc04adb9527e887cc857e98a890bc
BLAKE2b-256 f89c554afc821c32853550ca4d04696c22b1dd91914fa6e8266bd9e3af7a9256

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bac604ce22fae835a2fd2ce0e4ab0b973b14a8577152a26f7ddf4f9b3ae1df6a
MD5 dd25160a5388babf565f9e3a0d0f920f
BLAKE2b-256 38a3ba521aeb1e60d9786a7330425533e509e7c6ebc2eaa8499bb9bbb6ebb8b8

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 61e90638030a81f98df8d413899fccc0d771db390fa0979296f24551a1a3293a
MD5 617f120c39460b9a287136f79d6846f1
BLAKE2b-256 a8ab9a25a964ba310dd452f6f9a06431f589e61b8d503f5960acf4144a7f662a

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 245d1fd0b69d6faadebe40f9110277b3014826bb31c0dd62b306c81cd80f0537
MD5 4f10e46a4ca84a9b88c1286831364108
BLAKE2b-256 1856d34a4fd3c3a1643959aea9bd5922901105ffa21354145b584671e859ebaf

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 fa55efd6a2f298c6329209cf35a0d569a0312036773b0d89b49072a8584dcea8
MD5 37436675efeeceb6dd5b82abd8e9c52d
BLAKE2b-256 825232ebcf00e4293581e77a9bacc9c070d807aaa312c60af165879224a8c51d

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 c1450e921e05c2b2b645318c814d4e80fa0c207b8f59b5b1078d0b31280e5894
MD5 a357ee6d744663d19a78669bd122f014
BLAKE2b-256 045bee1cf85bc071d0b27cc802d2fe520e65b257566646efec80529501f8e3a0

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4afea1ab8ff61bafdace99bc6bbe506b32c7c80ce944e7fb0eae844b165229bc
MD5 fdeb05a1bbe9174f9559fffc7499182c
BLAKE2b-256 0f1561675863806d450a56fe2a360c3a51f8f18b4e04c36865a085fe7812fbb1

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 9d1cdd7d2a1889bc4a8371480df13285ef0e55252683f8cd49af567254c15c85
MD5 150d07db28b8ee8c87e7273843b25337
BLAKE2b-256 6a4e91697838b2d110c5e18562e5fa6a226609322431397b6648a7516dafd4ca

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b3974fa4b96fd7f2a8c13228c4256f68c80f86a2204cb58929ea76f239fc4809
MD5 f549b646f38e9f52b1665913c8f6d3ca
BLAKE2b-256 472b85123a0e94d1b861a13a20b0bdf0d6bcc0c2b306b2d407c3ae973c198fde

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.7-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.7-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 dac8a38b5dcb0a2e952a1d82021cc49522b13d6b07d9e245ca6ed071909c5edc
MD5 b19ab97a6d590f37566cd03f7fa39b64
BLAKE2b-256 ff9a44b4f6b984a94db3851eda1f5c37969f1c571894c6e30e2119c5a7f8e1df

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page