Skip to main content

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Project description

build status

jwm.robotstxt

Python Wrapper for Googles Robotstxt Parser

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Websites may provide an optional robots.txt file in their domains root to govern the access and behavior of web scrapers. One of the most famous webscrapers GoogleBot is responsible for promoting this standard and sites interested in SEO will closely conform to GoogleBot behavior.

All credit for the parser goes to the Google team who created, open sourced and promoted it.

SEO (Search Engine Optimization): The process of modifying a websites content or metadata to boost rankings in search engines page indexes. Higher rankings lead to higher positions in user searches leading to more visitors. For further details, see the SEO wikipedia page.

Usage

Basic usage using the RobotsMatcher class provided by Google.

import jwm.robotstxt.googlebot

robotstxt = """
user-agent: GoodBot
allowed: /path
"""

matcher = jwm.robotstxt.googlebot.RobotsMatcher()
assert matcher.AllowedByRobots(robotstxt, ("GoodBot",), "/path")

Check out the documentation for further details. For more use cases, see the test cases for jwm.robotstxt and robotstxt.

Installation

//Todo: Setup pipeline into Pypi. These steps are for the expected workflow

Install from Pypi under the jwm.robotstxt distribution.

pip install jwm.robotstxt

Import into your program through the jwm.robotstxt.googlebot package.

import jwm.robotstxt.googlebot

Virtual Environment

It is highly recommended to install python projects into a virtual environment, see PEP405 for motivations.

Create a virtual environment in the .venv directory.

python3 -m venv ./.venv

Activate with the correct command for your system.

# Linux/MacOS
. ./.venv/bin/activate
# Windows
.\.venv\Scripts\activate

Installing from source

Make sure you have cloned the repository and its submodules.

git clone --recurse-submodules https://github.com/jwmorley73/jwm.robotstxt.git

Install the project using pip. This will build the required robotstxt static library files and link them into the produced python package.

pip install .

If you want to include the developer tooling, add the dev optional dependencies.

pip install .[dev]

Known Issues

  • Windows 32 bit is not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jwm_robotstxt-1.0.5.tar.gz (11.4 kB view details)

Uploaded Source

Built Distributions

jwm.robotstxt-1.0.5-pp310-pypy310_pp73-win_amd64.whl (95.9 kB view details)

Uploaded PyPy Windows x86-64

jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (170.1 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (179.8 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_11_0_arm64.whl (131.3 kB view details)

Uploaded PyPy macOS 11.0+ ARM64

jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_10_15_x86_64.whl (134.1 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

jwm.robotstxt-1.0.5-cp312-cp312-win_amd64.whl (97.4 kB view details)

Uploaded CPython 3.12 Windows x86-64

jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (178.4 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (186.2 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.5-cp312-cp312-macosx_11_0_arm64.whl (132.3 kB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

jwm.robotstxt-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl (133.1 kB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

jwm.robotstxt-1.0.5-cp311-cp311-win_amd64.whl (97.5 kB view details)

Uploaded CPython 3.11 Windows x86-64

jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (178.9 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (186.4 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.5-cp311-cp311-macosx_11_0_arm64.whl (132.8 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

jwm.robotstxt-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl (133.3 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

jwm.robotstxt-1.0.5-cp310-cp310-win_amd64.whl (96.5 kB view details)

Uploaded CPython 3.10 Windows x86-64

jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (177.4 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (185.1 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.5-cp310-cp310-macosx_11_0_arm64.whl (131.5 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

jwm.robotstxt-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl (132.1 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

File details

Details for the file jwm_robotstxt-1.0.5.tar.gz.

File metadata

  • Download URL: jwm_robotstxt-1.0.5.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for jwm_robotstxt-1.0.5.tar.gz
Algorithm Hash digest
SHA256 84be078303761a64d00c336712b3a8a8b7b8d4cfe73e16c9eda3420438bbc793
MD5 fb5b09a58e4991f324043c6c32ec3f49
BLAKE2b-256 3f56877c6b42ce957f4884d014e45bc8ad018775acb3767ac3b6e3fd87a7f08e

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-pp310-pypy310_pp73-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-pp310-pypy310_pp73-win_amd64.whl
Algorithm Hash digest
SHA256 20e02db01e968c207a70c2376659df7efb6aeb3bae2eb015e2f7643bdf591637
MD5 afde8f69e8bd6f906c7d672c31dd7de7
BLAKE2b-256 09c6814b6abc832e03a568853d94bf5ea763f2b50856963128094bb656cf5c5c

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0bcdf849449f67ebc3549c8a7f441e33dfe04de88e19f9b4cabe26a99707afa9
MD5 5a26bc503580ca69cd3cc03f25c785a5
BLAKE2b-256 20d8aefb90bf7fd71ce59fb60891df19fe799ce4a37381f7f7684350169b2a93

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 29db74a404ccae34b8e3241c1d63781a51175ce8c9b18735288cde07f6639e35
MD5 48ae124420f7dfd412f282a2176a13b4
BLAKE2b-256 552358f4936f93f306a5f5b4f14909da97222f2725f3a525c5006f4b013fb4e5

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 df4117bdbb725486cf62fcce956d1c218ad0080a553e6c046a178a352a7bdb81
MD5 7f657f58690bfe695bab0e49f17e7a80
BLAKE2b-256 64ea0161b5c02b7e5722b4b7af7cdc0c0bc37c132d1d7bdf157d033ce3dc0d92

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-pp310-pypy310_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3aace658d7536c332058b8af983346abe221cfdb3f5aeddedad37fec5fbcfad4
MD5 b122f8163754077e58a235afd58947c8
BLAKE2b-256 71063fc3869bbd203e1746a3930935c6d15946fb9aafd1c6e3bdd6d59484f1dd

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 a6dc8272375517c8de902ea30d59999edc792bc05159cde40b634e4845a375d1
MD5 338f3444d09e7e4308ecd9adc357d54c
BLAKE2b-256 5cdfeffd40648310a70da39976f88cc56fa550cd5c47896b8b820c4e09beadbc

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 5f5cfce60d5ffa94b0c98e49ee883fee9427829ee23b94d3c5ff473803ea4b91
MD5 e14e861019d7de555d3be63259a09094
BLAKE2b-256 c1c40f0ad40d6b1225b422daceab07f632a232cc8c021091d1b63859219176b4

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 1f7d64f136a5c2d4a96d31e4d410995595d3331d5f1d0169d3beca02df7009ef
MD5 9da9bf4dc74a39d705689a593df33360
BLAKE2b-256 bb77f20d1fc2027776ce6eb04f26f9c99b3fd5fb0b3c37d50fdbe1b8c913abe1

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 de8d83b01d9cbc5d3928935e25feb58f2a5d6ca31fba8603eef5893c1112e8af
MD5 78ed00b561a64ea62a4d6304f7dbdf49
BLAKE2b-256 539a7b2bb5b345bd51ea144ff03e3a935335f848580a1a8f05c519035c36b9a0

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 fb9cb93a0c91b13269168f0a13295befc4e462cfb701f5c429c03beda1f4571e
MD5 f88a5c9de53a4b1dc899e5328c3031b6
BLAKE2b-256 988aef1ca1a3f7e37fd91e668267051daa484e7b39e6ee694afa277fdfbc6c72

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4e39c43bb249abca4e2f30f87622b346038742eaa357c08692d305c1a654e01b
MD5 376d83dba56191ade6ad2894a436f722
BLAKE2b-256 7920dfcabd5febf6caf4218d26d49da2e69132fdcde1a4871ce87a461619836f

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b6e99dfec86c8b1bd4caf55aa2e8dd10c4b01a5a46ecf06a83d71bd5ac027b75
MD5 36741bc486d7b170f4a269f1e74b7e90
BLAKE2b-256 96f756b386d61d7d889729b5879580c52013c238525b506a611214ebd0e47fb9

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 121c6137a3163e511e612f248875c8795a2e28212729ec3db046202d6f305da4
MD5 ad31f9dd53445c787109eb4d261fda4b
BLAKE2b-256 0b8e27a0dde48bc0f1e0f3dee5dab4e784411c6f5ce8b4a51dcf2ada92a96dc2

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 6e694dcf4ab6b8e79a5bf1437ecce1f0b11458783f86a3bde34afe95a753d98a
MD5 3d730945c764b01189b308102d88ae99
BLAKE2b-256 08a6da2470de361813aa8eabe6468be6b3895d01f928bb3c86be84c57453f62e

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 4818b66bb5752a87502abaf4e2795f7bc6efd123de115c5ce556aa29de9229e8
MD5 e6f8e501e60fdb8ac8e755c1acb72363
BLAKE2b-256 984a3568c405f17d4599daea1f70d9007967c9e99ff7332a60eb4dc6687ce4bb

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d0ffdfc57dec28a71ecf19d2c4fe119ba8c407d9286d3bb03b4e867124ffb047
MD5 9d02e4a71c19455e8c68d44af021d407
BLAKE2b-256 39b50fa235aaaba19217085b8e0449255ca5c110383d7397b8aa015f24544610

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 09cfe7495a1127a17d158d4f0e5d82ad86bd282d125e518f5a4480f21fbdf17c
MD5 4c9c6e0aa41f494ecd81d9387c4c0164
BLAKE2b-256 179fc0f2900eae83f6b4d42734198669b8a22371d0e67425a8b6d67b06a9fa76

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6b6ad55108798318248dc010e3d7aa11fb391d4a9ecb7da8956c30b6a479c837
MD5 fd1f13f4bed05841c24d43d74aba52b8
BLAKE2b-256 dc1e0e62fc0dd9598f3b8a3b118f1f391690e436519d3751beb76fcac711b2e7

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 82b4337c63ae4ce0700a49cb2b7ff05c1ad12d837679aa8c70a755a4b617d3ee
MD5 854825b4dec4e45d96168a180d025341
BLAKE2b-256 2b1a13287b4c1b71cfa20acc51c1294e74eb9cb3cc46e6e6559588bfa2f88057

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b098391517a62019f4003e07b6e57f84497215c025252ced6d77a5f3807dd6b9
MD5 a9e62f3184806f77346c12980fccba85
BLAKE2b-256 77ce0f53f8e7733c8209d7f717a082b81675d2c57b5a27540db7b4f80e2d36f8

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 48a78656d5d46184b0b52bd7dbd3144c0d010278dcab3132392f07e4065b6085
MD5 0e526c4b9678faf043bc21f50962ad59
BLAKE2b-256 2a0b5660c7f9f401e4bdf2b7b82a8d9b162e2440049f4f241900427dc0491a44

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 d37fb3f5878ac2b81e426cd6c47285a113a1587711ef748cd470fa3d1dad1ae1
MD5 b232d612c110c537163716acc4542910
BLAKE2b-256 eb0701d363b4b8547c8327efb30a6df8d435acbf2f1fe19ed47e14aa0de92041

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2a82de21695e84beb9cffcac935ca24be6d8ddec2939cf04991fb5f80e71a067
MD5 ec8e03073913925e39b86674dc4b25ff
BLAKE2b-256 1f143313b89d1eb9df817e6b2d41ed55b08390dd5a0b98cac9f750a5dda63511

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 4ea9e7f822674391095c8fe91c6a344831de082994d09d1b67a762820f655ea9
MD5 2ec3d760f712933cd2de5f981c405be6
BLAKE2b-256 bbfec3c62a61ffd97581993592416dd641519c6337b808fa318b4ada5f0645f3

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4607f910f680ea6c7e896aaea41bdf86fbfe284b18ae06d5f6e4ee427e222cf5
MD5 3d8a8314056cb05d1740711eb6598e6c
BLAKE2b-256 c1aca93f1fa08f23d64f0e730ab16927a697684e56136cac9c7f3f97e2e4e29e

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.5-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a88da5593600adf8187147161b8584644fa5be7c44053f5c3af8dd1f0ef020f7
MD5 b4d2e73a975ed1b8581a13aa61d01840
BLAKE2b-256 eaebf7567c31bea615b8dfec41d3cf6455790f2f97473aab1dcc6ce3b6355dfa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page