Skip to main content

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Project description

build status

jwm.robotstxt

Python Wrapper for Googles Robotstxt Parser

Provides python access to Googles parser for robot.txt files as used by their GoogleBot webscraper.

Websites may provide an optional robots.txt file in their domains root to govern the access and behavior of web scrapers. One of the most famous webscrapers GoogleBot is responsible for promoting this standard and sites interested in SEO will closely conform to GoogleBot behavior.

All credit for the parser goes to the Google team who created, open sourced and promoted it.

SEO (Search Engine Optimization): The process of modifying a websites content or metadata to boost rankings in search engines page indexes. Higher rankings lead to higher positions in user searches leading to more visitors. For further details, see the SEO wikipedia page.

Usage

Basic usage using the RobotsMatcher class provided by Google.

import jwm.robotstxt.googlebot

robotstxt = """
user-agent: GoodBot
allowed: /path
"""

matcher = jwm.robotstxt.googlebot.RobotsMatcher()
assert matcher.AllowedByRobots(robotstxt, ("GoodBot",), "/path")

Check out the documentation for further details. For more use cases, see the test cases for jwm.robotstxt and robotstxt.

Installation

//Todo: Setup pipeline into Pypi. These steps are for the expected workflow

Install from Pypi under the jwm.robotstxt distribution.

pip install jwm.robotstxt

Import into your program through the jwm.robotstxt.googlebot package.

import jwm.robotstxt.googlebot

Virtual Environment

It is highly recommended to install python projects into a virtual environment, see PEP405 for motivations.

Create a virtual environment in the .venv directory.

python3 -m venv ./.venv

Activate with the correct command for your system.

# Linux/MacOS
. ./.venv/bin/activate
# Windows
.\.venv\Scripts\activate

Installing from source

Make sure you have cloned the repository and its submodules.

git clone --recurse-submodules https://github.com/jwmorley73/jwm.robotstxt.git

Install the project using pip. This will build the required robotstxt static library files and link them into the produced python package.

pip install .

If you want to include the developer tooling, add the dev optional dependencies.

pip install .[dev]

Known Issues

  • Windows 32 bit is not supported.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jwm_robotstxt-1.0.6.tar.gz (11.4 kB view details)

Uploaded Source

Built Distributions

jwm.robotstxt-1.0.6-pp310-pypy310_pp73-win_amd64.whl (95.9 kB view details)

Uploaded PyPy Windows x86-64

jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (170.1 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (179.8 kB view details)

Uploaded PyPy manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_11_0_arm64.whl (131.3 kB view details)

Uploaded PyPy macOS 11.0+ ARM64

jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_10_15_x86_64.whl (134.1 kB view details)

Uploaded PyPy macOS 10.15+ x86-64

jwm.robotstxt-1.0.6-cp312-cp312-win_amd64.whl (97.4 kB view details)

Uploaded CPython 3.12 Windows x86-64

jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (178.4 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (186.2 kB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.6-cp312-cp312-macosx_11_0_arm64.whl (132.3 kB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

jwm.robotstxt-1.0.6-cp312-cp312-macosx_10_9_x86_64.whl (133.1 kB view details)

Uploaded CPython 3.12 macOS 10.9+ x86-64

jwm.robotstxt-1.0.6-cp311-cp311-win_amd64.whl (97.5 kB view details)

Uploaded CPython 3.11 Windows x86-64

jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (178.9 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (186.4 kB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.6-cp311-cp311-macosx_11_0_arm64.whl (132.8 kB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

jwm.robotstxt-1.0.6-cp311-cp311-macosx_10_9_x86_64.whl (133.3 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

jwm.robotstxt-1.0.6-cp310-cp310-win_amd64.whl (96.5 kB view details)

Uploaded CPython 3.10 Windows x86-64

jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_i686.whl (1.2 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (177.4 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (185.1 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

jwm.robotstxt-1.0.6-cp310-cp310-macosx_11_0_arm64.whl (131.5 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

jwm.robotstxt-1.0.6-cp310-cp310-macosx_10_9_x86_64.whl (132.1 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

File details

Details for the file jwm_robotstxt-1.0.6.tar.gz.

File metadata

  • Download URL: jwm_robotstxt-1.0.6.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.0 CPython/3.12.4

File hashes

Hashes for jwm_robotstxt-1.0.6.tar.gz
Algorithm Hash digest
SHA256 ee95aa2950c11001f5e7bb2cafe194aab27e2997a5e8a8750f5839dd97fa5026
MD5 7850932062c2425179ff492d611ad487
BLAKE2b-256 6c80f7319b697b6610240827f61224ec8d2136154fdc945ac31a3ea3536ca6ae

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-pp310-pypy310_pp73-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-pp310-pypy310_pp73-win_amd64.whl
Algorithm Hash digest
SHA256 ddaf6356a597a4ee2079ee171aa3f6ba9fe71be33cd0493c087e65f5c49049a4
MD5 1fe4bf4a5f69c4a806c21889de5cd147
BLAKE2b-256 9f485a3c88bb9e89f56cf7e5fbf1130ace709eba6b6afc9de1802ac5bc0e82bf

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 18accd8469d0098c28c52e59aa5b3049550073d53499bd447cfa2f846dcb0a8a
MD5 5e755202867c3a98cdd80fd7ce6e58cd
BLAKE2b-256 22965c952235e763d126931febfec9ad8eee5cb9c68410fdabd2f2d64c3f8e9d

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 ea9d4488708fb4a12916a90be213d815212aede30ae3d731708478d5390d32e5
MD5 012a358a658f8c5d6eb4fca133566324
BLAKE2b-256 4dc97460ae3d8ae5323a10429c0066dd05918fc749fd6331fca142a65ff0cd71

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 22adeffea7a5a4aa7bb57daca5f0d992441ee7d8bf0b6f292b8e5ee1096e3516
MD5 897e4045e01aaed787a1f584397dc9bd
BLAKE2b-256 c99bc09196460ece8c434b6b54b070e7ae87d8f656a77eb19ef2cac59f2a29ac

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_10_15_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-pp310-pypy310_pp73-macosx_10_15_x86_64.whl
Algorithm Hash digest
SHA256 3a1f9d618ee10c4d27b5761e1fcb4d8cfad30df62eb0661778c0c48f870db70a
MD5 562764d1e4df0fc9a0d4ec9b89d7a0a6
BLAKE2b-256 7cdeb517f105e823cd8605bed53eeadde248e474ad45e2ebd51f1a521ffe4214

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 157d12ec0a4c04a8bb4c618f02714a8658b6684d57e8360939340945c565a99f
MD5 4c08af3b03090d0b91452452dcc54338
BLAKE2b-256 b0951a7958ad2f17f1b44369b38182ab2818f3ce20f124a67472f284e70281d2

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a32f2d0854ee34e4a70e34af10fb6296f8d552c21d2094e92b2c3c759bade191
MD5 9488958103aedbe8d6b285e0c0ec13cd
BLAKE2b-256 56164c7376cff85c35d6a4c116b459506112f660bdc610704f8208ddb98b56cf

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 5f1783522a01f72249cb0c2792153e610ede0f7a02863eadda5303b1fb7ee60f
MD5 3b9cf79fb0e98e373279712f048552d2
BLAKE2b-256 f550a339db2da89e6989671c555d2cff4a972146e324101bf26e5954329ab0dd

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 624248588ac033b7a4b85ceb19a58bfab282ae7e07646f3a1c4d68f34a9b8168
MD5 e5970c78ec3d8c421753b9de491b0d98
BLAKE2b-256 654d3470ca11ed1fd8b271bced9c087721d4b0c85ac00beb8408ea57aaa21665

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 bb134fa6707f15094a42a04018adb3c438feb98dc9dc85d402cf7f27ad504463
MD5 fff2c2f3e55b6dc23bf867850737681f
BLAKE2b-256 f53d9b7afbcae2df2d783f99be0670359d7da6747fd6d65af3e549a340752f91

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cae815a1df258b4a52880b0040f7bcc869b1c328a5ae41c7ed39594fb53c626b
MD5 b19260bee985453866fca5231908d7ad
BLAKE2b-256 73c20a71e91f41fe9166dee8cb99574f2bba3dadb40dfc9c2b63f051b2406fe0

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp312-cp312-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp312-cp312-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 8f0eda14c06edee22824675c31bd836db0920e0a4d7d646ab58a9c1462550f72
MD5 eb7d358894f325240c4afb314a7b3806
BLAKE2b-256 0ae919876e6301401fd66a41dfadab21ad37f973a8fbb55a07a6991529d10f61

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 d4ecababd58c0752474d6bd876fca78f4358a1ef0ad75078fe42fbf3aa1cde97
MD5 ce5c2eee920916deb73beb4f6b219bdf
BLAKE2b-256 17d49cae0e6bf4fe600621174ad5b7dc778aeee1ce134c51c8ded161a6ca3d4b

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 e492f3f4371cc078e7c94132f6212a242d46b625703e7276fd878517ff2599c8
MD5 4228f6f1b38453b51750da57a588c14b
BLAKE2b-256 6232a8f9a7cd7094e2a7335673b9c4cd433c4815b471502393c2646979126317

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 460787e438bd03873b8f74e27f346edf6ba9b6781bb0087c79db956fe5ee8244
MD5 cf8a2657697f939953cbd4ddc82bf4f6
BLAKE2b-256 fee9b0f01a728091e737ca8074c48d27049afed7ae93904cf1f6263f360e55c0

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c63db925c13d902ab941538ab7524e257836dfb30fe2b3a445861878ffcf0fad
MD5 4b29b42bd417da85ed9d0c45da00196f
BLAKE2b-256 fd77ae3dcee13a0ead5ee284433731c2970c973930fd72a012a71ea8320b5600

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 8c6f4cdbdeb79a5dc20bd0b850c44a0c36636975648c1bd15f9641dc460213af
MD5 b965f9b16d722623d4eabce56b2cdd58
BLAKE2b-256 d7819c0c9786c966069aca43fdf2f6bdb518f82617ed4f4add46d32152f3f661

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6fb02e458e35e37d380535aa92bc1c2b6c3fcb0e0f62b631fe69453ca378658e
MD5 a64710c291b195852ef064cd95faca38
BLAKE2b-256 24aa5fcd085b45729dd10cef4e44b1db0a343bae950147a4d631cbf80ef97399

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 5c7e87ec0ec11bbdf475ec9ff6a4f7a8cf3b7a3ab379e218afa6fa3fb97de079
MD5 df2526f35bb3584844513a62bf4a4879
BLAKE2b-256 c0d98f9f1b07a8bdf45446a9ec7851d0a3185ac8ee0f1ab1ea5bf319b037e3bb

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c6e49d05a45f872a9cae6c91549ba608a0341ffa11373071f1e78fed8ccdaf37
MD5 5855762aff1dd38cddb61ea37c39ac62
BLAKE2b-256 ab3285cc041ae91b67e58bce593f123671040520d2a11c1c946d0a2fc568e2ba

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 97266fa050c9f6d69840d60f9670caf773f0d4d21c71bb0590dcb3e1f06ecb1d
MD5 0f990df523622a85625e3d5eb23ade1d
BLAKE2b-256 96ce05466155fcb46ba35948f9d88f3570f9ffb57c96465ccdcc690bfae6ef2b

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 7eaed734a24dc109980151e1d947cc48f421ac1922355da4b381ddb1afb21c35
MD5 adf76ab3205cb8106050f08d7a50f29a
BLAKE2b-256 4811d6012515e61f0b5cd8ca5ecfc6930a70753ce7c39ae9016eaaeb6d17d965

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4968365bb6ecd2d202dbd4cd0d651c6e90b367ac70a348d63ed11a781eecee3d
MD5 1a167a31aa45634c2cb609b317814202
BLAKE2b-256 020f237cae48137d4fa2d211340f52e215c5e6d72a71b716df9584eedfe5d640

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 5378c49ec83bb3c44b4ad8aa3ee35a72a9390c829b117ed41a0195163e0098f9
MD5 af9f3ca625196ffce124bd66b0815ee9
BLAKE2b-256 5a556cddda498f12643d9c197399e1987f5509903f3908a4255b54e46dba87eb

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 43c015468bc7d741d3c47763b4adc10784603fde9c1ffdcdde24718e9c0534b3
MD5 1e92ebce51e0fa5366cd6063769f7532
BLAKE2b-256 3f5c06475bab0a49b6c1fc7099803b01039d9e63a65d34cbf8bcedab030bc522

See more details on using hashes here.

File details

Details for the file jwm.robotstxt-1.0.6-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for jwm.robotstxt-1.0.6-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 c78e0b36cae0c36ab1f50a9127540993231a1d5d8df80594313bd66743d40234
MD5 64c1a82aab30ea240c87d96177748e61
BLAKE2b-256 83bc2834413e664b544a5e43e6d188ed46af0e70b029fbba71e5bee8644addcd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page