Skip to main content

A package to bulk match urls to robots.txt files

Project description

DeepCrawl Robots.txt live checker

> cat urls.txt

https://www.ebay.at/
https://www.ebay.at/adchoice
https://www.ebay.at/sl/sell
https://www.ebay.at/mye/myebay/watchlist
https://www.ebay.at/sch/ebayadvsearch
https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html
https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html
https://www.ebay.at/sch/Modeschmuck-/10968/i.html
https://www.ebay.at/sch/Damenschuhe-/3034/i.html
> cat robots.txt

User-agent: *
Disallow: /sch/
pip install deepcrawl_robots
from deepcrawl_robots import Processor

urls_path = "Path to urls file"
robots_txt_path = "Path to robots.txt file"
processor = Processor(
    user_agent="User agent",
    urls_file_path=urls_path,
    robots_file_path=robots_txt_path
)
> cat result.txt

https://www.ebay.at/mye/myebay/watchlist,true
https://www.ebay.at/adchoice,true
https://www.ebay.at/,true
https://www.ebay.at/sl/sell,true
https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html,false
https://www.ebay.at/sch/ebayadvsearch,true
https://www.ebay.at/sch/Modeschmuck-/10968/i.html,false
https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html,false
https://www.ebay.at/sch/Damenschuhe-/3034/i.html,false

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepcrawl_robots-0.0.5.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

deepcrawl_robots-0.0.5-py3-none-any.whl (1.3 MB view details)

Uploaded Python 3

File details

Details for the file deepcrawl_robots-0.0.5.tar.gz.

File metadata

  • Download URL: deepcrawl_robots-0.0.5.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.5

File hashes

Hashes for deepcrawl_robots-0.0.5.tar.gz
Algorithm Hash digest
SHA256 82480e1c42d0dca6aa5557dceb443ff32a521b5c07d41604edaaf48cda24f228
MD5 d649370c0cdf922698b98c1e0f3153aa
BLAKE2b-256 0471fd97748d4cbfdfa749747fd2024e62b1e6ee82cc92c50204ced9986e616b

See more details on using hashes here.

File details

Details for the file deepcrawl_robots-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: deepcrawl_robots-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 1.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.5

File hashes

Hashes for deepcrawl_robots-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 a40dfb7d24230d5fd8121284d41f6b8d8f625613e348a55274e32895eaadbbe0
MD5 ae5a9392a6e8cb9b7e554bd65b9280cd
BLAKE2b-256 a1d2596eac60e310048a99b4ef1fc9f3a23abf0bc969df7f2d011749dadc54ac

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page