A package to bulk match urls to robots.txt files
Project description
DeepCrawl Robots.txt live checker
> cat urls.txt https://www.ebay.at/ https://www.ebay.at/adchoice https://www.ebay.at/sl/sell https://www.ebay.at/mye/myebay/watchlist https://www.ebay.at/sch/ebayadvsearch https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html https://www.ebay.at/sch/Modeschmuck-/10968/i.html https://www.ebay.at/sch/Damenschuhe-/3034/i.html
> cat robots.txt User-agent: * Disallow: /sch/
pip install deepcrawl_robots
from deepcrawl_robots import Processor urls_path = "Path to urls file" robots_txt_path = "Path to robots.txt file" processor = Processor( user_agent="User agent", urls_file_path=urls_path, robots_file_path=robots_txt_path )
> cat result.txt https://www.ebay.at/mye/myebay/watchlist,true https://www.ebay.at/adchoice,true https://www.ebay.at/,true https://www.ebay.at/sl/sell,true https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html,false https://www.ebay.at/sch/ebayadvsearch,true https://www.ebay.at/sch/Modeschmuck-/10968/i.html,false https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html,false https://www.ebay.at/sch/Damenschuhe-/3034/i.html,false
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
deepcrawl_robots-0.0.5.tar.gz
(1.3 MB
view hashes)
Built Distribution
Close
Hashes for deepcrawl_robots-0.0.5-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a40dfb7d24230d5fd8121284d41f6b8d8f625613e348a55274e32895eaadbbe0 |
|
MD5 | ae5a9392a6e8cb9b7e554bd65b9280cd |
|
BLAKE2-256 | a1d2596eac60e310048a99b4ef1fc9f3a23abf0bc969df7f2d011749dadc54ac |