A package to bulk match urls to robots.txt files
Project description
DeepCrawl Robots.txt live checker
> cat urls.txt
https://www.ebay.at/
https://www.ebay.at/adchoice
https://www.ebay.at/sl/sell
https://www.ebay.at/mye/myebay/watchlist
https://www.ebay.at/sch/ebayadvsearch
https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html
https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html
https://www.ebay.at/sch/Modeschmuck-/10968/i.html
https://www.ebay.at/sch/Damenschuhe-/3034/i.html
> cat robots.txt
User-agent: *
Disallow: /sch/
pip install deepcrawl_robots
from deepcrawl_robots import Processor
urls_path = "Path to urls file"
robots_txt_path = "Path to robots.txt file"
processor = Processor(
user_agent="User agent",
urls_file_path=urls_path,
robots_file_path=robots_txt_path
)
> cat result.txt
https://www.ebay.at/mye/myebay/watchlist,true
https://www.ebay.at/adchoice,true
https://www.ebay.at/,true
https://www.ebay.at/sl/sell,true
https://www.ebay.at/sch/Kleidung-Accessoires-/11450/i.html,false
https://www.ebay.at/sch/ebayadvsearch,true
https://www.ebay.at/sch/Modeschmuck-/10968/i.html,false
https://www.ebay.at/sch/Auto-Tuning-Styling-/107059/i.html,false
https://www.ebay.at/sch/Damenschuhe-/3034/i.html,false
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deepcrawl_robots-0.0.5.tar.gz.
File metadata
- Download URL: deepcrawl_robots-0.0.5.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
82480e1c42d0dca6aa5557dceb443ff32a521b5c07d41604edaaf48cda24f228
|
|
| MD5 |
d649370c0cdf922698b98c1e0f3153aa
|
|
| BLAKE2b-256 |
0471fd97748d4cbfdfa749747fd2024e62b1e6ee82cc92c50204ced9986e616b
|
File details
Details for the file deepcrawl_robots-0.0.5-py3-none-any.whl.
File metadata
- Download URL: deepcrawl_robots-0.0.5-py3-none-any.whl
- Upload date:
- Size: 1.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.1 setuptools/51.0.0 requests-toolbelt/0.9.1 tqdm/4.54.1 CPython/3.7.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a40dfb7d24230d5fd8121284d41f6b8d8f625613e348a55274e32895eaadbbe0
|
|
| MD5 |
ae5a9392a6e8cb9b7e554bd65b9280cd
|
|
| BLAKE2b-256 |
a1d2596eac60e310048a99b4ef1fc9f3a23abf0bc969df7f2d011749dadc54ac
|