A set of data tools in Python
Project description
Find-Sitemap
Find Sitemap is a simple SEO tool to help you find the sitemap.
>>> from Find_Sitemap import FindSitemap
>>> main = FindSitemap('google.com')
>>> main.crawl()
...
...
check 13801/13804: https://google.com/xmap.php
check 13802/13804: https://google.com/xmap.jsp
check 13803/13804: https://google.com/xmap.asp
check 13804/13804: https://google.com/xmap.html
--------------------
Find sitemap urls len: 1
Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Getting Started
Installing Requests on PyPI:
$ pip install Find-Sitemap
Prerequisites
Usage
-
Show the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains {'www.'} >>> main.slugs_L1 {'/default', '/sitemap', '/feeds', '/api', '/contents' ...} >>> main.slugs_L2 {'/sitemap', '/stock', '/sitemap1', '/sitemap0', ...} >>> main.filetypes {'txt', 'xml', 'xml.gz', 'jsp', 'html', ...}
-
Add the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.add("shop.") >>> main.slugs_L1.add("/node") >>> main.slugs_L2.add("/site") >>> main.filetypes.add("xml")
-
Remove the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.remove("shop.") >>> main.slugs_L1.remove("/node") >>> main.slugs_L2.remove("/site") >>> main.filetypes.remove("xml")
-
Run the crawler.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.crawl() ... ... check 13801/13804: https://google.com/xmap.php check 13802/13804: https://google.com/xmap.jsp check 13803/13804: https://google.com/xmap.asp check 13804/13804: https://google.com/xmap.html -------------------- Find sitemap urls len: 1 Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Contributing
- See Contributing
Authors
- Email: a0025071@gmail.com
- Website: Max 行銷誌
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Find_Sitemap-0.1.4.tar.gz
(8.8 kB
view hashes)
Built Distribution
Close
Hashes for Find_Sitemap-0.1.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1cc45c9eb9a395cb840a6224dd0d06b3ca339e1b26a39386aa1d66ba66bd2db8 |
|
MD5 | 4085642d53cd01842f57c21617525f2d |
|
BLAKE2b-256 | 1039f403ef5f0e6adaed8ee7ff1327200e029e012e631525007cd07626195286 |