A set of data tools in Python
Project description
Find-Sitemap
Find Sitemap is a simple SEO tool to help you find the sitemap.
>>> from Find_Sitemap import FindSitemap
>>> main = FindSitemap('google.com')
>>> main.crawl()
...
...
check 13801/13804: https://google.com/xmap.php
check 13802/13804: https://google.com/xmap.jsp
check 13803/13804: https://google.com/xmap.asp
check 13804/13804: https://google.com/xmap.html
--------------------
Find sitemap urls len: 1
Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Getting Started
Installing Requests on PyPI:
$ pip install Find-Sitemap
Prerequisites
Usage
-
Show the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains {'www.'} >>> main.slugs_L1 {'/default', '/sitemap', '/feeds', '/api', '/contents' ...} >>> main.slugs_L2 {'/sitemap', '/stock', '/sitemap1', '/sitemap0', ...} >>> main.filetypes {'txt', 'xml', 'xml.gz', 'jsp', 'html', ...}
-
Add the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.add("shop.") >>> main.slugs_L1.add("/node") >>> main.slugs_L2.add("/site") >>> main.filetypes.add("xml")
-
Remove the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.remove("shop.") >>> main.slugs_L1.remove("/node") >>> main.slugs_L2.remove("/site") >>> main.filetypes.remove("xml")
-
Run the crawler.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.crawl() ... ... check 13801/13804: https://google.com/xmap.php check 13802/13804: https://google.com/xmap.jsp check 13803/13804: https://google.com/xmap.asp check 13804/13804: https://google.com/xmap.html -------------------- Find sitemap urls len: 1 Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Contributing
- See Contributing
Authors
- Email: a0025071@gmail.com
- Website: Max 行銷誌
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Find_Sitemap-0.1.3.tar.gz
(8.8 kB
view hashes)
Built Distribution
Close
Hashes for Find_Sitemap-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14625c32277b1ee012483801df358d5435a832818d80c46aae39240a379358d7 |
|
MD5 | e2298993e2f79cfeb5f44d1f532a705a |
|
BLAKE2b-256 | ea97ccb7b19893b3edb4a0ea9933bcfcc71e725e15cabb1a6384305cb3acbff6 |