A set of data tools in Python
Project description
Find-Sitemap
Find Sitemap is a simple SEO tool to help you find the sitemap.
>>> from Find_Sitemap import FindSitemap
>>> main = FindSitemap('google.com')
>>> main.crawl()
...
...
check 13801/13804: https://google.com/xmap.php
check 13802/13804: https://google.com/xmap.jsp
check 13803/13804: https://google.com/xmap.asp
check 13804/13804: https://google.com/xmap.html
--------------------
Find sitemap urls len: 1
Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Getting Started
Installing Requests on PyPI:
$ pip install Find-Sitemap
Prerequisites
Usage
-
Show the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains {'www.'} >>> main.slugs_L1 {'/default', '/sitemap', '/feeds', '/api', '/contents' ...} >>> main.slugs_L2 {'/sitemap', '/stock', '/sitemap1', '/sitemap0', ...} >>> main.filetypes {'txt', 'xml', 'xml.gz', 'jsp', 'html', ...} -
Add the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.add("shop.") >>> main.slugs_L1.add("/node") >>> main.slugs_L2.add("/site") >>> main.filetypes.add("xml") -
Remove the subdomains, slugs_L1, slugs_L2, filetypes parameters.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.subdomains.remove("shop.") >>> main.slugs_L1.remove("/node") >>> main.slugs_L2.remove("/site") >>> main.filetypes.remove("xml") -
Run the crawler.
>>> from Find_Sitemap import FindSitemap >>> main = FindSitemap('google.com') >>> main.crawl() ... ... check 13801/13804: https://google.com/xmap.php check 13802/13804: https://google.com/xmap.jsp check 13803/13804: https://google.com/xmap.asp check 13804/13804: https://google.com/xmap.html -------------------- Find sitemap urls len: 1 Find sitemap urls list: ['https://www.google.com/sitemap.xml']
Contributing
- See Contributing
Authors
- Email: a0025071@gmail.com
- Website: Max 行銷誌
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file Find_Sitemap-0.1.4.tar.gz.
File metadata
- Download URL: Find_Sitemap-0.1.4.tar.gz
- Upload date:
- Size: 8.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1439978fa36e85f9fabafcf65654ba0cbc53824e09de96ecd3b7b0ee9d3c8514
|
|
| MD5 |
0b8a1d18cfae86170a76214735d2a774
|
|
| BLAKE2b-256 |
c4c627cb7cfcb5e86753477d77b3c6d0490e5274f1d6ace6d01d7136128d18fc
|
File details
Details for the file Find_Sitemap-0.1.4-py3-none-any.whl.
File metadata
- Download URL: Find_Sitemap-0.1.4-py3-none-any.whl
- Upload date:
- Size: 10.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1cc45c9eb9a395cb840a6224dd0d06b3ca339e1b26a39386aa1d66ba66bd2db8
|
|
| MD5 |
4085642d53cd01842f57c21617525f2d
|
|
| BLAKE2b-256 |
1039f403ef5f0e6adaed8ee7ff1327200e029e012e631525007cd07626195286
|