Skip to main content

A set of data tools in Python

Project description

PRs Welcome License:MIT PyPi:Find-Sitemap Code style: black

Find-Sitemap

Find Sitemap is a simple SEO tool to help you find the sitemap.

>>> from Find_Sitemap import FindSitemap
>>> main = FindSitemap('google.com')
>>> main.crawl()
...
...
check 13801/13804: https://google.com/xmap.php
check 13802/13804: https://google.com/xmap.jsp
check 13803/13804: https://google.com/xmap.asp
check 13804/13804: https://google.com/xmap.html
--------------------
Find sitemap urls len: 1
Find sitemap urls list: ['https://www.google.com/sitemap.xml']

Getting Started

Installing Requests on PyPI:

$ pip install Find-Sitemap

Prerequisites

Usage

  1. Show the subdomains, slugs_L1, slugs_L2, filetypes parameters.

    >>> from Find_Sitemap import FindSitemap
    >>> main = FindSitemap('google.com')
    >>> main.subdomains
    {'www.'}
    
    >>> main.slugs_L1
    {'/default', '/sitemap', '/feeds', '/api', '/contents' ...}
    
    >>> main.slugs_L2
    {'/sitemap', '/stock', '/sitemap1', '/sitemap0', ...}
    
    >>> main.filetypes
    {'txt', 'xml', 'xml.gz', 'jsp', 'html', ...}
    
  2. Add the subdomains, slugs_L1, slugs_L2, filetypes parameters.

    >>> from Find_Sitemap import FindSitemap
    >>> main = FindSitemap('google.com')
    >>> main.subdomains.add("shop.")
    >>> main.slugs_L1.add("/node")
    >>> main.slugs_L2.add("/site")
    >>> main.filetypes.add("xml")
    
  3. Remove the subdomains, slugs_L1, slugs_L2, filetypes parameters.

    >>> from Find_Sitemap import FindSitemap
    >>> main = FindSitemap('google.com')
    >>> main.subdomains.remove("shop.")
    >>> main.slugs_L1.remove("/node")
    >>> main.slugs_L2.remove("/site")
    >>> main.filetypes.remove("xml")
    
  4. Run the crawler.

    >>> from Find_Sitemap import FindSitemap
    >>> main = FindSitemap('google.com')
    >>> main.crawl()
    ...
    ...
    check 13801/13804: https://google.com/xmap.php
    check 13802/13804: https://google.com/xmap.jsp
    check 13803/13804: https://google.com/xmap.asp
    check 13804/13804: https://google.com/xmap.html
    --------------------
    Find sitemap urls len: 1
    Find sitemap urls list: ['https://www.google.com/sitemap.xml']
    

Contributing

Authors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Find_Sitemap-0.1.4.tar.gz (8.8 kB view details)

Uploaded Source

Built Distribution

Find_Sitemap-0.1.4-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file Find_Sitemap-0.1.4.tar.gz.

File metadata

  • Download URL: Find_Sitemap-0.1.4.tar.gz
  • Upload date:
  • Size: 8.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.14

File hashes

Hashes for Find_Sitemap-0.1.4.tar.gz
Algorithm Hash digest
SHA256 1439978fa36e85f9fabafcf65654ba0cbc53824e09de96ecd3b7b0ee9d3c8514
MD5 0b8a1d18cfae86170a76214735d2a774
BLAKE2b-256 c4c627cb7cfcb5e86753477d77b3c6d0490e5274f1d6ace6d01d7136128d18fc

See more details on using hashes here.

File details

Details for the file Find_Sitemap-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: Find_Sitemap-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.14

File hashes

Hashes for Find_Sitemap-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 1cc45c9eb9a395cb840a6224dd0d06b3ca339e1b26a39386aa1d66ba66bd2db8
MD5 4085642d53cd01842f57c21617525f2d
BLAKE2b-256 1039f403ef5f0e6adaed8ee7ff1327200e029e012e631525007cd07626195286

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page