Skip to main content

Ultimate Sitemap Parser

Project description

Build Status Documentation Status Coverage Status PyPI package

Website sitemap parser for Python 3.5+.

Features

Installation

pip install ultimate_sitemap_parser

Usage

from usp.tree import sitemap_tree_for_homepage

tree = sitemap_tree_for_homepage('https://www.nytimes.com/')
print(tree)

sitemap_tree_for_homepage() will return a tree of AbstractSitemap subclass objects that represent the sitemap hierarchy found on the website; see a reference of AbstractSitemap subclasses.

If you’d like to just list all the pages found in all of the sitemaps within the website, consider using all_pages() method:

# all_pages() returns an Iterator
for page in tree.all_pages():
    print(page)

all_pages() method will return an iterator yielding SitemapPage objects; see a reference of SitemapPage.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultimate_sitemap_parser-0.5.tar.gz (20.2 kB view hashes)

Uploaded source

Built Distribution

ultimate_sitemap_parser-0.5-py2.py3-none-any.whl (23.2 kB view hashes)

Uploaded py2 py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page