Skip to main content
Join the official 2019 Python Developers SurveyStart the survey!

Ultimate Sitemap Parser

Project description

Build Status Documentation Status Coverage Status PyPI package

Website sitemap parser for Python 3.5+.

Features

Installation

pip install ultimate_sitemap_parser

Usage

from usp.tree import sitemap_tree_for_homepage

tree = sitemap_tree_for_homepage('https://www.nytimes.com/')
print(tree)

sitemap_tree_for_homepage() will return a tree of AbstractSitemap subclass objects that represent the sitemap hierarchy found on the website; see a reference of AbstractSitemap subclasses.

If you’d like to just list all the pages found in all of the sitemaps within the website, consider using all_pages() method:

# all_pages() returns an Iterator
for page in tree.all_pages():
    print(page)

all_pages() method will return an iterator yielding SitemapPage objects; see a reference of SitemapPage.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ultimate-sitemap-parser, version 0.5
Filename, size File type Python version Upload date Hashes
Filename, size ultimate_sitemap_parser-0.5-py2.py3-none-any.whl (23.2 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size ultimate_sitemap_parser-0.5.tar.gz (20.2 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page