Skip to main content

A performant library for parsing and crawling sitemaps

Project description

PyPI - Python Version PyPI - Version Conda Version Pepy Total Downloads

Ultimate Sitemap Parser (USP) is a performant and robust Python library for parsing and crawling sitemaps.

Features

Installation

pip install ultimate-sitemap-parser

or using Anaconda:

conda install -c conda-forge ultimate-sitemap-parser

Usage

from usp.tree import sitemap_tree_for_homepage

tree = sitemap_tree_for_homepage('https://www.example.org/')

for page in tree.all_pages():
    print(page.url)

sitemap_tree_for_homepage() will return a tree of AbstractSitemap subclass objects that represent the sitemap hierarchy found on the website; see a reference of AbstractSitemap subclasses. AbstractSitemap.all_pages() returns a generator to efficiently iterate over pages without loading the entire tree into memory.

For more examples and details, see the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultimate_sitemap_parser-1.3.1.tar.gz (37.9 kB view details)

Uploaded Source

Built Distribution

ultimate_sitemap_parser-1.3.1-py3-none-any.whl (42.2 kB view details)

Uploaded Python 3

File details

Details for the file ultimate_sitemap_parser-1.3.1.tar.gz.

File metadata

File hashes

Hashes for ultimate_sitemap_parser-1.3.1.tar.gz
Algorithm Hash digest
SHA256 c227dd10da5c5d6ec67c4add04170d0ee4008d966bf8c489a002bf3e85b35983
MD5 63ab9221df2a6d73a8c4dafe76ba4fa0
BLAKE2b-256 90de656c4c15be1ac512b3435d8052a0ade2fd334bf6eeda6760bcc11023f61e

See more details on using hashes here.

Provenance

The following attestation bundles were made for ultimate_sitemap_parser-1.3.1.tar.gz:

Publisher: publish.yml on GateNLP/ultimate-sitemap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ultimate_sitemap_parser-1.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for ultimate_sitemap_parser-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fa88d3a390dd6c9dcdf1bf5c743267399a95ff596514b43562dcb8a3a8c2b5ac
MD5 c849c25b30fb40185caf1e16fae2ae70
BLAKE2b-256 15943c8487ae1c6ec996916df8ca0200793508aa81263c87d2ca0fec3cafeef9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ultimate_sitemap_parser-1.3.1-py3-none-any.whl:

Publisher: publish.yml on GateNLP/ultimate-sitemap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page