Skip to main content

A performant library for parsing and crawling sitemaps

Project description

PyPI - Python Version PyPI - Version Conda Version Pepy Total Downloads

Ultimate Sitemap Parser (USP) is a performant and robust Python library for parsing and crawling sitemaps.

Features

Installation

pip install ultimate-sitemap-parser

or using Anaconda:

conda install -c conda-forge ultimate-sitemap-parser

Usage

from usp.tree import sitemap_tree_for_homepage

tree = sitemap_tree_for_homepage('https://www.example.org/')

for page in tree.all_pages():
    print(page.url)

sitemap_tree_for_homepage() will return a tree of AbstractSitemap subclass objects that represent the sitemap hierarchy found on the website; see a reference of AbstractSitemap subclasses. AbstractSitemap.all_pages() returns a generator to efficiently iterate over pages without loading the entire tree into memory.

For more examples and details, see the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ultimate_sitemap_parser-1.4.0.tar.gz (38.0 kB view details)

Uploaded Source

Built Distribution

ultimate_sitemap_parser-1.4.0-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file ultimate_sitemap_parser-1.4.0.tar.gz.

File metadata

  • Download URL: ultimate_sitemap_parser-1.4.0.tar.gz
  • Upload date:
  • Size: 38.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ultimate_sitemap_parser-1.4.0.tar.gz
Algorithm Hash digest
SHA256 a997aa63568c8ccb0fc38bbbde29d022f0c61a0500298f86087128f6184c40f1
MD5 747f80e0e9b48bc646eff1cfab2fc1ea
BLAKE2b-256 01a42c80977724bc5ae172bd137179d8691afeb90fb79701bb7d89d0c6a1b2c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for ultimate_sitemap_parser-1.4.0.tar.gz:

Publisher: publish.yml on GateNLP/ultimate-sitemap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ultimate_sitemap_parser-1.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ultimate_sitemap_parser-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 01bd776b793b96d3b0c9dd638dcf38df8f36ea39db849f0d44da9c91bb09164f
MD5 9e93aaeda7c855e5bb7d37fec487dcd1
BLAKE2b-256 3985a75cc6f8d203ec136ebd18b9ddd7fc04c6b5f2b70c66d39bb060577ee79c

See more details on using hashes here.

Provenance

The following attestation bundles were made for ultimate_sitemap_parser-1.4.0-py3-none-any.whl:

Publisher: publish.yml on GateNLP/ultimate-sitemap-parser

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page