Skip to main content

A tool to convert XML sitemaps to Atom feeds

Project description

sitemap2atom

A simple tool to convert an XML sitemap into an Atom feed — especially useful for sites that don't have a CMS, or where the CMS doesn't produce a feed. Each URL in the sitemap is fetched and its OpenGraph and Twitter Card metadata (title, description, image, author, dates) is used to build a rich Atom entry.

Installation

Run without installing (uvx)

Once published to PyPI you can run it directly with uv:

uvx sitemap2atom https://example.com/sitemap.xml -o feed.atom

To run the latest code straight from GitHub (before a release, or to try main):

uvx --from git+https://github.com/darkflib/sitemap2atom sitemap2atom https://example.com/sitemap.xml

Install as a tool / library

uv tool install sitemap2atom      # installs the `sitemap2atom` command
# or
pip install sitemap2atom

Usage

sitemap2atom SITEMAP_URL [OPTIONS]

By default the feed is written to standard output; redirect it or use -o to save it to a file:

# Print to stdout
sitemap2atom https://example.com/sitemap.xml

# Write to a file, limiting to the first 20 URLs
sitemap2atom https://example.com/sitemap.xml -o feed.atom --limit 20

Options

  • -o, --output PATH — write the Atom feed to this file (default: stdout).
  • --limit N — maximum number of sitemap URLs to process (default: all).
  • --feed-title TEXT — title for the generated feed (default: Enriched URL Feed).
  • --timeout SECONDS — per-request timeout in seconds (default: 10).
  • -v, --verbose — enable info-level logging on stderr.
  • --version — show the version and exit.

As a library

from sitemap2atom import fetch_sitemap_urls, enrich_url_list_to_atom, feed_to_pretty_xml

urls = fetch_sitemap_urls("https://example.com/sitemap.xml")
feed = enrich_url_list_to_atom(urls[:10], feed_title="My Feed")
print(feed_to_pretty_xml(feed))

Example output

See this gist for a sample of the kind of enriched Atom feed produced: https://gist.github.com/Darkflib/989b8f3a5a1ea995e8e294669d5e282a

Limitations

This is a simple tool aimed at basic use cases. It does not support authentication, sitemap index files / pagination, or dynamic sitemaps, and may not handle every sitemap or page format. Treat the sitemap and the pages it references as untrusted input and run it against sources you trust.

Development

This project uses uv.

git clone https://github.com/darkflib/sitemap2atom.git
cd sitemap2atom
uv sync
uv run pytest

See CONTRIBUTING.md for more, and CHANGELOG.md for release notes.

License

This project is licensed under the MIT License — see the LICENSE file for details.

PS. If you do anything interesting with this code, please let me know! I'd love to hear about it.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sitemap2atom-0.1.0.tar.gz (51.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sitemap2atom-0.1.0-py3-none-any.whl (9.1 kB view details)

Uploaded Python 3

File details

Details for the file sitemap2atom-0.1.0.tar.gz.

File metadata

  • Download URL: sitemap2atom-0.1.0.tar.gz
  • Upload date:
  • Size: 51.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sitemap2atom-0.1.0.tar.gz
Algorithm Hash digest
SHA256 830371ab01d42f534e2fa372cac19d56681cee66f0628a274edb532b7a0290d5
MD5 9ba2664fbefebace004ad0b3d722c913
BLAKE2b-256 ef2e9d878ff3f1b6a9236b2ae39795e476961f9e6a22b19ed618468ecf57eba7

See more details on using hashes here.

Provenance

The following attestation bundles were made for sitemap2atom-0.1.0.tar.gz:

Publisher: workflow.yml on Darkflib/sitemap2atom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sitemap2atom-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: sitemap2atom-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 9.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sitemap2atom-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e263ffe853ae60b14fb8affbbe8cbd4473102b71741678ad1a2ff4045345bc9a
MD5 816522c9f205a6eb55c04e9df730674b
BLAKE2b-256 0ab6c324931df0bd40dfc0c1585422f6aad78d9162a1fc6e30d3b6ea150f819d

See more details on using hashes here.

Provenance

The following attestation bundles were made for sitemap2atom-0.1.0-py3-none-any.whl:

Publisher: workflow.yml on Darkflib/sitemap2atom

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page