Skip to main content

Sitemap generation for ASGI applications.

Project description

asgi-sitemaps

Build Status Coverage Python versions Package version

Sitemap generation for ASGI applications. Inspired by Django's sitemap framework.

Contents

Features

  • Build and compose sitemap sections into a single dynamic ASGI endpoint.
  • Supports drawing sitemap items from a variety of sources (static lists, (async) ORM queries, etc).
  • Compatible with any ASGI framework.
  • Fully type annotated.
  • 100% test coverage.

Installation

Install with pip:

$ pip install 'asgi-sitemaps==1.*'

asgi-sitemaps requires Python 3.7+.

Quickstart

Let's build a static sitemap for a "Hello, world!" application. The sitemap will contain a single URL entry for the home / endpoint.

Here is the project file structure:

.
└── server
    ├── __init__.py
    ├── app.py
    └── sitemap.py

First, declare a sitemap section by subclassing Sitemap, then wrap it in a SitemapApp:

# server/sitemap.py
import asgi_sitemaps

class Sitemap(asgi_sitemaps.Sitemap):
    def items(self):
        return ["/"]

    def location(self, item: str):
        return item

    def changefreq(self, item: str):
        return "monthly"

sitemap = asgi_sitemaps.SitemapApp(Sitemap(), domain="example.io")

Now, register the sitemap endpoint as a route onto your ASGI app. For example, if using Starlette:

# server/app.py
from starlette.applications import Starlette
from starlette.responses import PlainTextResponse
from starlette.routing import Route
from .sitemap import sitemap

async def home(request):
    return PlainTextResponse("Hello, world!")

routes = [
    Route("/", home),
    Route("/sitemap.xml", sitemap),
]

app = Starlette(routes=routes)

Serve the app using $ uvicorn server.app:app, then request the sitemap:

curl http://localhost:8000/sitemap.xml
<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://example.io/</loc>
    <changefreq>monthly</changefreq>
    <priority>0.5</priority>
  </url>
</urlset>

Tada!

To learn more:

  • See How-To for more advanced usage, including splitting the sitemap in multiple sections, and dynamically generating entries from database queries.
  • See the Sitemap API reference for all supported sitemap options.

How-To

Sitemap sections

You can combine multiple sitemap classes into a single sitemap endpoint. This is useful to split the sitemap in multiple sections that may have different items() and/or sitemap attributes. Such sections could be static pages, blog posts, recent articles, etc.

To do so, declare multiple sitemap classes, then pass them as a list to SitemapApp:

# server/sitemap.py
import asgi_sitemaps

class StaticSitemap(asgi_sitemaps.Sitemap):
    ...

class BlogSitemap(asgi_sitemaps.Sitemap):
    ...

sitemap = asgi_sitemaps.SitemapApp([StaticSitemap(), BlogSitemap()], domain="example.io")

Entries from each sitemap will be concatenated when building the final sitemap.xml.

Dynamic generation from database queries

Sitemap.items() supports consuming any async iterable. This means you can easily integrate with an async database client or ORM so that Sitemap.items() fetches and returns relevant rows for generating your sitemap.

Here's an example using Databases, assuming you have a Database instance in server/resources.py:

# server/sitemap.py
import asgi_sitemaps
from .resources import database

class Sitemap(asgi_sitemaps.Sitemap):
    async def items(self):
        query = "SELECT permalink, updated_at FROM articles;"
        return await database.fetch_all(query)

    def location(self, row: dict):
        return row["permalink"]

Advanced web framework integration

While asgi-sitemaps is framework-agnostic, you can use the .scope attribute available on Sitemap instances to feed the ASGI scope into your framework-specific APIs for inspecting and manipulating request information.

Here is an example with Starlette where we build sitemap of static pages. To decouple from the raw URL paths, pages are referred to by view name. We reverse-lookup their URLs by building a Request instance from the ASGI .scope, and using .url_for():

# server/sitemap.py
import asgi_sitemaps
from starlette.datastructures import URL
from starlette.requests import Request

class StaticSitemap(asgi_sitemaps.Sitemap):
    def items(self):
        return ["home", "about", "blog:home"]

    def location(self, name: str):
        request = Request(scope=self.scope)
        url = request.url_for(name)
        return URL(url).path

The corresponding Starlette routing table could look something like this:

# server/routes.py
from starlette.routing import Mount, Route
from . import views
from .sitemap import sitemap

routes = [
    Route("/", views.home, name="home"),
    Route("/about", views.about, name="about"),
    Route("/blog/", views.blog_home, name="blog:home"),
    Route("/sitemap.xml", sitemap),
]

API Reference

class Sitemap

Represents a source of sitemap entries.

You can specify the type T of sitemap items for extra type safety:

import asgi_sitemaps

class MySitemap(asgi_sitemaps.Sitemap[str]):
    ...

async items

Signature: async def () -> Union[Iterable[T], AsyncIterable[T]]

(Required) Return an iterable or an asynchronous iterable of items of the same type. Each item will be passed as-is to .location(), .lastmod(), .changefreq(), and .priority().

Examples:

# Simplest usage: return a list
def items(self) -> List[str]:
    return ["/", "/contact"]

# Async operations are also supported
async def items(self) -> List[dict]:
    query = "SELECT permalink, updated_at FROM pages;"
    return await database.fetch_all(query)

# Sync and async generators are also supported
async def items(self) -> AsyncIterator[dict]:
    query = "SELECT permalink, updated_at FROM pages;"
    async for row in database.aiter_rows(query):
        yield row

location

Signature: def (item: T) -> str

(Required) Return the absolute path of a sitemap item.

"Absolute path" means an URL path without a protocol or domain. For example: /blog/my-article. (So https://mydomain.com/blog/my-article is not a valid location, nor is mydomain.com/blog/my-article.)

lastmod

Signature: def (item: T) -> Optional[datetime.datetime]

(Optional) Return the date of last modification of a sitemap item as a datetime object, or None (the default) for no lastmod field.

changefreq

Signature: def (item: T) -> Optional[str]

(Optional) Return the change frequency of a sitemap item.

Possible values are:

  • None - No changefreq field (the default).
  • "always"
  • "hourly"
  • "daily"
  • "weekly"
  • "monthly"
  • "yearly"
  • "never"

priority

Signature: def (item: T) -> float

(Optional) Return the priority of a sitemap item. Must be between 0 and 1. Defaults to 0.5.

protocol

Type: str

(Optional) This attribute defines the protocol used to build URLs of the sitemap.

Possible values are:

  • "auto" - The protocol with which the sitemap was requested (the default).
  • "http"
  • "https"

scope

This property returns the ASGI scope of the current HTTP request.

class SitemapApp

An ASGI application that responds to HTTP requests with the sitemap.xml contents of the sitemap.

Parameters:

  • (Required) sitemaps - A Sitemap object or a list of Sitemap objects, used to generate sitemap entries.
  • (Required) domain - The domain to use when generating sitemap URLs.

Examples:

sitemap = SitemapApp(Sitemap(), domain="mydomain.com")
sitemap = SitemapApp([StaticSitemap(), BlogSitemap()], domain="mydomain.com")

License

MIT

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

1.0 - 2022-02-13

Added

  • Now marked as Production/Stable software. (Pull #14)
  • Add official support for Python 3.9 and Python 3.10. (Pull #13)

0.3.2 - 2020-07-07

Fixed

  • Fix support for async items. (Pull #9)

0.3.1 - 2020-07-05

Fixed

  • Fix Scope type hint: values are now Any.

0.3.0 - 2020-07-05

This release changes the approach from "scrape the ASGI app to gather URLs" to a programmatic class-based API inspired by Django's sitemap framework.

As such, the command line application does not exist anymore. Users are expected to define Sitemap classes, compose them into a SitemapApp endpoint, and add that to their ASGI app routing table.

See the new README.md documentation for more information.

Changed

  • Switch to a class-based dynamic endpoint API. (Pull #4)

0.2.0 - 2020-06-01

Changed

  • Project was renamed from sitemaps to asgi-sitemaps - sitemap generation for ASGI apps. (Pull #2)
  • Change options of CLI and programmatic API to fit new "ASGI-only" project scope. (Pull #2)
  • CLI now reads from stdin (for --check mode) and outputs sitemap to stdout. (Pull #2)

Removed

  • Drop support for crawling arbitrary remote servers. (Pull #2)

Fixed

  • Don't include non-200 or non-HTML URLs in sitemap. (Pull #2)

0.1.0 - 2020-05-31

Added

  • Initial implementation: CLI and programmatic async API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

asgi-sitemaps-1.0.0.tar.gz (12.4 kB view hashes)

Uploaded Source

Built Distribution

asgi_sitemaps-1.0.0-py3-none-any.whl (9.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page