Skip to main content

Generate sitemap.xml for Datasette sites

Project description

datasette-sitemap

PyPI Changelog Tests License

Generate sitemap.xml for Datasette sites

Installation

Install this plugin in the same environment as Datasette.

datasette install datasette-sitemap

Demo

This plugin is used for the sitemap on til.simonwillison.net:

Here's the configuration used for that sitemap.

Usage

Once configured, this plugin adds a sitemap at /sitemap.xml with a list of URLs.

This list is defined using a SQL query in metadata.json (or .yml) that looks like this:

{
  "plugins": {
    "datasette-sitemap": {
      "query": "select '/' || id as path from my_table"
    }
  }
}

Using metadata.yml allows for multi-line SQL queries which can be easier to maintain:

plugins:
  datasette-sitemap:
    query: |
      select
        '/' || id as path
      from
        my_table

The SQL query must return a column called path. The values in this column must begin with a /. They will be used to generate a sitemap that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url><loc>https://example.com/1</loc></url>
  <url><loc>https://example.com/2</loc></url>
</urlset>

You can use UNION in your SQL query to combine results from multiple tables, or include literal paths that you want to include in the index:

select
  '/data/table1/' || id as path
  from table1
union
select
  '/data/table2/' || id as path
  from table2
union
select
  '/about' as path

If your Datasette instance has multiple databases you can configure the database to query using the database configuration property.

By default the domain name for the genearted URLs in the sitemap will be detected from the incoming request.

You can set base_url instead to override this. This should not include a trailing slash.

This example shows both of those settings, running the query against the content database and setting a custom base URL:

plugins:
  datasette-sitemap:
    query: |
      select '/plugins/' || name as path from plugins
      union
      select '/tools/' || name as path from tools
      union
      select '/news' as path
    database: content
    base_url: https://datasette.io

Try that query.

robots.txt

This plugin adds a robots.txt file pointing to the sitemap:

Sitemap: http://example.com/sitemap.xml

You can take full control of the sitemap by installing and configuring the datasette-block-robots plugin.

This plugin will add the Sitemap: line even if you are using datasette-block-robots for the rest of your robots.txt file.

Adding paths to the sitemap from other plugins

This plugin adds a new plugin hook to Datasete called sitemap_extra_paths() which can be used by other plugins to add their own additional lines to the sitemap.xml file.

The hook accepts these optional parameters:

  • datasette: The current Datasette instance. You can use this to execute SQL queries or read plugin configuration settings.
  • request: The Request object representing the incoming request to /sitemap.xml.

The hook should return a list of strings, each representing a path to be added to the sitemap. Each path must begin with a /.

It can also return an async def function, which will be awaited and used to generate a list of lines. Use this option if you need to make await calls inside you hook implementation.

This example uses the hook to add two extra paths, one of which came from a SQL query:

from datasette import hookimpl

@hookimpl
def sitemap_extra_paths(datasette):
    async def inner():
        db = datasette.get_database()
        path_from_db = (await db.execute("select '/example'")).single_value()
        return ["/about", path_from_db]
    return inner

Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

cd datasette-sitemap
python3 -m venv venv
source venv/bin/activate

Now install the dependencies and test dependencies:

pip install -e '.[test]'

To run the tests:

pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datasette-sitemap-1.0.tar.gz (8.5 kB view details)

Uploaded Source

Built Distribution

datasette_sitemap-1.0-py3-none-any.whl (9.3 kB view details)

Uploaded Python 3

File details

Details for the file datasette-sitemap-1.0.tar.gz.

File metadata

  • Download URL: datasette-sitemap-1.0.tar.gz
  • Upload date:
  • Size: 8.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.6

File hashes

Hashes for datasette-sitemap-1.0.tar.gz
Algorithm Hash digest
SHA256 1c8d5d47da63f921d45f71e80cafadf52be11aedf2777f4c5cc65fa23618dd09
MD5 8b44958477d1d289525e0178edcb0b4f
BLAKE2b-256 a0371a11539c93177adfcc59ba7885a07102eacd3cfd7b797f98835e8a894caa

See more details on using hashes here.

File details

Details for the file datasette_sitemap-1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for datasette_sitemap-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8729433093f8e4536430cc1426a4fe949902b78b2d370f8f05220752dc8a60f7
MD5 29c0bb536f2fe1f812eb6ffacf6a0193
BLAKE2b-256 b0b60bfabe5bbd2215c8a734aea5f60c28f8f67adfe3ffb3739d1e610c2c0bb9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page