Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

A sitemap generator suitable for applications with greater than 50,000 URLs.

Project Description

This module was based on the big_sitemap ruby gem.

From the gem description:

BigSitemap is a Sitemapgenerator suitable for applications with greater than 50,000 URLs. It splits large Sitemaps into multiple files, gzips the files to minimize bandwidth usage…



import bigsitemap

options = {
    'gzip': True,
    'ping': True,
    'base_url': '',
    'site_url': '',
    'base_path': '/var/www/cdn/sitemaps'


sections = ['/','/boats','/cars','/gadgets','/travel']
places   = ['/parents-house.html','/grocery-store.html']

generator = bigsitemap.Generator(options)

for section in sections:

for place in places:

generator.files() #Returns ['sitemap.xml.gz','sections.gz','places.gz']

If your sitemaps grow beyond 50,000 URLs, the sitemap files will be partitioned into multiple files (places_1.xml.gz, places_2.xml.gz, …).

Initialization Options

  • gzip: Use gzip? Default False.
  • ping: Ping google and bing on finish? Default False.
  • base_path: Where to store the sitemap files? required
  • site_url: What is your website url? required
  • base_url: If you store the xml files into another host, supply it here. Default site_url.

Change Frequency, Priority and Last Modified

You can control changefreq, priority and lastmod values for each record individually by passing them as optional arguments when adding URLs:




  • Writer class for dependency injection
  • Automated tests


Many thanks to Stateless Systems ( for releasing the big_sitemap ruby gem.

Release History

Release History

This version
History Node


Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting