Skip to main content
Donate to the Python Software Foundation or Purchase a PyCharm License to Benefit the PSF! Donate Now

A sitemap generator suitable for applications with greater than 50,000 URLs.

Project description

This module was based on the big_sitemap ruby gem.

From the gem description:

BigSitemap is a Sitemapgenerator suitable for applications with greater than 50,000 URLs. It splits large Sitemaps into multiple files, gzips the files to minimize bandwidth usage…

Usage

Example:

import bigsitemap

options = {
    'gzip': True,
    'ping': True,
    'base_url': 'http://cdn.mywebsite.com/sitemaps/',
    'site_url': 'http://www.mywebsite.com/',
    'base_path': '/var/www/cdn/sitemaps'

}


sections = ['/','/boats','/cars','/gadgets','/travel']
places   = ['/parents-house.html','/grocery-store.html']

generator = bigsitemap.Generator(options)

for section in sections:
    generator.add_url('sections',section,{'last_modified':datetime.now(),'change_frequency':'daily','priority':0.6})

for place in places:
    generator.add_url('places',place,{'last_modified':datetime.now(),'change_frequency':'daily','priority':0.6})

generator.finish()
generator.files() #Returns ['sitemap.xml.gz','sections.gz','places.gz']

If your sitemaps grow beyond 50,000 URLs, the sitemap files will be partitioned into multiple files (places_1.xml.gz, places_2.xml.gz, …).

Initialization Options

  • gzip: Use gzip? Default False.
  • ping: Ping google and bing on finish? Default False.
  • base_path: Where to store the sitemap files? required
  • site_url: What is your website url? required
  • base_url: If you store the xml files into another host, supply it here. Default site_url.

Change Frequency, Priority and Last Modified

You can control changefreq, priority and lastmod values for each record individually by passing them as optional arguments when adding URLs:

generator.add_url('sections',section,{
    'last_modified':datetime.now(),
    'change_frequency':'daily',
    'priority':0.6

})

TODO

  • Writer class for dependency injection
  • Automated tests

Credits

Many thanks to Stateless Systems (statelesssystems.com) for releasing the big_sitemap ruby gem.

Project details


Release history Release notifications

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page