Skip to main content

Library to generate XML sitemaps for websites and images. Boost SEO by indexing image URLs for better visibility on search engines.

Project description

🗺️ image_sitemap


PyPI version Python versions Downloads

Image & Website Sitemap Generator - SEO Tool for Better Visibility

Sitemap Images is a Python tool that generates a specialized XML sitemap file, allowing you to submit image URLs to search engines like Google, Bing, and Yahoo. This tool helps improve image search visibility, driving more traffic to your website and increasing engagement. To ensure search engines can discover your sitemap, simply add the following line to your robots.txt file:

Sitemap: https://example.com/sitemap-images.xml

By including image links in your sitemap and referencing it in your robots.txt file, you can enhance your website's SEO and make it easier for users to find your content.

Google image sitemaps standard description - Click.

📦 Features

  • Supports both website and image sitemap generation
  • Easy integration with existing Python projects
  • Helps improve visibility in search engine results
  • Boosts image search performance
  • Subdomain filtering with exclusion support
  • Configurable crawling depth and query parameters

✍️ Examples

  1. Set website page and crawling depth, run script
    import asyncio
    
    from image_sitemap import Sitemap
    from image_sitemap.instruments.config import Config
      
    images_config = Config(
        max_depth=3,
        accept_subdomains=True,
        excluded_subdomains={"blog", "api", "staging"},  # Exclude specific subdomains
        is_query_enabled=False,
        file_name="sitemap_images.xml",
        header={
           "User-Agent": "ImageSitemap Crawler",
           "Accept": "text/html",
        },
    )
    sitemap_config = Config(
        max_depth=3,
        accept_subdomains=True,
        excluded_subdomains={"blog", "api", "staging"},  # Exclude specific subdomains
        is_query_enabled=False,
        file_name="sitemap.xml",
        header={
           "User-Agent": "ImageSitemap Crawler",
           "Accept": "text/html",
        },
    )
    
    asyncio.run(Sitemap(config=images_config).run_images_sitemap(url="https://rucaptcha.com/"))
    asyncio.run(Sitemap(config=sitemap_config).run_sitemap(url="https://rucaptcha.com/"))
    
  2. Get sitemap images data in file
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset
        xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
        <url>
            <loc>https://rucaptcha.com/proxy/residential-proxies</loc>
            <image:image>
                <image:loc>https://rucaptcha.com/dist/web/assets/rotating-residential-proxies-NEVfEVLW.svg</image:loc>
            </image:image>
        </url>
    </urlset>
    
    Or just sitemap file
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset
       xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
       <url>
           <loc>https://rucaptcha.com/</loc>
       </url>
       <url>
           <loc>https://rucaptcha.com/h</loc>
       </url>
    </urlset>
    

🔧 Configuration Options

The Config class provides various options to customize sitemap generation:

Subdomain Control

  • accept_subdomains (bool): Enable/disable subdomain crawling (default: True)
  • excluded_subdomains (Set[str]): Set of subdomain names to exclude from parsing (default: set())
# Example: Include all subdomains except blog and api
config = Config(
    accept_subdomains=True,
    excluded_subdomains={"blog", "api", "staging", "dev"}
)

# This will include:
# - example.com
# - www.example.com  
# - shop.example.com
# But exclude:
# - blog.example.com
# - api.example.com
# - staging.example.com
# - dev.example.com

Other Options

  • max_depth (int): Maximum crawling depth (default: 1)
  • is_query_enabled (bool): Include URLs with query parameters (default: True)
  • file_name (str): Output sitemap filename (default: "sitemap_images.xml")
  • exclude_file_links (bool): Filter out file links from sitemap (default: True)
  • header (dict): Custom HTTP headers for requests

You can check examples file here - Click.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

image_sitemap-2.1.0.tar.gz (18.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

image_sitemap-2.1.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file image_sitemap-2.1.0.tar.gz.

File metadata

  • Download URL: image_sitemap-2.1.0.tar.gz
  • Upload date:
  • Size: 18.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for image_sitemap-2.1.0.tar.gz
Algorithm Hash digest
SHA256 5c268ab9a4f0dc0a1d897918282c1b3b7c2a2fd805b5aefb4829688e891f79dd
MD5 dd2efe6dc6ad95bce7a98fd42331e174
BLAKE2b-256 24ccf039a28f9f64a8248c90025a3532e2a1b4b83665f776b9dd3954c8f6bd97

See more details on using hashes here.

File details

Details for the file image_sitemap-2.1.0-py3-none-any.whl.

File metadata

  • Download URL: image_sitemap-2.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for image_sitemap-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b4cdbd9626496c9e8ed07f0a1349b3d8ac33246cfc989978acb3f88807e14254
MD5 d0bda8cc1cccf889d2bd3f26fb84f2d4
BLAKE2b-256 21596ed761d7b77479ed0dc4ac81b36600a3e811385bfcd2701ccdd90f280c2f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page