Library to generate XML sitemaps for websites and images. Boost SEO by indexing image URLs for better visibility on search engines.
Project description
🗺️ image_sitemap
Image & Website Sitemap Generator - SEO Tool for Better Visibility
Sitemap Images is a Python tool that generates a specialized XML sitemap file, allowing you to submit image URLs to search engines like Google, Bing, and Yahoo. This tool helps improve image search visibility, driving more traffic to your website and increasing engagement. To ensure search engines can discover your sitemap, simply add the following line to your robots.txt file:
Sitemap: https://example.com/sitemap-images.xml
By including image links in your sitemap and referencing it in your robots.txt file, you can enhance your website's SEO and make it easier for users to find your content.
Google image sitemaps standard description - Click.
📦 Features
- Supports both website and image sitemap generation
- Easy integration with existing Python projects
- Helps improve visibility in search engine results
- Boosts image search performance
- Subdomain filtering with exclusion support
- Configurable crawling depth and query parameters
✍️ Examples
- Set website page and crawling depth, run script
import asyncio from image_sitemap import Sitemap from image_sitemap.instruments.config import Config images_config = Config( max_depth=3, accept_subdomains=True, excluded_subdomains={"blog", "api", "staging"}, # Exclude specific subdomains is_query_enabled=False, file_name="sitemap_images.xml", header={ "User-Agent": "ImageSitemap Crawler", "Accept": "text/html", }, ) sitemap_config = Config( max_depth=3, accept_subdomains=True, excluded_subdomains={"blog", "api", "staging"}, # Exclude specific subdomains is_query_enabled=False, file_name="sitemap.xml", header={ "User-Agent": "ImageSitemap Crawler", "Accept": "text/html", }, ) asyncio.run(Sitemap(config=images_config).run_images_sitemap(url="https://rucaptcha.com/")) asyncio.run(Sitemap(config=sitemap_config).run_sitemap(url="https://rucaptcha.com/"))
- Get sitemap images data in file
<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1"> <url> <loc>https://rucaptcha.com/proxy/residential-proxies</loc> <image:image> <image:loc>https://rucaptcha.com/dist/web/assets/rotating-residential-proxies-NEVfEVLW.svg</image:loc> </image:image> </url> </urlset>
Or just sitemap file<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://rucaptcha.com/</loc> </url> <url> <loc>https://rucaptcha.com/h</loc> </url> </urlset>
🔧 Configuration Options
The Config class provides various options to customize sitemap generation:
Subdomain Control
accept_subdomains(bool): Enable/disable subdomain crawling (default:True)excluded_subdomains(Set[str]): Set of subdomain names to exclude from parsing (default:set())
# Example: Include all subdomains except blog and api
config = Config(
accept_subdomains=True,
excluded_subdomains={"blog", "api", "staging", "dev"}
)
# This will include:
# - example.com
# - www.example.com
# - shop.example.com
# But exclude:
# - blog.example.com
# - api.example.com
# - staging.example.com
# - dev.example.com
Other Options
max_depth(int): Maximum crawling depth (default:1)is_query_enabled(bool): Include URLs with query parameters (default:True)file_name(str): Output sitemap filename (default:"sitemap_images.xml")exclude_file_links(bool): Filter out file links from sitemap (default:True)header(dict): Custom HTTP headers for requests
You can check examples file here - Click.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file image_sitemap-2.1.0.tar.gz.
File metadata
- Download URL: image_sitemap-2.1.0.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5c268ab9a4f0dc0a1d897918282c1b3b7c2a2fd805b5aefb4829688e891f79dd
|
|
| MD5 |
dd2efe6dc6ad95bce7a98fd42331e174
|
|
| BLAKE2b-256 |
24ccf039a28f9f64a8248c90025a3532e2a1b4b83665f776b9dd3954c8f6bd97
|
File details
Details for the file image_sitemap-2.1.0-py3-none-any.whl.
File metadata
- Download URL: image_sitemap-2.1.0-py3-none-any.whl
- Upload date:
- Size: 17.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4cdbd9626496c9e8ed07f0a1349b3d8ac33246cfc989978acb3f88807e14254
|
|
| MD5 |
d0bda8cc1cccf889d2bd3f26fb84f2d4
|
|
| BLAKE2b-256 |
21596ed761d7b77479ed0dc4ac81b36600a3e811385bfcd2701ccdd90f280c2f
|