Script/Library to read and parse sitemap.xml data
Project description
Site Map Parser
Script and library which reads urls and converts to objects, allows exporting as CSV or JSON.
Handle sitemaps according to: https://www.sitemaps.org/protocol.html
Installation
pip install site-map-parser
Usage
Script usage
smapper $url > /tmp/data.csv
Logs written to ~/sitemap_run.log
Arguments
| Argument | Options | Default | Information |
|---|---|---|---|
| -h | N/A | N/A | Outputs argument data |
| url | e.g. http://www.example.com - http://www.example.com/other_sitemap.xml |
N/A | Required - sitemap data to retrieve |
| -l, --log | CRITICAL or ERROR or WARNING or INFO or DEBUG |
INFO |
logs to sitemapper_run.log in install folder |
| -e, --exporter | csv or json |
csv |
Export format of the data |
Library Usage
from sitemapparser import SiteMapParser
sm = SiteMapParser('http://www.example.com') # reads /sitemap.xml
if sm.has_sitemaps():
sitemaps = sm.get_sitemaps() # returns iterator of sitemapper.Sitemap instances
else:
urls = sm.get_urls() # returns iterator of sitemapper.Url instances
Exporting
Two exporters are available: csv and json
CSV Exporter
from sitemapparser.exporters import CSVExporter
# sm set as per earlier library usage example
csv_exporter = CSVExporter(sm)
if sm.has_sitemaps():
print(csv_exporter.export_sitemaps())
elif sm.has_urls():
print(csv_exporter.export_urls())
JSON Exporter
from sitemapparser.exporters import JSONExporter
# sm set as per earlier library usage example
json_exporter = JSONExporter(sm)
if sm.has_sitemaps():
print(json_exporter.export_sitemaps())
elif sm.has_urls():
print(json_exporter.export_urls())
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file site-map-parser-0.3.9.tar.gz.
File metadata
- Download URL: site-map-parser-0.3.9.tar.gz
- Upload date:
- Size: 7.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
306f92df5c4d17e73f4a132ebf4925741695ba3a23139fc2c30d015938d67a95
|
|
| MD5 |
74015ec33ed309ef4e51e1ef37e8114b
|
|
| BLAKE2b-256 |
64837cf82e87bb0fe2f893086f9ccf16cd041aa5ee331468ac51b6f899a4f903
|
File details
Details for the file site_map_parser-0.3.9-py3-none-any.whl.
File metadata
- Download URL: site_map_parser-0.3.9-py3-none-any.whl
- Upload date:
- Size: 11.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c6bba0e36189a8e6787ab4d41e769538cc4e94f1e179e09ed6c1e0af5fc3000
|
|
| MD5 |
184ad547e0b7df9a5f36350341c618b8
|
|
| BLAKE2b-256 |
1cc0efef8242d4cab1263aea30e4d0ac985278ae32ee62bb815c6e2c92b0edcf
|