Extract metadata from html pages using Open Graph metadata, HTML metadata, and a series of fallbacks
Project description
HTMLmetadata
Extract metadata from html pages using Open Graph metadata, HTML metadata, and a series of fallbacks
Inspired in https://metascraper.js.org
Install
pip install htmlmetadata
Use
You can use it by calling the module directly.
python -m htmlmetadata http://schema.org/docs/about.html
{
"request": {
"url": "http://schema.org/docs/about.html"
},
"summary": {
"description": "Schema.org is a set of extensible schemas that enables webmasters to embed\n structured data on their web pages for use by search engines and other applications.",
"title": "about page - schema.org",
"language": "en"
}
}
Or use it directly in your code.
from htmlmetadata import extract_metadata data = extract_metadata("http://schema.org/docs/about.html")
Project details
Release history Release notifications
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size htmlmetadata-1.1-py2.py3-none-any.whl (5.4 kB) | File type Wheel | Python version py2.py3 | Upload date | Hashes View hashes |
Filename, size htmlmetadata-1.1.zip (8.4 kB) | File type Source | Python version None | Upload date | Hashes View hashes |
Close
Hashes for htmlmetadata-1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 004d36c78fd27366b7ee97943d7293d54dd51a9f16efbbb6163c6d1b25619635 |
|
MD5 | e13f15ec806fbe9aa4c57d3ba6d010e1 |
|
BLAKE2-256 | ebd0e4077f28c0af6c47ce8d3411bdd252521311f2804ed9a99bdd3648fd45a5 |