Quickly extract metadata from URLs
Project description
RapidUnfurl
RapidUnfurl is a Python library designed to pull and process metadata very quickly to unfurl URL contents into a JSON object that can the be used by other programs for portraying that data, similar to how link expansion works in apps like Slack.
This library was originally forked from Loftie Ellis' pyunfurl library, which is an awesome project. I just wanted to do some things to speed up the process, and drop away the html rendering, which I didn't need.
Features
- Supports all oEmbed providers from https://oembed.com/ and https://noembed.com/ by default.
- Supports the autodiscovery part of the oEmbed spec.
- Support for Open Graph protocol.
- Support for Twitter Cards
- Falls back to Meta tags and the site favicon/title if all else fails.
Installation
Use the package manager pip to install pyunfurl.
pip install rapidunfurl
Usage
import rapidunfurl
rapidunfurl.unfurl('https://davintaddeo.com')
This will return a dict similar to the oembed spec:
{
"type": "website",
"url": "https://davintaddeo.com",
"title": "Davin Taddeo | DevOps Advocate",
"site_name": "@tdarwin",
"description": "Homepage of Davin Taddeo, DevOps Advocate, Senior Customer Architect for Chef",
"image": "https://davintaddeo.com/assets/images/round_headshot.png",
"card": "summary",
"favicon": "https://davintaddeo.com/favicon.ico"
}
Contributing
Pull requests are welcome. RapidUnfurl supports some custom integrations for sites that doesnt return any meta tags, if you want to improve the integration for a specific site you can look at the hackernews example.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rapidunfurl-1.0.0.tar.gz
.
File metadata
- Download URL: rapidunfurl-1.0.0.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8d1057f07284c13dda39350d5f4cbd99916fb19e657ebd7af3a90ed346e6db33 |
|
MD5 | 6026888cbd4aab79d4d5dc7a17342e95 |
|
BLAKE2b-256 | f5f7c8e241a7074f2b3b9e9dd2d00af65a4562709279ad999cfd1e3b0fc6c41c |
File details
Details for the file rapidunfurl-1.0.0-py3-none-any.whl
.
File metadata
- Download URL: rapidunfurl-1.0.0-py3-none-any.whl
- Upload date:
- Size: 16.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.0 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 125bea705e90c8c6024c0caa44de8cce6e399cea4f5fdb8c2c1b730802f539d8 |
|
MD5 | 1948a0d2ad457ff651b7df3f56f21c8c |
|
BLAKE2b-256 | b208c8b4886075f5e889869034d7d29db80731c4798b542ad452da9ba94ef59e |