Skip to main content

Quickly extract metadata from URLs

Project description

RapidUnfurl

RapidUnfurl is a Python library designed to pull and process metadata very quickly to unfurl URL contents into a JSON object that can the be used by other programs for portraying that data, similar to how link expansion works in apps like Slack.

This library was originally forked from Loftie Ellis' pyunfurl library, which is an awesome project. I just wanted to do some things to speed up the process, and drop away the html rendering, which I didn't need.

Features

Installation

Use the package manager pip to install pyunfurl.

pip install rapidunfurl

Usage

import rapidunfurl
rapidunfurl.unfurl('https://davintaddeo.com') 

This will return a dict similar to the oembed spec:

{
  "type": "website",
  "url": "https://davintaddeo.com",
  "title": "Davin Taddeo | DevOps Advocate",
  "site_name": "@tdarwin",
  "description": "Homepage of Davin Taddeo, DevOps Advocate, Senior Customer Architect for Chef",
  "image": "https://davintaddeo.com/assets/images/round_headshot.png",
  "card": "summary",
  "favicon": "https://davintaddeo.com/favicon.ico"
}

Contributing

Pull requests are welcome. RapidUnfurl supports some custom integrations for sites that doesnt return any meta tags, if you want to improve the integration for a specific site you can look at the hackernews example.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rapidunfurl-1.1.0.tar.gz (16.7 kB view hashes)

Uploaded Source

Built Distribution

rapidunfurl-1.1.0-py3-none-any.whl (16.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page