A python module to generate link previews.
Project description
SneakPeek
A python module and a minimalistic server to generate link previews.
What is supported
- Any page which supports Open Graph Protocol (which most sane websites do)
- Special handling for sites like
Installation
Run the following to install
pip install sneakpeek
Usage as a Python Module
From a URL
>>> import sneakpeek
>>> from pprint import pprint
>>> link = sneakpeek.SneakPeek("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
>>> link.fetch()
>>> link.is_valid()
True
>>> pprint(link)
{'description': 'The official video for “Never Gonna Give You Up” by Rick '
'AstleyTaken from the album ‘Whenever You Need Somebody’ – '
'deluxe 2CD and digital deluxe out 6th May ...',
'domain': 'www.youtube.com',
'image': 'https://i.ytimg.com/vi/dQw4w9WgXcQ/maxresdefault.jpg',
'image:height': '720',
'image:width': '1280',
'scrape': False,
'site_name': 'YouTube',
'title': 'Rick Astley - Never Gonna Give You Up (Official Music Video)',
'type': 'video.other',
'url': 'https://www.youtube.com/watch?v=dQw4w9WgXcQ',
'video:height': '720',
'video:secure_url': 'https://www.youtube.com/embed/dQw4w9WgXcQ',
'video:tag': 'never gonna give you up karaoke',
'video:type': 'text/html',
'video:url': 'https://www.youtube.com/embed/dQw4w9WgXcQ',
'video:width': '1280'}
>>> link = sneakpeek.SneakPeek(url="https://codingcoffee.dev")
>>> link.fetch()
>>> pprint(link)
{'description': 'A generalist with multi faceted interests and extensive '
'experience with DevOps, System Design and Full Stack '
'Development. I like blogging about things which interest me, '
'have a niche for optimizing and customizing things to the '
'very last detail, this includes my text editor and operating '
'system alike.',
'domain': 'codingcoffee.dev',
'image': 'https://www.gravatar.com/avatar/7ecdc5e1441ecd501faaf42a6ab9d6c0?s=200',
'scrape': False,
'title': 'Ameya Shenoy',
'type': 'website',
'url': 'https://codingcoffee.dev'}
Use scrape=True to fetch data using scraping instead of relying on open graph tags
>>> link = sneakpeek.SneakPeek(url="https://news.ycombinator.com/item?id=23812063", scrape=True)
>>> link.fetch()
>>> pprint(link)
{'description': '',
'domain': 'news.ycombinator.com',
'image': 'y18.gif',
'scrape': True,
'title': 'WireGuard as VPN Server on Kubernetes with AdBlocking | Hacker News',
'type': 'other',
'url': 'https://news.ycombinator.com/item?id=23812063'}
From HTML
>>> HTML = """
... <html xmlns:og="http://ogp.me/ns">
... <head>
... <title>The Rock (1996)</title>
... <meta property="og:title" content="The Rock" />
... <meta property="og:description" content="The Rock: Directed by Michael Bay. With Sean Connery, Nicolas Cage, Ed Harris, John Spencer. A mild-mannered chemist and an ex-con must lead the counterstrike when a rogue group of military men, led by a renegade general, threaten a nerve gas attack from Alcatraz against San Francisco.">
... <meta property="og:type" content="movie" />
... <meta property="og:url" content="http://www.imdb.com/title/tt0117500/" />
... <meta property="og:image" content="https://m.media-amazon.com/images/M/MV5BZDJjOTE0N2EtMmRlZS00NzU0LWE0ZWQtM2Q3MWMxNjcwZjBhXkEyXkFqcGdeQXVyNDk3NzU2MTQ@._V1_FMjpg_UX1000_.jpg">
... </head>
... </html>
... """
>>> movie = sneakpeek.SneakPeek(html=HTML)
>>> movie.is_valid()
True
>>> pprint(movie)
{'description': 'The Rock: Directed by Michael Bay. With Sean Connery, Nicolas '
'Cage, Ed Harris, John Spencer. A mild-mannered chemist and an '
'ex-con must lead the counterstrike when a rogue group of '
'military men, led by a renegade general, threaten a nerve gas '
'attack from Alcatraz against San Francisco.',
'domain': None,
'image': 'https://m.media-amazon.com/images/M/MV5BZDJjOTE0N2EtMmRlZS00NzU0LWE0ZWQtM2Q3MWMxNjcwZjBhXkEyXkFqcGdeQXVyNDk3NzU2MTQ@._V1_FMjpg_UX1000_.jpg',
'scrape': False,
'title': 'The Rock',
'type': 'movie',
'url': 'http://www.imdb.com/title/tt0117500/'}
Usage as a Server
A simple django server is used to serve the requests. Checkout the server folder for more details
sneekpeek serve
Development
pip install -U poetry
git clone https://github.com/codingcoffee/sneakpeek
cd sneakpeek
poetry install
Running Tests
poetry run pytest
- Tested Websites
TODO
- Twitter (requires a twitter API key)
- Instagram (using instagram-scraper)
- CI/CD for tests
Contribution
Have better suggestions to optimize the server image? Found some typos? Need special handling for a new website? Found a bug? Want to work on a TODO? Go ahead and send in a Pull Request or create an Issue! Contributions of any kind welcome!
License
The code in this repository has been released under the MIT License
Attributions
- Python opengraph
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sneakpeek-0.3.0.tar.gz.
File metadata
- Download URL: sneakpeek-0.3.0.tar.gz
- Upload date:
- Size: 6.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.10.5 Linux/5.18.14-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
48199ea7928d383107c9af4e4a6d7eb6276a064efe7226a8f1836e363a730877
|
|
| MD5 |
93f6542dadb1f2492527868cff621ec7
|
|
| BLAKE2b-256 |
d86f20d37915b6cafa2a2e28b56915587f48c4aaf99a5d07d10e51260810af26
|
File details
Details for the file sneakpeek-0.3.0-py3-none-any.whl.
File metadata
- Download URL: sneakpeek-0.3.0-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.14 CPython/3.10.5 Linux/5.18.14-arch1-1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e847caa74d532d775fde375df9563d2309e621c742c74cd9370f4d3a466169e
|
|
| MD5 |
7bcba98c8ef355ba219e601a86eb0bab
|
|
| BLAKE2b-256 |
386227ad132f1dd5c06bd808cb51f587f4a9edd15e69db984ec4cb25a2147f11
|