Skip to main content

Making site preview

Project description

[![Build Status](https://travis-ci.org/TigorC/aiounfurl.svg?branch=master)](https://travis-ci.org/TigorC/aiounfurl)
[![Coverage Status](https://coveralls.io/repos/github/TigorC/aiounfurl/badge.svg?branch=master)](https://coveralls.io/github/TigorC/aiounfurl?branch=master)

## aiounfurl
Using this library you can extract meta information from web pages and create site preview.
The library uses four sources of information:

1. [oEmbed](http://oembed.com)
2. [Open Graph](http://ogp.me)
3. [Twitter Cards](https://dev.twitter.com/cards/overview)
4. HTML meta tags

## Requirements
* python 3.5
* aiohttp
* beautifulsoup4
* html5lib

## Installation
```bash
pip install aiounfurl
```

## Example of using

To extract all site data:

```python
import asyncio
import aiohttp
from pprint import pprint
from aiounfurl.views import get_preview_data, fetch_all


async def get_links_data(links, loop):
results = []
async with aiohttp.ClientSession() as session:
tasks = [fetch_all(session, l, loop) for l in links]
results = await asyncio.gather(*tasks, loop=loop, return_exceptions=True)
return [{'link':l, 'data': d} for l, d in zip(links, results)]


links = [
'https://habrahabr.ru/post/314606/',
'https://www.youtube.com/watch?v=9EftQMnuhvU',
'https://medium.freecodecamp.com/million-requests-per-second-with-python-95c137af319'
]
loop = asyncio.get_event_loop()
result = loop.run_until_complete(get_links_data(links, loop))
loop.close()
pprint(result)
```

## Server example.
Full example you can find [here](https://github.com/TigorC/aiounfurl/blob/master/example/srv.py).

Install required packages for running example:

```bash
pip install -r example/requirements.txt
```
Run `python srv.py runserver`, then open http://127.0.0.1:8080/

## Running the example in Docker

I added a docker image with the example in http://hub.docker.com/ to run the sample as a separate independent service.

Running in the background:

```bash
docker run --name aiounfurl -p 8080:8080 -d tigorc/aiounfurl
```

then you can open our example [http://127.0.0.1:8080/](http://127.0.0.1:8080/).

Using the list of oEmbed providers (a json file with a list of providers /path_to_file/providers.json has to be preliminarily created):

```bash
docker run --name aiounfurl -p 8080:8080 -e "OEMBED_PROVIDERS_FILE=/srv/app/providers.json" -v /path_to_file/providers.json:/srv/app/providers.json -d tigorc/aiounfurl
```

## Tests
Install the `tox` package and run command:

```bash
tox
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiounfurl-0.2.4.tar.gz (7.9 kB view details)

Uploaded Source

File details

Details for the file aiounfurl-0.2.4.tar.gz.

File metadata

  • Download URL: aiounfurl-0.2.4.tar.gz
  • Upload date:
  • Size: 7.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for aiounfurl-0.2.4.tar.gz
Algorithm Hash digest
SHA256 0b546faeddaa681bc969d7370b2758bb45408ce7fb8d622a356ef70a16b9bce4
MD5 3812ce8e95dfb8adc17a3cc9493553d5
BLAKE2b-256 92e85c3bf2ac08a35d924c92bbc2f9e2a4dfcae69757d09b70002283faf4ce0d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page