Render HTML to an image using PhantomJS with this library designed to scale for high volume continuous operation.
Project description
phantom-snap
====
Render HTML to an image using PhantomJS with this library designed to scale for high volume continuous operation.
Features
--------
- Provides full timing control around the rendering process.
- Maintains a live PhantomJS process (instead of a new one per request which many wrappers do, which is slow).
- Render content from a URL, or provide the HTML content directly to the renderer
Roadmap
-------
- Decorator to manage rendering under specified timezones
- Decorator to manage rendering under specified proxies.
Examples
--------
The example assumes you have http://phantomjs.org/ installed.
This first example demonstrates rendering a URL and saving the resulting image to a file at /tmp/google-render.jpg.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.imagetools import save_image
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true'],
'env': {'TZ': 'America/Los_Angeles'}
'timeouts': {
'page_load': 3
}
}
r = PhantomJSRenderer(config)
url = 'http://www.google.com'
try:
page = r.render(url, img_format='JPEG')
save_image('/tmp/google-render', page)
finally:
r.shutdown(15)
A sample response from ``r.render(url)`` looks like this:
::
{
"status": "success",
"format": "PNG",
"url": "http://www.google.com",
"paint_time": 141,
"base64": "iVBORw0KGgo <SNIP> RK5CYII=",
"error": null,
"load_time": 342
}
This example shows how to provide HTML content directly to the rendering process, instead of requesting it.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.imagetools import save_image
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true']
}
r = PhantomJSRenderer(config)
url = 'http://www.a-url.com'
html = '<html><body>Boo ya!</body></html>'
try:
page = r.render(url=url, html=html, img_format='PNG')
save_image('/tmp/html-render', page)
finally:
r.shutdown(15)
If you would like to offload the running of phantomjs into `AWS Lambda <https://aws.amazon.com/lambda/>`_, you can use the ``LambdaRenderer`` class in the following way:
::
from phantom_snap.lambda_renderer import LambdaRenderer
from phantom_snap.imagetools import save_image
config = {
'url': 'http://url-to-my-lambda-func',
}
r = LambdaRenderer(config)
url = 'http://www.youtube.com'
page = r.render(url, img_format='JPEG')
save_image('/tmp/youtube-render', page)
r.shutdown()
To learn more about offloading renders into AWS Lambda, please see the ``serverless`` folder.
Decorators
----------
**Lifetime**
If you plan on running a ``PhantomJSRenderer`` instance for an extended period of time with high volume, it's recommended that you wrap the instance with a ``Lifetime`` decorator as shown below.
The ``Lifetime`` decorator will transparently shutdown the underlying PhantomJS process if the renderer is idle or after a maximum lifetime to release any accumulated resources. This is important if PhantomJS is configured to use a memory-based browser cache to prevent the cache from growing too large. After the ``Lifetime`` decorator shuts down the Renderer (due to idle time or maximum time) the next render request will automatically create a new PhantomJS process.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.decorators import Lifetime
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true'],
'env': {'TZ': 'America/Los_Angeles'},
# Properties for the Lifetime decorator
'idle_shutdown_sec': 900, # 15 minutes, Shutdown PhantomJS if it's been idle this long
'max_lifetime_sec': 43200 # 12 hours, Restart PhantomJS every 12 hours
}
r = Lifetime(PhantomJSRenderer(config))
try:
urls = [] # Some endless source of URL targets
for url in urls:
page = r.render(url=url, img_format='JPEG')
# Store the image somewhere
finally:
r.shutdown()
You can view the default configuration values in ``phantom_snap.settings.py``.
====
Render HTML to an image using PhantomJS with this library designed to scale for high volume continuous operation.
Features
--------
- Provides full timing control around the rendering process.
- Maintains a live PhantomJS process (instead of a new one per request which many wrappers do, which is slow).
- Render content from a URL, or provide the HTML content directly to the renderer
Roadmap
-------
- Decorator to manage rendering under specified timezones
- Decorator to manage rendering under specified proxies.
Examples
--------
The example assumes you have http://phantomjs.org/ installed.
This first example demonstrates rendering a URL and saving the resulting image to a file at /tmp/google-render.jpg.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.imagetools import save_image
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true'],
'env': {'TZ': 'America/Los_Angeles'}
'timeouts': {
'page_load': 3
}
}
r = PhantomJSRenderer(config)
url = 'http://www.google.com'
try:
page = r.render(url, img_format='JPEG')
save_image('/tmp/google-render', page)
finally:
r.shutdown(15)
A sample response from ``r.render(url)`` looks like this:
::
{
"status": "success",
"format": "PNG",
"url": "http://www.google.com",
"paint_time": 141,
"base64": "iVBORw0KGgo <SNIP> RK5CYII=",
"error": null,
"load_time": 342
}
This example shows how to provide HTML content directly to the rendering process, instead of requesting it.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.imagetools import save_image
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true']
}
r = PhantomJSRenderer(config)
url = 'http://www.a-url.com'
html = '<html><body>Boo ya!</body></html>'
try:
page = r.render(url=url, html=html, img_format='PNG')
save_image('/tmp/html-render', page)
finally:
r.shutdown(15)
If you would like to offload the running of phantomjs into `AWS Lambda <https://aws.amazon.com/lambda/>`_, you can use the ``LambdaRenderer`` class in the following way:
::
from phantom_snap.lambda_renderer import LambdaRenderer
from phantom_snap.imagetools import save_image
config = {
'url': 'http://url-to-my-lambda-func',
}
r = LambdaRenderer(config)
url = 'http://www.youtube.com'
page = r.render(url, img_format='JPEG')
save_image('/tmp/youtube-render', page)
r.shutdown()
To learn more about offloading renders into AWS Lambda, please see the ``serverless`` folder.
Decorators
----------
**Lifetime**
If you plan on running a ``PhantomJSRenderer`` instance for an extended period of time with high volume, it's recommended that you wrap the instance with a ``Lifetime`` decorator as shown below.
The ``Lifetime`` decorator will transparently shutdown the underlying PhantomJS process if the renderer is idle or after a maximum lifetime to release any accumulated resources. This is important if PhantomJS is configured to use a memory-based browser cache to prevent the cache from growing too large. After the ``Lifetime`` decorator shuts down the Renderer (due to idle time or maximum time) the next render request will automatically create a new PhantomJS process.
::
from phantom_snap.settings import PHANTOMJS
from phantom_snap.phantom import PhantomJSRenderer
from phantom_snap.decorators import Lifetime
config = {
'executable': '/usr/local/bin/phantomjs',
'args': PHANTOMJS['args'] + ['--disk-cache=false', '--load-images=true'],
'env': {'TZ': 'America/Los_Angeles'},
# Properties for the Lifetime decorator
'idle_shutdown_sec': 900, # 15 minutes, Shutdown PhantomJS if it's been idle this long
'max_lifetime_sec': 43200 # 12 hours, Restart PhantomJS every 12 hours
}
r = Lifetime(PhantomJSRenderer(config))
try:
urls = [] # Some endless source of URL targets
for url in urls:
page = r.render(url=url, img_format='JPEG')
# Store the image somewhere
finally:
r.shutdown()
You can view the default configuration values in ``phantom_snap.settings.py``.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
phantom-snap-0.0.18.1.tar.gz
(11.5 kB
view details)
File details
Details for the file phantom-snap-0.0.18.1.tar.gz
.
File metadata
- Download URL: phantom-snap-0.0.18.1.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.27.0 CPython/2.7.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 502d7a3f5ba4d1a2141d193d0c9cb8af6c4d84f1ee50410ab00d53360ea5a0c2 |
|
MD5 | 351b140cb521dcfba7ec60f46000f56c |
|
BLAKE2b-256 | 0a50ddc9b4042ad71d3a03ca25e1be7aca4ecb5b9c3732000754c113dec6eaf9 |