Skip to main content

Allows you to store large files in the cloud

Project description

https://travis-ci.org/netsight/netsight.cloudstorage.svg?branch=master

netsight.cloudstorage

Support for (securely) offloading Plone file data to the cloud.

This package provides two things:

  • Offloading large files to the cloud

  • Transcoding of video to web-compatible format

  • Doing so in a secure manner that doesn’t bypass Plone’s security model

At the moment this is done using Amazon Web Services (S3 for cloudstorage, Elastic Transcoder for transcoding), but could potentially be expanded to support other cloud storage services.

File data is first stored in Plone, and then synced to the cloud. Subsequent requests for the file data are redirected to a unique auto-expiring cloud URL (which prevents the data from unauthorised access).

Requirements

Uploads are handled asynchronously by Celery, for which you need to configure a supported broker.

Buildout configuration

You will need to add the following to your buildout:

  • netsight.cloudstorage egg into ‘eggs’

  • A part to build celery (e.g. using collective.recipe.celery)

  • broker_url and plone_url variables to your zope instance

Example buildout config

[buildout]
...

[celery]
recipe = collective.recipe.celery
eggs =
     ${instance:eggs}
     netsight.cloudstorage
broker-transport = redis
broker-host = redis://localhost:6379/0
result-backend = redis
result-dburi = redis://localhost:6379/0
imports = netsight.cloudstorage.tasks
celeryd-logfile = ${buildout:directory}/var/log/celeryd.log
celeryd-log-level = info
celeryd-concurrency = 2

[instance]
...
zope-conf-additional =
     <product-config netsight.cloudstorage>
             broker_url ${celery:broker-host}
             plone_url http://localhost:8080
     </product-config>

Please note that plone_url is used by the celery working to read from and send events to Plone. If you are using Virtual Hosting, you will need to include your VH config in the variable e.g.:

plone_url http://localhost:8080/VirtualHostBase/http/www.example.com:80/Plone/VirtualHostRoot/

An example buildout configuration for redis is provided in case you want to configure it using buildout and run with supervisor. Look at files redis.cfg and redis.conf.tpl for more information.

AWS Configuration

Installing the netsight.cloudstorage add-on in the control panel will give you a ‘CloudStorage Settings’ option. You will need to provide:

  • Your AWS Access Key

  • Your AWS Secret Access Key

  • S3 bucket name This is the name of the bucket where files will be uploaded. If it does not exist, it will be created for you when the first file is uploaded.

  • Minimum file size Any files uploaded above this size will automatically be sent to the cloud. Smaller files can still be manually uploaded.

Example AWS Policy

Here is an example policy you can use to grant a specific user access to a specific S3 bucket:

{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Effect": "Allow",
          "Action": "s3:*",
          "Resource": [
              "arn:aws:s3:::netsight-cloudstorage-mybucket",
              "arn:aws:s3:::netsight-cloudstorage-mybucket/*"
          ]
      }
  ]
}

For more details on AWS users and policies, see http://docs.aws.amazon.com/IAM/latest/UserGuide/access.html

How it works

The package registers an event subscribe that watches for new file field uploads. If the size of the file data exceeds the ‘minimum file size’ set above, it will register a celery task that asyncronously uploads the data to the cloud.

Once the upload is complete, celery will notify Plone, which generates an email to the content creator.

Once the cloud copy is available, the package patches the ‘download’ methods so that any requests for the file data result in a redirect to the cloud copy. Each request generates an auto-expiring one-time URL to the cloud copy, ensuring the security of the cloud data.

Transcoding

Files with a ‘video’ mimetype are also sent through a transcoding pipeline if this option is enabled in the control panel.

This transcoded version is stored separately, and must be manually requested by passing ‘transcoded=true’ on the file download request e.g.

http://myplonesite/folder/myfile/at_download/file?transcoded=true

Files are currently transcoded using the ‘Generic 480p 16:9’ preset (1351620000001-000020)

To enable transcoding, you first need to create a specific S3 bucket to save the transcoded files. This specific S3 bucket must be called like the one used to store the files, but ended in “-transcoded”. If you S3 bucket is called “netsight-cloudstorage-plone-storage”, you need to create a new bucket called “netsight-cloudstorage-plone-storage-transcoded”.

Then you need to create a transcoding pipeline. To do that, login to your AWS account, go to Application Services -> Elastic Transcoder and create a new Pipeline. Choose a name for the pipeline (you will have to set this name in the Control Panel), set the default S3 bucket (“netsight-cloudstorage-plone-storage”) as an input bucket and set the new one (“netsight-cloudstorage-plone-storage-transcoded”) as an ouput bucket both for files and playlists, and also for thumbnails.

TODO

  • Support for other transcoding presets

  • Support other cloud backends

Contributors

  • Ben Cole (Architecture and initial implementation)

  • Matthew Sital-Singh (Implementation and documentation)

  • Mikel Larreategi (Improved Dexterity support and optional transcoding)

Changelog

1.8.1 (2016-05-06)

  • Remove dependency on plone.namedfile [mattss]

1.8 (2016-02-09)

  • Provide an example of how to install and configure a redis server with buildout [erral]

  • Add a control panel option to disable transcoding [erral]

  • Better support of dexterity content-types using plone.namedfile. Now dexterity types’ blobs are uploaded automatically to cloud storage [erral]

  • Allow generating differing expiry URLs [benc]

  • Remove files from cloud when removed from Plone [mattss]

1.7.1 (2014-12-11)

  • Fixed issue with a log line [benc]

1.7 (2014-12-09)

  • Handling of content with multiple fields where at least one is below file size threshold [benc]

1.6.9 (2014-12-09)

  • Added more verbose logging throughout [benc]

1.6.8 (2014-12-09)

  • Added more verbose error logging to callback task [benc]

  • Added more logging to callback view [benc]

  • Updated requests required version [benc]

1.6.7 (2014-12-08)

  • Added more logging to upload_callback to aid debugging [benc]

1.6.6 (2014-11-27)

  • Removed bucket creation in transcoding - no longer needed as not creating pipeline [benc]

  • Fixed email notifications configuration [benc]

1.6.5 (2014-11-27)

  • Removed pipeline creation [benc]

  • Made pipeline name optional in control panel [benc]

1.6.1 (2014-11-21)

  • Added workaround for “connection reset by peer” [benc]

1.6 (2014-11-17)

  • Added abaility to disable email notifications [benc]

1.5 (2014-11-06)

  • Added transcoding for video files [benc]

  • Added customisable pipeline name [benc]

  • Added fleshed out README [mattss]

  • Added travis config [mattss]

1.4 (2014-10-23)

  • AWS transcoding support! [benc]

  • Improved support for virtual hosts [benc, mattss]

1.3 (2014-10-22)

  • Half-baked release [names removed to protect the innocent]

1.2 (2014-09-26)

  • General help text updates [mattss]

  • Clear cloud storage setting when re-queued [mattss]

1.1 (2014-09-25)

  • Switch to chunked uploads [benc]

  • Fix bug with download patch [mattss]

  • Add correct filename and mimetype to url generator [mattss]

  • Add manual upload trigger view [benc]

1.0 (2014-09-23)

  • Initial release [benc]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

netsight.cloudstorage-1.8.1.tar.gz (29.9 kB view details)

Uploaded Source

File details

Details for the file netsight.cloudstorage-1.8.1.tar.gz.

File metadata

File hashes

Hashes for netsight.cloudstorage-1.8.1.tar.gz
Algorithm Hash digest
SHA256 ee5c74f01dfc95dcefd96a24d57dfa669c6e62ffdc3a34b830bda303961b1fde
MD5 4eb554bfd9a82772e0d7c64d3cc849ec
BLAKE2b-256 a2a4b8c8b644cebb605017ebe2ac6ae1a6ba303c73906929f2f8400f4b93265b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page