S3 storage backend for Indico
Project description
S3 Storage Plugin
The S3 storage plugin allows Indico to store materials etc. on Amazon S3 or an S3-compatible object storage service (Ceph S3, Minio, etc.) instead of the local file system.
Warning
This plugin has only been tested with Ceph S3 so far. So if you encounter any problems using e.g. the real Amazon S3, please let us know!
It is currently used in production on multiple Indico instances, so we believe it is stable, but please be advised that we do not provide a way to move files back from S3 to local storage (but it would of course be possible to write a script for this).
Changelog
3.3
- Support (and require) Python 3.12
- Fix incorrect download filename formatting when using signed URLs or nginx proxying
3.2.2
- Support Python 3.11
3.2.1
- Stop using deprecated URL utils from werkzeug
3.2
- Update translations
3.1.2
- No technical changes, just fixing a mistake in the README change from 3.1.1
3.1.1
- No technical changes, just adding the missing README to PyPI and updating the nginx config snippet to correctly work with the changes from 3.1 (avoiding an nginx bug)
3.1
- Fix "invalid signature" S3 error in some cases when using
proxy=nginx
for downloads
3.0
- Initial release for Indico 3.0
Configuration
Configuration is done using the STORAGE_BACKENDS
entry of indico.conf
; add a new key
with a name of your choice (e.g. s3
) and specify the details of the S3 storage in the
value.
For a single bucket, all you need to specify is the bucket name:
STORAGE_BACKENDS = {
# ...
's3': 's3:bucket=indico-test'
}
If you want to dynamically create buckets for each year, month or week, you can do this as well. A task will automatically create new buckets a while before it will become active.
STORAGE_BACKENDS = {
# ...
's3': 's3-dynamic:bucket_template=indico-test-<year>,bucket_secret=somethingrandom'
}
For authentication and general S3 config (e.g. to use subdomains for bucket names), the
preferred way is to use the standard files, i.e. ~/.aws/credentials
and ~/.aws/config
,
but you can also specify all settings in the storage backend entry like this:
STORAGE_BACKENDS = {
# ...
's3': 's3:bucket=my-indico-test-bucket,access_key=12345,secret_key=topsecret'
}
Available config options
Multiple options can be specified by separating them with commas. These options are available:
host
-- the host where S3 is running, in case you use a custom S3-compatible storage.profile
-- the name of a specific S3 profile (used in the~/.aws/
config files)access_key
-- the S3 access key; should not be used in favor of~/.aws/credentials
secret_key
-- the S3 access key; should not be used in favor of~/.aws/credentials
addressing_style
-- the S3 addressing style (virtual
orpath
); should not be used in favor of~/.aws/config
bucket_policy_file
-- the path to a file containing an S3 bucket policy; this only applies to new buckets created by this pluginbucket_versioning
-- whether to enable S3 versioning on the bucket; this only applies to new buckets created by this pluginproxy
-- whether to proxy downloads. If set totrue
, all files will be downloaded to memory and then sent to the client by Indico. This may have performance implications if you have large files. A better option is setting it tonginx
, which requires some extra configuration (see below), but lets nginx handle proxying downloads transparently. If you do not use proxying at all, downloading a file redirects the user to a temporary S3 URL valid for a few minutes. Generally this works fine, but it may result in people accidentally copying (and forwarding) temporary links that expire quickly.meta
-- a custom string that is included in the bucket info API of the plugin. You generally do not need this unless you are using custom scripts accessing that API and want to include some extra data there.
When using the s3
backend (single static bucket), the following extra option is available:
bucket
(required) -- the name of the bucket
When using the s3-dynamic
backend, the following extra options are available:
bucket_template
(required) -- a template specifying how the bucket names should be generated. Needs to contain at least one of<year>
,<month>
or<week>
bucket_secret
(required unless set in aws config) -- a random secret used to make bucket names unguessable (as bucket names need to be globally unique on S3); may also be specified asindico_bucket_secret
in~/.aws/credentials
Proxying downloads through nginx
If you want to use the proxy=nginx
option to avoid redirecting users to the actual S3 URL
for file downloads without having the extra load and memory usage of downloading a (possibly
large) attachment to memory first that comes with proxy=on
, you need to add the following
to the server
block in your nginx config that is responsible for Indico.
location ~ ^/\.xsf/s3/(?<download_protocol>https?)/(?<download_host>[^/]+)/(?<download_path>.+)$ {
internal;
set $download_url $download_protocol://$download_host/$download_path;
resolver YOUR_RESOLVER;
proxy_set_header Host $download_host;
proxy_set_header Authorization '';
proxy_set_header Cookie '';
proxy_hide_header X-Amz-Request-Id;
proxy_hide_header Bucket;
proxy_max_temp_file_size 0;
proxy_intercept_errors on;
error_page 301 302 307 = @s3_redirect;
proxy_pass $download_url$is_args$args;
}
location @s3_redirect {
internal;
resolver YOUR_RESOLVER;
set $saved_redirect_location '$upstream_http_location';
proxy_pass $saved_redirect_location;
}
Replace YOUR_RESOLVER
with the hostname or IP address of a nameserver nginx can use to
resolve the S3 hosts. You may find a suitable IP in your /etc/resolv.conf
or by asking
someone from your IT department. If you are running a local caching nameserver, localhost
would work as well.
If you are interested in how this works, check this blog post on which this config is based.
Migration of existing data
The plugin comes with a migration tool, accessible through the indico s3 migrate
CLI.
It can be used without downtime of your service as it consists of two steps - first copying
the files, and then updating references in your database. Please have a look at its --help
output if you want to use it; we did not have time to write detailed documentation for it yet.
The step that updates the database can be reversed in case you want to switch back from S3 to local storage for whatever reason, but it will only affect migrated files - any file stored directly on S3 later (and thus not present on the local file system), will not be reverted. You would need to write your own script that downloads those files from S3.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file indico_plugin_storage_s3-3.3-py3-none-any.whl
.
File metadata
- Download URL: indico_plugin_storage_s3-3.3-py3-none-any.whl
- Upload date:
- Size: 44.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c041eae15eb4ebd659b891d7149f575a60a908b8d0a8f77ef2032a82d000d908 |
|
MD5 | 401d59b9646a0fe0ae8e9948d0655c40 |
|
BLAKE2b-256 | bb14931db6dcc13b7ea8d0beefa96219e00b204829ce4cd5d1738ea509ab10b1 |