Skip to main content

A microservice to move files from S3 APIs (Swift or Ceph) to other S3 APIs.

Project description

s3s3 0.1.10 A microservice to move files from S3 APIs (Swift or Ceph) to other S3 APIs.

To install use pip install s3s3


s3s3

s3s3 is a microservice to move files from one S3 compliant service to another (Swift, Ceph, AWS).

s3s3 has two services. s3s3.scripts.listen subscribes and listens to a redis pubsub channel waiting for s3 object names. When notified it calls s3s3.api.upload and uploads that S3 object from the source to destination(s). s3s3.scripts.bucket uploads all keys in one s3 bucket to another.

s3s3.scripts.listen is meant to be a real-time daemon.

s3s3.scripts.bucket is meant to run periodically as a cron job. Anything missed by s3s3.scripts.listen will be handled by this service.

Summary

s3s3 is required because Ceph and Swift are not feature complete with AWS S3. Many available libraries that work well with AWS S3 do not work with Ceph and Swift.

Some examples: key.size and key.md5 do not work with Ceph S3 without fetching the contents of the key (or s3 object). Multipart uploads are not reliable with Ceph S3. V4 signatures are not supported by Ceph S3.

Configuration

s3s3 must be configured. Configuration can be found in the s3s3.config module. A source, destination and pubsub section are required. Multiple destination connections are supported but there was minimal testing. To signify a section is a destination connection it must start with 'dest'.

Example template: https://github.com/jmatt/s3s3/blob/master/s3s3/s3s3.ini.dist

[source]
     aws_access_key_id = {YOUR_AWS_ACCESS_KEY_ID}
     aws_secret_access_key = {YOUR_AWS_SECRET_ACCESS_KEY}
     bucket_name = {YOUR_S3_BUCKET}
     host = {YOUR_CEPH_S3_ENDPOINT}
     verify_md5 = False # Verify md5 during s3 operations.
     is_secure = True # Optional
     calling_format = OrdinaryCallingFormat # Optional
[destination]
     aws_access_key_id = {YOUR_AWS_ACCESS_KEY_ID}
     aws_secret_access_key = {YOUR_AWS_SECRET_ACCESS_KEY}
     bucket_name = {YOUR_S3_BUCKET}
     verify_md5 = False # Verify md5 during s3 operations.
     is_secure = True # Optional

Install

s3s3 requires python3 and redis. It was tested with python3.4 and python3.5. And redis 3.x.

pip install s3s3

Client

There are two clients. One for each service. s3s3.client.ListenClient is a client to listen to the 'backup' redis pubsub channel and call s3s3.client.on_notify. s3s3.client.BucketClient is a client that uses the configuration to provide access to the duplicate_bucket API function.

Command Line

Both clients are available as command line scripts.

s3s3listen --config /path/to/s3s3.ini

This will use the configuration to build source and destination boto connections, connect to redis and start listening on the backup channel. Any messages pushed to that channel will be considered source s3 key names and will attempt to be uploaded to the destination connection(s).

s3s3bucket --config /path/pto/s3s3.ini

This will use the configuration to build source and destination boto connections and duplicate the source bucket in the destination bucket.

API

The API can be found in s3s3.api module.

def create_connection(connection_args):

Creates a boto connection from the connection_args dictionary.

def upload(source_key, dest_key, verify_md5=False):

Upload the source key (S3 object) to the destination key. If verify_md5 is true then verify md5s match.

def duplicate_bucket(source_bucket, dest_bucket, verify_md5=False):

Duplicate the source bucket to the destination bucket. If verify_md5 is true then verify md5s match. If the md5 is not available compute it and verify it matches.

Deploy

s3s3 requires redis, python3 and supervisord.

mkdir -p /opt/env
cd /opt/env
virtualenv -p python3 s3s3
. /opt/env/s3s3/bin/activate
pip install s3s3
echo_s3s3_supervisord_conf > /etc/supervisor/conf.d/s3s3.conf
echo_s3s3_ini_template > /usr/local/etc/s3s3.ini
# Update ini file with your source and destination s3 information.
service supervisor restart # or... start if it's not running.

LICENSE

See the LICENSE file.


For more information, please see: https://github.com/lsst-sqre/s3s3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3s3-0.1.10.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

s3s3-0.1.10-py3-none-any.whl (16.0 kB view details)

Uploaded Python 3

File details

Details for the file s3s3-0.1.10.tar.gz.

File metadata

  • Download URL: s3s3-0.1.10.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for s3s3-0.1.10.tar.gz
Algorithm Hash digest
SHA256 75eabd703be61379667dc9284e8cf1945d1255f242825e993156807cb18226ac
MD5 492f4c36a85bc45ead57b97cfed34467
BLAKE2b-256 4b23a5b82a4d42474818df77d3e56d27e01a595f77f8df33c581def86c536f14

See more details on using hashes here.

File details

Details for the file s3s3-0.1.10-py3-none-any.whl.

File metadata

File hashes

Hashes for s3s3-0.1.10-py3-none-any.whl
Algorithm Hash digest
SHA256 703b93e5af151bcfb5c730b9f375daa9729bc743bd91ca3f3882bb207928778d
MD5 f07baab46c51b726d7822d53697ad4e1
BLAKE2b-256 3462b3635cab7fbb69efeddff5b9cf0ea1c7004c02ef003b3b60f1926c1093bb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page