A security toolkit for Amazon S3
Project description
s3tk
A security toolkit for Amazon S3
Another day, another leaky Amazon S3 bucket
— The Register, 12 Jul 2017
Don’t be the... next... big... data... leak
:tangerine: Battle-tested at Instacart
Installation
Run:
pip install s3tk
You can use the AWS CLI or AWS Vault to set up your AWS credentials:
pip install awscli
aws configure
See IAM policies needed for each command.
Commands
Scan
Scan your buckets for:
- ACL open to public
- policy open to public
- public access blocked
- logging enabled
- versioning enabled
- default encryption enabled
s3tk scan
Only run on specific buckets
s3tk scan my-bucket my-bucket-2
Also works with wildcards
s3tk scan "my-bucket*"
Confirm correct log bucket(s) and prefix
s3tk scan --log-bucket my-s3-logs --log-bucket other-region-logs --log-prefix "{bucket}/"
Check CloudTrail object-level logging [experimental]
s3tk scan --object-level-logging
Skip logging, versioning, or default encryption
s3tk scan --skip-logging --skip-versioning --skip-default-encryption
Get email notifications of failures (via SNS)
s3tk scan --sns-topic arn:aws:sns:...
List Policy
List bucket policies
s3tk list-policy
Only run on specific buckets
s3tk list-policy my-bucket my-bucket-2
Show named statements
s3tk list-policy --named
Set Policy
Note: This replaces the previous policy
Only private uploads
s3tk set-policy my-bucket --no-object-acl
Delete Policy
Delete policy
s3tk delete-policy my-bucket
Block Public Access
Block public access on specific buckets
s3tk block-public-access my-bucket my-bucket-2
Use the --dry-run
flag to test
Enable Logging
Enable logging on all buckets
s3tk enable-logging --log-bucket my-s3-logs
Only on specific buckets
s3tk enable-logging my-bucket my-bucket-2 --log-bucket my-s3-logs
Set log prefix ({bucket}/
by default)
s3tk enable-logging --log-bucket my-s3-logs --log-prefix "logs/{bucket}/"
Use the --dry-run
flag to test
A few notes about logging:
- buckets with logging already enabled are not updated at all
- the log bucket must in the same region as the source bucket - run this command multiple times for different regions
- it can take over an hour for logs to show up
Enable Versioning
Enable versioning on all buckets
s3tk enable-versioning
Only on specific buckets
s3tk enable-versioning my-bucket my-bucket-2
Use the --dry-run
flag to test
Enable Default Encryption
Enable default encryption on all buckets
s3tk enable-default-encryption
Only on specific buckets
s3tk enable-default-encryption my-bucket my-bucket-2
This does not encrypt existing objects - use the encrypt
command for this
Use the --dry-run
flag to test
Scan Object ACL
Scan ACL on all objects in a bucket
s3tk scan-object-acl my-bucket
Only certain objects
s3tk scan-object-acl my-bucket --only "*.pdf"
Except certain objects
s3tk scan-object-acl my-bucket --except "*.jpg"
Reset Object ACL
Reset ACL on all objects in a bucket
s3tk reset-object-acl my-bucket
This makes all objects private. See bucket policies for how to enforce going forward.
Use the --dry-run
flag to test
Specify certain objects the same way as scan-object-acl
Encrypt
Encrypt all objects in a bucket with server-side encryption
s3tk encrypt my-bucket
Use S3-managed keys by default. For KMS-managed keys, use:
s3tk encrypt my-bucket --kms-key-id arn:aws:kms:...
For customer-provided keys, use:
s3tk encrypt my-bucket --customer-key secret-key
Use the --dry-run
flag to test
Specify certain objects the same way as scan-object-acl
Note: Objects will lose any custom ACL
Delete Unencrypted Versions
Delete all unencrypted versions of objects in a bucket
s3tk delete-unencrypted-versions my-bucket
For safety, this will not delete any current versions of objects
Use the --dry-run
flag to test
Specify certain objects the same way as scan-object-acl
Scan DNS
Scan Route 53 for buckets to make sure you own them
s3tk scan-dns
Otherwise, you may be susceptible to subdomain takeover
Credentials
Credentials can be specified in ~/.aws/credentials
or with environment variables. See this guide for an explanation of environment variables.
You can specify a profile to use with:
AWS_PROFILE=your-profile s3tk
IAM Policies
Here are the permissions needed for each command. Only include statements you need.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Scan",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketAcl",
"s3:GetBucketPolicy",
"s3:GetBucketPublicAccessBlock",
"s3:GetBucketLogging",
"s3:GetBucketVersioning",
"s3:GetEncryptionConfiguration"
],
"Resource": "*"
},
{
"Sid": "ScanObjectLevelLogging",
"Effect": "Allow",
"Action": [
"cloudtrail:ListTrails",
"cloudtrail:GetTrail",
"cloudtrail:GetEventSelectors",
"s3:GetBucketLocation"
],
"Resource": "*"
},
{
"Sid": "ScanDNS",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"route53:ListHostedZones",
"route53:ListResourceRecordSets"
],
"Resource": "*"
},
{
"Sid": "ListPolicy",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:GetBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "SetPolicy",
"Effect": "Allow",
"Action": [
"s3:PutBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "DeletePolicy",
"Effect": "Allow",
"Action": [
"s3:DeleteBucketPolicy"
],
"Resource": "*"
},
{
"Sid": "BlockPublicAccess",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutBucketPublicAccessBlock"
],
"Resource": "*"
},
{
"Sid": "EnableLogging",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutBucketLogging"
],
"Resource": "*"
},
{
"Sid": "EnableVersioning",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutBucketVersioning"
],
"Resource": "*"
},
{
"Sid": "EnableDefaultEncryption",
"Effect": "Allow",
"Action": [
"s3:ListAllMyBuckets",
"s3:PutEncryptionConfiguration"
],
"Resource": "*"
},
{
"Sid": "ResetObjectAcl",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObjectAcl",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
},
{
"Sid": "Encrypt",
"Effect": "Allow",
"Action": [
"s3:ListBucket",
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
},
{
"Sid": "DeleteUnencryptedVersions",
"Effect": "Allow",
"Action": [
"s3:ListBucketVersions",
"s3:GetObjectVersion",
"s3:DeleteObjectVersion"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}
Access Logs
Amazon Athena is great for querying S3 logs. Create a table (thanks to this post for the table structure) with:
CREATE EXTERNAL TABLE my_bucket (
bucket_owner string,
bucket string,
time string,
remote_ip string,
requester string,
request_id string,
operation string,
key string,
request_verb string,
request_url string,
request_proto string,
status_code string,
error_code string,
bytes_sent string,
object_size string,
total_time string,
turn_around_time string,
referrer string,
user_agent string,
version_id string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\\") ([^ ]*)$'
) LOCATION 's3://my-s3-logs/my-bucket/';
Change the last line to point to your log bucket (and prefix) and query away
SELECT
date_parse(time, '%d/%b/%Y:%H:%i:%S +0000') AS time,
request_url,
remote_ip,
user_agent
FROM
my_bucket
WHERE
requester = '-'
AND status_code LIKE '2%'
AND request_url LIKE '/some-keys%'
ORDER BY 1
CloudTrail Logs
Amazon Athena is also great for querying CloudTrail logs. Create a table (thanks to this post for the table structure) with:
CREATE EXTERNAL TABLE cloudtrail_logs (
eventversion STRING,
userIdentity STRUCT<
type:STRING,
principalid:STRING,
arn:STRING,
accountid:STRING,
invokedby:STRING,
accesskeyid:STRING,
userName:String,
sessioncontext:STRUCT<
attributes:STRUCT<
mfaauthenticated:STRING,
creationdate:STRING>,
sessionIssuer:STRUCT<
type:STRING,
principalId:STRING,
arn:STRING,
accountId:STRING,
userName:STRING>>>,
eventTime STRING,
eventSource STRING,
eventName STRING,
awsRegion STRING,
sourceIpAddress STRING,
userAgent STRING,
errorCode STRING,
errorMessage STRING,
requestId STRING,
eventId STRING,
resources ARRAY<STRUCT<
ARN:STRING,
accountId:STRING,
type:STRING>>,
eventType STRING,
apiVersion STRING,
readOnly BOOLEAN,
recipientAccountId STRING,
sharedEventID STRING,
vpcEndpointId STRING,
requestParameters STRING,
responseElements STRING,
additionalEventData STRING,
serviceEventDetails STRING
)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://my-cloudtrail-logs/'
Change the last line to point to your CloudTrail log bucket and query away
SELECT
eventTime,
eventName,
userIdentity.userName,
requestParameters
FROM
cloudtrail_logs
WHERE
eventName LIKE '%Bucket%'
ORDER BY 1
Best Practices
Keep things simple and follow the principle of least privilege to reduce the chance of mistakes.
- Strictly limit who can perform bucket-related operations
- Avoid mixing objects with different permissions in the same bucket (use a bucket policy to enforce this)
- Don’t specify public read permissions on a bucket level (no
GetObject
in bucket policy) - Monitor configuration frequently for changes
Bucket Policies
Only private uploads
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObjectAcl",
"Resource": "arn:aws:s3:::my-bucket/*"
}
]
}
Performance
For commands that iterate over bucket objects (scan-object-acl
, reset-object-acl
, encrypt
, and delete-unencrypted-versions
), run s3tk on an EC2 server for minimum latency.
Notes
The set-policy
, block-public-access
, enable-logging
, enable-versioning
, and enable-default-encryption
commands are provided for convenience. We recommend Terraform for managing your buckets.
resource "aws_s3_bucket" "my_bucket" {
bucket = "my-bucket"
acl = "private"
logging {
target_bucket = "my-s3-logs"
target_prefix = "my-bucket/"
}
versioning {
enabled = true
}
}
resource "aws_s3_bucket_public_access_block" "my_bucket" {
bucket = "${aws_s3_bucket.my_bucket.id}"
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
Upgrading
Run:
pip install s3tk --upgrade
To use master, run:
pip install git+https://github.com/ankane/s3tk.git --upgrade
Docker
Run:
docker run -it ankane/s3tk aws configure
Commit your credentials:
docker commit $(docker ps -l -q) my-s3tk
And run:
docker run -it my-s3tk s3tk scan
History
View the changelog
Contributing
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/s3tk.git
cd s3tk
pip install -r requirements.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file s3tk-0.4.0-py2.py3-none-any.whl
.
File metadata
- Download URL: s3tk-0.4.0-py2.py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13f8202b7fdf71f7fb66a6764cc164669dbcee995548956213b63a052b59bea8 |
|
MD5 | 82ee4c2e948d2464b1472b2273c35902 |
|
BLAKE2b-256 | 3a0dba05b8572603e913c648179b5aea8ac079a7faf87035d3f62d2e5b777863 |