Library and utilities for interfacing with S3
Project description
S3Lib
Python library and collection of command line programs for interfacing with AWS S3. Uses buffering and fixed memory usage, where possible, so that operations on large buckets and objects are safe and easy.
Features
- Memory-efficient streaming for large objects
- Batch operations for large buckets
- Support for custom S3-compatible endpoints
- Simple credential management
- Both library and CLI interfaces
Installation
pip install s3lib
Configuration
S3Lib supports multiple authentication methods (in order of precedence):
- Command-line argument: Use
--creds <path>to specify a credentials file - Environment variables: Set
AWS_ACCESS_KEY_IDandAWS_SECRET_ACCESS_KEY - Credentials file: Create
~/.s3with your credentials (default)
Credentials File Format
Create a file at ~/.s3 (or any path you specify) with:
<AWS_ACCESS_KEY_ID>
<AWS_SECRET_ACCESS_KEY>
Example:
AKIAIOSFODNN7EXAMPLE
wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Command Line Utilities
s3ls - List buckets or objects
List all buckets:
s3ls
List objects in a bucket:
s3ls mybucket
List with prefix filter:
s3ls mybucket --prefix logs/2024/
List with custom fields:
s3ls mybucket --fields Key Size LastModified
Available fields: Key, LastModified, ETag, Size, StorageClass
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--output FILE- Write output to file--creds FILE- Path to credentials file--mark MARKER- Start listing from this key--prefix PREFIX- Filter by prefix--batch SIZE- Batch size for API calls (default: 1000)
s3get - Download objects
Download an object:
s3get mybucket myfile.txt --output local-file.txt
Download to stdout:
s3get mybucket logs/app.log | grep ERROR
Download multiple objects:
s3get mybucket file1.txt file2.txt --output combined.txt
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--output FILE- Write output to file (default: stdout)--creds FILE- Path to credentials file--range START-END- Fetch only a byte range (e.g.0-499,500-,-999)
s3put - Upload objects
Upload a file:
s3put mybucket remote-file.txt local-file.txt
Upload from stdin:
echo "Hello World" | s3put mybucket hello.txt
Upload with custom headers:
s3put mybucket file.txt local.txt --header "Content-Type:text/plain" --header "Cache-Control:max-age=3600"
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--header KEY:VALUE- Add custom HTTP headers (repeatable)
s3head - Get object metadata
Get metadata for objects:
s3head mybucket file1.txt file2.txt
Get metadata in JSON format:
s3head mybucket file.txt --json
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--json- Output in JSON format
s3cp - Copy objects
Copy object within or between buckets:
s3cp source-bucket source-key dest-bucket dest-key
Copy with custom metadata:
s3cp mybucket old.txt mybucket new.txt --header "Content-Type:application/json"
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file--header KEY:VALUE- Set metadata headers (repeatable)
s3rm - Delete objects
Delete objects:
s3rm mybucket file1.txt file2.txt
Delete with verbose output:
s3rm mybucket file.txt --verbose
Batch delete with custom batch size:
s3rm mybucket file*.txt --batch 100
Options:
--host HOST- Custom S3 endpoint hostname--port PORT- Custom port--creds FILE- Path to credentials file-v, --verbose- Show files as they are deleted--batch SIZE- Batch size for delete operations (default: 500)
s3sign - Sign S3 forms
Sign a policy document for browser-based uploads:
s3sign policy.json
This outputs the base64-encoded policy and signature.
Options:
--creds FILE- Path to credentials file
Python Library API
Connection Lifecycle
Connection must be used as a context manager. Calling methods outside of a with block raises ConnectionLifecycleError. The connection is established lazily on first use and closed when the with block exits.
from s3lib import Connection
access_id = "AKIAIOSFODNN7EXAMPLE"
secret = b"wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
with Connection(access_id, secret) as s3:
for bucket in s3.list_buckets():
print(bucket)
Unconsumed responses: If get_object2 returns a stream and the Connection context exits before that stream is consumed, ConnectionLifecycleError is raised. Always close the stream before letting the connection exit:
with Connection(access_id, secret) as s3:
stream, headers = s3.get_object2("mybucket", "file.txt")
with stream: # stream must be closed inside the connection block
data = stream.read()
Downloading Objects
get_object2 (recommended)
get_object2 returns (S3ByteStream, headers) on success, or (None, headers) when a conditional request produces no body (304 Not Modified or 412 Precondition Failed).
with Connection(access_id, secret) as s3:
# Simple download
stream, headers = s3.get_object2("mybucket", "file.txt")
with stream:
data = stream.read()
# Conditional download — skip if unchanged (caching)
stream, headers = s3.get_object2("mybucket", "file.txt", if_none_match=cached_etag)
if stream is None:
pass # 304 Not Modified — use cached copy
else:
with stream:
data = stream.read()
# Conditional download — only if ETag still matches
stream, headers = s3.get_object2("mybucket", "file.txt", if_match=expected_etag)
if stream is None:
pass # 412 Precondition Failed — object has changed
else:
with stream:
data = stream.read()
get_object (low-level)
get_object returns the raw HTTPResponse. Conditional responses (304, 412) are returned as status codes — no exception is raised.
with Connection(access_id, secret) as s3:
# Conditional download — check status to detect unchanged object
response = s3.get_object("mybucket", "file.txt", if_none_match=cached_etag)
if response.status == 304:
pass # Not Modified — use cached copy
else:
data = response.read()
# Conditional download — check status to detect changed object
response = s3.get_object("mybucket", "file.txt", if_match=expected_etag)
if response.status == 412:
pass # Precondition Failed — object has changed
else:
data = response.read()
S3ByteStream
get_object2 returns an S3ByteStream context manager. It must always be used with with:
- Full consumption: when
.read()returnsb""(EOF), the underlying HTTP connection is kept alive and returned to a healthy state for reuse. - Early exit: when the
withblock exits before the stream is exhausted, the underlying socket is closed.
# Incremental read — stream a large object to disk
with Connection(access_id, secret) as s3:
stream, headers = s3.get_object2("mybucket", "largefile.bin")
with stream, open("local-large.bin", "wb") as f:
while chunk := stream.read(65536):
f.write(chunk)
Byte Range Fetching
Request only a portion of an object using byte_range=(start, end). Both positions are inclusive, 0-based byte offsets. Either can be None:
with Connection(access_id, secret) as s3:
# First 500 bytes
stream, headers = s3.get_object2("mybucket", "file.bin", byte_range=(0, 499))
with stream:
data = stream.read()
# From byte 4096 to end of object
stream, headers = s3.get_object2("mybucket", "file.bin", byte_range=(4096, None))
with stream:
tail = stream.read()
Uploading Objects
put_object2 (recommended)
put_object2 returns a PutResult TypedDict on success, or None when a conditional check fails — no exception to catch.
| Field | Type | Description |
|---|---|---|
etag |
str |
ETag of the stored object; use with if_match for future consistency checks |
version_id |
str | None |
Version ID if bucket versioning is enabled |
checksum |
str | None |
Server-confirmed checksum if one was requested |
with Connection(access_id, secret) as s3:
# Upload bytes
result = s3.put_object2("mybucket", "file.txt", b"Hello World")
print(result['etag'])
# Upload from an open file or BytesIO
with open("local.bin", "rb") as f:
result = s3.put_object2("mybucket", "remote.bin", f)
# Create-only — None means the object already existed, upload was skipped
result = s3.put_object2("mybucket", "file.txt", b"data", if_none_match=True)
if result is None:
pass # object already exists
# Optimistic locking — None means a concurrent write changed the object
result = s3.put_object2("mybucket", "file.txt", b"updated", if_match=old_etag)
if result is None:
pass # ETag changed, retry with a fresh read
put_object (low-level)
put_object returns a raw (status, headers) tuple. Conditional failures raise PreconditionFailed.
from s3lib import Connection, PreconditionFailed
with Connection(access_id, secret) as s3:
# Create-only upload
try:
s3.put_object("mybucket", "file.txt", b"data", if_none_match=True)
except PreconditionFailed:
pass # object already exists
# Optimistic locking
try:
s3.put_object("mybucket", "file.txt", b"updated", if_match=old_etag)
except PreconditionFailed:
pass # ETag changed, retry with a fresh read
Other Operations
with Connection(access_id, secret) as s3:
# List buckets
for bucket in s3.list_buckets():
print(bucket)
# List objects (keys only)
for key in s3.list_bucket("mybucket"):
print(key)
# List objects with metadata
for obj in s3.list_bucket2("mybucket"):
print(obj['Key'], obj['Size'], obj['LastModified'])
# Object metadata
headers = s3.head_object("mybucket", "file.txt")
# Copy object within or between buckets
s3.copy_object("bucket1", "src.txt", "bucket2", "dst.txt")
# Delete one object
s3.delete_object("mybucket", "file.txt")
# Bulk delete
for key, ok in s3.delete_objects("mybucket", ["a.txt", "b.txt"]):
print(key, ok)
Connection Options
# Custom endpoint (e.g. MinIO or a specific AWS region)
with Connection(access_id, secret, host="s3.us-west-2.amazonaws.com") as s3:
pass
# Custom port
with Connection(access_id, secret, port=9000) as s3:
pass
# Connection timeout (seconds)
with Connection(access_id, secret, conn_timeout=60) as s3:
pass
Development
See MAINTAINING.md for development and maintenance instructions.
Running Tests
# Install development dependencies
make dev
# Run tests, type checking, and linting
make check
# Run tests with coverage report
make test
# Type check only
make typecheck
# Lint only
make lint
License
MIT License - See setup.py for details.
Author
Andrew Thomson (athomsonguy@gmail.com)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s3lib-2.3.0.tar.gz.
File metadata
- Download URL: s3lib-2.3.0.tar.gz
- Upload date:
- Size: 51.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5e8fa58188e2b6b913d0566205d3e4e614283a91c01c61cc5241270eeb9f3146
|
|
| MD5 |
1aea4d5cc3761a4f544af93e4ab5331c
|
|
| BLAKE2b-256 |
d51feb7a512f5de61d7c3ac04b50cbbc1f9ff14bab3ea058fd7955fa0466b951
|
File details
Details for the file s3lib-2.3.0-py3-none-any.whl.
File metadata
- Download URL: s3lib-2.3.0-py3-none-any.whl
- Upload date:
- Size: 33.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0e1f02538ac9dad002102a1c7cc94fb3a68be0c0b1134eb48f0322261042bd01
|
|
| MD5 |
fedb043ac0f59ec7126a52517579826f
|
|
| BLAKE2b-256 |
2439e8c1ecc8604156d9e5db2d628484444399ae348fa2dcdb23b7aeed61e5fa
|