A SageMaker-compatible BundledScriptProcessor for running tar-bundled source dirs.
Project description
Bundled Script Processor
An extension of the Amazon SageMaker ScriptProcessor
that adds support for bundling a local source_dir (and optional dependencies) into a tarball, uploading it to S3, and
running it inside SageMaker Processing jobs. This makes it easier to organize your code into directories and
run it in SageMaker without manually managing uploads.
✨ Features
- Extends
ScriptProcessorwithsource_dirsupport - Accepts a source directory instead of just a single script
- Supports bundling dependencies / local folders
- Automatically generates a lightweight entrypoint script, i.e.
runproc.sh - Cleans up temporary artifacts after execution
🔍 How it works under the hood
BundledScriptProcessor extends the normal ScriptProcessor flow by injecting an extra packaging step before execution.
- Bundle creation – It takes your
source_dir(and any extra dependencies) and compresses them into asourcedir.tar.gz. - Upload to S3 – This tarball is uploaded to your SageMaker default bucket and mounted in the container as a
ProcessingInputnamed "code". - Custom entrypoint – A small
runproc.shscript is generated and uploaded as a secondProcessingInputnamed "entrypoint". This script: • Unpackssourcedir.tar.gz• Cleans up the archive • Executes your Python entrypoint (main.py by default) with the specified command (e.g. ["python3"]) and any additional arguments. - Entrypoint override – Finally, it overrides the default ScriptProcessor entrypoint to point to this generated shell script, so SageMaker runs it automatically when the job starts.
This design keeps the upload/extract/execute logic transparent to you, while still relying on SageMaker’s standard ProcessingJob mechanics. Additionally, it builds on the existing SageMaker ScriptProcessor API for tasks like compressing and uploading code to S3.
📦 Installation
pip install bundled-script-processor
🚀 Usage
Example directory layout
demo_bundled_script_processor/
├─ main.py
├─ task/
│ ├─ callable.py
│ └─ helper.py
├─ common/
│ └─ lib.py
main.py
from bundled_script_processor import BundledScriptProcessor
from sagemaker import Session, get_execution_role
sm_session = Session()
role = get_execution_role(sagemaker_session=sm_session)
script = 'callable.py'
source_dir = f'/home/pmaslov/demo_bundled_script_processor/task'
dep1 = f'/home/pmaslov/demo_bundled_script_processor/common'
processor = BundledScriptProcessor(
role=role,
image_uri="123456789012.dkr.ecr.eu-central-1.amazonaws.com/my-image:latest",
instance_type="ml.m4.xlarge"
)
# Run with a full source directory
processor.run(
source_dir=source_dir, # source_dir must contain callable.py (will be copied into /opt/ml/processing/input/code/)
code=script, # python callable (python file name) to be executed inside ScriptProcessor
dependencies=[dep1], # optional dependency (folder will be copied into /opt/ml/processing/input/code/)
arguments=["--hello", "world"] # optional CLI args
)
task/callable.py
from helper import helloworld
from common.lib import common_helloworld
if __name__ == '__main__':
print(helloworld())
print(common_helloworld())
task/helper.py
def helloworld():
return 'Hello World!'
common/lib.py
def common_helloworld():
return 'Common Hello World!'
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bundled_script_processor-0.2.1.tar.gz.
File metadata
- Download URL: bundled_script_processor-0.2.1.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8cbbfaddcd34e3c781d7bd37c5fc4fddd091a997f50105014c3f91442d3ad76a
|
|
| MD5 |
3cc8f4d1990c9a8b543dada10a4892fa
|
|
| BLAKE2b-256 |
5992d798f0781017af40c4df67734052acbe1a4599e4957b1902f612651ef136
|
File details
Details for the file bundled_script_processor-0.2.1-py3-none-any.whl.
File metadata
- Download URL: bundled_script_processor-0.2.1-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.28.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f1f4fdfac84e5bf22d90ee330fae81469200b829cff77db89ee8533277659f7
|
|
| MD5 |
994fda4dcd422ef03572166b6a94c434
|
|
| BLAKE2b-256 |
988061b98598690f89b21f567ffb645d80a99a32502abde289d146e61f6a9d80
|