Unofficial AWS Glue Ray.io packaging tool. Pure python or cross platform.
Project description
raypack
Raypack will create a package for AWS Glue, Ray.io tasks. This automates this documentation page that has some handwavy descriptions of shell commands. AWS Lambdas also call for a similar type of packaging, but as far as I know it is no sort of standard.
Use this if you have private dependencies, native dependencies, or you want to package your own code as a python package. AWS Glue can't handle anything without a binary wheel or private package repositories, gcc or other build tools are not in Glue runtime images.
See below for build options.
raypack is not supported by Amazon, AWS, nor Anyscale, Inc the makers of ray.io. Some code generate with ChatGPT (OpenAI)
Installation
You are encouraged to install with pipx so that the CLI tools dependencies do not conflict with your project dependencies.
pipx install raypack
Usage
raypack [--verbose]
python -m raypack [--verbose]
Configuration. If none specified, defaults are as below.
[tool.raypack]
exclude_packaging_cruft = true
outer_folder_name = "venv"
source_venv = ".venv"
venv_tool = "poetry"
Build Options
If your dependencies are all pure python, the packaging will work on any machine. However, if your dependencies have any native code:
Arm64 built on an Arm64 machine
- An actual arm64 build runner with gcc - Best option, will allow compiling native code correctly.
- Docker e.g.
FROM public.ecr.aws/lambda/python:3.9-arm64
- Second best option, I haven't tried it, not sure if it works on all build runners that are not actually arm64 CPUs. example
Mac Binaries
- An arm64 machine e.g. mac - Next best option, not sure if it will work.
Precompiled Binaries
- Any machine or an arm64 machine like a mac- Would work in limited situations, namely when there are precompiled binaries (wheels) or all packages are pure python.
The last option works by telling pip to just download and unzip the arm64 wheels. If there aren't wheels or if the wheels weren't compiled for arm64, then you have to consider finding a different machine or convincing package maintainers to support wheels and more kinds of wheels.
Capabilities
- TODO: Warn if not python 3.9 or other glue compatible version
- TODO: create a venv for 3.9 if current venv is not 3.9
- Calls poetry to create a virtualenv without dev dependencies
- TODO: support pip, pipenv to create virtualenv.
- Finds site-packages
- Zips virtualenv and zips own package
- TODO: support single file modules, eg. mymodule.py
- Skips cruft
- Run as few subprocesses as possible
- config using pyproject.toml or CLI args
- TODO: Uploads to s3
- pipx installable
- works on any OS as well as is possible (can't handle linux binaries on windows for example)
- Remove packages AWS includes
- AWS's documentation on packaging ray jobs
- [ray's documentation on dependencies](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html.
- AWS's documentation on packaging spark jobs
How it works
On native arm64 machine
- Gather info from pyproject.toml or CLI args, but not both.
- Create a local .venv and .whl using poetry.
- Create a new zip file with an extra top level folder.
- Find the site-packages folder and copy to a new zip
- Find the module contents in .whl and copy to a new zip
- Upload to s3
- Use s3 py modules
"--s3-py-modules", "s3://s3bucket/pythonPackage.zip"
On non-arm64 machine
- Use poetry lock file to generate requirements.txt
- Use pip's download target with specified platform (arm64) to simulate creating a venv
- Combine with own code as above
- Upload to s3 as above
Contributing
To install and run tests and linting tools.
poetry install --with dev
make check
To see if the app can package up other apps
poetry build
# exist poetry shell so that pipx can install with the right base python
exit
pipx install /e/github/raypack/dist/raypack-0.1.0-py3-none-any.whl
And then in a different project with a pyproject.toml
file, run
raypack
Prior Art
Similar to PEX or other venv zip tools, which as far as I know are not AWS aware, or they don't include all the dependencies, or they are more interested in making the archive file executable or self-extracting.
AWS Lambdas also have to go through a similar ad hoc zip process.
Documentation
Change Log
- 0.1.0 - Idea and reserve package name.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file raypack-0.4.0.tar.gz
.
File metadata
- Download URL: raypack-0.4.0.tar.gz
- Upload date:
- Size: 14.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 95fc38554994b511b634ed4b9d13fff82e1dcf676ff6aa602032ff66be126c9a |
|
MD5 | 0ecb4e03b791ffb43944650ba92db302 |
|
BLAKE2b-256 | f50471bf173c0dddabb3d2f218dec817d35110e53e0287db7cb3d66391705095 |
File details
Details for the file raypack-0.4.0-py3-none-any.whl
.
File metadata
- Download URL: raypack-0.4.0-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/4.0.2 CPython/3.11.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4d07bd77f7a9b8e696ec5794d97542e3ef0ecd7403e347ff8fb5a7ecaaa4ca1e |
|
MD5 | d505f8e9bb9f8d82a4bfd0f835141724 |
|
BLAKE2b-256 | 49edc1aac593798e42c49adeac4ee2fb60a9fa216aeee6cf5a65e4401ebc60d8 |