a command executor with caching for data processing pipelines
Project description
Razel
A command executor with caching. It is:
- Fast: commands are executed multithreaded and local caching speeds up trial and error development (avoids repeated execution of commands which have been processed before)
- Scalable: optional remote caching allows sharing results between CI jobs
- Reliable: commands are executed in a sandbox to detect missing dependencies
- Easy to use: commands are specified using a high-level TypeScript or Python API and convenience functions/tasks are built-in
- Made for: data processing pipelines with executables working on files and many dependencies between those
Razel is not the best choice for building software, especially there's no built-in support for compiler setup and header dependencies.
Getting Started
The native input format for Razel is a razel.jsonl
file, see the example examples/razel.jsonl.
It can be run with razel exec -f examples/razel.jsonl
.
The preferred way is to use one of the high-level APIs. Both allow specifying the commands in an object-oriented style
and provide a run()
function which creates the razel.jsonl
file, downloads the native razel
binary
and uses it to execute the commands.
Paths of inputs files are relative to the workspace (directory of razel.jsonl
). Output files are created
in <cwd>/razel-out
. Additional metadata is written to <cwd>/razel-out/razel-metadata
.
TypeScript API
Install Deno to use the TypeScript API. Run the example Deno script:
deno run -A --check examples/deno.ts -- -v
Python API
The Python API requires Python >= 3.8. Install the package and run the example Python script:
pip install --upgrade razel
python examples/python.py -v
Batch file (experimental)
In addition to razel.jsonl
, Razel can directly execute a batch file containing commands.
Input and output files need to be specified, which is WIP.
Execute the example examples/batch.sh with Razel:
razel exec -f examples/batch.sh
Running in Docker/Podman container
The workspace directory can be mounted into a container:
podman run -t -v $PWD:$PWD -w $PWD denoland/deno deno run -A examples/deno.ts
Building Razel from source
Use rustup to install Rust. Install protobuf-compiler
. Then run cargo install --locked razel
.
Project Status
Razel is in active development and used in production.
CLI and format of razel.jsonl
will likely change, same for output in razel-out/razel-metadata
.
While Linux is the main development platform, Razel is also tested on Mac and Windows.
Features
Measurements
Razel parses the stdout of executed commands to capture runtime measurements and writes them
to razel-out/razel-metadata/log.json
and razel-out/razel-metadata/measurements.csv
.
Currently, the <CTestMeasurement>
and <DartMeasurement>
tags as used
by CTest/CDash are
supported:
<CTestMeasurement type="numeric/double" name="score">12.3</CTestMeasurement>
<CTestMeasurement type="text/string" name="result">ok</CTestMeasurement>
Supporting custom formats is planned.
Tags
Tags can be set on commands. Any custom string can be used as tag, a colon should be used for grouping.
The tags are added to razel-out/razel-metadata/execution_times.json
.
Using tags for filtering commands and creating reports is planned.
Tags with razel:
prefix are reserved and have special meaning:
razel:quiet
: don't be verbose if command succeededrazel:verbose
: always show verbose outputrazel:condition
: keep running and don't be verbose if command failedrazel:timeout:<seconds>
: kill command after the specified number of secondsrazel:no-cache
: always execute a command without cachingrazel:no-remote-cache
: don't use remote cacherazel:no-sandbox
: disable sandbox and also cache - for commands with unspecified input/output files
Conditional execution / Skipping commands
Commands can be skipped based on the execution result of another command. Set the razel:condition
tag on a command
and use that one as dependency for other commands.
WebAssembly
Razel has a WebAssembly runtime integrated and can directly execute WASM modules using WebAssembly System Interface (WASI).
WebAssembly is a perfect fit to create portable data processing pipelines with Razel. Just a single WebAssembly module is needed to run - and create bit-exact output - on all platforms. WebAssembly execution is slower than native binaries, but startup time might be faster (no process overhead).
Param/Response files
Commands with huge number of arguments might result in command lines which are too long to be executed by the OS. Razel detects those cases and replaces the arguments with a response file. The filename starts with @.
Out of memory (OOM) handling
If a process is killed by the OS, the command and similar ones will be retried with less concurrency to reduce the total memory usage. (Doesn't work in K8s because the whole pod is killed.)
Sandbox
Commands are executed in a temporary directory which contains symlinks to the input files specific to one command. This allows detecting unspecified dependencies which would break caching.
The sandbox is not meant for executing untrusted code.
Local Caching
The local cache is enabled by default and stores information about previously executed commands and output files.
The output directory razel-out
contains symlinks to files stored in the local cache.
Use razel exec --info
to get the default cache directory and --cache-dir
(env: RAZEL_CACHE_DIR
) to move it.
Remote Caching
Razel supports remote caching compatible to Bazel Remote Execution API. Remote execution is not yet implemented.
Use --remote-cache
(env: RAZEL_REMOTE_CACHE
) to specify a comma seperated list of remote cache URLs.
The first available one will be used.
Optionally --remote-cache-threshold
(REMOTE_CACHE_THRESHOLD
) can be set to only cache commands with
outputSize / execTime < threshold [kilobyte / s]
. If your remote cache doesn't have unlimited storage capacity,
this can drastically speed up execution because quick commands with large output files will no longer be cached,
providing more storage for expensive commands.
The following remote cache implementations are tested with Razel:
- bazel-remote-cache
- run with
podman run -p 9092:9092 buchgr/bazel-remote-cache --max_size 10
- call razel with
RAZEL_REMOTE_CACHE=grpc://localhost:9092
- run with
- nativelink
- run with instance_name
main
on port 50051:mkdir -p nativelink-config curl https://raw.githubusercontent.com/TraceMachina/nativelink/main/nativelink-config/examples/basic_cas.json --output nativelink-config/basic_cas.json podman run -p 50051:50051 -v $PWD/nativelink-config:/nativelink-config:ro ghcr.io/tracemachina/nativelink:v0.2.0 /nativelink-config/basic_cas.json
- call razel with
RAZEL_REMOTE_CACHE=grpc://localhost:50051/main
- run with instance_name
Configuration
Use razel exec -h
to list the configuration options for execution.
Some options can also be set as environment variables and those are loaded from .env
files.
The following sources are used in order, overwriting previous values:
.env
file in current directory or its parents.env.local
file in current directory or its parents- environment variable
- command line option
Acknowledgements
The idea to build fast and correct is based on Bazel.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file razel-0.5.1.tar.gz
.
File metadata
- Download URL: razel-0.5.1.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3f840fdb208c033fadeb45fc3a8f6f4a066c6d7b630ac2b7d51f6dc5bbb8f6f1 |
|
MD5 | 1c8e9f5bf88bfe65089de9c6c13f3eaf |
|
BLAKE2b-256 | 0ef35eae12991d619d4db06d13f36e2f1443941dfbb3f9237f3b71211754e72f |
File details
Details for the file razel-0.5.1-py3-none-any.whl
.
File metadata
- Download URL: razel-0.5.1-py3-none-any.whl
- Upload date:
- Size: 9.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 26e1c871a0fb4be9e251224c167c14ae4f25b4b2ff9fd37bec0cc387a45eeb1e |
|
MD5 | 43cdfebe97b5e468ab930552de27c349 |
|
BLAKE2b-256 | 9cef93451fbc158d77cf2cc619356425d866740fb7845d8b83501088d699f95f |