Skip to main content

a command executor with caching for data processing pipelines

Project description

Razel

Rust MIT CI

Deno module Python module Rust crate

A command executor with caching. It is:

  • Fast: commands are executed multithreaded and local caching speeds up trial and error development (avoids repeated execution of commands which have been processed before)
  • Scalable: optional remote caching allows sharing results between CI jobs
  • Reliable: commands are executed in a sandbox to detect missing dependencies
  • Easy to use: commands are specified using a high-level TypeScript or Python API and convenience functions/tasks are built-in
  • Made for: data processing pipelines with executables working on files and many dependencies between those

Razel is not the best choice for building software, especially there's no built-in support for compiler setup and header dependencies.

Getting Started

The native input format for Razel is a razel.jsonl file, see the example examples/razel.jsonl. It can be run with razel exec -f examples/razel.jsonl.

The preferred way is to use one of the high-level APIs. Both allow specifying the commands in an object-oriented style and provide a run() function which creates the razel.jsonl file, downloads the native razel binary and uses it to execute the commands.

Paths of inputs files are relative to the workspace (directory of razel.jsonl). Output files are created in <cwd>/razel-out. Additional metadata is written to <cwd>/razel-out/razel-metadata.

TypeScript API

Install Deno to use the TypeScript API. Run the example Deno script:

deno run -A --check examples/deno.ts -- -v

Python API

The Python API requires Python >= 3.8. Install the package and run the example Python script:

pip install --upgrade razel
python examples/python.py -v

Batch file (experimental)

In addition to razel.jsonl, Razel can directly execute a batch file containing commands. Input and output files need to be specified, which is WIP.

Execute the example examples/batch.sh with Razel:

razel exec -f examples/batch.sh

Running in Docker/Podman container

The workspace directory can be mounted into a container:

podman run -t -v $PWD:$PWD -w $PWD denoland/deno deno run -A examples/deno.ts

Building Razel from source

Use rustup to install Rust. Install protobuf-compiler. Then run cargo install --locked razel.

Project Status

Razel is in active development and used in production.

CLI and format of razel.jsonl will likely change, same for output in razel-out/razel-metadata. While Linux is the main development platform, Razel is also tested on Mac and Windows.

Features

Measurements

Razel parses the stdout of executed commands to capture runtime measurements and writes them to razel-out/razel-metadata/log.json and razel-out/razel-metadata/measurements.csv. Currently, the <CTestMeasurement> and <DartMeasurement> tags as used by CTest/CDash are supported:

<CTestMeasurement type="numeric/double" name="score">12.3</CTestMeasurement>
<CTestMeasurement type="text/string" name="result">ok</CTestMeasurement>

Supporting custom formats is planned.

Tags

Tags can be set on commands. Any custom string can be used as tag, a colon should be used for grouping. The tags are added to razel-out/razel-metadata/execution_times.json. Using tags for filtering commands and creating reports is planned.

Tags with razel: prefix are reserved and have special meaning:

  • razel:quiet: don't be verbose if command succeeded
  • razel:verbose: always show verbose output
  • razel:condition: keep running and don't be verbose if command failed
  • razel:timeout:<seconds>: kill command after the specified number of seconds
  • razel:no-cache: always execute a command without caching
  • razel:no-remote-cache: don't use remote cache
  • razel:no-sandbox: disable sandbox and also cache - for commands with unspecified input/output files

Conditional execution / Skipping commands

Commands can be skipped based on the execution result of another command. Set the razel:condition tag on a command and use that one as dependency for other commands.

WebAssembly

Razel has a WebAssembly runtime integrated and can directly execute WASM modules using WebAssembly System Interface (WASI).

WebAssembly is a perfect fit to create portable data processing pipelines with Razel. Just a single WebAssembly module is needed to run - and create bit-exact output - on all platforms. WebAssembly execution is slower than native binaries, but startup time might be faster (no process overhead).

Param/Response files

Commands with huge number of arguments might result in command lines which are too long to be executed by the OS. Razel detects those cases and replaces the arguments with a response file. The filename starts with @.

Out of memory (OOM) handling

If a process is killed by the OS, the command and similar ones will be retried with less concurrency to reduce the total memory usage. (Doesn't work in K8s because the whole pod is killed.)

Sandbox

Commands are executed in a temporary directory which contains symlinks to the input files specific to one command. This allows detecting unspecified dependencies which would break caching.

The sandbox is not meant for executing untrusted code.

Local Caching

The local cache is enabled by default and stores information about previously executed commands and output files. The output directory razel-out contains symlinks to files stored in the local cache.

Use razel exec --info to get the default cache directory and --cache-dir (env: RAZEL_CACHE_DIR) to move it.

Remote Caching

Razel supports remote caching compatible to Bazel Remote Execution API. Remote execution is not yet implemented.

Use --remote-cache (env: RAZEL_REMOTE_CACHE) to specify a comma seperated list of remote cache URLs. The first available one will be used. Optionally --remote-cache-threshold (REMOTE_CACHE_THRESHOLD) can be set to only cache commands with outputSize / execTime < threshold [kilobyte / s]. If your remote cache doesn't have unlimited storage capacity, this can drastically speed up execution because quick commands with large output files will no longer be cached, providing more storage for expensive commands.

The following remote cache implementations are tested with Razel:

  • bazel-remote-cache
    • run with podman run -p 9092:9092 buchgr/bazel-remote-cache --max_size 10
    • call razel with RAZEL_REMOTE_CACHE=grpc://localhost:9092
  • nativelink
    • run with instance_name main on port 50051:
      mkdir -p nativelink-config
      curl https://raw.githubusercontent.com/TraceMachina/nativelink/main/nativelink-config/examples/basic_cas.json --output nativelink-config/basic_cas.json
      podman run -p 50051:50051 -v $PWD/nativelink-config:/nativelink-config:ro ghcr.io/tracemachina/nativelink:v0.2.0 /nativelink-config/basic_cas.json
      
    • call razel with RAZEL_REMOTE_CACHE=grpc://localhost:50051/main

Configuration

Use razel exec -h to list the configuration options for execution. Some options can also be set as environment variables and those are loaded from .env files.

The following sources are used in order, overwriting previous values:

  • .env file in current directory or its parents
  • .env.local file in current directory or its parents
  • environment variable
  • command line option

Acknowledgements

The idea to build fast and correct is based on Bazel.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

razel-0.5.5.tar.gz (9.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

razel-0.5.5-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file razel-0.5.5.tar.gz.

File metadata

  • Download URL: razel-0.5.5.tar.gz
  • Upload date:
  • Size: 9.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for razel-0.5.5.tar.gz
Algorithm Hash digest
SHA256 e49afdf1425d585f26da2d6b9f35478bcbf1632f337d2a04baf8fb9b358233da
MD5 8c60008c93fda7d1a1ae07feabd2b981
BLAKE2b-256 544bc8ba1c880e5c0d77d645dc6d74ef9d344a65c2764e010107f7bdbcb2699a

See more details on using hashes here.

File details

Details for the file razel-0.5.5-py3-none-any.whl.

File metadata

  • Download URL: razel-0.5.5-py3-none-any.whl
  • Upload date:
  • Size: 9.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for razel-0.5.5-py3-none-any.whl
Algorithm Hash digest
SHA256 059bb9cae63e2901e41363cd43f90f728340ecd0b5389aae0f7c6652feb39b9b
MD5 a7cd6bd9a33fb710c0906ed1aa09591c
BLAKE2b-256 047e86d2e12c8d457f57b78e864271de9c0c2aa8182a7462c508fb883aafee66

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page