Run R scripts with pytask.

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Project description

pytask-r

Run R scripts with pytask.

Installation

pytask-r is available on PyPI and Anaconda.org. Install it with

$ pip install pytask-r

# or

$ conda install -c conda-forge pytask-r

You also need to have R installed and Rscript on your command line. Test it by typing the following on the command line

Rscript --help

If an error is shown instead of a help page, you can install R with conda.

conda install -c conda-forge r-base

Or install install R from the official R Project.

Usage

To create a task that runs an R script, define a task function with the @mark.r decorator. The script keyword provides an absolute path or a path relative to the task module.

from pathlib import Path
from pytask import mark


@mark.r(script=Path("script.r"))
def task_run_r_script(produces: Path = Path("out.rds")):
    pass

If you are wondering why the function body is empty, know that pytask-r replaces the body with a predefined internal function. See the section on implementation details for more information.

Dependencies and Products

Dependencies and products can be added as usual. See this tutorial for some help.

Accessing dependencies and products in the script

To access the paths of dependencies and products in the script, pytask-r stores the information by default in a .json file. The path to this file is passed as a positional argument to the script. Inside the script, you can read the information.

library(jsonlite)

args <- commandArgs(trailingOnly=TRUE)

path_to_json <- args[length(args)]

config <- read_json(path_to_json)

config$produces  # Is the path to the output file "../out.csv".

The .json file is stored in the same folder as the task in a .pytask directory.

To parse the JSON file, you need to install jsonlite.

You can also pass any other information to your script by using the @task decorator.

from pathlib import Path
from pytask import mark, task


@task(kwargs={"number": 1})
@mark.r(script=Path("script.r"))
def task_run_r_script(produces: Path = Path("out.rds")):
    pass

and inside the script use

config$number  # Is 1.

Debugging

In case a task throws an error, you might want to execute the script independently from pytask. After a failed execution, you see the command that executed the R script in the report of the task. It looks roughly like this

Rscript <options> script.r <path-to>/.pytask/task_py_task_example.json

Command Line Arguments

The decorator can be used to pass command line arguments to Rscript. See the following example.

@mark.r(script=Path("script.r"), options="--vanilla")
def task_run_r_script(produces: Path = Path("out.rds")):
    pass

Repeating tasks with different scripts or inputs

You can also repeat the execution of tasks, meaning executing multiple R scripts or passing different command line arguments to the same R script.

The following task executes two R scripts, script_1.r and script_2.r, which produce different outputs.

for i in range(2):

    @task
    @mark.r(script=Path(f"script_{i}.r"))
    def task_execute_r_script(produces: Path = Path(f"out_{i}.csv")):
        pass

If you want to pass different inputs to the same R script, pass these arguments with the kwargs keyword of the @task decorator.

for i in range(2):

    @task(kwargs={"i": i})
    @mark.r(script=Path("script.r"))
    def task_execute_r_script(produces: Path = Path(f"output_{i}.csv")):
        pass

and inside the task access the argument i with

library(jsonlite)

args <- commandArgs(trailingOnly=TRUE)

path_to_json <- args[length(args)]

config <- read_json(path_to_json)

config$produces  # Is the path to the output file "../output_{i}.csv".

config$i  # Is the number.

Serializers

You can also serialize your data with any other tool you like. By default, pytask-r also supports YAML (if PyYaml is installed).

Use the serializer keyword arguments of the @pytask.mark.r decorator with

@mark.r(script=Path("script.r"), serializer="yaml")
def task_example(): ...

And, in your R script use

library(yaml)
args <- commandArgs(trailingOnly=TRUE)
config <- read_yaml(args[length(args)])

Note that the YAML package needs to be installed.

If you need a custom serializer, you can also provide any callable serializer which transforms data into a string. Use suffix to set the correct file ending.

Here is a replication of the JSON example.

import json


@mark.r(script=Path("script.r"), serializer=json.dumps, suffix=".json")
def task_example(): ...

Configuration

You can influence the default behavior of pytask-r with configuration values.

r_serializer

Use this option to change the default serializer.

[tool.pytask.ini_options]
r_serializer = "json"

r_suffix

Use this option to set the default suffix of the file which contains serialized paths to dependencies and products and more.

[tool.pytask.ini_options]
r_suffix = ".json"

r_options

Use this option to set default options for each task which are separated by whitespace.

[tool.pytask.ini_options]
r_options = ["--vanilla"]

Changes

Consult the release notes to find out about what is new.

Project details

These details have not been verified by PyPI

Project links

Development Status
- 4 - Beta
License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.4.1

Apr 20, 2024

0.4.0

Oct 7, 2023

0.3.0

Jan 23, 2023

0.2.0

Apr 16, 2022

0.1.1

Feb 7, 2022

0.1.0

Jul 22, 2021

0.0.9

Mar 5, 2021

0.0.8

Mar 3, 2021

0.0.7

Feb 25, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytask_r-0.4.1.tar.gz (8.6 kB view details)

Uploaded Apr 20, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytask_r-0.4.1-py3-none-any.whl (11.0 kB view details)

Uploaded Apr 20, 2024 Python 3

File details

Details for the file pytask_r-0.4.1.tar.gz.

File metadata

Download URL: pytask_r-0.4.1.tar.gz
Upload date: Apr 20, 2024
Size: 8.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pytask_r-0.4.1.tar.gz
Algorithm	Hash digest
SHA256	`a792ee89616d12fe388c916e2b6e15cb8f4933a6dc55062d5cfff37729f0c8e5`
MD5	`4ad3c04af564f30bcce29d4281970c44`
BLAKE2b-256	`1b7215c46313d5637b8444c617ff0ab2794367881ecd37d3741816002e480b1d`

See more details on using hashes here.

File details

Details for the file pytask_r-0.4.1-py3-none-any.whl.

File metadata

Download URL: pytask_r-0.4.1-py3-none-any.whl
Upload date: Apr 20, 2024
Size: 11.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pytask_r-0.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c0f09ebfd573dbdefb76828eaedf8d2b5e93031b5f7768437fc94cb45018ebc3`
MD5	`86e801b337f0c5a51aead940cee12003`
BLAKE2b-256	`7914c37001727d7bac514c274e6a5b869151546117b73ac6dd10c15bd7ad1bb9`

See more details on using hashes here.

pytask-r 0.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pytask-r

Installation

Usage

Dependencies and Products

Accessing dependencies and products in the script

Debugging

Command Line Arguments

Repeating tasks with different scripts or inputs

Serializers

Configuration

Changes

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes