pytask-stata

Execute do-files with Stata and pytask.

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Operating System
- OS Independent
Programming Language

Project description

https://anaconda.org/pytask/pytask-stata/badges/version.svg

https://anaconda.org/pytask/pytask-stata/badges/platforms.svg

https://github.com/pytask-dev/pytask-stata/workflows/Continuous%20Integration%20Workflow/badge.svg?branch=main

https://codecov.io/gh/pytask-dev/pytask-stata/branch/main/graph/badge.svg

https://img.shields.io/badge/code%20style-black-000000.svg

pytask-stata

Run Stata’s do-files with pytask.

Installation

pytask-stata is available on PyPI and Anaconda.org. Install it with

$ pip install pytask-stata

# or

$ conda config --add channels conda-forge --add channels pytask
$ conda install pytask-stata

You also need to have Stata installed on your system and have the executable on your system’s PATH. If you do not know how to do it, here is an explanation.

Usage

Similarly to normal task functions which execute Python code, you define tasks to execute scripts written in Stata with Python functions. The difference is that the function body does not contain any logic, but the decorator tells pytask how to handle the task.

Here is an example where you want to run script.do.

import pytask


@pytask.mark.stata
@pytask.mark.depends_on("script.do")
@pytask.mark.produces("auto.dta")
def task_run_do_file():
    pass

When executing a do-file, the current working directory changes to the directory of the script which is executed.

Multiple dependencies and products

What happens if a task has more dependencies? Using a list, the do-file which should be executed must be found in the first position of the list.

@pytask.mark.stata
@pytask.mark.depends_on(["script.do", "input.dta"])
@pytask.mark.produces("output.dta")
def task_run_do_file():
    pass

If you use a dictionary to pass dependencies to the task, pytask-stata will, first, look for a "source" key in the dictionary and, secondly, under the key 0.

@pytask.mark.depends_on({"source": "script.do", "input": "input.dta"})
def task_run_do_file():
    pass


# or


@pytask.mark.depends_on({0: "script.do", "input": "input.dta"})
def task_run_do_file():
    pass


# or two decorators for the function, if you do not assign a name to the input.


@pytask.mark.depends_on({"source": "script.do"})
@pytask.mark.depends_on("input.dta")
def task_run_do_file():
    pass

Command Line Arguments

The decorator can be used to pass command line arguments to your Stata executable. For example, pass the path of the product with

@pytask.mark.stata("auto.dta")
@pytask.mark.depends_on("script.do")
@pytask.mark.produces("auto.dta")
def task_run_do_file():
    pass

And in your script.do, you can intercept the value with

* Intercept command line argument and save to macro named 'produces'.
args produces

sysuse auto, clear
save "`produces'"

The relative path inside the do-file works only because the pytask-stata switches the current working directory to the directory of the do-file before the task is executed. This is necessary precaution.

To make the task independent from the current working directory, pass the full path as an command line argument. Here is an example.

# Absolute path to the build directory.
from src.config import BLD


@pytask.mark.stata(BLD / "auto.dta")
@pytask.mark.depends_on("script.do")
@pytask.mark.produces(BLD / "auto.dta")
def task_run_do_file():
    pass

Parametrization

You can also parametrize the execution of scripts, meaning executing multiple do-files as well as passing different command line arguments to the same do-file.

The following task executes two do-files which produce different outputs.

@pytask.mark.stata
@pytask.mark.parametrize(
    "depends_on, produces", [("script_1.do", "1.dta"), ("script_2.do", "2.dta")]
)
def task_execute_do_file():
    pass

If you want to pass different command line arguments to the same do-file, you have to include the @pytask.mark.stata decorator in the parametrization just like with @pytask.mark.depends_on and @pytask.mark.produces.

@pytask.mark.depends_on("script.do")
@pytask.mark.parametrize(
    "produces, stata",
    [("output_1.dta", ("1",)), ("output_2.dta", ("2",))],
)
def task_execute_do_file():
    pass

Configuration

pytask-stata can be configured with the following options.

stata_keep_log

Use this option to keep the .log files which are produced for every task. This option is useful to debug Stata tasks. Set the option via the configuration file with

stata_keep_log = (True|true|1|False|false|0)

The option is also available in the command line interface via the --stata-keep-log flag.

stata_check_log_lines

Use this option to vary the number of lines in the log file which are checked for error codes. It also controls the number of lines displayed on errors. Use any integer greater than zero. Here is the entry in the configuration file

stata_check_log_lines = 10

and here via the command line interface

$ pytask build --stata-check-log-lines 10

stata_source_key

If you want to change the name of the key which identifies the do file, change the following default configuration in your pytask configuration file.

stata_source_key = source

Changes

Consult the release notes to find out about what is new.

Project details

These details have not been verified by PyPI

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

0.3.0

Jan 23, 2023

0.2.0

Apr 16, 2022

0.1.2

Feb 7, 2022

0.1.1

Feb 7, 2022

0.1.0

Jul 21, 2021

0.0.6

Mar 4, 2021

0.0.5

Mar 4, 2021

This version

0.0.4

Feb 25, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytask-stata-0.0.4.tar.gz (31.8 kB view hashes)

Uploaded Feb 25, 2021 Source

Built Distribution

pytask_stata-0.0.4-py3-none-any.whl (15.8 kB view hashes)

Uploaded Feb 25, 2021 Python 3

Hashes for pytask-stata-0.0.4.tar.gz

Hashes for pytask-stata-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`625443d3e00768cbd889c7c8fb5cc33056d876ab90131ca8d8cfbe3d113d0253`
MD5	`9c59a3d603f60a52c5239bf1d8916524`
BLAKE2b-256	`e62074eef257af96d042e84d5eb30089f4ab3764316d49cdbe42512c389763a0`

Hashes for pytask_stata-0.0.4-py3-none-any.whl

Hashes for pytask_stata-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9ed7b0c480b2ad8e4c6bcc88d519ee2f199fc0b5ca96bf29d6cac050701fe6d7`
MD5	`296b814b8077067d64aca2a32b139df1`
BLAKE2b-256	`2d2ef932fdddac1ad08d1ed21707c16d3ebfbe07d46e95ad0a559a732c9e07bb`