Jupyter kernel for Stata based on pystata
Project description
nbstata
What is Jupyter?
JupyterLab is a browser-based editor that allows you to combine interactive code and results with Markdown in a single document (called a Jupyter notebook). It is open source and widely used. Though it is named after the three core programming languages it supports (Julia, Python, and R), it can be used with with a wide variety of languages.
nbstata
allows you to create Stata notebooks (as opposed to using
Stata within a Python notebook, which is a nice way to embed Stata
commands within Python
code but
is needlessly clunky if you are working primarily with Stata).
nbstata
features
- Works with Stata 17 (only).
- Can display output from Stata code without an “echo” of multi-line commands
- Autocompletion for variables, macros, matrices, and file paths.
- DataGrid widget with
browse
-like capabilities (e.g., interactive filtering). - Variable and data properties (
describe
ande
/return list
) available in a “contextual help” side panel. - Interactive help files available within notebook.
-
#delimit ;
interactive support (along with all types of comments). - Mata interactive support.
What do Stata notebooks allow that the official Stata IDE doesn’t?
- Exploratory analysis that is both:
- interactive
- preserved for future reference/editing
- Presenting results in a way that interweaves:
- code
- results (including graphs)
- rich text:
- lists
- Headings
- links
- math: $y_{it}=\beta_0+\varepsilon_{it}$
Install
Because it uses pystata under
the hood, nbstata
requires Stata 17 to be installed locally. (If you
have an older version of Stata, consider
stata_kernel instead.)
To install nbstata
, run:
pip install nbstata
python -m nbstata.install [--sys-prefix] [--prefix PREFIX] [--conf-file]
Include --sys-prefix
to install to sys.prefix
(e.g. a virtualenv or
conda env), or --prefix PREFIX
if you want to specify the install path
yourself.
Configuration file
The --conf-file
option creates a configuration file for you. (Note: If
the installer cannot find the location of your Stata installation, a
configuration file will be created even if you do not include the
--conf-file
option to allow you to manually specify the Stata
location.) The location of the configuration file will be:
[prefix]/etc/nbstata.conf
if--sys-prefix
or--prefix
is specified.~/.nbstata.conf
otherwise.
(Note: If a configuration file exists in both locations at kernel runtime, the user version takes precedence.)
Updating
To update from a previous version of nbstata
, run:
pip install nbstata --upgrade
When updating, you don’t have to run python -m nbstata.install
again.
Syntax highlighting
Stata syntax highlighting can be installed for Jupyter Lab:
pip install jupyterlab_stata_highlight2
(If you prefer the standard Jupyter color scheme, the original jupyterlab-stata-highlight also works.)
Configuration
The following settings are permitted inside the configuration file:
stata_dir
: Stata installation directory.edition
: Stata edition. Acceptable values are ‘be’, ‘se’ and ‘mp’. Default is ‘be’.graph_format
: Acceptable values are ‘png’ (the default), ‘pdf’, ‘svg’ and ‘pystata’. Specify the last option if you want to usepystata
’s default setting.echo
: controls the echo of commands, with the default being ‘None’:- ‘True’: the kernel will echo all commands.
- ‘False’: the kernel will not echo single-line commands.
- ‘None’: the kernel will not echo any command.
splash
: controls display of the splash message during Stata startup. Default is ‘False’.missing
: What should be displayed in the output of the*%head
and*%tail
magics for a missing value. Default is ‘.’, following Stata. To defer to pandas’ format forNA
, specify ‘pandas’.
Settings must be under the title [nbstata]
. Example:
[nbstata]
stata_dir = /opt/stata
edition = mp
graph_format = svg
echo = False
splash = True
missing = NA
Default Graph Format
Both pystata
and stata_kernel
default to the SVG image format.
nbstata
defaults to the PNG image format instead for several reasons:
- Jupyter does not show SVG images from untrusted notebooks (link 1).
- Notebooks with empty cells are untrusted (link 2).
- SVG images cannot be copied and pasted directly into Word or PowerPoint.
Magics
Magics are commands that only work in nbstata
and are not part of
Stata’s syntax. Magics normally start with %
, but this will cause
errors when the notebook is exported and run as a Stata script. As an
alternative, you may prefix the magic name with *%
, which will then be
treated by Stata as a single-line comment.
nbstata
currently supports the following magics:
Magic | Description | Full Syntax |
---|---|---|
*%browse | Interactively view dataset | *%browse [-h] [varlist] [if] [in] [, nolabel noformat] |
*%head | View first 5 (or N) rows | *%head [-h] [N] [varlist] [if] [, nolabel noformat] |
*%tail | View last 5 (or N) rows | *%tail [-h] [N] [varlist] [if] [, nolabel noformat] |
*%locals | List locals with their values | *%locals |
*%delimit | Print the current delimiter | *%delimit |
*%help | Display Stata help | *%help [-h] command_or_topic_name |
*%echo | Ensure echo from cell | *%echo |
*%noecho | Suppress echo from cell | *%noecho |
*%quietly | Suppress all output from cell | *%quietly |
The %browse
magic requires JupyterLab with the
@finos/perspective-jupyterlab
extension correctly
installed.
By default, the %browse
, %head
, and %tail
magics convert numeric
Stata values to strings using their Stata format (or value labels). To
prevent this behavior, specify the noformat
and/or nolabel
options.
Stata Implementation Details
#delimit
behavior
A #delimit;
command in
one cell will persist into other cells, until #delimit cr
is called.
For example, see delimit
tests.ipynb.
echo = None
: potential for unanticipated errors
The default echo = None
configuration does some complicated things
under the hood to emulate functionality that pystata
does not directly
support: running multi-line Stata code without echoing the commands.
While extensive automatic tests are in place to help ensure its
reliability, unanticipated issues may arise. If, while using this mode,
a particular code cell is not working as expected, try placing the
%noecho
magic at the top of it to see if that resolves the issue. (If
so, please report that
here.) You
can also avoid such potential issues by setting the config
echo = False
, which will at least not echo single-line Stata commands
though it will echo multiple commands.
Contributing
nbstata
is being developed using
nbdev.
The /nbs
directory is where edits to the source code should be made.
(The python code is then exported to the /nbdev
library folder.) The
one exception is install.py
.
The @patch_to decorator is occasionally used to break up class definitions into separate cells.
For more, see CONTRIBUTING.md.
Acknowledgements
Kyle Barron authored the original stata_kernel
and Vinci Chow carried
that work forward for Stata 17, converting the backend to use
pystata. nbstata
is directly
derived from his
pystata-kernel.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.