Skip to main content

Automation framework for the scientific method in AI research

Project description

Usage:

pepy-downloads pypi-version supported-pythons linux-supported osx-supported

Release:

ci-analysis-master ci-integration-master ci-coverage-master

Development:

ci-analysis-devel ci-integration-devel ci-coverage-devel

Misc:

license doi docs maintenance

TL;DR

What is SIERRA? See What is SIERRA?

Why should you use SIERRA? See Why SIERRA?

To install SIERRA (requires python 3.8+):

pip3 install sierra-research

To get started using SIERRA, see getting started.

Want to cite SIERRA? See Citing.

Have an issue using SIERRA? See Troubleshooting.

What is SIERRA?

https://raw.githubusercontent.com/jharwell/sierra/master/docs/figures/architecture.png

SIERRA architecture, organized by pipeline stage. Stages are listed left to right, and an approximate joint architectural/functional stack is top to bottom for each stage. “…” indicates areas where SIERRA is designed via plugins to be easily extensible. “Host machine” indicates the machine SIERRA was invoked on.

SIERRA is a command line tool and plugin framework for:

  • Automating scientific research, providing faculties for seamless experiment generation, execution, and results processing.

  • Accelerating research cycles by allowing researchers to focus on the “science” aspects: developing new things and designing experiments to test them.

  • Improving the reproducibility of scientific research, particularly in AI.

Why SIERRA?

  • SIERRA changes the paradigm of the engineering tasks researchers must perform from manual and procedural to declarative and automated. That is, from:

    "I need to perform these steps to run the experiment, process the data and
    generate the graphs I want."

    to:

    "OK SIERRA: Here is the environment and simulator/robot platform I want to
    use, the deliverables I want to generate, and the data I want to appear on
    them for my research query--GO!"

    Essentially, SIERRA handles the “engineering” parts of research on the backend, such as: generating experiments, configuring execution environments or platforms, running the generated experiments, and processing experimental results to generate statistics, and/or visualizations. It also handles random seeds, algorithm stochasticity, and other low-level details.

  • It eliminates manual reconfiguration of experiments across simulator/robot platforms by decoupling the concepts of execution environment and platform; any supported pair can be selected in a mix-and-match fashion (see SIERRA Support Matrix). Thus, it removes the need for throw-away scripts for data processing and deliverable generation.

  • SIERRA can be used with code written in any language; only bindings must be written in python.

  • SIERRA has a rich model framework allowing you to run arbitrary models, generate data, and plot it on the same figure as empirical results, automatically.

  • Its deeply modular architecture makes it easy to customize for the needs of a specific research project.

Not sure if SIERRA makes sense for your research? Consider the following use cases:

If aspects of either use case sound familiar, then there is a strong chance SIERRA could help you! SIERRA is well documented–see the SIERRA docs to get started.

Use Case #1: Alice The Robotics Researcher

Alice is a researcher at a large university that has developed a new distributed task allocation algorithm $\alpha$ for use in a foraging task where robots must coordinate to find objects of interest in an unknown environment and bring them to a central location. Alice wants to implement her algorithm so she can investigate:

  • How well it scales with the number of robots, specifically if it remains efficient with up to 1000 robots in several different scenarios.

  • How robust it is with respect to sensor and actuator noise.

  • How it compares to other similar state of the art algorithms on a foraging task: $\beta,\gamma$.

Alice is faced with the following heterogeneity matrix which she has to deal with to answer her research queries, in addition to the technical challenges of the AI elements themselves:

Algorithm

Contains stochasticity?

Outputs data in?

$\alpha$

Yes

CSV, rosbag

$\beta$

Yes

CSV, rosbag

$\gamma$

No

rosbag

Alice is familiar with ROS, and wants to use it with large scale simulated and small scale real-robot experiments with TurtleBots. However, for real robots she is unsure what data she will ultimately need, and wants to capture all ROS messages, to avoid having to redo experiments later. She has access to a large SLURM-managed cluster, and prefers to develop code on her laptop.

Use Case #2: Alice The Contagion Modeler

Alice has teamed with Bob, a biologist, to model the spread of contagion among agents in a population, and how that affects their individual and collective abilities to do tasks. She believes her $\alpha$ algorithm can be reused in this context. However, Bob is not convinced and has selected several multi-agent models from recent papers: $\delta,\epsilon$, and wants Alice to compare $\alpha$ to them. $\delta$ was originally developed in NetLogo, for modeling disease transmission in animals. $\epsilon$ was originally developed for ARGoS to model the effects of radiation on robots.

Alice is faced with the following heterogeneity matrix which she must deal with with to answer her research query, in addition to the technical challenges of the AI elements themselves:

Algorithm

Can Run On?

Input Requirements?

$\alpha$

ROS/Gazebo

XML

$\delta$

NetLogo

NetLogo

$\epsilon$

ARGoS

XML

Bob is interested in how the rate of contagion spread varies with agent velocity and population size. Bob needs to prepare succinct, comprehensive visual representations of the results of this research queries for a a presentation, including visual comparisons of the multi-agent model as it runs for each algorithm. He will give Alice a range of parameter values to test for each algorithm based on his ecological knowledge, and rely on Alice to perform the experiments. For this project, Alice does not have access to HPC resources, but does have a handful of servers in her lab which she can use.

SIERRA Support Matrix

SIERRA supports multiple platforms which researchers can write code to target. In SIERRA terminology, a platform is a “thing” (usually a simulator or robot) that you want to write to code to run on. Note that platform != OS, in SIERRA terminology. If a SIERRA platform runs on a given OS, then SIERRA supports doing so; if it does not, then SIERRA does not. For example, SIERRA does not support running ARGoS on windows, because ARGoS does not support windows.

SIERRA supports multiple execution environments for execution of experiments, such as High Performance Computing (HPC) environments and real robots. Which execution environment experiments targeting a given platform is (somewhat) independent of the platform itself (see below).

SIERRA also supports multiple output formats for experimental outputs, as shown below. SIERRA currently only supports XML experimental inputs.

SIERRA supports (mostly) mix-and-match between platforms, execution environments, experiment input/output formats as shown in its support matrix below. This is one of the most powerful features of SIERRA! If your desired platform/execution environment is not listed, see the plugin tutorials for how to add it via a plugin.

Execution Environment

Platform

Experimental Input Format

Experimental Output Format

SLURM: An HPC cluster managed by the SLURM scheduler.

ARGoS, ROS1+Gazebo

XML

CSV, PNG

Torque/MOAB: An HPC cluster managed by the Torque/MOAB scheduler.

ARGoS, ROS1+Gazebo

XML

CSV, PNG

ADHOC: A miscellaneous collection of networked HPC compute nodes or random servers; not managed by a scheduler.

ARGoS, ROS1+Gazebo

XML

CSV, PNG

Local: The SIERRA host machine,e.g., a researcher’s laptop.

ARGoS, ROS1+Gazebo

XML

CSV, PNG

ROS1+Turtlebot3: Turtlebot3 robots with ROS1.

ROS1+Gazebo, ROS1+robot

XML

CSV, PNG

For more details about the platforms out experimental output formats, see below.

Platform

Description

ARGoS

Simulator for fast simulation of large swarms. Requires ARGoS >= 3.0.0-beta59.

ROS1 + Gazebo

Using ROS1 with the Gazebo simulator. Requires Gazebo >= 11.9.0, ROS1 Noetic or later.

ROS1+Robot

Using ROS1 with a real robot platform of your choice. ROS1 Noetic or later is required.

Experimental Output Format

Scope

CSV file

Raw experimental outputs, transforming into heatmap images.

PNG file

Stitching images together into videos.

Requirements To Use SIERRA

The basic requirements are:

  • Recent OSX (tested with 12+) or Linux (tested with ubuntu 20.04+).

  • python >= 3.8.

For more details, including the requirements for researcher code, see the SIERRA requirements.

Citing

If you use SIERRA and have found it helpful, please cite the following paper:

@inproceedings{Harwell2022a-SIERRA,
author = {Harwell, John and Lowmanstone, London and Gini, Maria},
title = {SIERRA: A Modular Framework for Research Automation},
year = {2022},
isbn = {9781450392136},
publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
booktitle = {Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems},
pages = {1905–1907}
}

You can also cite the specific version of SIERRA used with the DOI at the top of this page, to help facilitate reproducibility.

Troubleshooting

If you have problems using SIERRA, please open an issue or post in the Github forum and I’ll be happy to help you work through it.

Contributing

I welcome all types of contributions, no matter how large or how small, and if you have an idea, I’m happy to talk about it at any point :-). See here for the general procedure.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sierra-research-1.3.6.tar.gz (462.4 kB view hashes)

Uploaded Source

Built Distribution

sierra_research-1.3.6-py3-none-any.whl (257.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page