Skip to main content

A library for choreographing your machine learning research.

Project description




AI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project.


CI PyPI Documentation Status License

Quick links

Installation

ai2-tango requires Python 3.8 or later.

Installing with pip

ai2-tango is available on PyPI. Just run

pip install ai2-tango

To install with a specific integration, such as torch for example, run

pip install 'ai2-tango[torch]'

To install with all integrations, run

pip install 'ai2-tango[all]'

Installing with conda

ai2-tango is available on conda-forge. You can install just the base package with

conda install tango -c conda-forge

You can pick and choose from the integrations with one of these:

conda install tango-datasets -c conda-forge
conda install tango-pytorch_lightning -c conda-forge
conda install tango-torch -c conda-forge
conda install tango-wandb -c conda-forge

You can also install everything:

conda install tango-all -c conda-forge

Even though ai2-tango itself is quite small, installing everything will pull in a lot of dependencies. Don't be surprised if this takes a while!

Installing from source

To install ai2-tango from source, first clone the repository:

git clone https://github.com/allenai/tango.git
cd tango

Then run

pip install -e '.[all]'

To install with only a specific integration, such as torch for example, run

pip install -e '.[torch]'

Or to install just the base tango library, you can run

pip install -e .

Checking your installation

Run

tango info

to check your installation.

Docker image

You can build a Docker image suitable for tango projects by using the official Dockerfile as a starting point for your own Dockerfile, or you can simply use one of our prebuilt images as a base image in your Dockerfile. For example:

# Start from a prebuilt tango base image.
# You can choose the right tag from the available options here:
# https://github.com/allenai/tango/pkgs/container/tango/versions
FROM ghcr.io/allenai/tango:cuda11.3

# Install your project's additional requirements.
COPY requirements.txt .
RUN /opt/conda/bin/pip install --no-cache-dir -r requirements.txt

# Install source code.
# This instruction copies EVERYTHING in the current directory (build context),
# which may not be what you want. Consider using a ".dockerignore" file to
# exclude files and directories that you don't want on the image.
COPY . .

Make sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports. You can see a list of all available image tags on GitHub.

FAQ

Why is the library named Tango?

The motivation behind this library is that we can make research easier by composing it into well-defined steps. What happens when you choreograph a number of steps together? Well, you get a dance. And since our team's leader is part of a tango band, "AI2 Tango" was an obvious choice!

How can I debug my steps through the Tango CLI?

You can run the tango command through pdb. For example:

python -m pdb -m tango run config.jsonnet

How is Tango different from Metaflow, Airflow, or redun?

We've found that existing DAG execution engines like these tools are great for production workflows but not as well suited for messy, collaborative research projects where code is changing constantly. AI2 Tango was built specifically for these kinds of research projects.

How does Tango's caching mechanism work?

AI2 Tango caches the results of steps based on the unique_id of the step. The unique_id is essentially a hash of all of the inputs to the step along with:

  1. the step class's fully qualified name, and
  2. the step class's VERSION class variable (an arbitrary string).

Unlike other workflow engines like redun, Tango does not take into account the source code of the class itself (other than its fully qualified name) because we've found that using a hash of the source code bytes is way too sensitive and less transparent for users. When you change the source code of your step in a meaningful way you can just manually change the VERSION class variable to indicate to Tango that the step has been updated.

Team

ai2-tango is developed and maintained by the AllenNLP team, backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.

License

ai2-tango is licensed under Apache 2.0. A full copy of the license can be found on GitHub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai2-tango-1.0.1.tar.gz (193.6 kB view details)

Uploaded Source

Built Distribution

ai2_tango-1.0.1-py3-none-any.whl (241.2 kB view details)

Uploaded Python 3

File details

Details for the file ai2-tango-1.0.1.tar.gz.

File metadata

  • Download URL: ai2-tango-1.0.1.tar.gz
  • Upload date:
  • Size: 193.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.14

File hashes

Hashes for ai2-tango-1.0.1.tar.gz
Algorithm Hash digest
SHA256 e10f3a61183e62940053141283d08af2d1f765d1a06a0bc8091116080c868026
MD5 47940468a5707eef7934cbee1a92db3d
BLAKE2b-256 aafd4a7e0f3aafda543eb6019e7c5be449a18073a24bc036375faf1daff54cfc

See more details on using hashes here.

File details

Details for the file ai2_tango-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ai2_tango-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 241.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.14

File hashes

Hashes for ai2_tango-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 66e65ba39af9920a6a64d9176098433a02857e429114eae716e4972496441db8
MD5 65b5bc3d9e2403be4e22d51b7c5d11fb
BLAKE2b-256 42bd06f03c8c97195de7f6c8f7c2798707f277ad402daaab91d916adc5f453a2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page