Skip to main content

Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

Project description

docs

Documentation Status

tests

Build Status Code Coverage
Language grade: Python

package

PyPI Package latest release PyPI Wheel Conda-Forge Latest Version
Supported versions Supported implementations
GitHub license

Hangar is version control for tensor data. Commit, branch, merge, revert, and collaborate in the data-defined software era.

  • Free software: Apache 2.0 license

What is Hangar?

Hangar is based off the belief that too much time is spent collecting, managing, and creating home-brewed version control systems for data. At it’s core Hangar is designed to solve many of the same problems faced by traditional code version control system (ie. Git), just adapted for numerical data:

  • Time travel through the historical evolution of a dataset.

  • Zero-cost Branching to enable exploratory analysis and collaboration

  • Cheap Merging to build datasets over time (with multiple collaborators)

  • Completely abstracted organization and management of data files on disk

  • Ability to only retrieve a small portion of the data (as needed) while still maintaining complete historical record

  • Ability to push and pull changes directly to collaborators or a central server (ie a truly distributed version control system)

The ability of version control systems to perform these tasks for codebases is largely taken for granted by almost every developer today; However, we are in-fact standing on the shoulders of giants, with decades of engineering which has resulted in these phenomenally useful tools. Now that a new era of “Data-Defined software” is taking hold, we find there is a strong need for analogous version control systems which are designed to handle numerical data at large scale… Welcome to Hangar!

The Hangar Workflow:

   Checkout Branch
          |
          ▼
 Create/Access Data
          |
          ▼
Add/Remove/Update Samples
          |
          ▼
       Commit

Log Style Output:

*   5254ec (master) : merge commit combining training updates and new validation samples
|\
| * 650361 (add-validation-data) : Add validation labels and image data in isolated branch
* | 5f15b4 : Add some metadata for later reference and add new training samples received after initial import
|/
*   baddba : Initial commit adding training images and labels

Learn more about what Hangar is all about at https://hangar-py.readthedocs.io/

Installation

Hangar is in early alpha development release!

pip install hangar

Documentation

https://hangar-py.readthedocs.io/

Development

To run the all tests run:

tox

Note, to combine the coverage data from all the tox environments run:

Windows

set PYTEST_ADDOPTS=--cov-append
tox

Other

PYTEST_ADDOPTS=--cov-append tox

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hangar-0.5.2.tar.gz (1.8 MB view hashes)

Uploaded source

Built Distributions

hangar-0.5.2-cp38-cp38-win_amd64.whl (781.9 kB view hashes)

Uploaded cp38

hangar-0.5.2-cp37-cp37m-win_amd64.whl (776.5 kB view hashes)

Uploaded cp37

hangar-0.5.2-cp36-cp36m-win_amd64.whl (776.9 kB view hashes)

Uploaded cp36

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page