Skip to main content

No project description provided

Project description

nshsnap

nshsnap is a Python library for creating and managing snapshots of Python projects and environments. It's particularly useful for scenarios where you need to preserve the exact state of your code and dependencies, such as when running machine learning training jobs on cluster systems like SLURM.

Motivation

When running long-running jobs or experiments, especially in distributed environments, it's critical to maintain consistency in your codebase. Small changes in your project files can lead to crashes or inconsistent results in ongoing runs. nshsnap addresses this issue by allowing you to take a full snapshot of your main development package, preserving the code state at that moment.

This tool was originally created to solve issues encountered when running ML training jobs on SLURM. It enables you to:

  1. Capture the current state of your project
  2. Ensure reproducibility across multiple runs
  3. Isolate changes in your development environment from active jobs

Installation

You can install nshsnap using pip:

pip install nshsnap

Usage

Programmatic Usage

Here's a basic example of how to use nshsnap in your Python code:

from nshsnap import snapshot

# Create a snapshot with default settings
snapshot_info = snapshot()

# Or, create a snapshot with custom configuration
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    editable_modules=True
)

# Create a snapshot with specific git references
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    git_references={
        "my_project": "v1.2.3",        # Use tag v1.2.3
        "my_other_module": "main"      # Use main branch
    },
    editable_modules=False
)

print(f"Snapshot created at: {snapshot_info.snapshot_dir}")
print(f"Modules included: {', '.join(snapshot_info.modules)}")

Command Line Usage

nshsnap also provides a command-line interface:

# Snapshot all editable packages in the current environment
nshsnap --editables

# Snapshot specific modules
nshsnap --modules my_project my_other_module

# Specify a custom snapshot directory
nshsnap --editables --dir /path/to/snapshot/directory

# Snapshot modules at specific git references
nshsnap --modules my_project --git-ref my_project:v1.2.3
nshsnap --editables --git-ref my_package:main --git-ref another_package:develop

# Get help
nshsnap --help

Activating and Using Snapshots

After creating a snapshot, all you need to do is prepend the snapshot directory to your PYTHONPATH to activate the snapshot environment:

export PYTHONPATH=/path/to/snapshot:$PYTHONPATH

You can also activate or execute commands within the snapshot environment using our helper scripts:

# Activate the snapshot environment
source /path/to/snapshot/.bin/activate

# Execute a command within the snapshot environment
/path/to/snapshot/.bin/execute python my_script.py

Features

  • Snapshot editable packages and specified modules
  • Git reference support: Snapshot modules at specific git branches, tags, or commit hashes
  • Preserve exact state of code and dependencies
  • Easy activation and execution within snapshot environments
  • Integration with version control systems (respects .gitignore)
  • Metadata storage for snapshot information

Git References

nshsnap supports snapshotting modules at specific git references (branches, tags, or commit hashes). This is particularly useful when you want to ensure your snapshot uses a specific version of a dependency:

  • Modules must be git repositories: The git reference feature only works for modules that are located in git repositories
  • Automatic restoration: After snapshotting, the original git reference is automatically restored
  • Error handling: If a git reference cannot be checked out, nshsnap will log an error and skip that module
  • Multiple references: You can specify different git references for different modules
# Example: Snapshot with git references
nshsnap --modules my_project other_module \
        --git-ref my_project:v2.1.0 \
        --git-ref other_module:feature-branch

Requirements

  • Python 3.9+
  • git
  • rsync

Contributing

Contributions to nshsnap are welcome! Please feel free to submit a Pull Request.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshsnap-0.15.0b2.tar.gz (16.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshsnap-0.15.0b2-py3-none-any.whl (29.3 kB view details)

Uploaded Python 3

File details

Details for the file nshsnap-0.15.0b2.tar.gz.

File metadata

  • Download URL: nshsnap-0.15.0b2.tar.gz
  • Upload date:
  • Size: 16.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b2.tar.gz
Algorithm Hash digest
SHA256 6f9daca16ba163d53eae42cc80593feea9ad56bd83e2ce7c3afb627dc8df937f
MD5 2f497c46f7b02ede73a33d87d6f9b922
BLAKE2b-256 83e51976e783c2ecc35e0396237d5dec157efb7e988e5a2d7604587a88ed1ab6

See more details on using hashes here.

File details

Details for the file nshsnap-0.15.0b2-py3-none-any.whl.

File metadata

  • Download URL: nshsnap-0.15.0b2-py3-none-any.whl
  • Upload date:
  • Size: 29.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 70865ced5fda6465d30c11fb4565277c6322bc0434dab6d916dcf7cd6b30af5c
MD5 e286beb949cb02034848520c2bd815c3
BLAKE2b-256 f14e0558dbcb3dcba05411d399b7a08346c95bc72b06d72c009ef2fe383e2a03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page