Skip to main content

No project description provided

Project description

nshsnap

nshsnap is a Python library for creating and managing snapshots of Python projects and environments. It's particularly useful for scenarios where you need to preserve the exact state of your code and dependencies, such as when running machine learning training jobs on cluster systems like SLURM.

Motivation

When running long-running jobs or experiments, especially in distributed environments, it's critical to maintain consistency in your codebase. Small changes in your project files can lead to crashes or inconsistent results in ongoing runs. nshsnap addresses this issue by allowing you to take a full snapshot of your main development package, preserving the code state at that moment.

This tool was originally created to solve issues encountered when running ML training jobs on SLURM. It enables you to:

  1. Capture the current state of your project
  2. Ensure reproducibility across multiple runs
  3. Isolate changes in your development environment from active jobs

Installation

You can install nshsnap using pip:

pip install nshsnap

Usage

Programmatic Usage

Here's a basic example of how to use nshsnap in your Python code:

from nshsnap import snapshot

# Create a snapshot with default settings
snapshot_info = snapshot()

# Or, create a snapshot with custom configuration
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    editable_modules=True
)

# Create a snapshot with specific git references
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    git_references={
        "my_project": "v1.2.3",        # Use tag v1.2.3
        "my_other_module": "main"      # Use main branch
    },
    editable_modules=False
)

print(f"Snapshot created at: {snapshot_info.snapshot_dir}")
print(f"Modules included: {', '.join(snapshot_info.modules)}")

Command Line Usage

nshsnap also provides a command-line interface:

# Snapshot all editable packages in the current environment
nshsnap --editables

# Snapshot specific modules
nshsnap --modules my_project my_other_module

# Specify a custom snapshot directory
nshsnap --editables --dir /path/to/snapshot/directory

# Snapshot modules at specific git references
nshsnap --modules my_project --git-ref my_project:v1.2.3
nshsnap --editables --git-ref my_package:main --git-ref another_package:develop

# Get help
nshsnap --help

Activating and Using Snapshots

After creating a snapshot, all you need to do is prepend the snapshot directory to your PYTHONPATH to activate the snapshot environment:

export PYTHONPATH=/path/to/snapshot:$PYTHONPATH

You can also activate or execute commands within the snapshot environment using our helper scripts:

# Activate the snapshot environment
source /path/to/snapshot/.bin/activate

# Execute a command within the snapshot environment
/path/to/snapshot/.bin/execute python my_script.py

Features

  • Snapshot editable packages and specified modules
  • Git reference support: Snapshot modules at specific git branches, tags, or commit hashes
  • Preserve exact state of code and dependencies
  • Easy activation and execution within snapshot environments
  • Integration with version control systems (respects .gitignore)
  • Metadata storage for snapshot information

Git References

nshsnap supports snapshotting modules at specific git references (branches, tags, or commit hashes). This is particularly useful when you want to ensure your snapshot uses a specific version of a dependency:

  • Modules must be git repositories: The git reference feature only works for modules that are located in git repositories
  • Automatic restoration: After snapshotting, the original git reference is automatically restored
  • Error handling: If a git reference cannot be checked out, nshsnap will log an error and skip that module
  • Multiple references: You can specify different git references for different modules
# Example: Snapshot with git references
nshsnap --modules my_project other_module \
        --git-ref my_project:v2.1.0 \
        --git-ref other_module:feature-branch

Requirements

  • Python 3.9+
  • git
  • rsync

Contributing

Contributions to nshsnap are welcome! Please feel free to submit a Pull Request.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshsnap-0.15.0b0.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshsnap-0.15.0b0-py3-none-any.whl (28.8 kB view details)

Uploaded Python 3

File details

Details for the file nshsnap-0.15.0b0.tar.gz.

File metadata

  • Download URL: nshsnap-0.15.0b0.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b0.tar.gz
Algorithm Hash digest
SHA256 95122b5ab80d2552d421a674e3a34b4f3b7c65ccbaea8e5813f828c1e54808b2
MD5 2c11a8db2993ab9e73115f8090bdc5c8
BLAKE2b-256 b777c20b7fcd2ec14f328e4524a95f5b46a7ddc9b8e367f13ddfc7e52d2ef4a2

See more details on using hashes here.

File details

Details for the file nshsnap-0.15.0b0-py3-none-any.whl.

File metadata

  • Download URL: nshsnap-0.15.0b0-py3-none-any.whl
  • Upload date:
  • Size: 28.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b0-py3-none-any.whl
Algorithm Hash digest
SHA256 78da8de6882f4a205fd68741faea9df6bc2b75650ab4e9c5605d663a1e6b6864
MD5 8a08a116bdfa791e2077dae149b8c169
BLAKE2b-256 ea90de8ca9e5ca0a9e9fb64a8e3c43300ed1eeab469d56fe34edfc6b4cb72c88

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page