Skip to main content

No project description provided

Project description

nshsnap

nshsnap is a Python library for creating and managing snapshots of Python projects and environments. It's particularly useful for scenarios where you need to preserve the exact state of your code and dependencies, such as when running machine learning training jobs on cluster systems like SLURM.

Motivation

When running long-running jobs or experiments, especially in distributed environments, it's critical to maintain consistency in your codebase. Small changes in your project files can lead to crashes or inconsistent results in ongoing runs. nshsnap addresses this issue by allowing you to take a full snapshot of your main development package, preserving the code state at that moment.

This tool was originally created to solve issues encountered when running ML training jobs on SLURM. It enables you to:

  1. Capture the current state of your project
  2. Ensure reproducibility across multiple runs
  3. Isolate changes in your development environment from active jobs

Installation

You can install nshsnap using pip:

pip install nshsnap

Usage

Programmatic Usage

Here's a basic example of how to use nshsnap in your Python code:

from nshsnap import snapshot

# Create a snapshot with default settings
snapshot_info = snapshot()

# Or, create a snapshot with custom configuration
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    editable_modules=True
)

# Create a snapshot with specific git references
snapshot_info = snapshot(
    modules=["my_project", "my_other_module"],
    git_references={
        "my_project": "v1.2.3",        # Use tag v1.2.3
        "my_other_module": "main"      # Use main branch
    },
    editable_modules=False
)

print(f"Snapshot created at: {snapshot_info.snapshot_dir}")
print(f"Modules included: {', '.join(snapshot_info.modules)}")

Command Line Usage

nshsnap also provides a command-line interface:

# Snapshot all editable packages in the current environment
nshsnap --editables

# Snapshot specific modules
nshsnap --modules my_project my_other_module

# Specify a custom snapshot directory
nshsnap --editables --dir /path/to/snapshot/directory

# Snapshot modules at specific git references
nshsnap --modules my_project --git-ref my_project:v1.2.3
nshsnap --editables --git-ref my_package:main --git-ref another_package:develop

# Get help
nshsnap --help

Snapshot and Run Commands

For convenience, nshsnap provides the nshsnap-run command that creates a snapshot and immediately runs a command within that environment:

# Run a Python module within a snapshot
nshsnap-run --modules my_project python -m my_project.main

# Run a script with all editable packages
nshsnap-run --editables python my_script.py

# Run with specific git references
nshsnap-run --modules my_project --git-ref my_project:v1.2.3 python -m my_project.main

# Run with custom snapshot directory
nshsnap-run --modules my_project --dir /tmp/my_snapshot python train.py

# Get help
nshsnap-run --help

Activating and Using Snapshots

After creating a snapshot, all you need to do is prepend the snapshot directory to your PYTHONPATH to activate the snapshot environment:

export PYTHONPATH=/path/to/snapshot:$PYTHONPATH

You can also activate or execute commands within the snapshot environment using our helper scripts:

# Activate the snapshot environment
source /path/to/snapshot/.bin/activate

# Execute a command within the snapshot environment
/path/to/snapshot/.bin/execute python my_script.py

Features

  • Snapshot editable packages and specified modules
  • Git reference support: Snapshot modules at specific git branches, tags, or commit hashes
  • Snapshot and run: Create snapshots and immediately execute commands within them using nshsnap-run
  • Preserve exact state of code and dependencies
  • Easy activation and execution within snapshot environments
  • Integration with version control systems (respects .gitignore)
  • Metadata storage for snapshot information

Git References

nshsnap supports snapshotting modules at specific git references (branches, tags, or commit hashes). This is particularly useful when you want to ensure your snapshot uses a specific version of a dependency:

  • Modules must be git repositories: The git reference feature only works for modules that are located in git repositories
  • Automatic restoration: After snapshotting, the original git reference is automatically restored
  • Error handling: If a git reference cannot be checked out, nshsnap will log an error and skip that module
  • Multiple references: You can specify different git references for different modules
# Example: Snapshot with git references
nshsnap --modules my_project other_module \
        --git-ref my_project:v2.1.0 \
        --git-ref other_module:feature-branch

Requirements

  • Python 3.9+
  • git
  • rsync

Contributing

Contributions to nshsnap are welcome! Please feel free to submit a Pull Request.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshsnap-0.15.0b5.tar.gz (18.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshsnap-0.15.0b5-py3-none-any.whl (31.6 kB view details)

Uploaded Python 3

File details

Details for the file nshsnap-0.15.0b5.tar.gz.

File metadata

  • Download URL: nshsnap-0.15.0b5.tar.gz
  • Upload date:
  • Size: 18.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b5.tar.gz
Algorithm Hash digest
SHA256 3f1a57d6538fd04f780dc4bde9ad681808736f8af1eaa44e90c49f6f4fb141be
MD5 c7996cdf2d03b80dc7cab73c630cb5a6
BLAKE2b-256 8c2c1e4d1df8cfb93e38c308abd52c56a6368a43a56a9be6d29c39dc3952ecf2

See more details on using hashes here.

File details

Details for the file nshsnap-0.15.0b5-py3-none-any.whl.

File metadata

  • Download URL: nshsnap-0.15.0b5-py3-none-any.whl
  • Upload date:
  • Size: 31.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshsnap-0.15.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 c1e1048b3b60b5bd3eec837e640caceea5f52bbda8a2ae37efedacc8b2aa733f
MD5 bdab9cbf977182d134a0f32fcdebc8ca
BLAKE2b-256 ae3ea73464d36f7cfd9313357f815e4cb2d7300e19f3662385cde5c7a0ece338

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page