Skip to main content

No project description provided

Project description

nshutils

nshutils is a collection of utility functions and classes that I've found useful in my day-to-day work as an ML researcher. This library includes utilities for typechecking, logging, and saving/loading activations from neural networks.

Installation

To install nshutils, simply run:

pip install nshutils

Configuration

nshutils features a unified configuration system that allows you to control various aspects of the library through environment variables, programmatic settings, or JSON configuration.

Environment Variables

You can configure nshutils using environment variables:

# Enable debug mode
export NSHUTILS_DEBUG=1

# Enable typecheck (or disable with 0)
export NSHUTILS_TYPECHECK=1

# Enable ActSave with default temp directory
export NSHUTILS_ACTSAVE=1

# Enable ActSave with specific directory
export NSHUTILS_ACTSAVE="/path/to/activations"

# Set ActSave filters (comma-separated patterns)
export NSHUTILS_ACTSAVE_FILTERS="layer*,attention*,encoder.*"

# JSON configuration (overrides individual variables)
export NSHUTILS_CONFIG='{"debug": {"enabled": true}, "typecheck": {"enabled": false}, "actsave": {"enabled": true, "save_dir": "/path/to/activations", "filters": ["layer*"]}}'

# Comma-separated configuration
export NSHUTILS_CONFIG="debug=true,typecheck=false,actsave=true"

Programmatic Configuration

You can also configure nshutils programmatically:

from nshutils import config

# Enable debug mode
config.set(True, "debug")

# Configure typecheck with explicit settings
config.set({"enabled": True}, "typecheck")

# Configure ActSave
config.set({
    "enabled": True,
    "save_dir": "/path/to/activations",
    "filters": ["layer*", "attention*"]
}, "actsave")

# Check current settings
print(f"Debug enabled: {config.debug_enabled()}")
print(f"Typecheck enabled: {config.typecheck_enabled()}")
print(f"ActSave enabled: {config.actsave_enabled()}")

# Temporary overrides using context managers
with config.debug_override(False):
    # Debug is temporarily disabled
    pass

with config.actsave_override({"enabled": True, "filters": ["encoder.*"]}):
    # ActSave temporarily uses different filters
    pass

Configuration Hierarchy

The configuration system follows a hierarchical approach:

  1. Debug → Typecheck: When debug is enabled, typecheck is automatically enabled unless explicitly overridden
  2. Environment variables take precedence over programmatic settings
  3. Context managers provide temporary overrides within their scope

Features

Typechecking

nshutils provides a simple way to typecheck your code using the jaxtyping library. The typecheck system is now integrated with the unified configuration system.

Enable typechecking globally:

export NSHUTILS_TYPECHECK=1

Or programmatically:

from nshutils import config
config.set(True, "typecheck")

# Now use typecheck decorators
from nshutils.typecheck import typecheck

@typecheck
def my_function(x: Float[torch.Tensor, "batch seq"]) -> Float[torch.Tensor, "batch seq"]:
    return x * 2

You can also use the tassert function to assert that a value is of a certain type:

import nshutils.typecheck as tc

def my_function(x: tc.Float[torch.Tensor, "bsz seq len"]) -> tc.Float[torch.Tensor, "bsz seq len"]:
    tc.tassert(tc.Float[torch.Tensor, "bsz seq len"], x)
    ...

Logging

nshutils provides a simple way to configure logging for your project. Simply call one of the logging setup functions:

from nshutils.logging import init_python_logging

init_python_logging()

This will configure logging to use pretty formatting for PyTorch tensors and numpy arrays (inspired by and/or utilizing lovely-numpy and lovely-tensors), and will also enable rich logging if the rich library is installed.

Activation Saving/Loading

nshutils provides a simple way to save and load activations from neural networks. ActSave is now integrated with the unified configuration system.

Basic Usage

Enable ActSave via environment variables:

# Enable with default temp directory
export NSHUTILS_ACTSAVE=1

# Enable with specific directory
export NSHUTILS_ACTSAVE="/path/to/activations"

# Set filters (comma-separated patterns)
export NSHUTILS_ACTSAVE_FILTERS="layer*,attention*"

Or programmatically:

from nshutils import config
from nshutils import ActSave

# Enable ActSave with configuration
config.set({
    "enabled": True,
    "save_dir": "/path/to/activations",
    "filters": ["layer*", "attention*"]
}, "actsave")

def my_model_forward(x):
    ...
    # Save activations - automatically filtered
    ActSave({"encoder.activations": x})

    # Equivalent to the above
    with ActSave.context("encoder"):
        ActSave(activations=x)
    ...

x = torch.randn(...)
my_model_forward(x)
# Activations are saved to disk under the configured directory

You can also use the traditional explicit enable/disable pattern:

from nshutils import ActSave

# Explicit enable with filters
ActSave.enable(save_dir="path/to/activations", filters=["layer*", "attention*"])

def my_model_forward(x):
    ActSave({"encoder.activations": x})

x = torch.randn(...)
my_model_forward(x)
ActSave.disable()

Activation Filtering

ActSave supports filtering to selectively save only certain activations based on fnmatch patterns. This is useful for reducing storage space and focusing on specific model components.

Configure filters via environment variables:

# Only save activations matching "layer*" or "attention*" patterns
export NSHUTILS_ACTSAVE_FILTERS="layer*,attention*"
export NSHUTILS_ACTSAVE=1

Or programmatically:

from nshutils import config

# Configure via unified config system
config.set({
    "enabled": True,
    "save_dir": "/path/to/activations",
    "filters": ["layer*", "attention*"]
}, "actsave")

# These will be saved (match filters)
ActSave(
    layer1_output=x1,
    layer2_hidden=x2,
    attention_weights=x3
)

# These will NOT be saved (don't match filters)
ActSave(
    decoder_output=x4,
    embedding_vector=x5
)

Traditional explicit filtering is also supported:

from nshutils import ActSave

# Only save activations matching "layer*" or "attention*" patterns
filters = ["layer*", "attention*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # These will be saved (match filters)
    ActSave(
        layer1_output=x1,
        layer2_hidden=x2,
        attention_weights=x3
    )

    # These will NOT be saved (don't match filters)
    ActSave(
        decoder_output=x4,
        embedding_vector=x5
    )

The filtering patterns support standard Unix shell-style wildcards:

  • * matches everything
  • ? matches any single character
  • [seq] matches any character in seq
  • [!seq] matches any character not in seq
Contextual Filtering

Filters work with context prefixes, allowing you to save activations from specific model components:

from nshutils import config, ActSave

# Only save activations from encoder layers
config.set({
    "enabled": True,
    "save_dir": "/path/to/activations",
    "filters": ["encoder.*"]
}, "actsave")

# Decoder context - these won't be saved
with ActSave.context("decoder"):
    ActSave(layer1_output=x1, attention=x2)

# Encoder context - these will be saved
with ActSave.context("encoder"):
    ActSave(layer1_output=x3, attention=x4)  # Saved as "encoder.layer1_output", "encoder.attention"
Dynamic Filter Updates

You can update filters during runtime:

from nshutils import config

# Set initial filters
config.set({"enabled": True, "filters": ["layer*"]}, "actsave")

# Initially only layer outputs saved
ActSave(layer1_output=x1, attention_weights=x2)

# Update filters
config.set({"enabled": True, "filters": ["attention*"]}, "actsave")
ActSave(layer2_output=x3, attention_weights=x4)  # Only attention_weights saved

# Check current filters
current_filters = config.actsave_filters()  # Returns ["attention*"]

# Clear filters (save all)
config.set({"enabled": True, "filters": None}, "actsave")

Traditional explicit filter management is also available:

ActSave.enable(save_dir="path/to/activations")

# Initially no filters - all activations saved
ActSave(layer1_output=x1, attention_weights=x2)

# Update to only save layer outputs
ActSave.set_filters(["layer*"])
ActSave(layer2_output=x3, decoder_output=x4)  # Only layer2_output saved

# Check current filters
current_filters = ActSave.filters  # Returns ["layer*"]

# Clear filters
ActSave.set_filters(None)
Environment Variable Configuration

You can configure ActSave and filtering through environment variables (now part of the unified configuration system):

# Enable ActSave with default temp directory
export NSHUTILS_ACTSAVE=1

# Enable ActSave with specific directory
export NSHUTILS_ACTSAVE="/path/to/activations"

# Set filters (comma-separated patterns)
export NSHUTILS_ACTSAVE_FILTERS="layer*,attention*,encoder.*"

# Combine both
export NSHUTILS_ACTSAVE="/path/to/activations"
export NSHUTILS_ACTSAVE_FILTERS="layer*,attention*"

# Or use the unified config format
export NSHUTILS_CONFIG='{"actsave": {"enabled": true, "save_dir": "/path/to/activations", "filters": ["layer*", "attention*"]}}'

The NSHUTILS_ACTSAVE_FILTERS environment variable supports:

  • Comma-separated patterns: "layer*,attention*,decoder.*"
  • Whitespace handling: Extra spaces around commas are automatically trimmed
  • Empty values: Empty string or only commas/spaces result in no filtering

To load activations, use the ActLoad class:

from nshutils import ActLoad

act_load = ActLoad.from_latest_version("path/to/activations")
encoder_acts = act_load["encoder"]

for act in encoder_acts:
    print(act.shape)

This will load all of the activations saved under the encoder prefix.

Other Utilities

nshutils also provides a few other utility functions/classes:

  • snoop: A simple way to debug your code using the pysnooper library, based on the torchsnooper library.
  • apply_to_collection: Recursively apply a function to all elements of a collection that match a certain type, taken from the pytorch-lightning library.

Contributing

Contributions to nshutils are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

nshutils is released under the MIT License. See the LICENSE file for more details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshutils-0.38.1.tar.gz (36.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshutils-0.38.1-py3-none-any.whl (43.8 kB view details)

Uploaded Python 3

File details

Details for the file nshutils-0.38.1.tar.gz.

File metadata

  • Download URL: nshutils-0.38.1.tar.gz
  • Upload date:
  • Size: 36.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.1.tar.gz
Algorithm Hash digest
SHA256 6d59fb2c9cf7b5ff7fa620d9d4312afa4ae9feee5ff96a67c26db2ecb34f06a2
MD5 3a90637ea54ac74d6357b151c6d55647
BLAKE2b-256 b864fabd70493226461a26a5d84a1480e8670239c75bad6194b81815735c4ee5

See more details on using hashes here.

File details

Details for the file nshutils-0.38.1-py3-none-any.whl.

File metadata

  • Download URL: nshutils-0.38.1-py3-none-any.whl
  • Upload date:
  • Size: 43.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0941768dd19954f7b4c639cd09defd1a3aa7e221ace7110f7ca9e9a421707b27
MD5 e277f1d903f94a93219029266cbc6418
BLAKE2b-256 2799acd6ec6068f021c89fb4dec126eb32aeff259359f116e24c0ede2349e64a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page