Skip to main content

No project description provided

Project description

nshutils

nshutils is a collection of utility functions and classes that I've found useful in my day-to-day work as an ML researcher. This library includes utilities for typechecking, logging, and saving/loading activations from neural networks.

Installation

To install nshutils, simply run:

pip install nshutils

Features

Typechecking

nshutils provides a simple way to typecheck your code using the jaxtyping library. Simply call typecheck_this_module() at the top of your module (i.e., in the root __init__.py file) to enable typechecking for the entire module:

from nshutils.typecheck import typecheck_this_module

typecheck_this_module()

You can also use the tassert function to assert that a value is of a certain type:

import nshutils.typecheck as tc

def my_function(x: tc.Float[torch.Tensor, "bsz seq len"]) -> tc.Float[torch.Tensor, "bsz seq len"]:
    tc.tassert(tc.Float[torch.Tensor, "bsz seq len"], x)
    ...

Logging

nshutils provides a simple way to configure logging for your project. Simply call one of the logging setup functions:

from nshutils.logging import init_python_logging

init_python_logging()

This will configure logging to use pretty formatting for PyTorch tensors and numpy arrays (inspired by and/or utilizing lovely-numpy and lovely-tensors), and will also enable rich logging if the rich library is installed.

Activation Saving/Loading

nshutils provides a simple way to save and load activations from neural networks. To save activations, use the ActSave object:

from nshutils import ActSave

def my_model_forward(x):
    ...
    # Save activations to "{save_dir}/encoder.activations/{idx}.npy"
    ActSave({"encoder.activations": x})

    # Equivalent to the above
    with ActSave.context("encoder"):
        ActSave(activations=x)
    ...

ActSave.enable(save_dir="path/to/activations")
x = torch.randn(...)
my_model_forward(x)
# Activations are saved to disk under the "path/to/activations" directory

This will save the x tensor to disk under the encoder prefix.

Activation Filtering

ActSave supports filtering to selectively save only certain activations based on fnmatch patterns. This is useful for reducing storage space and focusing on specific model components:

from nshutils import ActSave

# Only save activations matching "layer*" or "attention*" patterns
filters = ["layer*", "attention*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # These will be saved (match filters)
    ActSave(
        layer1_output=x1,
        layer2_hidden=x2,
        attention_weights=x3
    )

    # These will NOT be saved (don't match filters)
    ActSave(
        decoder_output=x4,
        embedding_vector=x5
    )

The filtering patterns support standard Unix shell-style wildcards:

  • * matches everything
  • ? matches any single character
  • [seq] matches any character in seq
  • [!seq] matches any character not in seq
Contextual Filtering

Filters work with context prefixes, allowing you to save activations from specific model components:

# Only save activations from encoder layers
filters = ["encoder.*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # Decoder context - these won't be saved
    with ActSave.context("decoder"):
        ActSave(layer1_output=x1, attention=x2)

    # Encoder context - these will be saved
    with ActSave.context("encoder"):
        ActSave(layer1_output=x3, attention=x4)  # Saved as "encoder.layer1_output", "encoder.attention"
Dynamic Filter Updates

You can update filters during runtime:

ActSave.enable(save_dir="path/to/activations")

# Initially no filters - all activations saved
ActSave(layer1_output=x1, attention_weights=x2)

# Update to only save layer outputs
ActSave.set_filters(["layer*"])
ActSave(layer2_output=x3, decoder_output=x4)  # Only layer2_output saved

# Check current filters
current_filters = ActSave.filters  # Returns ["layer*"]

# Clear filters
ActSave.set_filters(None)
Environment Variable Configuration

You can configure ActSave and filtering through environment variables:

# Enable ActSave with default temp directory
export ACTSAVE=1

# Enable ActSave with specific directory
export ACTSAVE="/path/to/activations"

# Set filters (comma-separated patterns)
export ACTSAVE_FILTERS="layer*,attention*,encoder.*"

# Combine both
export ACTSAVE="/path/to/activations"
export ACTSAVE_FILTERS="layer*,attention*"

The ACTSAVE_FILTERS environment variable supports:

  • Comma-separated patterns: "layer*,attention*,decoder.*"
  • Whitespace handling: Extra spaces around commas are automatically trimmed
  • Empty values: Empty string or only commas/spaces result in no filtering

To load activations, use the ActLoad class:

from nshutils import ActLoad

act_load = ActLoad.from_latest_version("path/to/activations")
encoder_acts = act_load["encoder"]

for act in encoder_acts:
    print(act.shape)

This will load all of the activations saved under the encoder prefix.

Other Utilities

nshutils also provides a few other utility functions/classes:

  • snoop: A simple way to debug your code using the pysnooper library, based on the torchsnooper library.
  • apply_to_collection: Recursively apply a function to all elements of a collection that match a certain type, taken from the pytorch-lightning library.

Contributing

Contributions to nshutils are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

nshutils is released under the MIT License. See the LICENSE file for more details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshutils-0.38.0b0.tar.gz (61.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshutils-0.38.0b0-py3-none-any.whl (73.8 kB view details)

Uploaded Python 3

File details

Details for the file nshutils-0.38.0b0.tar.gz.

File metadata

  • Download URL: nshutils-0.38.0b0.tar.gz
  • Upload date:
  • Size: 61.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b0.tar.gz
Algorithm Hash digest
SHA256 49c26ea1de9978296e6017c5c125966204be66f679854c1a251ad367cabc215d
MD5 80774efc236146e92ee719050ccf639d
BLAKE2b-256 3315514b981127e4e75202e2170bbdbff01369594f563e9534da12a08ebe3d95

See more details on using hashes here.

File details

Details for the file nshutils-0.38.0b0-py3-none-any.whl.

File metadata

  • Download URL: nshutils-0.38.0b0-py3-none-any.whl
  • Upload date:
  • Size: 73.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b0-py3-none-any.whl
Algorithm Hash digest
SHA256 9e723ef4a3e653fafab8df4f3fd0827667bc174f42524239decc11b70816230b
MD5 86e050989123f1712622b6db8d14849f
BLAKE2b-256 93fc3d6b3333dcf0c88990d23a854bfa0ac6cf934aa5d0c9f98b58a46be66d35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page