Skip to main content

No project description provided

Project description

nshutils

nshutils is a collection of utility functions and classes that I've found useful in my day-to-day work as an ML researcher. This library includes utilities for typechecking, logging, and saving/loading activations from neural networks.

Installation

To install nshutils, simply run:

pip install nshutils

Features

Typechecking

nshutils provides a simple way to typecheck your code using the jaxtyping library. Simply call typecheck_this_module() at the top of your module (i.e., in the root __init__.py file) to enable typechecking for the entire module:

from nshutils.typecheck import typecheck_this_module

typecheck_this_module()

You can also use the tassert function to assert that a value is of a certain type:

import nshutils.typecheck as tc

def my_function(x: tc.Float[torch.Tensor, "bsz seq len"]) -> tc.Float[torch.Tensor, "bsz seq len"]:
    tc.tassert(tc.Float[torch.Tensor, "bsz seq len"], x)
    ...

Logging

nshutils provides a simple way to configure logging for your project. Simply call one of the logging setup functions:

from nshutils.logging import init_python_logging

init_python_logging()

This will configure logging to use pretty formatting for PyTorch tensors and numpy arrays (inspired by and/or utilizing lovely-numpy and lovely-tensors), and will also enable rich logging if the rich library is installed.

Activation Saving/Loading

nshutils provides a simple way to save and load activations from neural networks. To save activations, use the ActSave object:

from nshutils import ActSave

def my_model_forward(x):
    ...
    # Save activations to "{save_dir}/encoder.activations/{idx}.npy"
    ActSave({"encoder.activations": x})

    # Equivalent to the above
    with ActSave.context("encoder"):
        ActSave(activations=x)
    ...

ActSave.enable(save_dir="path/to/activations")
x = torch.randn(...)
my_model_forward(x)
# Activations are saved to disk under the "path/to/activations" directory

This will save the x tensor to disk under the encoder prefix.

Activation Filtering

ActSave supports filtering to selectively save only certain activations based on fnmatch patterns. This is useful for reducing storage space and focusing on specific model components:

from nshutils import ActSave

# Only save activations matching "layer*" or "attention*" patterns
filters = ["layer*", "attention*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # These will be saved (match filters)
    ActSave(
        layer1_output=x1,
        layer2_hidden=x2,
        attention_weights=x3
    )

    # These will NOT be saved (don't match filters)
    ActSave(
        decoder_output=x4,
        embedding_vector=x5
    )

The filtering patterns support standard Unix shell-style wildcards:

  • * matches everything
  • ? matches any single character
  • [seq] matches any character in seq
  • [!seq] matches any character not in seq
Contextual Filtering

Filters work with context prefixes, allowing you to save activations from specific model components:

# Only save activations from encoder layers
filters = ["encoder.*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # Decoder context - these won't be saved
    with ActSave.context("decoder"):
        ActSave(layer1_output=x1, attention=x2)

    # Encoder context - these will be saved
    with ActSave.context("encoder"):
        ActSave(layer1_output=x3, attention=x4)  # Saved as "encoder.layer1_output", "encoder.attention"
Dynamic Filter Updates

You can update filters during runtime:

ActSave.enable(save_dir="path/to/activations")

# Initially no filters - all activations saved
ActSave(layer1_output=x1, attention_weights=x2)

# Update to only save layer outputs
ActSave.set_filters(["layer*"])
ActSave(layer2_output=x3, decoder_output=x4)  # Only layer2_output saved

# Check current filters
current_filters = ActSave.filters  # Returns ["layer*"]

# Clear filters
ActSave.set_filters(None)
Environment Variable Configuration

You can configure ActSave and filtering through environment variables:

# Enable ActSave with default temp directory
export ACTSAVE=1

# Enable ActSave with specific directory
export ACTSAVE="/path/to/activations"

# Set filters (comma-separated patterns)
export ACTSAVE_FILTERS="layer*,attention*,encoder.*"

# Combine both
export ACTSAVE="/path/to/activations"
export ACTSAVE_FILTERS="layer*,attention*"

The ACTSAVE_FILTERS environment variable supports:

  • Comma-separated patterns: "layer*,attention*,decoder.*"
  • Whitespace handling: Extra spaces around commas are automatically trimmed
  • Empty values: Empty string or only commas/spaces result in no filtering

To load activations, use the ActLoad class:

from nshutils import ActLoad

act_load = ActLoad.from_latest_version("path/to/activations")
encoder_acts = act_load["encoder"]

for act in encoder_acts:
    print(act.shape)

This will load all of the activations saved under the encoder prefix.

Other Utilities

nshutils also provides a few other utility functions/classes:

  • snoop: A simple way to debug your code using the pysnooper library, based on the torchsnooper library.
  • apply_to_collection: Recursively apply a function to all elements of a collection that match a certain type, taken from the pytorch-lightning library.

Contributing

Contributions to nshutils are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

nshutils is released under the MIT License. See the LICENSE file for more details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshutils-0.38.0b2.tar.gz (61.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshutils-0.38.0b2-py3-none-any.whl (73.7 kB view details)

Uploaded Python 3

File details

Details for the file nshutils-0.38.0b2.tar.gz.

File metadata

  • Download URL: nshutils-0.38.0b2.tar.gz
  • Upload date:
  • Size: 61.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b2.tar.gz
Algorithm Hash digest
SHA256 ca9c2967303cd958dc51ce114e1a5eac0d7047bd80e3a8c08153d5fd81f8876b
MD5 92b96b1e4bb4bceb2def9029b1c422db
BLAKE2b-256 5c075b24d8f16a10b1f530501e7c855abd9d6a853362aa135dd29d8d7f5a00db

See more details on using hashes here.

File details

Details for the file nshutils-0.38.0b2-py3-none-any.whl.

File metadata

  • Download URL: nshutils-0.38.0b2-py3-none-any.whl
  • Upload date:
  • Size: 73.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b2-py3-none-any.whl
Algorithm Hash digest
SHA256 d25b9fcf805f442c06048f77db351294a7208e234b6633c41ff9b7064cdd308b
MD5 e8d3d98c9e231f2022bdacd2911fae09
BLAKE2b-256 95bd7b32ad5667981032a8c736eb2c7fbbc074933176b46cace9310ea5d1f819

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page