Skip to main content

No project description provided

Project description

nshutils

nshutils is a collection of utility functions and classes that I've found useful in my day-to-day work as an ML researcher. This library includes utilities for typechecking, logging, and saving/loading activations from neural networks.

Installation

To install nshutils, simply run:

pip install nshutils

Features

Typechecking

nshutils provides a simple way to typecheck your code using the jaxtyping library. Simply call typecheck_this_module() at the top of your module (i.e., in the root __init__.py file) to enable typechecking for the entire module:

from nshutils.typecheck import typecheck_this_module

typecheck_this_module()

You can also use the tassert function to assert that a value is of a certain type:

import nshutils.typecheck as tc

def my_function(x: tc.Float[torch.Tensor, "bsz seq len"]) -> tc.Float[torch.Tensor, "bsz seq len"]:
    tc.tassert(tc.Float[torch.Tensor, "bsz seq len"], x)
    ...

Logging

nshutils provides a simple way to configure logging for your project. Simply call one of the logging setup functions:

from nshutils.logging import init_python_logging

init_python_logging()

This will configure logging to use pretty formatting for PyTorch tensors and numpy arrays (inspired by and/or utilizing lovely-numpy and lovely-tensors), and will also enable rich logging if the rich library is installed.

Activation Saving/Loading

nshutils provides a simple way to save and load activations from neural networks. To save activations, use the ActSave object:

from nshutils import ActSave

def my_model_forward(x):
    ...
    # Save activations to "{save_dir}/encoder.activations/{idx}.npy"
    ActSave({"encoder.activations": x})

    # Equivalent to the above
    with ActSave.context("encoder"):
        ActSave(activations=x)
    ...

ActSave.enable(save_dir="path/to/activations")
x = torch.randn(...)
my_model_forward(x)
# Activations are saved to disk under the "path/to/activations" directory

This will save the x tensor to disk under the encoder prefix.

Activation Filtering

ActSave supports filtering to selectively save only certain activations based on fnmatch patterns. This is useful for reducing storage space and focusing on specific model components:

from nshutils import ActSave

# Only save activations matching "layer*" or "attention*" patterns
filters = ["layer*", "attention*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # These will be saved (match filters)
    ActSave(
        layer1_output=x1,
        layer2_hidden=x2,
        attention_weights=x3
    )

    # These will NOT be saved (don't match filters)
    ActSave(
        decoder_output=x4,
        embedding_vector=x5
    )

The filtering patterns support standard Unix shell-style wildcards:

  • * matches everything
  • ? matches any single character
  • [seq] matches any character in seq
  • [!seq] matches any character not in seq
Contextual Filtering

Filters work with context prefixes, allowing you to save activations from specific model components:

# Only save activations from encoder layers
filters = ["encoder.*"]

with ActSave.enabled(save_dir="path/to/activations", filters=filters):
    # Decoder context - these won't be saved
    with ActSave.context("decoder"):
        ActSave(layer1_output=x1, attention=x2)

    # Encoder context - these will be saved
    with ActSave.context("encoder"):
        ActSave(layer1_output=x3, attention=x4)  # Saved as "encoder.layer1_output", "encoder.attention"
Dynamic Filter Updates

You can update filters during runtime:

ActSave.enable(save_dir="path/to/activations")

# Initially no filters - all activations saved
ActSave(layer1_output=x1, attention_weights=x2)

# Update to only save layer outputs
ActSave.set_filters(["layer*"])
ActSave(layer2_output=x3, decoder_output=x4)  # Only layer2_output saved

# Check current filters
current_filters = ActSave.filters  # Returns ["layer*"]

# Clear filters
ActSave.set_filters(None)
Environment Variable Configuration

You can configure ActSave and filtering through environment variables:

# Enable ActSave with default temp directory
export ACTSAVE=1

# Enable ActSave with specific directory
export ACTSAVE="/path/to/activations"

# Set filters (comma-separated patterns)
export ACTSAVE_FILTERS="layer*,attention*,encoder.*"

# Combine both
export ACTSAVE="/path/to/activations"
export ACTSAVE_FILTERS="layer*,attention*"

The ACTSAVE_FILTERS environment variable supports:

  • Comma-separated patterns: "layer*,attention*,decoder.*"
  • Whitespace handling: Extra spaces around commas are automatically trimmed
  • Empty values: Empty string or only commas/spaces result in no filtering

To load activations, use the ActLoad class:

from nshutils import ActLoad

act_load = ActLoad.from_latest_version("path/to/activations")
encoder_acts = act_load["encoder"]

for act in encoder_acts:
    print(act.shape)

This will load all of the activations saved under the encoder prefix.

Other Utilities

nshutils also provides a few other utility functions/classes:

  • snoop: A simple way to debug your code using the pysnooper library, based on the torchsnooper library.
  • apply_to_collection: Recursively apply a function to all elements of a collection that match a certain type, taken from the pytorch-lightning library.

Contributing

Contributions to nshutils are welcome! Please open an issue or submit a pull request on the GitHub repository.

License

nshutils is released under the MIT License. See the LICENSE file for more details.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nshutils-0.38.0b5.tar.gz (34.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nshutils-0.38.0b5-py3-none-any.whl (42.4 kB view details)

Uploaded Python 3

File details

Details for the file nshutils-0.38.0b5.tar.gz.

File metadata

  • Download URL: nshutils-0.38.0b5.tar.gz
  • Upload date:
  • Size: 34.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b5.tar.gz
Algorithm Hash digest
SHA256 161f057aa768f28e6d96e259f894fe0fb0f937b6dbbb321829959f18addb7106
MD5 6f3b3368ca9ea3575448936ad2c61b0a
BLAKE2b-256 03c80a9a2f7a1fe9cef3141c880feb887d5c99e37c741e77f927fb6aba2c54ea

See more details on using hashes here.

File details

Details for the file nshutils-0.38.0b5-py3-none-any.whl.

File metadata

  • Download URL: nshutils-0.38.0b5-py3-none-any.whl
  • Upload date:
  • Size: 42.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for nshutils-0.38.0b5-py3-none-any.whl
Algorithm Hash digest
SHA256 5f64a089d4178021051026a8713a5c44ebed5acb87cb29ccc76316a8cd330c77
MD5 4f9a00af7befb138ccf868e3647407e2
BLAKE2b-256 a39a8789afd9a2b127581d9b93f8aa7939af4ec5fffb5d0662da22ff2840882f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page