Skip to main content

Hashed Random Projection layer for PyTorch

Project description

PyPI version PyPi downloads

torch-hrp

Hashed Random Projection layer for PyTorch.

Usage

Hashed Random Projections (HRP), binary representations, encoding/decoding for storage (notebook)

Generate a HRP layer with a new hyperplane

The random projection or hyperplane is randomly initialized. The initial state of the PRNG (random_state) is required (Default: 42) to ensure reproducibility.

import torch_hrp as thrp
import torch

BATCH_SIZE = 32
NUM_FEATURES = 64
OUTPUT_SIZE = 1024

# demo inputs
inputs = torch.randn(size=(BATCH_SIZE, NUM_FEATURES))

# instantiate layer 
layer = thrp.HashedRandomProjection(
    output_size=OUTPUT_SIZE,
    input_size=NUM_FEATURES,
    random_state=42   # Default: 42
)

# run it
outputs = layer(inputs)
assert outputs.shape == (BATCH_SIZE, OUTPUT_SIZE)

Instiantiate HRP layer with given hyperplane

import torch_hrp as thrp
import torch

BATCH_SIZE = 32
NUM_FEATURES = 64
OUTPUT_SIZE = 1024

# demo inputs
inputs = torch.randn(size=(BATCH_SIZE, NUM_FEATURES))

# use an existing hyperplane
myhyperplane = torch.randn(size=(NUM_FEATURES, OUTPUT_SIZE))

# init layer
layer = thrp.HashedRandomProjection(hyperplane=myhyperplane)

# run it
outputs = layer(inputs)

Appendix

Installation

The torch-hrp git repo is available as PyPi package

pip install torch-hrp
pip install git+ssh://git@github.com/satzbeleg/torch-hrp.git

Install a virtual environment

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt --no-cache-dir
pip install -r requirements-dev.txt --no-cache-dir
pip install -r requirements-demo.txt --no-cache-dir

(If your git repo is stored in a folder with whitespaces, then don't use the subfolder .venv. Use an absolute path without whitespaces.)

Python commands

  • Jupyter for the examples: jupyter lab
  • Check syntax: flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')
  • Run Unit Tests: PYTHONPATH=. pytest

Publish

# pandoc README.md --from markdown --to rst -s -o README.rst
python setup.py sdist 
twine upload -r pypi dist/*

Clean up

find . -type f -name "*.pyc" | xargs rm
find . -type d -name "__pycache__" | xargs rm -r
rm -r .pytest_cache
rm -r .venv

Support

Please open an issue for support.

Contributing

Please contribute using Github Flow. Create a branch, add commits, and open a pull request.

Acknowledgements

The "Evidence" project was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - 433249742 (GU 798/27-1; GE 1119/11-1).

Citation

Please cite the arXiv Preprint when using this software for any purpose.

@misc{hamster2023rediscovering,
      title={Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings}, 
      author={Ulf A. Hamster and Ji-Ung Lee and Alexander Geyken and Iryna Gurevych},
      year={2023},
      eprint={2304.02481},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torch-hrp-0.1.0.tar.gz (8.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page