Equilibrated Input Embedding Initialization (EIEI)
Project description
numpy-eiei : Equilibrated Input Embedding Initialization (EIEI)
EIEI is a procedure to initialize the weights of an input embedding.
Usage
# Load some data
corpus = """Lorem ipsum dolor sit amet, ..."""
# Build a token list
import kshingle as ks
tokens = [c for c in corpus]
TOKENLIST = list(set(tokens))
TOKENLIST.append("[UNK]")
TOKENLIST.append("[MASK]")
tokenlist_size = len(TOKENLIST)
encoded = ks.encode_with_vocab(tokens, TOKENLIST, tokenlist_size - 2)
# Initialize the Embedding with the EIEI algorithm
from numpy_eiei import eiei
emb = eiei(
encoded,
tokenlist_size,
embed_dim=300,
max_context_size=14,
max_patience=6,
pct_add=0.1,
fill=False
)
Appendix
Installation
The numpy-eiei git repo is available as PyPi package
pip install numpy-eiei
pip install git+ssh://git@github.com/ulf1/numpy-eiei.git
Install a virtual environment
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt --no-cache-dir
pip install -r requirements-dev.txt --no-cache-dir
pip install -r requirements-demo.txt --no-cache-dir
(If your git repo is stored in a folder with whitespaces, then don’t use the subfolder .venv. Use an absolute path without whitespaces.)
Python commands
Jupyter for the examples: jupyter lab
Check syntax: flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')
Run Unit Tests: PYTHONPATH=. pytest
Publish
pandoc README.md --from markdown --to rst -s -o README.rst
python setup.py sdist
twine upload -r pypi dist/*
Clean up
find . -type f -name "*.pyc" | xargs rm
find . -type d -name "__pycache__" | xargs rm -r
rm -r .pytest_cache
rm -r .venv
Support
Please open an issue for support.
Contributing
Please contribute using Github Flow. Create a branch, add commits, and open a pull request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
numpy-eiei-0.1.0.tar.gz
(11.6 kB
view hashes)