Skip to main content

TextSlinger: Fast and accurate text predictions in Python

Project description

Cowboy drawing his two cell phone six shooters

TextSlinger: Fast and Accurate Text Predictions in Python

This is a Python library for making text predictions using different types of language models. Current features:

  • Predict the distribution over the next character given the previous text.
  • Predict the most likely next words given the previous text and prefix of current word.
  • Supports:

Developer setup

Our code style is whatever the Black formatter says it should be. You should configure your IDE to format using Black when you save.

Setting up a Python environment

If you don't have Miniforge installed in your user account you'll first need to do that.

To install Miniforge on MacOS using Apple Silicon:

curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
zsh Miniforge3-MacOSX-arm64.sh
~/miniforge3/bin/conda init zsh

To install Miniforge on Linux:

curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh

After installing Miniforge, be sure to close your terminal and start a new one. Create an environment as follows:

conda config --remove-key channels
conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -n textslinger python=3.10 -y
conda activate textslinger

Installation of PyTorch

MacOS using Apple Silicon:

pip install torch torchvision torchaudio

Linux with CUDA support (GPU driver must support installed library version or greater. Run nvidia-smi to check driver support):

# CUDA 11.8 
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Linux without CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Test if the PyTorch installation worked:

python - <<'EOF'
import torch
print("Torch version:", torch.__version__)
print("MPS available:", torch.backends.mps.is_available())
print("MPS built:", torch.backends.mps.is_built())
print("CUDA available:", torch.cuda.is_available())
print("CUDA version:", torch.version.cuda)
print("CUDA device count:", torch.cuda.device_count())

if torch.backends.mps.is_available():
    x = torch.randn(2, 2, device="mps")
    print("Tensor device:", x.device)
elif torch.cuda.is_available():
    x = torch.randn(2, 2, device="cuda")
    print("Tensor device:", x.device)
else:
    x = torch.randn(2, 2)
    print("Tensor device:", x.device)
EOF

Installation of libraries

Install transformers (5.2.0 or greater required).

pip install transformers

Check transformers version and model support:

python - <<'EOF'
import transformers
from transformers import __version__
from transformers.utils import is_torch_available

print("Version:", __version__)
print("File:", transformers.__file__)

# Check for BLT symbols that do NOT exist in stable 5.0.0
try:
    from transformers.models.blt.modeling_blt import BltModel
    print("BLT model available ✅")
except Exception as e:
    print("BLT model missing ❌", e)
EOF

Install other dependencies:

pip install pytest scipy peft psutil datasets

# NOTE: increase MAX_ORDER if you plan to load n-gram models with longer context 
MAX_ORDER=12 pip install https://github.com/kpu/kenlm/archive/master.zip

Fix harmless warning message:

pip install --upgrade --force-reinstall setuptools

Testing installation

Download assets needed by the test suite and then run it:

cd textslinger/assets
./download.sh
cd ..
pytest -v -rs

This material is based upon work supported by the NSF under Grant No. IIS-1909089 and IIS-2402876.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textslinger-0.2.3.tar.gz (55.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textslinger-0.2.3-py3-none-any.whl (61.4 kB view details)

Uploaded Python 3

File details

Details for the file textslinger-0.2.3.tar.gz.

File metadata

  • Download URL: textslinger-0.2.3.tar.gz
  • Upload date:
  • Size: 55.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for textslinger-0.2.3.tar.gz
Algorithm Hash digest
SHA256 40637e65da8b87977d3aba21af7f0d19c5100b9ddbf98d0c77ec51455b588f4d
MD5 1d058390eb4ab159c9167bcf2f85f047
BLAKE2b-256 5ac4c81e1622ebf3325aac93c83ed70023884058faa0fbcd34744059da71758d

See more details on using hashes here.

File details

Details for the file textslinger-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: textslinger-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 61.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for textslinger-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2b29b3e835ab8d2f9fe09e66f4d689e07ce23b16a65efb12049695c7d1e6d839
MD5 34c047e5f21dc2e984258c7f5129ea63
BLAKE2b-256 7a22b4c05be99d3c813663af9285cb7d3b902bf4b168c3390ef99da004da3d09

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page