Skip to main content

TextSlinger: Fast and accurate text predictions in Python

Project description

Cowboy drawing his two cell phone six shooters

TextSlinger: Fast and Accurate Text Predictions in Python

This is a Python library for making text predictions using different types of language models. Current features:

  • Predict the distribution over the next character given the previous text.
  • Predict the most likely next words given the previous text and prefix of current word.
  • Supports:

Developer setup

Our code style is whatever the Black formatter says it should be. You should configure your IDE to format using Black when you save.

Setting up a Python environment

If you don't have Miniforge installed in your user account you'll first need to do that.

To install Miniforge on MacOS using Apple Silicon:

curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
zsh Miniforge3-MacOSX-arm64.sh
~/miniforge3/bin/conda init zsh

To install Miniforge on Linux:

curl -LO https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh

After installing Miniforge, be sure to close your terminal and start a new one. Create an environment as follows:

conda config --remove-key channels
conda config --add channels conda-forge
conda config --set channel_priority strict
conda create -n textslinger python=3.10 -y
conda activate textslinger

Installation of PyTorch

MacOS using Apple Silicon:

pip install torch torchvision torchaudio

Linux with CUDA support (GPU driver must support installed library version or greater. Run nvidia-smi to check driver support):

# CUDA 11.8 
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# CUDA 12.1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Linux without CUDA support:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Test if the PyTorch installation worked:

python - <<'EOF'
import torch
print("Torch version:", torch.__version__)
print("MPS available:", torch.backends.mps.is_available())
print("MPS built:", torch.backends.mps.is_built())
print("CUDA available:", torch.cuda.is_available())
print("CUDA version:", torch.version.cuda)
print("CUDA device count:", torch.cuda.device_count())

if torch.backends.mps.is_available():
    x = torch.randn(2, 2, device="mps")
    print("Tensor device:", x.device)
elif torch.cuda.is_available():
    x = torch.randn(2, 2, device="cuda")
    print("Tensor device:", x.device)
else:
    x = torch.randn(2, 2)
    print("Tensor device:", x.device)
EOF

Installation of libraries

Install transformers (5.2.0 or greater required).

pip install transformers

Check transformers version and model support:

python - <<'EOF'
import transformers
from transformers import __version__
from transformers.utils import is_torch_available

print("Version:", __version__)
print("File:", transformers.__file__)

# Check for BLT symbols that do NOT exist in stable 5.0.0
try:
    from transformers.models.blt.modeling_blt import BltModel
    print("BLT model available ✅")
except Exception as e:
    print("BLT model missing ❌", e)
EOF

Install other dependencies:

pip install pytest scipy peft psutil datasets

# NOTE: increase MAX_ORDER if you plan to load n-gram models with longer context 
MAX_ORDER=12 pip install https://github.com/kpu/kenlm/archive/master.zip

Fix harmless warning message:

pip install --upgrade --force-reinstall setuptools

Testing installation

Download assets needed by the test suite and then run it:

cd textslinger/assets
./download.sh
cd ..
pytest -v -rs

This material is based upon work supported by the NSF under Grant No. IIS-1909089 and IIS-2402876.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textslinger-0.2.0.tar.gz (47.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

textslinger-0.2.0-py3-none-any.whl (52.3 kB view details)

Uploaded Python 3

File details

Details for the file textslinger-0.2.0.tar.gz.

File metadata

  • Download URL: textslinger-0.2.0.tar.gz
  • Upload date:
  • Size: 47.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.19

File hashes

Hashes for textslinger-0.2.0.tar.gz
Algorithm Hash digest
SHA256 01b43e4e4605f13934088e216a08967780888d974bd2262c87bf06cdecac57cb
MD5 523b922dc7244b14d9eaa11f21ec4d78
BLAKE2b-256 f33b5f6da3016b015f2844a80ec12a7a216c21c905c59914b74d81bfe0d00d59

See more details on using hashes here.

File details

Details for the file textslinger-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: textslinger-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 52.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.19

File hashes

Hashes for textslinger-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc07db5f16d2bcc57c13589c2373e00670aeef9853926f44b8b9a4e5a5d3454f
MD5 7e7c740e12b931985713f1555d4cb197
BLAKE2b-256 017eb27cab3ec100710e388e0e2667e15c094736fcac2271e52e9ecca63b1af7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page