Skip to main content

Open language modeling toolkit based on PyTorch

Project description

EOLE

Documentation

Open language modeling toolkit based on PyTorch.

👷‍♂️🚧 Work in Progress

EOLE is a spin-off of the OpenNMT-py project. We aim to maintain the research-friendly approach of the original project while updating the structure and expanding it to include new topics related to large language models (LLMs) and various other techniques. Our goal is to provide a comprehensive yet compact and modular codebase for experimenting with various types of language models (encoder, decoder, seq2seq).


Current State

We have made significant progress in several areas:

  • Configuration Management: Streamlined through pydantic models.
  • Command Line Entry Points: Improved using structured subparsers for better organization.
  • Reproducible Recipes: Provided for widely used models and tasks, ensuring consistency and reliability.
  • Core API Simplification: Refined around the new configuration objects for ease of use.

Future Directions

There are still several exciting avenues to explore:

  • Further Simplification and Refactoring: Continue enhancing the codebase for clarity and efficiency.
  • Inference Server: Develop a robust solution for model inference.
  • Additional Recipes: Expand the library of reproducible recipes.
  • Documentation: Enhance and expand the documentation for better user guidance.
  • Test Coverage: Improve testing to ensure code reliability and performance.
  • Logging Enhancements: Implement more sophisticated logging mechanisms.
  • Broader Model Support: Extend support to include a wider range of open models, potentially multi-modal.

Key Features

  • Versatile Training and Inference: Train from scratch, finetune, and infer models of various architectures including Transformer Encoder/Decoder/EncoderDecoder and RNN EncoderDecoder.
  • Dynamic Data Transforms: Apply on-the-fly transformations in the dataloading logic for both training and inference.
  • Comprehensive LLM Support: Includes converters for Llama, Mistral, Phi, OpenLlama, Redpajama, MPT-7B, and Falcon models.
  • Advanced Quantization: Support for 8-bit and 4-bit quantization, along with LoRA adapters, with or without checkpointing, as well as mixed precision (FP16).
  • Efficient Finetuning: Finetune 7B and 13B models on a single RTX 24GB GPU using 4-bit quantization.
  • Flexible Inference: Perform inference in 4-bit or 8-bit using the same layer quantization methods as in finetuning.
  • Tensor Parallelism: Enable tensor parallelism for both training and inference when models exceed the memory capacity of a single GPU.

Setup

Using Docker

To facilitate setup and reproducibility, we provide Docker images via the GitHub Container Registry: EOLE Docker Images.

You can customize the workflow and build your own images based on specific needs using build.sh and Dockerfile in the docker directory of the repository.

There are two images with CUDA 11.8 and 12.1 prebuilt, change the -cudaXX.X to your desired version when pulling the Docker images

To pull the Docker image:

docker pull ghcr.io/eole-nlp/eole:0.0.2-torch2.3.0-ubuntu22.04-cuda12.1

Example one-liner to run a container and open a bash shell within it:

docker run --rm -it --runtime=nvidia ghcr.io/eole-nlp/eole:0.0.2-torch2.3.0-ubuntu22.04-cuda12.1

Note: Ensure you have the Nvidia Container Toolkit (formerly nvidia-docker) installed to take advantage of CUDA/GPU features.

Depending on your needs, you can add various flags:

  • -p 5000:5000: Forward an exposed port from your container to your host.
  • -v /some/local/directory:/some/container/directory: Mount a local directory to a container directory.
  • --entrypoint some_command: Run a specific command as the container entry point (instead of the default bash shell).

Installing Locally

Requirements

  • Python >= 3.10
  • PyTorch >= 2.3 < 2.4

Installation from Source

To install from source:

git clone https://github.com/eole-nlp/eole
cd eole
pip install -e .

Installation from PyPI

Installation from PyPI will be available soon.

Notes

If you encounter a MemoryError during installation, try using pip with the --no-cache-dir option.

(Optional) Some advanced features (e.g., pretrained models or specific transforms) require extra packages. Install them with:

pip install -r requirements.opt.txt

Manual Installation of Some Dependencies

Apex

Apex is recommended for improved performance, especially for the legacy fusedadam optimizer and FusedRMSNorm.

git clone https://github.com/NVIDIA/apex
cd apex
pip3 install -v --no-build-isolation --config-settings --build-option="--cpp_ext --cuda_ext --deprecated_fused_adam --xentropy --fast_multihead_attn" ./
cd ..

Flash Attention

To use Flash Attention, install it manually:

pip install flash-attn --no-build-isolation

AWQ

For inference or quantizing an AWQ model, AutoAWQ is required. Install it with:

pip install autoawq

For more details, refer to AutoAWQ.


Contributing

We love contributions! Please look at issues marked with the contributions welcome tag.

Before raising an issue, make sure you read the requirements and the Full Documentation. You can also check if a Recipe fits your use case.

Unless there is a bug, please use the Discussions tab to ask questions or propose new topics/features.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eole-0.0.2.tar.gz (207.4 kB view details)

Uploaded Source

Built Distribution

eole-0.0.2-py3-none-any.whl (260.7 kB view details)

Uploaded Python 3

File details

Details for the file eole-0.0.2.tar.gz.

File metadata

  • Download URL: eole-0.0.2.tar.gz
  • Upload date:
  • Size: 207.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.6

File hashes

Hashes for eole-0.0.2.tar.gz
Algorithm Hash digest
SHA256 8a88f60b380e01c4a9fd6974eda536fd7d2f3135889246605e839160752cc716
MD5 98828b6f094e68e10e55faa6a21b5a7d
BLAKE2b-256 dfcbebe378ba4aff938913be100f34806a27c3a554136b68a4f04a9726f37305

See more details on using hashes here.

File details

Details for the file eole-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: eole-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 260.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.6

File hashes

Hashes for eole-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7bcd2e2e25581a9f36783aa2f2ffda67bd86ed53f5f0d73a818cfb2ba4453a29
MD5 ba63bd7236006785cae24d45e69fd324
BLAKE2b-256 b9b34c6d7001a62ebca46d2183a84b8f15bcb14b233f65c1efebaeaf344cc659

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page