Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings.
Project description
BiLLM
Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with 🤗 transformers.
Usage
-
python -m pip install -U billm
-
Specify start index for bi-directional layers via
export BiLLM_START_INDEX={layer_index}
. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled. -
Import LLMs from BiLLM and initialize them as usual with transformers.
- from transformers import (
- LLamaModel,
- LLamaForSequenceClassification,
- MistralModel,
- MistralForSequenceClassification
- )
+ from billm import (
+ LLamaModel,
+ LLamaForSequenceClassification,
+ LLamaForTokenClassification,
+ MistralModel,
+ MistralForSequenceClassification,
+ MistralForTokenClassification,
+ )
Examples
NER
$ cd examples
$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \
--model_name_or_path mistralai/Mistral-7B-v0.1 \
--dataset_name_or_path conll2003 \
--push_to_hub 0
Supported Models
- LLaMA
- Mistral
Citation
If you use this toolkit in your work, please cite the following paper:
- For sentence embeddings modeling:
@inproceedings{li2024bellm,
title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
author = "Li, Xianming and Li, Jing",
booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
year = "2024",
publisher = "Association for Computational Linguistics"
}
- For other tasks:
@article{li2023label,
title={Label supervised llama finetuning},
author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
journal={arXiv preprint arXiv:2310.01208},
year={2023}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.