Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings.
Project description
BiLLM
Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with 🤗 transformers.
Supported Models
- LLaMA
- Mistral
- Qwen2
- OpenELM
Usage
-
python -m pip install -U billm
-
Specify start index for bi-directional layers via
export BiLLM_START_INDEX={layer_index}
. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled. -
Import LLMs from BiLLM and initialize them as usual with transformers.
- from transformers import (
- LLamaModel,
- LLamaForCausalLM,
- LLamaForSequenceClassification,
- MistralModel,
- MistralForCausalLM,
- MistralForSequenceClassification
- Qwen2Model,
- Qwen2ForCausalLM,
- Qwen2ForSequenceClassification
- )
+ from billm import (
+ LLamaModel,
+ LLamaForCausalLM,
+ LLamaForSequenceClassification,
+ LLamaForTokenClassification,
+ MistralModel,
+ MistralForCausalLM,
+ MistralForSequenceClassification,
+ MistralForTokenClassification,
+ Qwen2Model,
+ Qwen2ForCausalLM,
+ Qwen2ForSequenceClassification,
+ Qwen2ForTokenClassification
+ OpenELMModel,
+ OpenELMForCausalLM,
+ OpenELMForSequenceClassification,
+ OpenELMForTokenClassification
+ )
Examples
NER
training:
$ cd examples
$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \
--model_name_or_path mistralai/Mistral-7B-v0.1 \
--dataset_name_or_path conll2003 \
--push_to_hub 0
inference:
from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification
label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
peft_config.base_model_name_or_path,
num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge and unload is necessary for inference
model = model.merge_and_unload()
token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)
Sentence Embeddings
refer to AnglE: https://github.com/SeanLee97/AnglE
Citation
If you use this toolkit in your work, please cite the following paper:
- For sentence embeddings modeling:
@inproceedings{li2024bellm,
title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
author = "Li, Xianming and Li, Jing",
booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
year = "2024",
publisher = "Association for Computational Linguistics"
}
- For other tasks:
@article{li2023label,
title={Label supervised llama finetuning},
author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
journal={arXiv preprint arXiv:2310.01208},
year={2023}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file billm-0.1.6.tar.gz
.
File metadata
- Download URL: billm-0.1.6.tar.gz
- Upload date:
- Size: 28.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e79f543c322de9750cbd4dcfdbbf89e79ab6f13ea4b0bdbda8dcfa753acb6266 |
|
MD5 | fcd4c4cf352b77668d291b13f84325ad |
|
BLAKE2b-256 | e63676a8fe630ebe0a2f1def45ff0031330c4d755adc4c23291c70840d2dc3e1 |
File details
Details for the file billm-0.1.6-py3-none-any.whl
.
File metadata
- Download URL: billm-0.1.6-py3-none-any.whl
- Upload date:
- Size: 35.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b4fd197913e36b681e8470dbbec4d56494cdc0090691a9d2a13ba6a047cee63 |
|
MD5 | f62641d97d3ff8fe44ab086a45c621d5 |
|
BLAKE2b-256 | 2434025291bf2a5c3b5479b5d630cd9eaf30f2bb7ce66ba67c62e7fa58251308 |