Skip to main content

Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings.

Project description

BiLLM

Tool for converting LLMs from uni-directional to bi-directional for tasks like classification and sentence embeddings. Compatible with 🤗 transformers.

https://arxiv.org/abs/2310.01208 https://arxiv.org/abs/2311.05296 PyPI version PyPI Downloads http://makeapullrequest.com https://pdm-project.org

Supported Models

  • LLaMA
  • Mistral
  • Qwen2
  • OpenELM

Usage

  1. python -m pip install -U billm

  2. Specify start index for bi-directional layers via export BiLLM_START_INDEX={layer_index}. if not specified, default is 0, i.e., all layers are bi-directional. If set to -1, BiLLM is disabled.

  3. Import LLMs from BiLLM and initialize them as usual with transformers.

- from transformers import (
-    LLamaModel,
-    LLamaForCausalLM,
-    LLamaForSequenceClassification,
-    MistralModel,
-    MistralForCausalLM,
-    MistralForSequenceClassification
-    Qwen2Model,
-    Qwen2ForCausalLM,
-    Qwen2ForSequenceClassification
- )

+ from billm import (
+    LLamaModel,
+    LLamaForCausalLM,
+    LLamaForSequenceClassification,
+    LLamaForTokenClassification,
+    MistralModel,
+    MistralForCausalLM,
+    MistralForSequenceClassification,
+    MistralForTokenClassification,
+    Qwen2Model,
+    Qwen2ForCausalLM,
+    Qwen2ForSequenceClassification,
+    Qwen2ForTokenClassification
+    OpenELMModel,
+    OpenELMForCausalLM,
+    OpenELMForSequenceClassification,
+    OpenELMForTokenClassification
+ )

Examples

NER

training:

$ cd examples
$ WANDB_MODE=disabled BiLLM_START_INDEX=0 CUDA_VISIBLE_DEVICES=3 python billm_ner.py \
--model_name_or_path mistralai/Mistral-7B-v0.1 \
--dataset_name_or_path conll2003 \
--push_to_hub 0

inference:

from transformers import AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
from billm import MistralForTokenClassification


label2id = {'O': 0, 'B-PER': 1, 'I-PER': 2, 'B-ORG': 3, 'I-ORG': 4, 'B-LOC': 5, 'I-LOC': 6, 'B-MISC': 7, 'I-MISC': 8}
id2label = {v: k for k, v in label2id.items()}
model_id = 'WhereIsAI/billm-mistral-7b-conll03-ner'
tokenizer = AutoTokenizer.from_pretrained(model_id)
peft_config = PeftConfig.from_pretrained(model_id)
model = MistralForTokenClassification.from_pretrained(
    peft_config.base_model_name_or_path,
    num_labels=len(label2id), id2label=id2label, label2id=label2id
)
model = PeftModel.from_pretrained(model, model_id)
# merge and unload is necessary for inference
model = model.merge_and_unload()

token_classifier = pipeline("token-classification", model=model, tokenizer=tokenizer, aggregation_strategy="simple")
sentence = "I live in Hong Kong. I am a student at Hong Kong PolyU."
tokens = token_classifier(sentence)
print(tokens)

Sentence Embeddings

refer to AnglE: https://github.com/SeanLee97/AnglE

Citation

If you use this toolkit in your work, please cite the following paper:

  1. For sentence embeddings modeling:
@inproceedings{li2024bellm,
    title = "BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings",
    author = "Li, Xianming and Li, Jing",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}
  1. For other tasks:
@article{li2023label,
  title={Label supervised llama finetuning},
  author={Li, Zongxi and Li, Xianming and Liu, Yuzhang and Xie, Haoran and Li, Jing and Wang, Fu-lee and Li, Qing and Zhong, Xiaoqin},
  journal={arXiv preprint arXiv:2310.01208},
  year={2023}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

billm-0.1.6.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

billm-0.1.6-py3-none-any.whl (35.2 kB view details)

Uploaded Python 3

File details

Details for the file billm-0.1.6.tar.gz.

File metadata

  • Download URL: billm-0.1.6.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for billm-0.1.6.tar.gz
Algorithm Hash digest
SHA256 e79f543c322de9750cbd4dcfdbbf89e79ab6f13ea4b0bdbda8dcfa753acb6266
MD5 fcd4c4cf352b77668d291b13f84325ad
BLAKE2b-256 e63676a8fe630ebe0a2f1def45ff0031330c4d755adc4c23291c70840d2dc3e1

See more details on using hashes here.

File details

Details for the file billm-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: billm-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 35.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.10

File hashes

Hashes for billm-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 4b4fd197913e36b681e8470dbbec4d56494cdc0090691a9d2a13ba6a047cee63
MD5 f62641d97d3ff8fe44ab086a45c621d5
BLAKE2b-256 2434025291bf2a5c3b5479b5d630cd9eaf30f2bb7ce66ba67c62e7fa58251308

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page