representation engineering / control vectors
Project description
repeng
A Python library for generating control vectors with representation engineering. Train a vector in less than sixty seconds!
For a full example, see the notebooks folder or the blog post.
...
from repeng import ControlVector, ControlModel, DatasetEntry
# load and wrap Mistral-7B
model_name = "mistralai/Mistral-7B-Instruct-v0.1"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)
model = ControlModel(model, list(range(-5, -18, -1)))
...
# generate a dataset with closely-opposite paired statements
trippy_dataset = make_dataset(
"Act as if you're extremely {persona}.",
["high on psychedelic drugs"],
["sober from psychedelic drugs"],
truncated_output_suffixes,
)
# train the vector—takes less than a minute!
trippy_vector = ControlVector.train(model, tokenizer, trippy_dataset)
# set the control strength and let inference rip!
for strength in (-2.2, 1, 2.2):
print(f"strength={strength}")
model.set_control(trippy_vector, strength)
out = model.generate(
**tokenizer(
f"[INST] Give me a one-sentence pitch for a TV show. [/INST]",
return_tensors="pt"
),
do_sample=False,
...
)
print(tokenizer.decode(out.squeeze()).strip())
print()
strength=-2.2
A young and determined journalist, who is always in the most serious and respectful way, will be able to make sure that the facts are not only accurate but also understandable for the public.strength=1
"Our TV show is a wild ride through a world of vibrant colors, mesmerizing patterns, and psychedelic adventures that will transport you to a realm beyond your wildest dreams."strength=2.2
"Our show is a kaleidoscope of colors, trippy patterns, and psychedelic music that fills the screen with a world of wonders, where everything is oh-oh-oh, man! ��psy����������oodle����psy��oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
For a more detailed explanation of how the library works and what it can do, see the blog post.
Notice
Some of the code in this repository derives from andyzoujm/representation-engineering (MIT license).
Citation
If this repository is useful in your work, please remember to cite the representation-engineering paper that it's based on.
You can additionally cite this repository:
@misc{vogel2024repeng,
title = {repeng},
author = {Theia Vogel},
year = {2024},
url = {https://github.com/vgel/repeng/}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file repeng-0.1.0.tar.gz
.
File metadata
- Download URL: repeng-0.1.0.tar.gz
- Upload date:
- Size: 6.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.0 Linux/5.18.10-76051810-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35323c89d441726ef02b5e89dffa4110aa22f68abfa0295984b34b616fb598dd |
|
MD5 | eac92757f7275ba47af94f906c2e1c48 |
|
BLAKE2b-256 | d20160198178cb1dccdc56b9c3ddf5fc0c3f830b8b7a22853647b6c2e61cb1cf |
File details
Details for the file repeng-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: repeng-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.0 CPython/3.11.0 Linux/5.18.10-76051810-generic
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2212606bea2337b9e7f785ed3aa2a1dd17c9485669bf71641a0eb4748c474a3b |
|
MD5 | edfff999faebe99b1e6f2824b1cb28b1 |
|
BLAKE2b-256 | 1c4651fa4fe330cfdfdb9e2cb212e0eea6551d5ded3dc2377a446570897e8af1 |