Skip to main content

Library for extracting and analyzing persona vectors

Project description

Persona Vectors

Docs PyPI

Extract persona-aligned activation vectors from language models and analyze how persona prompts move hidden states.

This project is experimental.

Install

uv sync
cp .env.example .env

Python >=3.12 is required. Set NDIF_API_KEY in .env to run extraction remotely on NDIF.

Dataset loading comes from the sibling persona-data package. For local development, uncomment the persona-data path source in pyproject.toml and keep that repo checked out next to this one.

The Streamlit UI lives in the sibling persona-ui repo.

Quickstart

# Extract activations
uv run python main.py extract --model google/gemma-2-9b-it --backend remote

# Analyze saved activations
uv run python main.py analyze --model google/gemma-2-9b-it --variant biography --mask-strategy answer_mean

# Compute an experimental steering vector
uv run python main.py steer --model google/gemma-2-9b-it --persona-id <UUID> --layer 20

The notebooks are useful for exploratory runs:

uv run python -m notebooks.notebook_extract
uv run python -m notebooks.notebook_manifold
uv run python -m notebooks.notebook_similarity
uv run python -m notebooks.notebook_steer

Extraction Scripts

# Persona Vectors extracted for steering: train split capped at 50 questions, then push to the Hub
MODEL=google/gemma-2-9b-it scripts/extraction_train50_push.sh

# All-questions workflow (explicit only): first 100 personas, save under artifacts/persona-vectors
MODEL=google/gemma-2-9b-it scripts/extraction_all_questions.sh

What Gets Saved

Extraction writes one (num_layers, hidden_size) tensor per persona, prompt variant, model, and mask strategy:

artifacts/activations/<model_dir>/<mask_strategy>/<prompt_variant>/
├── manifest.json
└── <persona_id>.safetensors

<model_dir> is the model name with / replaced by __. Each safetensors file contains one activations tensor. The manifest stores tensor shape, persona names, and contributing QA sample ids.

CLI

# Extract all personas and both prompt variants
uv run python main.py extract --model google/gemma-2-9b-it

# Extract specific personas with a train split cap
uv run python main.py extract --model google/gemma-2-9b-it --persona-id <UUID> baseline_assistant --n-train 50

# Extract the first N personas from the dataset
uv run python main.py extract --model google/gemma-2-9b-it --sample-size 100

# Re-run personas already present locally
uv run python main.py extract --model google/gemma-2-9b-it --persona-id <UUID> --force

# Push local activations to the Hub
uv run python main.py push --model google/gemma-2-9b-it --repo implicit-personalization/synth-persona-vectors

See the docs for API details.

Layout

src/persona_vectors/
├── activations.py   # low-level hidden-state extraction
├── extraction.py    # prompt formatting, masks, persona extraction flow
├── artifacts.py     # local and Hub activation stores
├── analysis.py      # loading, PCA, cosine similarity, clustering
├── plots.py         # Plotly figures
├── steering.py      # experimental steering vectors
└── parser.py        # CLI parser

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persona_vectors-0.8.0.tar.gz (31.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

persona_vectors-0.8.0-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file persona_vectors-0.8.0.tar.gz.

File metadata

  • Download URL: persona_vectors-0.8.0.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for persona_vectors-0.8.0.tar.gz
Algorithm Hash digest
SHA256 3775afc7e04ab1d02582e9c4b3f2d124174ea40d376dd2b91492457a747dd553
MD5 01372e4e9f3426c5e169ad5e6118b7ef
BLAKE2b-256 76228a0ca0e6e54ebd8dd07a4064c2890ec33b68ad81a00e4e93c4f9eee2bcf7

See more details on using hashes here.

File details

Details for the file persona_vectors-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: persona_vectors-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 36.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for persona_vectors-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08b37a749f98b764d22d4c943158922338ab054729f7137eff2c3a167e2b2ae5
MD5 1d0636467d2f001f37ba88b7266115bd
BLAKE2b-256 43a67f67a7df27d78db706cbc9afd5d5ca4b52970b9005717c3bfcc0ce90ec71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page