Skip to main content

A tool for citation using transformer models.

Reason this release was yanked:

Missing documentation

Project description

Citention

This package enables generating LLM responses with citations from generation ("[4] [6]"), attention scores and retriever scores, and is published alongside the paper [Citation Failure: Definition, Analysis and Efficient Mitigation](TODO add link) (Buchmann et al., 2025). The code builds on AT2 (Cohen-Wang et al., 2025) and Huggingface Transformers.


Contact: Jan Buchmann

UKP Lab | TU Darmstadt

Installation

pip install citention

PermissionError from NLTK: If you get a PermissionError from NLTK, set the NLTK_DATA environment variable to a writable directory.

Repository Structure

├── assets/
│   └── test_instance.json # Example input for testing
├── src/
│   └── citention/
│       ├── citation_scorers/ # Classes for obtaining citation scores from generation, attention or retrieval       ├── model/
│          ├── attention_attributor.py # Computes attention scores to source documents          ├── citention_model.py # Contains central CitentionModel class          ├── score_combinator.py # Implements score combination          ├── score_estimator.py # Obtains raw attention scores and weighs them          └── task.py # Handles generation       ├── retrievers/ # Classes for obtaining retrieval scores       ├── trainers/ # Trainer classes        └── util/ # Utility functions and response processing
└── tests/ # Basic tests

Usage

Generation and Citation

The main functionality of this package is provided in the CitentionModel class. It enables generating text with citations, where the citations can be obtained from generation, attention, retrieval and their combination.

The code below shows how to initialize a model, create a small prompt and get a response and citations from generation and attention. For more usage examples see the main repo of our paper.

from transformers import AutoModelForCausalLM, AutoTokenizer

from citention.model.citention_model import CitentionModel

# Load base LLM and tokenizer
llm_hf_id = 'Qwen/Qwen3-1.7B'
llm = AutoModelForCausalLM.from_pretrained(llm_hf_id)
tokenizer = AutoTokenizer.from_pretrained(llm_hf_id)

# Define citation methods
# Currently available methods: "generation", "attention", "bm25", "sbert_dual"
citation_scorer_class_names = ['generation', 'attention']
# Define paths to pre-trained parameters (not needed for all methods)
citation_scorer_model_names_or_paths = [None, None]

# Initialize model
model = CitentionModel(
    model_name=llm_hf_id,
    llm=llm,
    tokenizer=tokenizer,
    citation_scorer_class_names=citation_scorer_class_names,
    citation_scorer_model_names_or_paths=citation_scorer_model_names_or_paths,
    calibrate_attention_scores=False,
    seed=12345,
    citation_prediction_method='top_k'
)

# Dummy source documents that can be used and cited
source_candidates = [
    'Albert Einstein: <some_content>',
    'Nicola Tesla: <some_content>',
    'Marie Curie: <some_content>'
]
# Format source candidates by adding integer identifiers in square brackets ("[i]")
formatted_source_candidates = '\n'.join([
    f'[{i}] {source_candidates[i]}'
    for i in range(len(source_candidates))
])
instruction = 'You are given a list of source documents and a question. Answer the question only using information from the source documents. Add the ids of the relevant source documents after each response sentence.'
question = 'Who came up with special relativity theory?'

# Create prompt
messages = [{
    'role': 'user',
    'content': f'{instruction}\n{source_candidates}\n{question}'
}]
prompt = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True
)
# Add opening and closing thinking tokens as this currently can't be handled
prompt += '<think>\n</think>'

# Get character spans of source candidates and question
# Get char spans
char_spans = {
    'source_candidates': [],
    'question': (0, 0),
}
offset = 0
for i, source_candidate in enumerate(source_candidates):
    formatted_source_candidate = f'[{i}] {source_candidate}'
    start = prompt[offset:].find(formatted_source_candidate) + offset
    id_span = (start, start + len(f'[{i}]'))
    content_span = (id_span[1] + 1, id_span[1] + 1 + len(source_candidate))
    char_spans['source_candidates'].append({
        'id': id_span,
        'content': content_span
    })
    offset += len(formatted_source_candidate)

question_start = prompt.find(question)
char_spans['question'] = (question_start, question_start + len(question))

(
    raw_generation,
    predicted_statement,
    predicted_citations,
    combined_scores,
    citation_scores
) = model.generate_and_cite(
    prompt=prompt,
    max_new_tokens=100,
    multiple_statements=True,
    char_spans=char_spans,
    include_source_candidate_ids_in_ranges=False,
    attention_query='statement',
    retriever_query='question_and_statement',
    citation_k=2
)

Training

Training QRHEAD

The QRHEAD paper (Zhang et al., 2025) describes the selection of Query-Focused Retrieval Heads to improved attention-based reranking. We provide a simple QRHeadTrainer class for this purpose. See tests/test_qr_head.py for an example of training and passing the path of the saved attention head parameters to CitentionModel.

Training AT2

The AT2 paper (Cohen-Wang et al., 2025) describes the training of attention head parameters through context perturbation. Paremeters trained with the trainer class from the AT2 package can be directly used in CitentionModel.

Training BM25

Usage of BM25 retrieval requires pre-computation of token frequency statistics. The BM25Trainer class fulfills this purpose. See tests/test_bm25.py for an example of training and passing the path of the saved BM25 parameters to CitentionModel.

Training a Score Combinator

By default, the CitentionModel will compute a uniform average of scores from all citation scorers. To change this to a weighted average, the weights need to be optimized using the ScoreCombinatorTrainer class. See tests/test_score_combinator.py for an example of training and passing the path of the saved score combinator parameters to CitentionModel.

Citation

TODO add Citation Failure paper

References

Cohen-Wang, B., Chuang, Y., & Madry, A. (2025). Learning to Attribute with Attention. ArXiv, abs/2504.13752.

Zhang, W., Yin, F., Yen, H., Chen, D., & Ye, X. (2025). Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking. ArXiv, abs/2506.09944.

Disclaimer

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

citention-0.0.1.tar.gz (33.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

citention-0.0.1-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file citention-0.0.1.tar.gz.

File metadata

  • Download URL: citention-0.0.1.tar.gz
  • Upload date:
  • Size: 33.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for citention-0.0.1.tar.gz
Algorithm Hash digest
SHA256 3e0a37cd52664da9a50110a92d2b021dd3045c8348205518019f6710f2a6c64b
MD5 325ef57d0e661d7739315bbf18e6e89d
BLAKE2b-256 77f9f129bbfa558bad8436c7e4dcd43e991b93b931befac9da6bba9bec0f4023

See more details on using hashes here.

File details

Details for the file citention-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: citention-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 36.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for citention-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 59b5ecb7850fb4950d8178ed821f08902934698b39764c693e5c7a4099369e42
MD5 6af7560724d5685f0ccc4e40219b708f
BLAKE2b-256 6666f7b0ef70dcc13d3088a6782a2c8dd7f482086f0ed9a426afa6673d84403f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page