A tool for citation using transformer models.
Reason this release was yanked:
Missing documentation
Project description
Citention
This package enables generating LLM responses with citations from generation ("[4] [6]"), attention scores and retriever scores, and is published alongside the paper [Citation Failure: Definition, Analysis and Efficient Mitigation](TODO add link) (Buchmann et al., 2025). The code builds on AT2 (Cohen-Wang et al., 2025) and Huggingface Transformers.
Contact: Jan Buchmann
Installation
pip install citention
PermissionError from NLTK: If you get a PermissionError from NLTK, set the NLTK_DATA environment variable to a writable directory.
Repository Structure
├── assets/
│ └── test_instance.json # Example input for testing
├── src/
│ └── citention/
│ ├── citation_scorers/ # Classes for obtaining citation scores from generation, attention or retrieval
│ ├── model/
│ │ ├── attention_attributor.py # Computes attention scores to source documents
│ │ ├── citention_model.py # Contains central CitentionModel class
│ │ ├── score_combinator.py # Implements score combination
│ │ ├── score_estimator.py # Obtains raw attention scores and weighs them
│ │ └── task.py # Handles generation
│ ├── retrievers/ # Classes for obtaining retrieval scores
│ ├── trainers/ # Trainer classes
│ └── util/ # Utility functions and response processing
└── tests/ # Basic tests
Usage
Generation and Citation
The main functionality of this package is provided in the CitentionModel class. It enables generating text with citations, where the citations can be obtained from generation, attention, retrieval and their combination.
The code below shows how to initialize a model, create a small prompt and get a response and citations from generation and attention. For more usage examples see the main repo of our paper.
from transformers import AutoModelForCausalLM, AutoTokenizer
from citention.model.citention_model import CitentionModel
# Load base LLM and tokenizer
llm_hf_id = 'Qwen/Qwen3-1.7B'
llm = AutoModelForCausalLM.from_pretrained(llm_hf_id)
tokenizer = AutoTokenizer.from_pretrained(llm_hf_id)
# Define citation methods
# Currently available methods: "generation", "attention", "bm25", "sbert_dual"
citation_scorer_class_names = ['generation', 'attention']
# Define paths to pre-trained parameters (not needed for all methods)
citation_scorer_model_names_or_paths = [None, None]
# Initialize model
model = CitentionModel(
model_name=llm_hf_id,
llm=llm,
tokenizer=tokenizer,
citation_scorer_class_names=citation_scorer_class_names,
citation_scorer_model_names_or_paths=citation_scorer_model_names_or_paths,
calibrate_attention_scores=False,
seed=12345,
citation_prediction_method='top_k'
)
# Dummy source documents that can be used and cited
source_candidates = [
'Albert Einstein: <some_content>',
'Nicola Tesla: <some_content>',
'Marie Curie: <some_content>'
]
# Format source candidates by adding integer identifiers in square brackets ("[i]")
formatted_source_candidates = '\n'.join([
f'[{i}] {source_candidates[i]}'
for i in range(len(source_candidates))
])
instruction = 'You are given a list of source documents and a question. Answer the question only using information from the source documents. Add the ids of the relevant source documents after each response sentence.'
question = 'Who came up with special relativity theory?'
# Create prompt
messages = [{
'role': 'user',
'content': f'{instruction}\n{source_candidates}\n{question}'
}]
prompt = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True
)
# Add opening and closing thinking tokens as this currently can't be handled
prompt += '<think>\n</think>'
# Get character spans of source candidates and question
# Get char spans
char_spans = {
'source_candidates': [],
'question': (0, 0),
}
offset = 0
for i, source_candidate in enumerate(source_candidates):
formatted_source_candidate = f'[{i}] {source_candidate}'
start = prompt[offset:].find(formatted_source_candidate) + offset
id_span = (start, start + len(f'[{i}]'))
content_span = (id_span[1] + 1, id_span[1] + 1 + len(source_candidate))
char_spans['source_candidates'].append({
'id': id_span,
'content': content_span
})
offset += len(formatted_source_candidate)
question_start = prompt.find(question)
char_spans['question'] = (question_start, question_start + len(question))
(
raw_generation,
predicted_statement,
predicted_citations,
combined_scores,
citation_scores
) = model.generate_and_cite(
prompt=prompt,
max_new_tokens=100,
multiple_statements=True,
char_spans=char_spans,
include_source_candidate_ids_in_ranges=False,
attention_query='statement',
retriever_query='question_and_statement',
citation_k=2
)
Training
Training QRHEAD
The QRHEAD paper (Zhang et al., 2025) describes the selection of Query-Focused Retrieval Heads to improved attention-based reranking. We provide a simple QRHeadTrainer class for this purpose. See tests/test_qr_head.py for an example of training and passing the path of the saved attention head parameters to CitentionModel.
Training AT2
The AT2 paper (Cohen-Wang et al., 2025) describes the training of attention head parameters through context perturbation. Paremeters trained with the trainer class from the AT2 package can be directly used in CitentionModel.
Training BM25
Usage of BM25 retrieval requires pre-computation of token frequency statistics. The BM25Trainer class fulfills this purpose. See tests/test_bm25.py for an example of training and passing the path of the saved BM25 parameters to CitentionModel.
Training a Score Combinator
By default, the CitentionModel will compute a uniform average of scores from all citation scorers. To change this to a weighted average, the weights need to be optimized using the ScoreCombinatorTrainer class. See tests/test_score_combinator.py for an example of training and passing the path of the saved score combinator parameters to CitentionModel.
Citation
TODO add Citation Failure paper
References
Cohen-Wang, B., Chuang, Y., & Madry, A. (2025). Learning to Attribute with Attention. ArXiv, abs/2504.13752.
Zhang, W., Yin, F., Yen, H., Chen, D., & Ye, X. (2025). Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking. ArXiv, abs/2506.09944.
Disclaimer
This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file citention-0.0.1.tar.gz.
File metadata
- Download URL: citention-0.0.1.tar.gz
- Upload date:
- Size: 33.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3e0a37cd52664da9a50110a92d2b021dd3045c8348205518019f6710f2a6c64b
|
|
| MD5 |
325ef57d0e661d7739315bbf18e6e89d
|
|
| BLAKE2b-256 |
77f9f129bbfa558bad8436c7e4dcd43e991b93b931befac9da6bba9bec0f4023
|
File details
Details for the file citention-0.0.1-py3-none-any.whl.
File metadata
- Download URL: citention-0.0.1-py3-none-any.whl
- Upload date:
- Size: 36.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
59b5ecb7850fb4950d8178ed821f08902934698b39764c693e5c7a4099369e42
|
|
| MD5 |
6af7560724d5685f0ccc4e40219b708f
|
|
| BLAKE2b-256 |
6666f7b0ef70dcc13d3088a6782a2c8dd7f482086f0ed9a426afa6673d84403f
|