SpaRTA adaptation wrapper. Invocation code to load and run SpaRTA adapters for inference

These details have not been verified by PyPI

Project links

Repository

Project description

PEFT-SpaRTA

SpaRTA (Sparse Random parameTer Adaptation) is a Parameter-Efficient Fine-Tuning (PEFT) alternative to traditional LoRA that reduces the number of trainable parameters by randomly selecting a very small proportion of the model parameters to train on.

This Python package provides the invocation code necessary to load and run SpaRTA-adapted models for inference. In particular, it includes the classes

SpaRTAforSequenceClassification
SpaRTAforCausalLM

to load a SpaRTA adapter along its pre-trained base (transformer) model architectured, respectively, for sequence classification tasks and autoregressive text generation tasks.

We also include the class

SpaRTA

to facilitate sparse random parameter adaptation of a model and train your own SpaRTA adapters. This implementation is compatible with some of the most popular trainers, as shown in here.

For more details on how SpaRTA works see our paper. The original implementation of SpaRTA can be found in https://github.com/IBM/sparta.

Installation

pip install peft-sparta

How to use it for inference

Download a SpaRTA adapter from a Hugging Face repository

Let's download a SpaRTA adapter that spacializes the google/gemma-2b model to do sentiment classification of English sentences.

ADAPTER_DIR='/my_sparta_adapters/sparta-gemma_2b/'

mkdir -p $ADAPTER_DIR

hf download jesusriosal/sparta-gemma_2b-sst2 --local-dir $ADAPTER_DIR

Load the SpaRTA adapter and create the adapted model

from peft_sparta import SpaRTAforSequenceClassification

adapter_dir = '/my_sparta_adapters/sparta-gemma_2b/'

model = SpaRTAforSequenceClassification(
   adapter = adapter_dir,
   device = 'cuda')

print(model)

(SpaRTA)ModelForSeqClassification(
	adapter = '/my_sparta_adapters/sparta-gemma_2b/'
	model = 'google/gemma-2b'
	id2label = {0: 'negative', 1: 'positive'}
)

Inputs

Let's use our adapted model to classify a few sentences. For this adapter, the model consumes the sentences directly. No formating is needed

sentences = ["I enjoyed very much the movie.", 
             "It was painful to watch.", 
             "I couldn't enjoy more the movie.",
             "It was a bad movie."]

Inference

Probabilistic classification

The adapted model can give us its estimated probabilities that each sentence (row) has negative (first column) or positive (second column) sentiment.

class_probs = model.classify(sentences) 

print(class_probs)

tensor([[0.1152, 0.8848],
        [0.9497, 0.0503],
        [0.1689, 0.8311],
        [0.9720, 0.0280]], device='cuda:0')

To identify which column correspond to each class, use:

print(model.id2label)

{'0': 'negative', '1': 'positive'}

Here are the model's estimated probabilities of positive sentiment for each sentence

for sentence, pos_prob in zip(sentences, class_probs[:,1]):
    print(f"{pos_prob.item()*100:>4.0f}%\t{sentence}")

 Prob   Sentence
 ----   -----------------------------
  88%	I enjoyed very much the movie.
   5%	It was painful to watch.
  83%	I couldn't enjoy more the movie.
   3%	It was a bad movie.

Deciding the sentiment class of each sentence (deterministic classification)

We have seen how the model makes probabilistic assessments of the sentiment of each sentence. If we want the model to make a definitive decison on whether the sentence has positive or negative sentiment, we can use:

classes = model.decide_class(sentences)

to obtain the model's predicted class of each sentence. Basically, the model takes the most likely class as its sentiment prediction of a sentence

for sentence, sentence_class in zip(sentences, classes):
    print(f"'{sentence_class}':  {sentence}")

 Sentiment   Sentence
-----------  -------------------------------
'positive':  I enjoyed very much the movie.
'negative':  It was painful to watch.
'positive':  I couldn't enjoy more the movie.
'negative':  It was a bad movie.

Input templates

Sometimes the input to the model may need to be formatted before our adapted model can processs it. This is typicaly the case when using instruction-following models, for which wrapping the input within an instruction, formatted with the model's chat template, can be advantageous. In these cases, we can use the following input_template argument to specify the formatting over raw inputs used during training, and needed during inference.

To see this, let's use another SpaRTA adapter for sentiment classification based on the google/gemma-2b-it model.

hf download jesusriosal/sparta-gemma_2b-sst2 --local-dir '/my_sparta_adapters/sparta-gemma_2b_it/'

from peft_sparta import SpaRTAforSequenceClassification

adapter_dir = '/my_sparta_adapters/sparta-gemma_2b_it/'

model = SpaRTAforSequenceClassification(
    adapter=adapter_dir,
    device='cuda',
    input_template = ("<start_of_turn>user\n"
                      "Determine the sentiment of the following sentence about a movie. "
                      "The sentiment can only be classified as positive or negative.\n"
                      "Sentence: {sentence}" 
                      "<end_of_turn>\n<start_of_turn>model\n"
                      "The sentiment of the sentence is")
    )

print(model)

(SpaRTA)ModelForSeqClassification(
	adapter = '/my_sparta_adapters/sparta-gemma_2b_it/'
	model = 'google/gemma-2b-it'
	id2label = {0: 'negative', 1: 'positive'}
)

This SpaRTA adapter was trained formating the input sentences to be classified with the input_template (see model.template printout below), which included a task instruction. This ensures that during inference the same formatting is used on the inputs to be classified.

print(model.template)

<start_of_turn>user
Determine the sentiment of the following sentence about a movie. The sentiment can only be classified as positive or negative.
Sentence: {sentence}<end_of_turn>
<start_of_turn>model
The sentiment of the sentence is

For example, the sentence

I enjoyed very much the movie.

is converted to

<start_of_turn>user
Determine the sentiment of the following sentence about a movie. The sentiment can only be classified as positive or negative.
Sentence: I enjoyed very much the movie.<end_of_turn>
<start_of_turn>model
The sentiment of the sentence is

before passing it to the model for classification

Thus, to classify the (raw, non-formatted) sentences above we proceed as follows

sentences = [{'sentence': sent} for sent in sentences]

class_probs = model.classify(sentences)

# prob of positive sentiment for each sentence 
for sentence, pos_prob in zip(sentences, class_probs[:,1]):
    print(f"{pos_prob.item()*100:>4.0f}%\t{sentence['sentence']}")

 100%	I enjoyed very much the movie.
   0%	It was painful to watch.
 100%	I couldn't enjoy more the movie.
   0%	It was a bad movie.

classes = model.decide_class(sentences)

for sentence, sentence_class in zip(sentences, classes):
    print(f"'{sentence_class}':  {sentence['sentence']}")

 Sentiment   Sentence
-----------  -------------------------------
'positive':  I enjoyed very much the movie.
'negative':  It was painful to watch.
'positive':  I couldn't enjoy more the movie.
'negative':  It was a bad movie.

Out-of-Distribution performance evaluations

If you have a labeled dataset with English sentences and their sentiment labels, like the one below, you can evaluate the performace of these models on that dataset as follows.

Given the following dataset of new, unseen sentences and their sentiment labels:

test_sentences = ["it's a charming journey. ",
                  "bleak and desperate",
                  "nolan is poised to embark a major career as a commercial yet inventive filmmaker.",
                  "the acting, costumes, music, cinematography and sound are all astounding. ",
                  "it's slow -- very, very slow. ",
                  "the film is a refreshingly serious look at young women.",
                  "a sometimes tedious film.",
                  "like doing last year's taxes with your ex-wife.",
                  "you don't have to know about music to appreciate the film. ",
                  "in exactly 89 minutes, most of which passed as slowly as if i'd been sitting naked on an igloo, the movie sank from quirky to jerky to utter turkey."]

test_labels = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]

where a label of 0 represents negative sentiment and a label of 1 positive.

We evaluate the performance of the model on this labeled dataset as follows. We will need to first put each sentence within a dictionary with a key named 'sentence' for the model with the input_template, so the sentences can be consumed by it accordingly.

test_sentences = [{'sentence': sent} for sent in test_sentences] # for the model with input_template

model.evaluate(test_sentences, test_labels, batch_size=64)

loss: 0.002
accuracy: 100%
confusion matrix: [5, 0
                   0, 5]
balanced accuracy: 100% 
MCC: 1.0
F1-score: 1.0

How to train a SpaRTA adapter

Given a pre-trained model, we prepare it for fine-tuning with SpaRTA by

from peft_sparta import SpaRTA

model = SpaRTA(model, sparsity=0.99)

This adds the adapter to the pre-trained model. The adapter consists of non-trainable randomly sampled indices and trainable deltas, representing the changes to the original model parameters for those indices. Note that in this case we have chosen a sparsity level of 99%, meaning that we target to keep only 1% of the model parameters to be trainable.

Our SpaRTA wrapper class supports the following arguments:

model (nn.Module) Pre-trained model to be adapted.
sparsity (float) Target fraction of the total number of model parameters to make non-trainable. Must be 0 < sparsity < 1.
frozen_modules (list[str], optional) List of layers name substrings to make entirely frozen (non-trainable). Classification heads ("score") will always be fully-trainable by default. Defaults to ["embed_tokens", "self_attn.q", "self_attn.k", "mlp", "norm"].
trainable_tokens (list[int], optional) List of (unique) token ids whose embeddings should be fully-trainable. Useful for newly added (special) tokens to the vocabulary. Defaults to None.
dropout (float, optional) Dropout probability applied to the trainable parameters during training. Must be 0 <= dropout < 1. Defaults to 0.

The following notebooks illustrate examples of how to train a SpaRTA adapter with several popular trainers.

Citation

@article{rios2025sparsity,
  title={Sparsity may be all you need: Sparse random parameter adaptation},
  author={Rios, Jesus and Dognin, Pierre and Luss, Ronny and Ramamurthy, Karthikeyan N},
  journal={arXiv preprint arXiv:2502.15975},
  year={2025}
}

@software{rios2025sparta,
  title   = {{PEFT-SpaRTA}},
  author  = {Rios, Jesus},
  url     = {https://github.com/jmriosal/peft-sparta}
}

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

0.0.1

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

peft_sparta-0.0.1.tar.gz (19.1 kB view details)

Uploaded Feb 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

peft_sparta-0.0.1-py3-none-any.whl (16.6 kB view details)

Uploaded Feb 16, 2026 Python 3

File details

Details for the file peft_sparta-0.0.1.tar.gz.

File metadata

Download URL: peft_sparta-0.0.1.tar.gz
Upload date: Feb 16, 2026
Size: 19.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for peft_sparta-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`4a6fb3befca2a4758e47bbd5897cdccd9331cbcc70a36d1a0dcae577a715b37c`
MD5	`1b5e59b6d85ff215316ebb3419065422`
BLAKE2b-256	`759a84f6e9f8da35f8e26e1b4591173b775580179b727beaa35e7c2cad65c4cb`

See more details on using hashes here.

File details

Details for the file peft_sparta-0.0.1-py3-none-any.whl.

File metadata

Download URL: peft_sparta-0.0.1-py3-none-any.whl
Upload date: Feb 16, 2026
Size: 16.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for peft_sparta-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d1c58e87984991a558be5820c34371a2f65209ab700468401a1814db1103a9a`
MD5	`294dc7568c628cacc9bdcfae69207ac3`
BLAKE2b-256	`4b6856859ed62684e43e1eab0c6091afe2871428107a93049d1f4673a533ae3c`

See more details on using hashes here.

peft-sparta 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

PEFT-SpaRTA

Installation

How to use it for inference

Download a SpaRTA adapter from a Hugging Face repository

Load the SpaRTA adapter and create the adapted model

Inputs

Inference

Probabilistic classification

Deciding the sentiment class of each sentence (deterministic classification)

Input templates

Out-of-Distribution performance evaluations

How to train a SpaRTA adapter

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes