swe-pruner

Self-Adaptive Context Pruning for Coding Agents

These details have not been verified by PyPI

Project description

Installation

Basic Installation

Install from PyPI:

pip install swe-pruner

Flash Attention Setup

SwePruner requires flash-attn which needs to be installed separately based on your system configuration. You can install it using one of the following methods:

Pre-built wheel (recommended): Download the appropriate wheel file for your system from the flash-attention releases and install it:

pip install flash_attn-<version>-<platform>.whl

From source: If no pre-built wheel is available, you can build from source:

pip install flash-attn --no-build-isolation

Note: Flash attention requires CUDA and specific PyTorch versions. Make sure your environment is compatible.

Model Download

The pre-trained model files are not included in the PyPI package. You need to download them separately:

From HuggingFace Hub (if available):

# Using huggingface-hub
huggingface-cli download <model-repo-id> --local-dir ./model

Manual download: Download the model files and place them in a directory (e.g., ./model) with the following structure:

model/
├── config.json
├── model.safetensors
├── tokenizer.json
├── tokenizer_config.json
└── ... (other tokenizer files)

Usage

Command Line Interface

Start the FastAPI server using the CLI:

swe-pruner serve --model-path ./model --port 8000

Options:

--host / -h: Host to bind the server to (default: 0.0.0.0)
--port / -p: Port to run the server on (default: 8000)
--model-path / -m: Path to model directory (overrides SWEPRUNER_MODEL_PATH environment variable)

You can also set the model path using an environment variable:

export SWEPRUNER_MODEL_PATH=./model
swe-pruner serve

Python API

Basic Usage

from hf.prune_wrapper import SwePrunerForCodePruning, PruneRequest

# Load the model
model = SwePrunerForCodePruning.from_pretrained("./model")

# Create a prune request
request = PruneRequest(
    query="Find functions that handle user authentication",
    code="""
def login(username, password):
    # Authentication logic
    if verify_credentials(username, password):
        return create_session(username)
    return None

def logout(session_id):
    # Logout logic
    invalidate_session(session_id)
    """,
    threshold=0.5,
    always_keep_first_frags=False,
    chunk_overlap_tokens=50
)

# Prune the code
response = model.prune(request)

print(f"Relevance score: {response.score}")
print(f"Pruned code:\n{response.pruned_code}")
print(f"Token count: {response.origin_token_cnt} -> {response.left_token_cnt}")

API Response

The PruneResponse object contains:

score: Document-level relevance score (float)
pruned_code: Pruned code string with filtered sections marked
token_scores: List of [token, score] pairs
kept_frags: List of kept line numbers
origin_token_cnt: Original token count
left_token_cnt: Remaining token count after pruning
model_input_token_cnt: Total tokens sent to the model
error_msg: Error message if any (optional)

FastAPI Server

Once the server is running, you can interact with it via HTTP:

Health Check

curl http://localhost:8000/health

Prune Code

curl -X POST http://localhost:8000/prune \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Find authentication functions",
    "code": "def login(): ...",
    "threshold": 0.5
  }'

Configuration

Model Parameters

The model supports various configuration options through SwePrunerConfig:

backbone_model_name_or_path: Backbone model identifier
bottleneck: Bottleneck dimension (default: 256)
dropout: Dropout rate (default: 0.4)
num_fusion_layers: Number of fusion layers (default: 1)
num_heads: Number of attention heads (default: 8)
use_multi_layer_fusion: Whether to use multi-layer fusion (default: True)
compression_head_type: Type of compression head ("ffn", "simple", or "crf")

Pruning Parameters

threshold: Score threshold for keeping tokens (default: 0.5)
always_keep_first_frags: Always keep the first N fragments (default: False)
chunk_overlap_tokens: Overlap tokens between chunks for long code (default: 50)

Requirements

Python >= 3.12
PyTorch >= 2.8.0
Transformers >= 4.57.1
CUDA (for GPU acceleration)
Flash Attention 2

See pyproject.toml for the complete list of dependencies.

License

MIT License

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.1.1

Jan 28, 2026

This version

0.1.0

Jan 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swe_pruner-0.1.0.tar.gz (54.6 kB view details)

Uploaded Jan 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

swe_pruner-0.1.0-py3-none-any.whl (18.1 kB view details)

Uploaded Jan 22, 2026 Python 3

File details

Details for the file swe_pruner-0.1.0.tar.gz.

File metadata

Download URL: swe_pruner-0.1.0.tar.gz
Upload date: Jan 22, 2026
Size: 54.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for swe_pruner-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`3b9a941ca24fd71db4582db04619fe825aa2d3448d4b562199eab500a50def75`
MD5	`2dea96618113ea5d5c5a56fb27b0b074`
BLAKE2b-256	`22432033f29fe3180a705afc791ddc21d29f0f676974f101ea1e6a05433cfe97`

See more details on using hashes here.

File details

Details for the file swe_pruner-0.1.0-py3-none-any.whl.

File metadata

Download URL: swe_pruner-0.1.0-py3-none-any.whl
Upload date: Jan 22, 2026
Size: 18.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.22

File hashes

Hashes for swe_pruner-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`078044bd116c33cc4590c7ca4fc5904d6eb7fba9cfdc8baa5f13bce205d9c47c`
MD5	`63be45c555fe093eb34f3e0fcdf90f80`
BLAKE2b-256	`a314cfbb32c24047ff51def970ad931c6d6b9433a3814eb06027a8000d9fbea8`

See more details on using hashes here.

swe-pruner 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Installation

Basic Installation

Flash Attention Setup

Model Download

Usage

Command Line Interface

Python API

Basic Usage

API Response

FastAPI Server

Health Check

Prune Code

Configuration

Model Parameters

Pruning Parameters

Requirements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes