Skip to main content

Sample-Importance-Aware Selection (SIAS) - Continual learning and coreset selection algorithms

Project description

SAGE SIAS (Sample-Importance-Aware Selection)

Independent package for sample-importance-aware selection, continual learning, and coreset algorithms

PyPI version Python 3.10+ License: MIT

🎯 Overview

sage-sias provides Sample-Importance-Aware Selection algorithms for:

  • Continual Learning: Efficient sample selection for continual/lifelong learning scenarios
  • Coreset Selection: Select representative subsets from large datasets
  • Active Learning: Importance-based data selection strategies
  • Tool/Trajectory Curation: Select important samples for agent training

📦 Installation

# Basic installation
pip install isage-sias

# With PyTorch support
pip install isage-sias[torch]

# Development installation
pip install isage-sias[dev]

🚀 Quick Start

Continual Learning

from sage_sias import ContinualLearner

# Create continual learner
learner = ContinualLearner(
    buffer_size=1000,
    selection_strategy="importance"
)

# Add samples
for data, label in stream:
    learner.add_sample(data, label)

# Get selected samples
important_samples = learner.get_buffer()

Coreset Selection

from sage_sias import CoresetSelector

# Create coreset selector
selector = CoresetSelector(
    target_size=100,
    method="kmeans++"
)

# Select representative samples
coreset = selector.select(dataset, features)

📚 Key Components

1. Continual Learner (continual_learner.py)

Manages sample selection for continual learning:

  • Buffer management with importance-based eviction
  • Multiple selection strategies (random, importance, diversity)
  • Support for experience replay

2. Coreset Selector (coreset_selector.py)

Selects representative subsets:

  • K-means++ based selection
  • Diversity-aware sampling
  • Importance scoring
  • Support for large-scale datasets

3. Types (types.py)

Common data types and protocols:

  • Sample representation
  • Importance scoring interfaces
  • Selection strategies

🔧 Architecture

sage_sias/
├── continual_learner.py    # Continual learning with buffer management
├── coreset_selector.py      # Coreset selection algorithms
├── types.py                 # Common types and protocols
└── __init__.py             # Public API exports

🎓 Use Cases

  1. Agent Training: Select important trajectories for fine-tuning
  2. Data Pruning: Reduce dataset size while maintaining performance
  3. Active Learning: Query most informative samples
  4. Memory Management: Maintain representative samples in limited buffers
  5. Transfer Learning: Select relevant samples for adaptation

🔗 Integration with SAGE

This package is part of the SAGE ecosystem but can be used independently:

# Standalone usage
from sage_sias import ContinualLearner, CoresetSelector

# With SAGE agentic (optional)
from sage_agentic import AgentTrainer
from sage_sias import CoresetSelector

trainer = AgentTrainer()
selector = CoresetSelector(target_size=100)
important_trajectories = selector.select(all_trajectories)
trainer.train(important_trajectories)

📖 Documentation

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Originally part of the SAGE framework, now maintained as an independent package for broader community use.

📧 Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

isage_sias-0.1.0-cp311-none-any.whl (19.2 kB view details)

Uploaded CPython 3.11

File details

Details for the file isage_sias-0.1.0-cp311-none-any.whl.

File metadata

  • Download URL: isage_sias-0.1.0-cp311-none-any.whl
  • Upload date:
  • Size: 19.2 kB
  • Tags: CPython 3.11
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for isage_sias-0.1.0-cp311-none-any.whl
Algorithm Hash digest
SHA256 d2775343d8da667c3bf2a19a6915058db1156fa6998910d0dc273b87a6e806fe
MD5 9593b272043c54469af54b6cc8ca2001
BLAKE2b-256 b513a66fc46315531d758a2b2c902909e4001e6e0e5cbba3f634dde2d51a5695

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page