Feature selection using a genetic algorithm
Project description
Feature Selection via Genetic Algorithm (FSGA)
A university project implementing feature selection using Genetic Algorithms, with evaluation and visualization tools.
Quick Start
# Installation
git clone <repository-url>
cd feature-selection-via-genetic-algorithm
uv venv && source .venv/bin/activate
uv pip install -e .
# Run example
python experiments/run_comparison.py
Basic Usage
from fsga.core.genetic_algorithm import GeneticAlgorithm
from fsga.datasets.loader import load_dataset
from fsga.evaluators.accuracy_evaluator import AccuracyEvaluator
from fsga.ml.models import ModelWrapper
# Load data and setup
X_train, X_test, y_train, y_test, _ = load_dataset('iris', split=True)
model = ModelWrapper('rf', n_estimators=50, random_state=42)
evaluator = AccuracyEvaluator(X_train, y_train, X_test, y_test, model)
# Run GA
from fsga.selectors.tournament_selector import TournamentSelector
from fsga.operators.uniform_crossover import UniformCrossover
from fsga.mutations.bitflip_mutation import BitFlipMutation
ga = GeneticAlgorithm(
num_features=X_train.shape[1],
evaluator=evaluator,
selector=TournamentSelector(evaluator, tournament_size=3),
crossover_operator=UniformCrossover(),
mutation_operator=BitFlipMutation(probability=0.01),
population_size=50,
num_generations=100,
early_stopping_patience=10
)
results = ga.evolve()
print(f"Accuracy: {results['best_fitness']:.2%}")
print(f"Features: {results['best_chromosome'].sum()}/{X_train.shape[1]}")
Key Features
- Modular Design: Swappable operators, selectors, and evaluators
- Multiple Operators: 5 crossover types, 5 selection strategies, 3 fitness functions
- Baseline Comparisons: Built-in RFE, LASSO, Mutual Information, Chi², ANOVA
- Statistical Testing: Wilcoxon, Mann-Whitney, Cohen's d, Jaccard stability
- Visualization: 9 plot functions for analysis and comparison
- Experiment Framework:
ExperimentRunnerfor reproducible experiments - Configuration: YAML-based configuration system
Architecture
fsga/
├── core/ # GA engine (genetic_algorithm, population)
├── operators/ # Crossover: uniform, single-point, two-point, multi-point
├── mutations/ # Mutation: bitflip
├── selectors/ # Selection: tournament, roulette, ranking, elitism
├── evaluators/ # Fitness: accuracy, F1, balanced accuracy
├── ml/ # Model wrappers (sklearn integration)
├── datasets/ # Dataset loaders (iris, wine, breast_cancer, digits)
├── analysis/ # Baselines + ExperimentRunner
├── visualization/ # 9 plot functions
└── utils/ # Config, metrics, serialization, logging
Documentation
- Getting Started - Installation and basic usage
- Tutorial - Step-by-step guide with examples
- Architecture - System design and extension points
- Project Plan - Status and roadmap
- Module READMEs - See
fsga/*/README.mdfor component details
Example Results
Breast Cancer Dataset (30 features → 12 features):
- GA Accuracy: 98.3% with 40% of features
- All Features: 95.7% with 100% of features
- +2.6% accuracy, 60% dimensionality reduction
Iris Dataset (4 features → 2 features):
- GA Accuracy: 98.3% with 50% of features
- Selected: petal length, petal width
Wine Dataset (13 features → 6.5 features):
- GA Accuracy: 100% with 50% of features
Running Experiments
# Full analysis (all datasets, all visualizations)
python experiments/run_experiment.py
# Quick test (single dataset, fewer runs)
python experiments/run_experiment.py --quick
# Specific datasets only
python experiments/run_experiment.py --datasets iris wine
# Without visualizations (faster)
python experiments/run_experiment.py --no-plots
# Results saved to: results/{mode}/{dataset}/
Tests
# Run all tests
uv run pytest tests/ -v
# With coverage
uv run pytest tests/ --cov=fsga --cov-report=html
# Current: 280+ tests, 82% coverage
Configuration
Example config (configs/default.yaml):
population_size: 50
num_generations: 100
mutation_rate: 0.01
crossover_rate: 0.8
early_stopping_patience: 10
dataset:
name: iris
split_ratio: 0.7
Load with:
from fsga.utils.config import Config
config = Config.from_file('configs/default.yaml')
License
MIT License - see LICENSE file for details.
Contributing
Contributions welcome! See module READMEs for extension points:
- New operators:
fsga/operators/README.md - New selectors:
fsga/selectors/README.md - New evaluators:
fsga/evaluators/README.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fsga-0.1.5.tar.gz.
File metadata
- Download URL: fsga-0.1.5.tar.gz
- Upload date:
- Size: 64.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6cd6315501506875f3f4193ccf64185f355ce1390d4f874de19939c6d7b3f412
|
|
| MD5 |
def6220160ab068d54f19b5b59a809c5
|
|
| BLAKE2b-256 |
e43392c0cb8fe934315c5638ef299150e8620179eff9d830406e04e0e447da9c
|
Provenance
The following attestation bundles were made for fsga-0.1.5.tar.gz:
Publisher:
publish-pypi.yml on straightchlorine/fsga
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fsga-0.1.5.tar.gz -
Subject digest:
6cd6315501506875f3f4193ccf64185f355ce1390d4f874de19939c6d7b3f412 - Sigstore transparency entry: 1102211211
- Sigstore integration time:
-
Permalink:
straightchlorine/fsga@73050a523476de1987650acf8e6311b2141e0824 -
Branch / Tag:
refs/tags/1.1.5 - Owner: https://github.com/straightchlorine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@73050a523476de1987650acf8e6311b2141e0824 -
Trigger Event:
release
-
Statement type:
File details
Details for the file fsga-0.1.5-py3-none-any.whl.
File metadata
- Download URL: fsga-0.1.5-py3-none-any.whl
- Upload date:
- Size: 55.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b38b736329f606ce1635dc8838e50feb57baac13cb7e447c28dd3861c2a76aad
|
|
| MD5 |
9939edc0fd851666bcdb643e799af086
|
|
| BLAKE2b-256 |
6c564b33993c57ae12118510cf2e5aed08571271f6db45363ed9f5cd300d2d08
|
Provenance
The following attestation bundles were made for fsga-0.1.5-py3-none-any.whl:
Publisher:
publish-pypi.yml on straightchlorine/fsga
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
fsga-0.1.5-py3-none-any.whl -
Subject digest:
b38b736329f606ce1635dc8838e50feb57baac13cb7e447c28dd3861c2a76aad - Sigstore transparency entry: 1102211212
- Sigstore integration time:
-
Permalink:
straightchlorine/fsga@73050a523476de1987650acf8e6311b2141e0824 -
Branch / Tag:
refs/tags/1.1.5 - Owner: https://github.com/straightchlorine
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-pypi.yml@73050a523476de1987650acf8e6311b2141e0824 -
Trigger Event:
release
-
Statement type: