A tool for creating heuristic and ML-based importance scores.

These details have not been verified by PyPI

Project links

Repository

Project description

ImportanceScore

ImportanceScore is a configurable ML based tool suite designed to create a meaningful importance score by applying either supervised machine learning or explicit, rule-based logic. ImportanceScore externalizes all configuration to ensure your scoring process is automated, repeatable, and scalable.

Note: for a detailed usage guide for the GUI, see the Usage Guide.

Key Benefits

Reproducible Pipeline: A config driven system and a tune -> train -> predict workflow ensure every run is repeatable. The system is designed for version control (e.g., Git), allowing you to archive configurations and model artifacts together for long-term reproducibility.
Prescriptive Directories and File Names To ensure clarity and reproducibility, ImportanceScore uses a standardized directory structure and file naming convention. This design allows you to define the logic for a category once and apply it to any number of different data segments.
Transparent & Tunable: The system is built for the iterative loop of "score -> explain -> tune." Detailed logging, feature contribution reports (--explain), and a fully configuration-driven design allow you to build trust in your model and refine its logic with precision. A GUI is provided to make this process quick and easy.
Drop-in Models: Because of its structured nature you can very easily switch between models. This includes the ability to start with the rule-based Weighted Linear Model, use it to create training data, and then switch to the more powerful Random Forest Regressor.
Scoring-Specific Features: The tool includes a powerful preprocessing pipeline with features specifically tailored for creating importance scores:
- text_weight_scoring: Assign bonus points based on keywords.
- feature_interactions: Combine related features (e.g., historic, heritage) to prevent double-counting.
- clip_outliers: Cap feature values at absolute thresholds based on domain knowledge.
Regional Context Scoring: The system allows you to score distinct geographic regions independently while using a shared logic configuration. This is critical for highlighting locally significant
features that would otherwise be overshadowed in a global ranking.
- Example: Mt. Mitchell (2,037m) is the towering giant of the Appalachians and a major landmark. However, if scored directly against the 4,000m peaks of the Rockies, it would appear
- insignificant. By scoring regions separately (e.g., East_Peaks, West_Peaks), the system correctly identifies Mt. Mitchell as a "Tier 1" feature within its context, ensuring it appears
- prominently on the map.
- Requirement: To use this feature, you simply provide separate input files for each region (e.g., peaks_east.csv, peaks_west.csv) and run the scoring pipeline for each file individually.

Directory Structure

This system uses two key organizing concepts: category and segment.

category: A reusable blueprint for a type of data (e.g., peaks, poi).
segment: A specific subset of data being processed (e.g., uswest, yellowstone).

The project layout separates reusable configurations from segment-specific data:

config/: (Category-centric) Contains all reusable YAML configuration files. These are named by category (e.g., peaks_model.yml).
models/: (Category-centric) Stores the final trained .joblib model artifacts, which are also named by category.
data/: (Segment-centric) Holds all data files, which are almost always specific to a segment.
data/raw/: Input feature and target files.
data/interim/: Intermediate outputs, such as scored files.
logs/: (Segment-centric) Contains detailed output and explanation files from specific runs.

File Naming Convention

File names are designed to be self-describing:

Configuration Files: Are always named for the category they configure.
config/peaks_model.yml
config/poi_classification.yml
Data and Log Files: Must be prefixed with their segment and category.
data/raw/uswest_peaks_features.csv
data/interim/yellowstone_poi_score.csv
logs/yellowstone_poi_explain.csv

Weighted Linear Model (WLM)

This suite provides a WeightedLinearModel, a sci-kit compatible rule-based model. The final score is calculated as: score = intercept + Σ(contribution_of_each_feature).

The contribution from each feature is determined by its configured mode:

presence: If the feature is present, add the coefficient value.
value: Multiply the feature's value by the coefficient.
base_multiplier: If the feature is present, multiply the base_score_column's value by the coefficient.

For a detailed guide, see the Weighted Linear Model Readme.

Advanced Workflow: Bootstrapping a Model

The suite is uniquely designed to solve the "cold start" problem where no labeled data exists. You can bootstrap a powerful supervised model from your own expertise.

Encode Expertise: Manually define your heuristic rules in the configuration for the Weighted Linear Model (WLM).
Generate Weak Labels: Run the WLM to produce an initial ranked list.
Curate a Training Set: Hand-pick a small, diverse subset of these scored items and adjust their scores to create a high-quality "gold" training set.
Switch to Supervised Learning: Change a single line in the model configuration (model: WLM -> model: RFR) and run the train and tune steps to create a RandomForestRegressor that learns the nuanced patterns from your curated labels. All data extraction and cleanup for the WLM model will continue to be used for RFR.

This process combines the best of both worlds: it starts with your domain knowledge and uses machine learning to scale and refine it.

Project details

These details have not been verified by PyPI

Project links

Repository

Release history Release notifications | RSS feed

This version

1.1.2

Dec 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

importancescore-1.1.2.tar.gz (42.1 kB view details)

Uploaded Dec 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

importancescore-1.1.2-py3-none-any.whl (50.7 kB view details)

Uploaded Dec 30, 2025 Python 3

File details

Details for the file importancescore-1.1.2.tar.gz.

File metadata

Download URL: importancescore-1.1.2.tar.gz
Upload date: Dec 30, 2025
Size: 42.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.13.9

File hashes

Hashes for importancescore-1.1.2.tar.gz
Algorithm	Hash digest
SHA256	`de57241db4ccffb161ea7032002fef53f12a3bb5e940ae719d2d6510faf42ac8`
MD5	`7f8992cce06f98170c2c35738f2828bf`
BLAKE2b-256	`327450438edfb7ee4334af225d9ab9f47882cf6f71e1442f47ba47a7df3b6cb6`

See more details on using hashes here.

File details

Details for the file importancescore-1.1.2-py3-none-any.whl.

File metadata

Download URL: importancescore-1.1.2-py3-none-any.whl
Upload date: Dec 30, 2025
Size: 50.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.0.1 CPython/3.13.9

File hashes

Hashes for importancescore-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4e7614dbc68efe2e73168f65028dcc236d01ec13a63673fa0390d53dfb0bf97d`
MD5	`0804b08d02a498effd9388cf5ba60267`
BLAKE2b-256	`9f65df1c02f63595d98bf92ddc97a60a40fcd335ae593b30c8b01c18d07383ba`

See more details on using hashes here.

ImportanceScore 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ImportanceScore

Key Benefits

Directory Structure

File Naming Convention

Weighted Linear Model (WLM)

Advanced Workflow: Bootstrapping a Model

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes