SpectraSherpa — local-first spectroscopy platform for chemometricians
Project description
SpectraSherpa by Spectra Scientific LLC
The open chemometrics workbench — visual, reproducible, and local-first.
PCA, PLS, MCR-ALS, SIMCA, PLS-DA, variable selection, and calibration transfer in a drag-and-drop workflow builder — with full version history, provenance you can defend in an audit, and one-click export for supported workflows. Open source and free to run, entirely on your own machine.
📖 Documentation · 🚀 Local onboarding · 🧩 Node Library · 🔬 For developers
Want to try it before installing? Visit the hosted demo at demo.spectrascientific.ai and register with access code welcome_to_spectra_sherpa.
Why SpectraSherpa
A single open workbench for the chemometrics you actually run, built around five strengths:
- A first-class chemometric toolkit. Preprocessing, decomposition, calibration, classification, variable selection, and calibration transfer are core features, purpose-built for spectroscopy. (Full list below.)
- Numbers you can defend. PCA and PLS workflows are checked against their underlying SpectroChemPy execution paths, with sklearn parity tests for exported artifact helpers where sklearn is the reference implementation. MCR-ALS is checked against synthetic ground truth and reference workflows.
- Reproducible by construction. Every workflow is versioned on save and every run is an immutable, provenance-tracked record, so a result always traces back to its exact recipe and data.
- From exploration to production. Export supported workflows to standalone Python or a Jupyter notebook; trained models are first-class artifacts you can batch-apply and deploy.
- Open and local-first. AGPL-3.0, reads your instrument files directly (
.spg,.spa,.jdx,.opus,.mat, …), and runs entirely on your machine with network egress denied by default in local mode.
Built to be an open foundation that labs and instrument makers can standardize on, extend, and embed. Bundled public benchmark datasets (corn, diesel NIR, NIR shootout, and SpectroChemPy examples) let you reproduce familiar results on day one.
Install & run
pip install "spectra-sherpa[scp,hitran]"
spectra-sherpa
Opens http://localhost:8000 in your browser — no login required. The first launch takes 30–90 s to initialize a local database and caches; later launches are fast.
Other install options — minimal, from source, extras
# Minimal core (no SpectroChemPy examples/readers, no HITRAN downloads)
pip install spectra-sherpa && spectra-sherpa
# From source (contributors)
git clone https://github.com/Spectra-Scientific-LLC/Spectra-Sherpa.git
cd Spectra-Sherpa
pip install poetry
poetry env use python3.11 # supported: 3.11 or 3.12
poetry install --with dev --extras "scp hitran"
poetry run spectra-sherpa
| Extra | Adds |
|---|---|
scp |
SpectroChemPy algorithms + instrument file readers (.spc, .spa, .spg, .jdx, .opus, …) |
hitran |
HITRAN/HAPI clients for Data → Synthesis live line-table downloads |
Requires Python 3.11 or 3.12. Full requirements, CLI flags, and troubleshooting are in 30 Minutes to Local Compute.
HITRAN live downloads require your own HITRAN API key, saved in Settings > API Keys, plus Settings > Integrations > HITRAN/HAPI Queries enabled.
Hosted Demo, Pro, Hybrid, and Organization deployments are operated separately by Spectra Scientific. This README covers the local workstation install.
The chemometric toolkit
Over 60 nodes across the workflow you actually run:
- Preprocessing — Savitzky-Golay smoothing/derivatives, baseline correction, MSC, SNV, OSC, normalization, scaling.
- Exploratory — PCA, MCR-ALS, SIMPLISMA, EFA, hierarchical clustering.
- Calibration & classification — PLS regression, PLS-DA, SIMCA, KNN.
- Variable selection — iPLS, CARS, SPA, UVE, VIP.
- Calibration transfer — PDS and direct standardization for instrument-to-instrument transfer.
- Validation & deployment — cross-validation, nested CV, selection stability, model comparison, batch prediction, deploy-readiness checks.
See the Node Library for node parameters and ports, and Current Capabilities for the supported production scope.
Core concepts
Everything lives inside a Project. Five objects make up a complete analysis — each explained in depth in the Projects, Datasets, and Runs guide.
| Object | What it is |
|---|---|
| Project | The durable container grouping your datasets, workflows, runs, reports, scripts, and trained models — with versioned snapshots and provenance. |
| Data | The spectra or feature tables you work on. Import instrument files into My Dataset, pull from reference/example datasets, or synthesize FTIR time series. |
| Workflow | The analysis recipe: a drag-and-drop graph (DAG) of nodes, versioned on every save so any result traces back to the exact recipe that produced it. |
| Run | One execution of a workflow — an immutable record of parameters, node status, diagnostics, and any model Artifacts (frozen PCA/PLS/MCR/PLS-DA/KNN/SIMCA models) it produced. |
| Report | A shareable summary assembled from a workflow and its runs. Toggle sections, then export to PDF, HTML, Markdown, or JSON for publication, hand-off, or validation packages. |
For Python analysts & chemometricians
SpectraSherpa matches your existing methods rather than replacing them. The internal container is a thin wrapper over a (n_samples, n_features) NumPy array with labeled wavelength and sample axes, so your scikit-learn and pandas code works directly on dataset.data. Bring a working notebook function and make node-scaffold turns it into a toolbar node in minutes.
Start here: Writing a Plugin Node — notebook to node to pull request, no web development required.
Because every step is a typed, provenance-tracked artifact, SpectraSherpa is also a clean foundation for AI assistance — the commercial Sherpa Advisor and Guidance layers build LLM-assisted analysis on top of this deterministic core, which remains fully usable on its own.
Built on the work of others
SpectraSherpa stands on established open science, and keeps citation guidance close to generated outputs:
- SpectroChemPy — spectroscopic algorithms and instrument-file readers, by Arnaud Travert and Christian Fernandez at the Laboratoire Catalyse et Spectrochimie (LCS), ENSICAEN / Université de Caen / CNRS. Licensed CeCILL-B (BSD-compatible).
- HITRAN / HAPI — the high-resolution molecular spectroscopic database used by Data → Synthesis to build physically grounded FTIR line tables.
- Eigenvector Research data sets — recommended NIR/OES chemometrics teaching and validation datasets. SpectraSherpa catalogs these datasets and can download them at runtime when egress is enabled; it does not redistribute the raw Eigenvector data in the wheel.
- NIST Chemistry WebBook (SRD 69) and the NIST Quantitative Infrared Database (SRD 79) — reference IR spectra for synthesis.
These databases are not owned by Spectra Scientific. Cite NIST, HITRAN, and HAPI in any report, publication, or validation package that uses synthetic datasets — Reference Libraries and Synthesis and the Attributions page list the recommended attributions.
Documentation
Full docs at docs.spectrascientific.ai.
- Get started: Cloud vs Local OSS · 30 Minutes to Local Compute · Import Your First Dataset
- Workflows: Data Import · Projects, Datasets, and Runs · Reports and Exports
- Reference: Supported File Types · Node Library · Templates
- Develop: Architecture · Plugins and Extension Points · Developer Setup
Contributing
We welcome contributions — see CONTRIBUTING.md.
[!IMPORTANT] This project requires a signed Contributor License Agreement (CLA). When you open a PR, a bot comments with instructions; sign by replying:
I have read the CLA Document and I hereby sign the CLA
License
Copyright (C) 2026 Spectra Scientific LLC. Licensed under AGPL-3.0 — see LICENSE. If you distribute a modified version (including as a network service), you must release your modifications under the same license. SpectroChemPy is CeCILL-B; see NOTICE.md for full third-party terms. Enterprise features and commercial licensing are available from Spectra Scientific.
[!WARNING] Provided "AS IS" without warranty of any kind. Spectra Scientific LLC disclaims all liability for damages arising from use, including reliance on analytical results. See DISCLAIMER.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spectra_sherpa-0.5.2.tar.gz.
File metadata
- Download URL: spectra_sherpa-0.5.2.tar.gz
- Upload date:
- Size: 7.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d001a3a7dead0cb7b175b9e5f8a4b4002251459b96ac6750cf555cb102fbbda5
|
|
| MD5 |
24c44afed1fc2f13035ceef5058f423b
|
|
| BLAKE2b-256 |
de4169cb41c4979b0d0f8269e845f611f6cdc6ad466286b1ac6a7b9a432fa21c
|
Provenance
The following attestation bundles were made for spectra_sherpa-0.5.2.tar.gz:
Publisher:
pypi-release.yml on Spectra-Scientific-LLC/Spectra-Sherpa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spectra_sherpa-0.5.2.tar.gz -
Subject digest:
d001a3a7dead0cb7b175b9e5f8a4b4002251459b96ac6750cf555cb102fbbda5 - Sigstore transparency entry: 1738263987
- Sigstore integration time:
-
Permalink:
Spectra-Scientific-LLC/Spectra-Sherpa@9432f1a546963616ee484201e8a3a33c4143bcdf -
Branch / Tag:
refs/tags/spectra-sherpa-v0.5.2 - Owner: https://github.com/Spectra-Scientific-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-release.yml@9432f1a546963616ee484201e8a3a33c4143bcdf -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file spectra_sherpa-0.5.2-py3-none-any.whl.
File metadata
- Download URL: spectra_sherpa-0.5.2-py3-none-any.whl
- Upload date:
- Size: 7.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7fb8acbabc6fd6b8e06ff6c59db51b91f6aebf4cd1eb2fda86089c24df4a8b6
|
|
| MD5 |
8af3a258e3d05a42a04ae0757d95b21f
|
|
| BLAKE2b-256 |
9d886e14abedad224ee52621cf8f343bdf1aaf7d430766c6e69d6d0ea1c52df3
|
Provenance
The following attestation bundles were made for spectra_sherpa-0.5.2-py3-none-any.whl:
Publisher:
pypi-release.yml on Spectra-Scientific-LLC/Spectra-Sherpa
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
spectra_sherpa-0.5.2-py3-none-any.whl -
Subject digest:
d7fb8acbabc6fd6b8e06ff6c59db51b91f6aebf4cd1eb2fda86089c24df4a8b6 - Sigstore transparency entry: 1738264037
- Sigstore integration time:
-
Permalink:
Spectra-Scientific-LLC/Spectra-Sherpa@9432f1a546963616ee484201e8a3a33c4143bcdf -
Branch / Tag:
refs/tags/spectra-sherpa-v0.5.2 - Owner: https://github.com/Spectra-Scientific-LLC
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-release.yml@9432f1a546963616ee484201e8a3a33c4143bcdf -
Trigger Event:
workflow_dispatch
-
Statement type: