Skip to main content

SpectraSherpa — local-first spectroscopy platform for chemometricians

Project description

SpectraSherpa by Spectra Scientific LLC

CI Docs License: AGPL-3.0 Python 3.11+

Open-source, local-first chemometrics platform, AI-ready.

SpectraSherpa brings transparent, reproducible multivariate analysis to spectroscopists and analytical chemists. Build visual analysis pipelines, train and deploy calibration models, and extend with custom Python — all without your data leaving your machine.

Why SpectraSherpa?

  • Transparent algorithms — Open source means every preprocessing step, decomposition, and calibration model is auditable. No black boxes.
  • Data stays on your machine — Built for IP-sensitive labs in pharma, semiconductor, food science, and materials. Network egress is denied by default.
  • No coding required — Visual drag-and-drop workflow builder with over 60 processing nodes. Go from raw spectra to a deployed PLS model without writing Python.
  • Extensible when you need it — Export any workflow to standalone Python or Jupyter notebooks. Add custom nodes via plugins or drop-in scripts.
  • Modern metadata management — Versioned projects, experiments, workflows, and model artifacts with full provenance tracking and audit trails.
  • AI-ready — Built-in chat assistant connects to any OpenAI-compatible endpoint you choose (bring your own key and URL). Full AI advisor with agentic tools, peak identification, code generation, and contextual workflow analysis available via subscription.

For Python data analysts and chemometricians

If you already analyze spectra in Python — whether using scikit-learn, pandas, or your own scripts — SpectraSherpa is built to match your methods, not replace them.

The math matches what you already know. PCA, PLS, MCR-ALS, and classification nodes produce results validated side-by-side against scikit-learn reference outputs. The PCA reproduction study shows the exact numerical comparison on a standard dataset — same parameters, same results, verified to five decimal places.

Your NumPy arrays work without conversion. The internal data container is a thin wrapper over NumPy: a (n_samples, n_features) array with labeled wavelength and sample axes. Your existing code works directly:

from spectra_sherpa.app.lib.sherpa_dataset import SherpaDataset, SpectralAxis, SampleAxis

dataset = SherpaDataset(
    X=your_array,                                        # shape: (n_samples, n_features)
    feature_axis=SpectralAxis(values=wavenumbers, units="cm-1"),
    sample_axis=SampleAxis(values=sample_ids),
)
X = dataset.data      # get the NumPy array back at any time
y = dataset.target    # labels, if any

Export any workflow to a Python script or Jupyter notebook. The visual builder is for exploration and reproducibility. The notebook is the artifact you publish, share, or hand off — it requires only pip install spectra-sherpa and standard scientific libraries (NumPy, SciPy, scikit-learn).

Add your own algorithm as a processing step. If you have a working function in a notebook, one command generates the wrapper and registers it in the toolbar:

make node-scaffold

See the Scientist Contributor Guide — notebook to node to pull request, with no web development knowledge required.


Try It

Free online demo — Register and explore SpectraSherpa at demo.spectrascientific.ai with all features including the AI advisor enabled. (Note: For a limited time, use the access code welcome_to_spectra_sherpa to create an account. No upload of proprietary data to the demo server is allowed. Accounts inactive for more than a week will be automatically deleted.)

Install locally:

pip install spectra-sherpa
spectra-sherpa

Opens http://localhost:8000 in your browser. No login required. Install spectra-sherpa[scp] as well if you want the SpectroChemPy-backed example datasets and workflows.

Supported Techniques

SpectraSherpa's core math applies broadly to multivariate spectral and sensor data, but the template-guided onboarding path is narrower than that general claim. The table below reflects what is actually supported in the product today.

Supported Today

Support Level Techniques Notes
Template-guided example workflows FTIR, NIR, Raman, OES Shipped templates with bundled example datasets from Eigenvector Research and SpectroChemPy, instantiated directly from Projects. Some example workflows require the optional spectra-sherpa[scp] install.
Template-guided (user-supplied data) UV-Vis Templates exist for PCA, MCR-ALS, clustering, and preprocessing. Users bind their own compatible data.
User-data workflows FTIR, Raman, NIR, UV-Vis, OES All techniques accepted by the node library and template contracts when the user supplies compatible data.

Future Plan

Many other measurement domains are good fits for SpectraSherpa's architecture and chemometric approach, including vibrational, elemental, diffraction, mass spectrometry, imaging, and broader semiconductor virtual metrology workflows. These are inspirational targets rather than finished product claims today, and we are actively looking for developers and scientist-contributors who want to help expand template coverage, validation datasets, and technique-specific UX.

See the Applications Guide for the current support split between shipped templates, partial support, and future plan.

Features

  • Workflow Builder — Visually design reproducible analysis pipelines by connecting processing steps (nodes) in a drag-and-drop canvas. 11 categories: Data, Synthesis, Preprocessing, Exploratory, Regression, Classification, Clustering, Validation, Custom, Output, and Deployment
  • Model Artifacts — Train, persist, and reload models (PCA, PLS, MCR, PLSDA, KNN, SIMCA) with a generic Load & Apply node
  • Type System — Node connections are validated automatically; incompatible connections (e.g. feeding a model into a raw-data input) are blocked before execution
  • Python & Notebook Export — Generate standalone .py scripts or Jupyter notebooks from any workflow
  • Project Management — Organize experiments, workflows, scripts, and models with versioned snapshots
  • Experiment Tracking — DOE support with 96-well plate layouts, samples, mixtures, and factor definitions
  • Deploy — Batch prediction, folder watching, and execution run tracking with model provenance
  • AI Chat — Connect any OpenAI-compatible chat endpoint for AI-assisted analysis and workflow guidance. See Configuration for setup.
  • Plugin System — Add your own processing nodes by dropping a Python file into a folder or installing a package
  • Privacy Controls — Fine-grained egress permissions; "deny all" network policy by default; local-first architecture for IP-sensitive labs
Mode Login required? Use Case
local No — single user, opens straight to the app Desktop analysis, privacy-first
hybrid Optional external service integration Local GUI with remote services
enterprise Extension-defined Shared lab environments, multi-user operation

Algorithm Library

Over 60 processing nodes across preprocessing, exploratory analysis, regression, classification, clustering, validation, synthesis, and deployment. Optionally install SpectroChemPy-powered algorithms with pip install spectra-sherpa[scp].

  • Node Reference — Full catalog of every node with parameters and port definitions
  • Applications Guide — Algorithm-to-technique mapping for analytical chemistry and semiconductor metrology
  • Workflow Builder Guide — How to build, connect, and execute processing pipelines

Core Concepts

SpectraSherpa organizes work into Projects — containers that group related experiments, workflows, scripts, and trained models:

Project
├── Experiments        — Raw spectral data files with version history
│   └── Files          — .csv, .jdx, .spc, .spa, .spg, .opus, .mat, ...
├── Workflows          — DAG-based analysis pipelines
│   ├── Nodes + Edges  — Processing graph definition
│   ├── Versions       — Immutable snapshots on each save
│   └── Execution Runs — Saved results with diagnostics
│       └── Batch Predictions — Per-file results for deploy
├── Scripts            — Python exports (auto-generated or manual)
└── Models             — Trained model artifacts (PCA, PLS, MCR, ...)
    ├── manifest.json  — Metadata, metrics, feature axis
    └── arrays.npz     — Numpy arrays (loadings, scores, etc.)

Installation

Requirements: Python 3.11 or 3.12 (3.13 may work but the scientific stack — numpy, scipy, scikit-learn, SpectroChemPy — does not yet ship full wheels for 3.13+, so installs may try to compile from source and fail; 3.14 is not recommended). Node.js is only needed if you want to modify the browser interface itself.

# Install and run (all you need as a user)
pip install spectra-sherpa
spectra-sherpa

# From source (for contributors — see CONTRIBUTING.md for a full walkthrough)
git clone https://github.com/Spectra-Scientific-LLC/Spectra-Sherpa.git
cd Spectra-Sherpa
pip install poetry                              # Poetry manages Python dependencies
poetry env use python3.11                       # pin the venv to a supported Python (3.11 or 3.12)
poetry install --with dev --extras "scp"
poetry run spectra-sherpa                       # launches the app from the source checkout

# Only needed to change the browser interface
cd frontend && npm install && npm run dev       # npm is the JavaScript package manager

# Run the Python test suite
poetry run pytest tests/ -v --no-cov
Extra Install Description
scp pip install spectra-sherpa[scp] SpectroChemPy algorithms and file readers

First-run notes

The very first launch initializes a local SQLite database, runs Alembic migrations, and (when the [scp] extra is installed) lets SpectroChemPy populate its font and stylesheet cache. Allow 30–90 seconds the first time before opening your browser. The server is ready when you see this line in the terminal:

INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Subsequent launches start in a few seconds because the SCP cache is now populated.

Troubleshooting

  • ValueError: the greenlet library is required to use this function. No module named 'greenlet'greenlet is a base dependency, so a clean pip install spectra-sherpa (or poetry install) pulls it. If you see this in an existing venv, re-run the install or pip install greenlet directly.
  • pyproject.toml changed significantly since poetry.lock was last generated — run poetry lock (the --no-update flag was removed in Poetry 2.x; bare poetry lock is the equivalent), then re-run poetry install.
  • ERR_CONNECTION_REFUSED when opening http://127.0.0.1:8000 immediately after launch — the server is still in lifespan startup. Wait for the Uvicorn running on http://127.0.0.1:8000 log line before opening the browser.
  • Banner reads an old version (e.g. v0.3.0) after upgrading — caused by stale state in an existing venv. poetry env remove --all then re-run poetry install to rebuild from the new lock; the banner now reads the version live from package metadata so it cannot drift.
  • Port 8000 already in use — relaunch with spectra-sherpa --port 9000, or set KILL_PORT_ON_START=true in .env to free the port automatically.

Documentation

Full documentation at docs.spectrascientific.ai:

Third-Party Notices

SpectraSherpa optionally integrates with SpectroChemPy, a Python library for advanced spectroscopic data analysis developed by Arnaud Travert and Christian Fernandez at the Laboratoire Catalyse et Spectrochimie (LCS), ENSICAEN / Université de Caen / CNRS. SpectroChemPy is licensed under CeCILL-B (BSD-compatible); SpectraSherpa is AGPL-3.0.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for details.

[!IMPORTANT] This project requires contributors to sign a Contributor License Agreement (CLA). When you open a Pull Request, a bot will comment with instructions. You can sign by commenting: I have read the CLA Document and I hereby sign the CLA

License

Copyright (C) 2026 Spectra Scientific LLC.

SpectraSherpa is licensed under the AGPL-3.0. See LICENSE for details.

You are free to use, modify, and distribute SpectraSherpa. If you distribute a modified version — including as a network service — you must make your modifications available under the same license.

[!WARNING] This software is provided "AS IS" without warranty of any kind. Spectra Scientific LLC disclaims all liability for damages arising from use of this software, including reliance on analytical results. See DISCLAIMER for full terms.

Enterprise features and commercial licensing are available from Spectra Scientific LLC.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spectra_sherpa-0.4.2.tar.gz (15.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spectra_sherpa-0.4.2-py3-none-any.whl (16.3 MB view details)

Uploaded Python 3

File details

Details for the file spectra_sherpa-0.4.2.tar.gz.

File metadata

  • Download URL: spectra_sherpa-0.4.2.tar.gz
  • Upload date:
  • Size: 15.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for spectra_sherpa-0.4.2.tar.gz
Algorithm Hash digest
SHA256 48de25bb6b9c45f291d8afd009667ad336716dc017029deaba8942aff0420aa8
MD5 5a3e982984371c6dd2a992682923163c
BLAKE2b-256 bb72b5beaecdbcfacb0eccaca418023ee44cd61aad63165b9fda7615a926c0d4

See more details on using hashes here.

File details

Details for the file spectra_sherpa-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: spectra_sherpa-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 16.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for spectra_sherpa-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 082dabca8b6c1a11fea55e1582e1c314396cedd4e92ac5e862237278ce7f1620
MD5 a7f52b0940b1255702eaf5670c60eac1
BLAKE2b-256 ec389d07b352ffac66160fd9747372e06a854ebdfb7cae3de53f6a2aff267a1b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page