Physics-certified motion data toolkit for Physical AI training
Project description
S2S — Physics-Certified Sensor Data
Physics-certified motion data for prosthetics, robotics, and Physical AI.
IMU sensor data is silently corrupted more often than people realize. S2S catches it using physics laws, not statistics. Proven on 5 real datasets. One line to install.
Live Demos
- 📊 Interactive Data Explorer — 104,160 real certified records, hover to explore
- 📱 Phone IMU Demo — real-time physics certification on your phone
- 🎥 Pose Camera Demo — 17-joint live certification
No install needed. All processing runs on your device. No data sent anywhere.
The Problem
Physical AI (robots, prosthetics, exoskeletons) is trained on motion data. But most datasets contain synthetic data that violates physics, corrupted recordings, and mislabeled actions — with no way to verify the data came from a real human moving in physically valid ways.
A robot trained on bad data learns bad motion. A prosthetic hand trained on uncertified data fails its user.
Four Proven Levels + Experimental AI
S2S improves model performance at every stage of the training pipeline. All results validated across five independent datasets at four different sampling rates.
Level 1 — Quality Floor ✅ PROVEN on 3 datasets
| Dataset | Hz | Corruption | S2S Recovery | Net vs Clean |
|---|---|---|---|---|
| WISDM 2019 | 20Hz | 35% corrupted | 154% recovered | +1.74% F1 |
| PAMAP2 | 100Hz | 35% corrupted | confirmed | +0.95% F1 |
| UCI HAR | 50Hz | 35% corrupted | 135% recovered | +2.51% F1 |
Physics floor removes bad data and beats the clean baseline across three independent datasets at three different sampling rates.
Level 2 — Physics Quality Floor Generalises ✅ PROVEN on 3 datasets
| Dataset | Hz | Data used | vs All data |
|---|---|---|---|
| WISDM 2019 | 20Hz | 41% of windows | +1.74% F1 |
| PAMAP2 | 100Hz | 88% of windows | +0.95% F1 |
| UCI HAR | 50Hz | 49% of windows | +2.51% F1 |
Also proven: kinematic chain consistency on PAMAP2 (hand + chest + ankle IMU):
| Condition | F1 | Δ |
|---|---|---|
| Single chest IMU | 0.7969 | baseline |
| 3 IMUs naive concat | 0.8308 | +3.39% |
| 3 IMUs + chain filter | 0.8399 | +0.91% over naive |
| Net vs single sensor | +4.23% F1 | ← headline |
Less data, higher quality, better model. The physics score is a reliable proxy for training value — confirmed across devices, sampling rates, and activity types.
Level 3 — Biological Signal Certification ✅ PROVEN
Tested on PhysioNet PTT-PPG — 4 real subjects, 1164 windows, 500Hz wrist device, walk/sit/run.
| Signal | Result |
|---|---|
| PPG pass rate | 96.3% on real human subjects |
| Heart rate | mean 106 BPM (physiologically correct for activity) |
| HRV RMSSD | mean 21ms (real human variability) |
| Skin temperature | 33.6°C (confirmed real human range) |
Real pulse, real HRV, real temperature — verified simultaneously. Synthetic data cannot fake all three.
Level 4 — Multi-Sensor Fusion Coherence ✅ PROVEN
| Dataset | HIL Score | Pass Rate | Tiers |
|---|---|---|---|
| PTT-PPG 500Hz wrist | 68.7/100 | 100% | 438 SILVER + 726 BRONZE |
| PAMAP2 100Hz (auto-Hz) | 65.3/100 | 100% | 87 SILVER + 13 BRONZE |
Real sensors: PPG infrared + PPG red + IMU accel+gyro + skin temperature — all from the same wrist hardware.
If HR rose with activity, skin temperature stayed in human range, and IMU timing matched PPG — simultaneously — a human was there.
Level 5 — Physics-Informed Hybrid AI EXPERIMENTAL
Hybrid approach beats both baselines but physics feature extraction needs improvement.
| Dataset | Model | Accuracy | Features | Status |
|---|---|---|---|---|
| PTT-PPG | Raw IMU | 79.55% | 768 features | Baseline |
| PTT-PPG | Physics Only | 70.48% | 19 features | Baseline |
| PTT-PPG | Hybrid | 83.68% | 787 features | +4.13% vs Raw, +13.20% vs Physics |
Key Finding: Physics features add complementary signal but extraction is early-stage.
Feature Importance Analysis (Honest Assessment):
- Only 2/19 physics features contribute:
rigid_rms_measuredandresonance_peak_energy - 17/19 physics features have zero importance in hybrid model
- Physics efficiency: 0.0032 per feature vs 0.0012 for raw (promising but needs work)
- Raw IMU dominates: 94% of predictive power from 768 features
What Works:
rigid_rms_measured(0.0569 importance) - RMS acceleration magnituderesonance_peak_energy(0.0011 importance) - Frequency domain energy
What Needs Improvement:
- 17 physics laws produce non-predictive features for activity classification
- Most tier indicators (is_gold, is_silver, etc.) have zero importance
- Confidence scores and detailed law outputs not useful for ML
Status:
Physics feature extraction is early-stage but promising. The hybrid approach proves physics features add unique signal, but most physics laws need better feature engineering for ML tasks.
Next Phase: Improve physics feature extraction to make more laws ML-relevant.
Active Learning Pipeline ✅ PROVEN
Self-improving data quality system that learns from corruption patterns and generates training curriculum automatically.
Module 1 — Corruption Fingerprinter
- Purpose: Detect and classify data corruption types
- Results: Identified resonance_frequency as most vulnerable (77% of corruptions break it first)
- Status: ✅ Proven on PTT-PPG data
Module 2 — Frankenstein Mixer
- Purpose: Find exact contamination boundaries for each physics law
- Results: IMU consistency breaks at 30.6% contamination, resonance at 29.2%, jerk at 53.7%
- Status: ✅ Proven on PTT-PPG data
Module 3 — Curriculum Generator
- Purpose: Generate training data at every quality level automatically
- Results: 2,000 samples with balanced tiers (GOLD 5.7%, SILVER 58.7%, BRONZE 28.1%, REJECTED 6.7%)
- Auto-discovery: Found NinaPro DB5, EMG Amputee, HuGaDB, PTT-PPG automatically
- Status: ✅ Proven on Mac with auto-discovery
Module 4 — Cloud Trainer
- Purpose: Train quality prediction models on curriculum data
- Results: 85.5% accuracy (+27.7% over 57.8% majority baseline)
- Best model: GradientBoosting with 93% precision on SILVER tier
- Status: ✅ Proven on Mac with sklearn baseline
Pipeline Impact:
- Automatic curriculum generation from any local dataset
- Quality predictor that significantly outperforms naive baseline
- Self-improving system that learns corruption patterns
- Ready for deployment with trained models and prediction API
Auto-Hz Device Detection
S2S automatically detects device profile from two numbers already in the data — sampling Hz (from median timestamp intervals) and signal amplitude range (from first window). No user configuration needed.
| Hz range | Signal range | Profile | Example |
|---|---|---|---|
| ≥400Hz | <1.0 normalized | normalized_500hz | PTT-PPG |
| ≤150Hz | >10 raw ADC | raw_adc_100hz | PAMAP2 |
| other | other | default | fallback |
Before auto-Hz: PAMAP2 Level 4 HIL = 38.4. After: 65.3. Same data, correct profile.
Validated on Real Human Data
WISDM 2019 (51 subjects, 20Hz, wrist accel, 18 activities):
| Level | Result |
|---|---|
| Level 1 | +1.74% F1 vs corrupted, 154% recovery |
| Level 2 | +1.74% F1 vs all data, 41% of windows used |
PAMAP2 (9 subjects, 100Hz, hand+chest+ankle IMU, 12 activities):
| Level | Result |
|---|---|
| Level 1 | +0.95% F1 vs corrupted |
| Level 2 | +4.23% F1 kinematic chain vs single sensor |
| Level 4 | HIL 65.3/100, 100% pass, 87 SILVER |
UCI HAR (30 subjects, 50Hz, body accel+gyro, 6 activities):
| Level | Result |
|---|---|
| Level 1 | +2.51% F1 vs corrupted, 135% recovery |
| Level 2 | +2.51% F1 vs all data, 49% of windows used |
PhysioNet PTT-PPG (4 subjects, 500Hz, wrist PPG+IMU+thermal, walk/sit/run):
| Level | Result |
|---|---|
| Level 2 IMU | 61.7% pass rate, avg score 37.2/100 |
| Level 3 PPG | 96.3% pass rate, HR 106 BPM, HRV 21ms |
| Level 4 Fusion | HIL 68.7/100, 100% pass, 438 SILVER |
NinaPro DB5 (10 subjects, 2000Hz, 16-channel EMG + 3-axis accelerometer, hand gestures):
| Level | Result |
|---|---|
| Law 1 Newton | EMG→accel lag 117.5ms mean, 81.6% in 50–200ms range, 10/10 subjects |
Muscle fires → limb accelerates 117.5ms later. Consistent with published neuromuscular literature. Shuffled baseline: 88.5ms — real causal lag is distinct. Synthetic data cannot reproduce without full rigid-body muscle simulation.
11 Physics Laws
Single-Sensor Laws (Levels 1–3)
| # | Law | What It Catches |
|---|---|---|
| 1 | Newton's Second Law (F=ma, 117.5ms EMG→accel lag) | Synthetic data missing lagged EMG-accel correlation |
| 2 | Segment Resonance (ω=√(K/I)) | Tremor at impossible frequency for body segment |
| 3 | Rigid Body Kinematics (a=α×r+ω²×r) | Gyro and accel generated independently |
| 4 | Ballistocardiography (F=ρQv) | IMU missing cardiac recoil |
| 5 | Joule Heating (Q=0.75×P×t) | Sustained EMG without thermal elevation |
| 6 | Motor Control Jerk (∂³x/∂t³ ≤ 5000 m/s³) | Robotic or keyframe animation artefacts |
| 7 | IMU Consistency (Var(accel) ~ f(Var(gyro))) | Accel and gyro from independent generators |
Multi-Sensor Chain Laws (Level 4)
| # | Law | What It Catches |
|---|---|---|
| 8 | Locomotion Coherence (freq spread <2.5Hz) | Sensors recording different activities |
| 9 | Segment Coupling (chest-ankle r >0.3) | Independent synthetic channels |
| 10 | Gyro-Accel Coupling (per IMU) | Rotation without corresponding acceleration |
| 11 | Cross-Sensor Jerk Timing (ankle leads chest 0–200ms) | Reversed or zero lag — not real heel-strike |
Tier System
| Tier | Score | Meaning |
|---|---|---|
| GOLD | ≥87 | All physics laws passed. Pristine. |
| SILVER | 75–86 | Trusted. Minor deviations within noise. |
| BRONZE | 60–74 | Marginal. Candidate for reconstruction at ≤50Hz. |
| RECONSTRUCTED | — | Repaired, re-scored ≥75, spectral sim ≥0.8. Weight 0.5. |
| REJECTED | <floor | Removed from pipeline. |
Floor = p25 of clean score distribution per dataset (adaptive).
Live API
No install needed:
curl -X POST https://s2s-65sy.onrender.com/certify -H "Content-Type: application/json" -d "{"accel": [[ax,ay,az],...], "sample_rate_hz": 50}"
import requests
cert = requests.post("https://s2s-65sy.onrender.com/certify", json={"accel": data, "sample_rate_hz": 50})
print(cert.json()["tier"]) # GOLD / SILVER / BRONZE / REJECTED
Or install locally:
Install
pip install s2s-certify
Zero dependencies. Pure Python 3.9+. Works on any platform.
Quick Start
from s2s_certify import certify
result = certify(accel_window, sample_rate_hz=20)
print(result['tier']) # GOLD / SILVER / BRONZE / REJECTED
print(result['score']) # 0–100
print(result['laws_passed']) # which physics laws passed
s2s-certify your_imu_data.csv
s2s-certify your_imu_data.csv --output report.json
Datasets Validated
| Dataset | Hz | Sensors | Windows | Used for |
|---|---|---|---|---|
| WISDM 2019 | 20Hz | Wrist accel | 46,946 | Levels 1, 2 |
| PAMAP2 | 100Hz | Hand+Chest+Ankle IMU | 13,094 | Levels 1, 2, 4 |
| UCI HAR | 50Hz | Body accel+gyro | 10,299 | Levels 1, 2 |
| PhysioNet PTT-PPG | 500Hz | Wrist PPG+IMU+Thermal | 1,164 | Levels 2, 3, 4, 5 (experimental) |
| NinaPro DB5 | 2000Hz | Forearm EMG+Accelerometer | 500 | Law 1 |
Paper
S2S: Physics-Certified Sensor Data — Four Proven Levels, Eleven Laws, Five Independent Datasets
→ Read paper (PDF) | → DOI: 10.5281/zenodo.18878307
Project Structure
s2s_standard_v1_3/ # Physics engine (zero dependencies)
experiments/ # All experiments + results JSON
tests/ # 110 tests, all passing
docs/paper/ # S2S_Paper_v5.pdf
License
BSL-1.1 — free for research and non-commercial use. Contact for commercial licensing.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file s2s_certify-1.5.0.tar.gz.
File metadata
- Download URL: s2s_certify-1.5.0.tar.gz
- Upload date:
- Size: 93.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42b9750651ed4c6482164551ff234b91d8dbb22b689c9bba83eeedf598d885ac
|
|
| MD5 |
143706f80c96a3f17e72005c5e9aae14
|
|
| BLAKE2b-256 |
70dc7efc065e196444e7907d588b4b7c9593b656db66485c98c13e67c26fd538
|
File details
Details for the file s2s_certify-1.5.0-py3-none-any.whl.
File metadata
- Download URL: s2s_certify-1.5.0-py3-none-any.whl
- Upload date:
- Size: 91.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e6d160d9b210ba504242cb002e7d9dea699bb4b1136031fa5963d8087b370dc
|
|
| MD5 |
397857a51ac635e1f10792b162750e35
|
|
| BLAKE2b-256 |
ad6b6233ebeade8e3ee9fc5281344d754ede49c6ce4572137a85bb89df57f34d
|