Wi-Fi Sensing Data Processing - A Python library for downloading, processing, analyzing and training on Wi-Fi CSI data

These details have not been verified by PyPI

Project links

Project description

SDP: Sensing Data Protocol for Scalable Wireless Sensing

📖 Citation

If you use SDP in your research, please cite:

@misc{zhang2026sdpunifiedprotocolbenchmarking,
      title={SDP: A Unified Protocol and Benchmarking Framework for Reproducible Wireless Sensing}, 
      author={Di Zhang and Jiawei Huang and Yuanhao Cui and Xiaowen Cao and Tony Xiao Han and Xiaojun Jing and Christos Masouros},
      year={2026},
      eprint={2601.08463},
      archivePrefix={arXiv},
      primaryClass={eess.SP},
      url={https://arxiv.org/abs/2601.08463}, 
}

🇬🇧 English | 🇨🇳 中文

🇬🇧 English

🎯 What is SDP?

SDP is a protocol-level abstraction and unified benchmark for reproducible wireless sensing.

⚠️ SDP is not a new neural network, but a standardized protocol that unifies CSI representations for fair comparison.

The Problem

Wireless sensing research often suffers from:

❌ Hardware-specific CSI formats
❌ Inconsistent preprocessing pipelines
❌ Unstable training results
❌ Large performance variance across random seeds

Result: Models cannot be fairly compared.

The Solution

SDP solves this at the protocol level, not the model level:

Feature	Raw CSI	Other Tools	SDP
Standardized Format	❌ Hardware-specific	⚠️ Partial	✅ Unified CSIFrame
Multi-Dataset Support	❌ Manual parsing	⚠️ 2-3 datasets	✅ 5 datasets built-in
Preprocessing	❌ DIY	⚠️ Basic only	✅ Wavelet + Phase Calib
Reproducibility	❌ Random	⚠️ Varies	✅ 5-seed standard
Deep Learning	❌ From scratch	⚠️ Limited	✅ CNN+Transformer
CLI Interface	❌ None	⚠️ Partial	✅ Full CLI support

SDP projects raw CSI into a fixed canonical frequency grid (K=30), ensuring cross-hardware comparability.

Performance Highlights

Metric	Result
Accuracy	SOTA on 5 datasets
Reproducibility	5-seed evaluation standard
Stability	Low variance across runs

Figure 1: Accuracy comparison across datasets

Figure 2: Reproducibility and stability analysis

Figure 3: Ablation study results

🚀 Quick Start (3 Steps, 5 Minutes)

Step 1: Install (30 seconds)

pip install wsdp

Verify installation:

wsdp --version

Step 2: Download Dataset (2 minutes)

Option A: From CLI (Recommended for testing)

# elderAL = smallest dataset, fastest for testing
wsdp download elderAL ./data

# Or download larger datasets:
# wsdp download widar ./data
# wsdp download gait ./data
# wsdp download xrf55 ./data
# wsdp download zte ./data

Option B: From SDP Website

Download manually if you encounter network issues.

Required Dataset Structure:

data/
├── elderAL/                    # Dataset name
│   ├── action0_static_new/     # Activity folder
│   │   ├── user0_position1_activity0/  # Sample folder
│   │   │   ├── sample1.csv
│   │   │   └── ...
│   │   └── ...
│   ├── action1_walk_new/
│   └── ...
├── widar/
├── gait/
├── xrf55/
└── zte/

Step 3: Train & Evaluate (2 minutes)

🐍 Python API (Recommended for research):

Create train.py:

from wsdp import pipeline

# Minimal call - uses default hyperparameters
pipeline("./data/elderAL", "./output", "elderAL")

# Or with custom hyperparameters
pipeline(
    input_path="./data/elderAL",
    output_folder="./output",
    dataset="elderAL",
    learning_rate=1e-3,
    num_epochs=50,
    batch_size=64,
)

Run:

python train.py

💻 CLI (Quick & Simple):

# Basic training
wsdp run ./data/elderAL ./output elderAL

# With hyperparameter override
wsdp run ./data/elderAL ./output elderAL --lr 0.001 --epochs 50 --batch-size 64

# With config file
wsdp run ./data/elderAL ./output elderAL --config my_config.yaml

📊 What You Get:

After training, check ./output/:

output/
├── best_model.pth              # Best model checkpoint
├── confusion_matrix.png        # Evaluation visualization
├── training_curves.png         # Loss & accuracy curves
└── output.log                  # Detailed training logs

✅ If you see these files, SDP is working correctly!

📊 Supported Datasets

Dataset	Format	Subcarriers	Complex	Scenarios	Size
Widar	.dat (bfee)	30	✅	Gesture recognition	~2GB
Gait	.dat (bfee)	30	✅	Gait recognition	~1GB
XRF55	.npy	30	✅	Human activity	~3GB
ElderAL	.csv	varies	❌	Elderly activity	~500MB
ZTE	.csv	512	✅	CSI with I/Q	~4GB

More datasets coming soon! See Roadmap.

🔬 Research & Customization

🧠 Plug in Your Own Model

Step 1: Create custom_model.py:

import torch
import torch.nn as nn

class YourCustomModel(nn.Module):
    def __init__(self, num_classes=6):
        super().__init__()
        # Your architecture here
        # Input shape: (Batch, Timestamp, Frequency, Antenna)
        
    def forward(self, x):
        # Your forward pass
        return output

# Required: expose model class
model = YourCustomModel

Step 2: Run with your model:

wsdp run ./data/elderAL ./output elderAL -m custom_model.py

📁 Use Your Own Dataset

Organize your data:

data/
└── my_dataset/
    ├── user0_pos0_action0/
    │   ├── sample1.csv
    │   └── ...
    └── user0_pos0_action1/
        └── ...

Run:

wsdp run ./data/my_dataset ./output my_dataset

🗺️ Codebase Map

Want to go deeper? Here's where to modify:

Directory	Purpose	What to Modify
`models/`	Architectures	Define or compare model architectures
`algorithms/`	Signal Processing	Denoising, calibration, etc.
`datasets/`	Dataset Wrappers	Add new dataset loaders
`readers/`	File Readers	Add new format parsers
`structure/`	Data Structures	Modify CSIFrame format
`processors/`	Protocol Logic	Adjust canonical projection

🧪 Understanding SDP (10-Min Deep Dive)

The SDP Pipeline

Raw CSI
  ↓
[Deterministic Sanitization]
  - Phase calibration
  - Wavelet denoising
  ↓
[Canonical Tensor Construction]
  - K=30 frequency grid
  - Standardized shape
  ↓
[Deep Learning Model]
  ↓
Prediction

Canonical Tensor Format

After sanitization, SDP constructs a Canonical CSI Tensor:

$$X \in \mathbb{C}^{A \times K \times T}$$

Where:

$A$ = Number of antennas
$K$ = 30 (fixed frequency grid)
$T$ = Time samples

This ensures cross-hardware comparability.

Why Deterministic?

Raw CSI contains hardware distortions:

Phase offsets
Sampling time offsets
Noise fluctuations

SDP enforces deterministic calibration and denoising, guaranteeing:

✅ Same raw CSI → Same cleaned tensor
✅ Reproducibility is enforced, not optional

📚 Documentation & Resources

🗺️ Roadmap

v0.1 - Initial protocol design
v0.2 - 5 datasets support, CLI tool
v0.3 - More datasets (WiFi-HAR, CSI-HAR, etc.)
v0.4 - Online demo platform
v0.5 - PyPI official release
v1.0 - Full protocol standardization

Want a specific dataset? Open an issue and let us know!

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for:

Development setup
Coding guidelines
Pull request process

📄 License

MIT License - see LICENSE file.

🇨🇳 中文

🎯 SDP 是什么？

SDP 是一个协议级抽象框架，用于可复现的无线感知研究。

⚠️ SDP 不是一个新的神经网络，而是一个标准化协议，统一 CSI 表示以实现公平比较。

问题所在

无线感知研究常面临：

❌ 硬件特定的 CSI 格式
❌ 不一致的预处理流程
❌ 不稳定的训练结果
❌ 随机种子间性能方差大

结果：模型无法公平比较。

解决方案

SDP 在协议层面解决问题，而非模型层面：

特性	原始 CSI	其他工具	SDP
标准化格式	❌ 硬件特定	⚠️ 部分支持	✅ 统一 CSIFrame
多数据集支持	❌ 手动解析	⚠️ 2-3 个	✅ 5 个内置数据集
预处理	❌ 自行实现	⚠️ 仅基础	✅ 小波+相位校准
可复现性	❌ 随机	⚠️ 不稳定	✅ 5 种子标准
深度学习	❌ 从零开始	⚠️ 有限	✅ CNN+Transformer
CLI 接口	❌ 无	⚠️ 部分	✅ 完整 CLI 支持

SDP 将原始 CSI 投影到固定的规范频率网格 (K=30)，确保跨硬件可比性。

性能亮点

指标	结果
准确率	5 个数据集上达到 SOTA
可复现性	5 种子评估标准
稳定性	多次运行方差低

图 1：跨数据集准确率对比

图 2：可复现性与稳定性分析

图 3：消融实验结果

🚀 快速开始（3 步，5 分钟）

第 1 步：安装（30 秒）

pip install wsdp

验证安装：

wsdp --version

第 2 步：下载数据集（2 分钟）

方式 A：命令行下载（测试推荐）

# elderAL = 最小数据集，测试最快
wsdp download elderAL ./data

# 或下载更大的数据集：
# wsdp download widar ./data
# wsdp download gait ./data
# wsdp download xrf55 ./data
# wsdp download zte ./data

方式 B：从 SDP 官网下载

如遇到网络问题，可手动下载。

必需的数据集结构：

data/
├── elderAL/                    # 数据集名称
│   ├── action0_static_new/     # 活动文件夹
│   │   ├── user0_position1_activity0/  # 样本文件夹
│   │   │   ├── sample1.csv
│   │   │   └── ...
│   │   └── ...
│   ├── action1_walk_new/
│   └── ...
├── widar/
├── gait/
├── xrf55/
└── zte/

第 3 步：训练与评估（2 分钟）

🐍 Python API（研究推荐）：

创建 train.py：

from wsdp import pipeline

# 最小调用 - 使用默认超参数
pipeline("./data/elderAL", "./output", "elderAL")

# 或自定义超参数
pipeline(
    input_path="./data/elderAL",
    output_folder="./output",
    dataset="elderAL",
    learning_rate=1e-3,
    num_epochs=50,
    batch_size=64,
)

运行：

python train.py

💻 命令行（快速简单）：

# 基础训练
wsdp run ./data/elderAL ./output elderAL

# 自定义超参数
wsdp run ./data/elderAL ./output elderAL --lr 0.001 --epochs 50 --batch-size 64

# 使用配置文件
wsdp run ./data/elderAL ./output elderAL --config my_config.yaml

📊 输出文件：

训练后，查看 ./output/：

output/
├── best_model.pth              # 最佳模型检查点
├── confusion_matrix.png        # 评估可视化
├── training_curves.png         # 损失和准确率曲线
└── output.log                  # 详细训练日志

✅ 如果看到这些文件，说明 SDP 运行正常！

📊 支持的数据集

数据集	格式	子载波	复数	场景	大小
Widar	.dat (bfee)	30	✅	手势识别	~2GB
Gait	.dat (bfee)	30	✅	步态识别	~1GB
XRF55	.npy	30	✅	人体活动	~3GB
ElderAL	.csv	varies	❌	老年人活动	~500MB
ZTE	.csv	512	✅	I/Q 格式 CSI	~4GB

更多数据集即将推出！ 查看路线图。

🔬 研究与定制

🧠 接入你自己的模型

第 1 步： 创建 custom_model.py：

import torch
import torch.nn as nn

class YourCustomModel(nn.Module):
    def __init__(self, num_classes=6):
        super().__init__()
        # 你的架构代码
        # 输入形状: (Batch, Timestamp, Frequency, Antenna)
        
    def forward(self, x):
        # 你的前向传播
        return output

# 必需：暴露模型类
model = YourCustomModel

第 2 步： 使用你的模型运行：

wsdp run ./data/elderAL ./output elderAL -m custom_model.py

📁 使用你自己的数据集

组织你的数据：

data/
└── my_dataset/
    ├── user0_pos0_action0/
    │   ├── sample1.csv
    │   └── ...
    └── user0_pos0_action1/
        └── ...

运行：

wsdp run ./data/my_dataset ./output my_dataset

🗺️ 代码结构地图

想深入修改？这里是各目录功能：

目录	用途	修改内容
`models/`	架构	定义或比较模型架构
`algorithms/`	信号处理	去噪、校准等
`datasets/`	数据集包装	添加新数据集加载器
`readers/`	文件读取器	添加新格式解析器
`structure/`	数据结构	修改 CSIFrame 格式
`processors/`	协议逻辑	调整规范投影

🧪 理解 SDP（10 分钟深度阅读）

SDP 流程

原始 CSI
  ↓
[确定性清洗]
  - 相位校准
  - 小波去噪
  ↓
[规范张量构建]
  - K=30 频率网格
  - 标准化形状
  ↓
[深度学习模型]
  ↓
预测

规范张量格式

清洗后，SDP 构建规范 CSI 张量：

$$X \in \mathbb{C}^{A \times K \times T}$$

其中：

$A$ = 天线数量
$K$ = 30（固定频率网格）
$T$ = 时间样本

这确保了跨硬件可比性。

为什么是确定性的？

原始 CSI 包含硬件失真：

相位偏移
采样时间偏移
噪声波动

SDP 强制执行确定性校准和去噪，保证：

✅ 相同的原始 CSI → 相同的清洗后张量
✅ 可复现性是强制的，不是可选的

📚 文档与资源

🗺️ 路线图

v0.1 - 初始协议设计
v0.2 - 5 个数据集支持，CLI 工具
v0.3 - 更多数据集（WiFi-HAR、CSI-HAR 等）
v0.4 - 在线演示平台
v0.5 - PyPI 正式发布
v1.0 - 完整协议标准化

想要特定数据集？ 提交 issue 告诉我们！

🤝 贡献

欢迎贡献！查看 CONTRIBUTING.md 了解：

开发环境搭建
编码规范
Pull Request 流程

📄 许可证

MIT 许可证 - 详见 LICENSE 文件。

Made with ❤️ by the WSDP Team

⬆ Back to Top

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

Mar 30, 2026

0.3.3

Mar 20, 2026

0.3.2

Mar 19, 2026

0.3.1

Mar 17, 2026

0.3.0

Mar 17, 2026

This version

0.2.0

Mar 17, 2026

0.1.2

Mar 10, 2026

0.1.1

Feb 27, 2026

0.1.0

Feb 12, 2026

0.0.9

Feb 12, 2026

0.0.8

Feb 9, 2026

0.0.7

Feb 9, 2026

0.0.6

Jan 17, 2026

0.0.5

Jan 16, 2026

0.0.4

Jan 13, 2026

0.0.3 yanked

Jan 13, 2026

Reason this release was yanked:

cant run by command

0.0.1 yanked

Jan 13, 2026

Reason this release was yanked:

neglect json config

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wsdp-0.2.0.tar.gz (43.5 kB view details)

Uploaded Mar 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

wsdp-0.2.0-py3-none-any.whl (38.0 kB view details)

Uploaded Mar 17, 2026 Python 3

File details

Details for the file wsdp-0.2.0.tar.gz.

File metadata

Download URL: wsdp-0.2.0.tar.gz
Upload date: Mar 17, 2026
Size: 43.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for wsdp-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`11be747e546a63b5eeec19c198c4e955fb16c778199b130888cff69d9d85c6a1`
MD5	`c7d597b72898a2c5bfa4d41a8f08c546`
BLAKE2b-256	`6e1c1b2f0cebbfcac2951968a2019ca8d01a03ab920cd14556bbdbbbc66ca6bb`

See more details on using hashes here.

File details

Details for the file wsdp-0.2.0-py3-none-any.whl.

File metadata

Download URL: wsdp-0.2.0-py3-none-any.whl
Upload date: Mar 17, 2026
Size: 38.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for wsdp-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e19c118feef131f0988bf4907b9434be0dbfa94cf2892753ae8ccb3068e72406`
MD5	`878d63d07215a566643703038f69719d`
BLAKE2b-256	`8bd52504fc9fa1f56c0944499ab4b1beec860d5c4beaa71d9610c4b849a34130`

See more details on using hashes here.

wsdp 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

SDP: Sensing Data Protocol for Scalable Wireless Sensing

📖 Citation

🇬🇧 English

🎯 What is SDP?

The Problem

The Solution

Performance Highlights

🚀 Quick Start (3 Steps, 5 Minutes)

Step 1: Install (30 seconds)

Step 2: Download Dataset (2 minutes)

Step 3: Train & Evaluate (2 minutes)

📊 Supported Datasets

🔬 Research & Customization

🧠 Plug in Your Own Model

📁 Use Your Own Dataset

🗺️ Codebase Map

🧪 Understanding SDP (10-Min Deep Dive)

The SDP Pipeline

Canonical Tensor Format

Why Deterministic?

📚 Documentation & Resources

🗺️ Roadmap

🤝 Contributing

📄 License

🇨🇳 中文

🎯 SDP 是什么？

问题所在

解决方案

性能亮点

🚀 快速开始（3 步，5 分钟）

第 1 步：安装（30 秒）

第 2 步：下载数据集（2 分钟）

第 3 步：训练与评估（2 分钟）

📊 支持的数据集

🔬 研究与定制

🧠 接入你自己的模型

📁 使用你自己的数据集

🗺️ 代码结构地图

🧪 理解 SDP（10 分钟深度阅读）

SDP 流程

规范张量格式

为什么是确定性的？

📚 文档与资源

🗺️ 路线图

🤝 贡献

📄 许可证

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes