Unofficial parser for NETZSCH STA (Simultaneous Thermal Analysis) NGB instrument binary files. Not affiliated with, endorsed by, or approved by NETZSCH-Gerätebau GmbH.

These details have not been verified by PyPI

Project links

Project description

pyNGB - NETZSCH STA File Parser

A comprehensive Python library for parsing and analyzing NETZSCH STA (Simultaneous Thermal Analysis) NGB files with high performance, extensive metadata extraction, and robust batch processing capabilities.

🚨 Disclaimer

This package and its author are not affiliated with, endorsed by, or approved by NETZSCH-Gerätebau GmbH. This is an independent, open-source project created to provide Python support for parsing NGB (NETZSCH binary) file formats. NETZSCH is a trademark of NETZSCH-Gerätebau GmbH.

✨ Features

Core Capabilities

🚀 High-Performance Parsing: Optimized binary parsing with NumPy and PyArrow
📊 Rich Metadata Extraction: Complete instrument settings, sample information, and measurement parameters
🧮 Baseline Subtraction: Automatic baseline correction with temperature program validation
🔧 Flexible Data Access: Multiple APIs for different use cases
📦 Modern Data Formats: PyArrow tables with embedded metadata
🔍 Data Validation: Built-in quality checking and validation tools
⚡ Batch Processing: Parallel processing of multiple files
🛠️ Command Line Interface: Production-ready CLI for automation

Advanced Features

🏗️ Modular Architecture: Extensible and maintainable design
🔒 Type Safety: Full type hints and static analysis support
🧪 Comprehensive Testing: 300+ tests including integration and stress tests
🔄 Format Conversion: Export to Parquet, CSV, and JSON
📈 Dataset Management: Tools for managing collections of NGB files
🔀 Concurrent Processing: Thread-safe operations and parallel execution
📊 DTG Analysis: Simple, powerful derivative thermogravimetry calculation
📝 Rich Documentation: Complete API documentation with examples

🚀 Quick Start

Installation

pip install pyngb

Basic Usage

from pyngb import read_ngb

# Quick data loading (recommended for most users)
data = read_ngb("sample.ngb-ss3")
print(f"Loaded {data.num_rows} rows with {data.num_columns} columns")
print(f"Columns: {data.column_names}")

# Access embedded metadata
import json
metadata = json.loads(data.schema.metadata[b'file_metadata'])
print(f"Sample: {metadata.get('sample_name', 'Unknown')}")
print(f"Instrument: {metadata.get('instrument', 'Unknown')}")

# Separate metadata and data (for advanced analysis)
metadata, data = read_ngb("sample.ngb-ss3", return_metadata=True)

Data Analysis

import polars as pl

# Convert to DataFrame for analysis
df = pl.from_arrow(table)

# Basic exploration
print(df.describe())
print(f"Temperature range: {df['sample_temperature'].min():.1f} to {df['sample_temperature'].max():.1f} °C")

# Simple plotting
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 8))
plt.subplot(2, 1, 1)
plt.plot(df['time'], df['sample_temperature'])
plt.ylabel('Temperature (°C)')
plt.title('Temperature Program')

plt.subplot(2, 1, 2)
plt.plot(df['time'], df['mass'])
plt.xlabel('Time (s)')
plt.ylabel('Mass (mg)')
plt.title('Mass Loss')
plt.show()

DTG (Derivative Thermogravimetry) Analysis

from pyngb.api.analysis import add_dtg
import polars as pl

# Method 1: Add DTG column directly to your table (recommended)
table_with_dtg = add_dtg(table, method="savgol", smooth="medium")
df = pl.from_arrow(table_with_dtg)

print(f"Added DTG column. Shape: {table_with_dtg.num_rows} x {table_with_dtg.num_columns}")
print(f"Columns: {table_with_dtg.column_names}")

# Method 2: Compare different smoothing levels in one workflow
dtg_strict = add_dtg(table, smooth="strict", column_name="dtg_strict")
dtg_medium = add_dtg(table, smooth="medium", column_name="dtg_medium")
dtg_loose = add_dtg(table, smooth="loose", column_name="dtg_loose")

# Combine all DTG columns for comparison
df_comparison = pl.from_arrow(dtg_strict).join(
    pl.from_arrow(dtg_medium).select(["dtg_medium"]), how="inner", left_on=None, right_on=None
).join(
    pl.from_arrow(dtg_loose).select(["dtg_loose"]), how="inner", left_on=None, right_on=None
)

# Plot DTG results
plt.figure(figsize=(12, 6))
plt.plot(df_comparison['sample_temperature'], df_comparison['dtg_strict'], label='Strict smoothing', alpha=0.7)
plt.plot(df_comparison['sample_temperature'], df_comparison['dtg_medium'], label='Medium smoothing', linewidth=2)
plt.plot(df_comparison['sample_temperature'], df_comparison['dtg_loose'], label='Loose smoothing', alpha=0.7)
plt.xlabel('Temperature (°C)')
plt.ylabel('DTG (mg/min)')
plt.title('Derivative Thermogravimetry')
plt.legend()
plt.show()

Baseline Subtraction

from pyngb import read_ngb, subtract_baseline

# Method 1: Integrated baseline subtraction (recommended)
corrected_data = read_ngb(
    "sample.ngb-ss3",
    baseline_file="baseline.ngb-bs3"
)

# Method 2: Standalone baseline subtraction
corrected_df = subtract_baseline("sample.ngb-ss3", "baseline.ngb-bs3")

# Advanced: Custom dynamic axis (default is sample_temperature)
corrected_data = read_ngb(
    "sample.ngb-ss3",
    baseline_file="baseline.ngb-bs3",
    dynamic_axis="time"  # or "furnace_temperature"
)

📋 Complete Usage Guide

1. Single File Processing

from pyngb import read_ngb

# Method 1: Unified data and metadata (recommended)
table = read_ngb("experiment.ngb-ss3")
# Access data
df = pl.from_arrow(table)
# Access metadata
metadata = json.loads(table.schema.metadata[b'file_metadata'])

# Method 2: Separate metadata and data
metadata, data = read_ngb("experiment.ngb-ss3", return_metadata=True)

2. Batch Processing

from pyngb import BatchProcessor

# Initialize batch processor
processor = BatchProcessor(max_workers=4, verbose=True)

# Process multiple files
results = processor.process_files(
    ["file1.ngb-ss3", "file2.ngb-ss3", "file3.ngb-ss3"],
    output_format="both",  # Parquet and CSV
    output_dir="./processed_data/"
)

# Check results
successful = [r for r in results if r["status"] == "success"]
print(f"Successfully processed {len(successful)} files")

3. Dataset Management

from pyngb import NGBDataset

# Create dataset from directory
dataset = NGBDataset.from_directory("./sta_experiments/")

# Get overview
summary = dataset.summary()
print(f"Dataset contains {summary['file_count']} files")
print(f"Unique instruments: {summary['unique_instruments']}")

# Export metadata for analysis
dataset.export_metadata("dataset_metadata.csv", format="csv")

# Filter dataset
high_temp_files = dataset.filter_by_metadata(
    lambda meta: meta.get('sample_mass', 0) > 10.0
)

4. Data Validation

from pyngb.validation import QualityChecker, validate_sta_data

# Quick validation
issues = validate_sta_data(df)
print(f"Found {len(issues)} data quality issues")

# Comprehensive validation
checker = QualityChecker(df)
result = checker.full_validation()

print(f"Validation passed: {result.is_valid}")
print(f"Errors: {result.summary()['error_count']}")
print(f"Warnings: {result.summary()['warning_count']}")

5. Advanced Parser Configuration

from pyngb import NGBParser, PatternConfig

# Custom configuration
config = PatternConfig()
config.column_map["custom_id"] = "custom_column"
config.metadata_patterns["custom_field"] = (b"\x99\x99", b"\x88\x88")

# Use custom parser
parser = NGBParser(config)
metadata, data = parser.parse("sample.ngb-ss3")

🖥️ Command Line Interface

Basic Commands

# Convert single file to Parquet
python -m pyngb sample.ngb-ss3

# Convert to CSV with verbose output
python -m pyngb sample.ngb-ss3 -f csv -v

# Convert to all formats (Parquet, CSV)
python -m pyngb sample.ngb-ss3 -f all -o ./output/

Baseline Subtraction via CLI

# Basic baseline subtraction
python -m pyngb sample.ngb-ss3 -b baseline.ngb-bs3

# Baseline subtraction with custom dynamic axis
python -m pyngb sample.ngb-ss3 -b baseline.ngb-bs3 --dynamic-axis time

# Baseline subtraction with all output formats
python -m pyngb sample.ngb-ss3 -b baseline.ngb-bs3 -f all -o ./corrected/

Batch Processing

# Process all files in directory
python -m pyngb *.ngb-ss3 -f parquet -o ./processed/

# Process with specific output formats
python -m pyngb experiments/*.ngb-ss3 -f both -o ./results/

# Get help
python -m pyngb --help

Advanced CLI Usage

# Process directory with pattern matching
find ./data -name "*.ngb-ss3" | xargs python -m pyngb -f parquet -o ./output/

# Automated processing pipeline
python -m pyngb $(find ./incoming -name "*.ngb-ss3" -mtime -1) -f all -o ./daily_processing/

🏗️ Architecture

pyngb uses a modular, extensible architecture designed for performance and maintainability:

pyngb/
├── api/                    # High-level user interface
│   ├── loaders.py         # Main loading functions
│   └── analysis.py        # DTG analysis API
├── analysis/              # Simplified analysis tools
│   └── dtg.py             # Clean DTG calculation
├── binary/                # Low-level binary parsing
│   ├── parser.py          # Binary structure parsing
│   └── handlers.py        # Data type handlers
├── core/                  # Core orchestration
│   └── parser.py          # Main parser coordination
├── extractors/            # Data extraction modules
│   ├── metadata.py        # Metadata extraction
│   └── streams.py         # Data stream processing
├── batch.py               # Batch processing tools
├── validation.py          # Data quality validation
├── constants.py           # Configuration and constants
├── exceptions.py          # Custom exception hierarchy
└── util.py               # Utility functions

Design Principles

Performance First: Optimized for speed and memory efficiency
Extensibility: Easy to add new data types and extraction patterns
Reliability: Comprehensive error handling and validation
Usability: Multiple APIs for different user needs
Maintainability: Clean separation of concerns and thorough testing

📊 Data Output

Supported Columns

Common data columns extracted from NGB files:

Column	Description	Units
`time`	Measurement time	seconds
`sample_temperature`	Sample temperature	°C
`furnace_temperature`	Furnace temperature	°C
`mass`	Sample mass	mg
`dsc_signal`	DSC heat flow	µV/mg
`purge_flow_1`	Primary purge gas flow	mL/min
`purge_flow_2`	Secondary purge gas flow	mL/min
`protective_flow`	Protective gas flow	mL/min

Derived Columns (via DTG analysis):

Column	Description	Units
`dtg`	Derivative thermogravimetry (time-based)	mg/min

Metadata Fields

Comprehensive metadata extraction including:

Instrument Information: Model, version, calibration data
Sample Details: Name, mass, material, crucible type
Experimental Conditions: Operator, date, lab, project
Temperature Program: Complete heating/cooling profiles
Gas Flows: MFC settings and gas types
System Parameters: PID settings, acquisition rates

🔧 Advanced Features

Performance Optimization

# Memory-efficient processing of large files
table = read_ngb("large_file.ngb-ss3")
# Process in chunks to manage memory
chunk_size = 10000
for i in range(0, table.num_rows, chunk_size):
    chunk = table.slice(i, chunk_size)
    # Process chunk...

Custom Data Types

from pyngb.binary.handlers import DataTypeHandler, DataTypeRegistry

class CustomHandler(DataTypeHandler):
    def can_handle(self, data_type: bytes) -> bool:
        return data_type == b'\x99'

    def parse(self, data: bytes) -> list:
        # Custom parsing logic
        return [struct.unpack('<f', data[i:i+4])[0] for i in range(0, len(data), 4)]

# Register custom handler
registry = DataTypeRegistry()
registry.register(CustomHandler())

Validation Customization

from pyngb.validation import QualityChecker

class CustomQualityChecker(QualityChecker):
    def custom_check(self):
        """Add custom validation logic."""
        if "custom_column" in self.data.columns:
            values = self.data["custom_column"]
            if values.min() < 0:
                self.result.add_error("Custom column has negative values")

🧪 Testing and Quality

pyngb includes a comprehensive test suite ensuring reliability:

300+ Tests: Unit, integration, and end-to-end tests
Real Data Testing: Tests using actual NGB files
Stress Testing: Memory management and concurrent processing
Edge Case Coverage: Corrupted files, extreme data values
Performance Testing: Large file processing benchmarks

Run tests locally:

# Install development dependencies
uv sync --extra dev

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html

# Run only fast tests
pytest -m "not slow"

🤝 Contributing

We welcome contributions! Here's how to get started:

Development Setup

# Clone repository
git clone https://github.com/GraysonBellamy/pyngb.git
cd pyngb

# Install with development dependencies
uv sync --extra dev

# Install pre-commit hooks
pre-commit install

# Run tests
pytest

Contributing Guidelines

Fork the repository and create a feature branch
Write tests for new functionality
Follow code style (ruff + mypy)
Update documentation for new features
Submit a pull request with clear description

See CONTRIBUTING.md for detailed guidelines.

📚 Documentation

Complete Documentation: Full user guide and API reference
Quick Start Guide: Get up and running quickly
API Reference: Detailed function documentation
Development Guide: Contributing and development setup
Troubleshooting: Common issues and solutions

🚀 Performance

pyngb is optimized for performance:

Fast Parsing: Typical files parse in 0.1-2 seconds
Memory Efficient: Uses PyArrow for optimal memory usage
Parallel Processing: Multi-core batch processing
Scalable: Handles files from KB to GB sizes

Benchmarks

Operation	Performance
Parse 10MB file	~0.5 seconds
Extract metadata	~0.1 seconds
Batch process 100 files	~30 seconds (4 cores)
Memory usage	~2x file size

🔗 Integration

pyngb integrates well with the scientific Python ecosystem:

# With Pandas
import pandas as pd
df_pandas = pl.from_arrow(table).to_pandas()

# With NumPy
import numpy as np
temperature_array = table['sample_temperature'].to_numpy()

# With Matplotlib/Seaborn
import matplotlib.pyplot as plt
import seaborn as sns

# With Jupyter notebooks
from IPython.display import display
display(df.head())

📄 License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

🙏 Acknowledgments

NETZSCH-Gerätebau GmbH for creating the STA instruments (no affiliation)
The PyArrow and Polars teams for excellent data processing libraries
The scientific Python community for the foundational tools

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Full Documentation

Made with ❤️ for the scientific community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.2

Apr 14, 2026

0.1.1

Nov 6, 2025

This version

0.1.0

Aug 24, 2025

0.0.3

Aug 21, 2025

0.0.2

Aug 20, 2025

0.0.1

Aug 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyngb-0.1.0.tar.gz (1.8 MB view details)

Uploaded Aug 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyngb-0.1.0-py3-none-any.whl (74.3 kB view details)

Uploaded Aug 24, 2025 Python 3

File details

Details for the file pyngb-0.1.0.tar.gz.

File metadata

Download URL: pyngb-0.1.0.tar.gz
Upload date: Aug 24, 2025
Size: 1.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pyngb-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`1b4ac00a8cc9c4556692f7e96c0b304d64e77c965224d4c17d932243c01e033e`
MD5	`450ea55d418a22d406b283e0f221fc9b`
BLAKE2b-256	`947e004dba03b82ae96b23726148e18fff1f473f14a8675e7e82d44ac8f1147f`

See more details on using hashes here.

File details

Details for the file pyngb-0.1.0-py3-none-any.whl.

File metadata

Download URL: pyngb-0.1.0-py3-none-any.whl
Upload date: Aug 24, 2025
Size: 74.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.1

File hashes

Hashes for pyngb-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f94d67dc386810ccb74b8c5c7800ae122a2aed105dd8225efcb4dfffbbb0e695`
MD5	`858373205c71bedacc4027e636db5cc5`
BLAKE2b-256	`5da8a11b2251ae3f7ba9752fa05eb071e19296b408f4f5942103bbf916cb6f0f`

See more details on using hashes here.

pyngb 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyNGB - NETZSCH STA File Parser

🚨 Disclaimer

✨ Features

Core Capabilities

Advanced Features

🚀 Quick Start

Installation

Basic Usage

Data Analysis

DTG (Derivative Thermogravimetry) Analysis

Baseline Subtraction

📋 Complete Usage Guide

1. Single File Processing

2. Batch Processing

3. Dataset Management

4. Data Validation

5. Advanced Parser Configuration

🖥️ Command Line Interface

Basic Commands

Baseline Subtraction via CLI

Batch Processing

Advanced CLI Usage

🏗️ Architecture

Design Principles

📊 Data Output

Supported Columns

Metadata Fields

🔧 Advanced Features

Performance Optimization

Custom Data Types

Validation Customization

🧪 Testing and Quality

🤝 Contributing

Development Setup

Contributing Guidelines

📚 Documentation

🚀 Performance

Benchmarks

🔗 Integration

📄 License

🙏 Acknowledgments

📞 Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes