Skip to main content

CPU-optimized statistical computing library with SIMD vectorization and OpenMP parallelization for massive datasets (10M+ records)

Project description

HPCSeries Core

High-Performance Statistical Computing for Large-Scale Data Analysis

Version Python License: MIT Architecture


Overview

HPCSeries Core is a CPU-optimized statistical computing library for massive datasets (10M+ records). Provides 2-100x speedup over NumPy/Pandas through SIMD vectorization (AVX2/AVX-512/NEON), OpenMP parallelization, and cache-optimized algorithms.

Built with Fortran, C, and C++ for maximum performance, with zero-copy Python bindings via Cython.

Key Features

  • SIMD-Accelerated Operations: sum, mean, std, min, max, median, MAD, quantile
  • Fast Rolling Windows: 50-100x faster than Pandas for rolling operations
  • Anomaly Detection: Statistical and robust outlier detection
  • Axis/Masked Operations: Efficient 2D array and missing data handling
  • Auto-Tuning: One-time calibration for optimal hardware performance
  • Architecture-Aware: Automatic optimization for x86 (Intel/AMD) and ARM (Graviton)

Performance Highlights

Operation Array Size NumPy/Pandas HPCSeries Speedup
sum 1M 0.45 ms 0.12 ms 3.8x
rolling_mean 100K (w=50) 45 ms 0.8 ms 56x
rolling_median 100K (w=50) 850 ms 7.2 ms 118x

Target use cases: 10M-1B records, time-series analysis, sensor data, financial analytics.


Installation

Quick Install

pip install hpcs

Verify:

import hpcs
print(hpcs.__version__)  # 0.7.0

Build from Source

git clone https://github.com/hpcseries/HPCSeriesCore.git
cd HPCSeriesCore

# Build C/Fortran library
mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j$(nproc)
cd ..

# Install Python bindings
pip install -e .

Requirements: Python 3.8+, NumPy 1.20+, GCC/gfortran 7+, CMake 3.18+

See Build Guide for details.


Quick Start

import hpcs
import numpy as np

x = np.random.randn(1_000_000)

# Reductions (2-5x faster than NumPy)
hpcs.sum(x), hpcs.mean(x), hpcs.std(x)

# Rolling operations (50-100x faster than Pandas)
rolling_median = hpcs.rolling_median(x, window=100)

# Anomaly detection
anomalies = hpcs.detect_anomalies_robust(x, threshold=3.0)

# Auto-tuning (run once)
hpcs.calibrate()
hpcs.save_calibration_config()

Performance Configuration

Optimal OpenMP Settings

export OMP_DYNAMIC=false
export OMP_PROC_BIND=true
export OMP_PLACES=cores
export OMP_NUM_THREADS=2

Why 2 threads? Empirical testing on AMD EPYC Genoa, Intel Ice Lake, and ARM Graviton3 shows HPCSeries Core saturates memory bandwidth at 2 threads. Using 4+ threads degrades performance by 5-18% due to cache contention.

See Performance Methodology for full analysis.

Additional Tips

  • Ensure C-contiguous arrays: np.ascontiguousarray(x)
  • Use robust functions (median, robust_zscore) for data with outliers
  • Run calibration once: hpcs.calibrate() and hpcs.save_calibration_config()

Documentation

Core Documentation

Examples & Tutorials

  • Jupyter Notebooks - 12 comprehensive tutorials covering:
    • Getting started and basic usage
    • Rolling operations and anomaly detection
    • Climate data, IoT sensors, financial analytics
    • NumPy/Pandas migration guide
    • Kaggle competition examples

See Notebooks README for full list.


Version History

v0.7.0 (Current - 2025-12-17)

  • Architecture-aware compilation (x86 and ARM)
  • AWS deployment infrastructure
  • Comprehensive performance validation
  • Thread scaling optimization (OMP_NUM_THREADS=2 universal)

See CHANGELOG.md for complete history.


Project Structure

HPCSeriesCore/
├── src/                      # C/Fortran/C++ source
│   ├── fortran/              # HPC kernels (OpenMP)
│   └── hpcs_*.c              # SIMD implementations
├── include/                  # C API headers
├── python/hpcs/              # Python bindings (Cython)
├── cmake/                    # CMake modules (architecture detection)
├── notebooks/                # Jupyter tutorials
├── docs/                     # Documentation
├── tests/                    # Test suites
└── bench/                    # Benchmarks

Support


License

MIT License - See LICENSE for details.


Citation

If you use HPCSeries Core in your research, please cite:

@software{hpcseries_core_2025,
  title = {HPCSeries Core: High-Performance Statistical Computing for Large-Scale Data Analysis},
  author = {HPCSeries Core Contributors},
  year = {2025},
  month = {12},
  version = {0.7.0},
  url = {https://github.com/hpcseries/HPCSeriesCore},
  license = {MIT}
}

Or use GitHub's "Cite this repository" button (auto-generated from CITATION.cff).


⭐ Star us on GitHub if HPCSeries Core accelerates your data analysis!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hpcs-0.7.0.tar.gz (381.3 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

hpcs-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

hpcs-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

hpcs-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

hpcs-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

hpcs-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file hpcs-0.7.0.tar.gz.

File metadata

  • Download URL: hpcs-0.7.0.tar.gz
  • Upload date:
  • Size: 381.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for hpcs-0.7.0.tar.gz
Algorithm Hash digest
SHA256 7a43530ba070afc6d9f5df14bb778e683865e6dd61abd648deca5467af63473c
MD5 c3fae12fdf87ed67459298e906b5ad5f
BLAKE2b-256 e385ff2324305f10a584b1032f7b71317586027310266b3936ddad87d1716c57

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0.tar.gz:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcs-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hpcs-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ebcc7f0323a31b3d49595035cbb535781ab9c12838ce8a154217e805faea23ac
MD5 6c1a1e1044078167aab12d7356720cd1
BLAKE2b-256 e96b94b8d3f26b9d3edc47fb4eeab8ccd846d4868abb0df671eadfec307a832c

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcs-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hpcs-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4a0764362a752a6dff8da591afa5835019d24a7db52eda990ff2c21d737908fc
MD5 7207c84326d4d52fa7d4528d533a383b
BLAKE2b-256 a50b0bd0d7751d864a602caf590f8c7d17418741c74284d0e7f6c0f29c1d249a

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcs-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hpcs-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4fe6d5d7831ffbe364af1ab53f37a76e29ba5e2f511220600452e8550e62de84
MD5 45725da6cb80dcee666a17f52f22b8a2
BLAKE2b-256 a3da45e00a11212fcc8c2c1c0bbb3df5e7ee6fe8898abb47e7b16391e664558a

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcs-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hpcs-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6132421667b8a39a00ac3fd17fdf19dccf3d4aab22b48217ff741bfc105c763e
MD5 673f3ba4912d52a270619587eefba0bd
BLAKE2b-256 267b1bfd34e14b88ecaea394a87c8b5d791cf86616fe6ae385f4ce67abe54ea3

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hpcs-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for hpcs-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 90571386b42caea37a3facb49f9269639779510b809ec83854a9b472d9b030a5
MD5 a817a8808f2b28864469d3361108a9a0
BLAKE2b-256 fc6c6163020012c3a1fa6dc8f6a8008853f1a81fbd6dbf009d8b4b5d6a605b6d

See more details on using hashes here.

Provenance

The following attestation bundles were made for hpcs-0.7.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: build-wheels.yml on hpcseries/HPCSeriesCore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page