Skip to main content

Entropy-driven adaptive sorting that beats C-level np.sort() on real-world data distributions.

Project description

beyond-sort

Entropy-driven adaptive sorting that beats C-level np.sort() on real-world data.

One line. Automatic. Faster than C.

import beyond_sort

result = beyond_sort.sort(data)         # auto-selects optimal strategy
strategy = beyond_sort.probe(data)      # inspect which strategy would be used
detail = beyond_sort.probe_detail(data) # full entropy analysis

Why is this faster than np.sort()?

np.sort() uses a single C-level introsort for all data. beyond-sort analyzes your data's structure in a single pass, then routes to the mathematically optimal algorithm compiled to native machine code:

Data Pattern np.sort() strategy beyond-sort strategy Why it's faster
Random Introsort (generic) Parallel radix (value-range partitioned) Multi-core + zero-merge
Nearly sorted Introsort (generic) Parallel radix (exploits narrow value range) Fewer passes needed
Reversed Introsort (O(n log n)) O(n) reverse copy Detects structure, skips sorting
Few unique values Introsort (O(n log n)) Parallel counting sort O(n+k) with segmented histogram
Pipe organ (↑↓) Introsort (generic) Parallel radix (value-range partitioned) Partitions align with structure

Benchmarks

Tested on 32-core CPU, N = 100,000,000 (100 million integers):

Scenario np.sort() beyond-sort Speedup
random 1731 ms 1205 ms 1.44×
nearly_sorted 1765 ms 1090 ms 1.62×
reversed 1726 ms 240 ms 7.19×
few_unique 491 ms 459 ms 1.07×
pipe_organ 1911 ms 1008 ms 1.90×
sawtooth 1584 ms 932 ms 1.70×

vs Python sorted(): up to 16× faster.

The advantage grows with data size — at 10M elements: 1.1× faster; at 100M: 1.4-7.2× faster.

Architecture: 4-Layer Optimization

┌─────────────────────────────────────────────────┐
│  Layer 1: Symbol   Entropy probe (τ, runs,      │
│                    value range) → strategy route │
├─────────────────────────────────────────────────┤
│  Layer 2: Syntax   Numba @njit → LLVM native    │
│                    machine code, zero interpreter│
├─────────────────────────────────────────────────┤
│  Layer 3: Instr.   NumPy SIMD vectorization,    │
│                    cache-friendly memory access  │
├─────────────────────────────────────────────────┤
│  Layer 4: Parallel Value-range partitioning →   │
│                    prange multi-core, zero-merge │
└─────────────────────────────────────────────────┘

Key insight: Value-range partitioning

Instead of splitting the array by position (which requires merging), we split by value range. Each partition contains a non-overlapping range of values. After sorting each partition independently on separate cores, the results concatenate directly — zero merge step.

Install

pip install beyond-sort

Requirements: Python 3.10+, NumPy, Numba.

When to use beyond-sort

  • Sorting large integer arrays (100K+ elements)
  • Data with predictable structure (logs, time-series, sensor data, IDs)
  • Batch processing where sorting is a bottleneck
  • Any scenario where np.sort() isn't fast enough

API

beyond_sort.sort(data) → np.ndarray

Sort an integer array. Accepts list or numpy array. Returns sorted numpy array.

beyond_sort.probe(data) → str

Returns the strategy name without sorting. Useful for debugging.

beyond_sort.probe_detail(data) → dict

Full entropy analysis: tau (order degree), run count, value range, selected strategy.

beyond_sort.warmup()

Pre-compile all JIT paths. Call once at startup for consistent benchmark results.

License

Apache-2.0 — free to use commercially, patent rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

beyond_sort-0.1.0.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

beyond_sort-0.1.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file beyond_sort-0.1.0.tar.gz.

File metadata

  • Download URL: beyond_sort-0.1.0.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for beyond_sort-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0b659439cf46cbe6e4dc46fcc609e7c978e3dfb154af7d9b5bc1649e9f5b3103
MD5 05e84e380f9f252d6dc4f01d30c896b9
BLAKE2b-256 e5c95ce72aa62bc8925527085311a5a52d069a5afb6310acd7abf82d6ece3b82

See more details on using hashes here.

File details

Details for the file beyond_sort-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: beyond_sort-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for beyond_sort-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9bfd91dc235be2dc585b1648a5b24aa7c3fa9acc6ad362d21f84ef5e74ff5c69
MD5 d21581706d18457903d7d91899223e7c
BLAKE2b-256 0dfab21c6c38094f4d3fba11150005245b152b96c95c33ebb490b645de0dc4ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page