Entropy-driven adaptive sorting that beats C-level np.sort() on real-world data distributions.
Project description
beyond-sort
Entropy-driven adaptive sorting that beats C-level np.sort() on real-world data.
One line. Automatic. Faster than C.
import beyond_sort
result = beyond_sort.sort(data) # auto-selects optimal strategy
strategy = beyond_sort.probe(data) # inspect which strategy would be used
detail = beyond_sort.probe_detail(data) # full entropy analysis
Why is this faster than np.sort()?
np.sort() uses a single C-level introsort for all data. beyond-sort analyzes your data's structure in a single pass, then routes to the mathematically optimal algorithm compiled to native machine code:
| Data Pattern | np.sort() strategy | beyond-sort strategy | Why it's faster |
|---|---|---|---|
| Random | Introsort (generic) | Parallel radix (value-range partitioned) | Multi-core + zero-merge |
| Nearly sorted | Introsort (generic) | Parallel radix (exploits narrow value range) | Fewer passes needed |
| Reversed | Introsort (O(n log n)) | O(n) reverse copy | Detects structure, skips sorting |
| Few unique values | Introsort (O(n log n)) | Parallel counting sort | O(n+k) with segmented histogram |
| Pipe organ (↑↓) | Introsort (generic) | Parallel radix (value-range partitioned) | Partitions align with structure |
Benchmarks
Tested on 32-core CPU, N = 100,000,000 (100 million integers):
| Scenario | np.sort() | beyond-sort | Speedup |
|---|---|---|---|
| random | 1731 ms | 1205 ms | 1.44× |
| nearly_sorted | 1765 ms | 1090 ms | 1.62× |
| reversed | 1726 ms | 240 ms | 7.19× |
| few_unique | 491 ms | 459 ms | 1.07× |
| pipe_organ | 1911 ms | 1008 ms | 1.90× |
| sawtooth | 1584 ms | 932 ms | 1.70× |
vs Python sorted(): up to 16× faster.
The advantage grows with data size — at 10M elements: 1.1× faster; at 100M: 1.4-7.2× faster.
Architecture: 4-Layer Optimization
┌─────────────────────────────────────────────────┐
│ Layer 1: Symbol Entropy probe (τ, runs, │
│ value range) → strategy route │
├─────────────────────────────────────────────────┤
│ Layer 2: Syntax Numba @njit → LLVM native │
│ machine code, zero interpreter│
├─────────────────────────────────────────────────┤
│ Layer 3: Instr. NumPy SIMD vectorization, │
│ cache-friendly memory access │
├─────────────────────────────────────────────────┤
│ Layer 4: Parallel Value-range partitioning → │
│ prange multi-core, zero-merge │
└─────────────────────────────────────────────────┘
Key insight: Value-range partitioning
Instead of splitting the array by position (which requires merging), we split by value range. Each partition contains a non-overlapping range of values. After sorting each partition independently on separate cores, the results concatenate directly — zero merge step.
Install
pip install beyond-sort
Requirements: Python 3.10+, NumPy, Numba.
When to use beyond-sort
- Sorting large integer arrays (100K+ elements)
- Data with predictable structure (logs, time-series, sensor data, IDs)
- Batch processing where sorting is a bottleneck
- Any scenario where
np.sort()isn't fast enough
API
beyond_sort.sort(data) → np.ndarray
Sort an integer array. Accepts list or numpy array. Returns sorted numpy array.
beyond_sort.probe(data) → str
Returns the strategy name without sorting. Useful for debugging.
beyond_sort.probe_detail(data) → dict
Full entropy analysis: tau (order degree), run count, value range, selected strategy.
beyond_sort.warmup()
Pre-compile all JIT paths. Call once at startup for consistent benchmark results.
License
Apache-2.0 — free to use commercially, patent rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file beyond_sort-0.1.0.tar.gz.
File metadata
- Download URL: beyond_sort-0.1.0.tar.gz
- Upload date:
- Size: 8.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b659439cf46cbe6e4dc46fcc609e7c978e3dfb154af7d9b5bc1649e9f5b3103
|
|
| MD5 |
05e84e380f9f252d6dc4f01d30c896b9
|
|
| BLAKE2b-256 |
e5c95ce72aa62bc8925527085311a5a52d069a5afb6310acd7abf82d6ece3b82
|
File details
Details for the file beyond_sort-0.1.0-py3-none-any.whl.
File metadata
- Download URL: beyond_sort-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9bfd91dc235be2dc585b1648a5b24aa7c3fa9acc6ad362d21f84ef5e74ff5c69
|
|
| MD5 |
d21581706d18457903d7d91899223e7c
|
|
| BLAKE2b-256 |
0dfab21c6c38094f4d3fba11150005245b152b96c95c33ebb490b645de0dc4ae
|