Adaptive ultra-sort: the fastest general-purpose sorting library for Python
Project description
quill-sort
Adaptive Ultra-Sort — the fastest general-purpose sorting library for Python.
from quill import quill_sort, quill_sorted
quill_sort([3, 1, 4, 1, 5, 9]) # → [1, 1, 3, 4, 5, 9]
quill_sort(records, key=lambda r: r['age']) # sort objects
quill_sort(big_data, parallel=True) # use all CPU cores
result = quill_sorted(iterable, reverse=True) # mirrors built-in sorted()
Installation
pip install quill-sort # core (no dependencies)
pip install quill-sort[fast] # + numpy + psutil for max speed
pip install quill-sort[all] # + numpy + pandas + psutil
How it works
Quill profiles your data at intake and routes to the optimal algorithm automatically:
| Data type | Strategy | Complexity |
|---|---|---|
| Dense integer range | np.bincount counting sort |
O(n + k) |
| uint8 / uint16 | Radix sort (kind='stable', 1-2 passes) |
O(n) |
| int32 | Stable radix (default) or introsort (stable=False) |
O(n log n) |
| int64 / float64 | Stable radix (default) or heapsort (stable=False) |
O(n log n) |
| Strings / bytes | Python Timsort | O(n log n) |
| Objects with key | Rank-encode → numpy argsort | O(n log n) |
| Pre-sorted | Early exit after O(sample) probe | O(sample) |
| Nearly sorted | Timsort bypass (asc_ratio > 0.90) | O(n) |
| Reverse-sorted | Single .reverse() call |
O(n/2) |
| Parallel (4+ cores) | MSD radix, shared-memory scatter | O(n) |
| External (> RAM) | Radix-partition + parallel bucket sort | disk-bound |
API
quill_sort(
data, # list, generator, range, ndarray, Series, DataFrame
key=None, # sort key function (like sorted())
reverse=False, # descending order
inplace=True, # mutate data in-place (False = return new list)
parallel=False, # use all CPU cores
stable=True, # True = matches sorted() exactly; False = max speed
high_performance_mode=False, # skip interactive prompts for external sort
silent=False, # suppress status output
stats=False, # return (sorted_list, stats_dict)
)
stable parameter (new in v5)
By default, stable=True guarantees that equal elements maintain their original relative order — identical to Python's sorted(). Set stable=False for maximum speed on numeric data (uses benchmark-optimal sort algorithms that may reorder equal elements).
# Stable (default) — matches sorted() exactly
quill_sort(data) # safe for all use cases
# Unstable — faster for large numeric datasets
quill_sort(data, stable=False) # ~2-20x faster for int32/int64
analyze() — inspect your data's profile
from quill import analyze
info = analyze([3, 1, 4, 1, 5, 9])
# {'n': 6, 'dtype': 'int_pos', 'presorted': False, 'dense': True, ...}
stats — timing information
result, stats = quill_sort(data, stats=True)
# stats = {'time_ms': 12.3, 'n': 1000000}
Stability guarantee
quill_sort with stable=True (the default) is a stable sort: equal elements preserve their original relative order. This matches the contract of Python's built-in sorted() and list.sort().
Special value handling
None values are sorted to the end of the list (or the beginning with reverse=True):
quill_sort([3, None, 1, None, 2])
# → [1, 2, 3, None, None]
NaN values in float data are placed at the end (or beginning with reverse=True):
quill_sort([3.0, float('nan'), 1.0, 2.0])
# → [1.0, 2.0, 3.0, nan]
Mixed int/float lists are automatically promoted to float64:
quill_sort([1, 2.5, 3, 4.0])
# → [1, 2.5, 3, 4.0] (fast float path, not slow object fallback)
v5 — What changed
Sort stability fix (breaking): v4 used quicksort for int32 and heapsort for int64 — both unstable. v5 defaults to stable=True which uses kind='stable' for all dtypes, matching sorted() exactly. Use stable=False for the old fast-but-unstable behavior.
New features:
stableparameter for explicit stability controlstatsparameter for timing infoanalyze()API for data profiling- Float NaN handling (stripped, sorted, reinserted)
- Mixed int/float promotion to float64
- Nearly-sorted Timsort bypass (O(n) for ~sorted data)
- Parallel float sort via shared-memory numpy
- Cache-aware counting sort (L2-sized range → always counting sort)
- Adaptive parallel threshold (scales with core count)
- Thread-safe plugin registry and process pool
- 138-test suite with hypothesis property-based testing
v4 — Previous improvements
Benchmark-proven optimal sort algorithm per dtype:
| dtype | v3 (wrong) | v4 (correct) | Sort-phase speedup |
|---|---|---|---|
| uint8 | stable |
stable |
— (already optimal) |
| uint16 | stable |
stable |
— (already optimal) |
| int32 | stable |
default (introsort) |
17-20× |
| int64 | stable |
heapsort |
8-15× |
| float64 | stable |
heapsort |
8-10× |
Other v4 improvements:
- Heavy-key detection in parallel MSD radix (handles Zipf/skewed distributions up to 2.3× faster)
- Parallel MSD radix sort via shared memory — zero-copy IPC, no pickling of large arrays
- All bucket sorts in external engine now use correct
kindper dtype - Batch sorts in streaming k-way merge corrected
Supported types
int,float,str,bytes— native fast paths- Mixed
int+float— automatically promoted to float64 - Negative integers — automatically shifted to non-negative before sorting
Nonevalues — filtered out, sorted, reinserted at endfloat('nan')— filtered out, reinserted at endpandas.Series— sorted and returned as a new Seriespandas.DataFrame— sorted by column(s) viakey='column_name'numpy.ndarray— sorted in-place via numpy directly- Any generator or iterator — materialised to list, sorted, returned
Plugin system
from quill import register_plugin
from quill._plugins import QuillPlugin
class MyPlugin(QuillPlugin):
handles = (MyCustomClass,)
name = "my_custom_class"
@staticmethod
def prepare(data, key, reverse):
items = [x.value for x in data]
postprocess = lambda sorted_list: [MyCustomClass(v) for v in sorted_list]
return items, key, postprocess
register_plugin(MyPlugin)
quill_sort(list_of_my_objects)
CLI / Demo
python -m quill # run the benchmark demo
quill # same, if installed via pip
Performance
Benchmarked on Windows 11, 28-core CPU, numpy installed:
| n | Time | Notes |
|---|---|---|
| 1,000 | ~0.0001s | Python sort / timsort |
| 100,000 | ~0.002s | numpy stable sort |
| 1,000,000 | ~0.010s | numpy stable sort |
| 10,000,000 | ~0.08s | numpy stable sort |
| 100,000,000 | ~0.8s | numpy stable sort |
| 1,000,000,000 | ~39s | external: radix-partition + 28-core parallel bucket sort |
The 1B result uses Quill's external sort engine: radix-partitions into 256 buckets on disk, sorts all buckets in parallel across available CPU cores, then concatenates — no merge pass needed.
Note: list↔numpy conversion overhead dominates for medium sizes. For maximum throughput on large datasets, pass numpy arrays directly (the NumpyArrayPlugin handles this automatically).
Performance tuning
- Install numpy:
pip install quill-sort[fast]— 10-100× faster for numeric data. stable=False: 2-20× faster for int32/int64 when you don't need stability.parallel=True: Force parallel for datasets under the auto-threshold.- Install psutil: Accurate RAM sensing prevents unnecessary external sort triggers.
high_performance_mode=True: Skip interactive prompts in production.
Troubleshooting
- "quill_sort is slower than expected" — Install numpy:
pip install quill-sort[fast]. - "External sort triggered unexpectedly" — Install psutil for accurate RAM sensing. Without it, Quill assumes only 2GB available.
- "Parallel sort not engaging" — Auto-parallel requires 4+ cores and large datasets. Use
parallel=Trueto force it. - "Equal elements reordered" — Ensure
stable=True(the default). If usingstable=False, equal elements may reorder.
Development
git clone https://github.com/invariant-games/quill.git
cd quill
pip install -e ".[all,dev]"
pytest # run all tests
pytest -m slow # parallel/large data tests
python -m quill # run benchmark demo
Requirements
- Python 3.8+
numpyoptional but strongly recommended —pip install numpypsutiloptional, for accurate RAM sensing —pip install psutilpandasoptional —pip install pandas
License
MIT — by Isaiah Tucker
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quill_sort-5.0.0.tar.gz.
File metadata
- Download URL: quill_sort-5.0.0.tar.gz
- Upload date:
- Size: 45.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72f129b32bc6332522cdb58188e33f39c14c3a57c3c7f69f29339ecdfe950bc5
|
|
| MD5 |
de16abef664d513360be1aa4c9896fc3
|
|
| BLAKE2b-256 |
ead4fe13e0a993f090182da623c288bf21744c84acc8ebf9aca555bd84623d52
|
File details
Details for the file quill_sort-5.0.0-py3-none-any.whl.
File metadata
- Download URL: quill_sort-5.0.0-py3-none-any.whl
- Upload date:
- Size: 43.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f82d9228390853499f881cbbce2212f4246ec38022817c15de3b38899ca4dbef
|
|
| MD5 |
d771bfe889c1a34d491c00b7e4d94bc7
|
|
| BLAKE2b-256 |
668f6e0fa480b6855eaf8e5447386247465b9aac3de5d002693b618f2e336771
|