Ultra-Low Latency C++ HFT Engine for Python by falcon7
Project description
SHAURYA: Scalable High-frequency Architecture for Ultra-low Response Yield Access
Shaurya is a high-frequency trading (HFT) market data feed handler engineered for sub-microsecond latency. By leveraging Zero-Copy parsing, Lock-Free concurrency, and Stack-based memory management, it bypasses the performance bottlenecks of standard software architectures to process financial data with deterministic speed.
⚡ Performance Impact & Comparison
Shaurya was benchmarked using high-resolution hardware timers (QueryPerformanceCounter).
| Implementation Approach | Average Latency | Min Latency | Why it's Slow/Fast? |
|---|---|---|---|
| Python Script | ~45.0 µs | ~30.0 µs | Interpreter overhead & Garbage Collection pauses. |
Standard C++ (std::string) |
~5.0 µs | ~3.5 µs | Frequent Heap Allocations (malloc) & deep memory copying. |
| SHAURYA (Zero-Copy) | 1.88 µs* | 0.3 µs | Zero-Copy pointer arithmetic & Lock-Free queues. |
The Result: Shaurya achieves a minimum internal reaction time of 300 nanoseconds, approximately 50x faster than standard Python implementations.
*Measured in Pure Mock Environment
🌍 Real-World Validation: The "Fragmented Liquidity" Test
Shaurya was subjected to a 30-minute stress test aggregating live ticks from Binance, Coinbase, and Bitstamp simultaneously.
- Test Duration: 30 Minutes
- Total Messages: 21,862 (Live Volatility Bursts)
- Outcome: The engine successfully normalized fragmented liquidity streams in real-time. While average latency increased under OS scheduler load (due to non-isolated cores), the minimum latency remained at 0.3 µs, proving the core engine's efficiency remains stable even during crypto market volatility.
🏗 Key Technical Innovations
1. Zero-Copy Architecture
Instead of copying network packets into new std::string objects (which forces the OS to allocate memory), Shaurya uses a custom StringViewLite class. This creates a lightweight "view" over the raw socket buffer, allowing the engine to parse prices without moving a single byte of memory.
2. Lock-Free Concurrency (SPSC)
Traditional systems use Mutex locks (std::mutex) to share data between threads, which forces the CPU to stop and switch contexts (expensive). Shaurya implements a Single-Producer Single-Consumer Ring Buffer using std::atomic instructions. This allows the Network Thread to push data and the Strategy Thread to read data simultaneously without ever blocking.
3. CPU Cache Optimization
Critical data structures are aligned to 64-byte cache lines (alignas(64)). This prevents False Sharing, a phenomenon where two threads fight over the same CPU cache line, drastically reducing performance on multi-core systems.
🚀 Quick Start
Prerequisites
- OS: Windows (Required for
winsock2andQueryPerformanceCounter) - Compiler: G++ (MinGW) supporting C++11 or higher.
Execution Guide
- Build the System:
build.bat
- Start Data Source:
python bridge.py - Start Shaurya Engine:
bin\Shaurya.exe
Upon completion, the engine generates a Shaurya_Metrics.txt report detailing the nanosecond-level performance of the run.
Resources
If you are new to High-Frequency Trading systems, these concepts explain the "Why" behind Shaurya's architecture:
- Latency vs. Jitter: Understand why "Average Speed" is useless in HFT.
- Zero-Copy Networking: How avoiding memory copies saves microseconds.
- Lock-Free Programming: An introduction to Atomics and Ring Buffers.
- False Sharing: The hidden killer of multi-threaded performance.
Developed by your's truly 🛩️!
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hft_shaurya-0.1.0.tar.gz.
File metadata
- Download URL: hft_shaurya-0.1.0.tar.gz
- Upload date:
- Size: 3.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
30a2520812a4d29744794df130f2abe4b85aebe26036c89ed62b1996ce2a8737
|
|
| MD5 |
272b868974a3226dcb2a5fdf06415005
|
|
| BLAKE2b-256 |
5bee600c5634f1863b1d614a7d94e95ef0901422e7a9a76c4e194671d0fb6f36
|
File details
Details for the file hft_shaurya-0.1.0-py3-none-any.whl.
File metadata
- Download URL: hft_shaurya-0.1.0-py3-none-any.whl
- Upload date:
- Size: 3.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
530ff1ffa892a4f8cd2d3b63f6021f18baad1541bc878a96adbb1bf6e55a899a
|
|
| MD5 |
bcc1499385ce76af12e45efc8e8c2498
|
|
| BLAKE2b-256 |
2c7647de6c03959c9cf4b72672eaedac7cdace332ca4534dbb3a497d629aced3
|