Skip to main content

High-performance exchange feed parser and orderflow analytics engine with Rust and Python bindings

Project description

OrderPulse / fastreader

OrderPulse packages a Rust parser and order book engine as a Python extension module named fastreader. The library is built for binary exchange feed files containing:

  • order messages: N new, M modify, X cancel
  • trade messages: T

The Python API exposed from src/lib.rs centers on three classes:

  • ReadMsgFromBinary: load and query parsed messages
  • MessageBatch: hold a selected subset of messages
  • OrderBookBuilder: replay messages into a top-of-book snapshot stream

Install and Import

Build and install the extension locally with maturin:

maturin develop

Then import it in Python:

from fastreader import ReadMsgFromBinary, OrderBookBuilder

Quick Start

from fastreader import ReadMsgFromBinary, OrderBookBuilder

reader = ReadMsgFromBinary("/path/to/feed.bin")

print(reader.total_messages())
print(reader.total_orders())
print(reader.total_trades())

first_ten = reader.get_all_messages(limit=10)
for line in first_ten:
	print(line)

orders = reader.select_order_messages()
print(orders.len())
print(orders.to_list(limit=5))

builder = OrderBookBuilder()
rows_printed = builder.create_orderbook_all_messages(reader)
print(rows_printed)

To keep only one instrument token while loading:

reader = ReadMsgFromBinary("/path/to/feed.bin", token=26000)

Public API

ReadMsgFromBinary(path, token=None)

Loads the full binary file once, parses every recognized packet, and stores the result in memory.

Parameters:

  • path: path to the binary feed file
  • token: optional instrument token filter applied after parsing

Behavior:

  • opens the file and memory-maps it for fast sequential access
  • scans packet-by-packet and recognizes T, N, M, and X
  • converts little-endian numeric fields into native Rust values
  • stores parsed packets in an internal Vec<Message>
  • if token is provided, keeps only messages whose token matches
  • initializes an internal cursor used by incremental read helpers

total_messages()

Returns the total number of parsed messages currently stored.

total_orders()

Counts messages whose variant is Message::Order.

This includes all order-side packet types represented by the parser: new, modify, and cancel.

total_trades()

Counts messages whose variant is Message::Trade.

summary()

Prints three lines to standard output:

  • total messages
  • total order messages
  • total trade messages

Use this when you want a simple console summary instead of a returned Python value.

reset_cursor()

Resets the internal cursor back to the first stored message.

This matters for the incremental helpers:

  • get_next_msg()
  • select_next_messages(limit)

get_all_messages(limit=None)

Returns formatted strings for all stored messages, optionally capped by limit.

Use this when you want a Python list immediately and do not need to keep a reusable batch object.

get_order_messages(limit=None)

Returns formatted strings for order messages only.

Internally it filters the in-memory message list for Message::Order, then formats each item.

get_trade_messages(limit=None)

Returns formatted strings for trade messages only.

Internally it filters for Message::Trade, then formats each item.

get_next_msg()

Returns the next formatted message string and advances the internal cursor by one.

When the cursor reaches the end, it returns the literal string "END".

Use this for sequential consumption from Python without slicing the entire data set each time.

select_all_messages()

Returns a MessageBatch containing every stored message.

Use this when you want to keep a reusable selection and convert it later with to_list().

select_order_messages()

Returns a MessageBatch containing only order messages.

select_trade_messages()

Returns a MessageBatch containing only trade messages.

select_next_messages(limit)

Returns the next limit messages as a MessageBatch and advances the internal cursor.

This is the batch-oriented counterpart to get_next_msg().

MessageBatch

MessageBatch is a lightweight container around a selected slice of parsed messages.

It is useful when you want to:

  • keep a filtered subset
  • page through data in chunks
  • pass only a subset into OrderBookBuilder

len()

Returns the number of messages held by the batch.

is_empty()

Returns True when the batch contains no messages.

to_list(limit=None)

Formats the stored messages into Python strings, optionally capped by limit.

This does not re-read the file. It only formats the messages already stored in the batch.

OrderBookBuilder()

Creates a builder that replays parsed messages through the in-memory order book engine.

The builder itself is stateless. Each build call creates a fresh OrderBookManager internally.

create_orderbook_all_messages(reader)

Consumes all messages stored in a ReadMsgFromBinary instance.

Behavior:

  • replays order messages into the order book
  • applies trade messages as quantity reductions against buy and sell order ids
  • after each message, asks for the top 5 bid and ask levels of the affected token
  • prints one CSV row only when both sides have at least 5 populated levels
  • returns the number of printed rows

create_orderbook(batch)

Same behavior as create_orderbook_all_messages, but the input is a MessageBatch.

Use this when you want to build the book from:

  • only order messages
  • only trade messages
  • a cursor window from select_next_messages()
  • any user-selected subset

Formatted Message Output

Most reader methods return strings produced by the internal format_message function in src/lib.rs.

Order messages are formatted like:

Order Message: SeqNo..., msg_len..., Msg_Type'...', Exch_ts..., local_ts..., order_id..., Token..., order_Type'...', Price..., Quantity..., missed...

Trade messages are formatted like:

Trade Message: SeqNo..., msg_len..., Msg_Type'...', Exch_ts..., local_ts..., order_id_buy..., order_id_sell..., Token..., Price..., Quantity..., missed...

Field meanings:

  • SeqNo: stream sequence number from the packet header
  • msg_len: encoded packet byte length
  • Msg_Type: raw feed message code such as N, M, X, or T
  • Exch_ts: exchange timestamp inside the payload
  • local_ts: local timestamp attached to the packet
  • order_id: order identifier for order messages
  • order_id_buy: buy order identifier for trade messages
  • order_id_sell: sell order identifier for trade messages
  • Token: instrument token
  • order_Type: side code for order messages, typically buy or sell
  • Price: order price or trade price
  • Quantity: order quantity or traded quantity
  • missed: gap flag rendered as 0 or 1

Architecture

1. Binary parser

The parser lives in src/read_trd_ord_only.rs.

Flow:

  • the file is opened and memory-mapped using memmap2
  • the buffer is scanned from left to right
  • spaces are skipped
  • a small PeekStructure is read first to inspect msg_type
  • based on the type, the parser reads either an OrderPacket or TradePacket
  • little-endian fields are converted to host-endian values
  • the parsed packet is wrapped in the Message enum from src/structure.rs
  • unknown bytes trigger one-byte resynchronization and the parser keeps scanning

Recognized message types:

  • T: trade packet
  • N: new order
  • M: modify order
  • X: cancel order

Debugging support:

  • ORDERPULSE_DEBUG=1 enables parser debug logs
  • ORDERPULSE_DEBUG_LIMIT=<n> limits how many debug lines are emitted

2. In-memory message model

The packet structures are defined in src/structure.rs.

Key types:

  • StreamHeader: packet header containing length, stream id, and sequence number
  • OrderMessage: payload for order events
  • TradeMessage: payload for trade events
  • OrderPacket and TradePacket: payload plus header, local timestamp, and flags
  • Message: enum wrapping either packet type

ReadMsgFromBinary stores Vec<Message>, and every Python-facing query method works from this in-memory vector.

3. Order book engine

The order book engine lives in src/orderbook.rs.

Core ideas:

  • one OrderBook is maintained per token
  • active orders are indexed by order_id
  • price levels are aggregated into bid and ask arrays
  • the dynamic price range expands when a new price falls outside the current window
  • order messages add, modify, or cancel orders
  • trade messages reduce quantity on both the buy and sell order ids

Top-of-book extraction:

  • bids are scanned from highest price downward
  • asks are scanned from lowest price upward
  • the midpoint is computed from the best bid and best ask
  • the builder prints a row only if at least 5 bid levels and 5 ask levels exist

4. Python binding layer

The PyO3 bindings are declared in src/lib.rs.

The exported Python module is named fastreader, not OrderPulse.

That means Python code should use:

from fastreader import ReadMsgFromBinary, MessageBatch, OrderBookBuilder

Typical Usage Patterns

Inspect a file quickly

from fastreader import ReadMsgFromBinary

reader = ReadMsgFromBinary("feed.bin")
reader.summary()
print(reader.get_all_messages(limit=3))

Work on a single token

reader = ReadMsgFromBinary("feed.bin", token=12345)
print(reader.total_orders())
print(reader.get_trade_messages(limit=10))

Stream through messages incrementally

reader = ReadMsgFromBinary("feed.bin")

while True:
	msg = reader.get_next_msg()
	if msg == "END":
		break
	print(msg)

Build an order book from a subset

from fastreader import ReadMsgFromBinary, OrderBookBuilder

reader = ReadMsgFromBinary("feed.bin")
batch = reader.select_next_messages(50_000)

builder = OrderBookBuilder()
rows = builder.create_orderbook(batch)
print(rows)

Notes and Limitations

  • ReadMsgFromBinary loads the full parsed result into memory
  • methods returning strings are convenience views, not raw packet objects
  • summary() and order book builders print to standard output
  • order book row generation is currently hard-coded to 5 bid and 5 ask levels
  • rows are skipped until both sides have at least 5 populated levels
  • trade token is stored as i32 in the wire struct and converted to u32 for filtering and book lookup

Internal Modules

Besides the main parser and order book classes, src/lib.rs also exposes orderbook_processing as a Rust module. That module contains cycle-count benchmarking helpers in src/orderbook_processing.rs and low-level timing utilities in src/tsc.rs.

Those helpers are separate from the main Python fastreader message-reading API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orderpulse-0.2.13.tar.gz (19.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orderpulse-0.2.13-cp312-cp312-manylinux_2_34_x86_64.whl (248.3 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

File details

Details for the file orderpulse-0.2.13.tar.gz.

File metadata

  • Download URL: orderpulse-0.2.13.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for orderpulse-0.2.13.tar.gz
Algorithm Hash digest
SHA256 4a085cfedb7ef72457fb0fe551134db28e5682710e40bac35b3fb0646e231042
MD5 55c6bad64357986c675e3f18e5b3713f
BLAKE2b-256 54d1ffa48090a59d69cc2e62beee4abfe2cddfbdb7cfb317c7fc5125b3b1fcf8

See more details on using hashes here.

File details

Details for the file orderpulse-0.2.13-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for orderpulse-0.2.13-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c57e126c33ae9df1bcf6c486312f66c3c84126530659f179725ee76a95e59e04
MD5 14b57a379af03521748c42564db6563b
BLAKE2b-256 e542b0e892280c8a8dc4982b9dc1da2f6c6475450240c9f35671fe6debee41a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page