Skip to main content

Nanobind/C++ parsers for massive, bulk S3, and websocket market data.

Project description

massive-speedup

Native C++/nanobind readers for Polygon/Massive flat-file market data.

See INSTALL.md for installation details and DEVELOPMENT.md for release and PyPI publishing notes.

CSV Gzip Files

Install/build the native extension:

pip3 install -e .

Iterate parsed records directly from a .csv.gz file:

import massive_speedup

for trade in massive_speedup.FlatFiles.Stock.Trade.parse("trades.csv.gz"):
    print(trade.ticker, trade.sip_timestamp, trade.price)

for quote in massive_speedup.FlatFiles.Stock.Quote.parse("quotes.csv.gz"):
    print(quote.ticker, quote.bid_price, quote.ask_price)

for quote in massive_speedup.FlatFiles.currency.Quote.parse("currency_quotes.csv.gz"):
    print(quote.ticker, quote.participant_timestamp)

You can also iterate raw CSV fields as bytes tuples:

for row in massive_speedup.FlatFiles.Stock.Trade.parse_raw("trades.csv.gz"):
    print(row[0], row[8])

Example scripts:

Record Access

Parsed records expose read-only attributes and are iterable in CSV field order:

trade = next(massive_speedup.FlatFiles.Stock.Trade.parse("trades.csv.gz"))

print(trade.ticker)
print(trade.conditions)
print(trade.sip_timestamp)
print(trade.pack())
print(list(trade))

Packed records do not include the ticker. Reconstruct with the ticker from the file name:

packed = trade.pack()
trade2 = massive_speedup.StockTrade.from_packed(packed, trade.ticker)

Window Aggregation

The native aggregators consume iterables of parsed records and yield C++ result objects exposed through nanobind. Result attributes are read-only and lazily converted to Python objects on first access. The aggregation interval and offset are expressed in seconds; the returned window_start is still nanoseconds since epoch.

import massive_speedup

trades = massive_speedup.FlatFiles.Stock.Trade.parse("trades.csv.gz")

for bar in massive_speedup.FlatFiles.Stock.Trade.Aggregator(
    trades,
    interval_seconds=60,
):
    print(
        bar.ticker,
        bar.window_start,
        bar.open,
        bar.close,
        bar.high,
        bar.low,
        bar.avg,
        bar.volume_weighted_avg,
        bar.volume,
        bar.transactions,
        bar.stddev,
    )

Available aggregators:

  • massive_speedup.StockTradeAggregator / FlatFiles.Stock.Trade.Aggregator
  • massive_speedup.StockQuoteAggregator / FlatFiles.Stock.Quote.Aggregator
  • massive_speedup.CurrencyQuoteAggregator / FlatFiles.currency.Quote.Aggregator

Stock trades aggregate price and use size for volume and volume_weighted_avg. Stock quotes aggregate ask and bid prices separately and use ask/bid sizes for ask/bid volume-weighted averages. Currency quotes aggregate ask and bid prices separately and omit volume and volume-weighted averages because the source rows have no size field.

quotes = massive_speedup.StockQuoteDatabase("/data/massive-db", "2026-01-23", "A")

for quote_bar in massive_speedup.StockQuoteAggregator(
    quotes,
    interval_seconds=1,
    offset_seconds=0,
):
    print(quote_bar.ask_open, quote_bar.ask_close, quote_bar.bid_avg)

Aggregators stream consecutive (ticker, window_start) groups. Use input ordered by ticker and timestamp, such as the native database iterators or default Massive/Polygon flat-file order. stddev is population standard deviation.

Build Database Files

Build fixed-length binary database files from one or more input .csv.gz files:

massive-speedup-build-database --database /data/massive-db 2026-01-23.csv.gz

The input type is inferred from the CSV header. Output layout is:

{database}/{stock_trade|stock_quote|currency_quote}/{YYYY-MM-DD}/{ticker}

Existing ticker files are not overwritten by default. The builder keeps reading the input until the next ticker and only writes missing ticker files. Use --force to rebuild existing ticker files, which is useful after a binary record format change:

massive-speedup-build-database --force --database /data/massive-db 2026-01-23.csv.gz

Date-level idempotency uses an .incomplete marker in {database}/{type}/{YYYY-MM-DD}. If the date directory exists without .incomplete, the input file is skipped. If the directory is new, .incomplete is created before processing and removed only after successful completion. Use --force to process a date even when .incomplete is absent.

Use --benchmark to print throughput:

massive-speedup-build-database --benchmark --database /data/massive-db *.csv.gz

Database Files

Open a fixed-length binary file through mmap and iterate records:

records = massive_speedup.StockTradeDatabase(
    "/data/massive-db",
    "2026-01-23",
    "A",
)

for trade in records:
    print(trade.sip_timestamp, trade.price)

Merge stock trades and quotes for one date and ticker in SIP timestamp order:

for trade, quote in massive_speedup.stock_trade_quote_timeline(
    "/data/massive-db",
    "2026-01-23",
    "A",
):
    if trade:
        print("trade", trade.sip_timestamp, trade.price, quote)
    else:
        print("quote", quote.sip_timestamp, quote.bid_price, quote.ask_price)

Quote rows yield (None, current_quote). Trade rows yield (trade, last_quote), where last_quote is None until the first quote has appeared. When a trade and quote have the same SIP timestamp, the quote is yielded first.

Database files support indexing and timestamp search:

first = records[0]
last = records[-1]

index = records.index_before_timestamp(1769161728012983416)
near_open = records.index_before_timestamp(1769161728012983416, galloping=0)
next_index = records.index_after_timestamp(1769161728012983416, galloping=index + 1)

Timestamp arguments are nanoseconds since epoch. Database readers also accept datetime.time values, which are resolved using the reader's date:

import datetime as dt

index = records.index_before_timestamp(dt.time(9, 30))

Find the closest record before or after a participant timestamp:

before = records.find_before_participant_timestamp(
    1769161728012624580,
)
after = records.find_after_participant_timestamp(
    1769161728012624580,
    fuzz=250_000_000,
    galloping=True,
)
strict_before = records.find_before_participant_timestamp(
    1769161728012624580,
    on=False,
)

find_before_participant_timestamp returns the record with the highest participant timestamp less than or equal to the target. find_after_participant_timestamp returns the record with the lowest participant timestamp greater than or equal to the target. Set on=False for strict < or > comparisons. fuzz is a nanosecond scan window around the searched timestamp and defaults to one second (1_000_000_000). Both methods return records, not indexes.

Stock database readers also expose NYSE market session timestamps in nanoseconds:

print(records.market_open)
print(records.market_close)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

massive_speedup-0.1.4.tar.gz (58.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

massive_speedup-0.1.4-cp314-cp314t-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.14tmusllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.14tmanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp314-cp314-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.14musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp313-cp313t-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.13tmusllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13tmanylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp313-cp313-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.13musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp312-cp312-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp311-cp311-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.11musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp310-cp310-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

massive_speedup-0.1.4-cp39-cp39-musllinux_1_2_x86_64.whl (1.5 MB view details)

Uploaded CPython 3.9musllinux: musl 1.2+ x86-64

massive_speedup-0.1.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (1.0 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.27+ x86-64manylinux: glibc 2.28+ x86-64

File details

Details for the file massive_speedup-0.1.4.tar.gz.

File metadata

  • Download URL: massive_speedup-0.1.4.tar.gz
  • Upload date:
  • Size: 58.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for massive_speedup-0.1.4.tar.gz
Algorithm Hash digest
SHA256 21ecb57a8261deb4ab81ebaf76f074eb00b03511a0c828dbf53d067414e9b53b
MD5 890f61b7e7dc20b7e815e6b76eaf18fd
BLAKE2b-256 0521a2123a28c30e5517439d819b555920639ffec5fd207c0951be254cabd5c6

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp314-cp314t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp314-cp314t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a4c3bdc0df3fc33c8286c2029a06af695e97a224df9021b1a04670a472f8a82f
MD5 6b4f095ec99a12336b4003e1d25361be
BLAKE2b-256 a92e997707962c08ff1c683aea83f5f6b201fbd24b62de9d7c845684fe9a96b0

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp314-cp314t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7fc07e57658997b94ff07c35ef2ba7a0d4ced572c737257189aa7d672cd95354
MD5 05504b7b770811267c64be41c44e64c7
BLAKE2b-256 aeec71274e9c8b20ec696c66e60e393fc8184b7385b7a29f9a5e47375714b420

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp314-cp314-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp314-cp314-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 5da1bde97da074dd9db92a310d777e73fff222f5b38bccc923ebb1333f7b3c22
MD5 4dddd36dc26915baf84c7f73ddb268d0
BLAKE2b-256 1d001aa27f93b87c91254425896350ccaccfe850b7daf19110606117da3bfc9b

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp314-cp314-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8dd31565f85276b5bc10f4fe519ad1116a9a66a469ab9d176e7872af3b844b9c
MD5 401f371469d0b796b214080c9da0ab31
BLAKE2b-256 52d7a1cdda7b7f84253e833033bf77bacde9eade5671cd4771e0fbce9bf29dc2

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp313-cp313t-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp313-cp313t-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 a41b485559c8f7e9af0a49b4bc221b5114125f28482ecb06129e77d1e8d5fe5a
MD5 f44f81238b1c5c38a9189e9f0e170845
BLAKE2b-256 e9186e144156e0704d019803081d12fd0d4151999300c7c5920f081d75d9abf5

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp313-cp313t-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4f33812d981f8b9c1ce070fe7a77cbc6bb74c891fa33410e1e5d57a930fbd1f0
MD5 f5c94d350037c48d68c74787810a76a0
BLAKE2b-256 8d0d2561186fea755dc750c0deaaa867539a99251514c2b1360f4aa7614a1d25

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp313-cp313-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp313-cp313-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 7941a880f213b6e0fc035c3bea4f9c42a1046067e68be90a7e1109ada8314d66
MD5 5a8a543cce25d72100ab3a7739cd1c7a
BLAKE2b-256 524a39fd1b80f22afd6f63ee2c3449975995d93cce0c3040a2c0a3d2cd7bba6e

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3af3c05a05375e5b97c8542dc0cd4b965af74b2d867979f1b365fbbf128794a4
MD5 8cb7a246bb56a305533ffcb2e2f06268
BLAKE2b-256 6556772a449c61450429a27c6ae5c83a80df9c264b601fa08ac49faa79612fdc

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9960bff81be3e133b83c5d47bb5ea5ad552f2b7b2b46e6650016c3109b014f3e
MD5 e1240848ac943a7283970e0c851ccd18
BLAKE2b-256 b7776c1ce47cfb6f21e517df4d73366835b1227ed8bd59c7c3007a45567c6c42

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e68255724eb401caa06249cd4eb579ed1c599d96e25f7fe465fd08ce44db22b8
MD5 b5bc480c1b850d8662466111e242860d
BLAKE2b-256 7ef624482d1a4430c6e5ea68b15a4fac20fafb8b9bdcda8bbc17e86877da4ff2

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp311-cp311-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp311-cp311-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 b9e99ae1824961a73537e3e6854d51b63cdb54b7b5f5d1f5976374f016402733
MD5 3d0b8b03925550add21c76ecbdaa87f9
BLAKE2b-256 678f9533364f7c7ba36c4169e9b4c0c4bcd4c9bf0aa57d10157fccea83677a13

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 12bfd333013dfe6e2606238910ee128ae29b81f2569e50abcf81be89009ca07a
MD5 a144089ecb465772566651142897aff3
BLAKE2b-256 5ea00fb0ea5a6714399a7a29d4bb4e9d3245d23d1aa6b22fc5f63b77252cb621

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 1cbd7dd3bbd1bec7945889ffbb071b5f3017822806982ae4458463072b64e05e
MD5 752a2b1499e336ccc0644ec7595d9f87
BLAKE2b-256 35d786cfaf238d3fe010a587f5d9a77b050015eb2469c7b99c266d3a6d234ea8

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 8f42c16cd21804a1a4038543a2a032efc832815f6e554e4c4133394bc5412a29
MD5 44f20cc99138dc3d599793cb2f25d3f3
BLAKE2b-256 ebe475c681e395a32a628389dfe9b1a18bb2598f15aff11601c5d13fac09f1d0

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp39-cp39-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp39-cp39-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 0c32c55cc6791d4b55654f6ba2673b4491a28d2334a8702dcf80a8fe46239af4
MD5 02bfd7b43c34ed88bc48e089181b08de
BLAKE2b-256 35e1a898f1133e04b59b9e4918aab3b920a561d608f7db7d2de37f112f02d01f

See more details on using hashes here.

File details

Details for the file massive_speedup-0.1.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for massive_speedup-0.1.4-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7c68e1bd037b2fe630f749c639fe23b1e55e267360005fd0d25e11369d4996a0
MD5 b93f81c59898c0926027ca9991fc863f
BLAKE2b-256 ca52147f365a2ab53ad4f53454e2921b8dc62f918c191bb775f63b05e51355a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page