Skip to main content

Parser for IEX data files

Project description

iex_parser

Parser for IEX pcap DEEP (1.0) and TOPS (1.5, 1.6) files.

Overview

At the time of writing the IEX exchange provides two file downloads for historical data: DEEP and TOPS. This data is provided as a pcap file which is a dump of the network activity.

This package provides an API for extracting the data from these files.

Installation

Install from PyPi.

pip install iex_parser

Example

The following code processes the TOPS sample file downloaded from IEX. Note the file doesn't have to be unzipped. For version 1.6 use TOPS_1_6, for 1.5 use TOPS_1_5.

from iex_parser import Parser, TOPS_1_6

TOPS_SAMPLE_DATA_FILE = 'data_feeds_20180127_20180127_IEXTP1_TOPS1.6.pcap.gz'

with Parser(TOPS_SAMPLE_DATA_FILE, TOPS_1_6) as reader:
    for message in reader:
        print(message)

The result looks like this:

{'type': 'trading_status', 'status': b'T', 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 40, 490473, tzinfo=datetime.timezone.utc), 'symbol': b'SPEM', 'reason': b''}
{'type': 'trading_status', 'status': b'H', 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 42, 95642, tzinfo=datetime.timezone.utc), 'symbol': b'INCO', 'reason': b'NA'}
{'type': 'trading_status', 'status': b'H', 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 42, 852349, tzinfo=datetime.timezone.utc), 'symbol': b'CHSCN', 'reason': b'NA'}
{'type': 'price_level_update', 'side': b'S', 'flags': 1, 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 44, 856983, tzinfo=datetime.timezone.utc), 'symbol': b'ATLO', 'size': 8755, 'price': Decimal('38.95')}
{'type': 'price_level_update', 'side': b'S', 'flags': 0, 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 44, 856983, tzinfo=datetime.timezone.utc), 'symbol': b'ATLO', 'size': 37222, 'price': Decimal('48')}
{'type': 'price_level_update', 'side': b'S', 'flags': 1, 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 44, 856987, tzinfo=datetime.timezone.utc), 'symbol': b'ATLO', 'size': 8958, 'price': Decimal('38.95')}
{'type': 'price_level_update', 'side': b'S', 'flags': 0, 'timestamp': datetime.datetime(2018, 1, 27, 15, 23, 44, 856987, tzinfo=datetime.timezone.utc), 'symbol': b'ATLO', 'size': 37019, 'price': Decimal('48')}

The following code processes the DEEP sample file downloaded from IEX.

from iex_parser import Parser, DEEP_1_0

DEEP_SAMPLE_DATA_FILE = 'data_feeds_20180127_20180127_IEXTP1_DEEP1.0.pcap.gz'

with Parser(DEEP_SAMPLE_DATA_FILE, DEEP_1_0) as reader:
    for message in reader:
        print(message)

Speed

Because the data is distributed as a dump of network packets, there are a lot of "empty" packets. These take time to read and slow the delivery of the real data. To handle this the packets are read on a separate python thread and queued. The size of the queue is an optional parameter to the Parser, and has been set by experimentation to 25000.

The main question I get is: can it go any faster?

The short answer is no. However the reason for the slowness is the time spent reading and skipping network data in the pcap file.

The solution is to convert the downloaded pcap files into csv or JSON.

Command line tools

There are command line tools that takes a downloaded file and converts it to csv files or a JSON file.

iex-to-csv

$ iex-to-csv -i <input-file> -o <output-folder> [-s] [-t <ticker> ...] [-c]

The input file must be as downloaded from IEX. This -s flag can be used to suppress the progress printing. The -t flag can be used to select specific tickers. The -c flag cause the ordinal to be reset when the timestamp changes, rather than monotonically increasing. A file for every message type is produced.

For example:

$ iex-to-csv -i ~/data/raw/data_feeds_20200305_20200305_IEXTP1_DEEP1.0.pcap.gz -o ~/data/csv

iex-to-json

$ iex-to-json -i <input-file> -o <output-path> [-s] [-t <ticker> ...]

The input file must be as downloaded from IEX. This -s flag can be used to suppress the progress printing. The -t flag can be used to select specific tickers. A single file is produced containing a JSON message per line.

For example:

$ iex-to-json -i ~/data/raw/data_feeds_20200305_20200305_IEXTP1_DEEP1.0.pcap.gz -o ~/data/json/

There is a helper function to load this data:

from pathlib import Path
from iex_parser.iex_to_json import load_json

INPUT_FILENAME = Path('data_feeds_20200305_20200305_IEXTP1_DEEP1.0.json.gz')

for obj in load_json(INPUT_FILENAME):
    if obj['type'] == 'trade_report':
        print(obj)

Messages

The messages are returned as dictionaries.

Security Directive

{
    'type': 'security_directive',
    'flags': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'round_lot_size': int,
    'adjusted_poc_close': decimal.Decimal,
    'luld_tier': int
}

Trading Status

{
    'type': 'trading_status',
    'status': bytes,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'reason': bytes
}

Operational Halt

{
    'type': 'operational_halt',
    'halt_status': bytes,
    'timestamp': datetime.datetime,
    'symbol': bytes
}

Short Sale Price Test Status

{
    'type': 'short_sale_price_test_status',
    'status': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'detail': bytes
}

Quote Update

{
    'type': 'quote_update',
    'flags': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'bid_size': int,
    'bid_price': decimal.Decimal,
    'ask_size': int,
    'ask_price': decimal.Decimal
}

Trade Report

{
    'type': 'trade_report',
    'flags': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'size': int,
    'price': decimal.Decimal,
    'trade_id': int
}

Official Price

{
    'type': 'official_price',
    'price_type': bytes,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'price': deccimal.Decimal
}

Trade Break

{
    'type': 'trade_break',
    'flags': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'size': int,
    'price': decimal.Decimal,
    'trade_id': int
}

Auction Information

{
    'type': 'auction_information',
    'auction_type': bytes,
    'timestamp': decimal.Decimal,
    'symbol': bytes,
    'paired_shares': int,
    'reference_price': decimal.Decmal,
    'indicative_clearing_price': decimal.Decimal,
    'imbalance_shares': int,
    'imbalance_side': bytes,
    'extension_number': int,
    'scheduled_auction_time': datetime.datetime,
    'auction_book_clearing_price': decimal.Decimal,
    'collar_reference_price': decimal.Decimal,
    'lower_auction_collar_price': decimal.Decimal,
    'upper_auction_collar_price': decimal.Decimal
}

Price Level Update

{
    'type': 'price_level_update',
    'side': bytes,
    'flags': int,
    'timestamp': datetime.datetime,
    'symbol': bytes,
    'size': int,
    'price': decimal.Decimal
}

Secrity Event

{
    'type': 'security_event',
    'security_event': bytes,
    'timestamp': datetime.datetime,
    'symbol': bytes
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iex_parser-1.7.0.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

iex_parser-1.7.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file iex_parser-1.7.0.tar.gz.

File metadata

  • Download URL: iex_parser-1.7.0.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.10.16.3-microsoft-standard-WSL2

File hashes

Hashes for iex_parser-1.7.0.tar.gz
Algorithm Hash digest
SHA256 379d0de92d20449d3701323be15d995661a852c31a36b39481a0d19e247378b0
MD5 b525214099a3f3acc5ccb3cfa35d1c1d
BLAKE2b-256 6f1f16e2ac180631d0efc282b1a1df7b4f0bd3555cd47642b6bfc36c3a127a30

See more details on using hashes here.

File details

Details for the file iex_parser-1.7.0-py3-none-any.whl.

File metadata

  • Download URL: iex_parser-1.7.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.12 CPython/3.8.10 Linux/5.10.16.3-microsoft-standard-WSL2

File hashes

Hashes for iex_parser-1.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d460da83e72a86bc0911e1c376965c395cbb69553bfe2d88e87d6f4e724964a5
MD5 c671250d3b60609b99e86216ef70b20c
BLAKE2b-256 2ad2046eecd55de7a02f51fe45d223fc6fcb454838735409d808bc198211e663

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page