Cryptocurrency Order Book Analysis Tool
Table of Contents
- About the Project
- Getting Started
About The Project
This project is an extension of my thesis, A Study of CUSUM Statistics on Bitcoin Transactions , where I was tasked with implementing CUSUM statistic processes to identify price actions periods in bitcoin markets. After developing a tool for market orders, the natural extension was to find relationships from activities in the limit order book. I started developing this tool to record instances of the limit order book in order to record Limit Order insertions (LO), cancellations (CO), and Market Orders (MO).
As the project grew I wanted to make a tool that could be used by academics looking to apply and develop market microstructure models in live markets. As a result, the styles in which the limit orderbook and orderbook events are recorded are being developed in accordance to the conventions presented in recent market microstructure papers correspond to the following papers:
The last paper, 5, shows a working model implementing Order Flow Imbalance (OFI) and Trade Flow Imbalance to BTC-USD trades was done by Ed Silyantev. He developed a tool to assess OFI and TFI of XBT-USD pair.
To get a local copy up and running follow these simple steps.
You can use
requirements.txt to see what is necessary but they are also listed below:
from standard python library: asyncio, time, datetime, sys, bisect
requisite modules: copra, pandas, numpy
Given that this is still very much a work in progress, it may make more sense to fork the project, or download the project as a compressed folder, and build
CSV_out_test.py with your preferred settings.
Note: depending on the popularity of the asset and the computational power of your PC, you may run into errors arising from the computer not being able to keep up with the market (especially BTC-USD). I would suggest experimenting with an unpopular pair, (e.g., XRP-USD), or a crypto-crypto pair (e.g., XRP-BTC), and timing your queries outside of NYSE, and London Stock Exchange trading hours as they tend to have less activity.
however if you want an easy installation:
pip3 install crobat
Since this is an orderbook <u>recorder</u> my use until now has been to record the orderbook. However there are accessors in the
LOB_funcs.py file, under in the history class. In the /test folder there is a small usecase if you would like to see it but documentation is pending.
For now we only have the full orderbook, with no regard for ticksize, and we call that
We change the
settingsvariable in the
CSV_out_test.pyfile that has arguments for:
Parameter Function Arg Type Description Recording Duration duration int recording time in seconds Position Range position_range int ordinal distance from the best bid(ask) Currency Pair currency_pair str List of currency pairs supported by Coinbase
When you are ready, you can start the build. When it finishes you should get a message
CoPrA. And the files for the limit orderbook for each side should be created with a timestamp:
Filename side description L2_orderbook_events_askYYYY-MM-DDTHH:MM:SS.ffffff ask Time series of order book events on the ask side L2_orderbook_events_bidYYYY-MM-DDTHH:MM:SS.ffffff bid Time series of order book events on the bid side L2_orderbook_events_signedYYYY-MM-DDTHH:MM:SS.ffffff both Time series of order book events on both sides, - sign for bid, +sign for ask L2_orderbook_ask_volmYYYY-MM-DDTHH:MM:SS.ffffff ask Time series of the volume snapshots of order book on the ask side L2_orderbook_bid_volmYYYY-MM-DDTHH:MM:SS.ffffff bid Time series of the volume snapshots of order book on the bid side L2_orderbook_signed_volmYYYY-MM-DDTHH:MM:SS.ffffff both Time series of the volume snapshots of the signed order book, - for bid, + for ask L2_orderbook_ask_volmYYYY-MM-DDTHH:MM:SS.ffffff ask Time series of the price snapshots of order book on the ask side L2_orderbook_bid_volmYYYY-MM-DDTHH:MM:SS.ffffff bid Time series of the price snapshots of order book on the bid side L2_orderbook_signed_volmYYYY-MM-DDTHH:MM:SS.ffffff both Time series of the price snapshots of the signed order book, - for bid, + for ask
Understanding The Raw Order Book Data
The coinbase exchange operates using the double auction model, the Coinbase Pro API, and by extension the CoPrA API makes it realitively easy to get still images of an instance of the orderbook as
snapshots and it sends updates in real time of the volume at a particular price level as
l2_update messages. If you would like to know more, the cited papers do a great job introducing the double auction model for the purposes of defining the types of orders, and how they record events and make sense of them.
Below there is a graph of the snapshot where bids (green) show open limit orders to buy the 1 unit of the cryptocurrency below $7085.930, and asks (red) show open limit orders to buy 1 unit above $7085.930. The x-axis shows the price points, and the y-axis is the aggregate size at the price level. Note that the signed orderbook calls volume on the bid side negative.
Early and current works relied on exchanges and private data providers (e.g., NASDAQ - BookViewer, LOBSTER) to provide reconstructions of orderbooks. Earlier works were limited to taking snapshots and inferring the possible sequence of orderbook events between states. Coinbase and by extension crobat update the levels on the instance of a update message from the exchange so there is no guess as to what happened between states of the order book. The current format of the orderbook snapshot is not aggregated. The format of the orderbook snapshot for a single side is shown below
|YYYY-MM-DDTHH:MM:SS.ffffff||total BTC at position 1||total BTC at position 2||total BTC at position 3||...||total BTC at position range|
The associated price quote (price quote (USD per XTC))snapshot is also generated, to make generation of market depth feasible.
|YYYY-MM-DDTHH:MM:SS.ffffff||price quote at position 1||price quote at position 2||price quote at position 3||...||price quote at position range|
The signed orderbook takes a different approach to position labelling so please keep that in mind. (note: I should shift the position index to start at 1, for singe side order book snapshot time series). The signed orderbook snapshot is generated in a similar fashion with a volume, and price at each position. However, it uses the convention established in  for the signed order book. where positions on the bid are negative, with negative volume (XTC). I'll show the default setting that displays the 5 best bids and asks on each side.
|YYYY-MM-DDTHH:MM:SS.ffffff||total XTC at the 5th best bid||total XTC at the 4th best bid||total XTC at the 3rd best bid||total XTC at the 2nd best bid||total XTC at the best bid||total XTC at the best ask||total XTC at the best 2nd ask||total XTC at the 3rd best ask||total XTC at the 4th best ask||total XTC at the 5th best ask|
Similar to the single side implementation, there is an associated price quote (e.g., USD per XTC) snapshot generated at each timepoint. The default format is given below:
|YYYY-MM-DDTHH:MM:SS.ffffff||price quote at the 5th best bid||price quote at the 4th best bid||price quote at the 3rd best bid||price quote at the 2nd best bid||price quote at the best bid||price quote at the best ask||price quote at the best 2nd ask||price quote at the 3rd best ask||price quote at the 4th best ask||price quote at the 5th best ask|
Event recording are a timeseries of MO, LO, CO's as afforded from the
l2_update messages which are used to update the price, volume pair size at each price level. The format of the Event recorder is as follows:
|Timestamp||order type||price level||event size||position||mid price||bid-ask spread|
|YYYY-MM-DDTHH:MM:SS.ffffff||MO, LO, CO||price level in quote currency||event size in base currency||position||(best-ask + best-bid)/2||best-ask - best-bid range|
Signed event recordings follow the convention from The Price impact of Orderbook events, where positive order flow is due to MO's on the buy side, CO on the sell side, and LO on the buy side. Conversely, negative order flow is due to MO's on the sell side, CO on the buy side, and LO on the buy side. The format is similar to the single side order book events timeseries, but the order volume is signed based on the aforementioned construction.
|Timestamp||order type||price level||event size||position||side||mid price||bid-ask spread|
|YYYY-MM-DDTHH:MM:SS.ffffff||MO, LO, CO||price level in quote currency||event size in base currency||signed position(- for bids, + for asks)||buy/sell||(best-ask + best-bid)/2||best-ask - best-bid range|
####Features that need to be developed in order of priority:
- fixed tick orderbook snapshots and event recording
market depth recording in both base and quote currencies. Acessor functions(documentation pending) modernizing/optimizing iteration and classes(replaced sort instances with insert, and a little bit of logic)
- Finding a way to call the classes outside of the AsyncIO or WebSocket Loop (help me figure this one out!)
See the open issues for a list of proposed features (and known issues).
Any contributions you make are greatly appreciated. I am not much of a computer scientist so suggestions and feedback will improve this project for everyone and make me a more capable developer for future projects.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature)
- Commit your Changes (
git commit -m 'Add some AmazingFeature')
- Push to the Branch (
git push origin feature/AmazingFeature)
- Open a Pull Request
Distributed under the GNU GPLv3 License. See
LICENSE for more information.
Project Link: https://github.com/orderbooktools/crobat
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size crobat-1.0.1-py3-none-any.whl (19.5 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size crobat-1.0.1.tar.gz (9.3 kB)||File type Source||Python version None||Upload date||Hashes View|