Classify trades using trade classification algorithms 🐍
Project description
Trade Classification With Python
Documentation ✒️: https://karelze.github.io/tclf/
Source Code 🐍: https://github.com/KarelZe/tclf
tclf
is a scikit-learn
-compatible implementation of trade classification algorithms to classify financial markets transactions into buyer- and seller-initiated trades.
The key features are:
- Easy: Easy to use and learn.
- Sklearn-compatible: Compatible to the sklearn API. Use sklearn metrics and visualizations.
- Feature complete: Wide range of supported algorithms. Use the algorithms individually or stack them like LEGO blocks.
Installation
python -m pip install tclf
Supported Algorithms
- (Rev.) CLNV rule[^1]
- (Rev.) EMO rule[^2]
- (Rev.) LR algorithm[^6]
- (Rev.) Tick test[^5]
- Depth rule[^3]
- Quote rule[^4]
- Tradesize rule[^3]
For a primer on trade classification rules visit the rules section 🆕 in our docs.
Minimal Example
Let's start simple: classify all trades by the quote rule and all other trades, which cannot be classified by the quote rule, randomly.
Create a main.py
with:
import numpy as np
import pandas as pd
from tclf.classical_classifier import ClassicalClassifier
X = pd.DataFrame(
[
[1.5, 1, 3],
[2.5, 1, 3],
[1.5, 3, 1],
[2.5, 3, 1],
[1, np.nan, 1],
[3, np.nan, np.nan],
],
columns=["trade_price", "bid_ex", "ask_ex"],
)
clf = ClassicalClassifier(layers=[("quote", "ex")], strategy="random")
clf.fit(X)
probs = clf.predict_proba(X)
Run your script with
$ python main.py
In this example, input data is available as a pd.DataFrame with columns conforming to our naming conventions.
The parameter layers=[("quote", "ex")]
sets the quote rule at the exchange level and strategy="random"
specifies the fallback strategy for unclassified trades.
Advanced Example
Often it is desirable to classify both on exchange level data and nbbo data. Also, data might only be available as a numpy array. So let's extend the previous example by classifying using the quote rule at exchange level, then at nbbo and all other trades randomly.
import numpy as np
from sklearn.metrics import accuracy_score
from tclf.classical_classifier import ClassicalClassifier
X = np.array(
[
[1.5, 1, 3, 2, 2.5],
[2.5, 1, 3, 1, 3],
[1.5, 3, 1, 1, 3],
[2.5, 3, 1, 1, 3],
[1, np.nan, 1, 1, 3],
[3, np.nan, np.nan, 1, 3],
]
)
y_true = np.array([-1, 1, 1, -1, -1, 1])
features = ["trade_price", "bid_ex", "ask_ex", "bid_best", "ask_best"]
clf = ClassicalClassifier(
layers=[("quote", "ex"), ("quote", "best")], strategy="random", features=features
)
clf.fit(X)
acc = accuracy_score(y_true, clf.predict(X))
In this example, input data is available as np.arrays with both exchange ("ex"
) and nbbo data ("best"
). We set the layers parameter to layers=[("quote", "ex"), ("quote", "best")]
to classify trades first on subset "ex"
and remaining trades on subset "best"
. Additionally, we have to set ClassicalClassifier(..., features=features)
to pass column information to the classifier.
Like before, column/feature names must follow our naming conventions.
Other Examples
For more practical examples, see our examples section.
Development
We are using pixi
as a dependency management and workflow tool.
pixi install
pixi run postinstall
pixi run test
Citation
If you are using the package in publications, please cite as:
@software{bilz_tclf_2023,
author = {Bilz, Markus},
license = {BSD 3},
month = nov,
title = {{tclf} -- trade classification with python},
url = {https://github.com/KarelZe/tclf},
version = {0.0.1},
year = {2023}
}
Footnotes
[^1]:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.