Skip to main content

Mlfin.py is an Advance Machine Learning toolbox for financial applications.

Project description

PyPI Version Python Versions Platforms MIT License Build Status Coverage Documentation Status

MLfin.py is an Advance Machine Learning toolbox for financial applications. The main ideas is using proprietary works and code snippent by Dr. Marcos López de Prado to build a morden Pythonic package that implements newest tech stacks from various libraries such as Numpy, Pandas, Numba, and Scikit-Learn. This work inspired by the library MlFinLab by Hudson and Thames. Unfortunately, the library is closed-source and I believe in the power of open source projects, it motivates me to build this package from ground up.

Leverage best practice in packaging Python library, morden documentation style and comprehensive examples, MLfin.py will be the great tool for Quant Researchers, Algorithmic Traders, and Data Scientists as well as Finance students to reproduce the complex data transformation, labeling, sampling and feature engineering techniques with ease.

Installation

Installation can then be done via pip:

pip install mlfinpy

For the sake of best practice, it is good to do this with a dependency manager. I suggest you set yourself up with poetry, then within a new poetry project run:

poetry add mlfinpy

For developers

If you are planning on using Mlfinpy as a starting template for significant modifications, it probably makes sense to clone the repository and to just use the source code:

git clone https://github.com/baobach/mlfinpy

Alternatively, if you still want the convenience of a global from mlfinpy import x, you should try:

pip install -e git+https://github.com/baobach/mlfinpy.git

Work with HFT Data

In reality, testing code snippets through the first 3 chapters of the book is challenging as it relies on HFT data to create the new financial data structures. Sourcing the HFT data is very difficult and thus TickData LLC provides the full history of S&P500 Emini futures tick data and available for purchase.

I am not affiliated with TickData in any way but would like to recommend others to make use of their service. The full history costs $750 and is worth every penny. They have really done a great job at cleaning the data and providing it in a user friendly manner.

Download Sources

TickData does offer about 20 days worth of raw tick data which can be sourced from their website link. For those of you interested in working with a two years of sample tick, volume, and dollar bars, it is provided for in the research repo. You should be able to work on a few implementations of the code with this set.

Searching for free tick data can be a challenging task. The following three sources may help:

  1. Dukascopy. Offers free historical tick data for some futures, though you do have to register.

  2. Most crypto exchanges offer tick data but not historical (see Binance API). So you’d have to run a script for a few days.

  3. Blog Post: How and why I got 75Gb of free foreign exchange “Tick” data.

Project principles and design decisions

  • It should be easy to swap out individual components of each module with the user’s proprietary improvements.

  • Usability is everything: it is better to be self-explanatory than consistent.

  • The goal is creating a framework to build a robust and functional library for machine learning applications.

  • Everything that has been implemented should be tested and formatted with lattest requirements.

  • Inline documentation is good: dedicated (separate) documentation is better. The two are not mutually exclusive.

  • Formatting should never get in the way of good code: because of this, I have deferred all formatting decisions to Black, Flake8, and Isort.

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mlfinpy-0.1.2.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

mlfinpy-0.1.2-py3-none-any.whl (1.4 MB view details)

Uploaded Python 3

File details

Details for the file mlfinpy-0.1.2.tar.gz.

File metadata

  • Download URL: mlfinpy-0.1.2.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for mlfinpy-0.1.2.tar.gz
Algorithm Hash digest
SHA256 33623528e720be86198d45bedcf994f87f35ae49517896a05ce2ab1e8cd00c0a
MD5 ab4de018a887ecf6d958e1b32f50f997
BLAKE2b-256 30ab10305d10d44ce556fcc0aec66b27a6ca21e3f1f089d78e7e75535d30b1c3

See more details on using hashes here.

File details

Details for the file mlfinpy-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: mlfinpy-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for mlfinpy-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2e0237e0a5f693bb1c7e9758e52ec03033a4d24ecee1162899d8241d7340e3f2
MD5 a3d96c06b4067163522458f6cb142c9e
BLAKE2b-256 4fd009eb537055b76d7c429241dd3d731591e04d25f28745b9792131b587fb28

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page