Skip to main content

A Quantitative-research Platform

Project description

Python Versions Platform PypI Versions Upload Python Package Github Actions Test Status Documentation Status License Join the chat at https://gitter.im/Microsoft/qlib

:newspaper: What's NEW!   :sparkling_heart:

Recent released features

Feature Status
Meta-Learning-based framework & DDG-DA Released on Jan 10, 2022
Planning-based portfolio optimization Released on Dec 28, 2021
Release Qlib v0.8.0 Released on Dec 8, 2021
ADD model Released on Nov 22, 2021
ADARNN model Released on Nov 14, 2021
TCN model Released on Nov 4, 2021
Nested Decision Framework Released on Oct 1, 2021. Example and Doc
Temporal Routing Adaptor (TRA) Released on July 30, 2021
Transformer & Localformer Released on July 22, 2021
Release Qlib v0.7.0 Released on July 12, 2021
TCTS Model Released on July 1, 2021
Online serving and automatic model rolling :star: Released on May 17, 2021
DoubleEnsemble Model Released on Mar 2, 2021
High-frequency data processing example Released on Feb 5, 2021
High-frequency trading example Part of code released on Jan 28, 2021
High-frequency data(1min) Released on Jan 27, 2021
Tabnet Model Released on Jan 22, 2021

Features released before 2021 are not listed here.

Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.

It contains the full ML pipeline of data processing, model training, back-testing; and covers the entire chain of quantitative investment: alpha seeking, risk modeling, portfolio optimization, and order execution.

With Qlib, users can easily try ideas to create better Quant investment strategies.

For more details, please refer to our paper "Qlib: An AI-oriented Quantitative Investment Platform".

Plans

New features under development(order by estimated release time). Your feedbacks about the features are very important.

Feature Status
Point-in-Time database Under review: https://github.com/microsoft/qlib/pull/343
Orderbook database Under review: https://github.com/microsoft/qlib/pull/744

Framework of Qlib

At the module level, Qlib is a platform that consists of the above components. The components are designed as loose-coupled modules, and each component could be used stand-alone.

Name Description
Infrastructure layer Infrastructure layer provides underlying support for Quant research. DataServer provides a high-performance infrastructure for users to manage and retrieve raw data. Trainer provides a flexible interface to control the training process of models, which enable algorithms to control the training process.
Workflow layer Workflow layer covers the whole workflow of quantitative investment. Information Extractor extracts data for models. Forecast Model focuses on producing all kinds of forecast signals (e.g. alpha, risk) for other modules. With these signals Decision Generator will generate the target trading decisions(i.e. portfolio, orders) to be executed by Execution Env (i.e. the trading market). There may be multiple levels of Trading Agent and Execution Env (e.g. an order executor trading agent and intraday order execution environment could behave like an interday trading environment and nested in daily portfolio management trading agent and interday trading environment )
Interface layer Interface layer tries to present a user-friendly interface for the underlying system. Analyser module will provide users detailed analysis reports of forecasting signals, portfolios and execution results
  • The modules with hand-drawn style are under development and will be released in the future.
  • The modules with dashed borders are highly user-customizable and extendible.

Quick Start

This quick start guide tries to demonstrate

  1. It's very easy to build a complete Quant research workflow and try your ideas with Qlib.
  2. Though with public data and simple models, machine learning technologies work very well in practical Quant investment.

Here is a quick demo shows how to install Qlib, and run LightGBM with qrun. But, please make sure you have already prepared the data following the instruction.

Installation

This table demonstrates the supported Python version of Qlib:

install with pip install from source plot
Python 3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
Python 3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
Python 3.9 :x: :heavy_check_mark: :x:

Note:

  1. Conda is suggested for managing your Python environment.
  2. Please pay attention that installing cython in Python 3.6 will raise some error when installing Qlib from source. If users use Python 3.6 on their machines, it is recommended to upgrade Python to version 3.7 or use conda's Python to install Qlib from source.
  3. For Python 3.9, Qlib supports running workflows such as training models, doing backtest and plot most of the related figures (those included in notebook). However, plotting for the model performance is not supported for now and we will fix this when the dependent packages are upgraded in the future.

Install with pip

Users can easily install Qlib by pip according to the following command.

  pip install pyqlib

Note: pip will install the latest stable qlib. However, the main branch of qlib is in active development. If you want to test the latest scripts or functions in the main branch. Please install qlib with the methods below.

Install from source

Also, users can install the latest dev version Qlib by the source code according to the following steps:

  • Before installing Qlib from source, users need to install some dependencies:

    pip install numpy
    pip install --upgrade  cython
    
  • Clone the repository and install Qlib as follows.

    • If you haven't installed qlib by the command pip install pyqlib before:
      git clone https://github.com/microsoft/qlib.git && cd qlib
      python setup.py install
      
    • If you have already installed the stable version by the command pip install pyqlib:
      git clone https://github.com/microsoft/qlib.git && cd qlib
      pip install .
      

    Note: Only the command pip install . can overwrite the stable version installed by pip install pyqlib, while the command python setup.py install can't.

Tips: If you fail to install Qlib or run the examples in your environment, comparing your steps and the CI workflow may help you find the problem.

Data Preparation

Load and prepare data by running the following code:

# get 1d data
python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn

# get 1min data
python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data_1min --region cn --interval 1min

This dataset is created by public data collected by crawler scripts, which have been released in the same repository. Users could create the same dataset with it. Description of dataset

Please pay ATTENTION that the data is collected from Yahoo Finance, and the data might not be perfect. We recommend users to prepare their own data if they have a high-quality dataset. For more information, users can refer to the related document.

Automatic update of daily frequency data (from yahoo finance)

This step is Optional if users only want to try their models and strategies on history data.

It is recommended that users update the data manually once (--trading_date 2021-05-25) and then set it to update automatically.

For more information, please refer to: yahoo collector

  • Automatic update of data to the "qlib" directory each trading day(Linux)

    • use crontab: crontab -e

    • set up timed tasks:

      * * * * 1-5 python <script path> update_data_to_bin --qlib_data_1d_dir <user data dir>
      
      • script path: scripts/data_collector/yahoo/collector.py
  • Manual update of data

    python scripts/data_collector/yahoo/collector.py update_data_to_bin --qlib_data_1d_dir <user data dir> --trading_date <start date> --end_date <end date>
    
    • trading_date: start of trading day
    • end_date: end of trading day(not included)

Auto Quant Research Workflow

Qlib provides a tool named qrun to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). You can start an auto quant research workflow and have a graphical reports analysis according to the following steps:

  1. Quant Research Workflow: Run qrun with lightgbm workflow config (workflow_config_lightgbm_Alpha158.yaml as following.

      cd examples  # Avoid running program under the directory contains `qlib`
      qrun benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
    

    If users want to use qrun under debug mode, please use the following command:

    python -m pdb qlib/workflow/cli.py examples/benchmarks/LightGBM/workflow_config_lightgbm_Alpha158.yaml
    

    The result of qrun is as follows, please refer to Intraday Trading for more details about the result.

    'The following are analysis results of the excess return without cost.'
                           risk
    mean               0.000708
    std                0.005626
    annualized_return  0.178316
    information_ratio  1.996555
    max_drawdown      -0.081806
    'The following are analysis results of the excess return with cost.'
                           risk
    mean               0.000512
    std                0.005626
    annualized_return  0.128982
    information_ratio  1.444287
    max_drawdown      -0.091078
    

    Here are detailed documents for qrun and workflow.

  2. Graphical Reports Analysis: Run examples/workflow_by_code.ipynb with jupyter notebook to get graphical reports

    • Forecasting signal (model prediction) analysis

      • Cumulative Return of groups Cumulative Return
      • Return distribution long_short
      • Information Coefficient (IC) Information Coefficient Monthly IC IC
      • Auto Correlation of forecasting signal (model prediction) Auto Correlation
    • Portfolio analysis

      • Backtest return Report
    • Explanation of above results

Building Customized Quant Research Workflow by Code

The automatic workflow may not suit the research workflow of all Quant researchers. To support a flexible Quant research workflow, Qlib also provides a modularized interface to allow researchers to build their own workflow by code. Here is a demo for customized Quant research workflow by code.

Main Challenges & Solutions in Quant Research

Quant investment is an very unique scenario with lots of key challenges to be solved. Currently, Qlib provides some solutions for several of them.

Forecasting: Finding Valuable Signals/Patterns

Accurate forecasting of the stock price trend is a very important part to construct profitable portfolios. However, huge amount of data with various formats in the financial market which make it challenging to build forecasting models.

An increasing number of SOTA Quant research works/papers, which focus on building forecasting models to mine valuable signals/patterns in complex financial data, are released in Qlib

Quant Model (Paper) Zoo

Here is a list of models built on Qlib.

Your PR of new Quant models is highly welcomed.

The performance of each model on the Alpha158 and Alpha360 dataset can be found here.

Run a single model

All the models listed above are runnable with Qlib. Users can find the config files we provide and some details about the model through the benchmarks folder. More information can be retrieved at the model files listed above.

Qlib provides three different ways to run a single model, users can pick the one that fits their cases best:

  • Users can use the tool qrun mentioned above to run a model's workflow based from a config file.

  • Users can create a workflow_by_code python script based on the one listed in the examples folder.

  • Users can use the script run_all_model.py listed in the examples folder to run a model. Here is an example of the specific shell command to be used: python run_all_model.py run --models=lightgbm, where the --models arguments can take any number of models listed above(the available models can be found in benchmarks). For more use cases, please refer to the file's docstrings.

    • NOTE: Each baseline has different environment dependencies, please make sure that your python version aligns with the requirements(e.g. TFT only supports Python 3.6~3.7 due to the limitation of tensorflow==1.15.0)

Run multiple models

Qlib also provides a script run_all_model.py which can run multiple models for several iterations. (Note: the script only support Linux for now. Other OS will be supported in the future. Besides, it doesn't support parallel running the same model for multiple times as well, and this will be fixed in the future development too.)

The script will create a unique virtual environment for each model, and delete the environments after training. Thus, only experiment results such as IC and backtest results will be generated and stored.

Here is an example of running all the models for 10 iterations:

python run_all_model.py run 10

It also provides the API to run specific models at once. For more use cases, please refer to the file's docstrings.

Adapting to Market Dynamics

Due to the non-stationary nature of the environment of the financial market, the data distribution may change in different periods, which makes the performance of models build on training data decays in the future test data. So adapting the forecasting models/strategies to market dynamics is very important to the model/strategies' performance.

Here is a list of solutions built on Qlib.

Quant Dataset Zoo

Dataset plays a very important role in Quant. Here is a list of the datasets built on Qlib:

Dataset US Market China Market
Alpha360
Alpha158

Here is a tutorial to build dataset with Qlib. Your PR to build new Quant dataset is highly welcomed.

More About Qlib

The detailed documents are organized in docs. Sphinx and the readthedocs theme is required to build the documentation in html formats.

cd docs/
conda install sphinx sphinx_rtd_theme -y
# Otherwise, you can install them with pip
# pip install sphinx sphinx_rtd_theme
make html

You can also view the latest document online directly.

Qlib is in active and continuing development. Our plan is in the roadmap, which is managed as a github project.

Offline Mode and Online Mode

The data server of Qlib can either deployed as Offline mode or Online mode. The default mode is offline mode.

Under Offline mode, the data will be deployed locally.

Under Online mode, the data will be deployed as a shared data service. The data and their cache will be shared by all the clients. The data retrieval performance is expected to be improved due to a higher rate of cache hits. It will consume less disk space, too. The documents of the online mode can be found in Qlib-Server. The online mode can be deployed automatically with Azure CLI based scripts. The source code of online data server can be found in Qlib-Server repository.

Performance of Qlib Data Server

The performance of data processing is important to data-driven methods like AI technologies. As an AI-oriented platform, Qlib provides a solution for data storage and data processing. To demonstrate the performance of Qlib data server, we compare it with several other data storage solutions.

We evaluate the performance of several storage solutions by finishing the same task, which creates a dataset (14 features/factors) from the basic OHLCV daily data of a stock market (800 stocks each day from 2007 to 2020). The task involves data queries and processing.

HDF5 MySQL MongoDB InfluxDB Qlib -E -D Qlib +E -D Qlib +E +D
Total (1CPU) (seconds) 184.4±3.7 365.3±7.5 253.6±6.7 368.2±3.6 147.0±8.8 47.6±1.0 7.4±0.3
Total (64CPU) (seconds) 8.8±0.6 4.2±0.2
  • +(-)E indicates with (out) ExpressionCache
  • +(-)D indicates with (out) DatasetCache

Most general-purpose databases take too much time to load data. After looking into the underlying implementation, we find that data go through too many layers of interfaces and unnecessary format transformations in general-purpose database solutions. Such overheads greatly slow down the data loading process. Qlib data are stored in a compact format, which is efficient to be combined into arrays for scientific computation.

Related Reports

Contact Us

  • If you have any issues, please create issue here or send messages in gitter.
  • If you want to make contributions to Qlib, please create pull requests.
  • For other reasons, you are welcome to contact us by email(qlib@microsoft.com).
    • We are recruiting new members(both FTEs and interns), your resumes are welcome!

Join IM discussion groups:

Gitter
image

Contributing

We appreciate all contributions and thank all the contributors!

Before we released Qlib as an open-source project on Github in Sep 2020, Qlib is an internal project in our group. Unfortunately, the internal commit history is not kept. A lot of members in our group have also contributed a lot to Qlib, which includes Ruihua Wang, Yinda Zhang, Haisu Yu, Shuyu Wang, Bochen Pang, and Dong Zhou. Especially thanks to Dong Zhou due to his initial version of Qlib.

Guidance

This project welcomes contributions and suggestions.
Here are some code standards for submiting a pull request.

Making contributions is not a hard thing. Solving an issue(maybe just answering a question raised in issues list or gitter), fixing/issuing a bug, improving the documents and even fixing a typo are important contributions to Qlib.

For example, if you want to contribute to Qlib's document/code, you can follow the steps in the figure below.

If you don't know how to start to contribute, you can refer to the following examples.

Type Examples
Solving issues Answer a question; issuing or fixing a bug
Docs Improve docs quality ; Fix a typo
Feature Implement a requested feature like this; Refactor interfaces
Dataset Add a dataset
Models Implement a new model

If you would like to become one of Qlib's maintainers to contribute more (e.g. help merge PR, triage issues), please contact us by email(qlib@microsoft.com). We are glad to help you to set the right permission.

Licence

Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the right to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

pyqlib-0.8.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (912.1 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyqlib-0.8.1-cp38-cp38-macosx_10_14_x86_64.whl (505.1 kB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

pyqlib-0.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (875.0 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64 manylinux: glibc 2.5+ x86-64

pyqlib-0.8.1-cp37-cp37m-macosx_10_14_x86_64.whl (504.6 kB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file pyqlib-0.8.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyqlib-0.8.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bc754149426b0eee72cd8d89d3e3ef58dbde1a646822f41c4bc13987826a878b
MD5 83d2296e0015fe5803f3d74dc91f82d9
BLAKE2b-256 e50f9f39c6b221cfbd3e87f9839cf9affbb4b2e7fb379b831e25742a45d1e36b

See more details on using hashes here.

File details

Details for the file pyqlib-0.8.1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyqlib-0.8.1-cp38-cp38-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 505.1 kB
  • Tags: CPython 3.8, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for pyqlib-0.8.1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 40f00c23020b96c0457bec3732dab12433e54a9756874b8ab34264ded8dde976
MD5 0b948d0aa765c281122d2210f5bd34a4
BLAKE2b-256 9975962dfb9bf69292fa44a9e58a5bf9ee1671fea30a6b41c1c55b5236a8a51d

See more details on using hashes here.

File details

Details for the file pyqlib-0.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for pyqlib-0.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6352a7f09e58c424ead18012eeed273cf8af918ac45f52fc852fcdde67e719f0
MD5 6c77c70e57bd6af568a9e07b350e9ee8
BLAKE2b-256 c7eed4ff53763c44769c943942a4fe36eeb43b55f59e433a38a96ae89a2c428d

See more details on using hashes here.

File details

Details for the file pyqlib-0.8.1-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

  • Download URL: pyqlib-0.8.1-cp37-cp37m-macosx_10_14_x86_64.whl
  • Upload date:
  • Size: 504.6 kB
  • Tags: CPython 3.7m, macOS 10.14+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.12

File hashes

Hashes for pyqlib-0.8.1-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0af75b5c3670c07adb795b406a7c2f5e3cbebd5d814498100fe2505307716705
MD5 2b203629983d1727ddb5b70f6f574a06
BLAKE2b-256 768c5eb6a83eaa01d79175155e1f4ab460ba93459d9f9699ca34ee6c9e6db602

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page