init commit
20
docs/Makefile
Normal file
@@ -0,0 +1,20 @@
|
||||
# Minimal makefile for Sphinx documentation
|
||||
#
|
||||
|
||||
# You can set these variables from the command line.
|
||||
SPHINXOPTS =
|
||||
SPHINXBUILD = python3 -msphinx
|
||||
SPHINXPROJ = Quantlab
|
||||
SOURCEDIR = .
|
||||
BUILDDIR = _build
|
||||
|
||||
# Put it first so that "make" without argument is like "make help".
|
||||
help:
|
||||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
|
||||
.PHONY: help Makefile
|
||||
|
||||
# Catch-all target: route all unknown targets to Sphinx using the new
|
||||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
|
||||
%: Makefile
|
||||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
|
||||
BIN
docs/_static/img/analysis/analysis_model_IC.png
vendored
Normal file
|
After Width: | Height: | Size: 37 KiB |
BIN
docs/_static/img/analysis/analysis_model_NDQ.png
vendored
Normal file
|
After Width: | Height: | Size: 24 KiB |
BIN
docs/_static/img/analysis/analysis_model_auto_correlation.png
vendored
Normal file
|
After Width: | Height: | Size: 48 KiB |
BIN
docs/_static/img/analysis/analysis_model_cumulative_return.png
vendored
Normal file
|
After Width: | Height: | Size: 64 KiB |
BIN
docs/_static/img/analysis/analysis_model_long_short.png
vendored
Normal file
|
After Width: | Height: | Size: 16 KiB |
BIN
docs/_static/img/analysis/analysis_model_monthly_IC.png
vendored
Normal file
|
After Width: | Height: | Size: 17 KiB |
BIN
docs/_static/img/analysis/analysis_model_top_bottom_turnover.png
vendored
Normal file
|
After Width: | Height: | Size: 59 KiB |
BIN
docs/_static/img/analysis/cumulative_return_buy.png
vendored
Normal file
|
After Width: | Height: | Size: 41 KiB |
BIN
docs/_static/img/analysis/cumulative_return_buy_minus_sell.png
vendored
Normal file
|
After Width: | Height: | Size: 44 KiB |
BIN
docs/_static/img/analysis/cumulative_return_hold.png
vendored
Normal file
|
After Width: | Height: | Size: 42 KiB |
BIN
docs/_static/img/analysis/cumulative_return_sell.png
vendored
Normal file
|
After Width: | Height: | Size: 52 KiB |
BIN
docs/_static/img/analysis/rank_label_buy.png
vendored
Normal file
|
After Width: | Height: | Size: 92 KiB |
BIN
docs/_static/img/analysis/rank_label_hold.png
vendored
Normal file
|
After Width: | Height: | Size: 70 KiB |
BIN
docs/_static/img/analysis/rank_label_sell.png
vendored
Normal file
|
After Width: | Height: | Size: 100 KiB |
BIN
docs/_static/img/analysis/report.png
vendored
Normal file
|
After Width: | Height: | Size: 148 KiB |
BIN
docs/_static/img/analysis/risk_analysis_annual.png
vendored
Normal file
|
After Width: | Height: | Size: 50 KiB |
BIN
docs/_static/img/analysis/risk_analysis_bar.png
vendored
Normal file
|
After Width: | Height: | Size: 12 KiB |
BIN
docs/_static/img/analysis/risk_analysis_mdd.png
vendored
Normal file
|
After Width: | Height: | Size: 54 KiB |
BIN
docs/_static/img/analysis/risk_analysis_sharpe.png
vendored
Normal file
|
After Width: | Height: | Size: 53 KiB |
BIN
docs/_static/img/analysis/risk_analysis_std.png
vendored
Normal file
|
After Width: | Height: | Size: 51 KiB |
BIN
docs/_static/img/analysis/score_ic.png
vendored
Normal file
|
After Width: | Height: | Size: 99 KiB |
BIN
docs/_static/img/framework.png
vendored
Normal file
|
After Width: | Height: | Size: 205 KiB |
BIN
docs/_static/img/topk_drop.png
vendored
Normal file
|
After Width: | Height: | Size: 50 KiB |
104
docs/advanced/alpha.rst
Normal file
@@ -0,0 +1,104 @@
|
||||
.. _alpha:
|
||||
===========================
|
||||
Building Formulaic Alphas
|
||||
===========================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
In quantitative trading practice, designing novel factors that can explain and predict future asset returns are of vital importance to the profitability of a strategy. Such factors are usually called alpha factors, or alphas in short.
|
||||
|
||||
|
||||
A formulaic alpha, as the name suggests, is a kind of alpha that can be presented as a formula or a mathematical expression.
|
||||
|
||||
|
||||
Building Formulaic Alphas in ``Qlib``
|
||||
======================================
|
||||
|
||||
In ``Qlib``, users can easily build formulaic alphas.
|
||||
|
||||
Example
|
||||
-----------------
|
||||
|
||||
`MACD`, short for moving average convergence/divergence, is a formulaic alpha used in technical analysis of stock prices. It is designed to reveal changes in the strength, direction, momentum, and duration of a trend in a stock's price.
|
||||
|
||||
`MACD` can be presented as the following formula:
|
||||
|
||||
.. math::
|
||||
|
||||
MACD = 2\times (DIF-DEA)
|
||||
|
||||
.. note::
|
||||
|
||||
`DIF` means Differential value, which is 12-period EMA minus 26-period EMA.
|
||||
|
||||
.. math::
|
||||
|
||||
DIF = \frac{EMA(CLOSE, 12) - EMA(CLOSE, 26)}{CLOSE}
|
||||
|
||||
`DEA`means a 9-period EMA of the DIF.
|
||||
|
||||
.. math::
|
||||
|
||||
DEA = \frac{EMA(DIF, 9)}{CLOSE}
|
||||
|
||||
Users can use ``Data Handler`` to build formulaic alphas `MACD` in qlib:
|
||||
|
||||
.. note:: Users need to initialize ``Qlib`` with `qlib.init` first. Please refer to `initialization <initialization.rst>`_.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.contrib.estimator.handler import QLibDataHandler
|
||||
>>> fields = ['(EMA($close, 12) - EMA($close, 26))/$close - EMA((EMA($close, 12) - EMA($close, 26))/$close, 9)/$close'] # MACD
|
||||
>>> names = ['MACD']
|
||||
>>> labels = ['Ref($vwap, -2)/Ref($vwap, -1) - 1'] # label
|
||||
>>> label_names = ['LABEL']
|
||||
>>> data_handler = QLibDataHandler(start_date='2010-01-01', end_date='2017-12-31', fields=fields, names=names, labels=labels, label_names=label_names)
|
||||
>>> TRAINER_CONFIG = {
|
||||
... "train_start_date": "2007-01-01",
|
||||
... "train_end_date": "2014-12-31",
|
||||
... "validate_start_date": "2015-01-01",
|
||||
... "validate_end_date": "2016-12-31",
|
||||
... "test_start_date": "2017-01-01",
|
||||
... "test_end_date": "2020-08-01",
|
||||
... }
|
||||
>>> feature_train, label_train, feature_validate, label_validate, feature_test, label_test = data_handler.get_split_data(**TRAINER_CONFIG)
|
||||
>>> print(feature_train, label_train)
|
||||
MACD
|
||||
instrument datetime
|
||||
SH600004 2012-01-04 -0.030853
|
||||
2012-01-05 -0.030452
|
||||
2012-01-06 -0.028252
|
||||
2012-01-09 -0.024507
|
||||
2012-01-10 -0.019744
|
||||
... ...
|
||||
SZ300273 2014-12-25 0.031339
|
||||
2014-12-26 0.029695
|
||||
2014-12-29 0.025577
|
||||
2014-12-30 0.020493
|
||||
2014-12-31 0.017089
|
||||
|
||||
[605882 rows x 1 columns]
|
||||
label
|
||||
instrument datetime
|
||||
SH600004 2012-01-04 0.003021
|
||||
2012-01-05 0.017434
|
||||
2012-01-06 0.015490
|
||||
2012-01-09 0.002324
|
||||
2012-01-10 -0.002542
|
||||
... ...
|
||||
SZ300273 2014-12-25 -0.032454
|
||||
2014-12-26 -0.016638
|
||||
2014-12-29 0.008263
|
||||
2014-12-30 -0.011985
|
||||
2014-12-31 0.047797
|
||||
|
||||
[605882 rows x 1 columns]
|
||||
|
||||
Reference
|
||||
===========
|
||||
|
||||
To kown more about ``Data Handler``, please refer to `Data Handler <../component/data.html>`_
|
||||
|
||||
To kown more about ``Data Api``, please refer to `Data Api <../component/data.html>`_
|
||||
2
docs/changelog/changelog.rst
Normal file
@@ -0,0 +1,2 @@
|
||||
.. include:: ../../CHANGES.rst
|
||||
|
||||
106
docs/component/backtest.rst
Normal file
@@ -0,0 +1,106 @@
|
||||
.. _backtest:
|
||||
============================================
|
||||
Intraday Trading: Model&Strategy Testing
|
||||
============================================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Intraday Trading`` is designed to test models and strategies, which help users to check the performance of custom model/strategy.
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
``Intraday Trading`` uses ``Order Executor`` to trade and execute orders output by ``Interday Strategy``. ``Order Executor`` is a component in `Qlib Framework <../introduction/introduction.html#framework>`_, which can execute orders. ``Vwap Executor`` and ``Close Executor`` is supported by ``Qlib`` now. In the future, ``Qlib`` will support ``HighFreq Executor`` also.
|
||||
|
||||
|
||||
|
||||
Example
|
||||
===========================
|
||||
|
||||
Users need to generate a prediction score(a pandas DataFrame) with MultiIndex<instrument, datetime> and a `score` column. And users need to assign a strategy used in backtest, if strategy is not assigned,
|
||||
a `TopkDropoutStrategy` strategy with `(topk=50, n_drop=5, risk_degree=0.95, limit_threshold=0.0095)` will be used.
|
||||
If ``Strategy`` module is not user's interested part, `TopkDropoutStrategy` is enough.
|
||||
|
||||
The simple example with default strategy is as follows.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from qlib.contrib.evaluate import backtest
|
||||
# pred_score is the prediction score
|
||||
report, positions = backtest(pred_score, topk=50, n_drop=0.5, verbose=False, limit_threshold=0.0095)
|
||||
|
||||
To know more about backtesting with specific strategy, please refer to `Strategy <strategy.html>`_.
|
||||
|
||||
To know more about the prediction score `pred_score` output by ``Model``, please refer to `Interday Model: Model Training & Prediction <model.html>`_.
|
||||
|
||||
Prediction Score
|
||||
-----------------
|
||||
|
||||
The prediction score is a pandas DataFrame. Its index is <instrument(str), datetime(pd.Timestamp)> and it must
|
||||
contains a `score` column.
|
||||
|
||||
A prediction sample is shown as follows.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
instrument datetime score
|
||||
SH600000 2019-01-04 -0.505488
|
||||
SZ002531 2019-01-04 -0.320391
|
||||
SZ000999 2019-01-04 0.583808
|
||||
SZ300569 2019-01-04 0.819628
|
||||
SZ001696 2019-01-04 -0.137140
|
||||
... ...
|
||||
SZ000996 2019-04-30 -1.027618
|
||||
SH603127 2019-04-30 0.225677
|
||||
SH603126 2019-04-30 0.462443
|
||||
SH603133 2019-04-30 -0.302460
|
||||
SZ300760 2019-04-30 -0.126383
|
||||
|
||||
``Model`` module can make predictions, please refer to `Model <model.html>`_.
|
||||
|
||||
Backtest Result
|
||||
------------------
|
||||
|
||||
The backtest results are in the following form:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
sub_bench mean 0.000662
|
||||
std 0.004487
|
||||
annual 0.166720
|
||||
sharpe 2.340526
|
||||
mdd -0.080516
|
||||
sub_cost mean 0.000577
|
||||
std 0.004482
|
||||
annual 0.145392
|
||||
sharpe 2.043494
|
||||
mdd -0.083584
|
||||
|
||||
- `sub_bench`
|
||||
Returns of the portfolio without deduction of fees
|
||||
|
||||
- `sub_cost`
|
||||
Returns of the portfolio with deduction of fees
|
||||
|
||||
- `mean`
|
||||
Mean value of the returns sequence(difference sequence of assets).
|
||||
|
||||
- `std`
|
||||
Standard deviation of the returns sequence(difference sequence of assets).
|
||||
|
||||
- `annual`
|
||||
Average annualized returns of the portfolio.
|
||||
|
||||
- `ir`
|
||||
Information Ratio, please refer to `Information Ratio – IR <https://www.investopedia.com/terms/i/informationratio.asp>`_.
|
||||
|
||||
- `mdd`
|
||||
Maximum Drawdown, please refer to `Maximum Drawdown (MDD) <https://www.investopedia.com/terms/m/maximum-drawdown-mdd.asp>`_.
|
||||
|
||||
|
||||
Reference
|
||||
==============
|
||||
|
||||
To know more about ``Intraday Trading``, please refer to `Backtest API <../reference/api.html>`_.
|
||||
333
docs/component/data.rst
Normal file
@@ -0,0 +1,333 @@
|
||||
.. _data:
|
||||
================================
|
||||
Data Layer: Data Framework&Usage
|
||||
================================
|
||||
|
||||
Introduction
|
||||
============================
|
||||
|
||||
``Data Layer`` is designed to download raw data, retrieve data, construct datasets and get frequently-used data.
|
||||
|
||||
Also, users can building formulaic alphas with ``Data Layer`` easliy. If users are interesting formulaic alphas, please refer to `Building Formulaic Alphas <../advanced/alpha.html>`_.
|
||||
|
||||
The ``Data Layer`` framework includes four components as follows.
|
||||
|
||||
- Raw Data
|
||||
- Data API
|
||||
- Data Handler
|
||||
- Cache
|
||||
|
||||
|
||||
|
||||
Raw Data
|
||||
============================
|
||||
|
||||
``Qlib`` provides the script ``scripts/get_data.py`` to download the raw data that will be used to initialize the qlib package, please refer to `Initialization <../start/initialization.rst>`_.
|
||||
|
||||
When ``Qlib`` is initialized, users can choose china-stock mode or US-stock mode, please refer to `Initialization <../start/initialization.rst>`_.
|
||||
|
||||
China-Stock Market Mode
|
||||
--------------------------------
|
||||
|
||||
If users use ``Qlib`` in china-stock mode, china-stock data is required. The script ``scripts/get_data.py`` can be used to download china-stock data. If users want to use ``Qlib`` in china-stock mode, they need to do as follows.
|
||||
|
||||
- Download data in qlib format
|
||||
Run the following command to download china-stock data in csv format.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python scripts/get_data.py qlib_data_cn --target_dir ~/.qlib/qlib_data/cn_data
|
||||
|
||||
Users can find china-stock data in qlib format in the'~/.qlib/csv_data/cn_data' directory.
|
||||
|
||||
- Initialize ``Qlib`` in china-stock mode
|
||||
Users only need to initialize ``Qlib`` as follows.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from qlib.config import REG_CN
|
||||
qlib.init(provider_uri='~/.qlib/qlib_data/cn_data', region=REG_CN)
|
||||
|
||||
|
||||
US-Stock Market Mode
|
||||
-------------------------
|
||||
If users use ``Qlib`` in US-stock mode, US-stock data is required. ``Qlib`` does not provide script to download US-stock data. If users want to use ``Qlib`` in US-stock market mode, they need to do as follows.
|
||||
|
||||
- Prepare data in csv format
|
||||
Users need to prepare US-stock data in csv format by themselves, which is in the same format as the china-stock data in csv format. Please download the china-stock data in csv format as follows for reference of format.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python scripts/get_data.py csv_data_cn --target_dir ~/.qlib/csv_data/cn_data
|
||||
|
||||
|
||||
- Convert data from csv format to ``Qlib`` format
|
||||
``Qlib`` provides the script ``scripts/dump_bin.py`` to convert data from csv format to qlib format.
|
||||
Assuming that the users store the US-stock data in csv format in path '~/.qlib/csv_data/us_data', they need to execute the following command to convert the data from csv format to ``Qlib`` format:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
python scripts/dump_bin.py dump --csv_path ~/.qlib/csv_data/us_data --qlib_dir ~/.qlib/qlib_data/us_data --include_fields open,close,high,low,volume,factor
|
||||
|
||||
- Initialize ``Qlib`` in US-stock mode
|
||||
Users only need to initialize ``Qlib`` as follows.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from qlib.config import REG_US
|
||||
qlib.init(provider_uri='~/.qlib/qlib_data/us_data', region=REG_US)
|
||||
|
||||
|
||||
Please refer to `Script API <../reference/api.html>`_ for more details.
|
||||
|
||||
Data API
|
||||
========================
|
||||
|
||||
Data Retrieval
|
||||
---------------
|
||||
Users can use APIs in ``qlib.data`` to retrieve data, please refer to `Data Retrieval <../start/getdata.html>`_.
|
||||
|
||||
Feature
|
||||
------------------
|
||||
|
||||
``Qlib`` provides `Feature` and `ExpressionOps` to fetch the features according to users' need.
|
||||
|
||||
- `Feature`
|
||||
Load data from data provider.
|
||||
|
||||
- `ExpressionOps`
|
||||
`ExpressionOps` will use operator for feature construction.
|
||||
To know more about ``Operator``, please refer to `Operator API <../reference/api.html>`_.
|
||||
|
||||
To know more about ``Feature``, please refer to `Feature API <../reference/api.html>`_.
|
||||
|
||||
Filter
|
||||
-------------------
|
||||
``Qlib`` provides `NameDFilter` and `ExpressionDFilter` to filter the instruments according to users' need.
|
||||
|
||||
- `NameDFilter`
|
||||
Name dynamic instrument filter. Filter the instruments based on a regulated name format. A name rule regular expression is required.
|
||||
|
||||
- `ExpressionDFilter`
|
||||
Expression dynamic instrument filter. Filter the instruments based on a certain expression. An expression rule indicating a certain feature field is required.
|
||||
|
||||
- `basic features filter`: rule_expression = '$close/$open>5'
|
||||
- `cross-sectional features filter` : rule_expression = '$rank($close)<10'
|
||||
- `time-sequence features filter`: rule_expression = '$Ref($close, 3)>100'
|
||||
|
||||
To know more about ``Filter``, please refer to `Filter API <../reference/api.html>`_.
|
||||
|
||||
|
||||
API
|
||||
-------------
|
||||
|
||||
To know more about ``Data Api``, please refer to `Data Api <../reference/api.html>`_.
|
||||
|
||||
Data Handler
|
||||
=================
|
||||
|
||||
``Data Handler`` is a part of ``estimator`` and can also be used as a single module.
|
||||
|
||||
``Data Handler`` can be used to load raw data, prepare features and label columns, preprocess data(standardization, remove NaN, etc.), split training, validation, and test sets. It is a subclass of ``qlib.contrib.estimator.handler.BaseDataHandler``, which provides some interfaces, for example:
|
||||
|
||||
Base Class & Interface
|
||||
----------------------
|
||||
|
||||
Qlib provides a base class `qlib.contrib.estimator.BaseDataHandler <../reference/api.html#class-qlib.contrib.estimator.BaseDataHandler>`_, which provides the following interfaces:
|
||||
|
||||
- `setup_feature`
|
||||
Implement the interface to load the data features.
|
||||
|
||||
- `setup_label`
|
||||
Implement the interface to load the data labels and calculate user's labels.
|
||||
|
||||
- `setup_processed_data`
|
||||
Implement the interface for data preprocessing, such as preparing feature columns, discarding blank lines, and so on.
|
||||
|
||||
Qlib also provides two functions to help user init the data handler, user can override them for user's need.
|
||||
|
||||
- `_init_kwargs`
|
||||
User can init the kwargs of the data handler in this function, some kwargs may be used when init the raw df.
|
||||
Kwargs are the other attributes in data.args, like dropna_label, dropna_feature
|
||||
|
||||
- `_init_raw_df`
|
||||
User can init the raw df, feature names and label names of data handler in this function.
|
||||
If the index of feature df and label df are not same, user need to override this method to merge them (e.g. inner, left, right merge).
|
||||
|
||||
If users want to load features and labels by config, users can inherit ``qlib.contrib.estimator.handler.ConfigDataHandler``, ``Qlib`` also have provided some preprocess method in this subclass.
|
||||
If users want to use qlib data, `QLibDataHandler` is recommended. Users can inherit their custom class from `QLibDataHandler`, which is also a subclass of `ConfigDataHandler`.
|
||||
|
||||
|
||||
Usage
|
||||
--------------
|
||||
'Data Handler' can be used as a single module, which provides the following mehtod:
|
||||
|
||||
- `get_split_data`
|
||||
- According to the start and end dates, return features and labels of the pandas DataFrame type used for the 'Model'
|
||||
|
||||
- `get_rolling_data`
|
||||
- According to the start and end dates, and `rolling_period`, an iterator is returned, which can be used to traverse the features and labels used for rolling.
|
||||
|
||||
|
||||
|
||||
|
||||
Example
|
||||
--------------
|
||||
|
||||
``Data Handler`` can be run with ``estimator`` by modifying the configuration file, and can also be used as a single module.
|
||||
|
||||
Know more about how to run ``Data Handler`` with ``estimator``, please refer to `Estimator <estimator.html#about-data>`_.
|
||||
|
||||
Qlib provides implemented data handler `QLibDataHandlerV1`. The following example shows how to run 'QLibDataHandlerV1' as a single module.
|
||||
|
||||
.. note:: User needs to initialize ``Qlib`` with `qlib.init` first, please refer to `initialization <initialization.rst>`_.
|
||||
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
from qlib.contrib.estimator.handler import QLibDataHandlerV1
|
||||
from qlib.contrib.model.gbdt import LGBModel
|
||||
|
||||
DATA_HANDLER_CONFIG = {
|
||||
"dropna_label": True,
|
||||
"start_date": "2007-01-01",
|
||||
"end_date": "2020-08-01",
|
||||
"market": "csi500",
|
||||
}
|
||||
|
||||
TRAINER_CONFIG = {
|
||||
"train_start_date": "2007-01-01",
|
||||
"train_end_date": "2014-12-31",
|
||||
"validate_start_date": "2015-01-01",
|
||||
"validate_end_date": "2016-12-31",
|
||||
"test_start_date": "2017-01-01",
|
||||
"test_end_date": "2020-08-01",
|
||||
}
|
||||
|
||||
exampleDataHandler = QLibDataHandlerV1(**DATA_HANDLER_CONFIG)
|
||||
|
||||
# example of 'get_split_data'
|
||||
x_train, y_train, x_validate, y_validate, x_test, y_test = exampleDataHandler.get_split_data(**TRAINER_CONFIG)
|
||||
|
||||
# example of 'get_rolling_data'
|
||||
|
||||
for (x_train, y_train, x_validate, y_validate, x_test, y_test) in exampleDataHandler.get_rolling_data(**TRAINER_CONFIG):
|
||||
print(x_train, y_train, x_validate, y_validate, x_test, y_test)
|
||||
|
||||
|
||||
.. note:: (x_train, y_train, x_validate, y_validate, x_test, y_test) can be used as arguments for the ``fit``, ``predict``, and ``score`` methods of the 'Model' , please refer to `Model <model.html#Interface>`_.
|
||||
|
||||
Also, the above example has been given in ``examples.estimator.train_backtest_analyze.ipynb``.
|
||||
|
||||
API
|
||||
---------
|
||||
|
||||
To know more abot ``Data Handler``, please refer to `Data Handler API <../reference/api.html#handler>`_.
|
||||
|
||||
Cache
|
||||
==========
|
||||
|
||||
``Cache`` is an optional module that helps accelerate providing data by saving some frequently-used data as cache file.
|
||||
|
||||
Memory Cache
|
||||
--------------
|
||||
|
||||
Base Class & Interface
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
``Qlib`` provides a `Memcache` class to cache the most-frequently-used data in memory, an inheritable `ExpressionCache` class, and an inheritable `DatasetCache` class.
|
||||
|
||||
`Memcache` is a memory cache mechanism that composes of three `MemCacheUnit` instances to cache **Calendar**, **Instruments**, and **Features**. The MemCache is defined globally in `cache.py` as `H`. User can use `H['c'], H['i'], H['f']` to get/set memcache.
|
||||
|
||||
.. autoclass:: qlib.data.cache.MemCacheUnit
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.MemCache
|
||||
:members:
|
||||
|
||||
|
||||
Disk Cache
|
||||
--------------
|
||||
|
||||
Base Class & Interface
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`ExpressionCache` is a disk cache mechanism that saves expressions such as **Mean($close, 5)**. Users can inherit this base class to define their own cache mechanism. Users need to override `self._uri` method to define how their cache file path is generated, `self._expression` method to define what data they want to cache and how to cache it.
|
||||
|
||||
`DatasetCache` is a disk cache mechanism that saves datasets. A certain dataset is regulated by a stockpool configuration (or a series of instruments, though not recommended), a list of expressions or static feature fields, the start time and end time for the collected features and the frequency. Users need to override `self._uri` method to define how their cache file path is generated, `self._expression` method to define what data they want to cache and how to cache it.
|
||||
|
||||
`ExpressionCache` and `DatasetCache` actually provides the same interfaces with `ExpressionProvider` and `DatasetProvider` so that the disk cache layer is transparent to users and will only be used if they want to define their own cache mechanism. The users can plug the cache mechanism into the server system by assigning the cache class they want to use in `config.py`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
'ExpressionCache': 'ServerExpressionCache',
|
||||
'DatasetCache': 'ServerDatasetCache',
|
||||
|
||||
Users can find the cache interface here.
|
||||
|
||||
ExpressionCache
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: qlib.data.cache.ExpressionCache
|
||||
:members:
|
||||
|
||||
DatasetCache
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: qlib.data.cache.DatasetCache
|
||||
:members:
|
||||
|
||||
|
||||
Implemented Disk Cache
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. note::
|
||||
|
||||
If the user does not use QlibServer, please ignore the content of this section
|
||||
|
||||
Qlib has currently provided `ServerExpressionCache` class and `ServerDatasetCache` class as the cache mechanisms used for QlibServer. The class interface and file structure designed for server cache mechanism is listed below.
|
||||
|
||||
DiskExpressionCache
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: qlib.data.cache.ServerExpressionCache
|
||||
|
||||
DiskDatasetCache
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. autoclass:: qlib.data.cache.ServerDatasetCache
|
||||
|
||||
|
||||
Data and Cache File Structure
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
- data/
|
||||
[raw data] updated by data providers
|
||||
- calendars/
|
||||
- day.txt
|
||||
- instruments/
|
||||
- all.txt
|
||||
- csi500.txt
|
||||
- ...
|
||||
- features/
|
||||
- sh600000/
|
||||
- open.day.bin
|
||||
- close.day.bin
|
||||
- ...
|
||||
- ...
|
||||
[cached data] updated by server when raw data is updated
|
||||
- calculated features/
|
||||
- sh600000/
|
||||
- [hash(instrtument, field_expression, freq)]
|
||||
- all-time expression -cache data file
|
||||
- .meta : an assorted meta file recording the instrument name, field name, freq, and visit times
|
||||
- ...
|
||||
- cache/
|
||||
- [hash(stockpool_config, field_expression_list, freq)]
|
||||
- all-time Dataset-cache data file
|
||||
- .meta : an assorted meta file recording the stockpool config, field names and visit times
|
||||
- .index : an assorted index file recording the line index of all calendars
|
||||
- ...
|
||||
|
||||
674
docs/component/estimator.rst
Normal file
@@ -0,0 +1,674 @@
|
||||
.. _estimator:
|
||||
=================================
|
||||
Estimator: Workflow Management
|
||||
=================================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
The components in `Qlib Framework <../introduction/introduction.html#framework>`_ is designed in a loosely-coupled way. Users could build their own quant research workflow with these components like `Example <http://TODO_URL>`_
|
||||
|
||||
|
||||
Besides, ``Qlib`` provides more user-friendly interfaces named ``Estimator`` to automatically run the whole workflow defined by a config. A concrete execution of the whole workflow is called an `experiment`.
|
||||
With ``Estimator``, user can easily run an `experiment`, which includes the following steps:
|
||||
|
||||
- Data
|
||||
- Loading
|
||||
- Processing
|
||||
- Slicing
|
||||
- Model
|
||||
- Training and inference(static or rolling)
|
||||
- Saving & loading
|
||||
- Evaluation(Back-testing)
|
||||
|
||||
For each `experiment`, ``Qlib`` will capture the details of model training, performance evalution results and basic infomation(e.g. names, ids). The captured data will be stored in backend-storge(disk or database).
|
||||
|
||||
Example
|
||||
===================
|
||||
|
||||
The following is an example:
|
||||
|
||||
.. note:: Make sure install the latest version of `qlib`, please refer to `Qlib installation <../start/installation.html>`_.
|
||||
|
||||
If users want to use the models and data provided by `Qlib`, they only need to do as follows.
|
||||
|
||||
First, Write a simple configuration file as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: estimator_example
|
||||
observer_type: file_storage
|
||||
mode: train
|
||||
model:
|
||||
class: LGBModel
|
||||
module_path: qlib.contrib.model.gbdt
|
||||
args:
|
||||
loss: mse
|
||||
colsample_bytree: 0.8879
|
||||
learning_rate: 0.0421
|
||||
subsample: 0.8789
|
||||
lambda_l1: 205.6999
|
||||
lambda_l2: 580.9768
|
||||
max_depth: 8
|
||||
num_leaves: 210
|
||||
num_threads: 20
|
||||
data:
|
||||
class: QLibDataHandlerClose
|
||||
args:
|
||||
dropna_label: True
|
||||
filter:
|
||||
market: csi500
|
||||
trainer:
|
||||
class: StaticTrainer
|
||||
args:
|
||||
rolling_period: 360
|
||||
train_start_date: 2007-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-12-31
|
||||
test_start_date: 2017-01-01
|
||||
test_end_date: 2020-08-01
|
||||
strategy:
|
||||
class: TopkDropoutStrategy
|
||||
args:
|
||||
topk: 50
|
||||
n_drop: 5
|
||||
backtest:
|
||||
normal_backtest_args:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 100000000
|
||||
benchmark: SH000905
|
||||
deal_price: close
|
||||
open_cost: 0.0005
|
||||
close_cost: 0.0015
|
||||
min_cost: 5
|
||||
qlib_data:
|
||||
# when testing, please modify the following parameters according to the specific environment
|
||||
provider_uri: "~/.qlib/qlib_data/cn_data"
|
||||
region: "cn"
|
||||
|
||||
|
||||
Then run the following command:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
estimator -c configuration.yaml
|
||||
|
||||
.. note:: 'estimator' is a built-in command of our program.
|
||||
|
||||
|
||||
|
||||
Configuration File
|
||||
===================
|
||||
|
||||
Before using ``estimator``, users need to prepare a configuration file. The following shows how to prepare each part of the configuration file.
|
||||
|
||||
Experiment Field
|
||||
--------------------
|
||||
|
||||
First, the configuration file needs to have a field about the experiment, whose key is `experiment`. This field and its contents determine how `estimator` tracks and persists this `experiment`. ``Qlib`` used `sacred`, a lightweight open-source tool designed to configure, organize, generate logs, and manage experiment results. The field `experiment` will determine the partial behavior of `sacred`.
|
||||
|
||||
Usually, in the running process of `estimator`, those following will be managed by `sacred`:
|
||||
|
||||
- `model.bin`, model binary file
|
||||
- `pred.pkl`, model prediction result file
|
||||
- `analysis.pkl`, backtest performance analysis file
|
||||
- `positions.pkl`, backtest position record file
|
||||
- `run`, the experiment information object, usually contains some meta information such as the experiment name, experiment date, etc.
|
||||
|
||||
Usually, it should contain the following:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: test_experiment
|
||||
observer_type: mongo
|
||||
mongo_url: mongodb://MONGO_URL
|
||||
db_name: public
|
||||
finetune: false
|
||||
exp_info_path: /home/test_user/exp_info.json
|
||||
mode: test
|
||||
loader:
|
||||
id: 677
|
||||
|
||||
|
||||
The meaning of each field is as follows:
|
||||
|
||||
- `name`
|
||||
The experiment name, str type, `sacred` will use this experiment name as an identifier for some important internal processes. Usually, users can see this field in `sacred` by `run` object. The default value is `test_experiment`.
|
||||
|
||||
- `observer_type`
|
||||
Observer type, str type, there are two values which are `file_storage` and `mongo` respectively. If it is `file_storage`, all the above-mentioned managed contents will be stored in the `dir` directory, separated by the number of times of experiments as a subfolder. If it is `mongo`, the content will be stored in the database. The default is `file_storage`.
|
||||
|
||||
- For `file_storage` observer.
|
||||
- `dir`
|
||||
Directory url, str type, directory for `file_storage` observer type, files captures and managed by sacred with observer type of `file_storage` will be save to this directory, default is the directory of `config.json`.
|
||||
|
||||
- For `mongo` observer.
|
||||
- `mongo_url`
|
||||
Database URL, str type, required if the observer type is `mongo`.
|
||||
|
||||
- `db_name`
|
||||
Database name, str type, required if the observer type is `mongo`.
|
||||
|
||||
- `finetune`
|
||||
Estimator will produce a model based on this flag
|
||||
|
||||
The following table is the processing logic for different situations.
|
||||
|
||||
========== =========================================== ==================================== =========================================== ==========================================
|
||||
. Static Rolling
|
||||
. Finetune=True Finetune=False Finetune=True Finetune=False
|
||||
========== =========================================== ==================================== =========================================== ==========================================
|
||||
Train - Need to provide model(Static or Rolling) - No need to provide model - Need to provide model(Static or Rolling) - Need to provide model(Static or Rolling)
|
||||
- The args in model section will be - The args in model section will be - The args in model section will be - The args in model section will be
|
||||
used for finetuning used for training used for finetuning used for finetuning
|
||||
- Update based on the provided model - Train model from scratch - Update based on the provided model - Based on the provided model update
|
||||
and parameters and parameters - Train model from scratch
|
||||
- **Each rolling time slice is based on** - **Train each rolling time slice**
|
||||
**a model updated from the previous** **separately**
|
||||
**time**
|
||||
Test - Model must exist, otherwise an exception will be raised.
|
||||
- For `StaticTrainer`, users need to train a model and record 'exp_info' for 'Test'.
|
||||
- For `RollingTrainer`, users need to train a set of models until the latest time, and record 'exp_info' for 'Test'.
|
||||
========== =============================================================================================================================================================================
|
||||
|
||||
.. note::
|
||||
|
||||
1. finetune parameters: share model.args parameters.
|
||||
|
||||
2. provide model: from `loader.model_index`, load the index of the model(starting from 0).
|
||||
|
||||
3. If `loader.model_index` is None:
|
||||
- In 'Static Finetune=True', if provide 'Rolling', use the last model to update.
|
||||
|
||||
- For RollingTrainer with Finetune=Ture.
|
||||
|
||||
- If StaticTrainer is used in loader, the model will be used for initialization for finetuning.
|
||||
|
||||
- If RollingTrainer is used in loader, the existing models will be used without any modification and the new models will be initialized with the model in the last period and finetune one by one.
|
||||
|
||||
|
||||
- `exp_info_path`
|
||||
experiment info save path, str type, save the experiment info and model prediction score after the experiment is finished. Optional parameter, the default value is `config_file_dir/ex_name/exp_info.json`
|
||||
|
||||
- `mode`
|
||||
`train` or `test`, str type, if `mode` is test, it will load the model according to the parameters of `loader`. The default value is `train`.
|
||||
Also note that when the load model failed, it will `fit` model.
|
||||
.. note::
|
||||
|
||||
if users choose `mode` test, they need to make sure:
|
||||
- The loader of `test_start_date` must be less than or equal to the current `test_start_date`.
|
||||
- If other parameters of the `loader` model args are different, a warning will appear.
|
||||
|
||||
|
||||
- `loader`
|
||||
If the `mode` is `test` or `finetune` is `true`, it will be used.
|
||||
|
||||
- `model_index`
|
||||
Model index, int type. The index of the loaded model in loader_models (starting at 0) for the first `finetune`. The default value is None.
|
||||
|
||||
- `exp_info_path`
|
||||
Loader model experiment info path, str type. If the field exists, the following parameters will be parsed from `exp_info_path`, and the following parameters will not work. This field and `id` must exist one.
|
||||
|
||||
- `id`
|
||||
The experiment id of the model that needs to be loaded, int type. If the `mode` is `test`, this value is required. This field and `exp_info_path` must exist one.
|
||||
|
||||
- `name`
|
||||
The experiment name of the model that needs to be loaded, str type. The default value is the current experiment `name`.
|
||||
|
||||
- `observer_type`
|
||||
The experiment observer type of the model that needs to be loaded, str type. The default value is the current experiment `observer_type`.
|
||||
.. note:: The observer type is a concept of the `sacred` module, which determines how files, standard input and output which are managed by sacred are stored.
|
||||
|
||||
|
||||
- `file_storage`
|
||||
If `observer_type` is `file_storage`, the config may be as follows.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: test_experiment
|
||||
dir: <path to a directory> # default is dir of `config.yml`
|
||||
observer_type: file_storage
|
||||
- `mongo`
|
||||
If `observer_type` is `mongo`, the config may be as follows.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: test_experiment
|
||||
observer_type: mongo
|
||||
mongo_url: mongodb://MONGO_URL
|
||||
db_name: public
|
||||
|
||||
Users need to indicate `mongo_url` and `db_name` for a mongo observer.
|
||||
|
||||
.. note::
|
||||
|
||||
If users choose mongo observer, they need to make sure:
|
||||
- have an environment with the mongodb installed and a mongo database dedicated for storing the experiments results.
|
||||
- The python environment(the version of python and package) to run the experiments and the one to fetch the results are consistent.
|
||||
|
||||
Model Field
|
||||
-----------------
|
||||
|
||||
Users can use a specified model by configuration with hyper-parameters.
|
||||
|
||||
Custom Models
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
Qlib support custom models, but it must be a subclass of the `qlib.contrib.model.Model`, the config for custom model may be as following.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: SomeModel
|
||||
module_path: /tmp/my_experment/custom_model.py
|
||||
args:
|
||||
loss: binary
|
||||
|
||||
|
||||
The class `SomeModel` should be in the module `custom_model`, and ``Qlib`` could parse the `module_path` to load the class.
|
||||
|
||||
To Know more about ``Model``, please refer to `Model <model.html>`_.
|
||||
|
||||
Data Field
|
||||
-----------------
|
||||
|
||||
``Data Handler`` can be used to load raw data, prepare features and label columns, preprocess data(standardization, remove NaN, etc.), split training, validation, and test sets. It is a subclass of `qlib.contrib.estimator.handler.BaseDataHandler`.
|
||||
|
||||
Users can use the specified data handler by config as follows.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
data:
|
||||
class: QLibDataHandlerClose
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
dropna_label: True
|
||||
filter:
|
||||
market: csi500
|
||||
filter_pipeline:
|
||||
-
|
||||
class: NameDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
name_rule_re: S(?!Z3)
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
-
|
||||
class: ExpressionDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
rule_expression: $open/$factor<=45
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
|
||||
- `class`
|
||||
Data handler class, str type, which should be a subclass of `qlib.contrib.estimator.handler.BaseDataHandler`, and implements 5 important interfaces for loading features, loading raw data, preprocessing raw data, slicing train, validation, and test data. The default value is `ALPHA360`. If users want to write a data handler to retrieve the data in qlib, `QlibDataHandler` is suggested.
|
||||
|
||||
- `module_path`
|
||||
The module path, str type, absolute url is also supported, indicates the path of the `class` implementation of data processor class. The default value is `qlib.contrib.estimator.handler`.
|
||||
|
||||
- `args`
|
||||
Parameters used for ``Data Handler`` initialization.
|
||||
|
||||
- `train_start_date`
|
||||
Training start time, str type, default value is `2005-01-01`.
|
||||
|
||||
- `start_date`
|
||||
Data start date, str type.
|
||||
|
||||
- `end_date`
|
||||
Data end date, str type. the data from start_date to end_date decides which part of data will be loaded in datahandler, users can only use these data in the following parts.
|
||||
|
||||
- `dropna_feature` (Optional in args)
|
||||
Drop Nan feature, bool type, default value is False.
|
||||
|
||||
- `dropna_label` (Optional in args)
|
||||
Drop Nan label, bool type, default value is True. Some multi-label tasks will use this.
|
||||
|
||||
- `normalize_method` (Optional in args)
|
||||
Normalzie data by given method. str type. ``Qlib`` give two normalize method, `MinMax` and `Std`.
|
||||
If users wants to build their own method, please override `_process_normalize_feature`.
|
||||
|
||||
- `filter`
|
||||
Dynamically filtering the stocks based on the filter pipeline.
|
||||
|
||||
- `market`
|
||||
index name, str type, the default value is `csi500`.
|
||||
|
||||
- `filter_pipeline`
|
||||
Filter rule list, list type, the default value is []. Can be customized according to users' needs.
|
||||
|
||||
- `class`
|
||||
Filter class name, str type.
|
||||
|
||||
- `module_path`
|
||||
The module path, str type.
|
||||
|
||||
- `args`
|
||||
The filter class parameters, this parameters are set according to the `class`, and all the parameters as kwargs to `class`.
|
||||
|
||||
Custom Data Handler
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Qlib support custom data handler, but it must be a subclass of the ``qlib.contrib.estimator.handler.BaseDataHandler``, the config for custom data handler may be as follows.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
data:
|
||||
class: SomeDataHandler
|
||||
module_path: /tmp/my_experment/custom_data_handler.py
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
|
||||
The class `SomeDataHandler` should be in the module `custom_data_handler`, and ``Qlib`` could parse the `module_path` to load the class.
|
||||
|
||||
If users want to load features and labels by config, they can inherit ``qlib.contrib.estimator.handler.ConfigDataHandler``, ``Qlib`` also has provided some preprocess method in this subclass.
|
||||
If users want to use qlib data, `QLibDataHandler` is recommended, from which users can inherit custom class. `QLibDataHandler` is also a subclass of `ConfigDataHandler`.
|
||||
|
||||
To Know more about ``Data Handler``, please refer to `Data Framework&Usage <data.html>`_.
|
||||
|
||||
Trainer Field
|
||||
-----------------
|
||||
|
||||
Users can specify the trainer ``Trainer`` by the config file, which is subclass of ``qlib.contrib.estimator.trainer.BaseTrainer`` and implement three important interfaces for training the model, restoring the model, and getting model predictions as follows.
|
||||
|
||||
- `train`
|
||||
Implement this interface to train the model.
|
||||
|
||||
- `load`
|
||||
Implement this interface to recover the model from disk.
|
||||
|
||||
- `get_pred`
|
||||
Implement this interface to get model prediction results.
|
||||
|
||||
Qlib have provided two implemented trainer,
|
||||
|
||||
- `StaticTrainer`
|
||||
The static trainer will be trained using the training, validation, and test data of the data processor static slicing.
|
||||
|
||||
- `RollingTrainer`
|
||||
The rolling trainer will use the rolling iterator of the data processor to split data for rolling training.
|
||||
|
||||
|
||||
Users can specify `trainer` with the configuration file:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
trainer:
|
||||
class: StaticTrainer # or RollingTrainer
|
||||
args:
|
||||
rolling_period: 360
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2017-07-31
|
||||
|
||||
- `class`
|
||||
Trainer class, which should be a subclass of `qlib.contrib.estimator.trainer.BaseTrainer`, and needs to implement three important interfaces, the default value is `StaticTrainer`.
|
||||
|
||||
- `module_path`
|
||||
The module path, str type, absolute url is also supported, indicates the path of the trainer class implementation.
|
||||
|
||||
- `args`
|
||||
Parameters used for ``Trainer`` initialization.
|
||||
|
||||
- `rolling_period`
|
||||
The rolling period, integer type, indicates how many time steps need rolling when rolling the data. The default value is `60`. Only used in `RollingTrainer`.
|
||||
|
||||
- `train_start_date`
|
||||
Training start time, str type.
|
||||
|
||||
- `train_end_date`
|
||||
Training end time, str type.
|
||||
|
||||
- `validate_start_date`
|
||||
Validation start time, str type.
|
||||
|
||||
- `validate_end_date`
|
||||
Validation end time, str type.
|
||||
|
||||
- `test_start_date`
|
||||
Test start time, str type.
|
||||
|
||||
- `test_end_date`
|
||||
Test end time, str type. If `test_end_date` is `-1` or greater than the last date of the data, the last date of the data will be used as `test_end_date`.
|
||||
|
||||
Custom Trainer
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Qlib support custom trainer, but it must be a subclass of the `qlib.contrib.estimator.trainer.BaseTrainer`, the config for custom trainer may be as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
trainer:
|
||||
class: SomeTrainer
|
||||
module_path: /tmp/my_experment/custom_trainer.py
|
||||
args:
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2017-07-31
|
||||
|
||||
|
||||
The class `SomeTrainer` should be in the module `custom_trainer`, and ``Qlib`` could parse the `module_path` to load the class.
|
||||
|
||||
Strategy Field
|
||||
-----------------
|
||||
|
||||
Users can specify strategy through a config file, for example:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy :
|
||||
class: TopkDropoutStrategy
|
||||
args:
|
||||
topk: 50
|
||||
n_drop: 5
|
||||
|
||||
- `class`
|
||||
The strategy class, str type, should be a subclass of `qlib.contrib.strategy.strategy.BaseStrategy`. The default value is `TopkDropoutStrategy`.
|
||||
|
||||
- `module_path`
|
||||
The module location, str type, absolute url is also supported, and absolute path is also supported, indicates the location of the policy class implementation.
|
||||
|
||||
- `args`
|
||||
Parameters used for ``Trainer`` initialization.
|
||||
|
||||
- `topk`
|
||||
The number of stocks in the portfolio
|
||||
|
||||
- `n_drop`
|
||||
Number of stocks to be replaced in each trading date
|
||||
|
||||
Custom Strategy
|
||||
^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Qlib support custom strategy, but it must be a subclass of the ``qlib.contrib.strategy.strategy.BaseStrategy``, the config for custom strategy may be as following,
|
||||
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy :
|
||||
class: SomeStrategy
|
||||
module_path: /tmp/my_experment/custom_strategy.py
|
||||
|
||||
The class `SomeStrategy` should be in the module `custom_strategy`, and ``Qlib`` could parse the `module_path` to load the class.
|
||||
|
||||
To Know more about ``Strategy``, please refer to `Strategy <strategy.html>`_.
|
||||
|
||||
Backtest Field
|
||||
-----------------
|
||||
|
||||
Users can specify `backtest` through a config file, for example:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
backtest :
|
||||
normal_backtest_args:
|
||||
topk: 50
|
||||
benchmark: SH000905
|
||||
account: 500000
|
||||
deal_price: close
|
||||
min_cost: 5
|
||||
subscribe_fields:
|
||||
- $close
|
||||
- $change
|
||||
- $factor
|
||||
|
||||
- `normal_backtest_args`
|
||||
Normal backtest parameters. All the parameters in this section will be passed to the ``qlib.contrib.evaluate.backtest`` function in the form of `**kwargs`.
|
||||
|
||||
- `benchmark`
|
||||
Stock index symbol, str or list type, the default value is `None`.
|
||||
|
||||
.. note::
|
||||
|
||||
* If `benchmark` is None, it will use the average change of the day of all stocks in 'pred' as the 'bench'.
|
||||
|
||||
* If `benchmark` is list, it will use the daily average change of the stock pool in the list as the 'bench'.
|
||||
|
||||
* If `benchmark` is str, it will use the daily change as the 'bench'.
|
||||
|
||||
|
||||
- `account`
|
||||
Backtest initial cash, integer type. The `account` in `strategy` section is deprecated. It only works when `account` is not set in `backtest` section. It will be overridden by `account` in the `backtest` section. The default value is 1e9.
|
||||
|
||||
- `deal_price`
|
||||
Order transaction price field, str type, the default value is vwap.
|
||||
|
||||
- `min_cost`
|
||||
Min transaction cost, float type, the default value is 5.
|
||||
|
||||
- `subscribe_fields`
|
||||
Subscribe quote fields, array type, the default value is [`deal_price`, $close, $change, $factor].
|
||||
|
||||
|
||||
Qlib Data Field
|
||||
--------------------
|
||||
|
||||
The `qlib_data` field describes the parameters of qlib initialization.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
qlib_data:
|
||||
# when testing, please modify the following parameters according to the specific environment
|
||||
provider_uri: "~/.qlib/qlib_data/cn_data"
|
||||
region: "cn"
|
||||
|
||||
- `provider_uri`
|
||||
The local directory where the data loaded by 'get_data.py' is stored.
|
||||
- `region`
|
||||
- If region == ``qlib.config.REG_CN``, 'qlib' will be initialized in US-stock mode.
|
||||
- If region == ``qlib.config.REG_US``, 'qlib' will be initialized in china-stock mode.
|
||||
|
||||
Please refer to `Initialization <../start/initialization.rst>`_.
|
||||
|
||||
Experiment Result
|
||||
===================
|
||||
|
||||
Form of Experimental Result
|
||||
----------------------------
|
||||
The result of the experiment is the result of the backtest, please refer to `Backtest <backtest.html>`_.
|
||||
|
||||
|
||||
Get Experiment Result
|
||||
----------------------------
|
||||
|
||||
Users can check the experiment results from file storage directly, or check the experiment results from database, or get the experiment results through two API of a module `fetcher` provided by ``Qlib``.
|
||||
|
||||
- `get_experiments()`
|
||||
The API takes two parameters. The first parameter is the experiment name. The default is all experiments. The second parameter is the observer type. Users can get the experiment name dictionary with a list of ids and test end date by the API as follows.
|
||||
|
||||
.. code-block:: JSON
|
||||
|
||||
{
|
||||
"ex_a": [
|
||||
{
|
||||
"id": 1,
|
||||
"test_end_date": "2017-01-01"
|
||||
}
|
||||
],
|
||||
"ex_b": [
|
||||
...
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
- `get_experiment(exp_name, exp_id, fields=None)`
|
||||
The API takes three parameters, the first parameter is the experiment name, the second parameter is the experiment id, and the third parameter is field list.
|
||||
If fields is None, will get all fields.
|
||||
|
||||
.. note::
|
||||
Currently supported fields:
|
||||
['model', 'analysis', 'positions', 'report_normal', 'pred', 'task_config', 'label']
|
||||
|
||||
.. code-block:: JSON
|
||||
|
||||
{
|
||||
'analysis': analysis_df,
|
||||
'pred': pred_df,
|
||||
'positions': positions_dic,
|
||||
'report_normal': report_normal_df,
|
||||
}
|
||||
|
||||
|
||||
Here is a simple example of `FileFetcher`, which could fetch files from `file_storage` observer.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.contrib.estimator.fetcher import FileFetcher
|
||||
>>> f = FileFetcher(experiments_dir=r'./')
|
||||
>>> print(f.get_experiments())
|
||||
|
||||
{
|
||||
'test_experiment': [
|
||||
{
|
||||
'id': '1',
|
||||
'config': ...
|
||||
},
|
||||
{
|
||||
'id': '2',
|
||||
'config': ...
|
||||
},
|
||||
{
|
||||
'id': '3',
|
||||
'config': ...
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
>>> print(f.get_experiment('test_experiment', '1'))
|
||||
|
||||
risk
|
||||
sub_bench mean 0.000662
|
||||
std 0.004487
|
||||
annual 0.166720
|
||||
sharpe 2.340526
|
||||
mdd -0.080516
|
||||
sub_cost mean 0.000577
|
||||
std 0.004482
|
||||
annual 0.145392
|
||||
sharpe 2.043494
|
||||
mdd -0.083584
|
||||
|
||||
If users use mongo observer when training, they should initialize their fether with mongo_url
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.contrib.estimator.fetcher import MongoFetcher
|
||||
>>> f = MongoFetcher(mongo_url=..., db_name=...)
|
||||
|
||||
179
docs/component/model.rst
Normal file
@@ -0,0 +1,179 @@
|
||||
.. _model:
|
||||
============================================
|
||||
Interday Model: Model Training & Prediction
|
||||
============================================
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Interday Model`` is designed to make the prediction score about stocks. Users can use the ``Interday Model`` in an automatic workflow by ``Estimator``, please refer to `Estimator <estimator.html>`_.
|
||||
|
||||
Because the components in ``Qlib`` are designed in a loosely-coupled way, ``Interday Model`` can be used as a independent module also.
|
||||
|
||||
Base Class & Interface
|
||||
======================
|
||||
|
||||
``Qlib`` provides a base class `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_, which all models should inherit from.
|
||||
|
||||
The base class provides the following interfaces:
|
||||
|
||||
- `__init__(**kwargs)`
|
||||
- Initialization.
|
||||
- If users use ``Estimator`` to start an `experiment`, the parameter of `__init__` method shoule be consistent with the hyperparameters in the configuration file.
|
||||
|
||||
- `fit(self, x_train, y_train, x_valid, y_valid, w_train=None, w_valid=None, **kwargs)`
|
||||
- Train model.
|
||||
- Parameter:
|
||||
- `x_train`, pd.DataFrame type, train feature
|
||||
The following example explains the value of `x_train`:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
KMID KLEN KMID2 KUP KUP2
|
||||
instrument datetime
|
||||
SH600004 2012-01-04 0.000000 0.017685 0.000000 0.012862 0.727275
|
||||
2012-01-05 -0.006473 0.025890 -0.250001 0.012945 0.499998
|
||||
2012-01-06 0.008117 0.019481 0.416666 0.008117 0.416666
|
||||
2012-01-09 0.016051 0.025682 0.624998 0.006421 0.250001
|
||||
2012-01-10 0.017323 0.026772 0.647057 0.003150 0.117648
|
||||
... ... ... ... ... ...
|
||||
SZ300273 2014-12-25 -0.005295 0.038697 -0.136843 0.016293 0.421052
|
||||
2014-12-26 -0.022486 0.041701 -0.539215 0.002453 0.058824
|
||||
2014-12-29 -0.031526 0.039092 -0.806451 0.000000 0.000000
|
||||
2014-12-30 -0.010000 0.032174 -0.310811 0.013913 0.432433
|
||||
2014-12-31 0.010917 0.020087 0.543479 0.001310 0.065216
|
||||
|
||||
|
||||
`x_train` is a pandas DataFrame, whose index is MultiIndex <instrument(str), datetime(pd.Timestamp)>. Each column of `x_train` corresponds to a feature, and the column name is the feature name.
|
||||
|
||||
.. note::
|
||||
|
||||
The number and names of the columns is determined by the data handler, please refer to `Data Handler <data.html#data-handler>`_ and `Estimator Data <estimator.html#about-data>`_.
|
||||
|
||||
- `y_train`, pd.DataFrame type, train label
|
||||
The following example explains the value of `y_train`:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
LABEL
|
||||
instrument datetime
|
||||
SH600004 2012-01-04 -0.798456
|
||||
2012-01-05 -1.366716
|
||||
2012-01-06 -0.491026
|
||||
2012-01-09 0.296900
|
||||
2012-01-10 0.501426
|
||||
... ...
|
||||
SZ300273 2014-12-25 -0.465540
|
||||
2014-12-26 0.233864
|
||||
2014-12-29 0.471368
|
||||
2014-12-30 0.411914
|
||||
2014-12-31 1.342723
|
||||
|
||||
`y_train` is a pandas DataFrame, whose index is MultiIndex <instrument(str), datetime(pd.Timestamp)>. The `LABEL` column represents the value of train label.
|
||||
|
||||
.. note::
|
||||
|
||||
The number and names of the columns is determined by the ``Data Handler``, please refer to `Data Handler <data.html#data-handler>`_.
|
||||
|
||||
- `x_valid`, pd.DataFrame type, validation feature
|
||||
The format of `x_valid` is same as `x_train`
|
||||
|
||||
|
||||
- `y_valid`, pd.DataFrame type, validation label
|
||||
The format of `y_valid` is same as `y_train`
|
||||
|
||||
- `w_train`(Optional args, default is None), pd.DataFrame type, train weight
|
||||
`w_train` is a pandas DataFrame, whose shape and index is same as `x_train`. The float value in `w_train` represents the weight of the feature at the same position in `x_train`.
|
||||
|
||||
- `w_train`(Optional args, default is None), pd.DataFrame type, validation weight
|
||||
`w_train` is a pandas DataFrame, whose shape and index is same as `x_valid`. The float value in `w_train` represents the weight of the feature at the same position in `x_train`.
|
||||
|
||||
- `predict(self, x_test, **kwargs)`
|
||||
- Predict test data 'x_test'
|
||||
- Parameter:
|
||||
- `x_test`, pd.DataFrame type, test features
|
||||
The form of `x_test` is same as `x_train` in 'fit' method.
|
||||
- Return:
|
||||
- `label`, np.ndarray type, test label
|
||||
The label of `x_test` that predicted by model.
|
||||
|
||||
- `score(self, x_test, y_test, w_test=None, **kwargs)`
|
||||
- Evaluate model with test feature/label
|
||||
- Parameter:
|
||||
- `x_test`, pd.DataFrame type, test feature
|
||||
The format of `x_test` is same as `x_train` in `fit` method.
|
||||
|
||||
- `x_test`, pd.DataFrame type, test label
|
||||
The format of `y_test` is same as `y_train` in `fit` method.
|
||||
|
||||
- `w_test`, pd.DataFrame type, test weight
|
||||
The format of `w_test` is same as `w_train` in `fit` method.
|
||||
- Return: float type, evaluation score
|
||||
|
||||
For other interfaces such as `save`, `load`, `finetune`, please refer to `Model API <../reference/api.html#module-qlib.contrib.model.base>`_.
|
||||
|
||||
Example
|
||||
==================
|
||||
|
||||
``Qlib`` provides ``LightGBM`` and ``DNN`` models as the baseline, the following steps shows how to run`` LightGBM`` as an independent module.
|
||||
|
||||
- Initialize ``Qlib`` with `qlib.init` first, please refer to `initialization <initialization.rst>`_.
|
||||
- Run the following code to get the prediction score `pred_score`
|
||||
.. code-block:: Python
|
||||
|
||||
from qlib.contrib.estimator.handler import QLibDataHandlerClose
|
||||
from qlib.contrib.model.gbdt import LGBModel
|
||||
|
||||
DATA_HANDLER_CONFIG = {
|
||||
"dropna_label": True,
|
||||
"start_date": "2007-01-01",
|
||||
"end_date": "2020-08-01",
|
||||
"market": MARKET,
|
||||
}
|
||||
|
||||
TRAINER_CONFIG = {
|
||||
"train_start_date": "2007-01-01",
|
||||
"train_end_date": "2014-12-31",
|
||||
"validate_start_date": "2015-01-01",
|
||||
"validate_end_date": "2016-12-31",
|
||||
"test_start_date": "2017-01-01",
|
||||
"test_end_date": "2020-08-01",
|
||||
}
|
||||
|
||||
x_train, y_train, x_validate, y_validate, x_test, y_test = QLibDataHandlerClose(
|
||||
**DATA_HANDLER_CONFIG
|
||||
).get_split_data(**TRAINER_CONFIG)
|
||||
|
||||
|
||||
MODEL_CONFIG = {
|
||||
"loss": "mse",
|
||||
"colsample_bytree": 0.8879,
|
||||
"learning_rate": 0.0421,
|
||||
"subsample": 0.8789,
|
||||
"lambda_l1": 205.6999,
|
||||
"lambda_l2": 580.9768,
|
||||
"max_depth": 8,
|
||||
"num_leaves": 210,
|
||||
"num_threads": 20,
|
||||
}
|
||||
# use default model
|
||||
# custom Model, refer to: TODO: Model API url
|
||||
model = LGBModel(**MODEL_CONFIG)
|
||||
model.fit(x_train, y_train, x_validate, y_validate)
|
||||
_pred = model.predict(x_test)
|
||||
pred_score = pd.DataFrame(index=_pred.index)
|
||||
pred_score["score"] = _pred.iloc(axis=1)[0]
|
||||
|
||||
.. note:: `QLibDataHandlerClose` is the data handler provided by ``Qlib``, please refer to `Data Handler <data.html#data-handler>`_.
|
||||
|
||||
Also, the above example has been given in ``examples/train_backtest_analyze.ipynb``.
|
||||
|
||||
Custom Model
|
||||
===================
|
||||
|
||||
Qlib supports custom models. If users are interested in customizing their own models and integrating the models into ``Qlib``, please refer to `Custom Model Integration <../start/integration.html>`_.
|
||||
|
||||
|
||||
API
|
||||
===================
|
||||
Please refer to `Model API <../reference/api.html#module-qlib.contrib.model.base>`_.
|
||||
197
docs/component/report.rst
Normal file
@@ -0,0 +1,197 @@
|
||||
.. _report:
|
||||
==========================================
|
||||
Aanalysis: Evaluation & Results Analysis
|
||||
==========================================
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Aanalysis`` is designed to show the graphical reports of ``Intraday Trading`` , which helps users to evaluate and analyse investment portfolios visually. There are the following graphics to view:
|
||||
|
||||
- analysis_position
|
||||
- report_graph
|
||||
- score_ic_graph
|
||||
- cumulative_return_graph
|
||||
- risk_analysis_graph
|
||||
- rank_label_graph
|
||||
|
||||
- analysis_model
|
||||
- model_performance_graph
|
||||
|
||||
|
||||
Graphical Reports
|
||||
===================
|
||||
|
||||
Users can run the following code to get all supported reports.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib.contrib.report as qcr
|
||||
>>> print(qcr.GRAPH_NAME_LISt)
|
||||
['analysis_position.report_graph', 'analysis_position.score_ic_graph', 'analysis_position.cumulative_return_graph', 'analysis_position.risk_analysis_graph', 'analysis_position.rank_label_graph', 'analysis_model.model_performance_graph']
|
||||
|
||||
.. note::
|
||||
|
||||
For more details, please refer to the function document: similar to ``help(qcr.analysis_position.report_graph)``
|
||||
|
||||
|
||||
|
||||
Usage&Example
|
||||
===================
|
||||
|
||||
Usage of `analysis_position.report`
|
||||
-----------------------------------
|
||||
|
||||
API
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.report
|
||||
:members:
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. note::
|
||||
|
||||
- Axis X: Trading day
|
||||
- Axis Y: Accumulated value
|
||||
- The shaded part above: Maximum drawdown corresponding to `cum return`
|
||||
- The shaded part below: Maximum drawdown corresponding to `cum ex return wo cost` %
|
||||
|
||||
.. image:: ../_static/img/analysis/report.png
|
||||
|
||||
|
||||
Usage of `analysis_position.score_ic`
|
||||
-------------------------------------
|
||||
|
||||
API
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.score_ic
|
||||
:members:
|
||||
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. note::
|
||||
|
||||
- Axis X: Trading day
|
||||
- Axis Y: `Ref($close, -1)/$close - 1` and `score` IC%
|
||||
|
||||
.. image:: ../_static/img/analysis/score_ic.png
|
||||
|
||||
|
||||
Usage of `analysis_position.cumulative_return`
|
||||
----------------------------------------------
|
||||
|
||||
API
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.cumulative_return
|
||||
:members:
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. note::
|
||||
|
||||
- Cumulative return graphics.
|
||||
- Axis X: Trading day
|
||||
- Axis Y:
|
||||
- Above axis Y: `(((Ref($close, -1)/$close - 1) * weight).sum() / weight.sum()).cumsum()`
|
||||
- Below axis Y: Daily weight sum
|
||||
- In the **sell** graph, `y < 0` stands for profit; in other cases, `y > 0` stands for profit.
|
||||
- In the **buy_minus_sell** graph, the **y** value of the **weight** graph at the bottom is `buy_weight + sell_weight`.
|
||||
- In each graph, the **red line** in the histogram on the right represents the average.%
|
||||
|
||||
.. image:: ../_static/img/analysis/cumulative_return_buy.png
|
||||
|
||||
.. image:: ../_static/img/analysis/cumulative_return_sell.png
|
||||
|
||||
.. image:: ../_static/img/analysis/cumulative_return_buy_minus_sell.png
|
||||
|
||||
.. image:: ../_static/img/analysis/cumulative_return_hold.png
|
||||
|
||||
|
||||
Usage of `analysis_position.risk_analysis`
|
||||
----------------------------------------------
|
||||
|
||||
API
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.risk_analysis
|
||||
:members:
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
- annual/mdd/sharpe/std graphics
|
||||
- Axis X: Trading days are grouped by month
|
||||
- Axis Y: monthly(trading date) value
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. image:: ../_static/img/analysis/risk_analysis_bar.png
|
||||
|
||||
.. image:: ../_static/img/analysis/risk_analysis_annual.png
|
||||
|
||||
.. image:: ../_static/img/analysis/risk_analysis_mdd.png
|
||||
|
||||
.. image:: ../_static/img/analysis/risk_analysis_sharpe.png
|
||||
|
||||
.. image:: ../_static/img/analysis/risk_analysis_std.png
|
||||
|
||||
|
||||
Usage of `analysis_position.rank_label`
|
||||
----------------------------------------------
|
||||
|
||||
API
|
||||
~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.rank_label
|
||||
:members:
|
||||
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. note::
|
||||
|
||||
- hold/sell/buy graphics:
|
||||
- Axis X: Trading day
|
||||
- Axis Y: Percentage of `'Ref($close, -1)/$close - 1'.rank(ascending=False) / (number of lines on the day) * 100` every trading day. (`ascending=False`: The higher the value, the higher the ranking)%
|
||||
|
||||
.. image:: ../_static/img/analysis/rank_label_hold.png
|
||||
|
||||
.. image:: ../_static/img/analysis/rank_label_buy.png
|
||||
|
||||
.. image:: ../_static/img/analysis/rank_label_sell.png
|
||||
|
||||
|
||||
|
||||
Usage of `analysis_model.analysis_model_performance`
|
||||
-----------------------------------------------------
|
||||
|
||||
API
|
||||
~~~~~
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_model.analysis_model_performance
|
||||
:members:
|
||||
|
||||
|
||||
Graphical Result
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_cumulative_return.png
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_long_short.png
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_IC.png
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_monthly_IC.png
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_NDQ.png
|
||||
|
||||
.. image:: ../_static/img/analysis/analysis_model_auto_correlation.png
|
||||
119
docs/component/strategy.rst
Normal file
@@ -0,0 +1,119 @@
|
||||
.. _strategy:
|
||||
========================================
|
||||
Interday Strategy: Portfolio Management
|
||||
========================================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Interday Strategy`` is designed to adopt different trading strategies, which means that users can adopt different algorithms to generate investment portfolios based on the prediction scores of the ``Interday Model``. Users can use the ``Interday Strategy`` in an automatic workflow by ``Estimator``, please refer to `Estimator <estimator.html>`_.
|
||||
|
||||
Because the componets in ``Qlib`` are designed in a loosely-coupled way, ``Interday Strategy`` can be used as a independent module also.
|
||||
|
||||
``Qlib`` provides several implemented trading strategy. Also, ``Qlib`` supports costom strategy, users can customize strategies according to their own needs.
|
||||
|
||||
Base Class & Interface
|
||||
======================
|
||||
|
||||
BaseStrategy
|
||||
------------------
|
||||
|
||||
Qlib provides a base class ``qlib.contrib.strategy.BaseStrategy``. All strategy classes need to inherit the base class and implement its interface.
|
||||
|
||||
- `get_risk_degree`
|
||||
Return the proportion of your total value you will use in investment. Dynamically risk_degree will result in Market timing.
|
||||
|
||||
- `generate_order_list`
|
||||
Rerturn the order list.
|
||||
|
||||
User can inherit `BaseStrategy` to costomize their strategy class.
|
||||
|
||||
WeightStrategyBase
|
||||
--------------------
|
||||
|
||||
Qlib alse provides a class ``qlib.contrib.strategy.WeightStrategyBase`` that is a subclass of `BaseStrategy`.
|
||||
|
||||
`WeightStrategyBase` only focuses on the target positions, and automatically generates an order list based on positions. It provides the `generate_target_weight_position` interface.
|
||||
|
||||
- `generate_target_weight_position`
|
||||
- According to the current position and trading date to generate the target position. The cash is not considered.
|
||||
- Return the target position.
|
||||
|
||||
.. note::
|
||||
Here the `target position` means the target percentage of total assets.
|
||||
|
||||
`WeightStrategyBase` implements the interface `generate_order_list`, whose processions is as follows.
|
||||
|
||||
- Call `generate_target_weight_position` method to generate the target position.
|
||||
- Generate the target amount of stocks from the target position.
|
||||
- Generate the order list from the target amount
|
||||
|
||||
Users can inherit `WeightStrategyBase` and implement the inteface `generate_target_weight_position` to costomize their strategy class, which only focuses on the target positions.
|
||||
|
||||
Implemented Strategy
|
||||
====================
|
||||
|
||||
Qlib provides several implemented strategy classes `TopkDropoutStrategy`.
|
||||
|
||||
|
||||
TopkDropoutStrategy
|
||||
------------------
|
||||
`TopkDropoutStrategy` is a subclass of `BaseStrategy` and implement the interface `generate_order_list` whose process is as follows.
|
||||
|
||||
- Adopt the the ``Topk-Drop`` algorithm to calculate the target amount of each stock
|
||||
|
||||
.. note::
|
||||
``Topk-Drop`` algorithm:
|
||||
|
||||
- `Topk`: The number of stocks held
|
||||
- `Drop`: The number of stocks sold on each trading day
|
||||
|
||||
Currently, the number of held stocks is `Topk`.
|
||||
On each trading day, the `Drop` number of held stocks with worst prediction score will be sold, and the same number of unheld stocks with best prediction score will be bought.
|
||||
|
||||
.. image:: ../_static/img/topk_drop.png
|
||||
:alt: Topk-Drop
|
||||
|
||||
``TopkDrop`` algorithm sells `Drop` stocks every trading day, which guarantees a fixed turnover rate.
|
||||
|
||||
- Generate the order list from the target amount
|
||||
|
||||
Usage & Example
|
||||
====================
|
||||
``Interday Strategy`` can be specified in the ``Intraday Trading(Backtest)``, the example is as follows.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from qlib.contrib.strategy.strategy import TopkDropoutStrategy
|
||||
from qlib.contrib.evaluate import backtest
|
||||
STRATEGY_CONFIG = {
|
||||
"topk": 50,
|
||||
"n_drop": 5,
|
||||
}
|
||||
BACKTEST_CONFIG = {
|
||||
"verbose": False,
|
||||
"limit_threshold": 0.095,
|
||||
"account": 100000000,
|
||||
"benchmark": BENCHMARK,
|
||||
"deal_price": "vwap",
|
||||
}
|
||||
|
||||
# use default strategy
|
||||
# custom Strategy, refer to: TODO: Strategy API url
|
||||
strategy = TopkDropoutStrategy(**STRATEGY_CONFIG)
|
||||
|
||||
# pred_score is the prediction score output by Model
|
||||
report_normal, positions_normal = backtest(
|
||||
pred_score, strategy=strategy, **BACKTEST_CONFIG
|
||||
)
|
||||
|
||||
Also, the above example has been given in ``examples\train_backtest_analyze.ipynb``.
|
||||
|
||||
To know more about the prediction score `pred_score` output by ``Interday Model``, please refer to `Interday Model: Model Training & Prediction <model.html>`_.
|
||||
|
||||
To know more about ``Intraday Trading``, please refer to `Intraday Trading: Model&Strategy Testing <backtest.html>`_.
|
||||
|
||||
Reference
|
||||
===================
|
||||
TO konw more about ``Interday Strategy``, please refer to `Strategy API <../reference/api.html>`_.
|
||||
224
docs/conf.py
Normal file
@@ -0,0 +1,224 @@
|
||||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
|
||||
# QLib documentation build configuration file, created by
|
||||
# sphinx-quickstart on Wed Sep 27 15:16:05 2017.
|
||||
#
|
||||
# This file is execfile()d with the current directory set to its
|
||||
# containing dir.
|
||||
#
|
||||
# Note that not all possible configuration values are present in this
|
||||
# autogenerated file.
|
||||
#
|
||||
# All configuration values have a default; values that are commented out
|
||||
# serve to show the default.
|
||||
|
||||
# If extensions (or modules to document with autodoc) are in another directory,
|
||||
# add these directories to sys.path here. If the directory is relative to the
|
||||
# documentation root, use os.path.abspath to make it absolute, like shown here.
|
||||
#
|
||||
import os
|
||||
import sys
|
||||
|
||||
import pkg_resources
|
||||
|
||||
|
||||
# -- General configuration ------------------------------------------------
|
||||
|
||||
# If your documentation needs a minimal Sphinx version, state it here.
|
||||
#
|
||||
# needs_sphinx = '1.0'
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
|
||||
# ones.
|
||||
extensions = [
|
||||
'sphinx.ext.autodoc',
|
||||
'sphinx.ext.todo',
|
||||
'sphinx.ext.mathjax',
|
||||
'sphinx.ext.napoleon',
|
||||
]
|
||||
|
||||
# Add any paths that contain templates here, relative to this directory.
|
||||
templates_path = ['_templates']
|
||||
|
||||
# The suffix(es) of source filenames.
|
||||
# You can specify multiple suffix as a list of string:
|
||||
#
|
||||
# source_suffix = ['.rst', '.md']
|
||||
source_suffix = '.rst'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# General information about the project.
|
||||
project = u"QLib"
|
||||
copyright = u"Microsoft"
|
||||
author = u"Microsoft"
|
||||
|
||||
# The version info for the project you're documenting, acts as replacement for
|
||||
# |version| and |release|, also used in various other places throughout the
|
||||
# built documents.
|
||||
#
|
||||
# The short X.Y version.
|
||||
version = pkg_resources.get_distribution("qlib").version
|
||||
# The full version, including alpha/beta/rc tags.
|
||||
release = pkg_resources.get_distribution("qlib").version
|
||||
|
||||
# The language for content autogenerated by Sphinx. Refer to documentation
|
||||
# for a list of supported languages.
|
||||
#
|
||||
# This is also used if you do content translation via gettext catalogs.
|
||||
# Usually you set "language" from the command line for these cases.
|
||||
language = 'en_US'
|
||||
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
# This patterns also effect to html_static_path and html_extra_path
|
||||
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = 'sphinx'
|
||||
|
||||
# If true, `todo` and `todoList` produce output, else they produce nothing.
|
||||
todo_include_todos = False
|
||||
|
||||
# If true, '()' will be appended to :func: etc. cross-reference text.
|
||||
add_function_parentheses = False
|
||||
|
||||
# If true, the current module name will be prepended to all description
|
||||
# unit titles (such as .. function::).
|
||||
add_module_names = True
|
||||
|
||||
# If true, `todo` and `todoList` produce output, else they produce nothing.
|
||||
todo_include_todos = True
|
||||
|
||||
|
||||
# -- Options for HTML output ----------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. See the documentation for
|
||||
# a list of builtin themes.
|
||||
#
|
||||
html_theme = "sphinx_rtd_theme"
|
||||
|
||||
# Theme options are theme-specific and customize the look and feel of a theme
|
||||
# further. For a list of options available for each theme, see the
|
||||
# documentation.
|
||||
# html_context = {
|
||||
# "display_github": False,
|
||||
# "last_updated": True,
|
||||
# "commit": True,
|
||||
# "github_user": "Microsoft",
|
||||
# "github_repo": "QLib",
|
||||
# 'github_version': 'master',
|
||||
# 'conf_py_path': '/docs/',
|
||||
|
||||
# }
|
||||
#
|
||||
html_theme_options = {
|
||||
'collapse_navigation': False,
|
||||
'display_version': False,
|
||||
'navigation_depth': 3,
|
||||
}
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
# relative to this directory. They are copied after the builtin static files,
|
||||
# so a file named "default.css" will overwrite the builtin "default.css".
|
||||
#html_static_path = ['_static']
|
||||
|
||||
# Custom sidebar templates, must be a dictionary that maps document names
|
||||
# to template names.
|
||||
#
|
||||
# This is required for the alabaster theme
|
||||
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
|
||||
html_sidebars = {
|
||||
'**': [
|
||||
'about.html',
|
||||
'navigation.html',
|
||||
'relations.html', # needs 'show_related': True theme option to display
|
||||
'searchbox.html',
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
# -- Options for HTMLHelp output ------------------------------------------
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = 'qlibdoc'
|
||||
|
||||
|
||||
# -- Options for LaTeX output ---------------------------------------------
|
||||
|
||||
latex_elements = {
|
||||
# The paper size ('letterpaper' or 'a4paper').
|
||||
#
|
||||
# 'papersize': 'letterpaper',
|
||||
|
||||
# The font size ('10pt', '11pt' or '12pt').
|
||||
#
|
||||
# 'pointsize': '10pt',
|
||||
|
||||
# Additional stuff for the LaTeX preamble.
|
||||
#
|
||||
# 'preamble': '',
|
||||
|
||||
# Latex figure (float) alignment
|
||||
#
|
||||
# 'figure_align': 'htbp',
|
||||
}
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title,
|
||||
# author, documentclass [howto, manual, or own class]).
|
||||
latex_documents = [
|
||||
(master_doc, "qlib.tex", u"QLib Documentation", u"Microsoft", "manual"),
|
||||
]
|
||||
|
||||
|
||||
# -- Options for manual page output ---------------------------------------
|
||||
|
||||
# One entry per manual page. List of tuples
|
||||
# (source start file, name, description, authors, manual section).
|
||||
man_pages = [
|
||||
(master_doc, 'qlib', u'QLib Documentation',
|
||||
[author], 1)
|
||||
]
|
||||
|
||||
|
||||
# -- Options for Texinfo output -------------------------------------------
|
||||
|
||||
# Grouping the document tree into Texinfo files. List of tuples
|
||||
# (source start file, target name, title, author,
|
||||
# dir menu entry, description, category)
|
||||
texinfo_documents = [
|
||||
(master_doc, 'QLib', u'QLib Documentation',
|
||||
author, 'QLib', 'One line description of project.',
|
||||
'Miscellaneous'),
|
||||
]
|
||||
|
||||
|
||||
|
||||
# -- Options for Epub output ----------------------------------------------
|
||||
|
||||
# Bibliographic Dublin Core info.
|
||||
epub_title = project
|
||||
epub_author = author
|
||||
epub_publisher = author
|
||||
epub_copyright = copyright
|
||||
|
||||
# The unique identifier of the text. This can be a ISBN number
|
||||
# or the project homepage.
|
||||
#
|
||||
# epub_identifier = ''
|
||||
|
||||
# A unique identification for the text.
|
||||
#
|
||||
# epub_uid = ''
|
||||
|
||||
# A list of files that should not be packed into the epub file.
|
||||
epub_exclude_files = ['search.html']
|
||||
|
||||
|
||||
autodoc_member_order = 'bysource'
|
||||
autodoc_default_flags = ['members']
|
||||
171
docs/hidden/client.rst
Normal file
@@ -0,0 +1,171 @@
|
||||
.. _client:
|
||||
|
||||
Qlib Client-Server Framework
|
||||
===================
|
||||
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-----------
|
||||
Client-Server is designed to solve following problems
|
||||
|
||||
- Manage the data in a centralized way. Users don't have to manage data of different versions.
|
||||
- Reduce the amount of cache to be generated.
|
||||
- Make the data can be accessed in a remote way.
|
||||
|
||||
Therefore, we designed the client-server framework to solve these problems.
|
||||
We will maintain a server and provide the data.
|
||||
|
||||
You have to initialize you qlib with specific config for using the client-server framework.
|
||||
Here is a typical initialization process.
|
||||
|
||||
qlib ``init`` commonly used parameters; ``nfs-common`` must be installed on the server where the client is located, execute: ``sudo apt install nfs-common``:
|
||||
- ``provider_uri``: nfs-server path; the format is ``host: data_dir``, for example: ``172.23.233.89:/data2/gaochao/sync_qlib/qlib``. If using offline, it can be a local data directory
|
||||
- ``mount_path``: local data directory, ``provider_uri`` will be mounted to this directory
|
||||
- ``auto_mount``: whether to automatically mount ``provider_uri`` to ``mount_path`` during qlib ``init``; You can also mount it manually: sudo mount.nfs ``provider_uri`` ``mount_path``. If on PAI, it is recommended to set ``auto_mount=True``
|
||||
- ``flask_server``: data service host; if you are on the intranet, you can use the default host: 172.23.233.89
|
||||
- ``flask_port``: data service port
|
||||
|
||||
|
||||
If running on 10.150.144.153 or 10.150.144.154 server, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=False, mount_path='/data/csdesign/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
If running on PAI, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=True, mount_path='/data/csdesign/qlib', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
If running on Windows, open **NFS** features and write correct **mount_path**, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
1.windows System open NFS Features
|
||||
* Open ``Programs and Features``.
|
||||
* Click ``Turn Windows features on or off``.
|
||||
* Scroll down and check the option ``Services for NFS``, then click OK
|
||||
Reference address: https://graspingtech.com/mount-nfs-share-windows-10/
|
||||
2.config correct mount_path
|
||||
* In windows, mount path must be not exist path and root path,
|
||||
* correct format path eg: `H`, `i`...
|
||||
* error format path eg: `C`, `C:/user/name`, `qlib_data`...
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=True, mount_path='H', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
The client will mount the data in `provider_uri` on `mount_path`. Then the server and client will communicate with flask and transporting data with this NFS.
|
||||
|
||||
|
||||
If you have a local qlib data files and want to use the qlib data offline instead of online with client server framework.
|
||||
It is also possible with specific config.
|
||||
You can created such a config. `client_config_local.yml`
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
provider_uri: /data/csdesign/qlib
|
||||
calendar_provider: 'LocalCalendarProvider'
|
||||
instrument_provider: 'LocalInstrumentProvider'
|
||||
feature_provider: 'LocalFeatureProvider'
|
||||
expression_provider: 'LocalExpressionProvider'
|
||||
dataset_provider: 'LocalDatasetProvider'
|
||||
provider: 'LocalProvider'
|
||||
dataset_cache: 'SimpleDatasetCache'
|
||||
local_cache_path: '~/.cache/qlib/'
|
||||
|
||||
`provider_uri` is the directory of your local data.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init_from_yaml_conf('client_config_local.yml')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600001'], ['$close'], start_time='20180101', end_time='20190101').head()
|
||||
21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[21232:MainThread](2019-05-29 10:16:05,067) INFO - Initialization - [__init__.py:56] - provider_uri=/data/csdesign/qlib
|
||||
Out[9]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600001 2008-01-02 21.082111
|
||||
2008-01-03 23.195362
|
||||
2008-01-04 23.874615
|
||||
2008-01-07 24.880930
|
||||
2008-01-08 24.277143
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
1. The following API under the client-server module may not be as fast as the older off-line API.
|
||||
- Cal.calendar
|
||||
- Inst.list_instruments
|
||||
2. The rolling operation expression with parameter `0` can not be updated rightly under mechanism of the client-server framework.
|
||||
|
||||
API
|
||||
********************
|
||||
|
||||
The client is based on `python-socketio<https://python-socketio.readthedocs.io>`_ which is a framework that supports WebSocket client for Python language. The client can only propose requests and receive results, which do not include any calculating procedure.
|
||||
|
||||
Class
|
||||
--------------------
|
||||
|
||||
.. automodule:: qlib.data.client
|
||||
|
||||
|
||||
285
docs/hidden/online.rst
Normal file
@@ -0,0 +1,285 @@
|
||||
.. _online:
|
||||
|
||||
Online
|
||||
===================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-------------------
|
||||
|
||||
Welcome to use Online, this module simulates what will be like if we do the real trading use our model and strategy.
|
||||
|
||||
Just like Estimator and other modules in Qlib, you need to determine parameters through the configuration file,
|
||||
and in this module, you need to add an account in a folder to do the simulation. Then in each coming day,
|
||||
this module will use the newest information to do the trade for your account,
|
||||
the performance can be viewed at any time using the API we defined.
|
||||
|
||||
Each account will experience the following processes, the ‘pred_date’ represents the date you predict the target
|
||||
positions after trading, also, the ‘trade_date’ is the date you do the trading.
|
||||
|
||||
- Generate the order list (pre_date)
|
||||
- Execute the order list (trade_date)
|
||||
- Update account (trade_date)
|
||||
|
||||
In the meantime, you can just create an account and use this module to test its performance in a period.
|
||||
|
||||
- Simulate (start_date, end_date)
|
||||
|
||||
This module need to save your account in a folder, the model and strategy will be saved as pickle files,
|
||||
and the position and report will be saved as excel.
|
||||
The file structure can be viewed at fileStruct_.
|
||||
|
||||
|
||||
Example
|
||||
-------------------
|
||||
|
||||
Let's take an example,
|
||||
|
||||
.. note:: Make sure you have the latest version of `qlib` installed.
|
||||
|
||||
If you want to use the models and data provided by `qlib`, you only need to do as follows.
|
||||
|
||||
Firstly, write a simple configuration file as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
module_path: qlib.contrib.strategy
|
||||
args:
|
||||
market: csi500
|
||||
trade_freq: 5
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.online_model
|
||||
args:
|
||||
loss: mse
|
||||
model_path: ./model.bin
|
||||
|
||||
init_cash: 1000000000
|
||||
|
||||
We then can use this command to create a folder and do trading from 2017-01-01 to 2018-08-01.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
online simulate -id v-test -config ./config/config.yaml -exchange_config ./config/exchange.yaml -start 2017-01-01 -end 2018-08-01 -path ./user_data/
|
||||
|
||||
The start date (2017-01-01) is the add date of the user, which also is the first predict date,
|
||||
and the end date (2018-08-01) is the last trade date. You can use "`online generate -date 2018-08-02...`"
|
||||
command to continue generate the order_list at next trading date.
|
||||
|
||||
If Your account was saved in "./user_data/", you can see the performance of your account compared to a benchmark by
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online show -id v-test -path ./user_data/ -bench SH000905
|
||||
|
||||
...
|
||||
Result of porfolio:
|
||||
sub_bench:
|
||||
risk
|
||||
mean 0.001157
|
||||
std 0.003039
|
||||
annual 0.289131
|
||||
sharpe 6.017635
|
||||
mdd -0.013185
|
||||
sub_cost:
|
||||
risk
|
||||
mean 0.000800
|
||||
std 0.003043
|
||||
annual 0.199944
|
||||
sharpe 4.155963
|
||||
mdd -0.015517
|
||||
|
||||
Here 'SH000905' represents csi500 and 'SH000300' represents csi300
|
||||
|
||||
Manage your account
|
||||
--------------------
|
||||
|
||||
Any account processed by `online` should be saved in a folder. you can use commands
|
||||
defined to manage your accounts.
|
||||
|
||||
- add an new account
|
||||
This will add an new account with user_id='v-test', add_date='2019-10-15' in ./user_data.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online add_user -id {user_id} -config {config_file} -path {folder_path} -date {add_date}
|
||||
>> online add_user -id v-test -config config.yaml -path ./user_data/ -date 2019-10-15
|
||||
|
||||
- remove an account
|
||||
.. code-block:: bash
|
||||
|
||||
>> online remove_user -id {user_id} -path {folder_path}
|
||||
>> online remove_user -id v-test -path ./user_data/
|
||||
|
||||
- show the performance
|
||||
Here benchmark indicates the baseline is to be compared with yours.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online show -id {user_id} -path {folder_path} -bench {benchmark}
|
||||
>> online show -id v-test -path ./user_data/ -bench SH000905
|
||||
|
||||
The default value of all the parameter 'date' below is trade date
|
||||
(will be today if today is trading date and information has been updated in `qlib`).
|
||||
|
||||
The 'generate' and 'update' will check whether input date is valid, the following 3 processes should
|
||||
be called at each trading date.
|
||||
|
||||
- generate the order list
|
||||
generate the order list at trade date, and save them in {folder_path}/{user_id}/temp/ as a json file.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online generate -date {date} -path {folder_path}
|
||||
>> online generate -date 2019-10-16 -path ./user_data/
|
||||
|
||||
- execute the order list
|
||||
execute the order list and generate the transactions result in {folder_path}/{user_id}/temp/ at trade date
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online execute -date {date} -exchange_config {exchange_config_path} -path {folder_path}
|
||||
>> online execute -date 2019-10-16 -exchange_config ./config/exchange.yaml -path ./user_data/
|
||||
|
||||
A simple exchange config file can be as
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
open_cost: 0.003
|
||||
close_cost: 0.003
|
||||
limit_threshold: 0.095
|
||||
deal_price: vwap
|
||||
|
||||
|
||||
- update accounts
|
||||
update accounts in "{folder_path}/" at trade date
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online update -date {date} -path {folder_path}
|
||||
>> online update -date 2019-10-16 -path ./user_data/
|
||||
|
||||
API
|
||||
------------------
|
||||
|
||||
All those operations are based on defined in `qlib.contrib.online.operator`
|
||||
|
||||
.. automodule:: qlib.contrib.online.operator
|
||||
|
||||
.. _fileStruct:
|
||||
|
||||
File structure
|
||||
------------------
|
||||
|
||||
'user_data' indicates the root of folder.
|
||||
Name that bold indicates it’s a folder, otherwise it’s a document.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
{user_folder}
|
||||
│ users.csv: (Init date for each users)
|
||||
│
|
||||
└───{user_id1}: (users' sub-folder to save their data)
|
||||
│ │ position.xlsx
|
||||
│ │ report.csv
|
||||
│ │ model_{user_id1}.pickle
|
||||
│ │ strategy_{user_id1}.pickle
|
||||
│ │
|
||||
│ └───score
|
||||
│ │ └───{YYYY}
|
||||
│ │ └───{MM}
|
||||
│ │ │ score_{YYYY-MM-DD}.csv
|
||||
│ │
|
||||
│ └───trade
|
||||
│ └───{YYYY}
|
||||
│ └───{MM}
|
||||
│ │ orderlist_{YYYY-MM-DD}.json
|
||||
│ │ transaction_{YYYY-MM-DD}.csv
|
||||
│
|
||||
└───{user_id2}
|
||||
│ │ position.xlsx
|
||||
│ │ report.csv
|
||||
│ │ model_{user_id2}.pickle
|
||||
│ │ strategy_{user_id2}.pickle
|
||||
│ │
|
||||
│ └───score
|
||||
│ └───trade
|
||||
....
|
||||
|
||||
|
||||
Configuration file
|
||||
------------------
|
||||
|
||||
The configure file used in `online` should contain the model and strategy information.
|
||||
|
||||
About the model
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
First, your configuration file needs to have a field about the model,
|
||||
this field and its contents determine the model we used when generating score at predict date.
|
||||
|
||||
Followings are two examples for ScoreFileModel and a model that read a score file and return score at trade date.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.OnlineModel
|
||||
args:
|
||||
loss: mse
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.OnlineModel
|
||||
args:
|
||||
score_path: <your score path>
|
||||
|
||||
If your model doesn't belong to above models, you need to coding your model manually.
|
||||
Your model should be a subclass of models defined in 'qlib.contfib.model'. And it must
|
||||
contains 2 methods used in `online` module.
|
||||
|
||||
|
||||
About the strategy
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Your need define the strategy used to generate the order list at predict date.
|
||||
|
||||
Followings are two examples for a TopkAmountStrategy
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy:
|
||||
class: TopkDropoutStrategy
|
||||
module_path: qlib.contrib.strategy.strategy
|
||||
args:
|
||||
topk: 100
|
||||
n_drop: 10
|
||||
|
||||
Generated files
|
||||
------------------
|
||||
|
||||
The 'online_generate' command will create the order list at {folder_path}/{user_id}/temp/,
|
||||
the name of that is orderlist_{YYYY-MM-DD}.json, YYYY-MM-DD is the date that those orders to be executed.
|
||||
|
||||
The format of json file is like
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
{
|
||||
'sell': {
|
||||
{'$stock_id1': '$amount1'},
|
||||
{'$stock_id2': '$amount2'}, ...
|
||||
},
|
||||
'buy': {
|
||||
{'$stock_id1': '$amount1'},
|
||||
{'$stock_id2': '$amount2'}, ...
|
||||
}
|
||||
}
|
||||
|
||||
Then after executing the order list (either by 'online_execute' or other executors), a transaction file
|
||||
will be created also at {folder_path}/{user_id}/temp/.
|
||||
327
docs/hidden/tuner.rst
Normal file
@@ -0,0 +1,327 @@
|
||||
.. _tuner:
|
||||
|
||||
Tuner
|
||||
===================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-------------------
|
||||
|
||||
Welcome to use Tuner, this document is based on that you can use Estimator proficiently and correctly.
|
||||
|
||||
You can find the optimal hyper-parameters and combinations of models, trainers, strategies and data labels.
|
||||
|
||||
The usage of program `tuner` is similar with `estimator`, you need provide the URL of the configuration file.
|
||||
The `tuner` will do the following things:
|
||||
|
||||
- Construct tuner pipeline
|
||||
- Search and save best hyper-parameters of one tuner
|
||||
- Search next tuner in pipeline
|
||||
- Save the global best hyper-parameters and combination
|
||||
|
||||
Each tuner is consisted with a kind of combination of modules, and its goal is searching the optimal hyper-parameters of this combination.
|
||||
The pipeline is consisted with different tuners, it is aim at finding the optimal combination of modules.
|
||||
|
||||
The result will be printed on screen and saved in file, you can check the result in your experiment saving files.
|
||||
|
||||
Example
|
||||
~~~~~~~
|
||||
|
||||
Let's see an example,
|
||||
|
||||
First make sure you have the latest version of `qlib` installed.
|
||||
|
||||
Then, you need to privide a configuration to setup the experiment.
|
||||
We write a simple configuration example as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: tuner_experiment
|
||||
tuner_class: QLibTuner
|
||||
qlib_client:
|
||||
auto_mount: False
|
||||
logging_level: INFO
|
||||
optimization_criteria:
|
||||
report_type: model
|
||||
report_factor: model_score
|
||||
optim_type: max
|
||||
tuner_pipeline:
|
||||
-
|
||||
model:
|
||||
class: SomeModel
|
||||
space: SomeModelSpace
|
||||
trainer:
|
||||
class: RollingTrainer
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
space: TopkAmountStrategySpace
|
||||
max_evals: 2
|
||||
|
||||
time_period:
|
||||
rolling_period: 360
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2018-04-30
|
||||
data:
|
||||
class: ALPHA360
|
||||
provider_uri: /data/qlib
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
dropna_label: True
|
||||
dropna_feature: True
|
||||
filter:
|
||||
market: csi500
|
||||
filter_pipeline:
|
||||
-
|
||||
class: NameDFilter
|
||||
module_path: qlib.data.filter
|
||||
args:
|
||||
name_rule_re: S(?!Z3)
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
-
|
||||
class: ExpressionDFilter
|
||||
module_path: qlib.data.filter
|
||||
args:
|
||||
rule_expression: $open/$factor<=45
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
backtest:
|
||||
normal_backtest_args:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 500000
|
||||
benchmark: SH000905
|
||||
deal_price: vwap
|
||||
long_short_backtest_args:
|
||||
topk: 50
|
||||
|
||||
Next, we run the following command, and you can see:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
~/v-yindzh/Qlib/cfg$ tuner -c tuner_config.yaml
|
||||
|
||||
Searching params: {'model_space': {'colsample_bytree': 0.8870905643607678, 'lambda_l1': 472.3188735122233, 'lambda_l2': 92.75390994877243, 'learning_rate': 0.09741751430635413, 'loss': 'mse', 'max_depth': 8, 'num_leaves': 160, 'num_threads': 20, 'subsample': 0.7536051584789751}, 'strategy_space': {'buffer_margin': 250, 'topk': 40}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Searching params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Local best params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
|
||||
Time cost: 489.87220 | Finished searching best parameters in Tuner 0.
|
||||
Time cost: 0.00069 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_0/local_best_params.json .
|
||||
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 2',)}, 'model_space': {'input_dim': 158, 'lr': 0.001, 'lr_decay': 0.9100529502185579, 'lr_decay_steps': 162.48901403763966, 'optimizer': 'gd', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 300, 'topk': 35}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Local best params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
|
||||
Time cost: 550.74039 | Finished searching best parameters in Tuner 1.
|
||||
Time cost: 0.00023 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_1/local_best_params.json .
|
||||
Time cost: 1784.14691 | Finished tuner pipeline.
|
||||
Time cost: 0.00014 | Finished save global best tuner parameters.
|
||||
Best Tuner id: 0.
|
||||
You can check the best parameters at tuner_experiment/global_best_params.json.
|
||||
|
||||
|
||||
Finally, you can check the results of your experiment in the given path.
|
||||
|
||||
Configuration file
|
||||
------------------
|
||||
|
||||
Before using `tuner`, you need to prepare a configuration file. Next we will show you how to prepare each part of the configuration file.
|
||||
|
||||
About the experiment
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
First, your configuration file needs to have a field about the experiment, whose key is `experiment`, this field and its contents determine the saving path and tuner class.
|
||||
|
||||
Usually it should contain the following content:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: tuner_experiment
|
||||
tuner_class: QLibTuner
|
||||
|
||||
Also, there are some optional fields. The meaning of each field is as follows:
|
||||
|
||||
- `name`
|
||||
The experiment name, str type, the program will use this experiment name to construct a directory to save the process of the whole experiment and the results. The default value is `tuner_experiment`.
|
||||
|
||||
- `dir`
|
||||
The saving path, str type, the program will construct the experiment directory in this path. The default value is the path where configuration locate.
|
||||
|
||||
- `tuner_class`
|
||||
The class of tuner, str type, must be an already implemented model, such as `QLibTuner` in `qlib`, or a custom tuner, but it must be a subclass of `qlib.contrib.tuner.Tuner`, the default value is `QLibTuner`.
|
||||
|
||||
- `tuner_module_path`
|
||||
The module path, str type, absolute url is also supported, indicates the path of the implementation of tuner. The default value is `qlib.contrib.tuner.tuner`
|
||||
|
||||
About the optimization criteria
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You need to designate a factor to optimize, for tuner need a factor to decide which case is better than other cases.
|
||||
Usually, we use the result of `estimator`, such as backtest results and the score of model.
|
||||
|
||||
This part needs contain these fields:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
optimization_criteria:
|
||||
report_type: model
|
||||
report_factor: model_pearsonr
|
||||
optim_type: max
|
||||
|
||||
- `report_type`
|
||||
The type of the report, str type, determines which kind of report you want to use. If you want to use the backtest result type, you can choose `pred_long`, `pred_long_short`, `pred_short`, `sub_bench` and `sub_cost`. If you want to use the model result type, you can only choose `model`.
|
||||
|
||||
- `report_factor`
|
||||
The factor you want to use in the report, str type, determines which factor you want to optimize. If your `report_type` is backtest result type, you can choose `annual`, `sharpe`, `mdd`, `mean` and `std`. If your `report_type` is model result type, you can choose `model_score` and `model_pearsonr`.
|
||||
|
||||
- `optim_type`
|
||||
The optimization type, str type, determines what kind of optimization you want to do. you can minimize the factor or maximize the factor, so you can choose `max`, `min` or `correlation` at this field.
|
||||
Note: `correlation` means the factor's best value is 1, such as `model_pearsonr` (a corraltion coefficient).
|
||||
|
||||
If you want to process the factor or you want fetch other kinds of factor, you can override the `objective` method in your own tuner.
|
||||
|
||||
About the tuner pipeline
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The tuner pipeline contains different tuners, and the `tuner` program will process each tuner in pipeline. Each tuner will get an optimal hyper-parameters of its specific combination of modules. The pipeline will contrast the results of each tuner, and get the best combination and its optimal hyper-parameters. So, you need to configurate the pipeline and each tuner, here is an example:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
tuner_pipeline:
|
||||
-
|
||||
model:
|
||||
class: SomeModel
|
||||
space: SomeModelSpace
|
||||
trainer:
|
||||
class: RollingTrainer
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
space: TopkAmountStrategySpace
|
||||
max_evals: 2
|
||||
|
||||
Each part represents a tuner, and its modules which are to be tuned. Space in each part is the hyper-parameters' space of a certain module, you need to create your searching space and modify it in `/qlib/contrib/tuner/space.py`. We use `hyperopt` package to help us to construct the space, you can see the detail of how to use it in https://github.com/hyperopt/hyperopt/wiki/FMin .
|
||||
|
||||
- model
|
||||
You need to provide the `class` and the `space` of the model. If the model is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- trainer
|
||||
You need to proveide the `class` of the trainer. If the trainer is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- strategy
|
||||
You need to provide the `class` and the `space` of the strategy. If the strategy is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- data_label
|
||||
The label of the data, you can search which kinds of labels will lead to a better result. This part is optional, and you only need to provide `space`.
|
||||
|
||||
- max_evals
|
||||
Allow up to this many function evaluations in this tuner. The default value is 10.
|
||||
|
||||
If you don't want to search some modules, you can fix their spaces in `space.py`. We will not give the default module.
|
||||
|
||||
About the time period
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You need to use the same dataset to evaluate your different `estimator` experiments in `tuner` experiment. Two experiments using different dataset are uncomparable. You can specify `time_period` through the configuration file:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
time_period:
|
||||
rolling_period: 360
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2018-04-30
|
||||
|
||||
- `rolling_period`
|
||||
The rolling period, integer type, indicates how many time steps need rolling when rolling the data. The default value is `60`. If you use `RollingTrainer`, this config will be used, or it will be ignored.
|
||||
|
||||
- `train_start_date`
|
||||
Training start time, str type.
|
||||
|
||||
- `train_end_date`
|
||||
Training end time, str type.
|
||||
|
||||
- `validate_start_date`
|
||||
Validation start time, str type.
|
||||
|
||||
- `validate_end_date`
|
||||
Validation end time, str type.
|
||||
|
||||
- `test_start_date`
|
||||
Test start time, str type.
|
||||
|
||||
- `test_end_date`
|
||||
Test end time, str type. If `test_end_date` is `-1` or greater than the last date of the data, the last date of the data will be used as `test_end_date`.
|
||||
|
||||
About the data and backtest
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`data` and `backtest` are all same in the whole `tuner` experiment. Different `estimator` experiments must use the same data and backtest method. So, these two parts of config are same with that in `estimator` configuration. You can see the precise defination of these parts in `estimator` introduction. We only provide an example here.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
data:
|
||||
class: ALPHA360
|
||||
provider_uri: /data/qlib
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
dropna_label: True
|
||||
dropna_feature: True
|
||||
feature_label_config: /home/v-yindzh/v-yindzh/QLib/cfg/feature_config.yaml
|
||||
filter:
|
||||
market: csi500
|
||||
filter_pipeline:
|
||||
-
|
||||
class: NameDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
name_rule_re: S(?!Z3)
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
-
|
||||
class: ExpressionDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
rule_expression: $open/$factor<=45
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
backtest:
|
||||
normal_backtest_args:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 500000
|
||||
benchmark: SH000905
|
||||
deal_price: vwap
|
||||
long_short_backtest_args:
|
||||
topk: 50
|
||||
|
||||
Experiment Result
|
||||
-----------------
|
||||
|
||||
All the results are stored in experiment file directly, you can check them directly in the corresponding files.
|
||||
What we save are as following:
|
||||
|
||||
- Global optimal parameters
|
||||
- Local optimal parameters of each tuner
|
||||
- Config file of this `tuner` experiment
|
||||
- Every `estimator` experiments result in the process
|
||||
|
||||
60
docs/index.rst
Normal file
@@ -0,0 +1,60 @@
|
||||
============================================================
|
||||
``Qlib`` Documentation
|
||||
============================================================
|
||||
|
||||
``Qlib`` is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.
|
||||
|
||||
.. _user_guide:
|
||||
|
||||
Document Structure
|
||||
====================
|
||||
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
Home <self>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: INTRODUCTION:
|
||||
|
||||
Qlib <introduction/introduction.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: GETTING STARTED:
|
||||
|
||||
Installation <start/installation.rst>
|
||||
Initialization <start/initialization.rst>
|
||||
Data Retrieval <start/getdata.rst>
|
||||
Custom Model Integration <start/integration.rst>
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: COMPONENTS:
|
||||
|
||||
Estimator: Workflow Management <component/estimator.rst>
|
||||
Data Layer: Data Framework&Usage <component/data.rst>
|
||||
Interday Model: Model Training & Prediction <component/model.rst>
|
||||
Interday Strategy: Portfolio Management <component/strategy.rst>
|
||||
Intraday Trading: Model&Strategy Testing <component/backtest.rst>
|
||||
Aanalysis: Evaluation & Results Analysis <component/report.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: ADVANCED TOPICS:
|
||||
|
||||
Building Formulaic Alphas <advanced/alpha.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: REFERENCE:
|
||||
|
||||
API <reference/api.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: Change Log:
|
||||
|
||||
Change Log <changelog/changelog.rst>
|
||||
45
docs/introduction/introduction.rst
Normal file
@@ -0,0 +1,45 @@
|
||||
===============================
|
||||
``Qlib``: Quantitative Library
|
||||
===============================
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Qlib`` is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.
|
||||
|
||||
With ``Qlib``, users can easily apply their favorite model to create better Quant investment strategy.
|
||||
|
||||
|
||||
Framework
|
||||
==================
|
||||
|
||||
.. image:: ../_static/img/framework.png
|
||||
:alt: Framework
|
||||
|
||||
|
||||
At module level, ``Qlib`` is a platform that consists of the above components. Each components is loose-coupling and can be used stand-alone.
|
||||
|
||||
====================== ========================================================================
|
||||
Name Description
|
||||
====================== ========================================================================
|
||||
`Data layer` `DataServer` focus on providing high performance infrastructure for user
|
||||
to retrieve and get raw data. `DataEnhancement` will preprocess the data
|
||||
and provide the best dataset to be fed in to the models.
|
||||
|
||||
`Interday Model` `Interday model` focus on producing forecasting signals(aka. `alpha`).
|
||||
Models are trained by `Model Creator` and managed by `Model Manager`.
|
||||
User could choose one or multiple models for forecasting. Multiple models
|
||||
could be combined with `Ensemble` module.
|
||||
|
||||
`Interday Strategy` `Portfolio Generator` will take forecasting signals as input and output
|
||||
the orders based on current position to achieve target portfolio.
|
||||
|
||||
`Intraday Trading` `Order Executor` is responsible for executing orders output by
|
||||
`Interday Strategy` and returning the executed results.
|
||||
|
||||
`Analysis` User could get detailed analysis report of forecasting signal and portfolio
|
||||
in this part.
|
||||
====================== ========================================================================
|
||||
|
||||
- The modules with hand-drawn style is under development and will be released in the future.
|
||||
- The modules with dashed border is highly user-customizable and extendible.
|
||||
117
docs/reference/api.rst
Normal file
@@ -0,0 +1,117 @@
|
||||
================================
|
||||
API Reference
|
||||
================================
|
||||
|
||||
|
||||
|
||||
Here you can find all ``QLib`` interfaces.
|
||||
|
||||
|
||||
Data
|
||||
====================
|
||||
|
||||
Provider
|
||||
--------------------
|
||||
|
||||
.. automodule:: qlib.data.data
|
||||
:members:
|
||||
|
||||
Filter
|
||||
--------------------
|
||||
|
||||
.. automodule:: qlib.data.filter
|
||||
:members:
|
||||
|
||||
Feature
|
||||
--------------------
|
||||
|
||||
Class
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
.. automodule:: qlib.data.base
|
||||
:members:
|
||||
|
||||
Operator
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
.. automodule:: qlib.data.ops
|
||||
:members:
|
||||
|
||||
Cache
|
||||
----------------
|
||||
.. autoclass:: qlib.data.cache.MemCacheUnit
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.MemCache
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.ExpressionCache
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.DatasetCache
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.ServerExpressionCache
|
||||
:members:
|
||||
|
||||
.. autoclass:: qlib.data.cache.ServerDatasetCache
|
||||
:members:
|
||||
|
||||
|
||||
Contrib
|
||||
====================
|
||||
|
||||
|
||||
Data Handler
|
||||
---------------
|
||||
.. automodule:: qlib.contrib.estimator.handler
|
||||
:members:
|
||||
|
||||
Model
|
||||
--------------------
|
||||
.. automodule:: qlib.contrib.model.base
|
||||
:members:
|
||||
|
||||
Strategy
|
||||
-------------------
|
||||
|
||||
.. automodule:: qlib.contrib.strategy.strategy
|
||||
:members:
|
||||
|
||||
Evaluate
|
||||
-----------------
|
||||
|
||||
.. automodule:: qlib.contrib.evaluate
|
||||
:members:
|
||||
|
||||
|
||||
Report
|
||||
-----------------
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.report
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.score_ic
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.cumulative_return
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.risk_analysis
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_position.rank_label
|
||||
:members:
|
||||
|
||||
|
||||
|
||||
.. automodule:: qlib.contrib.report.analysis_model.analysis_model_performance
|
||||
:members:
|
||||
|
||||
|
||||
1
docs/requirements.txt
Normal file
@@ -0,0 +1 @@
|
||||
Cython==0.29.21
|
||||
137
docs/start/getdata.rst
Normal file
@@ -0,0 +1,137 @@
|
||||
.. _getdata:
|
||||
=============================
|
||||
Data Retrieval
|
||||
=============================
|
||||
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
====================
|
||||
|
||||
Users can get stock data by ``Qlib``. Following examples will demonstrate the basic user interface.
|
||||
|
||||
Examples
|
||||
====================
|
||||
|
||||
|
||||
``QLib`` Initialization:
|
||||
|
||||
.. note:: In order to get the data, users need to initialize ``Qlib`` with `qlib.init` first. Please refer to `initialization <initialization.rst>`_.
|
||||
|
||||
It is recommended to use the following code to initialize qlib:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(provider_uri='~/.qlib/qlib_data/cn_data')
|
||||
|
||||
|
||||
Load trading calendar with the given time range and frequency:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]
|
||||
[Timestamp('2010-01-04 00:00:00'), Timestamp('2010-01-05 00:00:00')]
|
||||
|
||||
Parse a given market name into a stockpool config:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> D.instruments(market='all')
|
||||
{'market': 'all', 'filter_pipe': []}
|
||||
|
||||
Load instruments of certain stockpool in the given time range:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> instruments = D.instruments(market='csi300')
|
||||
>>> D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6]
|
||||
|
||||
|
||||
Load dynamic instruments from a base market according to a name filter
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> from qlib.data.filter import NameDFilter
|
||||
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
|
||||
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter])
|
||||
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
|
||||
|
||||
Load dynamic instruments from a base market according to an expression filter
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> from qlib.data.filter import ExpressionDFilter
|
||||
>>> expressionDFilter = ExpressionDFilter(rule_expression='$close>100')
|
||||
>>> instruments = D.instruments(market='csi300', filter_pipe=[expressionDFilter])
|
||||
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
|
||||
|
||||
To know more about how to use the filter or how to build one's own filter, go to API Reference: `filter API <../reference/api.html#filter>`_
|
||||
|
||||
Load features of certain instruments in given time range:
|
||||
|
||||
.. note:: This is not a recommended way to get features.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> instruments = ['SH600000']
|
||||
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
|
||||
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
|
||||
$close $volume Ref($close,1) Mean($close,3) \
|
||||
instrument datetime
|
||||
SH600000 2010-01-04 81.809998 17144536.0 NaN 81.809998
|
||||
2010-01-05 82.419998 29827816.0 81.809998 82.114998
|
||||
2010-01-06 80.800003 25070040.0 82.419998 81.676666
|
||||
2010-01-07 78.989998 22077858.0 80.800003 80.736666
|
||||
2010-01-08 79.879997 17019168.0 78.989998 79.889999
|
||||
|
||||
Sub($high,$low)
|
||||
instrument datetime
|
||||
SH600000 2010-01-04 2.741158
|
||||
2010-01-05 3.049736
|
||||
2010-01-06 1.621399
|
||||
2010-01-07 2.856926
|
||||
2010-01-08 1.930397
|
||||
2010-01-08 1.930397
|
||||
|
||||
Load features of certain stockpool in given time range:
|
||||
|
||||
.. note:: Since the server need to cache all-time data for your request stockpool and fields, it may take longer to process your request than before. But in the second time, your request will be processed and responded in a flash even if you change the timespan.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> from qlib.data import D
|
||||
>>> from qlib.data.filter import NameDFilter, ExpressionDFilter
|
||||
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
|
||||
>>> expressionDFilter = ExpressionDFilter(rule_expression='($close/$factor)>100')
|
||||
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter, expressionDFilter])
|
||||
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
|
||||
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
|
||||
|
||||
$close $volume Ref($close, 1) \
|
||||
instrument datetime
|
||||
SH600655 2015-06-15 4342.160156 258706.359375 4530.459961
|
||||
2015-06-16 4409.270020 257349.718750 4342.160156
|
||||
2015-06-17 4312.330078 235214.890625 4409.270020
|
||||
2015-06-18 4086.729980 196772.859375 4312.330078
|
||||
2015-06-19 3678.250000 182916.453125 4086.729980
|
||||
Mean($close, 3) high− low
|
||||
instrument datetime
|
||||
SH600655 2015-06-15 4480.743327 285.251465
|
||||
2015-06-16 4427.296712 298.301270
|
||||
2015-06-16 4354.586751 356.098145
|
||||
2015-06-16 4269.443359 363.554932
|
||||
2015-06-16 4025.770020 368.954346
|
||||
|
||||
|
||||
.. note:: When calling D.features() at client, use parameter 'disk_cache=0' to skip dataset cache, use 'disk_cache=1' to generate and use dataset cache. In addition, when calling at server, you can use 'disk_cache=2' to update the dataset cache.
|
||||
|
||||
API
|
||||
====================
|
||||
To know more about how to use the Data, go to API Reference: `Data API <../reference/api.html#Data>`_
|
||||
60
docs/start/initialization.rst
Normal file
@@ -0,0 +1,60 @@
|
||||
.. _initialization:
|
||||
====================
|
||||
Qlib Initialization
|
||||
====================
|
||||
|
||||
.. currentmodule:: qlib
|
||||
|
||||
|
||||
Initialization
|
||||
=========================
|
||||
|
||||
Please execute the following process to initialize ``Qlib``.
|
||||
|
||||
- Download and prepare the Data: execute the following command to download the stock data.
|
||||
.. code-block:: bash
|
||||
|
||||
python scripts/get_data.py qlib_data_cn --target_dir ~/.qlib/qlib_data/cn_data
|
||||
|
||||
Know more about how to use ``get_data.py``, refer to `Raw Data <../advanced/data.html#raw-data>`_.
|
||||
|
||||
|
||||
- Run the initialization code: run the following code in python:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
import qlib
|
||||
# region in [REG_CN, REG_US]
|
||||
from qlib.config import REG_CN
|
||||
provider_uri = "~/.qlib/qlib_data/cn_data" # target_dir
|
||||
qlib.init(provider_uri=provider_uri, region=REG_CN)
|
||||
|
||||
|
||||
|
||||
Parameters
|
||||
-------------------
|
||||
|
||||
In fact, in addition to `provider_uri` and `region`, `qlib.init` has other parameters. The following are all the parameters of `qlib.init`:
|
||||
|
||||
- `provider_uri`
|
||||
Type: str. The local directory where the data loaded by ``get_data.py`` is stored.
|
||||
- `region`
|
||||
Type: str, optional parameter(default: ``qlib.config.REG_CN``).
|
||||
Currently: ``qlib.config.REG_US``('us') and ``qlib.config.REG_CN``('cn') is supported. Different value of ``region`` will
|
||||
result in different stock market mode.
|
||||
|
||||
- ``qlib.config.REG_US``: US stock market.
|
||||
- ``qlib.config.REG_CN``: China stock market.
|
||||
- `redis_host`
|
||||
Type: str, optional parameter(default: "127.0.0.1"), host of `redis`
|
||||
The lock and cache mechanism relies on redis.
|
||||
- `redis_port`
|
||||
Type: int, optional parameter(default: 6379), port of `redis`
|
||||
|
||||
.. note::
|
||||
|
||||
The value of `region` should be aligned with the data stored in `provider_uri`. Currently, ``scripts/get_data.py`` only provides China stock market data. If users want to use the US stock market data, they should prepare their own US-stock data in `provider_uri` and switch to US-stock mode.
|
||||
|
||||
.. note::
|
||||
|
||||
If redis connection failed with `redis_host` and `redis_port`, cache will not be used! Please refer to `Cache <../advanced/cache.rst>`_.
|
||||
43
docs/start/installation.rst
Normal file
@@ -0,0 +1,43 @@
|
||||
.. _installation:
|
||||
====================
|
||||
Installation
|
||||
====================
|
||||
|
||||
.. currentmodule:: qlib
|
||||
|
||||
|
||||
How to Install ``Qlib``
|
||||
====================
|
||||
|
||||
``Qlib`` only supports Python3, and supports up to Python3.8.
|
||||
|
||||
Please execute the following process to install ``Qlib``:
|
||||
|
||||
- Change the directory to ``Qlib``, in which the file ``setup.py`` exists.
|
||||
- Then, please execute the following command:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
$ pip install numpy
|
||||
$ pip install --upgrade cython
|
||||
$ python setup.py install
|
||||
|
||||
|
||||
.. note::
|
||||
It's recommended to use anaconda/miniconda to setup environment.
|
||||
``Qlib`` needs lightgbm and tensorflow packages, use pip to install them.
|
||||
|
||||
.. note::
|
||||
Do not import qlib in the repository folder which contains ``qlib``, otherwise errors may occur.
|
||||
|
||||
|
||||
|
||||
Use the following code to confirm installation successful:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.__version__
|
||||
<LATEST VERSION>
|
||||
|
||||
|
||||
146
docs/start/integration.rst
Normal file
@@ -0,0 +1,146 @@
|
||||
=========================================
|
||||
Custom Model Integration
|
||||
=========================================
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
``Qlib`` provides ``lightGBM`` and ``Dnn`` model as the baseline of ``Interday Model``. In addition to the default model, users can integrate their own custom models into ``Qlib``.
|
||||
|
||||
Users can integrate their own custom models according to the following steps.
|
||||
|
||||
- Define a custom model class, which should be a subclass of the `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_
|
||||
- Write a configuration file that describes the path and parameters of the custom model
|
||||
- Test the custom model
|
||||
|
||||
Custom Model Class
|
||||
===========================
|
||||
The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_ and override the methods in it.
|
||||
|
||||
- Override the `__init__` method
|
||||
- ``Qlib`` passes the initialized parameters to the \_\_init\_\_ method
|
||||
- The parameter must be consistent with the hyperparameters in the configuration file.
|
||||
- Code Example: In the following example, the hyperparameter filed of the configuration file should contain parameters such as ‘loss:mse’.
|
||||
.. code-block:: Python
|
||||
|
||||
def __init__(self, loss='mse', **kwargs):
|
||||
if loss not in {'mse', 'binary'}:
|
||||
raise NotImplementedError
|
||||
self._scorer = mean_squared_error if loss == 'mse' else roc_auc_score
|
||||
self._params.update(objective=loss, **kwargs)
|
||||
self._model = None
|
||||
|
||||
- Override the `fit` method
|
||||
- ``Qlib`` calls the fit method to train the model
|
||||
- The parameters must include training feature 'x_train', training label 'y_train', test feature 'x_valid', test label 'y_valid'at least.
|
||||
- The parameters could include some optional parameters with default values, such as train weight 'w_train', test weight 'w_valid' and 'num_boost_round = 1000'.
|
||||
- Code Example: In the following example, 'num_boost_round = 1000' is an optional parameter.
|
||||
.. code-block:: Python
|
||||
|
||||
def fit(self, x_train:pd.DataFrame, y_train:pd.DataFrame, x_valid:pd.DataFrame, y_valid:pd.DataFrame,
|
||||
w_train:pd.DataFrame = None, w_valid:pd.DataFrame = None, num_boost_round = 1000, **kwargs):
|
||||
|
||||
# Lightgbm need 1D array as its label
|
||||
if y_train.values.ndim == 2 and y_train.values.shape[1] == 1:
|
||||
y_train_1d, y_valid_1d = np.squeeze(y_train.values), np.squeeze(y_valid.values)
|
||||
else:
|
||||
raise ValueError('LightGBM doesn\'t support multi-label training')
|
||||
|
||||
w_train_weight = None if w_train is None else w_train.values
|
||||
w_valid_weight = None if w_valid is None else w_valid.values
|
||||
|
||||
dtrain = lgb.Dataset(x_train.values, label=y_train_1d, weight=w_train_weight)
|
||||
dvalid = lgb.Dataset(x_valid.values, label=y_valid_1d, weight=w_valid_weight)
|
||||
self._model = lgb.train(
|
||||
self._params,
|
||||
dtrain,
|
||||
num_boost_round=num_boost_round,
|
||||
valid_sets=[dtrain, dvalid],
|
||||
valid_names=['train', 'valid'],
|
||||
**kwargs
|
||||
)
|
||||
|
||||
- Override the `predict` method
|
||||
- The parameters include the test features
|
||||
- Return the prediction score
|
||||
- Please refer to `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_ for the parameter types of the fit method
|
||||
- Code Example:In the following example, user need to user dnn to predict the label(such as 'preds') of test data 'x_test' and return it.
|
||||
.. code-block:: Python
|
||||
|
||||
def predict(self, x_test:pd.DataFrame, **kwargs)-> numpy.ndarray:
|
||||
if self._model is None:
|
||||
raise ValueError('model is not fitted yet!')
|
||||
return self._model.predict(x_test.values)
|
||||
|
||||
- Override the `score` method
|
||||
- The parameters include the test features and test labels
|
||||
- Return the evaluation score of model. It's recommended to adopt the loss between labels and prediction score.
|
||||
- Code Example:In the following example, user need to calculate the weighted loss with test data 'x_test', test label 'y_test' and the weight 'w_test'.
|
||||
.. code-block:: Python
|
||||
|
||||
def score(self, x_test:pd.Dataframe, y_test:pd.Dataframe, w_test:pd.DataFrame = None) -> float:
|
||||
# Remove rows from x, y and w, which contain Nan in any columns in y_test.
|
||||
x_test, y_test, w_test = drop_nan_by_y_index(x_test, y_test, w_test)
|
||||
preds = self.predict(x_test)
|
||||
w_test_weight = None if w_test is None else w_test.values
|
||||
scorer = mean_squared_error if self.loss_type == 'mse' else roc_auc_score
|
||||
return scorer(y_test.values, preds, sample_weight=w_test_weight)
|
||||
|
||||
- Override the `save` method & `load` method
|
||||
- The `save` method parameter include the a `filename` that represents an absolute path, user need to save model into the path.
|
||||
- The `load` method parameter include the a `buffer` read from the `filename` passed in `save` method , user need to load model from the `buffer`.
|
||||
- Code Example:
|
||||
.. code-block:: Python
|
||||
|
||||
def save(self, filename):
|
||||
if self._model is None:
|
||||
raise ValueError('model is not fitted yet!')
|
||||
self._model.save_model(filename)
|
||||
|
||||
def load(self, buffer):
|
||||
self._model = lgb.Booster(params={'model_str': buffer.decode('utf-8')})
|
||||
|
||||
|
||||
Configuration File
|
||||
=======================
|
||||
|
||||
The configuration file is described in detail in the `estimator <../advanced/estimator.html#Example>`_ document. In order to integrate the custom model into ``Qlib``, you need to modify the "model" field in the configuration file.
|
||||
|
||||
- Example: The following example describes the ‘model’ field of configuration file about the custom lightgbm model mentioned above , where ‘module_path’ is the module path, ‘class’ is the class name, and ‘args’ is the hyperparameter passed into the __init__ method. All parameters in the field is passed to 'self._params' by '\*\*kwargs' in `__init__` except 'loss = mse'.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: LGBModel
|
||||
module_path: qlib.contrib.model.gbdt
|
||||
args:
|
||||
loss: mse
|
||||
colsample_bytree: 0.8879
|
||||
learning_rate: 0.0421
|
||||
subsample: 0.8789
|
||||
lambda_l1: 205.6999
|
||||
lambda_l2: 580.9768
|
||||
max_depth: 8
|
||||
num_leaves: 210
|
||||
num_threads: 20
|
||||
|
||||
Users could find configuration file of the baseline of the ``Model`` in ``qlib/examples/estimator/estimator_config.yaml`` and ``qlib/examples/estimator/estimator_config_dnn.yaml``
|
||||
|
||||
Model Testing
|
||||
=====================
|
||||
Assuming that the configuration file is ``examples/estimator/estimator_config.yaml``, user can run the following command to test the custom model:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
cd examples # Avoid running program under the directory contains `qlib`
|
||||
estimator -c estimator/estimator_config.yaml
|
||||
|
||||
.. note:: ``estimator`` is a built-in command of ``Qlib``.
|
||||
|
||||
Also, ``Model`` can also be tested as a single module. An example has been given in ``examples.estimator.train_backtest_analyze.ipynb``.
|
||||
|
||||
|
||||
Reference
|
||||
=====================
|
||||
|
||||
To know more about ``Model``, please refer to `Interday Model: Model Training & Prediction <../advanced/model.rst>`_ and `Model API <../reference/api.html#module-qlib.contrib.model.base>`_.
|
||||