mirror of
https://github.com/microsoft/qlib.git
synced 2026-07-04 11:30:57 +08:00
init commit
This commit is contained in:
171
docs/hidden/client.rst
Normal file
171
docs/hidden/client.rst
Normal file
@@ -0,0 +1,171 @@
|
||||
.. _client:
|
||||
|
||||
Qlib Client-Server Framework
|
||||
===================
|
||||
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-----------
|
||||
Client-Server is designed to solve following problems
|
||||
|
||||
- Manage the data in a centralized way. Users don't have to manage data of different versions.
|
||||
- Reduce the amount of cache to be generated.
|
||||
- Make the data can be accessed in a remote way.
|
||||
|
||||
Therefore, we designed the client-server framework to solve these problems.
|
||||
We will maintain a server and provide the data.
|
||||
|
||||
You have to initialize you qlib with specific config for using the client-server framework.
|
||||
Here is a typical initialization process.
|
||||
|
||||
qlib ``init`` commonly used parameters; ``nfs-common`` must be installed on the server where the client is located, execute: ``sudo apt install nfs-common``:
|
||||
- ``provider_uri``: nfs-server path; the format is ``host: data_dir``, for example: ``172.23.233.89:/data2/gaochao/sync_qlib/qlib``. If using offline, it can be a local data directory
|
||||
- ``mount_path``: local data directory, ``provider_uri`` will be mounted to this directory
|
||||
- ``auto_mount``: whether to automatically mount ``provider_uri`` to ``mount_path`` during qlib ``init``; You can also mount it manually: sudo mount.nfs ``provider_uri`` ``mount_path``. If on PAI, it is recommended to set ``auto_mount=True``
|
||||
- ``flask_server``: data service host; if you are on the intranet, you can use the default host: 172.23.233.89
|
||||
- ``flask_port``: data service port
|
||||
|
||||
|
||||
If running on 10.150.144.153 or 10.150.144.154 server, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=False, mount_path='/data/csdesign/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
If running on PAI, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=True, mount_path='/data/csdesign/qlib', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
If running on Windows, open **NFS** features and write correct **mount_path**, it's recommended to use the following code to ``init`` qlib:
|
||||
|
||||
1.windows System open NFS Features
|
||||
* Open ``Programs and Features``.
|
||||
* Click ``Turn Windows features on or off``.
|
||||
* Scroll down and check the option ``Services for NFS``, then click OK
|
||||
Reference address: https://graspingtech.com/mount-nfs-share-windows-10/
|
||||
2.config correct mount_path
|
||||
* In windows, mount path must be not exist path and root path,
|
||||
* correct format path eg: `H`, `i`...
|
||||
* error format path eg: `C`, `C:/user/name`, `qlib_data`...
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init(auto_mount=True, mount_path='H', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
|
||||
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
|
||||
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
|
||||
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
|
||||
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
|
||||
Out[5]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600000 2008-01-02 119.079704
|
||||
2008-01-03 113.120125
|
||||
2008-01-04 117.878860
|
||||
2008-01-07 124.505539
|
||||
2008-01-08 125.395004
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
The client will mount the data in `provider_uri` on `mount_path`. Then the server and client will communicate with flask and transporting data with this NFS.
|
||||
|
||||
|
||||
If you have a local qlib data files and want to use the qlib data offline instead of online with client server framework.
|
||||
It is also possible with specific config.
|
||||
You can created such a config. `client_config_local.yml`
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
provider_uri: /data/csdesign/qlib
|
||||
calendar_provider: 'LocalCalendarProvider'
|
||||
instrument_provider: 'LocalInstrumentProvider'
|
||||
feature_provider: 'LocalFeatureProvider'
|
||||
expression_provider: 'LocalExpressionProvider'
|
||||
dataset_provider: 'LocalDatasetProvider'
|
||||
provider: 'LocalProvider'
|
||||
dataset_cache: 'SimpleDatasetCache'
|
||||
local_cache_path: '~/.cache/qlib/'
|
||||
|
||||
`provider_uri` is the directory of your local data.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
>>> import qlib
|
||||
>>> qlib.init_from_yaml_conf('client_config_local.yml')
|
||||
>>> from qlib.data import D
|
||||
>>> D.features(['SH600001'], ['$close'], start_time='20180101', end_time='20190101').head()
|
||||
21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:16] - default_conf: client.
|
||||
[21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
|
||||
[21232:MainThread](2019-05-29 10:16:05,067) INFO - Initialization - [__init__.py:56] - provider_uri=/data/csdesign/qlib
|
||||
Out[9]:
|
||||
$close
|
||||
instrument datetime
|
||||
SH600001 2008-01-02 21.082111
|
||||
2008-01-03 23.195362
|
||||
2008-01-04 23.874615
|
||||
2008-01-07 24.880930
|
||||
2008-01-08 24.277143
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
1. The following API under the client-server module may not be as fast as the older off-line API.
|
||||
- Cal.calendar
|
||||
- Inst.list_instruments
|
||||
2. The rolling operation expression with parameter `0` can not be updated rightly under mechanism of the client-server framework.
|
||||
|
||||
API
|
||||
********************
|
||||
|
||||
The client is based on `python-socketio<https://python-socketio.readthedocs.io>`_ which is a framework that supports WebSocket client for Python language. The client can only propose requests and receive results, which do not include any calculating procedure.
|
||||
|
||||
Class
|
||||
--------------------
|
||||
|
||||
.. automodule:: qlib.data.client
|
||||
|
||||
|
||||
285
docs/hidden/online.rst
Normal file
285
docs/hidden/online.rst
Normal file
@@ -0,0 +1,285 @@
|
||||
.. _online:
|
||||
|
||||
Online
|
||||
===================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-------------------
|
||||
|
||||
Welcome to use Online, this module simulates what will be like if we do the real trading use our model and strategy.
|
||||
|
||||
Just like Estimator and other modules in Qlib, you need to determine parameters through the configuration file,
|
||||
and in this module, you need to add an account in a folder to do the simulation. Then in each coming day,
|
||||
this module will use the newest information to do the trade for your account,
|
||||
the performance can be viewed at any time using the API we defined.
|
||||
|
||||
Each account will experience the following processes, the ‘pred_date’ represents the date you predict the target
|
||||
positions after trading, also, the ‘trade_date’ is the date you do the trading.
|
||||
|
||||
- Generate the order list (pre_date)
|
||||
- Execute the order list (trade_date)
|
||||
- Update account (trade_date)
|
||||
|
||||
In the meantime, you can just create an account and use this module to test its performance in a period.
|
||||
|
||||
- Simulate (start_date, end_date)
|
||||
|
||||
This module need to save your account in a folder, the model and strategy will be saved as pickle files,
|
||||
and the position and report will be saved as excel.
|
||||
The file structure can be viewed at fileStruct_.
|
||||
|
||||
|
||||
Example
|
||||
-------------------
|
||||
|
||||
Let's take an example,
|
||||
|
||||
.. note:: Make sure you have the latest version of `qlib` installed.
|
||||
|
||||
If you want to use the models and data provided by `qlib`, you only need to do as follows.
|
||||
|
||||
Firstly, write a simple configuration file as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
module_path: qlib.contrib.strategy
|
||||
args:
|
||||
market: csi500
|
||||
trade_freq: 5
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.online_model
|
||||
args:
|
||||
loss: mse
|
||||
model_path: ./model.bin
|
||||
|
||||
init_cash: 1000000000
|
||||
|
||||
We then can use this command to create a folder and do trading from 2017-01-01 to 2018-08-01.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
online simulate -id v-test -config ./config/config.yaml -exchange_config ./config/exchange.yaml -start 2017-01-01 -end 2018-08-01 -path ./user_data/
|
||||
|
||||
The start date (2017-01-01) is the add date of the user, which also is the first predict date,
|
||||
and the end date (2018-08-01) is the last trade date. You can use "`online generate -date 2018-08-02...`"
|
||||
command to continue generate the order_list at next trading date.
|
||||
|
||||
If Your account was saved in "./user_data/", you can see the performance of your account compared to a benchmark by
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online show -id v-test -path ./user_data/ -bench SH000905
|
||||
|
||||
...
|
||||
Result of porfolio:
|
||||
sub_bench:
|
||||
risk
|
||||
mean 0.001157
|
||||
std 0.003039
|
||||
annual 0.289131
|
||||
sharpe 6.017635
|
||||
mdd -0.013185
|
||||
sub_cost:
|
||||
risk
|
||||
mean 0.000800
|
||||
std 0.003043
|
||||
annual 0.199944
|
||||
sharpe 4.155963
|
||||
mdd -0.015517
|
||||
|
||||
Here 'SH000905' represents csi500 and 'SH000300' represents csi300
|
||||
|
||||
Manage your account
|
||||
--------------------
|
||||
|
||||
Any account processed by `online` should be saved in a folder. you can use commands
|
||||
defined to manage your accounts.
|
||||
|
||||
- add an new account
|
||||
This will add an new account with user_id='v-test', add_date='2019-10-15' in ./user_data.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online add_user -id {user_id} -config {config_file} -path {folder_path} -date {add_date}
|
||||
>> online add_user -id v-test -config config.yaml -path ./user_data/ -date 2019-10-15
|
||||
|
||||
- remove an account
|
||||
.. code-block:: bash
|
||||
|
||||
>> online remove_user -id {user_id} -path {folder_path}
|
||||
>> online remove_user -id v-test -path ./user_data/
|
||||
|
||||
- show the performance
|
||||
Here benchmark indicates the baseline is to be compared with yours.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online show -id {user_id} -path {folder_path} -bench {benchmark}
|
||||
>> online show -id v-test -path ./user_data/ -bench SH000905
|
||||
|
||||
The default value of all the parameter 'date' below is trade date
|
||||
(will be today if today is trading date and information has been updated in `qlib`).
|
||||
|
||||
The 'generate' and 'update' will check whether input date is valid, the following 3 processes should
|
||||
be called at each trading date.
|
||||
|
||||
- generate the order list
|
||||
generate the order list at trade date, and save them in {folder_path}/{user_id}/temp/ as a json file.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online generate -date {date} -path {folder_path}
|
||||
>> online generate -date 2019-10-16 -path ./user_data/
|
||||
|
||||
- execute the order list
|
||||
execute the order list and generate the transactions result in {folder_path}/{user_id}/temp/ at trade date
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online execute -date {date} -exchange_config {exchange_config_path} -path {folder_path}
|
||||
>> online execute -date 2019-10-16 -exchange_config ./config/exchange.yaml -path ./user_data/
|
||||
|
||||
A simple exchange config file can be as
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
open_cost: 0.003
|
||||
close_cost: 0.003
|
||||
limit_threshold: 0.095
|
||||
deal_price: vwap
|
||||
|
||||
|
||||
- update accounts
|
||||
update accounts in "{folder_path}/" at trade date
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
>> online update -date {date} -path {folder_path}
|
||||
>> online update -date 2019-10-16 -path ./user_data/
|
||||
|
||||
API
|
||||
------------------
|
||||
|
||||
All those operations are based on defined in `qlib.contrib.online.operator`
|
||||
|
||||
.. automodule:: qlib.contrib.online.operator
|
||||
|
||||
.. _fileStruct:
|
||||
|
||||
File structure
|
||||
------------------
|
||||
|
||||
'user_data' indicates the root of folder.
|
||||
Name that bold indicates it’s a folder, otherwise it’s a document.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
{user_folder}
|
||||
│ users.csv: (Init date for each users)
|
||||
│
|
||||
└───{user_id1}: (users' sub-folder to save their data)
|
||||
│ │ position.xlsx
|
||||
│ │ report.csv
|
||||
│ │ model_{user_id1}.pickle
|
||||
│ │ strategy_{user_id1}.pickle
|
||||
│ │
|
||||
│ └───score
|
||||
│ │ └───{YYYY}
|
||||
│ │ └───{MM}
|
||||
│ │ │ score_{YYYY-MM-DD}.csv
|
||||
│ │
|
||||
│ └───trade
|
||||
│ └───{YYYY}
|
||||
│ └───{MM}
|
||||
│ │ orderlist_{YYYY-MM-DD}.json
|
||||
│ │ transaction_{YYYY-MM-DD}.csv
|
||||
│
|
||||
└───{user_id2}
|
||||
│ │ position.xlsx
|
||||
│ │ report.csv
|
||||
│ │ model_{user_id2}.pickle
|
||||
│ │ strategy_{user_id2}.pickle
|
||||
│ │
|
||||
│ └───score
|
||||
│ └───trade
|
||||
....
|
||||
|
||||
|
||||
Configuration file
|
||||
------------------
|
||||
|
||||
The configure file used in `online` should contain the model and strategy information.
|
||||
|
||||
About the model
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
First, your configuration file needs to have a field about the model,
|
||||
this field and its contents determine the model we used when generating score at predict date.
|
||||
|
||||
Followings are two examples for ScoreFileModel and a model that read a score file and return score at trade date.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.OnlineModel
|
||||
args:
|
||||
loss: mse
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: ScoreFileModel
|
||||
module_path: qlib.contrib.online.OnlineModel
|
||||
args:
|
||||
score_path: <your score path>
|
||||
|
||||
If your model doesn't belong to above models, you need to coding your model manually.
|
||||
Your model should be a subclass of models defined in 'qlib.contfib.model'. And it must
|
||||
contains 2 methods used in `online` module.
|
||||
|
||||
|
||||
About the strategy
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Your need define the strategy used to generate the order list at predict date.
|
||||
|
||||
Followings are two examples for a TopkAmountStrategy
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
strategy:
|
||||
class: TopkDropoutStrategy
|
||||
module_path: qlib.contrib.strategy.strategy
|
||||
args:
|
||||
topk: 100
|
||||
n_drop: 10
|
||||
|
||||
Generated files
|
||||
------------------
|
||||
|
||||
The 'online_generate' command will create the order list at {folder_path}/{user_id}/temp/,
|
||||
the name of that is orderlist_{YYYY-MM-DD}.json, YYYY-MM-DD is the date that those orders to be executed.
|
||||
|
||||
The format of json file is like
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
{
|
||||
'sell': {
|
||||
{'$stock_id1': '$amount1'},
|
||||
{'$stock_id2': '$amount2'}, ...
|
||||
},
|
||||
'buy': {
|
||||
{'$stock_id1': '$amount1'},
|
||||
{'$stock_id2': '$amount2'}, ...
|
||||
}
|
||||
}
|
||||
|
||||
Then after executing the order list (either by 'online_execute' or other executors), a transaction file
|
||||
will be created also at {folder_path}/{user_id}/temp/.
|
||||
327
docs/hidden/tuner.rst
Normal file
327
docs/hidden/tuner.rst
Normal file
@@ -0,0 +1,327 @@
|
||||
.. _tuner:
|
||||
|
||||
Tuner
|
||||
===================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
-------------------
|
||||
|
||||
Welcome to use Tuner, this document is based on that you can use Estimator proficiently and correctly.
|
||||
|
||||
You can find the optimal hyper-parameters and combinations of models, trainers, strategies and data labels.
|
||||
|
||||
The usage of program `tuner` is similar with `estimator`, you need provide the URL of the configuration file.
|
||||
The `tuner` will do the following things:
|
||||
|
||||
- Construct tuner pipeline
|
||||
- Search and save best hyper-parameters of one tuner
|
||||
- Search next tuner in pipeline
|
||||
- Save the global best hyper-parameters and combination
|
||||
|
||||
Each tuner is consisted with a kind of combination of modules, and its goal is searching the optimal hyper-parameters of this combination.
|
||||
The pipeline is consisted with different tuners, it is aim at finding the optimal combination of modules.
|
||||
|
||||
The result will be printed on screen and saved in file, you can check the result in your experiment saving files.
|
||||
|
||||
Example
|
||||
~~~~~~~
|
||||
|
||||
Let's see an example,
|
||||
|
||||
First make sure you have the latest version of `qlib` installed.
|
||||
|
||||
Then, you need to privide a configuration to setup the experiment.
|
||||
We write a simple configuration example as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: tuner_experiment
|
||||
tuner_class: QLibTuner
|
||||
qlib_client:
|
||||
auto_mount: False
|
||||
logging_level: INFO
|
||||
optimization_criteria:
|
||||
report_type: model
|
||||
report_factor: model_score
|
||||
optim_type: max
|
||||
tuner_pipeline:
|
||||
-
|
||||
model:
|
||||
class: SomeModel
|
||||
space: SomeModelSpace
|
||||
trainer:
|
||||
class: RollingTrainer
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
space: TopkAmountStrategySpace
|
||||
max_evals: 2
|
||||
|
||||
time_period:
|
||||
rolling_period: 360
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2018-04-30
|
||||
data:
|
||||
class: ALPHA360
|
||||
provider_uri: /data/qlib
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
dropna_label: True
|
||||
dropna_feature: True
|
||||
filter:
|
||||
market: csi500
|
||||
filter_pipeline:
|
||||
-
|
||||
class: NameDFilter
|
||||
module_path: qlib.data.filter
|
||||
args:
|
||||
name_rule_re: S(?!Z3)
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
-
|
||||
class: ExpressionDFilter
|
||||
module_path: qlib.data.filter
|
||||
args:
|
||||
rule_expression: $open/$factor<=45
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
backtest:
|
||||
normal_backtest_args:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 500000
|
||||
benchmark: SH000905
|
||||
deal_price: vwap
|
||||
long_short_backtest_args:
|
||||
topk: 50
|
||||
|
||||
Next, we run the following command, and you can see:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
~/v-yindzh/Qlib/cfg$ tuner -c tuner_config.yaml
|
||||
|
||||
Searching params: {'model_space': {'colsample_bytree': 0.8870905643607678, 'lambda_l1': 472.3188735122233, 'lambda_l2': 92.75390994877243, 'learning_rate': 0.09741751430635413, 'loss': 'mse', 'max_depth': 8, 'num_leaves': 160, 'num_threads': 20, 'subsample': 0.7536051584789751}, 'strategy_space': {'buffer_margin': 250, 'topk': 40}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Searching params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Local best params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
|
||||
Time cost: 489.87220 | Finished searching best parameters in Tuner 0.
|
||||
Time cost: 0.00069 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_0/local_best_params.json .
|
||||
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 2',)}, 'model_space': {'input_dim': 158, 'lr': 0.001, 'lr_decay': 0.9100529502185579, 'lr_decay_steps': 162.48901403763966, 'optimizer': 'gd', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 300, 'topk': 35}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
|
||||
...
|
||||
(Estimator experiment screen log)
|
||||
...
|
||||
Local best params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
|
||||
Time cost: 550.74039 | Finished searching best parameters in Tuner 1.
|
||||
Time cost: 0.00023 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_1/local_best_params.json .
|
||||
Time cost: 1784.14691 | Finished tuner pipeline.
|
||||
Time cost: 0.00014 | Finished save global best tuner parameters.
|
||||
Best Tuner id: 0.
|
||||
You can check the best parameters at tuner_experiment/global_best_params.json.
|
||||
|
||||
|
||||
Finally, you can check the results of your experiment in the given path.
|
||||
|
||||
Configuration file
|
||||
------------------
|
||||
|
||||
Before using `tuner`, you need to prepare a configuration file. Next we will show you how to prepare each part of the configuration file.
|
||||
|
||||
About the experiment
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
First, your configuration file needs to have a field about the experiment, whose key is `experiment`, this field and its contents determine the saving path and tuner class.
|
||||
|
||||
Usually it should contain the following content:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
experiment:
|
||||
name: tuner_experiment
|
||||
tuner_class: QLibTuner
|
||||
|
||||
Also, there are some optional fields. The meaning of each field is as follows:
|
||||
|
||||
- `name`
|
||||
The experiment name, str type, the program will use this experiment name to construct a directory to save the process of the whole experiment and the results. The default value is `tuner_experiment`.
|
||||
|
||||
- `dir`
|
||||
The saving path, str type, the program will construct the experiment directory in this path. The default value is the path where configuration locate.
|
||||
|
||||
- `tuner_class`
|
||||
The class of tuner, str type, must be an already implemented model, such as `QLibTuner` in `qlib`, or a custom tuner, but it must be a subclass of `qlib.contrib.tuner.Tuner`, the default value is `QLibTuner`.
|
||||
|
||||
- `tuner_module_path`
|
||||
The module path, str type, absolute url is also supported, indicates the path of the implementation of tuner. The default value is `qlib.contrib.tuner.tuner`
|
||||
|
||||
About the optimization criteria
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You need to designate a factor to optimize, for tuner need a factor to decide which case is better than other cases.
|
||||
Usually, we use the result of `estimator`, such as backtest results and the score of model.
|
||||
|
||||
This part needs contain these fields:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
optimization_criteria:
|
||||
report_type: model
|
||||
report_factor: model_pearsonr
|
||||
optim_type: max
|
||||
|
||||
- `report_type`
|
||||
The type of the report, str type, determines which kind of report you want to use. If you want to use the backtest result type, you can choose `pred_long`, `pred_long_short`, `pred_short`, `sub_bench` and `sub_cost`. If you want to use the model result type, you can only choose `model`.
|
||||
|
||||
- `report_factor`
|
||||
The factor you want to use in the report, str type, determines which factor you want to optimize. If your `report_type` is backtest result type, you can choose `annual`, `sharpe`, `mdd`, `mean` and `std`. If your `report_type` is model result type, you can choose `model_score` and `model_pearsonr`.
|
||||
|
||||
- `optim_type`
|
||||
The optimization type, str type, determines what kind of optimization you want to do. you can minimize the factor or maximize the factor, so you can choose `max`, `min` or `correlation` at this field.
|
||||
Note: `correlation` means the factor's best value is 1, such as `model_pearsonr` (a corraltion coefficient).
|
||||
|
||||
If you want to process the factor or you want fetch other kinds of factor, you can override the `objective` method in your own tuner.
|
||||
|
||||
About the tuner pipeline
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The tuner pipeline contains different tuners, and the `tuner` program will process each tuner in pipeline. Each tuner will get an optimal hyper-parameters of its specific combination of modules. The pipeline will contrast the results of each tuner, and get the best combination and its optimal hyper-parameters. So, you need to configurate the pipeline and each tuner, here is an example:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
tuner_pipeline:
|
||||
-
|
||||
model:
|
||||
class: SomeModel
|
||||
space: SomeModelSpace
|
||||
trainer:
|
||||
class: RollingTrainer
|
||||
strategy:
|
||||
class: TopkAmountStrategy
|
||||
space: TopkAmountStrategySpace
|
||||
max_evals: 2
|
||||
|
||||
Each part represents a tuner, and its modules which are to be tuned. Space in each part is the hyper-parameters' space of a certain module, you need to create your searching space and modify it in `/qlib/contrib/tuner/space.py`. We use `hyperopt` package to help us to construct the space, you can see the detail of how to use it in https://github.com/hyperopt/hyperopt/wiki/FMin .
|
||||
|
||||
- model
|
||||
You need to provide the `class` and the `space` of the model. If the model is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- trainer
|
||||
You need to proveide the `class` of the trainer. If the trainer is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- strategy
|
||||
You need to provide the `class` and the `space` of the strategy. If the strategy is user's own implementation, you need to privide the `module_path`.
|
||||
|
||||
- data_label
|
||||
The label of the data, you can search which kinds of labels will lead to a better result. This part is optional, and you only need to provide `space`.
|
||||
|
||||
- max_evals
|
||||
Allow up to this many function evaluations in this tuner. The default value is 10.
|
||||
|
||||
If you don't want to search some modules, you can fix their spaces in `space.py`. We will not give the default module.
|
||||
|
||||
About the time period
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You need to use the same dataset to evaluate your different `estimator` experiments in `tuner` experiment. Two experiments using different dataset are uncomparable. You can specify `time_period` through the configuration file:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
time_period:
|
||||
rolling_period: 360
|
||||
train_start_date: 2005-01-01
|
||||
train_end_date: 2014-12-31
|
||||
validate_start_date: 2015-01-01
|
||||
validate_end_date: 2016-06-30
|
||||
test_start_date: 2016-07-01
|
||||
test_end_date: 2018-04-30
|
||||
|
||||
- `rolling_period`
|
||||
The rolling period, integer type, indicates how many time steps need rolling when rolling the data. The default value is `60`. If you use `RollingTrainer`, this config will be used, or it will be ignored.
|
||||
|
||||
- `train_start_date`
|
||||
Training start time, str type.
|
||||
|
||||
- `train_end_date`
|
||||
Training end time, str type.
|
||||
|
||||
- `validate_start_date`
|
||||
Validation start time, str type.
|
||||
|
||||
- `validate_end_date`
|
||||
Validation end time, str type.
|
||||
|
||||
- `test_start_date`
|
||||
Test start time, str type.
|
||||
|
||||
- `test_end_date`
|
||||
Test end time, str type. If `test_end_date` is `-1` or greater than the last date of the data, the last date of the data will be used as `test_end_date`.
|
||||
|
||||
About the data and backtest
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`data` and `backtest` are all same in the whole `tuner` experiment. Different `estimator` experiments must use the same data and backtest method. So, these two parts of config are same with that in `estimator` configuration. You can see the precise defination of these parts in `estimator` introduction. We only provide an example here.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
data:
|
||||
class: ALPHA360
|
||||
provider_uri: /data/qlib
|
||||
args:
|
||||
start_date: 2005-01-01
|
||||
end_date: 2018-04-30
|
||||
dropna_label: True
|
||||
dropna_feature: True
|
||||
feature_label_config: /home/v-yindzh/v-yindzh/QLib/cfg/feature_config.yaml
|
||||
filter:
|
||||
market: csi500
|
||||
filter_pipeline:
|
||||
-
|
||||
class: NameDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
name_rule_re: S(?!Z3)
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
-
|
||||
class: ExpressionDFilter
|
||||
module_path: qlib.filter
|
||||
args:
|
||||
rule_expression: $open/$factor<=45
|
||||
fstart_time: 2018-01-01
|
||||
fend_time: 2018-12-11
|
||||
backtest:
|
||||
normal_backtest_args:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 500000
|
||||
benchmark: SH000905
|
||||
deal_price: vwap
|
||||
long_short_backtest_args:
|
||||
topk: 50
|
||||
|
||||
Experiment Result
|
||||
-----------------
|
||||
|
||||
All the results are stored in experiment file directly, you can check them directly in the corresponding files.
|
||||
What we save are as following:
|
||||
|
||||
- Global optimal parameters
|
||||
- Local optimal parameters of each tuner
|
||||
- Config file of this `tuner` experiment
|
||||
- Every `estimator` experiments result in the process
|
||||
|
||||
Reference in New Issue
Block a user