1
0
mirror of https://github.com/microsoft/qlib.git synced 2026-07-04 11:30:57 +08:00

init commit

This commit is contained in:
Young
2020-09-22 01:43:21 +00:00
parent aa51e5aad3
commit 99ebd87cba
131 changed files with 20218 additions and 0 deletions

171
docs/hidden/client.rst Normal file
View File

@@ -0,0 +1,171 @@
.. _client:
Qlib Client-Server Framework
===================
.. currentmodule:: qlib
Introduction
-----------
Client-Server is designed to solve following problems
- Manage the data in a centralized way. Users don't have to manage data of different versions.
- Reduce the amount of cache to be generated.
- Make the data can be accessed in a remote way.
Therefore, we designed the client-server framework to solve these problems.
We will maintain a server and provide the data.
You have to initialize you qlib with specific config for using the client-server framework.
Here is a typical initialization process.
qlib ``init`` commonly used parameters; ``nfs-common`` must be installed on the server where the client is located, execute: ``sudo apt install nfs-common``:
- ``provider_uri``: nfs-server path; the format is ``host: data_dir``, for example: ``172.23.233.89:/data2/gaochao/sync_qlib/qlib``. If using offline, it can be a local data directory
- ``mount_path``: local data directory, ``provider_uri`` will be mounted to this directory
- ``auto_mount``: whether to automatically mount ``provider_uri`` to ``mount_path`` during qlib ``init``; You can also mount it manually: sudo mount.nfs ``provider_uri`` ``mount_path``. If on PAI, it is recommended to set ``auto_mount=True``
- ``flask_server``: data service host; if you are on the intranet, you can use the default host: 172.23.233.89
- ``flask_port``: data service port
If running on 10.150.144.153 or 10.150.144.154 server, it's recommended to use the following code to ``init`` qlib:
.. code-block:: python
>>> import qlib
>>> qlib.init(auto_mount=False, mount_path='/data/csdesign/qlib')
>>> from qlib.data import D
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
Out[5]:
$close
instrument datetime
SH600000 2008-01-02 119.079704
2008-01-03 113.120125
2008-01-04 117.878860
2008-01-07 124.505539
2008-01-08 125.395004
If running on PAI, it's recommended to use the following code to ``init`` qlib:
.. code-block:: python
>>> import qlib
>>> qlib.init(auto_mount=True, mount_path='/data/csdesign/qlib', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
>>> from qlib.data import D
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
Out[5]:
$close
instrument datetime
SH600000 2008-01-02 119.079704
2008-01-03 113.120125
2008-01-04 117.878860
2008-01-07 124.505539
2008-01-08 125.395004
If running on Windows, open **NFS** features and write correct **mount_path**, it's recommended to use the following code to ``init`` qlib:
1.windows System open NFS Features
* Open ``Programs and Features``.
* Click ``Turn Windows features on or off``.
* Scroll down and check the option ``Services for NFS``, then click OK
Reference address: https://graspingtech.com/mount-nfs-share-windows-10/
2.config correct mount_path
* In windows, mount path must be not exist path and root path,
* correct format path eg: `H`, `i`...
* error format path eg: `C`, `C:/user/name`, `qlib_data`...
.. code-block:: python
>>> import qlib
>>> qlib.init(auto_mount=True, mount_path='H', provider_uri='172.23.233.89:/data2/gaochao/sync_qlib/qlib')
>>> from qlib.data import D
>>> D.features(['SH600000'], ['$close'], start_time='20080101', end_time='20090101').head()
[39336:MainThread](2019-05-28 21:35:42,800) INFO - Initialization - [__init__.py:16] - default_conf: client.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
[39336:MainThread](2019-05-28 21:35:42,801) INFO - Initialization - [__init__.py:56] - provider_uri=172.23.233.89:/data2/gaochao/sync_qlib/qlib
[39336:Thread-68](2019-05-28 21:35:42,809) INFO - Client - [client.py:28] - Connect to server ws://172.23.233.89:9710
[39336:Thread-72](2019-05-28 21:35:43,489) INFO - Client - [client.py:31] - Disconnect from server!
Opening /data/csdesign/qlib/cache/d239a3b191daa9a5b1b19a59beb47b33 in read-only mode
Out[5]:
$close
instrument datetime
SH600000 2008-01-02 119.079704
2008-01-03 113.120125
2008-01-04 117.878860
2008-01-07 124.505539
2008-01-08 125.395004
The client will mount the data in `provider_uri` on `mount_path`. Then the server and client will communicate with flask and transporting data with this NFS.
If you have a local qlib data files and want to use the qlib data offline instead of online with client server framework.
It is also possible with specific config.
You can created such a config. `client_config_local.yml`
.. code-block:: YAML
provider_uri: /data/csdesign/qlib
calendar_provider: 'LocalCalendarProvider'
instrument_provider: 'LocalInstrumentProvider'
feature_provider: 'LocalFeatureProvider'
expression_provider: 'LocalExpressionProvider'
dataset_provider: 'LocalDatasetProvider'
provider: 'LocalProvider'
dataset_cache: 'SimpleDatasetCache'
local_cache_path: '~/.cache/qlib/'
`provider_uri` is the directory of your local data.
.. code-block:: python
>>> import qlib
>>> qlib.init_from_yaml_conf('client_config_local.yml')
>>> from qlib.data import D
>>> D.features(['SH600001'], ['$close'], start_time='20180101', end_time='20190101').head()
21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:16] - default_conf: client.
[21232:MainThread](2019-05-29 10:16:05,066) INFO - Initialization - [__init__.py:54] - qlib successfully initialized based on client settings.
[21232:MainThread](2019-05-29 10:16:05,067) INFO - Initialization - [__init__.py:56] - provider_uri=/data/csdesign/qlib
Out[9]:
$close
instrument datetime
SH600001 2008-01-02 21.082111
2008-01-03 23.195362
2008-01-04 23.874615
2008-01-07 24.880930
2008-01-08 24.277143
Limitations
-----------
1. The following API under the client-server module may not be as fast as the older off-line API.
- Cal.calendar
- Inst.list_instruments
2. The rolling operation expression with parameter `0` can not be updated rightly under mechanism of the client-server framework.
API
********************
The client is based on `python-socketio<https://python-socketio.readthedocs.io>`_ which is a framework that supports WebSocket client for Python language. The client can only propose requests and receive results, which do not include any calculating procedure.
Class
--------------------
.. automodule:: qlib.data.client

285
docs/hidden/online.rst Normal file
View File

@@ -0,0 +1,285 @@
.. _online:
Online
===================
.. currentmodule:: qlib
Introduction
-------------------
Welcome to use Online, this module simulates what will be like if we do the real trading use our model and strategy.
Just like Estimator and other modules in Qlib, you need to determine parameters through the configuration file,
and in this module, you need to add an account in a folder to do the simulation. Then in each coming day,
this module will use the newest information to do the trade for your account,
the performance can be viewed at any time using the API we defined.
Each account will experience the following processes, the pred_date represents the date you predict the target
positions after trading, also, the trade_date is the date you do the trading.
- Generate the order list (pre_date)
- Execute the order list (trade_date)
- Update account (trade_date)
In the meantime, you can just create an account and use this module to test its performance in a period.
- Simulate (start_date, end_date)
This module need to save your account in a folder, the model and strategy will be saved as pickle files,
and the position and report will be saved as excel.
The file structure can be viewed at fileStruct_.
Example
-------------------
Let's take an example,
.. note:: Make sure you have the latest version of `qlib` installed.
If you want to use the models and data provided by `qlib`, you only need to do as follows.
Firstly, write a simple configuration file as following,
.. code-block:: YAML
strategy:
class: TopkAmountStrategy
module_path: qlib.contrib.strategy
args:
market: csi500
trade_freq: 5
model:
class: ScoreFileModel
module_path: qlib.contrib.online.online_model
args:
loss: mse
model_path: ./model.bin
init_cash: 1000000000
We then can use this command to create a folder and do trading from 2017-01-01 to 2018-08-01.
.. code-block:: bash
online simulate -id v-test -config ./config/config.yaml -exchange_config ./config/exchange.yaml -start 2017-01-01 -end 2018-08-01 -path ./user_data/
The start date (2017-01-01) is the add date of the user, which also is the first predict date,
and the end date (2018-08-01) is the last trade date. You can use "`online generate -date 2018-08-02...`"
command to continue generate the order_list at next trading date.
If Your account was saved in "./user_data/", you can see the performance of your account compared to a benchmark by
.. code-block:: bash
>> online show -id v-test -path ./user_data/ -bench SH000905
...
Result of porfolio:
sub_bench:
risk
mean 0.001157
std 0.003039
annual 0.289131
sharpe 6.017635
mdd -0.013185
sub_cost:
risk
mean 0.000800
std 0.003043
annual 0.199944
sharpe 4.155963
mdd -0.015517
Here 'SH000905' represents csi500 and 'SH000300' represents csi300
Manage your account
--------------------
Any account processed by `online` should be saved in a folder. you can use commands
defined to manage your accounts.
- add an new account
This will add an new account with user_id='v-test', add_date='2019-10-15' in ./user_data.
.. code-block:: bash
>> online add_user -id {user_id} -config {config_file} -path {folder_path} -date {add_date}
>> online add_user -id v-test -config config.yaml -path ./user_data/ -date 2019-10-15
- remove an account
.. code-block:: bash
>> online remove_user -id {user_id} -path {folder_path}
>> online remove_user -id v-test -path ./user_data/
- show the performance
Here benchmark indicates the baseline is to be compared with yours.
.. code-block:: bash
>> online show -id {user_id} -path {folder_path} -bench {benchmark}
>> online show -id v-test -path ./user_data/ -bench SH000905
The default value of all the parameter 'date' below is trade date
(will be today if today is trading date and information has been updated in `qlib`).
The 'generate' and 'update' will check whether input date is valid, the following 3 processes should
be called at each trading date.
- generate the order list
generate the order list at trade date, and save them in {folder_path}/{user_id}/temp/ as a json file.
.. code-block:: bash
>> online generate -date {date} -path {folder_path}
>> online generate -date 2019-10-16 -path ./user_data/
- execute the order list
execute the order list and generate the transactions result in {folder_path}/{user_id}/temp/ at trade date
.. code-block:: bash
>> online execute -date {date} -exchange_config {exchange_config_path} -path {folder_path}
>> online execute -date 2019-10-16 -exchange_config ./config/exchange.yaml -path ./user_data/
A simple exchange config file can be as
.. code-block:: yaml
open_cost: 0.003
close_cost: 0.003
limit_threshold: 0.095
deal_price: vwap
- update accounts
update accounts in "{folder_path}/" at trade date
.. code-block:: bash
>> online update -date {date} -path {folder_path}
>> online update -date 2019-10-16 -path ./user_data/
API
------------------
All those operations are based on defined in `qlib.contrib.online.operator`
.. automodule:: qlib.contrib.online.operator
.. _fileStruct:
File structure
------------------
'user_data' indicates the root of folder.
Name that bold indicates its a folder, otherwise its a document.
.. code-block:: yaml
{user_folder}
│ users.csv: (Init date for each users)
└───{user_id1}: (users' sub-folder to save their data)
│ │ position.xlsx
│ │ report.csv
│ │ model_{user_id1}.pickle
│ │ strategy_{user_id1}.pickle
│ │
│ └───score
│ │ └───{YYYY}
│ │ └───{MM}
│ │ │ score_{YYYY-MM-DD}.csv
│ │
│ └───trade
│ └───{YYYY}
│ └───{MM}
│ │ orderlist_{YYYY-MM-DD}.json
│ │ transaction_{YYYY-MM-DD}.csv
└───{user_id2}
│ │ position.xlsx
│ │ report.csv
│ │ model_{user_id2}.pickle
│ │ strategy_{user_id2}.pickle
│ │
│ └───score
│ └───trade
....
Configuration file
------------------
The configure file used in `online` should contain the model and strategy information.
About the model
~~~~~~~~~~~~~~~~~~~~
First, your configuration file needs to have a field about the model,
this field and its contents determine the model we used when generating score at predict date.
Followings are two examples for ScoreFileModel and a model that read a score file and return score at trade date.
.. code-block:: YAML
model:
class: ScoreFileModel
module_path: qlib.contrib.online.OnlineModel
args:
loss: mse
.. code-block:: YAML
model:
class: ScoreFileModel
module_path: qlib.contrib.online.OnlineModel
args:
score_path: <your score path>
If your model doesn't belong to above models, you need to coding your model manually.
Your model should be a subclass of models defined in 'qlib.contfib.model'. And it must
contains 2 methods used in `online` module.
About the strategy
~~~~~~~~~~~~~~~~~~~~
Your need define the strategy used to generate the order list at predict date.
Followings are two examples for a TopkAmountStrategy
.. code-block:: YAML
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
args:
topk: 100
n_drop: 10
Generated files
------------------
The 'online_generate' command will create the order list at {folder_path}/{user_id}/temp/,
the name of that is orderlist_{YYYY-MM-DD}.json, YYYY-MM-DD is the date that those orders to be executed.
The format of json file is like
.. code-block:: python
{
'sell': {
{'$stock_id1': '$amount1'},
{'$stock_id2': '$amount2'}, ...
},
'buy': {
{'$stock_id1': '$amount1'},
{'$stock_id2': '$amount2'}, ...
}
}
Then after executing the order list (either by 'online_execute' or other executors), a transaction file
will be created also at {folder_path}/{user_id}/temp/.

327
docs/hidden/tuner.rst Normal file
View File

@@ -0,0 +1,327 @@
.. _tuner:
Tuner
===================
.. currentmodule:: qlib
Introduction
-------------------
Welcome to use Tuner, this document is based on that you can use Estimator proficiently and correctly.
You can find the optimal hyper-parameters and combinations of models, trainers, strategies and data labels.
The usage of program `tuner` is similar with `estimator`, you need provide the URL of the configuration file.
The `tuner` will do the following things:
- Construct tuner pipeline
- Search and save best hyper-parameters of one tuner
- Search next tuner in pipeline
- Save the global best hyper-parameters and combination
Each tuner is consisted with a kind of combination of modules, and its goal is searching the optimal hyper-parameters of this combination.
The pipeline is consisted with different tuners, it is aim at finding the optimal combination of modules.
The result will be printed on screen and saved in file, you can check the result in your experiment saving files.
Example
~~~~~~~
Let's see an example,
First make sure you have the latest version of `qlib` installed.
Then, you need to privide a configuration to setup the experiment.
We write a simple configuration example as following,
.. code-block:: YAML
experiment:
name: tuner_experiment
tuner_class: QLibTuner
qlib_client:
auto_mount: False
logging_level: INFO
optimization_criteria:
report_type: model
report_factor: model_score
optim_type: max
tuner_pipeline:
-
model:
class: SomeModel
space: SomeModelSpace
trainer:
class: RollingTrainer
strategy:
class: TopkAmountStrategy
space: TopkAmountStrategySpace
max_evals: 2
time_period:
rolling_period: 360
train_start_date: 2005-01-01
train_end_date: 2014-12-31
validate_start_date: 2015-01-01
validate_end_date: 2016-06-30
test_start_date: 2016-07-01
test_end_date: 2018-04-30
data:
class: ALPHA360
provider_uri: /data/qlib
args:
start_date: 2005-01-01
end_date: 2018-04-30
dropna_label: True
dropna_feature: True
filter:
market: csi500
filter_pipeline:
-
class: NameDFilter
module_path: qlib.data.filter
args:
name_rule_re: S(?!Z3)
fstart_time: 2018-01-01
fend_time: 2018-12-11
-
class: ExpressionDFilter
module_path: qlib.data.filter
args:
rule_expression: $open/$factor<=45
fstart_time: 2018-01-01
fend_time: 2018-12-11
backtest:
normal_backtest_args:
verbose: False
limit_threshold: 0.095
account: 500000
benchmark: SH000905
deal_price: vwap
long_short_backtest_args:
topk: 50
Next, we run the following command, and you can see:
.. code-block:: bash
~/v-yindzh/Qlib/cfg$ tuner -c tuner_config.yaml
Searching params: {'model_space': {'colsample_bytree': 0.8870905643607678, 'lambda_l1': 472.3188735122233, 'lambda_l2': 92.75390994877243, 'learning_rate': 0.09741751430635413, 'loss': 'mse', 'max_depth': 8, 'num_leaves': 160, 'num_threads': 20, 'subsample': 0.7536051584789751}, 'strategy_space': {'buffer_margin': 250, 'topk': 40}}
...
(Estimator experiment screen log)
...
Searching params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
...
(Estimator experiment screen log)
...
Local best params: {'model_space': {'colsample_bytree': 0.6667379039007301, 'lambda_l1': 382.10698024977904, 'lambda_l2': 117.02506488151757, 'learning_rate': 0.18514539615228137, 'loss': 'mse', 'max_depth': 6, 'num_leaves': 200, 'num_threads': 12, 'subsample': 0.9449255686969292}, 'strategy_space': {'buffer_margin': 200, 'topk': 30}}
Time cost: 489.87220 | Finished searching best parameters in Tuner 0.
Time cost: 0.00069 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_0/local_best_params.json .
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 2',)}, 'model_space': {'input_dim': 158, 'lr': 0.001, 'lr_decay': 0.9100529502185579, 'lr_decay_steps': 162.48901403763966, 'optimizer': 'gd', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 300, 'topk': 35}}
...
(Estimator experiment screen log)
...
Searching params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
...
(Estimator experiment screen log)
...
Local best params: {'data_label_space': {'labels': ('Ref($vwap, -2)/Ref($vwap, -1) - 1',)}, 'model_space': {'input_dim': 158, 'lr': 0.1, 'lr_decay': 0.9882802970847494, 'lr_decay_steps': 164.76742865207729, 'optimizer': 'adam', 'output_dim': 1}, 'strategy_space': {'buffer_margin': 250, 'topk': 35}}
Time cost: 550.74039 | Finished searching best parameters in Tuner 1.
Time cost: 0.00023 | Finished saving local best tuner parameters to: tuner_experiment/estimator_experiment/estimator_experiment_1/local_best_params.json .
Time cost: 1784.14691 | Finished tuner pipeline.
Time cost: 0.00014 | Finished save global best tuner parameters.
Best Tuner id: 0.
You can check the best parameters at tuner_experiment/global_best_params.json.
Finally, you can check the results of your experiment in the given path.
Configuration file
------------------
Before using `tuner`, you need to prepare a configuration file. Next we will show you how to prepare each part of the configuration file.
About the experiment
~~~~~~~~~~~~~~~~~~~~
First, your configuration file needs to have a field about the experiment, whose key is `experiment`, this field and its contents determine the saving path and tuner class.
Usually it should contain the following content:
.. code-block:: YAML
experiment:
name: tuner_experiment
tuner_class: QLibTuner
Also, there are some optional fields. The meaning of each field is as follows:
- `name`
The experiment name, str type, the program will use this experiment name to construct a directory to save the process of the whole experiment and the results. The default value is `tuner_experiment`.
- `dir`
The saving path, str type, the program will construct the experiment directory in this path. The default value is the path where configuration locate.
- `tuner_class`
The class of tuner, str type, must be an already implemented model, such as `QLibTuner` in `qlib`, or a custom tuner, but it must be a subclass of `qlib.contrib.tuner.Tuner`, the default value is `QLibTuner`.
- `tuner_module_path`
The module path, str type, absolute url is also supported, indicates the path of the implementation of tuner. The default value is `qlib.contrib.tuner.tuner`
About the optimization criteria
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You need to designate a factor to optimize, for tuner need a factor to decide which case is better than other cases.
Usually, we use the result of `estimator`, such as backtest results and the score of model.
This part needs contain these fields:
.. code-block:: YAML
optimization_criteria:
report_type: model
report_factor: model_pearsonr
optim_type: max
- `report_type`
The type of the report, str type, determines which kind of report you want to use. If you want to use the backtest result type, you can choose `pred_long`, `pred_long_short`, `pred_short`, `sub_bench` and `sub_cost`. If you want to use the model result type, you can only choose `model`.
- `report_factor`
The factor you want to use in the report, str type, determines which factor you want to optimize. If your `report_type` is backtest result type, you can choose `annual`, `sharpe`, `mdd`, `mean` and `std`. If your `report_type` is model result type, you can choose `model_score` and `model_pearsonr`.
- `optim_type`
The optimization type, str type, determines what kind of optimization you want to do. you can minimize the factor or maximize the factor, so you can choose `max`, `min` or `correlation` at this field.
Note: `correlation` means the factor's best value is 1, such as `model_pearsonr` (a corraltion coefficient).
If you want to process the factor or you want fetch other kinds of factor, you can override the `objective` method in your own tuner.
About the tuner pipeline
~~~~~~~~~~~~~~~~~~~~~~~~
The tuner pipeline contains different tuners, and the `tuner` program will process each tuner in pipeline. Each tuner will get an optimal hyper-parameters of its specific combination of modules. The pipeline will contrast the results of each tuner, and get the best combination and its optimal hyper-parameters. So, you need to configurate the pipeline and each tuner, here is an example:
.. code-block:: YAML
tuner_pipeline:
-
model:
class: SomeModel
space: SomeModelSpace
trainer:
class: RollingTrainer
strategy:
class: TopkAmountStrategy
space: TopkAmountStrategySpace
max_evals: 2
Each part represents a tuner, and its modules which are to be tuned. Space in each part is the hyper-parameters' space of a certain module, you need to create your searching space and modify it in `/qlib/contrib/tuner/space.py`. We use `hyperopt` package to help us to construct the space, you can see the detail of how to use it in https://github.com/hyperopt/hyperopt/wiki/FMin .
- model
You need to provide the `class` and the `space` of the model. If the model is user's own implementation, you need to privide the `module_path`.
- trainer
You need to proveide the `class` of the trainer. If the trainer is user's own implementation, you need to privide the `module_path`.
- strategy
You need to provide the `class` and the `space` of the strategy. If the strategy is user's own implementation, you need to privide the `module_path`.
- data_label
The label of the data, you can search which kinds of labels will lead to a better result. This part is optional, and you only need to provide `space`.
- max_evals
Allow up to this many function evaluations in this tuner. The default value is 10.
If you don't want to search some modules, you can fix their spaces in `space.py`. We will not give the default module.
About the time period
~~~~~~~~~~~~~~~~~~~~~
You need to use the same dataset to evaluate your different `estimator` experiments in `tuner` experiment. Two experiments using different dataset are uncomparable. You can specify `time_period` through the configuration file:
.. code-block:: YAML
time_period:
rolling_period: 360
train_start_date: 2005-01-01
train_end_date: 2014-12-31
validate_start_date: 2015-01-01
validate_end_date: 2016-06-30
test_start_date: 2016-07-01
test_end_date: 2018-04-30
- `rolling_period`
The rolling period, integer type, indicates how many time steps need rolling when rolling the data. The default value is `60`. If you use `RollingTrainer`, this config will be used, or it will be ignored.
- `train_start_date`
Training start time, str type.
- `train_end_date`
Training end time, str type.
- `validate_start_date`
Validation start time, str type.
- `validate_end_date`
Validation end time, str type.
- `test_start_date`
Test start time, str type.
- `test_end_date`
Test end time, str type. If `test_end_date` is `-1` or greater than the last date of the data, the last date of the data will be used as `test_end_date`.
About the data and backtest
~~~~~~~~~~~~~~~~~~~~~~~~~~~
`data` and `backtest` are all same in the whole `tuner` experiment. Different `estimator` experiments must use the same data and backtest method. So, these two parts of config are same with that in `estimator` configuration. You can see the precise defination of these parts in `estimator` introduction. We only provide an example here.
.. code-block:: YAML
data:
class: ALPHA360
provider_uri: /data/qlib
args:
start_date: 2005-01-01
end_date: 2018-04-30
dropna_label: True
dropna_feature: True
feature_label_config: /home/v-yindzh/v-yindzh/QLib/cfg/feature_config.yaml
filter:
market: csi500
filter_pipeline:
-
class: NameDFilter
module_path: qlib.filter
args:
name_rule_re: S(?!Z3)
fstart_time: 2018-01-01
fend_time: 2018-12-11
-
class: ExpressionDFilter
module_path: qlib.filter
args:
rule_expression: $open/$factor<=45
fstart_time: 2018-01-01
fend_time: 2018-12-11
backtest:
normal_backtest_args:
verbose: False
limit_threshold: 0.095
account: 500000
benchmark: SH000905
deal_price: vwap
long_short_backtest_args:
topk: 50
Experiment Result
-----------------
All the results are stored in experiment file directly, you can check them directly in the corresponding files.
What we save are as following:
- Global optimal parameters
- Local optimal parameters of each tuner
- Config file of this `tuner` experiment
- Every `estimator` experiments result in the process