1
0
mirror of https://github.com/microsoft/qlib.git synced 2026-06-11 08:21:45 +08:00

Compare commits

..

271 Commits

Author SHA1 Message Date
Dong Zhou
5ac9dd7221 temporarily fix create exp conflicts for remote mlflow 2021-11-12 05:16:17 +00:00
you-n-g
7efec6bbc4 Fix private import 2021-11-08 09:52:55 +08:00
Young
3fa48d7017 simplify record tmp 2021-11-05 12:57:14 +00:00
Young
4f2d6b0d84 fix pytorch memory amount error 2021-11-02 20:41:39 +08:00
Young
3943b7001f fix CI bug for AyncCaller 2021-11-02 14:32:09 +08:00
Young
2593185721 Simplify TSDataset and async recorder 2021-11-02 11:07:40 +08:00
Young
7a884fa9f2 remove redundant file only when remote artifact 2021-11-01 18:55:44 +08:00
Dong Zhou
d929d4bb21 rm recorder temp file 2021-11-01 09:29:44 +00:00
Young
e54b019ee2 solve init kwargs conflictions 2021-11-01 06:22:25 +00:00
Young
426b98a3bc make the logic of online manager cleaner 2021-11-01 02:40:54 +00:00
Young
82f8ff9066 Update seperate dataframe 2021-11-01 00:51:21 +08:00
Young
31e9d529de Add multi horizon task generator 2021-10-28 00:01:19 +08:00
Young
5fa56703ae add handler pickle attr, enhance init_instance_by_config 2021-10-26 23:32:33 +08:00
Dong Zhou
c6bb11fe56 avoid trade without enough cash 2021-10-25 05:46:19 +00:00
Dong Zhou
3d7ebd1fe0 add back trade_val 2021-10-22 10:13:15 +00:00
Dong Zhou
7313b4dad0 fix impact cost 2021-10-22 08:58:37 +00:00
Dong Zhou
b70caff522 add doc 2021-10-22 08:49:20 +00:00
Dong Zhou
96b422a906 support market impact cost 2021-10-22 08:44:47 +00:00
Young
64130d9407 Fix the aggregation function of IndexData 2021-10-22 15:20:45 +08:00
Young
a58bc03a8e add sepdf(make mini project only rely on qlib) 2021-10-21 13:15:02 +00:00
Young
f537222ce3 make handler seperable 2021-10-21 12:38:24 +00:00
Dong Zhou
c427c64845 fix calendar 2021-10-19 06:17:53 +00:00
Young
22ff8fdc44 simple change log 2021-10-16 17:14:37 +00:00
Young
4efb0a75c1 Being compatible with previous Qlib version 2021-10-16 16:43:38 +00:00
Young
052aad7982 simplify signal parameter 2021-10-15 14:48:31 +00:00
Young
12f05c7182 Merge branch 'backtest_improve' of github.com:microsoft/qlib into backtest_improve 2021-10-15 11:27:33 +00:00
Young
ac08468330 Make static prediction easier 2021-10-15 11:21:03 +00:00
Dong Zhou
df9745f134 support empty order 2021-10-15 09:07:03 +00:00
Dong Zhou
2e49a5f7c0 fix order generator 2021-10-15 07:04:47 +00:00
you-n-g
3ab5721448 Fix OrderGenerator's return value 2021-10-15 14:28:08 +08:00
you-n-g
6a94b45503 Update order_generator.py 2021-10-15 13:52:55 +08:00
you-n-g
7c31012b50 Auto injecting model and dataset for Recorder (#645)
* Auto injecting model and dataset for Recorder

* Support using Feature in expression
2021-10-15 13:50:24 +08:00
you-n-g
334b92ace7 Checking dataset empty (#647)
* Checking dataset empty

* add dataset checker
2021-10-14 23:35:12 +08:00
you-n-g
9a175d7507 improve the doc of auto init (#541)
* improve the doc of auto init

* Update setup.py

* Update setup.py

* change cvxpy version

Co-authored-by: Wangwuyi123 <51237097+Wangwuyi123@users.noreply.github.com>
2021-10-12 11:58:27 +08:00
Lewen Wang
17ea44e0cf Update TCTS. (#643)
* Update TCTS.

* Update TCTS README.

* Update TCTS README.

* Update TCTS.

Co-authored-by: lewwang <lwwang@microsoft.com>
2021-10-12 10:08:48 +08:00
you-n-g
c0ce712be9 more detailed docs for workflow (#639)
* more detailed docs for workflow

* add more detailed docs for workflow
2021-10-11 15:38:18 +08:00
demon143
8e81a017c1 Update manage.py (#628)
* Update manage.py

* Update manage.py

* Update manage.py

* Create manage.py

* Update manage.py

* Update qlib/workflow/task/manage.py

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2021-10-11 15:37:50 +08:00
you-n-g
706727988c Update README.md 2021-10-09 23:37:07 +08:00
you-n-g
e99224e5c2 Update benchmark based on new backtest (#634)
* free random seed

* update model baselines

* more robust for parameters
2021-10-07 22:57:19 +08:00
Pengrong Zhu
8c8d1336de fix workflow_config_lightgbm_multi_freq.yaml (#635) 2021-10-06 17:18:27 +08:00
Pengrong Zhu
d01de411a8 add support for macos-11 (#630)
Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2021-10-03 16:49:17 +08:00
Young
28fe4d4bb4 update file strategy test 2021-10-03 14:58:37 +08:00
Young
873129aa9b update fix CI tests bugs 2021-10-03 14:58:37 +08:00
Young
3a152f9b8b fix CI 2021-10-03 14:58:37 +08:00
Young
2b75b41a08 remove 3.6 2021-10-03 14:58:37 +08:00
you-n-g
00d17f0a52 Update python-publish.yml 2021-10-01 03:03:26 +08:00
you-n-g
6bec33e854 Merge pull request #438 from microsoft/nested_decision_exe
Support Highfreq Backtest with the Model/Rule/RL Strategy
2021-10-01 02:47:53 +08:00
Young
48a860c8b7 fix backtest yaml 2021-09-30 18:43:36 +00:00
Young
4099050935 Merge remote-tracking branch 'origin/main' into nested_decision_exe 2021-09-30 18:41:15 +00:00
wangwenxi-handsome
3760a18a8d Merge nested main (#597)
* MVP for Indian Stocks in qlib using yahooquery

* cleaned with black

* cleaned with black

* add YahooNormalizeIN and YahooNormalizeIN1d

* cleaned the code

* added 1min for IN and also updated readme

* update comments

* fix comments

* recorder support upload both raw file and directory

* fix comments

* Update README.md

* Fix docs of QlibRecorder

* sort index after loader (#538)

make sure the fetch method is based on a index-sorted pd.DataFrame

* refactor online serving rolling api

* refactor TRA

* format by black

* fix horizon

* fix TRA when use single head

* clean up

* improve pretrain

* update README

* fix tra when logdir is None

* fix tra when logdir is None

* Update strategy.py

* Update README.md

* Update README.md

* Conda Suggestion

* code standard docs

* Update ensemble.py (#560)

* Fix CI  Bug (#575)


Co-authored-by: yuxwang <anduinnn@foxmail.com>

* Update gen.py (#576)

* Fix multi-process loop calls (#574)

* check lexsort in the 'lazy_sort_index' function (#566)

* check lexsort

* check lexsort

* lexsort comment

* lexsort comment

* Delete .DS_Store

* Update README.md

* bug fix & use oracle transport pretrain

* mend

* Add `backend_freq_config` parameter, support multi-freq uri

* Add sample_config to QlibDataLoader, support multi-freq

* add multi-freq example

* get_cls_kwargs renamed get_callable_kwargs

* support multi-freq uri

* Add inst_processors to D.features

* Fix typo

* Fix the index type of the multi-freq example

* Fix duplicate mlflow directories in tests

* Add DataPathManager to QlibConfig && modify inst_processors to supports list only

* Modify the default value in the multi_freq example

* Modify client-server mode and dataset-cache to disable inst_processor

* Add wheel package to github CI

* fix comment

* Update FAQ.rst

* Update README.md

Fix wrong link

* Update the docs of TaskManager (#586)

* Update manage.py

* update yaml

* update run_all_model

* Modify the Feature to be case sensitive (#589)

* update README

* remove verbose

* fix spell bug

* fix typos (#592)

* Update Release Note

* fix portfolio bug

* Add calendar support for resample

* add freq kwargs

* test.yml: Remove redundant code (#595)

* Supporting shared processor (#596)

* Supporting shared processor

* fix readonly reverse bug

* remove pytests dependency

* with fit bug

* fix parameter error

* fix comments

* Fix undefined names in Python code (#599)

* Update pytorch_tabnet.py

$ `flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics`
```
./qlib/qlib/contrib/model/pytorch_tabnet.py:567:38: F821 undefined name 'inp'
            self.independ.append(GLU(inp, out_dim, vbs=vbs))
                                     ^
./qlib/examples/model_rolling/task_manager_rolling.py:75:18: F821 undefined name 'task_train'
        run_task(task_train, self.task_pool, experiment_name=self.experiment_name)
                 ^
2     F821 undefined name 'task_train'
2
```

* Fix undefined names in Python code

* from qlib.model.trainer import task_train

* update seed

* fix some docstring

* add comments

* Fix SimpleDatasetCache

* Update setup.py

updated classifiers

* Update setup.py

change to matplotlib==3.3

* Update python-publish.yml

added python 3.9

* updategrade version number

* Update model list

* fix the type of filter_pipe

* fix comment

* fix record_temp

* update cvxpy version

* Update code_standard.rst (#587)

* Update code_standard.rst

* Update docs/developer/code_standard.rst

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

* Add file lock for MLflowExpManager (#619)

* fix torch version

* Share version number (#620)

* Update initialization.rst (#622)

* Update initialization.rst

* Update docs/start/initialization.rst

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

* Update docs/start/initialization.rst

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

* fix bugs for running previous exmaple

* fix deal amount bug

* update change doc (#623)

* Add files via upload

* Update README.md

* Update README.md

* Update README.md

* Delete change doc.gif

* Add files via upload

* Update README.md

* Delete change doc.gif

* Add files via upload

* Delete change doc.gif

* Add files via upload

* Update README.md

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>

* update doc

* simplify run all model

* fix run all model bug

* Fix Models (#483)

* fix gat dataset

* fix tft model

* Update tft.py

* Fix tft.py

Co-authored-by: Pengrong Zhu <zhu.pengrong@foxmail.com>

* type and skip empty exp

* fix model yaml config

* fix tft import bug

* skip empty result

* fix model and yaml bug

* fix wrong generate parameter

* Modify multi-freq example (#626)

* modify the example of multi-freq

* add Copyright

* add a comment to average_ops.py

* modify the example of multi-freq

* add comment to multi_freq_handler.py

* add the Ref expression description to multi_freq_handler.py

* add expression description to multi_freq_handler.py

* update images

* fix workflow and update framework

Co-authored-by: Gaurav <2796gaurav@gmail.com>
Co-authored-by: 2796gaurav <17353992+2796gaurav@users.noreply.github.com>
Co-authored-by: bxdd <bxd98@126.com>
Co-authored-by: Young <afe.young@gmail.com>
Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
Co-authored-by: Dong Zhou <Zhou.Dong@microsoft.com>
Co-authored-by: ZhangTP1996 <ztp18@mails.tsinghua.edu.cn>
Co-authored-by: demon143 <59681577+demon143@users.noreply.github.com>
Co-authored-by: Wangwuyi123 <51237097+Wangwuyi123@users.noreply.github.com>
Co-authored-by: yuxwang <anduinnn@foxmail.com>
Co-authored-by: Pengrong Zhu <zhu.pengrong@foxmail.com>
Co-authored-by: Mark Zhao <50850474+markzhao98@users.noreply.github.com>
Co-authored-by: cslwqxx <cslwqxx@users.noreply.github.com>
Co-authored-by: Dong Zhou <evanzd@users.noreply.github.com>
Co-authored-by: SaintMalik <37118134+saintmalik@users.noreply.github.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Anurag Kumar <mailanu98@gmail.com>
Co-authored-by: demon143 <785696300@qq.com>
2021-10-01 02:15:30 +08:00
you-n-g
8cf6ed3564 Update VERSION.txt 2021-09-30 22:59:05 +08:00
you-n-g
163e3c6266 replace multi processing with joblib (#477)
* replace multi processing with joblib

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* update class Parallel and data.py

* Fix Parallel support for maxtasksperchild

Co-authored-by: wangw <1666490690@qq.com>
Co-authored-by: zhupr <zhu.pengrong@foxmail.com>
2021-09-14 01:16:03 +08:00
you-n-g
6203e4c09e Update the docs of Report 2021-09-13 17:53:34 +08:00
Young
88d2f9263e fix sum index data bug 2021-09-02 01:57:44 +00:00
wangwenxi.handsome
f71b0c1189 250s 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
919380597b close and reindex 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
4da3f3b104 broadcast_to and get single data 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
9446116642 redundant references 2021-09-02 09:56:38 +08:00
Young
5003e49197 fix metric calculation error 2021-09-02 09:56:38 +08:00
Young
5f0ee6ce68 fix bugs 2021-09-02 09:56:38 +08:00
Young
9a74471ab6 Pass basic tests 2021-09-02 09:56:38 +08:00
Young
d39c8de800 draft design 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
43a8f502ed fix bug 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
7ee4a207bc add lru 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
25f54ddaeb new high freq struc 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
d9ad8ff791 index_data 2021-09-02 09:56:38 +08:00
Young
13a9b7cea0 type error bug 2021-09-02 09:56:38 +08:00
Young
9c326fd398 add import order 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
f111e34bd2 align interface 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
be0d9e6a22 update freq 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
e134c358fd fix index data bug 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
16b954866f get_base_info 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
f7d7f1a223 fix nanmean 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
8eb7a1fddc numpy_order_indicator 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
222c2fd21a fix exchange bug 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
f67b99a30e update exchange 2021-09-02 09:56:38 +08:00
wangwenxi.handsome
2da6a8c770 fix Path re 2021-08-31 11:57:14 +00:00
Young
309dfa36cc Add a example to collecting all the decisions 2021-08-15 15:22:48 +00:00
wangwenxi-handsome
735153a50d Cash Update (#559)
* fix negative cash

* update order test

* fix bug

* update file_order_test
2021-08-12 23:44:22 +08:00
you-n-g
05b9fb5a47 Fix bug when Account.benchmark_config is None 2021-08-09 19:23:17 +08:00
wangwenxi.handsome
7c858803f0 add position test 2021-08-08 14:32:33 +00:00
wangwenxi.handsome
74e1ee6921 update position and negative cash 2021-08-06 04:34:30 +00:00
Young
8e87950292 Print volume limitation log 2021-08-04 11:04:28 +00:00
wangwenxi.handsome
3ff1d91d61 add __init__ 2021-08-02 07:45:03 +00:00
wangwenxi.handsome
f5db0e1b05 fix vol limit bug 2021-08-02 03:49:03 +00:00
wangwenxi.handsome
0f2d85d098 volume limit update 2021-08-01 16:03:08 +00:00
wangwenxi.handsome
5c2ddac7f0 volume limit 2021-07-31 09:31:01 +00:00
Young
73f5cc0a2b add suspend check in twap 2021-07-29 04:11:18 +00:00
Young
ab3c4a2c05 new twap (more even) 2021-07-28 03:11:56 +00:00
you-n-g
c1992b1bb1 Merge pull request #456 from ultmaster/rl-dummy
Dummy RL example on nested decision framework
2021-07-27 22:58:15 +08:00
v-mingzhehan
e817413769 Restore examples 2021-07-27 14:52:29 +00:00
v-mingzhehan
0b607da690 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-27 14:32:36 +00:00
Young
0d41ca26ab fix data format bug & twap peeking strategy 2021-07-27 14:17:59 +00:00
wangwenxi.handsome
ba1c575aa9 doc and black for indicator 2021-07-27 12:14:43 +00:00
wangwenxi.handsome
66971d5f0d fix indicator 2021-07-27 09:06:13 +00:00
Young
fcca242807 add cash settlement mechanism 2021-07-26 17:14:41 +00:00
wangwenxi.handsome
4924717276 fix black 2021-07-26 11:25:14 +00:00
wangwenxi.handsome
c202a4b1e6 fix _get_base_vol_pri clip_time_range 2021-07-26 11:21:05 +00:00
Young
bdebe12cf2 support empty benchmark
Empty benchmark could accelerate the learning process
2021-07-26 06:14:57 +00:00
wangwenxi.handsome
e88c45e13c update position 2021-07-25 12:38:54 +00:00
wangwenxi.handsome
103d3034bf Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into nested_decision_exe 2021-07-25 12:37:04 +00:00
wangwenxi-handsome
4ffb05ae59 Update Action 2021-07-24 22:08:15 +08:00
wangwenxi.handsome
6dcbf51298 update action 2021-07-24 11:36:28 +00:00
wangwenxi-handsome
9d732e9646 Update Action 2021-07-24 10:12:59 +00:00
wangwenxi.handsome
a8ea66b83e black 2021-07-23 09:33:04 +00:00
you-n-g
9e6f4ec578 Merge pull request #520 from wangwenxi-handsome/nested_decision_exe
abstract Quote class from Exchange
2021-07-23 14:36:36 +08:00
wangwenxi.handsome
301e0477ec Merge branch 'nested_decision_exe' of https://github.com/wangwenxi-handsome/qlib into nested_decision_exe 2021-07-23 05:52:09 +00:00
wangwenxi.handsome
0ec6b87d39 fix little bug 2021-07-23 05:50:41 +00:00
you-n-g
d445f28e5f Merge branch 'main' into nested_decision_exe 2021-07-23 12:38:20 +08:00
you-n-g
bbba9600a1 Merge branch 'nested_decision_exe' into nested_decision_exe 2021-07-23 12:15:45 +08:00
wangwenxi.handsome
2c8a3ded08 high_performance_data_structure 2021-07-22 15:20:03 +00:00
wangwenxi.handsome
10c182e2b0 add order_indicator doc 2021-07-21 14:09:12 +00:00
wangwenxi.handsome
83d4387e9f pandas_order_indicator 2021-07-21 12:47:31 +00:00
v-mingzhehan
9bf8c999e6 type checking update 2021-07-20 06:14:40 +00:00
Young
4e862f7d1f add print cash in verbose mode and code format 2021-07-20 05:13:05 +00:00
v-mingzhehan
62583ea6ec Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-19 04:18:17 +00:00
Young
92f2891664 fix order factor setting issue
Move the factor setting from init phase to dealing phase.
2021-07-19 02:37:44 +00:00
v-mingzhehan
25ff62f542 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-18 09:00:47 +00:00
Young
4a62e02fca add get_data_cal_avail_range method 2021-07-18 07:12:14 +00:00
v-mingzhehan
572181ef5d Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-18 03:55:39 +00:00
Young
ed12c7fca3 add common_infra warning and fix time bug 2021-07-18 03:13:15 +00:00
v-mingzhehan
5f50614dbc Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-17 16:31:31 +00:00
Young
7738f39546 filter zero base price 2021-07-17 06:54:44 +00:00
wangwenxi.handsome
2b8d4dc3c2 callable 2021-07-16 14:09:36 +00:00
wangwenxi.handsome
6ad52e8cf5 black and doc 2021-07-16 13:55:49 +00:00
wangwenxi.handsome
567841e1c6 get qlib data in exchange 2021-07-16 12:56:49 +00:00
wangwenxi.handsome
110141ddac add doc 2021-07-16 09:17:29 +00:00
wangwenxi.handsome
65b44349cd add PandasQuote 2021-07-16 08:29:32 +00:00
Young
5241b2f918 Merge branch 'nested_decision_exe' of github.com:microsoft/qlib into nested_decision_exe 2021-07-16 03:17:54 +00:00
Young
344f4f69d2 add data calendar API and refine order cal api 2021-07-16 03:11:07 +00:00
wangwenxi.handsome
f295497e2c Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into nested_decision_exe 2021-07-15 13:38:38 +00:00
wangwenxi.handsome
aae4b02ab8 *tuple 2021-07-15 13:34:39 +00:00
Young
d907817ce9 unify variable names 2021-07-15 13:17:26 +00:00
v-mingzhehan
870f834577 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-15 08:31:39 +00:00
Young
94b456714d refactor index_range to trade_range 2021-07-15 08:02:09 +00:00
Young
571d27cba7 exchange support expression buy sell limit 2021-07-14 13:07:14 +00:00
v-mingzhehan
831773a0d6 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-14 09:12:54 +00:00
wangwenxi.handsome
7b9e338a0d add docs 2021-07-14 09:45:09 +08:00
wangwenxi.handsome
0646e53d24 fix spell error 2021-07-14 09:45:09 +08:00
wangwenxi.handsome
ca14e36f7a initial account by position 2021-07-14 09:45:09 +08:00
Young
9b38e62f21 Add more friendly index range by timing 2021-07-13 14:46:53 +00:00
wangwenxi.handsome
4c4b30ebec fix base price and volumn 2021-07-13 16:15:52 +08:00
v-mingzhehan
c29e5b2621 Fix circular import 2021-07-12 13:50:13 +00:00
Young
45bde7527e move the pa sign from last step to first 2021-07-11 01:53:21 +00:00
Young
155019ba35 move the pa sign from last step to first 2021-07-09 10:34:18 +00:00
v-mingzhehan
ece7b662e2 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-09 09:32:15 +00:00
Young
80f5426693 update docsting 2021-07-09 08:29:19 +00:00
Young
cbd52b7905 align range limit 2021-07-09 08:17:10 +00:00
Young
17d8b8a7cc fix calculating base_price 2021-07-09 08:16:01 +00:00
Young
eada8640b9 align range limit 2021-07-09 08:12:13 +00:00
Young
32ae6e4259 fix calculating base_price 2021-07-08 05:54:36 +00:00
v-mingzhehan
5c5379e09d Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-07 12:26:43 +00:00
Young
e8f5a1e491 black format 2021-07-07 10:52:52 +00:00
Young
0c946cffd6 add supporting setting trade unit in exchange 2021-07-07 10:47:54 +00:00
you-n-g
1fb50d521b Merge branch 'main' into nested_decision_exe 2021-07-07 17:30:31 +08:00
wangwenxi.handsome
8c743a46c7 use init_instance_by_config 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
93796bdcef add exchange kwargs 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
267ee3555d fix all example 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
8b28575dad fill placehorder dict and list 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
4488c3b625 code optimization 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
bd6080b8f5 yaml update 2021-07-07 17:27:29 +08:00
wangwenxi.handsome
cbe7c5285a high_fre_yaml 2021-07-07 17:27:29 +08:00
Wenxi Wang (FA Talent)
85c75a6639 config_extend 2021-07-07 17:27:29 +08:00
xixi
d1b8ed9613 fix qrun 2021-07-07 17:27:29 +08:00
xixi
d6984a3f2d fill_placehorder 2021-07-07 17:27:29 +08:00
Young
e42aa67f52 Supporting skip empty decisions 2021-07-06 12:27:07 +00:00
Young
4e41e9c8f2 simplify the portfolio-based report 2021-07-06 12:27:01 +00:00
Young
6fd50a5bfa Supporting skip empty decisions 2021-07-06 12:08:53 +00:00
Young
dd8231edeb simplify the portfolio-based report 2021-07-06 11:10:13 +00:00
Young
03d6facbd2 fix TWAP strategy 2021-07-06 10:02:20 +00:00
v-mingzhehan
354f7e68c2 Constrain TWAP trade step 2021-07-06 08:47:55 +00:00
v-mingzhehan
e214557e3a Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-06 06:43:34 +00:00
Young
bdac9f4dda supporting seperated buy and sell price 2021-07-06 06:35:10 +00:00
Young
cb72857710 fix annotation recursive error 2021-07-06 05:23:13 +00:00
v-mingzhehan
82645233e7 Support order dataframe 2021-07-06 03:50:34 +00:00
v-mingzhehan
e063d3536c Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-07-05 09:37:22 +00:00
Young
7048bef7c6 fix ffr and order amount 2021-07-04 08:11:17 +00:00
Young
50c0e99f98 fix ffr and order amount 2021-07-04 08:08:03 +00:00
bxdd
9b74a19b14 Merge pull request #493 from bxdd/optimize_resam_data
optimize performance of resam data in rule_strategy & exchange
2021-07-04 02:44:53 +08:00
bxdd
ecf2f24d59 fix comments 2021-07-03 18:42:40 +00:00
Young
ef7fe8aa75 support parallel HF trading 2021-07-03 09:22:23 +00:00
bxdd
8dd5788bac fix comments & update resam ts_last method 2021-07-01 16:31:58 +00:00
bxdd
8b85b9eee7 optimize performance of resam data in rule_strategy & exchange 2021-07-01 14:35:49 +00:00
v-mingzhehan
2b4a493617 Order patch 2021-07-01 09:41:08 +00:00
Young
a401f1eafe improve the docstring 2021-06-30 08:50:03 +00:00
v-mingzhehan
24d5a3127b Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-30 08:30:33 +00:00
Young
bbf5d1bbbb add file order strategy 2021-06-30 07:34:23 +00:00
bxdd
b242d6e1e1 delMiniTimer in haandler storage test 2021-06-30 11:34:08 +08:00
bxdd
8d1b1979d9 update handler_storage test 2021-06-30 11:34:08 +08:00
bxdd
9985befe69 update HashingStockStorage 2021-06-30 11:34:08 +08:00
you-n-g
90bbf2b7c6 Fix account update bar_count bug 2021-06-30 08:29:47 +08:00
bxdd
e1b6f310c9 add Handler Storage 2021-06-28 20:06:15 +00:00
Yuge Zhang
20d566ceee Merge branch 'rl-dummy' of github.com:ultmaster/qlib into rl-dummy 2021-06-28 18:01:41 +08:00
Yuge Zhang
8e8bba1a96 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-28 18:01:02 +08:00
Young
27f0db669f black format & add comments & add randStrategy direction 2021-06-28 17:47:30 +08:00
Young
72c9593aa7 adapting strategies to latest interfaces. 2021-06-28 17:47:30 +08:00
Young
c907d8deb4 fix bugs of random strategy 2021-06-28 17:47:30 +08:00
Young
e78cdd4a08 return the detailed order indicator 2021-06-28 17:47:30 +08:00
Young
9b91758aed performance optimization for cal_sam_minute 2021-06-28 17:47:30 +08:00
Young
b41267fa59 successful run random order gen in day script 2021-06-28 17:47:30 +08:00
Young
b68294da93 add InfPosition 2021-06-28 17:47:30 +08:00
Young
4f384d37ce API enhancement 2021-06-28 17:47:30 +08:00
bxdd
284d96761b fix bug in resam feature 2021-06-27 17:49:49 +00:00
bxdd
b6564cd760 support trade decision update 2021-06-24 19:09:36 +00:00
bxdd
1517a9eb91 add default executor config & update bug in indicator 2021-06-24 13:59:10 +00:00
v-mingzhehan
583fbbef3c Resolve init conflict 2021-06-22 07:07:19 +00:00
v-mingzhehan
d226ac8c32 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-22 07:07:07 +00:00
bxdd
ab97e82484 fix bug in Exchange 2021-06-22 15:03:05 +08:00
v-mingzhehan
7525854bed Add shortcut in init 2021-06-22 03:47:39 +00:00
v-mingzhehan
56cf43da44 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-22 03:27:34 +00:00
bxdd
4ac6e6e246 fix account bug & update indicator_analysis & fix some comments 2021-06-22 02:42:09 +08:00
bxdd
9e45528165 update backtest time range 2021-06-14 22:31:31 +08:00
bxdd
f78e90171b fix comments & add VAStrategy & add trade indicator 2021-06-14 21:32:18 +08:00
Yuge Zhang
76be5d50e5 Refine example 2021-06-07 10:56:12 +08:00
Yuge Zhang
a06fa2bc44 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-04 15:06:00 +08:00
bxdd
46d253b457 update Exchange.deal_order 2021-06-04 14:41:38 +08:00
Yuge Zhang
1581ef12ac Update impl for robustness 2021-06-04 13:01:49 +08:00
Yuge Zhang
c43805eff6 Update end-to-end example and requirements 2021-06-04 12:20:27 +08:00
bxdd
8aee853a11 update Exchange 2021-06-04 00:55:10 +08:00
Yuge Zhang
bf02fc23f8 Add RL strategy demo 2021-06-02 23:20:27 +08:00
Yuge Zhang
f5ac6230e1 Refactor for strategy 2021-06-02 22:04:54 +08:00
Yuge Zhang
2314405613 Rename files 2021-06-02 16:53:39 +08:00
Yuge Zhang
cc8339acd9 Add a few comments 2021-06-02 16:49:52 +08:00
Yuge Zhang
d515efb46e Finish RL dummy example 2021-06-02 16:41:18 +08:00
Yuge Zhang
3200bb88c8 Update an initial version of RL 2021-06-02 15:11:38 +08:00
bxdd
4d48c96d30 fix CI 2021-06-01 18:50:50 +08:00
Yuge Zhang
83535bff6a Playground checkpoint 2021-06-01 18:08:11 +08:00
Yuge Zhang
a8e96e59f8 Merge branch 'nested_decision_exe' of https://github.com/microsoft/qlib into rl-dummy 2021-06-01 17:51:39 +08:00
Yuge Zhang
449e3f40c8 Update init in backtest 2021-06-01 17:51:29 +08:00
bxdd
04fff8ca36 solve conflict 2021-06-01 17:46:47 +08:00
bxdd
a183d8a631 update workflow_by_code & update executor 2021-06-01 17:44:22 +08:00
bxdd
a46d99a2be black format 2021-06-01 16:20:21 +08:00
bxdd
bf16e1ab47 update Order with dataclass 2021-06-01 16:19:01 +08:00
Yuge Zhang
cdc59a78f0 Merge branch 'nested_decision_exe' into rl-dummy 2021-06-01 11:34:45 +08:00
Yuge Zhang
d3dac068df Update simple playground 2021-06-01 11:33:44 +08:00
bxdd
60e082e446 add infra interface & fix no KeyboardInterpret bug 2021-05-31 20:40:11 +08:00
bxdd
bf3b757294 fix bugs 2021-05-29 00:31:40 +08:00
bxdd
96e393b599 del DEBUG log 2021-05-28 22:32:33 +08:00
bxdd
029b63c9dd fix bugs & add highfreq backtest example 2021-05-28 22:29:21 +08:00
Yuge Zhang
c26bee126b Support loading for backtest 2021-05-28 17:31:08 +08:00
bxdd
6a636546c4 Merge github.com:microsoft/qlib into bxdd-qlib_highfreq_backtest 2021-05-27 21:16:35 +08:00
bxdd
4085b447aa move backtest to core, fix calendar bugs, add some docstring 2021-05-27 21:14:39 +08:00
bxdd
2ad61f12b3 rename var in backtest 2021-05-27 17:03:53 +08:00
bxdd
ee74489c37 solve the conflict 2021-05-25 02:53:44 +08:00
bxdd
75fcb3800d Merge branch 'qlib_highfreq_backtest' of github.com:bxdd/qlib into bxdd-qlib_highfreq_backtest 2021-05-25 02:40:34 +08:00
bxdd
0c6e505455 fix comments 2021-05-25 02:38:34 +08:00
you-n-g
26d75b71b0 Update sample.py 2021-05-19 15:06:47 +08:00
you-n-g
dda509da0b Update record_temp.py 2021-05-19 15:02:04 +08:00
bxdd
eaa719df17 optimize rule_strategy performance 2021-05-14 15:50:27 +08:00
bxdd
ea60e608ba update rule_startegy & add README, notebook for multi-level trading 2021-05-14 01:51:43 +08:00
bxdd
de2658a8db fix rule_strategy bug 2021-05-13 22:39:19 +08:00
bxdd
c703dabcc7 fix rule_strategy reset method 2021-05-13 00:46:17 +08:00
bxdd
07eaada31e fix comments 2021-05-13 00:33:57 +08:00
bxdd
621cb243c2 fix some comments and add docstring 2021-05-12 02:17:39 +08:00
bxdd
f7d30960c1 update the internal bar strategy 2021-05-07 00:10:44 +08:00
bxdd
bc3eada02d black format 2021-05-06 21:34:31 +08:00
bxdd
7540ecde11 fix trade time bug 2021-05-06 21:33:33 +08:00
bxdd
ae339506b3 del old strategy 2021-04-30 23:35:28 +08:00
bxdd
e30df11a0b solve the conflict 2021-04-30 23:23:56 +08:00
bxdd
d297a493b8 fix bugs 2021-04-30 22:56:21 +08:00
bxdd
a109df3f46 fix bug in recorder 2021-04-30 01:06:05 +08:00
bxdd
f404a031f3 black format 2021-04-29 02:29:29 +08:00
bxdd
49cdaf8f5d update port_ana_record 2021-04-29 02:28:22 +08:00
bxdd
86a6f565e8 trade_account support multi bar report 2021-04-29 02:15:34 +08:00
bxdd
8920c1967f del outdate file 2021-04-26 20:54:10 +08:00
bxdd
af0053eb17 fix bug 2021-04-24 22:37:36 +08:00
bxdd
b14efa1129 update trade calendar & backtest workflow 2021-04-24 02:29:42 +08:00
bxdd
39deb7d27f update env & strategy, add workflow 2021-04-22 22:28:01 +08:00
bxdd
8979d786a9 update report & account 2021-04-22 02:04:40 +08:00
bxdd
971d6a2847 update strategy 2021-04-21 16:42:16 +08:00
bxdd
d3a1e03a11 add sample & base class 2021-03-20 00:11:19 +08:00
186 changed files with 11593 additions and 3250 deletions

View File

@@ -12,8 +12,9 @@ jobs:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [windows-latest, macos-latest]
python-version: [3.6, 3.7, 3.8, 3.9]
os: [windows-latest, macos-latest, macos-11]
# not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
python-version: [3.7, 3.8]
steps:
- uses: actions/checkout@v2
@@ -44,7 +45,8 @@ jobs:
- name: Build wheel on Linux
uses: RalfG/python-wheels-manylinux-build@v0.3.1-manylinux2010_x86_64
with:
python-versions: 'cp36-cp36m cp37-cp37m cp38-cp38'
# not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
python-versions: 'cp37-cp37m cp38-cp38'
build-requirements: 'numpy cython'
- name: Set up Python
uses: actions/setup-python@v2

View File

@@ -13,7 +13,8 @@ jobs:
strategy:
matrix:
os: [windows-latest, ubuntu-18.04, ubuntu-20.04]
python-version: [3.6, 3.7, 3.8]
# not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
python-version: [3.7, 3.8]
steps:
- uses: actions/checkout@v2

View File

@@ -10,10 +10,12 @@ on:
jobs:
build:
runs-on: macos-latest
runs-on: ${{ matrix.os }}
strategy:
matrix:
python-version: [3.6, 3.7, 3.8]
os: [macos-11, macos-latest]
# not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
python-version: [3.7, 3.8]
steps:
- uses: actions/checkout@v2
@@ -31,6 +33,7 @@ jobs:
python -m pip install black
python -m black qlib -l 120 --check --diff
# Test Qlib installed with pip
- name: Install Qlib with pip
run: |
python -m pip install numpy==1.19.5

1
.gitignore vendored
View File

@@ -20,6 +20,7 @@ dist/
.nvimrc
.vscode
qlib/VERSION.txt
qlib/data/_libs/expanding.cpp
qlib/data/_libs/rolling.cpp
examples/estimator/estimator_example/

View File

@@ -159,6 +159,21 @@ Version 0.5.0
- Add baselines
- public data crawler
Version greater than Version 0.5.0
Version 0.8.0
--------------------
- The backtest is greatly refactored.
- Nested decision execution framework is supported
- There are lots of changes for daily trading, it is hard to list all of them. But a few important changes could be noticed
- The trading limitation is more accurate;
- In `previous version <https://github.com/microsoft/qlib/blob/v0.7.2/qlib/contrib/backtest/exchange.py#L160>`_, longing and shorting actions share the same action.
- In `current verison <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/backtest/exchange.py#L304>`_, the trading limitation is different between loging and shorting action.
- The constant is different when calculating annualized metrics.
- `Current version <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/contrib/evaluate.py#L42>`_ uses more accurate constant than `previous version <https://github.com/microsoft/qlib/blob/v0.7.2/qlib/contrib/evaluate.py#L22>`_
- `A new version <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/tests/data.py#L17>`_ of data is released. Due to the unstability of Yahoo data source, the data may be different after downloading data again.
- Users could chec kout the backtesting results between `Current version <https://github.com/microsoft/qlib/tree/7c31012b507a3823117bddcc693fc64899460b2a/examples/benchmarks>`_ and `previous version <https://github.com/microsoft/qlib/tree/v0.7.2/examples/benchmarks>`_
Other Versions
----------------------------------
Please refer to `Github release Notes <https://github.com/microsoft/qlib/releases>`_

View File

@@ -25,7 +25,7 @@ Recent released features
Features released before 2021 are not listed here.
<p align="center">
<img src="http://fintech.msra.cn/images_v060/logo/1.png" />
<img src="http://fintech.msra.cn/images_v070/logo/1.png" />
</p>
@@ -70,7 +70,7 @@ Your feedbacks about the features are very important.
# Framework of Qlib
<div style="align: center">
<img src="http://fintech.msra.cn/images_v060/framework.png?v=0.2" />
<img src="docs/_static/img/framework.svg" />
</div>
@@ -100,7 +100,6 @@ Here is a quick **[demo](https://terminalizer.com/view/3f24561a4470)** shows how
This table demonstrates the supported Python version of `Qlib`:
| | install with pip | install from source | plot |
| ------------- |:---------------------:|:--------------------:|:----:|
| Python 3.6 | :heavy_check_mark: | :heavy_check_mark: (only with `Anaconda`) | :heavy_check_mark: |
| Python 3.7 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Python 3.8 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Python 3.9 | :x: | :heavy_check_mark: | :x: |
@@ -247,19 +246,19 @@ Qlib provides a tool named `qrun` to run the whole workflow automatically (inclu
2. Graphical Reports Analysis: Run `examples/workflow_by_code.ipynb` with `jupyter notebook` to get graphical reports
- Forecasting signal (model prediction) analysis
- Cumulative Return of groups
![Cumulative Return](http://fintech.msra.cn/images_v060/analysis/analysis_model_cumulative_return.png?v=0.1)
![Cumulative Return](http://fintech.msra.cn/images_v070/analysis/analysis_model_cumulative_return.png?v=0.1)
- Return distribution
![long_short](http://fintech.msra.cn/images_v060/analysis/analysis_model_long_short.png?v=0.1)
![long_short](http://fintech.msra.cn/images_v070/analysis/analysis_model_long_short.png?v=0.1)
- Information Coefficient (IC)
![Information Coefficient](http://fintech.msra.cn/images_v060/analysis/analysis_model_IC.png?v=0.1)
![Monthly IC](http://fintech.msra.cn/images_v060/analysis/analysis_model_monthly_IC.png?v=0.1)
![IC](http://fintech.msra.cn/images_v060/analysis/analysis_model_NDQ.png?v=0.1)
![Information Coefficient](http://fintech.msra.cn/images_v070/analysis/analysis_model_IC.png?v=0.1)
![Monthly IC](http://fintech.msra.cn/images_v070/analysis/analysis_model_monthly_IC.png?v=0.1)
![IC](http://fintech.msra.cn/images_v070/analysis/analysis_model_NDQ.png?v=0.1)
- Auto Correlation of forecasting signal (model prediction)
![Auto Correlation](http://fintech.msra.cn/images_v060/analysis/analysis_model_auto_correlation.png?v=0.1)
![Auto Correlation](http://fintech.msra.cn/images_v070/analysis/analysis_model_auto_correlation.png?v=0.1)
- Portfolio analysis
- Backtest return
![Report](http://fintech.msra.cn/images_v060/analysis/report.png?v=0.1)
![Report](http://fintech.msra.cn/images_v070/analysis/report.png?v=0.1)
<!--
- Score IC
![Score IC](docs/_static/img/score_ic.png)
@@ -307,7 +306,7 @@ All the models listed above are runnable with ``Qlib``. Users can find the confi
- Users can use the tool `qrun` mentioned above to run a model's workflow based from a config file.
- Users can create a `workflow_by_code` python script based on the [one](examples/workflow_by_code.py) listed in the `examples` folder.
- Users can use the script [`run_all_model.py`](examples/run_all_model.py) listed in the `examples` folder to run a model. Here is an example of the specific shell command to be used: `python run_all_model.py --models=lightgbm`, where the `--models` arguments can take any number of models listed above(the available models can be found in [benchmarks](examples/benchmarks/)). For more use cases, please refer to the file's [docstrings](examples/run_all_model.py).
- Users can use the script [`run_all_model.py`](examples/run_all_model.py) listed in the `examples` folder to run a model. Here is an example of the specific shell command to be used: `python run_all_model.py run --models=lightgbm`, where the `--models` arguments can take any number of models listed above(the available models can be found in [benchmarks](examples/benchmarks/)). For more use cases, please refer to the file's [docstrings](examples/run_all_model.py).
- **NOTE**: Each baseline has different environment dependencies, please make sure that your python version aligns with the requirements(e.g. TFT only supports Python 3.6~3.7 due to the limitation of `tensorflow==1.15.0`)
## Run multiple models
@@ -317,7 +316,7 @@ The script will create a unique virtual environment for each model, and delete t
Here is an example of running all the models for 10 iterations:
```python
python run_all_model.py 10
python run_all_model.py run 10
```
It also provides the API to run specific models at once. For more use cases, please refer to the file's [docstrings](examples/run_all_model.py).
@@ -389,7 +388,7 @@ Qlib data are stored in a compact format, which is efficient to be combined into
Join IM discussion groups:
|[Gitter](https://gitter.im/Microsoft/qlib)|
|----|
|![image](http://fintech.msra.cn/images_v060/qrcode/gitter_qr.png)|
|![image](http://fintech.msra.cn/images_v070/qrcode/gitter_qr.png)|
# Contributing

View File

@@ -1 +1 @@
0.7.2
0.7.2.99

Binary file not shown.

Before

Width:  |  Height:  |  Size: 33 KiB

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 23 KiB

After

Width:  |  Height:  |  Size: 23 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 47 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 63 KiB

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 KiB

After

Width:  |  Height:  |  Size: 15 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 160 KiB

After

Width:  |  Height:  |  Size: 144 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 46 KiB

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 13 KiB

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 53 KiB

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 47 KiB

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 102 KiB

After

Width:  |  Height:  |  Size: 93 KiB

4
docs/_static/img/framework.svg vendored Normal file

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 98 KiB

View File

@@ -30,7 +30,7 @@ The simple example of the default strategy is as follows.
from qlib.contrib.evaluate import backtest
# pred_score is the prediction score
report, positions = backtest(pred_score, topk=50, n_drop=0.5, verbose=False, limit_threshold=0.0095)
report, positions = backtest(pred_score, topk=50, n_drop=0.5, limit_threshold=0.0095)
To know more about backtesting with a specific ``Strategy``, please refer to `Portfolio Strategy <strategy.html>`_.

120
docs/component/highfreq.rst Normal file
View File

@@ -0,0 +1,120 @@
.. _highfreq:
============================================
Design of hierarchical order execution framework
============================================
.. currentmodule:: qlib
Introduction
===================
In order to support reinforcement learning algorithms for high-frequency trading, a corresponding framework is required. None of the publicly available high-frequency trading frameworks now consider multi-layer trading mechanisms, and the currently designed algorithms cannot directly use existing frameworks.
In addition to supporting the basic intraday multi-layer trading, the linkage with the day-ahead strategy is also a factor that affects the performance evaluation of the strategy. Different day strategies generate different order distributions and different patterns on different stocks. To verify that high-frequency trading strategies perform well on real trading orders, it is necessary to support day-frequency and high-frequency multi-level linkage trading. In addition to more accurate backtesting of high-frequency trading algorithms, if the distribution of day-frequency orders is considered when training a high-frequency trading model, the algorithm can also be optimized more for product-specific day-frequency orders.
Therefore, innovation in the high-frequency trading framework is necessary to solve the various problems mentioned above, for which we designed a hierarchical order execution framework that can link daily-frequency and intra-day trading at different granularities.
.. image:: ../_static/img/framework.svg
The design of the framework is shown in the figure above. At each layer consists of Trading Agent and Execution Env. The Trading Agent has its own data processing module (Information Extractor), forecasting module (Forecast Model) and decision generator (Decision Generator). The trading algorithm generates the corresponding decisions by the Decision Generator based on the forecast signals output by the Forecast Module, and the decisions generated by the trading algorithm are passed to the Execution Env, which returns the execution results. Here the frequency of trading algorithm, decision content and execution environment can be customized by users (e.g. intra-day trading, daily-frequency trading, weekly-frequency trading), and the execution environment can be nested with finer-grained trading algorithm and execution environment inside (i.e. sub-workflow in the figure, e.g. daily-frequency orders can be turned into finer-grained decisions by splitting orders within the day). The hierarchical order execution framework is user-defined in terms of hierarchy division and decision frequency, making it easy for users to explore the effects of combining different levels of trading algorithms and breaking down the barriers between different levels of trading algorithm optimization.
In addition to the innovation in the framework, the hierarchical order execution framework also takes into account various details of the real backtesting environment, minimizing the differences with the final real environment as much as possible. At the same time, the framework is designed to unify the interface between online and offline (e.g. data pre-processing level supports using the same set of code to process both offline and online data) to reduce the cost of strategy go-live as much as possible.
Prepare Data
===================
.. _data:: ../../examples/highfreq/README.md
Example
===========================
Here is an example of highfreq execution.
.. code-block:: python
import qlib
# init qlib
provider_uri_day = "~/.qlib/qlib_data/cn_data"
provider_uri_1min = "~/.qlib/qlib_data/cn_data_1min"
provider_uri_map = {"1min": provider_uri_1min, "day": provider_uri_day}
qlib.init(provider_uri=provider_uri_day, expression_cache=None, dataset_cache=None)
# data freq and backtest time
freq = "1min"
inst_list = D.list_instruments(D.instruments("all"), as_list=True)
start_time = "2020-01-01"
start_time = "2020-01-31"
When initializing qlib, if the default data is used, then both daily and minute frequency data need to be passed in.
.. code-block:: python
# random order strategy config
strategy_config = {
"class": "RandomOrderStrategy",
"module_path": "qlib.contrib.strategy.rule_strategy",
"kwargs": {
"trade_range": TradeRangeByTime("9:30", "15:00"),
"sample_ratio": 1.0,
"volume_ratio": 0.01,
"market": market,
},
}
.. code-block:: python
# backtest config
backtest_config = {
"start_time": start_time,
"end_time": end_time,
"account": 100000000,
"benchmark": None,
"exchange_kwargs": {
"freq": freq,
"limit_threshold": 0.095,
"deal_price": "close",
"open_cost": 0.0005,
"close_cost": 0.0015,
"min_cost": 5,
"codes": market,
},
"pos_type": "InfPosition", # Position with infinitive position
}
please refer to "../../qlib/backtest".
.. code-block:: python
# excutor config
executor_config = {
"class": "NestedExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": "day",
"inner_executor": {
"class": "SimulatorExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": freq,
"generate_portfolio_metrics": True,
"verbose": False,
# "verbose": True,
"indicator_config": {
"show_indicator": False,
},
},
},
"inner_strategy": {
"class": "TWAPStrategy",
"module_path": "qlib.contrib.strategy.rule_strategy",
},
"track_data": True,
"generate_portfolio_metrics": True,
"indicator_config": {
"show_indicator": True,
},
},
}
NestedExecutor represents not the innermost layer, the initialization parameters should contain inner_executor and inner_strategy. simulatorExecutor represents the current excutor is the innermost layer, the innermost strategy used here is the TWAP strategy, the framework currently also supports the VWAP strategy
.. code-block:: python
# backtest
portfolio_metrics_dict, indicator_dict = backtest(executor=executor_config, strategy=strategy_config, **backtest_config)
The metrics of backtest are included in the portfolio_metrics_dict and indicator_dict.

View File

@@ -123,7 +123,6 @@ Here is a simple exampke of what is done in ``PortAnaRecord``, which users can r
"n_drop": 5,
}
BACKTEST_CONFIG = {
"verbose": False,
"limit_threshold": 0.095,
"account": 100000000,
"benchmark": BENCHMARK,

View File

@@ -93,7 +93,6 @@ Usage & Example
"n_drop": 5,
}
BACKTEST_CONFIG = {
"verbose": False,
"limit_threshold": 0.095,
"account": 100000000,
"benchmark": BENCHMARK,

View File

@@ -53,8 +53,10 @@ Below is a typical config file of ``qrun``.
kwargs:
topk: 50
n_drop: 5
signal:
- <MODEL>
- <DATASET>
backtest:
verbose: False
limit_threshold: 0.095
account: 100000000
benchmark: *benchmark
@@ -241,8 +243,10 @@ The following script is the configuration of `backtest` and the `strategy` used
kwargs:
topk: 50
n_drop: 5
signal:
- <MODEL>
- <DATASET>
backtest:
verbose: False
limit_threshold: 0.095
account: 100000000
benchmark: *benchmark

View File

@@ -93,7 +93,6 @@ We write a simple configuration example as following,
fend_time: 2018-12-11
backtest:
normal_backtest_args:
verbose: False
limit_threshold: 0.095
account: 500000
benchmark: SH000905
@@ -306,7 +305,6 @@ About the data and backtest
fend_time: 2018-12-11
backtest:
normal_backtest_args:
verbose: False
limit_threshold: 0.095
account: 500000
benchmark: SH000905

View File

@@ -15,7 +15,7 @@ With ``Qlib``, users can easily try their ideas to create better Quant investmen
Framework
===================
.. image:: ../_static/img/framework.png
.. image:: ../_static/img/framework.svg
:align: center

View File

@@ -34,19 +34,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: ALSTM
@@ -81,7 +86,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: ALSTM
@@ -71,7 +76,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -80,4 +87,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -12,19 +12,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: CatBoostModel
@@ -53,7 +58,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -19,19 +19,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: CatBoostModel
@@ -60,7 +65,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -12,19 +12,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: DEnsembleModel
@@ -75,16 +80,18 @@ task:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
config: *port_analysis_config

View File

@@ -19,19 +19,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: DEnsembleModel
@@ -82,10 +87,12 @@ task:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -93,5 +100,5 @@ task:
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
kwargs:
config: *port_analysis_config

View File

@@ -33,19 +33,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: GATs
@@ -79,7 +84,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -88,4 +95,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: GATs
@@ -71,7 +76,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -34,19 +34,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: GRU
@@ -80,7 +85,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: GRU
@@ -70,7 +75,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -79,4 +86,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -34,19 +34,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LSTM
@@ -80,7 +85,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LSTM
@@ -70,7 +75,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -79,4 +86,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -0,0 +1,18 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import pandas as pd
from qlib.data.inst_processor import InstProcessor
from qlib.utils.resam import resam_calendar
class ResampleNProcessor(InstProcessor):
def __init__(self, target_frq: str, **kwargs):
self.target_frq = target_frq
def __call__(self, df: pd.DataFrame, *args, **kwargs):
df.index = pd.to_datetime(df.index)
res_index = resam_calendar(df.index, "1min", self.target_frq)
df = df.resample(self.target_frq).last().reindex(res_index)
return df

View File

@@ -0,0 +1,135 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import pandas as pd
from qlib.data.dataset.loader import QlibDataLoader
from qlib.contrib.data.handler import DataHandlerLP, _DEFAULT_LEARN_PROCESSORS, check_transform_proc
class Avg15minLoader(QlibDataLoader):
def load(self, instruments=None, start_time=None, end_time=None) -> pd.DataFrame:
df = super(Avg15minLoader, self).load(instruments, start_time, end_time)
if self.is_group:
# feature_day(day freq) and feature_15min(1min freq, Average every 15 minutes) renamed feature
df.columns = df.columns.map(lambda x: ("feature", x[1]) if x[0].startswith("feature") else x)
return df
class Avg15minHandler(DataHandlerLP):
def __init__(
self,
instruments="csi500",
start_time=None,
end_time=None,
freq="day",
infer_processors=[],
learn_processors=_DEFAULT_LEARN_PROCESSORS,
fit_start_time=None,
fit_end_time=None,
process_type=DataHandlerLP.PTYPE_A,
filter_pipe=None,
inst_processor=None,
**kwargs,
):
infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
learn_processors = check_transform_proc(learn_processors, fit_start_time, fit_end_time)
data_loader = Avg15minLoader(
config=self.loader_config(), filter_pipe=filter_pipe, freq=freq, inst_processor=inst_processor
)
super().__init__(
instruments=instruments,
start_time=start_time,
end_time=end_time,
data_loader=data_loader,
infer_processors=infer_processors,
learn_processors=learn_processors,
process_type=process_type,
)
def loader_config(self):
# Results for dataset: df: pd.DataFrame
# len(df.columns) == 6 + 6 * 16, len(df.index.get_level_values(level="datetime").unique()) == T
# df.columns: close0, close1, ..., close16, open0, ..., open16, ..., vwap16
# freq == day:
# close0, open0, low0, high0, volume0, vwap0
# freq == 1min:
# close1, ..., close16, ..., vwap1, ..., vwap16
# df.index.name == ["datetime", "instrument"]: pd.MultiIndex
# Example:
# feature ... label
# close0 open0 low0 ... vwap1 vwap16 LABEL0
# datetime instrument ...
# 2020-10-09 SH600000 11.794546 11.819587 11.769505 ... NaN NaN -0.005214
# 2020-10-15 SH600000 12.044961 11.944795 11.932274 ... NaN NaN -0.007202
# ... ... ... ... ... ... ... ...
# 2021-05-28 SZ300676 6.369684 6.495406 6.306568 ... NaN NaN -0.001321
# 2021-05-31 SZ300676 6.601626 6.465643 6.465130 ... NaN NaN -0.023428
# features day: len(columns) == 6, freq = day
# $close is the closing price of the current trading day:
# if the user needs to get the `close` before the last T days, use Ref($close, T-1), for example:
# $close Ref($close, 1) Ref($close, 2) Ref($close, 3) Ref($close, 4)
# instrument datetime
# SH600519 2021-06-01 244.271530
# 2021-06-02 242.205917 244.271530
# 2021-06-03 242.229889 242.205917 244.271530
# 2021-06-04 245.421524 242.229889 242.205917 244.271530
# 2021-06-07 247.547089 245.421524 242.229889 242.205917 244.271530
# WARNING: Ref($close, N), if N == 0, Ref($close, N) ==> $close
fields = ["$close", "$open", "$low", "$high", "$volume", "$vwap"]
# names: close0, open0, ..., vwap0
names = list(map(lambda x: x.strip("$") + "0", fields))
config = {"feature_day": (fields, names)}
# features 15min: len(columns) == 6 * 16, freq = 1min
# $close is the closing price of the current trading day:
# if the user gets 'close' for the i-th 15min of the last T days, use `Ref(Mean($close, 15), (T-1) * 240 + i * 15)`, for example:
# Ref(Mean($close, 15), 225) Ref(Mean($close, 15), 465) Ref(Mean($close, 15), 705)
# instrument datetime
# SH600519 2021-05-31 241.769897 243.077942 244.712997
# 2021-06-01 244.271530 241.769897 243.077942
# 2021-06-02 242.205917 244.271530 241.769897
# WARNING: Ref(Mean($close, 15), N), if N == 0, Ref(Mean($close, 15), N) ==> Mean($close, 15)
# Results of the current script:
# time: 09:00 --> 09:14, ..., 14:45 --> 14:59
# fields: Ref(Mean($close, 15), 225), ..., Mean($close, 15)
# name: close1, ..., close16
#
# Expression description: take close as an example
# Mean($close, 15) ==> df["$close"].rolling(15, min_periods=1).mean()
# Ref(Mean($close, 15), 15) ==> df["$close"].rolling(15, min_periods=1).mean().shift(15)
# NOTE: The last data of each trading day, which is the average of the i-th 15 minutes
# Average:
# Average of the i-th 15-minute period of each trading day: 1 <= i <= 250 // 16
# Avg(15minutes): Ref(Mean($close, 15), 240 - i * 15)
#
# Average of the first 15 minutes of each trading day; i = 1
# Avg(09:00 --> 09:14), df.index.loc["09:14"]: Ref(Mean($close, 15), 240- 1 * 15) ==> Ref(Mean($close, 15), 225)
# Average of the last 15 minutes of each trading day; i = 16
# Avg(14:45 --> 14:59), df.index.loc["14:59"]: Ref(Mean($close, 15), 240 - 16 * 15) ==> Ref(Mean($close, 15), 0) ==> Mean($close, 15)
# 15min resample to day
# df.resample("1d").last()
tmp_fields = []
tmp_names = []
for i, _f in enumerate(fields):
_fields = [f"Ref(Mean({_f}, 15), {j * 15})" for j in range(1, 240 // 15)]
_names = [f"{names[i][:-1]}{int(names[i][-1])+j}" for j in range(240 // 15 - 1, 0, -1)]
_fields.append(f"Mean({_f}, 15)")
_names.append(f"{names[i][:-1]}{int(names[i][-1])+240 // 15}")
tmp_fields += _fields
tmp_names += _names
config["feature_15min"] = (tmp_fields, tmp_names)
# label
config["label"] = (["Ref($close, -2)/Ref($close, -1) - 1"], ["LABEL0"])
return config

View File

@@ -12,19 +12,23 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
model: <MODEL>
dataset: <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LGBModel
@@ -54,7 +58,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -33,6 +33,9 @@ port_analysis_config: &port_analysis_config
kwargs:
topk: 50
n_drop: 5
signal:
- <MODEL>
- <DATASET>
backtest:
verbose: False
limit_threshold: 0.095
@@ -80,4 +83,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -19,19 +19,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LGBModel
@@ -61,7 +66,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -70,4 +77,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -27,19 +27,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LGBModel
@@ -69,7 +74,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -0,0 +1,90 @@
qlib_init:
provider_uri:
day: "~/.qlib/qlib_data/cn_data"
1min: "~/.qlib/qlib_data/cn_data_1min"
region: cn
dataset_cache: null
maxtasksperchild: null
market: &market csi300
benchmark: &benchmark SH000300
data_handler_config: &data_handler_config
start_time: 2008-01-01
# 1min closing time is 15:00:00
end_time: "2020-08-01 15:00:00"
fit_start_time: 2008-01-01
fit_end_time: 2014-12-31
instruments: *market
freq:
label: day
feature_15min: 1min
feature_day: day
# with label as reference
inst_processor:
feature_15min:
- class: ResampleNProcessor
module_path: features_resample_N.py
kwargs:
target_frq: 1d
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LGBModel
module_path: qlib.contrib.model.gbdt
kwargs:
loss: mse
colsample_bytree: 0.8879
learning_rate: 0.2
subsample: 0.8789
lambda_l1: 205.6999
lambda_l2: 580.9768
max_depth: 8
num_leaves: 210
num_threads: 20
dataset:
class: DatasetH
module_path: qlib.data.dataset
kwargs:
handler:
class: Avg15minHandler
module_path: multi_freq_handler.py
kwargs: *data_handler_config
segments:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LinearModel
@@ -57,16 +62,18 @@ task:
train: [2008-01-01, 2014-12-31]
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
ana_long_short: True
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
config: *port_analysis_config

View File

@@ -34,19 +34,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LocalformerModel
@@ -70,13 +75,15 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: LocalformerModel
@@ -59,15 +64,17 @@ task:
valid: [2015-01-01, 2016-12-31]
test: [2017-01-01, 2020-08-01]
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config

View File

@@ -39,19 +39,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: DNNModelPytorch
@@ -83,7 +88,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -92,4 +99,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -27,19 +27,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: DNNModelPytorch
@@ -70,7 +75,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -79,4 +86,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -3,49 +3,58 @@
Here are the results of each benchmark model running on Qlib's `Alpha360` and `Alpha158` dataset with China's A shared-stock & CSI300 data respectively. The values of each metric are the mean and std calculated based on 20 runs with different random seeds.
The numbers shown below demonstrate the performance of the entire `workflow` of each model. We will update the `workflow` as well as models in the near future for better results.
<!--
> If you need to reproduce the results below, please use the **v1** dataset: `python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/qlib_cn_1d --region cn --version v1`
>
> In the new version of qlib, the default dataset is **v2**. Since the data is collected from the YahooFinance API (which is not very stable), the results of *v2* and *v1* may differ
> In the new version of qlib, the default dataset is **v2**. Since the data is collected from the YahooFinance API (which is not very stable), the results of *v2* and *v1* may differ -->
> NOTE:
> The backtest start from 0.8.0 is quite different from previous version. Please check out the changelog for the difference.
## Alpha360 dataset
| Model Name | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
|---|---|---|---|---|---|---|---|---|
| Linear | Alpha360 | 0.0150±0.00 | 0.1049±0.00| 0.0284±0.00 | 0.1970±0.00 | -0.0659±0.00 | -0.7072±0.00| -0.2955±0.00 |
| CatBoost (Liudmila Prokhorenkova, et al.) | Alpha360 | 0.0397±0.00 | 0.2878±0.00| 0.0470±0.00 | 0.3703±0.00 | 0.0342±0.00 | 0.4092±0.00| -0.1057±0.00 |
| XGBoost (Tianqi Chen, et al.) | Alpha360 | 0.0400±0.00 | 0.3031±0.00| 0.0461±0.00 | 0.3862±0.00 | 0.0528±0.00 | 0.6307±0.00| -0.1113±0.00 |
| LightGBM (Guolin Ke, et al.) | Alpha360 | 0.0399±0.00 | 0.3075±0.00| 0.0492±0.00 | 0.4019±0.00 | 0.0323±0.00 | 0.4370±0.00| -0.0917±0.00 |
| MLP | Alpha360 | 0.0285±0.00 | 0.1981±0.02| 0.0402±0.00 | 0.2993±0.02 | 0.0073±0.02 | 0.0880±0.22| -0.1446±0.03 |
| GRU (Kyunghyun Cho, et al.) | Alpha360 | 0.0490±0.01 | 0.3787±0.05| 0.0581±0.00 | 0.4664±0.04 | 0.0726±0.02 | 0.9817±0.34| -0.0902±0.03 |
| LSTM (Sepp Hochreiter, et al.) | Alpha360 | 0.0443±0.01 | 0.3401±0.05| 0.0536±0.01 | 0.4248±0.05 | 0.0627±0.03 | 0.8441±0.48| -0.0882±0.03 |
| ALSTM (Yao Qin, et al.) | Alpha360 | 0.0493±0.01 | 0.3778±0.06| 0.0585±0.00 | 0.4606±0.04 | 0.0513±0.03 | 0.6727±0.38| -0.1085±0.02 |
| GATs (Petar Velickovic, et al.) | Alpha360 | 0.0475±0.00 | 0.3515±0.02| 0.0592±0.00 | 0.4585±0.01 | 0.0876±0.02 | 1.1513±0.27| -0.0795±0.02 |
| DoubleEnsemble (Chuheng Zhang, et al.) | Alpha360 | 0.0407±0.00| 0.3053±0.00 | 0.0490±0.00 | 0.3840±0.00 | 0.0380±0.02 | 0.5000±0.21 | -0.0984±0.02 |
| TabNet (Sercan O. Arik, et al.)| Alpha360 | 0.0192±0.00 | 0.1401±0.00| 0.0291±0.00 | 0.2163±0.00 | -0.0258±0.00 | -0.2961±0.00| -0.1429±0.00 |
| TCTS (Xueqing Wu, et al.)| Alpha360 | 0.0485±0.00 | 0.3689±0.04| 0.0586±0.00 | 0.4669±0.02 | 0.0816±0.02 | 1.1572±0.30| -0.0689±0.02 |
| Transformer (Ashish Vaswani, et al.)| Alpha360 | 0.0141±0.00 | 0.0917±0.02| 0.0331±0.00 | 0.2357±0.03 | -0.0259±0.03 | -0.3323±0.43| -0.1763±0.07 |
| Localformer (Juyong Jiang, et al.)| Alpha360 | 0.0408±0.00 | 0.2988±0.03| 0.0538±0.00 | 0.4105±0.02 | 0.0275±0.03 | 0.3464±0.37| -0.1182±0.03 |
| TRA (Hengxu Lin, et al.)| Alpha360 | 0.0491±0.01 | 0.3868±0.06 | 0.0589±0.00 | 0.4802±0.04 | 0.0898±0.02 | 1.2490±0.32 | -0.0778±0.02 |
## Alpha158 dataset
| Model Name | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
|---|---|---|---|---|---|---|---|---|
| Linear | Alpha158 | 0.0393±0.00 | 0.2980±0.00| 0.0475±0.00 | 0.3546±0.00 | 0.0795±0.00 | 1.0712±0.00| -0.1449±0.00 |
| CatBoost (Liudmila Prokhorenkova, et al.) | Alpha158 | 0.0503±0.00 | 0.3586±0.00| 0.0483±0.00 | 0.3667±0.00 | 0.1080±0.00 | 1.1561±0.00| -0.0787±0.00 |
| XGBoost (Tianqi Chen, et al.) | Alpha158 | 0.0481±0.00 | 0.3659±0.00| 0.0495±0.00 | 0.4033±0.00 | 0.1111±0.00 | 1.2915±0.00| -0.0893±0.00 |
| LightGBM (Guolin Ke, et al.) | Alpha158 | 0.0475±0.00 | 0.3979±0.00| 0.0485±0.00 | 0.4123±0.00 | 0.1143±0.00 | 1.2744±0.00| -0.0800±0.00 |
| MLP | Alpha158 | 0.0358±0.00 | 0.2738±0.03| 0.0425±0.00 | 0.3221±0.01 | 0.0836±0.02 | 1.0323±0.25| -0.1127±0.02 |
| TFT (Bryan Lim, et al.) | Alpha158 (with selected 20 features) | 0.0343±0.00 | 0.2071±0.02| 0.0107±0.00 | 0.0660±0.02 | 0.0623±0.02 | 0.5818±0.20| -0.1762±0.01 |
| GRU (Kyunghyun Cho, et al.) | Alpha158 (with selected 20 features) | 0.0311±0.00 | 0.2418±0.04| 0.0425±0.00 | 0.3434±0.02 | 0.0330±0.02 | 0.4805±0.30| -0.1021±0.02 |
| LSTM (Sepp Hochreiter, et al.) | Alpha158 (with selected 20 features) | 0.0312±0.00 | 0.2394±0.04| 0.0418±0.00 | 0.3324±0.03 | 0.0298±0.02 | 0.4198±0.33| -0.1348±0.03 |
| ALSTM (Yao Qin, et al.) | Alpha158 (with selected 20 features) | 0.0385±0.01 | 0.3022±0.06| 0.0478±0.00 | 0.3874±0.04 | 0.0486±0.03 | 0.7141±0.45| -0.1088±0.03 |
| GATs (Petar Velickovic, et al.) | Alpha158 (with selected 20 features) | 0.0349±0.00 | 0.2511±0.01| 0.0457±0.00 | 0.3537±0.01 | 0.0578±0.02 | 0.8221±0.25| -0.0824±0.02 |
| DoubleEnsemble (Chuheng Zhang, et al.) | Alpha158 | 0.0544±0.00 | 0.4338±0.01 | 0.0523±0.00 | 0.4257±0.01 | 0.1253±0.01 | 1.4105±0.14 | -0.0902±0.01 |
| TabNet (Sercan O. Arik, et al.)| Alpha158 | 0.0383±0.00 | 0.3414±0.00| 0.0388±0.00 | 0.3460±0.00 | 0.0226±0.00 | 0.2652±0.00| -0.1072±0.00 |
| Transformer (Ashish Vaswani, et al.)| Alpha158 | 0.0274±0.00 | 0.2166±0.04| 0.0409±0.00 | 0.3342±0.04 | 0.0204±0.03 | 0.2888±0.40| -0.1216±0.04 |
| Localformer (Juyong Jiang, et al.)| Alpha158 | 0.0355±0.00 | 0.2747±0.04| 0.0466±0.00 | 0.3762±0.03 | 0.0506±0.02 | 0.7447±0.34| -0.0875±0.02 |
| TRA (Hengxu Lin, et al.)| Alpha158 (with selected 20 features)| 0.0409±0.00 | 0.3253±0.04 | 0.0488±0.00 | 0.4045±0.02 | 0.0673±0.02 | 1.0389±0.39 | -0.0830±0.02 |
| TRA (Hengxu Lin, et al.)| Alpha158 | 0.0442±0.00 | 0.3426±0.03 | 0.0555±0.00 | 0.4395±0.03 | 0.0833±0.03 | 1.2064±0.36 | -0.0849±0.02 |
| Model Name | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
|------------------------------------------|-------------------------------------|-------------|-------------|-------------|-------------|-------------------|-------------------|--------------|
| TabNet(Sercan O. Arik, et al.) | Alpha158 | 0.0204±0.01 | 0.1554±0.07 | 0.0333±0.00 | 0.2552±0.05 | 0.0227±0.04 | 0.3676±0.54 | -0.1089±0.08 |
| Transformer(Ashish Vaswani, et al.) | Alpha158 | 0.0264±0.00 | 0.2053±0.02 | 0.0407±0.00 | 0.3273±0.02 | 0.0273±0.02 | 0.3970±0.26 | -0.1101±0.02 |
| GRU(Kyunghyun Cho, et al.) | Alpha158(with selected 20 features) | 0.0315±0.00 | 0.2450±0.04 | 0.0428±0.00 | 0.3440±0.03 | 0.0344±0.02 | 0.5160±0.25 | -0.1017±0.02 |
| LSTM(Sepp Hochreiter, et al.) | Alpha158(with selected 20 features) | 0.0318±0.00 | 0.2367±0.04 | 0.0435±0.00 | 0.3389±0.03 | 0.0381±0.03 | 0.5561±0.46 | -0.1207±0.04 |
| Localformer(Juyong Jiang, et al.) | Alpha158 | 0.0356±0.00 | 0.2756±0.03 | 0.0468±0.00 | 0.3784±0.03 | 0.0438±0.02 | 0.6600±0.33 | -0.0952±0.02 |
| SFM(Liheng Zhang, et al.) | Alpha158 | 0.0379±0.00 | 0.2959±0.04 | 0.0464±0.00 | 0.3825±0.04 | 0.0465±0.02 | 0.5672±0.29 | -0.1282±0.03 |
| ALSTM (Yao Qin, et al.) | Alpha158(with selected 20 features) | 0.0362±0.01 | 0.2789±0.06 | 0.0463±0.01 | 0.3661±0.05 | 0.0470±0.03 | 0.6992±0.47 | -0.1072±0.03 |
| GATs (Petar Velickovic, et al.) | Alpha158(with selected 20 features) | 0.0349±0.00 | 0.2511±0.01 | 0.0462±0.00 | 0.3564±0.01 | 0.0497±0.01 | 0.7338±0.19 | -0.0777±0.02 |
| TRA(Hengxu Lin, et al.) | Alpha158(with selected 20 features) | 0.0404±0.00 | 0.3197±0.05 | 0.0490±0.00 | 0.4047±0.04 | 0.0649±0.02 | 1.0091±0.30 | -0.0860±0.02 |
| Linear | Alpha158 | 0.0397±0.00 | 0.3000±0.00 | 0.0472±0.00 | 0.3531±0.00 | 0.0692±0.00 | 0.9209±0.00 | -0.1509±0.00 |
| TRA(Hengxu Lin, et al.) | Alpha158 | 0.0440±0.00 | 0.3535±0.05 | 0.0540±0.00 | 0.4451±0.03 | 0.0718±0.02 | 1.0835±0.35 | -0.0760±0.02 |
| CatBoost(Liudmila Prokhorenkova, et al.) | Alpha158 | 0.0481±0.00 | 0.3366±0.00 | 0.0454±0.00 | 0.3311±0.00 | 0.0765±0.00 | 0.8032±0.01 | -0.1092±0.00 |
| XGBoost(Tianqi Chen, et al.) | Alpha158 | 0.0498±0.00 | 0.3779±0.00 | 0.0505±0.00 | 0.4131±0.00 | 0.0780±0.00 | 0.9070±0.00 | -0.1168±0.00 |
| TFT (Bryan Lim, et al.) | Alpha158(with selected 20 features) | 0.0358±0.00 | 0.2160±0.03 | 0.0116±0.01 | 0.0720±0.03 | 0.0847±0.02 | 0.8131±0.19 | -0.1824±0.03 |
| MLP | Alpha158 | 0.0376±0.00 | 0.2846±0.02 | 0.0429±0.00 | 0.3220±0.01 | 0.0895±0.02 | 1.1408±0.23 | -0.1103±0.02 |
| LightGBM(Guolin Ke, et al.) | Alpha158 | 0.0448±0.00 | 0.3660±0.00 | 0.0469±0.00 | 0.3877±0.00 | 0.0901±0.00 | 1.0164±0.00 | -0.1038±0.00 |
| DoubleEnsemble(Chuheng Zhang, et al.) | Alpha158 | 0.0544±0.00 | 0.4340±0.00 | 0.0523±0.00 | 0.4284±0.01 | 0.1168±0.01 | 1.3384±0.12 | -0.1036±0.01 |
## Alpha360 dataset
| Model Name | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
|-------------------------------------------|----------|-------------|-------------|-------------|-------------|-------------------|-------------------|--------------|
| Transformer(Ashish Vaswani, et al.) | Alpha360 | 0.0114±0.00 | 0.0716±0.03 | 0.0327±0.00 | 0.2248±0.02 | -0.0270±0.03 | -0.3378±0.37 | -0.1653±0.05 |
| TabNet(Sercan O. Arik, et al.) | Alpha360 | 0.0099±0.00 | 0.0593±0.00 | 0.0290±0.00 | 0.1887±0.00 | -0.0369±0.00 | -0.3892±0.00 | -0.2145±0.00 |
| MLP | Alpha360 | 0.0273±0.00 | 0.1870±0.02 | 0.0396±0.00 | 0.2910±0.02 | 0.0029±0.02 | 0.0274±0.23 | -0.1385±0.03 |
| Localformer(Juyong Jiang, et al.) | Alpha360 | 0.0404±0.00 | 0.2932±0.04 | 0.0542±0.00 | 0.4110±0.03 | 0.0246±0.02 | 0.3211±0.21 | -0.1095±0.02 |
| CatBoost((Liudmila Prokhorenkova, et al.) | Alpha360 | 0.0378±0.00 | 0.2714±0.00 | 0.0467±0.00 | 0.3659±0.00 | 0.0292±0.00 | 0.3781±0.00 | -0.0862±0.00 |
| XGBoost(Tianqi Chen, et al.) | Alpha360 | 0.0394±0.00 | 0.2909±0.00 | 0.0448±0.00 | 0.3679±0.00 | 0.0344±0.00 | 0.4527±0.02 | -0.1004±0.00 |
| DoubleEnsemble(Chuheng Zhang, et al.) | Alpha360 | 0.0404±0.00 | 0.3023±0.00 | 0.0495±0.00 | 0.3898±0.00 | 0.0468±0.01 | 0.6302±0.20 | -0.0860±0.01 |
| LightGBM(Guolin Ke, et al.) | Alpha360 | 0.0400±0.00 | 0.3037±0.00 | 0.0499±0.00 | 0.4042±0.00 | 0.0558±0.00 | 0.7632±0.00 | -0.0659±0.00 |
| ALSTM (Yao Qin, et al.) | Alpha360 | 0.0497±0.00 | 0.3829±0.04 | 0.0599±0.00 | 0.4736±0.03 | 0.0626±0.02 | 0.8651±0.31 | -0.0994±0.03 |
| LSTM(Sepp Hochreiter, et al.) | Alpha360 | 0.0448±0.00 | 0.3474±0.04 | 0.0549±0.00 | 0.4366±0.03 | 0.0647±0.03 | 0.8963±0.39 | -0.0875±0.02 |
| GRU(Kyunghyun Cho, et al.) | Alpha360 | 0.0493±0.00 | 0.3772±0.04 | 0.0584±0.00 | 0.4638±0.03 | 0.0720±0.02 | 0.9730±0.33 | -0.0821±0.02 |
| GATs (Petar Velickovic, et al.) | Alpha360 | 0.0476±0.00 | 0.3508±0.02 | 0.0598±0.00 | 0.4604±0.01 | 0.0824±0.02 | 1.1079±0.26 | -0.0894±0.03 |
| TCTS(Xueqing Wu, et al.) | Alpha360 | 0.0508±0.00 | 0.3931±0.04 | 0.0599±0.00 | 0.4756±0.03 | 0.0893±0.03 | 1.2256±0.36 | -0.0857±0.02 |
| TRA(Hengxu Lin, et al.) | Alpha360 | 0.0485±0.00 | 0.3787±0.03 | 0.0587±0.00 | 0.4756±0.03 | 0.0920±0.03 | 1.2789±0.42 | -0.0834±0.02 |
- The selected 20 features are based on the feature importance of a lightgbm-based model.
- The base model of DoubleEnsemble is LGBM.
- The base model of TCTS is GRU.

View File

@@ -1,4 +1,4 @@
pandas==1.1.2
numpy==1.17.4
scikit_learn==0.23.2
torch==1.7.0
torch==1.7.0

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: SFM
@@ -73,7 +78,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -1,52 +1,38 @@
# Temporally Correlated Task Scheduling for Sequence Learning
We provide the [code](https://github.com/microsoft/qlib/blob/main/qlib/contrib/model/pytorch_tcts.py) for reproducing the stock trend forecasting experiments.
### Background
Sequence learning has attracted much research attention from the machine learning community in recent years. In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks, which are different in terms of how much input information to use or which future step to predict. In stock trend forecasting, as demonstrated in Figure1, one can predict the price of a stock in different future days (e.g., tomorrow, the day after tomorrow). In this paper, we propose a framework to make use of those temporally correlated tasks to help each other.
<p align="center">
<img src="task_description.png" width="600" height="200"/>
</p>
### Method
Given that there are usually multiple temporally correlated tasks, the key challenge lies in which tasks to use and when to use them in the training process. In this work, we introduce a learnable task scheduler for sequence learning, which adaptively selects temporally correlated tasks during the training process. The scheduler accesses the model status and the current training data (e.g., in current minibatch), and selects the best auxiliary task to help the training of the main task. The scheduler and the model for the main task are jointly trained through bi-level optimization: the scheduler is trained to maximize the validation performance of the model, and the model is trained to minimize the training loss guided by the scheduler. The process is demonstrated in Figure2.
Given that there are usually multiple temporally correlated tasks, the key challenge lies in which tasks to use and when to use them in the training process. This work introduces a learnable task scheduler for sequence learning, which adaptively selects temporally correlated tasks during the training process. The scheduler accesses the model status and the current training data (e.g., in the current minibatch) and selects the best auxiliary task to help the training of the main task. The scheduler and the model for the main task are jointly trained through bi-level optimization: the scheduler is trained to maximize the validation performance of the model, and the model is trained to minimize the training loss guided by the scheduler. The process is demonstrated in Figure2.
<p align="center">
<img src="workflow.png"/>
</p>
At step <img src="https://render.githubusercontent.com/render/math?math=s">, with training data <img src="https://render.githubusercontent.com/render/math?math=x_s,y_s">, the scheduler <img src="https://render.githubusercontent.com/render/math?math=\varphi"> chooses a suitable task <img src="https://render.githubusercontent.com/render/math?math=T_{i_s}"> (green solid lines) to update the model <img src="https://render.githubusercontent.com/render/math?math=f"> (blue solid lines). After <img src="https://render.githubusercontent.com/render/math?math=S"> steps, we evaluate the model <img src="https://render.githubusercontent.com/render/math?math=f"> on the validation set and update the scheduler <img src="https://render.githubusercontent.com/render/math?math=\varphi"> (green dashed lines).
### DataSet
* We use the historical transaction data for 300 stocks on [CSI300](http://www.csindex.com.cn/en/indices/index-detail/000300) from 01/01/2008 to 08/01/2020.
* We split the data into training (01/01/2008-12/31/2013), validation (01/01/2014-12/31/2015), and test sets (01/01/2016-08/01/2020) based on the transaction time.
At step <img src="https://latex.codecogs.com/png.latex?s" title="s" />, with training data <img src="https://latex.codecogs.com/png.latex?x_s,y_s" title="x_s,y_s" />, the scheduler <img src="https://latex.codecogs.com/png.latex?\varphi" title="\varphi" /> chooses a suitable task <img src="https://latex.codecogs.com/png.latex?T_{i_s}" title="T_{i_s}" /> (green solid lines) to update the model <img src="https://latex.codecogs.com/png.latex?f" title="f" /> (blue solid lines). After <img src="https://latex.codecogs.com/png.latex?S" title="S" /> steps, we evaluate the model <img src="https://latex.codecogs.com/png.latex?f" title="f" /> on the validation set and update the scheduler <img src="https://latex.codecogs.com/png.latex?\varphi" title="\varphi" /> (green dashed lines).
### Experiments
#### Task Description
* The main tasks <img src="https://render.githubusercontent.com/render/math?math=T_k"> (<img src="https://render.githubusercontent.com/render/math?math=task_k"> in Figure1) refers to forecasting return of stock <img src="https://render.githubusercontent.com/render/math?math=i"> as following,
Due to different data versions and different Qlib versions, the original data and data preprocessing methods of the experimental settings in the paper are different from those experimental settings in the existing Qlib version. Therefore, we provide two versions of the code according to the two kinds of settings, 1) the [code](https://github.com/lwwang1995/tcts) that can be used to reproduce the experimental results and 2) the [code](https://github.com/microsoft/qlib/blob/main/qlib/contrib/model/pytorch_tcts.py) in the current Qlib baseline.
#### Setting1
* Dataset: We use the historical transaction data for 300 stocks on [CSI300](http://www.csindex.com.cn/en/indices/index-detail/000300) from 01/01/2008 to 08/01/2020. We split the data into training (01/01/2008-12/31/2013), validation (01/01/2014-12/31/2015), and test sets (01/01/2016-08/01/2020) based on the transaction time.
* The main tasks <img src="https://latex.codecogs.com/png.latex?T_k" title="T_k" /> refers to forecasting return of stock <img src="https://latex.codecogs.com/png.latex?i" title="i" /> as following,
<div align=center>
<img src="https://render.githubusercontent.com/render/math?math=r_{i}^k = \frac{\price_i^{t+k}}{\price_i^{t+k-1}} - 1">
<img src="https://latex.codecogs.com/png.image?\dpi{110}&space;r_{i}^{t,k}&space;=&space;\frac{price_i^{t&plus;k}}{price_i^{t&plus;k-1}}-1" title="r_{i}^{t,k} = \frac{price_i^{t+k}}{price_i^{t+k-1}}-1" />
</div>
* Temporally correlated task sets <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_k = \{T_1, T_2, ... , T_k\}">, in this paper, <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">, <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5"> and <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_10"> are used.
#### Baselines
* GRU/MLP/LightGBM (LGB)/Graph Attention Networks (GAT)
* Multi-task learning (MTL): In multi-task learning, multiple tasks are jointly trained and mutually boosted. Each task is treated equally, while in our setting, we focus on the main task.
* Curriculum transfer learning (CL): Transfer learning also leverages auxiliary tasks to boost the main task. [Curriculum transfer learning](https://arxiv.org/pdf/1804.00810.pdf) is one kind of transfer learning which schedules auxiliary tasks according to certain rules. Our problem can also be regarded as a special kind of transfer learning, where the auxiliary tasks are temporally correlated with the main task. Our learning process is dynamically controlled by a scheduler rather than some pre-defined rules. In the CL baseline, we start from the task <img src="https://render.githubusercontent.com/render/math?math=T_1" >, then <img src="https://render.githubusercontent.com/render/math?math=T_2" >, and gradually move to the last one.
#### Result
| Methods | <img src="https://render.githubusercontent.com/render/math?math=T_1" > | <img src="https://render.githubusercontent.com/render/math?math=T_2"> | <img src="https://render.githubusercontent.com/render/math?math=T_3"> |
| :----: | :----: | :----: | :----: |
| GRU | 0.049 / 1.903 | 0.018 / 1.972 | 0.014 / 1.989 |
| MLP | 0.023 / 1.961 | 0.022 / 1.962 | 0.015 / 1.978 |
| LGB | 0.038 / 1.883 | 0.023 / 1.952 | 0.007 / 1.987 |
| GAT | 0.052 / 1.898 | 0.024 / 1.954 | 0.015 / 1.973 |
| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">) | 0.061 / 1.862 | 0.023 / 1.942 | 0.012 / 1.956 |
| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">) | 0.051 / 1.880 | 0.028 / 1.941 | 0.016 / 1.962 |
| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">) | 0.071 / 1.851 | 0.030 / 1.939 | 0.017 / 1.963 |
| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">) | 0.057 / 1.875 | 0.021 / 1.939 | 0.017 / 1.959 |
| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">) | 0.056 / 1.877 | 0.028 / 1.942 | 0.015 / 1.962 |
| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">) | 0.075 / 1.849 | 0.032 /1.939 | 0.021 / 1.955 |
| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">) | 0.052 / 1.882 | 0.020 / 1.947 | 0.019 / 1.952 |
| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">) | 0.051 / 1.882 | 0.028 / 1.950 | 0.016 / 1.961 |
| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">) | 0.067 / 1.867 | 0.030 / 1.960 | 0.022 / 1.942|
* Temporally correlated task sets <img src="https://latex.codecogs.com/png.latex?\mathcal{T}_k&space;=&space;\{T_1,&space;T_2,&space;...&space;,&space;T_k\}" title="\mathcal{T}_k = \{T_1, T_2, ... , T_k\}" />, in this paper, <img src="https://latex.codecogs.com/png.latex?\mathcal{T}_3" title="\mathcal{T}_3" />, <img src="https://latex.codecogs.com/png.latex?\mathcal{T}_5" title="\mathcal{T}_5" /> and <img src="https://latex.codecogs.com/png.latex?\mathcal{T}_{10}" title="\mathcal{T}_{10}" /> are used in <img src="https://latex.codecogs.com/png.latex?T_1" title="T_1" />, <img src="https://latex.codecogs.com/png.latex?T_2" title="T_2" />, and <img src="https://latex.codecogs.com/png.latex?T_3" title="T_3" />.
#### Setting2
* Dataset: We use the historical transaction data for 300 stocks on [CSI300](http://www.csindex.com.cn/en/indices/index-detail/000300) from 01/01/2008 to 08/01/2020. We split the data into training (01/01/2008-12/31/2014), validation (01/01/2015-12/31/2016), and test sets (01/01/2017-08/01/2020) based on the transaction time.
* The main tasks <img src="https://latex.codecogs.com/png.latex?T_k" title="T_k" /> refers to forecasting return of stock <img src="https://latex.codecogs.com/png.latex?i" title="i" /> as following,
<div align=center>
<img src="https://latex.codecogs.com/png.image?\dpi{110}&space;r_{i}^{t,k}&space;=&space;\frac{price_i^{t&plus;1&plus;k}}{price_i^{t&plus;1}}-1" title="r_{i}^{t,k} = \frac{price_i^{t+1+k}}{price_i^{t+1}}-1" />
</div>
* In Qlib baseline, <img src="https://latex.codecogs.com/png.latex?\mathcal{T}_3" title="\mathcal{T}_3" />, is used in <img src="https://latex.codecogs.com/png.latex?T_1" title="T_1" />.
### Experimental Result
You can find the experimental result of setting1 in the [paper](http://proceedings.mlr.press/v139/wu21e/wu21e.pdf) and the experimental result of setting2 in this [page](https://github.com/microsoft/qlib/tree/main/examples/benchmarks).

View File

@@ -0,0 +1,4 @@
pandas==1.1.2
numpy==1.17.4
scikit_learn==0.23.2
torch==1.7.0

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

View File

@@ -22,25 +22,30 @@ data_handler_config: &data_handler_config
- class: CSRankNorm
kwargs:
fields_group: label
label: ["Ref($close, -1) / $close - 1",
"Ref($close, -2) / Ref($close, -1) - 1",
"Ref($close, -3) / Ref($close, -2) - 1"]
label: ["Ref($close, -2) / Ref($close, -1) - 1",
"Ref($close, -3) / Ref($close, -1) - 1",
"Ref($close, -4) / Ref($close, -1) - 1"]
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TCTS
@@ -49,9 +54,8 @@ task:
d_feat: 6
hidden_size: 64
num_layers: 2
dropout: 0.0
dropout: 0.3
n_epochs: 200
lr: 1e-3
early_stop: 20
batch_size: 800
metric: loss
@@ -60,10 +64,10 @@ task:
fore_optimizer: adam
weight_optimizer: adam
output_dim: 3
fore_lr: 5e-4
weight_lr: 5e-4
fore_lr: 2e-3
weight_lr: 2e-3
steps: 3
target_label: 1
target_label: 0
lowest_valid_performance: 0.993
dataset:
class: DatasetH
@@ -80,13 +84,14 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
ana_long_short: False
ann_scaler: 252
label_col: 1
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -195,7 +195,8 @@ class Alpha158Formatter(GenericDataFormatter):
for col in column_names:
if col not in {"forecast_time", "identifier"}:
output[col] = self._target_scaler.inverse_transform(predictions[col])
# Using [col] is for aligning with the format when fitting
output[col] = self._target_scaler.inverse_transform(predictions[[col]])
return output

View File

@@ -304,11 +304,18 @@ class TFTModel(ModelFT):
path : Union[Path, str]
the target path to be dumped
"""
# FIXME: implementing saving tensorflow models
# save tensorflow model
# path = Path(path)
# path.mkdir(parents=True)
# self.model.save(path)
# save qlib model wrapper
self.model = None
super(TFTModel, self).to_pickle(path / "qlib_model")
drop_attrs = ["model", "tf_graph", "sess", "data_formatter"]
orig_attr = {}
for attr in drop_attrs:
orig_attr[attr] = getattr(self, attr)
setattr(self, attr, None)
super(TFTModel, self).to_pickle(path)
for attr in drop_attrs:
setattr(self, attr, orig_attr[attr])

View File

@@ -14,19 +14,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TFTModel
@@ -46,7 +51,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -69,6 +69,7 @@ After running the scripts, you can find result files in path `./output`:
* `pred.pkl` - the prediction scores and output for inference.
Evaluation metrics reported in the paper:
This result is generated by qlib==0.7.1.
| Methods | MSE| MAE| IC | ICIR | AR | AV | SR | MDD |
|-------|-------|------|-----|-----|-----|-----|-----|-----|

View File

@@ -0,0 +1,5 @@
pandas==1.1.2
numpy==1.17.4
scikit_learn==0.23.2
torch==1.7.0
seaborn

View File

@@ -38,7 +38,7 @@ class TRAModel(Model):
model_init_state=None,
lamb=0.0,
rho=0.99,
seed=0,
seed=None,
logdir=None,
eval_train=True,
eval_test=False,

View File

@@ -53,21 +53,26 @@ model_config: &model_config
dropout: 0.0
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
kwargs:
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
@@ -117,13 +122,15 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
config: *port_analysis_config

View File

@@ -47,21 +47,26 @@ model_config: &model_config
dropout: 0.2
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
kwargs:
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
@@ -111,10 +116,12 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord

View File

@@ -47,21 +47,26 @@ model_config: &model_config
dropout: 0.0
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
kwargs:
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
@@ -111,10 +116,12 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
kwargs:
ana_long_short: False
ann_scaler: 252
- class: PortAnaRecord

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TabnetModel
@@ -46,6 +51,7 @@ task:
kwargs:
d_feat: 158
pretrain: True
seed: 993
dataset:
class: DatasetH
module_path: qlib.data.dataset
@@ -63,7 +69,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TabnetModel
@@ -46,6 +51,7 @@ task:
kwargs:
d_feat: 360
pretrain: True
seed: 993
dataset:
class: DatasetH
module_path: qlib.data.dataset
@@ -63,7 +69,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -34,19 +34,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TransformerModel
@@ -70,7 +75,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -26,19 +26,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: TransformerModel
@@ -61,7 +66,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
@@ -70,4 +77,4 @@ task:
- class: PortAnaRecord
module_path: qlib.workflow.record_temp
kwargs:
config: *port_analysis_config
config: *port_analysis_config

View File

@@ -12,19 +12,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: XGBModel
@@ -52,7 +57,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -19,19 +19,24 @@ data_handler_config: &data_handler_config
port_analysis_config: &port_analysis_config
strategy:
class: TopkDropoutStrategy
module_path: qlib.contrib.strategy.strategy
module_path: qlib.contrib.strategy
kwargs:
signal:
- <MODEL>
- <DATASET>
topk: 50
n_drop: 5
backtest:
verbose: False
limit_threshold: 0.095
start_time: 2017-01-01
end_time: 2020-08-01
account: 100000000
benchmark: *benchmark
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
exchange_kwargs:
limit_threshold: 0.095
deal_price: close
open_cost: 0.0005
close_cost: 0.0015
min_cost: 5
task:
model:
class: XGBModel
@@ -59,7 +64,9 @@ task:
record:
- class: SignalRecord
module_path: qlib.workflow.record_temp
kwargs: {}
kwargs:
model: <MODEL>
dataset: <DATASET>
- class: SigAnaRecord
module_path: qlib.workflow.record_temp
kwargs:

View File

@@ -5,30 +5,7 @@ from qlib.data.ops import ElemOperator, PairOperator
from qlib.config import C
from qlib.data.cache import H
from qlib.data.data import Cal
def get_calendar_day(freq="day", future=False):
"""Load High-Freq Calendar Date Using Memcache.
Parameters
----------
freq : str
frequency of read calendar file.
future : bool
whether including future trading day.
Returns
-------
_calendar:
array of date.
"""
flag = f"{freq}_future_{future}_day"
if flag in H["c"]:
_calendar = H["c"][flag]
else:
_calendar = np.array(list(map(lambda x: pd.Timestamp(x.date()), Cal.load_calendar(freq, future))))
H["c"][flag] = _calendar
return _calendar
from qlib.contrib.ops.high_freq import get_calendar_day
class DayLast(ElemOperator):

View File

@@ -59,7 +59,7 @@ task:
record:
- class: "SignalRecord"
module_path: "qlib.workflow.record_temp"
kwargs: {}
kwargs:
- class: "HFSignalRecord"
module_path: "qlib.workflow.record_temp"
kwargs: {}

View File

@@ -0,0 +1,30 @@
# Nested Decision Execution
This workflow is an example for nested decision execution in backtesting. Qlib supports nested decision execution in backtesting. It means that users can use different strategies to make trade decision in different frequencies.
## Weekly Portfolio Generation and Daily Order Execution
This workflow provides an example that uses a DropoutTopkStrategy (a strategy based on the daily frequency Lightgbm model) in weekly frequency for portfolio generation and uses SBBStrategyEMA (a rule-based strategy that uses EMA for decision-making) to execute orders in daily frequency.
### Usage
Start backtesting by running the following command:
```bash
python workflow.py backtest
```
Start collecting data by running the following command:
```bash
python workflow.py collect_data
```
## Daily Portfolio Generation and Minutely Order Execution
This workflow also provides a high-frequency example that uses a DropoutTopkStrategy for portfolio generation in daily frequency and uses SBBStrategyEMA to execute orders in minutely frequency.
### Usage
Start backtesting by running the following command:
```bash
python workflow.py backtest_highfreq
```

View File

@@ -0,0 +1,204 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import qlib
import fire
from qlib.config import REG_CN, HIGH_FREQ_CONFIG
from qlib.data import D
from qlib.utils import exists_qlib_data, init_instance_by_config, flatten_dict
from qlib.workflow import R
from qlib.workflow.record_temp import SignalRecord, PortAnaRecord
from qlib.tests.data import GetData
from qlib.backtest import collect_data
class NestedDecisionExecutionWorkflow:
market = "csi300"
benchmark = "SH000300"
data_handler_config = {
"start_time": "2008-01-01",
"end_time": "2021-05-31",
"fit_start_time": "2008-01-01",
"fit_end_time": "2014-12-31",
"instruments": market,
}
task = {
"model": {
"class": "LGBModel",
"module_path": "qlib.contrib.model.gbdt",
"kwargs": {
"loss": "mse",
"colsample_bytree": 0.8879,
"learning_rate": 0.0421,
"subsample": 0.8789,
"lambda_l1": 205.6999,
"lambda_l2": 580.9768,
"max_depth": 8,
"num_leaves": 210,
"num_threads": 20,
},
},
"dataset": {
"class": "DatasetH",
"module_path": "qlib.data.dataset",
"kwargs": {
"handler": {
"class": "Alpha158",
"module_path": "qlib.contrib.data.handler",
"kwargs": data_handler_config,
},
"segments": {
"train": ("2007-01-01", "2014-12-31"),
"valid": ("2015-01-01", "2016-12-31"),
"test": ("2020-01-01", "2021-05-31"),
},
},
},
}
port_analysis_config = {
"executor": {
"class": "NestedExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": "day",
"inner_executor": {
"class": "NestedExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": "30min",
"inner_executor": {
"class": "SimulatorExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": "5min",
"generate_portfolio_metrics": True,
"verbose": True,
"indicator_config": {
"show_indicator": True,
},
},
},
"inner_strategy": {
"class": "TWAPStrategy",
"module_path": "qlib.contrib.strategy.rule_strategy",
},
"generate_portfolio_metrics": True,
"indicator_config": {
"show_indicator": True,
},
},
},
"inner_strategy": {
"class": "SBBStrategyEMA",
"module_path": "qlib.contrib.strategy.rule_strategy",
"kwargs": {
"instruments": market,
"freq": "1min",
},
},
"track_data": True,
"generate_portfolio_metrics": True,
"indicator_config": {
"show_indicator": True,
},
},
},
"backtest": {
"start_time": "2020-09-20",
"end_time": "2021-05-20",
"account": 100000000,
"exchange_kwargs": {
"freq": "1min",
"limit_threshold": 0.095,
"deal_price": "close",
"open_cost": 0.0005,
"close_cost": 0.0015,
"min_cost": 5,
},
},
}
def _init_qlib(self):
"""initialize qlib"""
provider_uri_day = "~/.qlib/qlib_data/cn_data" # target_dir
GetData().qlib_data(target_dir=provider_uri_day, region=REG_CN, version="v2", exists_skip=True)
provider_uri_1min = HIGH_FREQ_CONFIG.get("provider_uri")
GetData().qlib_data(
target_dir=provider_uri_1min, interval="1min", region=REG_CN, version="v2", exists_skip=True
)
provider_uri_map = {"1min": provider_uri_1min, "day": provider_uri_day}
qlib.init(provider_uri=provider_uri_map, dataset_cache=None, expression_cache=None)
def _train_model(self, model, dataset):
with R.start(experiment_name="train"):
R.log_params(**flatten_dict(self.task))
model.fit(dataset)
R.save_objects(**{"params.pkl": model})
# prediction
recorder = R.get_recorder()
sr = SignalRecord(model, dataset, recorder)
sr.generate()
def backtest(self):
self._init_qlib()
model = init_instance_by_config(self.task["model"])
dataset = init_instance_by_config(self.task["dataset"])
self._train_model(model, dataset)
strategy_config = {
"class": "TopkDropoutStrategy",
"module_path": "qlib.contrib.strategy.signal_strategy",
"kwargs": {
"signal": (model, dataset),
"topk": 50,
"n_drop": 5,
},
}
self.port_analysis_config["strategy"] = strategy_config
self.port_analysis_config["backtest"]["benchmark"] = self.benchmark
with R.start(experiment_name="backtest"):
recorder = R.get_recorder()
par = PortAnaRecord(
recorder,
self.port_analysis_config,
risk_analysis_freq=["day", "30min", "5min"],
indicator_analysis_freq=["day", "30min", "5min"],
indicator_analysis_method="value_weighted",
)
par.generate()
# user could use following methods to analysis the position
# report_normal_df = recorder.load_object("portfolio_analysis/report_normal_1day.pkl")
# from qlib.contrib.report import analysis_position
# analysis_position.report_graph(report_normal_df)
def collect_data(self):
self._init_qlib()
model = init_instance_by_config(self.task["model"])
dataset = init_instance_by_config(self.task["dataset"])
self._train_model(model, dataset)
executor_config = self.port_analysis_config["executor"]
backtest_config = self.port_analysis_config["backtest"]
backtest_config["benchmark"] = self.benchmark
strategy_config = {
"class": "TopkDropoutStrategy",
"module_path": "qlib.contrib.strategy.signal_strategy",
"kwargs": {
"signal": (model, dataset),
"topk": 50,
"n_drop": 5,
},
}
data_generator = collect_data(executor=executor_config, strategy=strategy_config, **backtest_config)
for trade_decision in data_generator:
print(trade_decision)
if __name__ == "__main__":
fire.Fire(NestedDecisionExecutionWorkflow)

View File

@@ -21,7 +21,6 @@ class RollingDataWorkflow:
def _init_qlib(self):
"""initialize qlib"""
# use yahoo_cn_1min data
provider_uri = "~/.qlib/qlib_data/cn_data" # target_dir
GetData().qlib_data(target_dir=provider_uri, region=REG_CN, exists_skip=True)
qlib.init(provider_uri=provider_uri, region=REG_CN)

View File

@@ -6,6 +6,7 @@ import sys
import fire
import time
import glob
import yaml
import shutil
import signal
import inspect
@@ -23,22 +24,6 @@ from qlib.config import REG_CN
from qlib.workflow import R
from qlib.tests.data import GetData
# init qlib
provider_uri = "~/.qlib/qlib_data/cn_data"
exp_folder_name = "run_all_model_records"
exp_path = str(Path(os.getcwd()).resolve() / exp_folder_name)
exp_manager = {
"class": "MLflowExpManager",
"module_path": "qlib.workflow.expm",
"kwargs": {
"uri": "file:" + exp_path,
"default_exp_name": "Experiment",
},
}
GetData().qlib_data(target_dir=provider_uri, region=REG_CN, exists_skip=True)
qlib.init(provider_uri=provider_uri, region=REG_CN, exp_manager=exp_manager)
# decorator to check the arguments
def only_allow_defined_args(function_to_decorate):
@@ -88,11 +73,11 @@ def create_env():
sys.stderr.write("\n")
# get anaconda activate path
conda_activate = Path(os.environ["CONDA_PREFIX"]) / "bin" / "activate" # TODO: FIX ME!
return env_path, python_path, conda_activate
return temp_dir, env_path, python_path, conda_activate
# function to execute the cmd
def execute(cmd, wait_when_err=False):
def execute(cmd, wait_when_err=False, raise_err=True):
print("Running CMD:", cmd)
with subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=1, universal_newlines=True, shell=True) as p:
for line in p.stdout:
@@ -105,6 +90,8 @@ def execute(cmd, wait_when_err=False):
if p.returncode != 0:
if wait_when_err:
input("Press Enter to Continue")
if raise_err:
raise RuntimeError(f"Error when executing command: {cmd}")
return p.stderr
else:
return None
@@ -134,14 +121,23 @@ def get_all_folders(models, exclude) -> dict:
def get_all_files(folder_path, dataset) -> (str, str):
yaml_path = str(Path(f"{folder_path}") / f"*{dataset}*.yaml")
req_path = str(Path(f"{folder_path}") / f"*.txt")
return glob.glob(yaml_path)[0], glob.glob(req_path)[0]
yaml_file = glob.glob(yaml_path)
req_file = glob.glob(req_path)
if len(yaml_file) == 0:
return None, None
else:
return yaml_file[0], req_file[0]
# function to retrieve all the results
def get_all_results(folders) -> dict:
results = dict()
for fn in folders:
exp = R.get_exp(experiment_name=fn, create=False)
try:
exp = R.get_exp(experiment_name=fn, create=False)
except ValueError:
# No experiment results
continue
recorders = exp.list_recorders()
result = dict()
result["annualized_return_with_cost"] = list()
@@ -155,9 +151,12 @@ def get_all_results(folders) -> dict:
if recorders[recorder_id].status == "FINISHED":
recorder = R.get_recorder(recorder_id=recorder_id, experiment_name=fn)
metrics = recorder.list_metrics()
result["annualized_return_with_cost"].append(metrics["excess_return_with_cost.annualized_return"])
result["information_ratio_with_cost"].append(metrics["excess_return_with_cost.information_ratio"])
result["max_drawdown_with_cost"].append(metrics["excess_return_with_cost.max_drawdown"])
if "1day.excess_return_with_cost.annualized_return" not in metrics:
print(f"{recorder_id} is skipped due to incomplete result")
continue
result["annualized_return_with_cost"].append(metrics["1day.excess_return_with_cost.annualized_return"])
result["information_ratio_with_cost"].append(metrics["1day.excess_return_with_cost.information_ratio"])
result["max_drawdown_with_cost"].append(metrics["1day.excess_return_with_cost.max_drawdown"])
result["ic"].append(metrics["IC"])
result["icir"].append(metrics["ICIR"])
result["rank_ic"].append(metrics["Rank IC"])
@@ -185,138 +184,202 @@ def gen_and_save_md_table(metrics, dataset):
return table
# function to run the all the models
@only_allow_defined_args
def run(
times=1,
models=None,
dataset="Alpha360",
exclude=False,
qlib_uri: str = "git+https://github.com/microsoft/qlib#egg=pyqlib",
wait_before_rm_env: bool = False,
wait_when_err: bool = False,
):
"""
Please be aware that this function can only work under Linux. MacOS and Windows will be supported in the future.
Any PR to enhance this method is highly welcomed. Besides, this script doesn't support parrallel running the same model
for multiple times, and this will be fixed in the future development.
# read yaml, remove seed kwargs of model, and then save file in the temp_dir
def gen_yaml_file_without_seed_kwargs(yaml_path, temp_dir):
with open(yaml_path, "r") as fp:
config = yaml.load(fp)
try:
del config["task"]["model"]["kwargs"]["seed"]
except KeyError:
# If the key does not exists, use original yaml
# NOTE: it is very important if the model most run in original path(when sys.rel_path is used)
return yaml_path
else:
# otherwise, generating a new yaml without random seed
file_name = yaml_path.split("/")[-1]
temp_path = os.path.join(temp_dir, file_name)
with open(temp_path, "w") as fp:
yaml.dump(config, fp)
return temp_path
Parameters:
-----------
times : int
determines how many times the model should be running.
models : str or list
determines the specific model or list of models to run or exclude.
exclude : boolean
determines whether the model being used is excluded or included.
dataset : str
determines the dataset to be used for each model.
qlib_uri : str
the uri to install qlib with pip
it could be url on the we or local path
wait_before_rm_env : bool
wait before remove environment.
wait_when_err : bool
wait when errors raised when executing commands
Usage:
-------
Here are some use cases of the function in the bash:
class ModelRunner:
def _init_qlib(self, exp_folder_name):
# init qlib
GetData().qlib_data(exists_skip=True)
qlib.init(
exp_manager={
"class": "MLflowExpManager",
"module_path": "qlib.workflow.expm",
"kwargs": {
"uri": "file:" + str(Path(os.getcwd()).resolve() / exp_folder_name),
"default_exp_name": "Experiment",
},
}
)
.. code-block:: bash
# function to run the all the models
@only_allow_defined_args
def run(
self,
times=1,
models=None,
dataset="Alpha360",
exclude=False,
qlib_uri: str = "git+https://github.com/microsoft/qlib#egg=pyqlib",
exp_folder_name: str = "run_all_model_records",
wait_before_rm_env: bool = False,
wait_when_err: bool = False,
):
"""
Please be aware that this function can only work under Linux. MacOS and Windows will be supported in the future.
Any PR to enhance this method is highly welcomed. Besides, this script doesn't support parallel running the same model
for multiple times, and this will be fixed in the future development.
# Case 1 - run all models multiple times
python run_all_model.py 3
Parameters:
-----------
times : int
determines how many times the model should be running.
models : str or list
determines the specific model or list of models to run or exclude.
exclude : boolean
determines whether the model being used is excluded or included.
dataset : str
determines the dataset to be used for each model.
qlib_uri : str
the uri to install qlib with pip
it could be url on the we or local path
exp_folder_name: str
the name of the experiment folder
wait_before_rm_env : bool
wait before remove environment.
wait_when_err : bool
wait when errors raised when executing commands
# Case 2 - run specific models multiple times
python run_all_model.py 3 mlp
Usage:
-------
Here are some use cases of the function in the bash:
# Case 3 - run specific models multiple times with specific dataset
python run_all_model.py 3 mlp Alpha158
.. code-block:: bash
# Case 4 - run other models except those are given as arguments for multiple times
python run_all_model.py 3 [mlp,tft,lstm] --exclude=True
# Case 1 - run all models multiple times
python run_all_model.py run 3
# Case 5 - run specific models for one time
python run_all_model.py --models=[mlp,lightgbm]
# Case 2 - run specific models multiple times
python run_all_model.py run 3 mlp
# Case 6 - run other models except those are given as aruments for one time
python run_all_model.py --models=[mlp,tft,sfm] --exclude=True
# Case 3 - run specific models multiple times with specific dataset
python run_all_model.py run 3 mlp Alpha158
"""
# get all folders
folders = get_all_folders(models, exclude)
# init error messages:
errors = dict()
# run all the model for iterations
for fn in folders:
# create env by anaconda
env_path, python_path, conda_activate = create_env()
# get all files
sys.stderr.write("Retrieving files...\n")
yaml_path, req_path = get_all_files(folders[fn], dataset)
sys.stderr.write("\n")
# install requirements.txt
sys.stderr.write("Installing requirements.txt...\n")
execute(f"{python_path} -m pip install -r {req_path}", wait_when_err=wait_when_err)
sys.stderr.write("\n")
# setup gpu for tft
if fn == "TFT":
execute(
f"conda install -y --prefix {env_path} anaconda cudatoolkit=10.0 && conda install -y --prefix {env_path} cudnn",
wait_when_err=wait_when_err,
)
# Case 4 - run other models except those are given as arguments for multiple times
python run_all_model.py run 3 [mlp,tft,lstm] --exclude=True
# Case 5 - run specific models for one time
python run_all_model.py run --models=[mlp,lightgbm]
# Case 6 - run other models except those are given as arguments for one time
python run_all_model.py run --models=[mlp,tft,sfm] --exclude=True
"""
self._init_qlib(exp_folder_name)
# get all folders
folders = get_all_folders(models, exclude)
# init error messages:
errors = dict()
# run all the model for iterations
for fn in folders:
# get all files
sys.stderr.write("Retrieving files...\n")
yaml_path, req_path = get_all_files(folders[fn], dataset)
if yaml_path is None:
sys.stderr.write(f"There is no {dataset}.yaml file in {folders[fn]}")
continue
sys.stderr.write("\n")
# install qlib
sys.stderr.write("Installing qlib...\n")
execute(f"{python_path} -m pip install --upgrade pip", wait_when_err=wait_when_err) # TODO: FIX ME!
execute(f"{python_path} -m pip install --upgrade cython", wait_when_err=wait_when_err) # TODO: FIX ME!
if fn == "TFT":
execute(
f"cd {env_path} && {python_path} -m pip install --upgrade --force-reinstall --ignore-installed PyYAML -e {qlib_uri}",
wait_when_err=wait_when_err,
) # TODO: FIX ME!
else:
execute(
f"cd {env_path} && {python_path} -m pip install --upgrade --force-reinstall -e {qlib_uri}",
wait_when_err=wait_when_err,
) # TODO: FIX ME!
sys.stderr.write("\n")
# run workflow_by_config for multiple times
for i in range(times):
sys.stderr.write(f"Running the model: {fn} for iteration {i+1}...\n")
errs = execute(
f"{python_path} {env_path / 'bin' / 'qrun'} {yaml_path} {fn} {exp_folder_name}",
wait_when_err=wait_when_err,
)
if errs is not None:
_errs = errors.get(fn, {})
_errs.update({i: errs})
errors[fn] = _errs
# create env by anaconda
temp_dir, env_path, python_path, conda_activate = create_env()
# install requirements.txt
sys.stderr.write("Installing requirements.txt...\n")
with open(req_path) as f:
content = f.read()
if "torch" in content:
# automatically install pytorch according to nvidia's version
execute(
f"{python_path} -m pip install light-the-torch", wait_when_err=wait_when_err
) # for automatically installing torch according to the nvidia driver
execute(
f"{env_path / 'bin' / 'ltt'} install --install-cmd '{python_path} -m pip install {{packages}}' -- -r {req_path}",
wait_when_err=wait_when_err,
)
else:
execute(f"{python_path} -m pip install -r {req_path}", wait_when_err=wait_when_err)
sys.stderr.write("\n")
# remove env
sys.stderr.write(f"Deleting the environment: {env_path}...\n")
if wait_before_rm_env:
input("Press Enter to Continue")
shutil.rmtree(env_path)
# getting all results
sys.stderr.write(f"Retrieving results...\n")
results = get_all_results(folders)
# calculating the mean and std
sys.stderr.write(f"Calculating the mean and std of results...\n")
results = cal_mean_std(results)
# generating md table
sys.stderr.write(f"Generating markdown table...\n")
gen_and_save_md_table(results, dataset)
sys.stderr.write("\n")
# print erros
sys.stderr.write(f"Here are some of the errors of the models...\n")
pprint(errors)
sys.stderr.write("\n")
# move results folder
shutil.move(exp_path, exp_path + f"_{dataset}_{datetime.now().strftime('%Y-%m-%d_%H:%M:%S')}")
shutil.move("table.md", f"table_{dataset}_{datetime.now().strftime('%Y-%m-%d_%H:%M:%S')}.md")
# read yaml, remove seed kwargs of model, and then save file in the temp_dir
yaml_path = gen_yaml_file_without_seed_kwargs(yaml_path, temp_dir)
# setup gpu for tft
if fn == "TFT":
execute(
f"conda install -y --prefix {env_path} anaconda cudatoolkit=10.0 && conda install -y --prefix {env_path} cudnn",
wait_when_err=wait_when_err,
)
sys.stderr.write("\n")
# install qlib
sys.stderr.write("Installing qlib...\n")
execute(f"{python_path} -m pip install --upgrade pip", wait_when_err=wait_when_err) # TODO: FIX ME!
execute(f"{python_path} -m pip install --upgrade cython", wait_when_err=wait_when_err) # TODO: FIX ME!
if fn == "TFT":
execute(
f"cd {env_path} && {python_path} -m pip install --upgrade --force-reinstall --ignore-installed PyYAML -e {qlib_uri}",
wait_when_err=wait_when_err,
) # TODO: FIX ME!
else:
execute(
f"cd {env_path} && {python_path} -m pip install --upgrade --force-reinstall -e {qlib_uri}",
wait_when_err=wait_when_err,
) # TODO: FIX ME!
sys.stderr.write("\n")
# run workflow_by_config for multiple times
for i in range(times):
sys.stderr.write(f"Running the model: {fn} for iteration {i+1}...\n")
errs = execute(
f"{python_path} {env_path / 'bin' / 'qrun'} {yaml_path} {fn} {exp_folder_name}",
wait_when_err=wait_when_err,
)
if errs is not None:
_errs = errors.get(fn, {})
_errs.update({i: errs})
errors[fn] = _errs
sys.stderr.write("\n")
# remove env
sys.stderr.write(f"Deleting the environment: {env_path}...\n")
if wait_before_rm_env:
input("Press Enter to Continue")
shutil.rmtree(env_path)
# print errors
sys.stderr.write(f"Here are some of the errors of the models...\n")
pprint(errors)
self._collect_results(exp_folder_name, dataset)
def _collect_results(self, exp_folder_name, dataset):
folders = get_all_folders(exp_folder_name, dataset)
# getting all results
sys.stderr.write(f"Retrieving results...\n")
results = get_all_results(folders)
if len(results) > 0:
# calculating the mean and std
sys.stderr.write(f"Calculating the mean and std of results...\n")
results = cal_mean_std(results)
# generating md table
sys.stderr.write(f"Generating markdown table...\n")
gen_and_save_md_table(results, dataset)
sys.stderr.write("\n")
sys.stderr.write("\n")
# move results folder
shutil.move(exp_folder_name, exp_folder_name + f"_{dataset}_{datetime.now().strftime('%Y-%m-%d_%H:%M:%S')}")
shutil.move("table.md", f"table_{dataset}_{datetime.now().strftime('%Y-%m-%d_%H:%M:%S')}.md")
if __name__ == "__main__":
fire.Fire(run) # run all the model
fire.Fire(ModelRunner) # run all the model

View File

@@ -20,9 +20,7 @@
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true
},
"metadata": {},
"outputs": [],
"source": [
"import sys, site\n",
@@ -66,7 +64,6 @@
"from qlib.config import REG_CN\n",
"from qlib.contrib.model.gbdt import LGBModel\n",
"from qlib.contrib.data.handler import Alpha158\n",
"from qlib.contrib.strategy.strategy import TopkDropoutStrategy\n",
"from qlib.contrib.evaluate import (\n",
" backtest as normal_backtest,\n",
" risk_analysis,\n",
@@ -197,27 +194,40 @@
"# prediction, backtest & analysis\n",
"###################################\n",
"port_analysis_config = {\n",
" \"executor\": {\n",
" \"class\": \"SimulatorExecutor\",\n",
" \"module_path\": \"qlib.backtest.executor\",\n",
" \"kwargs\": {\n",
" \"time_per_step\": \"day\",\n",
" \"generate_portfolio_metrics\": True,\n",
" },\n",
" },\n",
" \"strategy\": {\n",
" \"class\": \"TopkDropoutStrategy\",\n",
" \"module_path\": \"qlib.contrib.strategy.strategy\",\n",
" \"module_path\": \"qlib.contrib.strategy.model_strategy\",\n",
" \"kwargs\": {\n",
" \"model\": model,\n",
" \"dataset\": dataset,\n",
" \"topk\": 50,\n",
" \"n_drop\": 5,\n",
" },\n",
" },\n",
" \"backtest\": {\n",
" \"verbose\": False,\n",
" \"limit_threshold\": 0.095,\n",
" \"start_time\": \"2017-01-01\",\n",
" \"end_time\": \"2020-08-01\",\n",
" \"account\": 100000000,\n",
" \"benchmark\": benchmark,\n",
" \"deal_price\": \"close\",\n",
" \"open_cost\": 0.0005,\n",
" \"close_cost\": 0.0015,\n",
" \"min_cost\": 5,\n",
" \"exchange_kwargs\": {\n",
" \"freq\": \"day\",\n",
" \"limit_threshold\": 0.095,\n",
" \"deal_price\": \"close\",\n",
" \"open_cost\": 0.0005,\n",
" \"close_cost\": 0.0015,\n",
" \"min_cost\": 5,\n",
" },\n",
" },\n",
"}\n",
"\n",
"\n",
"# backtest and analysis\n",
"with R.start(experiment_name=\"backtest_analysis\"):\n",
" recorder = R.get_recorder(recorder_id=rid, experiment_name=\"train_model\")\n",
@@ -230,7 +240,7 @@
" sr.generate()\n",
"\n",
" # backtest & analysis\n",
" par = PortAnaRecord(recorder, port_analysis_config)\n",
" par = PortAnaRecord(recorder, port_analysis_config, \"day\")\n",
" par.generate()\n"
]
},
@@ -250,11 +260,12 @@
"from qlib.contrib.report import analysis_model, analysis_position\n",
"from qlib.data import D\n",
"recorder = R.get_recorder(recorder_id=ba_rid, experiment_name=\"backtest_analysis\")\n",
"print(recorder)\n",
"pred_df = recorder.load_object(\"pred.pkl\")\n",
"pred_df_dates = pred_df.index.get_level_values(level='datetime')\n",
"report_normal_df = recorder.load_object(\"portfolio_analysis/report_normal.pkl\")\n",
"positions = recorder.load_object(\"portfolio_analysis/positions_normal.pkl\")\n",
"analysis_df = recorder.load_object(\"portfolio_analysis/port_analysis.pkl\")"
"report_normal_df = recorder.load_object(\"portfolio_analysis/report_normal_1day.pkl\")\n",
"positions = recorder.load_object(\"portfolio_analysis/positions_normal_1day.pkl\")\n",
"analysis_df = recorder.load_object(\"portfolio_analysis/port_analysis_1day.pkl\")"
]
},
{
@@ -349,7 +360,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -362,8 +373,7 @@
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
"pygments_lexer": "ipython3"
},
"toc": {
"base_numbering": 1,
@@ -381,4 +391,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

View File

@@ -17,32 +17,43 @@ if __name__ == "__main__":
GetData().qlib_data(target_dir=provider_uri, region=REG_CN, exists_skip=True)
qlib.init(provider_uri=provider_uri, region=REG_CN)
model = init_instance_by_config(CSI300_GBDT_TASK["model"])
dataset = init_instance_by_config(CSI300_GBDT_TASK["dataset"])
port_analysis_config = {
"executor": {
"class": "SimulatorExecutor",
"module_path": "qlib.backtest.executor",
"kwargs": {
"time_per_step": "day",
"generate_portfolio_metrics": True,
},
},
"strategy": {
"class": "TopkDropoutStrategy",
"module_path": "qlib.contrib.strategy.strategy",
"module_path": "qlib.contrib.strategy.signal_strategy",
"kwargs": {
"signal": (model, dataset),
"topk": 50,
"n_drop": 5,
},
},
"backtest": {
"verbose": False,
"limit_threshold": 0.095,
"start_time": "2017-01-01",
"end_time": "2020-08-01",
"account": 100000000,
"benchmark": CSI300_BENCH,
"deal_price": "close",
"open_cost": 0.0005,
"close_cost": 0.0015,
"min_cost": 5,
"return_order": True,
"exchange_kwargs": {
"freq": "day",
"limit_threshold": 0.095,
"deal_price": "close",
"open_cost": 0.0005,
"close_cost": 0.0015,
"min_cost": 5,
},
},
}
# model initialization
model = init_instance_by_config(CSI300_GBDT_TASK["model"])
dataset = init_instance_by_config(CSI300_GBDT_TASK["dataset"])
# NOTE: This line is optional
# It demonstrates that the dataset can be used standalone.
example_df = dataset.prepare("train")
@@ -61,5 +72,5 @@ if __name__ == "__main__":
# backtest. If users want to use backtest based on their own prediction,
# please refer to https://qlib.readthedocs.io/en/latest/component/recorder.html#record-template.
par = PortAnaRecord(recorder, port_analysis_config)
par = PortAnaRecord(recorder, port_analysis_config, "day")
par.generate()

View File

@@ -6,6 +6,7 @@ _version_path = Path(__file__).absolute().parent / "VERSION.txt" # This file is
__version__ = _version_path.read_text(encoding="utf-8").strip()
__version__bak = __version__ # This version is backup for QlibConfig.reset_qlib_version
import os
from typing import Union
import yaml
import logging
import platform
@@ -54,14 +55,15 @@ def init(default_conf="client", **kwargs):
if "flask_server" in C:
logger.info(f"flask_server={C['flask_server']}, flask_port={C['flask_port']}")
logger.info("qlib successfully initialized based on %s settings." % default_conf)
data_path = {_freq: C.dpm.get_data_path(_freq) for _freq in C.dpm.provider_uri.keys()}
data_path = {_freq: C.dpm.get_data_uri(_freq) for _freq in C.dpm.provider_uri.keys()}
logger.info(f"data_path={data_path}")
def _mount_nfs_uri(provider_uri, mount_path, auto_mount: bool = False):
LOG = get_module_logger("mount nfs", level=logging.INFO)
if mount_path is None:
raise ValueError(f"Invalid mount path: {mount_path}!")
# FIXME: the C["provider_uri"] is modified in this function
# If it is not modified, we can pass only provider_uri or mount_path instead of C
mount_command = "sudo mount.nfs %s %s" % (provider_uri, mount_path)
@@ -150,14 +152,17 @@ def init_from_yaml_conf(conf_path, **kwargs):
:param conf_path: A path to the qlib config in yml format
"""
with open(conf_path) as f:
config = yaml.safe_load(f)
if conf_path is None:
config = {}
else:
with open(conf_path) as f:
config = yaml.safe_load(f)
config.update(kwargs)
default_conf = config.pop("default_conf", "client")
init(default_conf, **config)
def get_project_path(config_name="config.yaml", cur_path=None) -> Path:
def get_project_path(config_name="config.yaml", cur_path: Union[Path, str, None] = None) -> Path:
"""
If users are building a project follow the following pattern.
- Qlib is a sub folder in project path
@@ -186,6 +191,7 @@ def get_project_path(config_name="config.yaml", cur_path=None) -> Path:
"""
if cur_path is None:
cur_path = Path(__file__).absolute().resolve()
cur_path = Path(cur_path)
while True:
if (cur_path / config_name).exists():
return cur_path
@@ -201,6 +207,40 @@ def auto_init(**kwargs):
- The parsing process will be affected by the `conf_type` of the configuration file
- Init qlib with default config
- Skip initialization if already initialized
:**kwargs: it may contain following parameters
cur_path: the start path to find the project path
Here are two examples of the configuration
Example 1)
If you want create a new project-specific config based on a shared configure, you can use `conf_type: ref`
.. code-block:: yaml
conf_type: ref
qlib_cfg: '<shared_yaml_config_path>' # this could be null reference no config from other files
# following configs in `qlib_cfg_update` is project=specific
qlib_cfg_update:
exp_manager:
class: "MLflowExpManager"
module_path: "qlib.workflow.expm"
kwargs:
uri: "file://<your mlflow experiment path>"
default_exp_name: "Experiment"
Example 2)
If you wan to create simple a stand alone config, you can use following config(a.k.a `conf_type: origin`)
.. code-block:: python
exp_manager:
class: "MLflowExpManager"
module_path: "qlib.workflow.expm"
kwargs:
uri: "file://<your mlflow experiment path>"
default_exp_name: "Experiment"
"""
kwargs["skip_if_reg"] = kwargs.get("skip_if_reg", True)
@@ -209,6 +249,7 @@ def auto_init(**kwargs):
except FileNotFoundError:
init(**kwargs)
else:
logger = get_module_logger("Initialization")
conf_pp = pp / "config.yaml"
with conf_pp.open() as f:
conf = yaml.safe_load(f)
@@ -222,8 +263,14 @@ def auto_init(**kwargs):
# - There is a shared configure file and you don't want to edit it inplace.
# - The shared configure may be updated later and you don't want to copy it.
# - You have some customized config.
qlib_conf_path = conf["qlib_cfg"]
qlib_conf_update = conf.get("qlib_cfg_update")
init_from_yaml_conf(qlib_conf_path, **qlib_conf_update, **kwargs)
logger = get_module_logger("Initialization")
qlib_conf_path = conf.get("qlib_cfg", None)
# merge the arguments
qlib_conf_update = conf.get("qlib_cfg_update", {})
for k, v in kwargs.items():
if k in qlib_conf_update:
logger.warning(f"`qlib_conf_update` from conf_pp is override by `kwargs` on key '{k}'")
qlib_conf_update.update(kwargs)
init_from_yaml_conf(qlib_conf_path, **qlib_conf_update)
logger.info(f"Auto load project config: {conf_pp}")

322
qlib/backtest/__init__.py Normal file
View File

@@ -0,0 +1,322 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
import copy
from typing import List, Tuple, Union, TYPE_CHECKING
from .account import Account
if TYPE_CHECKING:
from ..strategy.base import BaseStrategy
from .executor import BaseExecutor
from .decision import BaseTradeDecision
from .position import Position
from .exchange import Exchange
from .backtest import backtest_loop
from .backtest import collect_data_loop
from .utils import CommonInfrastructure
from .decision import Order
from ..utils import init_instance_by_config
from ..log import get_module_logger
from ..config import C
# make import more user-friendly by adding `from qlib.backtest import STH`
logger = get_module_logger("backtest caller")
def get_exchange(
exchange=None,
freq="day",
start_time=None,
end_time=None,
codes="all",
subscribe_fields=[],
open_cost=0.0015,
close_cost=0.0025,
min_cost=5.0,
limit_threshold=None,
deal_price: Union[str, Tuple[str], List[str]] = None,
**kwargs,
):
"""get_exchange
Parameters
----------
# exchange related arguments
exchange: Exchange().
subscribe_fields: list
subscribe fields.
open_cost : float
open transaction cost.
close_cost : float
close transaction cost.
min_cost : float
min transaction cost.
trade_unit : int
Included in kwargs. Please refer to the docs of `__init__` of `Exchange`
deal_price: Union[str, Tuple[str], List[str]]
The `deal_price` supports following two types of input
- <deal_price> : str
- (<buy_price>, <sell_price>): Tuple[str] or List[str]
<deal_price>, <buy_price> or <sell_price> := <price>
<price> := str
- for example '$close', '$open', '$vwap' ("close" is OK. `Exchange` will help to prepend
"$" to the expression)
limit_threshold : float
limit move 0.1 (10%) for example, long and short with same limit.
Returns
-------
:class: Exchange
an initialized Exchange object
"""
if limit_threshold is None:
limit_threshold = C.limit_threshold
if exchange is None:
logger.info("Create new exchange")
exchange = Exchange(
freq=freq,
start_time=start_time,
end_time=end_time,
codes=codes,
deal_price=deal_price,
subscribe_fields=subscribe_fields,
limit_threshold=limit_threshold,
open_cost=open_cost,
close_cost=close_cost,
min_cost=min_cost,
**kwargs,
)
return exchange
else:
return init_instance_by_config(exchange, accept_types=Exchange)
def create_account_instance(
start_time, end_time, benchmark: str, account: Union[float, int, dict], pos_type: str = "Position"
) -> Account:
"""
# TODO: is very strange pass benchmark_config in the account(maybe for report)
# There should be a post-step to process the report.
Parameters
----------
start_time
start time of the benchmark
end_time
end time of the benchmark
benchmark : str
the benchmark for reporting
account : Union[
float,
{
"cash": float,
"stock1": Union[
int, # it is equal to {"amount": int}
{"amount": int, "price"(optional): float},
]
},
]
information for describing how to creating the account
For `float`:
Using Account with only initial cash
For `dict`:
key "cash" means initial cash.
key "stock1" means the information of first stock with amount and price(optional).
...
"""
if isinstance(account, (int, float)):
pos_kwargs = {"init_cash": account}
elif isinstance(account, dict):
init_cash = account["cash"]
del account["cash"]
pos_kwargs = {
"init_cash": init_cash,
"position_dict": account,
}
else:
raise ValueError("account must be in (int, float, Position)")
kwargs = {
"init_cash": account,
"benchmark_config": {
"benchmark": benchmark,
"start_time": start_time,
"end_time": end_time,
},
"pos_type": pos_type,
}
kwargs.update(pos_kwargs)
return Account(**kwargs)
def get_strategy_executor(
start_time,
end_time,
strategy: BaseStrategy,
executor: BaseExecutor,
benchmark: str = "SH000300",
account: Union[float, int, Position] = 1e9,
exchange_kwargs: dict = {},
pos_type: str = "Position",
):
# NOTE:
# - for avoiding recursive import
# - typing annotations is not reliable
from ..strategy.base import BaseStrategy
from .executor import BaseExecutor
trade_account = create_account_instance(
start_time=start_time, end_time=end_time, benchmark=benchmark, account=account, pos_type=pos_type
)
exchange_kwargs = copy.copy(exchange_kwargs)
if "start_time" not in exchange_kwargs:
exchange_kwargs["start_time"] = start_time
if "end_time" not in exchange_kwargs:
exchange_kwargs["end_time"] = end_time
trade_exchange = get_exchange(**exchange_kwargs)
common_infra = CommonInfrastructure(trade_account=trade_account, trade_exchange=trade_exchange)
trade_strategy = init_instance_by_config(strategy, accept_types=BaseStrategy, common_infra=common_infra)
trade_executor = init_instance_by_config(executor, accept_types=BaseExecutor, common_infra=common_infra)
return trade_strategy, trade_executor
def backtest(
start_time,
end_time,
strategy,
executor,
benchmark="SH000300",
account=1e9,
exchange_kwargs={},
pos_type: str = "Position",
):
"""initialize the strategy and executor, then backtest function for the interaction of the outermost strategy and executor in the nested decision execution
Parameters
----------
start_time : pd.Timestamp|str
closed start time for backtest
**NOTE**: This will be applied to the outmost executor's calendar.
end_time : pd.Timestamp|str
closed end time for backtest
**NOTE**: This will be applied to the outmost executor's calendar.
E.g. Executor[day](Executor[1min]), setting `end_time == 20XX0301` will include all the minutes on 20XX0301
strategy : Union[str, dict, BaseStrategy]
for initializing outermost portfolio strategy. Please refer to the docs of init_instance_by_config for more information.
executor : Union[str, dict, BaseExecutor]
for initializing the outermost executor.
benchmark: str
the benchmark for reporting.
account : Union[float, int, Position]
information for describing how to creating the account
For `float` or `int`:
Using Account with only initial cash
For `Position`:
Using Account with a Position
exchange_kwargs : dict
the kwargs for initializing Exchange
pos_type : str
the type of Position.
Returns
-------
portfolio_metrics_dict: Dict[PortfolioMetrics]
it records the trading portfolio_metrics information
indicator_dict: Dict[Indicator]
it computes the trading indicator
It is organized in a dict format
"""
trade_strategy, trade_executor = get_strategy_executor(
start_time,
end_time,
strategy,
executor,
benchmark,
account,
exchange_kwargs,
pos_type=pos_type,
)
portfolio_metrics, indicator = backtest_loop(start_time, end_time, trade_strategy, trade_executor)
return portfolio_metrics, indicator
def collect_data(
start_time,
end_time,
strategy,
executor,
benchmark="SH000300",
account=1e9,
exchange_kwargs={},
pos_type: str = "Position",
return_value: dict = None,
):
"""initialize the strategy and executor, then collect the trade decision data for rl training
please refer to the docs of the backtest for the explanation of the parameters
Yields
-------
object
trade decision
"""
trade_strategy, trade_executor = get_strategy_executor(
start_time,
end_time,
strategy,
executor,
benchmark,
account,
exchange_kwargs,
pos_type=pos_type,
)
yield from collect_data_loop(start_time, end_time, trade_strategy, trade_executor, return_value=return_value)
def format_decisions(
decisions: List[BaseTradeDecision],
) -> Tuple[str, List[Tuple[BaseTradeDecision, Union[Tuple, None]]]]:
"""
format the decisions collected by `qlib.backtest.collect_data`
The decisions will be organized into a tree-like structure.
Parameters
----------
decisions : List[BaseTradeDecision]
decisions collected by `qlib.backtest.collect_data`
Returns
-------
Tuple[str, List[Tuple[BaseTradeDecision, Union[Tuple, None]]]]:
reformat the list of decisions into a more user-friendly format
<decisions> := Tuple[<freq>, List[Tuple[<decision>, <sub decisions>]]]
- <sub decisions> := `<decisions> in lower level` | None
- <freq> := "day" | "30min" | "1min" | ...
- <decision> := <instance of BaseTradeDecision>
"""
if len(decisions) == 0:
return None
cur_freq = decisions[0].strategy.trade_calendar.get_freq()
res = (cur_freq, [])
last_dec_idx = 0
for i, dec in enumerate(decisions[1:], 1):
if dec.strategy.trade_calendar.get_freq() == cur_freq:
res[1].append((decisions[last_dec_idx], format_decisions(decisions[last_dec_idx + 1 : i])))
last_dec_idx = i
res[1].append((decisions[last_dec_idx], format_decisions(decisions[last_dec_idx + 1 :])))
return res

377
qlib/backtest/account.py Normal file
View File

@@ -0,0 +1,377 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
import copy
from typing import Dict, List, Tuple, TYPE_CHECKING
from qlib.utils import init_instance_by_config
import pandas as pd
from .position import BasePosition, InfPosition, Position
from .report import PortfolioMetrics, Indicator
from .decision import BaseTradeDecision, Order
from .exchange import Exchange
"""
rtn & earning in the Account
rtn:
from order's view
1.change if any order is executed, sell order or buy order
2.change at the end of today, (today_close - stock_price) * amount
earning
from value of current position
earning will be updated at the end of trade date
earning = today_value - pre_value
**is consider cost**
while earning is the difference of two position value, so it considers cost, it is the true return rate
in the specific accomplishment for rtn, it does not consider cost, in other words, rtn - cost = earning
"""
class AccumulatedInfo:
"""accumulated trading info, including accumulated return/cost/turnover"""
def __init__(self):
self.reset()
def reset(self):
self.rtn = 0 # accumulated return, do not consider cost
self.cost = 0 # accumulated cost
self.to = 0 # accumulated turnover
def add_return_value(self, value):
self.rtn += value
def add_cost(self, value):
self.cost += value
def add_turnover(self, value):
self.to += value
@property
def get_return(self):
return self.rtn
@property
def get_cost(self):
return self.cost
@property
def get_turnover(self):
return self.to
class Account:
def __init__(
self,
init_cash: float = 1e9,
position_dict: dict = {},
freq: str = "day",
benchmark_config: dict = {},
pos_type: str = "Position",
port_metr_enabled: bool = True,
):
"""the trade account of backtest.
Parameters
----------
init_cash : float, optional
initial cash, by default 1e9
position_dict : Dict[
stock_id,
Union[
int, # it is equal to {"amount": int}
{"amount": int, "price"(optional): float},
]
]
initial stocks with parameters amount and price,
if there is no price key in the dict of stocks, it will be filled by _fill_stock_value.
by default {}.
"""
self._pos_type = pos_type
self._port_metr_enabled = port_metr_enabled
self.benchmark_config = None # avoid no attribute error
self.init_vars(init_cash, position_dict, freq, benchmark_config)
def init_vars(self, init_cash, position_dict, freq: str, benchmark_config: dict):
self.init_cash = init_cash
self.current_position: BasePosition = init_instance_by_config(
{
"class": self._pos_type,
"kwargs": {
"cash": init_cash,
"position_dict": position_dict,
},
"module_path": "qlib.backtest.position",
}
)
self.portfolio_metrics = None
self.hist_positions = {}
self.reset(freq=freq, benchmark_config=benchmark_config)
def is_port_metr_enabled(self):
"""
Is portfolio-based metrics enabled.
"""
return self._port_metr_enabled and not self.current_position.skip_update()
def reset_report(self, freq, benchmark_config):
# portfolio related metrics
if self.is_port_metr_enabled():
self.accum_info = AccumulatedInfo()
self.portfolio_metrics = PortfolioMetrics(freq, benchmark_config)
self.hist_positions = {}
# fill stock value
# The frequency of account may not align with the trading frequency.
# This may result in obscure bugs when data quality is low.
if isinstance(self.benchmark_config, dict) and self.benchmark_config.get("start_time") is not None:
self.current_position.fill_stock_value(self.benchmark_config["start_time"], self.freq)
# trading related metrics(e.g. high-frequency trading)
self.indicator = Indicator()
def reset(self, freq=None, benchmark_config=None, port_metr_enabled: bool = None):
"""reset freq and report of account
Parameters
----------
freq : str, optional
frequency of account & report, by default None
benchmark_config : {}, optional
benchmark config of report, by default None
"""
if freq is not None:
self.freq = freq
if benchmark_config is not None:
self.benchmark_config = benchmark_config
if port_metr_enabled is not None:
self._port_metr_enabled = port_metr_enabled
self.reset_report(self.freq, self.benchmark_config)
def get_hist_positions(self):
return self.hist_positions
def get_cash(self):
return self.current_position.get_cash()
def _update_state_from_order(self, order, trade_val, cost, trade_price):
if self.is_port_metr_enabled():
# update turnover
self.accum_info.add_turnover(trade_val)
# update cost
self.accum_info.add_cost(cost)
# update return from order
trade_amount = trade_val / trade_price
if order.direction == Order.SELL: # 0 for sell
# when sell stock, get profit from price change
profit = trade_val - self.current_position.get_stock_price(order.stock_id) * trade_amount
self.accum_info.add_return_value(profit) # note here do not consider cost
elif order.direction == Order.BUY: # 1 for buy
# when buy stock, we get return for the rtn computing method
# profit in buy order is to make rtn is consistent with earning at the end of bar
profit = self.current_position.get_stock_price(order.stock_id) * trade_amount - trade_val
self.accum_info.add_return_value(profit) # note here do not consider cost
def update_order(self, order, trade_val, cost, trade_price):
if self.current_position.skip_update():
# TODO: supporting polymorphism for account
# updating order for infinite position is meaningless
return
# if stock is sold out, no stock price information in Position, then we should update account first, then update current position
# if stock is bought, there is no stock in current position, update current, then update account
# The cost will be substracted from the cash at last. So the trading logic can ignore the cost calculation
if order.direction == Order.SELL:
# sell stock
self._update_state_from_order(order, trade_val, cost, trade_price)
# update current position
# for may sell all of stock_id
self.current_position.update_order(order, trade_val, cost, trade_price)
else:
# buy stock
# deal order, then update state
self.current_position.update_order(order, trade_val, cost, trade_price)
self._update_state_from_order(order, trade_val, cost, trade_price)
def update_current_position(self, trade_start_time, trade_end_time, trade_exchange):
"""update current to make rtn consistent with earning at the end of bar, and update holding bar count of stock"""
# update price for stock in the position and the profit from changed_price
# NOTE: updating position does not only serve portfolio metrics, it also serve the strategy
if not self.current_position.skip_update():
stock_list = self.current_position.get_stock_list()
for code in stock_list:
# if suspend, no new price to be updated, profit is 0
if trade_exchange.check_stock_suspended(code, trade_start_time, trade_end_time):
continue
bar_close = trade_exchange.get_close(code, trade_start_time, trade_end_time)
self.current_position.update_stock_price(stock_id=code, price=bar_close)
# update holding day count
# NOTE: updating bar_count does not only serve portfolio metrics, it also serve the strategy
self.current_position.add_count_all(bar=self.freq)
def update_portfolio_metrics(self, trade_start_time, trade_end_time):
"""update portfolio_metrics"""
# calculate earning
# account_value - last_account_value
# for the first trade date, account_value - init_cash
# self.portfolio_metrics.is_empty() to judge is_first_trade_date
# get last_account_value, last_total_cost, last_total_turnover
if self.portfolio_metrics.is_empty():
last_account_value = self.init_cash
last_total_cost = 0
last_total_turnover = 0
else:
last_account_value = self.portfolio_metrics.get_latest_account_value()
last_total_cost = self.portfolio_metrics.get_latest_total_cost()
last_total_turnover = self.portfolio_metrics.get_latest_total_turnover()
# get now_account_value, now_stock_value, now_earning, now_cost, now_turnover
now_account_value = self.current_position.calculate_value()
now_stock_value = self.current_position.calculate_stock_value()
now_earning = now_account_value - last_account_value
now_cost = self.accum_info.get_cost - last_total_cost
now_turnover = self.accum_info.get_turnover - last_total_turnover
# update portfolio_metrics for today
# judge whether the the trading is begin.
# and don't add init account state into portfolio_metrics, due to we don't have excess return in those days.
self.portfolio_metrics.update_portfolio_metrics_record(
trade_start_time=trade_start_time,
trade_end_time=trade_end_time,
account_value=now_account_value,
cash=self.current_position.position["cash"],
return_rate=(now_earning + now_cost) / last_account_value,
# here use earning to calculate return, position's view, earning consider cost, true return
# in order to make same definition with original backtest in evaluate.py
total_turnover=self.accum_info.get_turnover,
turnover_rate=now_turnover / last_account_value,
total_cost=self.accum_info.get_cost,
cost_rate=now_cost / last_account_value,
stock_value=now_stock_value,
)
def update_hist_positions(self, trade_start_time):
"""update history position"""
now_account_value = self.current_position.calculate_value()
# set now_account_value to position
self.current_position.position["now_account_value"] = now_account_value
self.current_position.update_weight_all()
# update hist_positions
# note use deepcopy
self.hist_positions[trade_start_time] = copy.deepcopy(self.current_position)
def update_indicator(
self,
trade_start_time: pd.Timestamp,
trade_exchange: Exchange,
atomic: bool,
outer_trade_decision: BaseTradeDecision,
trade_info: list = None,
inner_order_indicators: List[Dict[str, pd.Series]] = None,
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]] = None,
indicator_config: dict = {},
):
"""update trade indicators and order indicators in each bar end"""
# TODO: will skip empty decisions make it faster? `outer_trade_decision.empty():`
# indicator is trading (e.g. high-frequency order execution) related analysis
self.indicator.reset()
# aggregate the information for each order
if atomic:
self.indicator.update_order_indicators(trade_info)
else:
self.indicator.agg_order_indicators(
inner_order_indicators,
decision_list=decision_list,
outer_trade_decision=outer_trade_decision,
trade_exchange=trade_exchange,
indicator_config=indicator_config,
)
# aggregate all the order metrics a single step
self.indicator.cal_trade_indicators(trade_start_time, self.freq, indicator_config)
# record the metrics
self.indicator.record(trade_start_time)
def update_bar_end(
self,
trade_start_time: pd.Timestamp,
trade_end_time: pd.Timestamp,
trade_exchange: Exchange,
atomic: bool,
outer_trade_decision: BaseTradeDecision,
trade_info: list = None,
inner_order_indicators: List[Dict[str, pd.Series]] = None,
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]] = None,
indicator_config: dict = {},
):
"""update account at each trading bar step
Parameters
----------
trade_start_time : pd.Timestamp
closed start time of step
trade_end_time : pd.Timestamp
closed end time of step
trade_exchange : Exchange
trading exchange, used to update current
atomic : bool
whether the trading executor is atomic, which means there is no higher-frequency trading executor inside it
- if atomic is True, calculate the indicators with trade_info
- else, aggregate indicators with inner indicators
trade_info : List[(Order, float, float, float)], optional
trading information, by default None
- necessary if atomic is True
- list of tuple(order, trade_val, trade_cost, trade_price)
inner_order_indicators : Indicator, optional
indicators of inner executor, by default None
- necessary if atomic is False
- used to aggregate outer indicators
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]] = None,
The decision list of the inner level: List[Tuple[<decision>, <start_time>, <end_time>]]
The inner level
indicator_config : dict, optional
config of calculating indicators, by default {}
"""
if atomic is True and trade_info is None:
raise ValueError("trade_info is necessary in atomic executor")
elif atomic is False and inner_order_indicators is None:
raise ValueError("inner_order_indicators is necessary in un-atomic executor")
# update current position and hold bar count in each bar end
self.update_current_position(trade_start_time, trade_end_time, trade_exchange)
if self.is_port_metr_enabled():
# portfolio_metrics is portfolio related analysis
self.update_portfolio_metrics(trade_start_time, trade_end_time)
self.update_hist_positions(trade_start_time)
# update indicator in each bar end
self.update_indicator(
trade_start_time=trade_start_time,
trade_exchange=trade_exchange,
atomic=atomic,
outer_trade_decision=outer_trade_decision,
trade_info=trade_info,
inner_order_indicators=inner_order_indicators,
decision_list=decision_list,
indicator_config=indicator_config,
)
def get_portfolio_metrics(self):
"""get the history portfolio_metrics and postions instance"""
if self.is_port_metr_enabled():
_portfolio_metrics = self.portfolio_metrics.generate_portfolio_metrics_dataframe()
_positions = self.get_hist_positions()
return _portfolio_metrics, _positions
else:
raise ValueError("generate_portfolio_metrics should be True if you want to generate portfolio_metrics")
def get_trade_indicator(self) -> Indicator:
"""get the trade indicator instance, which has pa/pos/ffr info."""
return self.indicator

81
qlib/backtest/backtest.py Normal file
View File

@@ -0,0 +1,81 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
from qlib.backtest.decision import BaseTradeDecision
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from qlib.strategy.base import BaseStrategy
from qlib.backtest.executor import BaseExecutor
from ..utils.time import Freq
from tqdm.auto import tqdm
def backtest_loop(start_time, end_time, trade_strategy: BaseStrategy, trade_executor: BaseExecutor):
"""backtest funciton for the interaction of the outermost strategy and executor in the nested decision execution
please refer to the docs of `collect_data_loop`
Returns
-------
portfolio_metrics: PortfolioMetrics
it records the trading portfolio_metrics information
indicator: Indicator
it computes the trading indicator
"""
return_value = {}
for _decision in collect_data_loop(start_time, end_time, trade_strategy, trade_executor, return_value):
pass
return return_value.get("portfolio_metrics"), return_value.get("indicator")
def collect_data_loop(
start_time, end_time, trade_strategy: BaseStrategy, trade_executor: BaseExecutor, return_value: dict = None
):
"""Generator for collecting the trade decision data for rl training
Parameters
----------
start_time : pd.Timestamp|str
closed start time for backtest
**NOTE**: This will be applied to the outmost executor's calendar.
end_time : pd.Timestamp|str
closed end time for backtest
**NOTE**: This will be applied to the outmost executor's calendar.
E.g. Executor[day](Executor[1min]), setting `end_time == 20XX0301` will include all the minutes on 20XX0301
trade_strategy : BaseStrategy
the outermost portfolio strategy
trade_executor : BaseExecutor
the outermost executor
return_value : dict
used for backtest_loop
Yields
-------
object
trade decision
"""
trade_executor.reset(start_time=start_time, end_time=end_time)
trade_strategy.reset(level_infra=trade_executor.get_level_infra())
with tqdm(total=trade_executor.trade_calendar.get_trade_len(), desc="backtest loop") as bar:
_execute_result = None
while not trade_executor.finished():
_trade_decision: BaseTradeDecision = trade_strategy.generate_trade_decision(_execute_result)
_execute_result = yield from trade_executor.collect_data(_trade_decision, level=0)
bar.update(1)
if return_value is not None:
all_executors = trade_executor.get_all_executors()
all_portfolio_metrics = {
"{}{}".format(*Freq.parse(_executor.time_per_step)): _executor.trade_account.get_portfolio_metrics()
for _executor in all_executors
if _executor.trade_account.is_port_metr_enabled()
}
all_indicators = {}
for _executor in all_executors:
key = "{}{}".format(*Freq.parse(_executor.time_per_step))
all_indicators[key] = _executor.trade_account.get_trade_indicator().generate_trade_indicators_dataframe()
all_indicators[key + "_obj"] = _executor.trade_account.get_trade_indicator()
return_value.update({"portfolio_metrics": all_portfolio_metrics, "indicator": all_indicators})

548
qlib/backtest/decision.py Normal file
View File

@@ -0,0 +1,548 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
from enum import IntEnum
from qlib.data.data import Cal
from qlib.utils.time import concat_date_time, epsilon_change
from qlib.log import get_module_logger
# try to fix circular imports when enabling type hints
from typing import Callable, TYPE_CHECKING
if TYPE_CHECKING:
from qlib.strategy.base import BaseStrategy
from qlib.backtest.exchange import Exchange
from qlib.backtest.utils import TradeCalendarManager
import warnings
import numpy as np
import pandas as pd
import numpy as np
from dataclasses import dataclass, field
from typing import ClassVar, Optional, Union, List, Set, Tuple
class OrderDir(IntEnum):
# Order direction
SELL = 0
BUY = 1
@dataclass
class Order:
"""
stock_id : str
amount : float
start_time : pd.Timestamp
closed start time for order trading
end_time : pd.Timestamp
closed end time for order trading
direction : int
Order.SELL for sell; Order.BUY for buy
factor : float
presents the weight factor assigned in Exchange()
"""
# 1) time invariant values
# - they are set by users and is time-invariant.
stock_id: str
amount: float # `amount` is a non-negative and adjusted value
direction: int
# 2) time variant values:
# - Users may want to set these values when using lower level APIs
# - If users don't, TradeDecisionWO will help users to set them
# The interval of the order which belongs to (NOTE: this is not the expected order dealing range time)
start_time: pd.Timestamp
end_time: pd.Timestamp
# 3) results
# - users should not care about these values
# - they are set by the backtest system after finishing the results.
# What the value should be about in all kinds of cases
# - not tradable: the deal_amount == 0 , factor is None
# - the stock is suspended and the entire order fails. No cost for this order
# - dealed or partially dealed: deal_amount >= 0 and factor is not None
deal_amount: Optional[float] = None # `deal_amount` is a non-negative value
factor: Optional[float] = None
# TODO:
# a status field to indicate the dealing result of the order
# FIXME:
# for compatible now.
# Please remove them in the future
SELL: ClassVar[OrderDir] = OrderDir.SELL
BUY: ClassVar[OrderDir] = OrderDir.BUY
def __post_init__(self):
if self.direction not in {Order.SELL, Order.BUY}:
raise NotImplementedError("direction not supported, `Order.SELL` for sell, `Order.BUY` for buy")
self.deal_amount = 0
self.factor = None
@property
def amount_delta(self) -> float:
"""
return the delta of amount.
- Positive value indicates buying `amount` of share
- Negative value indicates selling `amount` of share
"""
return self.amount * self.sign
@property
def deal_amount_delta(self) -> float:
"""
return the delta of deal_amount.
- Positive value indicates buying `deal_amount` of share
- Negative value indicates selling `deal_amount` of share
"""
return self.deal_amount * self.sign
@property
def sign(self) -> float:
"""
return the sign of trading
- `+1` indicates buying
- `-1` value indicates selling
"""
return self.direction * 2 - 1
@staticmethod
def parse_dir(direction: Union[str, int, np.integer, OrderDir, np.ndarray]) -> Union[OrderDir, np.ndarray]:
if isinstance(direction, OrderDir):
return direction
elif isinstance(direction, (int, float, np.integer, np.floating)):
if direction > 0:
return Order.BUY
else:
return Order.SELL
elif isinstance(direction, str):
dl = direction.lower()
if dl.strip() == "sell":
return OrderDir.SELL
elif dl.strip() == "buy":
return OrderDir.BUY
else:
raise NotImplementedError(f"This type of input is not supported")
elif isinstance(direction, np.ndarray):
direction_array = direction.copy()
direction_array[direction_array > 0] = Order.BUY
direction_array[direction_array <= 0] = Order.SELL
return direction_array
else:
raise NotImplementedError(f"This type of input is not supported")
class OrderHelper:
"""
Motivation
- Make generating order easier
- User may have no knowledge about the adjust-factor information about the system.
- It involves to much interaction with the exchange when generating orders.
"""
def __init__(self, exchange: Exchange):
self.exchange = exchange
def create(
self,
code: str,
amount: float,
direction: OrderDir,
start_time: Union[str, pd.Timestamp] = None,
end_time: Union[str, pd.Timestamp] = None,
) -> Order:
"""
help to create a order
# TODO: create order for unadjusted amount order
Parameters
----------
code : str
the id of the instrument
amount : float
**adjusted trading amount**
direction : OrderDir
trading direction
start_time : Union[str, pd.Timestamp] (optional)
The interval of the order which belongs to
end_time : Union[str, pd.Timestamp] (optional)
The interval of the order which belongs to
Returns
-------
Order:
The created order
"""
if start_time is not None:
start_time = pd.Timestamp(start_time)
if end_time is not None:
end_time = pd.Timestamp(end_time)
# NOTE: factor is a value belongs to the results section. User don't have to care about it when creating orders
return Order(
stock_id=code,
amount=amount,
start_time=start_time,
end_time=end_time,
direction=direction,
)
class TradeRange:
def __call__(self, trade_calendar: TradeCalendarManager) -> Tuple[int, int]:
"""
This method will be call with following way
The outer strategy give a decision with with `TradeRange`
The decision will be checked by the inner decision.
inner decision will pass its trade_calendar as parameter when getting the trading range
- The framework's step is integer-index based.
Parameters
----------
trade_calendar : TradeCalendarManager
the trade_calendar is from inner strategy
Returns
-------
Tuple[int, int]:
the start index and end index which are tradable
Raises
------
NotImplementedError:
Exceptions are raised when no range limitation
"""
raise NotImplementedError(f"Please implement the `__call__` method")
def clip_time_range(self, start_time: pd.Timestamp, end_time: pd.Timestamp) -> Tuple[pd.Timestamp, pd.Timestamp]:
"""
Parameters
----------
start_time : pd.Timestamp
end_time : pd.Timestamp
Both sides (start_time, end_time) are closed
Returns
-------
Tuple[pd.Timestamp, pd.Timestamp]:
The tradable time range.
- It is intersection of [start_time, end_time] and the rule of TradeRange itself
"""
raise NotImplementedError(f"Please implement the `clip_time_range` method")
class IdxTradeRange(TradeRange):
def __init__(self, start_idx: int, end_idx: int):
self._start_idx = start_idx
self._end_idx = end_idx
def __call__(self, trade_calendar: TradeCalendarManager = None) -> Tuple[int, int]:
return self._start_idx, self._end_idx
class TradeRangeByTime(TradeRange):
"""This is a helper function for make decisions"""
def __init__(self, start_time: str, end_time: str):
"""
This is a callable class.
**NOTE**:
- It is designed for minute-bar for intraday trading!!!!!
- Both start_time and end_time are **closed** in the range
Parameters
----------
start_time : str
e.g. "9:30"
end_time : str
e.g. "14:30"
"""
self.start_time = pd.Timestamp(start_time).time()
self.end_time = pd.Timestamp(end_time).time()
assert self.start_time < self.end_time
def __call__(self, trade_calendar: TradeCalendarManager = None) -> Tuple[int, int]:
if trade_calendar is None:
raise NotImplementedError("trade_calendar is necessary for getting TradeRangeByTime.")
start = trade_calendar.start_time
val_start, val_end = concat_date_time(start.date(), self.start_time), concat_date_time(
start.date(), self.end_time
)
return trade_calendar.get_range_idx(val_start, val_end)
def clip_time_range(self, start_time: pd.Timestamp, end_time: pd.Timestamp) -> Tuple[pd.Timestamp, pd.Timestamp]:
start_date = start_time.date()
val_start, val_end = concat_date_time(start_date, self.start_time), concat_date_time(start_date, self.end_time)
# NOTE: `end_date` should not be used. Because the `end_date` is for slicing. It may be in the next day
# Assumption: start_time and end_time is for intraday trading. So it is OK for only using start_date
return max(val_start, start_time), min(val_end, end_time)
class BaseTradeDecision:
"""
Trade decisions ara made by strategy and executed by exeuter
Motivation:
Here are several typical scenarios for `BaseTradeDecision`
Case 1:
1. Outer strategy makes a decision. The decision is not available at the start of current interval
2. After a period of time, the decision are updated and become available
3. The inner strategy try to get the decision and start to execute the decision according to `get_range_limit`
Case 2:
1. The outer strategy's decision is available at the start of the interval
2. Same as `case 1.3`
"""
def __init__(self, strategy: BaseStrategy, trade_range: Union[Tuple[int, int], TradeRange] = None):
"""
Parameters
----------
strategy : BaseStrategy
The strategy who make the decision
trade_range: Union[Tuple[int, int], Callable] (optional)
The index range for underlying strategy.
Here are two examples of trade_range for each type
1) Tuple[int, int]
start_index and end_index of the underlying strategy(both sides are closed)
2) TradeRange
"""
self.strategy = strategy
self.start_time, self.end_time = strategy.trade_calendar.get_step_time()
self.total_step = None # upper strategy has no knowledge about the sub executor before `_init_sub_trading`
if isinstance(trade_range, Tuple):
# for Tuple[int, int]
trade_range = IdxTradeRange(*trade_range)
self.trade_range: TradeRange = trade_range
def get_decision(self) -> List[object]:
"""
get the **concrete decision** (e.g. execution orders)
This will be called by the inner strategy
Returns
-------
List[object]:
The decision result. Typically it is some orders
Example:
[]:
Decision not available
[concrete_decision]:
available
"""
raise NotImplementedError(f"This type of input is not supported")
def update(self, trade_calendar: TradeCalendarManager) -> Union["BaseTradeDecision", None]:
"""
Be called at the **start** of each step.
This function is design for following purpose
1) Leave a hook for the strategy who make `self` decision to update the decision itself
2) Update some information from the inner executor calendar
Parameters
----------
trade_calendar : TradeCalendarManager
The calendar of the **inner strategy**!!!!!
Returns
-------
None:
No update, use previous decision(or unavailable)
BaseTradeDecision:
New update, use new decision
"""
# purpose 1)
self.total_step = trade_calendar.get_trade_len()
# purpose 2)
return self.strategy.update_trade_decision(self, trade_calendar)
def _get_range_limit(self, **kwargs) -> Tuple[int, int]:
if self.trade_range is not None:
return self.trade_range(trade_calendar=kwargs.get("inner_calendar"))
else:
raise NotImplementedError("The decision didn't provide an index range")
def get_range_limit(self, **kwargs) -> Tuple[int, int]:
"""
return the expected step range for limiting the decision execution time
Both left and right are **closed**
if no available trade_range, `default_value` will be returned
It is only used in `NestedExecutor`
- The outmost strategy will not follow any range limit (but it may give range_limit)
- The inner most strategy's range_limit will be useless due to atomic executors don't have such
features.
**NOTE**:
1) This function must be called after `self.update` in following cases(ensured by NestedExecutor):
- user relies on the auto-clip feature of `self.update`
2) This function will be called after _init_sub_trading in NestedExecutor.
Parameters
----------
**kwargs:
{
"default_value": <default_value>, # using dict is for distinguish no value provided or None provided
"inner_calendar": <trade calendar of inner strategy>
# because the range limit will control the step range of inner strategy, inner calendar will be a
# important parameter when trade_range is callable
}
Returns
-------
Tuple[int, int]:
Raises
------
NotImplementedError:
If the following criteria meet
1) the decision can't provide a unified start and end
2) default_value is not provided
"""
try:
_start_idx, _end_idx = self._get_range_limit(**kwargs)
except NotImplementedError:
if "default_value" in kwargs:
return kwargs["default_value"]
else:
# Default to get full index
raise NotImplementedError(f"The decision didn't provide an index range")
# clip index
if getattr(self, "total_step", None) is not None:
# if `self.update` is called.
# Then the _start_idx, _end_idx should be clipped
if _start_idx < 0 or _end_idx >= self.total_step:
logger = get_module_logger("decision")
logger.warning(
f"[{_start_idx},{_end_idx}] go beyoud the total_step({self.total_step}), it will be clipped"
)
_start_idx, _end_idx = max(0, _start_idx), min(self.total_step - 1, _end_idx)
return _start_idx, _end_idx
def get_data_cal_range_limit(self, rtype: str = "full", raise_error: bool = False) -> Tuple[int, int]:
"""
get the range limit based on data calendar
NOTE: it is **total** range limit instead of a single step
The following assumptions are made
1) The frequency of the exchange in common_infra is the same as the data calendar
2) Users want the index mod by **day** (i.e. 240 min)
Parameters
----------
rtype: str
- "full": return the full limitation of the deicsion in the day
- "step": return the limitation of current step
raise_error: bool
True: raise error if no trade_range is set
False: return full trade calendar.
It is useful in following cases
- users want to follow the order specific trading time range when decision level trade range is not
available. Raising NotImplementedError to indicates that range limit is not available
Returns
-------
Tuple[int, int]:
the range limit in data calendar
Raises
------
NotImplementedError:
If the following criteria meet
1) the decision can't provide a unified start and end
2) raise_error is True
"""
# potential performance issue
day_start = pd.Timestamp(self.start_time.date())
day_end = epsilon_change(day_start + pd.Timedelta(days=1))
freq = self.strategy.trade_exchange.freq
_, _, day_start_idx, day_end_idx = Cal.locate_index(day_start, day_end, freq=freq)
if self.trade_range is None:
if raise_error:
raise NotImplementedError(f"There is no trade_range in this case")
else:
return 0, day_end_idx - day_start_idx
else:
if rtype == "full":
val_start, val_end = self.trade_range.clip_time_range(day_start, day_end)
elif rtype == "step":
val_start, val_end = self.trade_range.clip_time_range(self.start_time, self.end_time)
else:
raise ValueError(f"This type of input {rtype} is not supported")
_, _, start_idx, end_index = Cal.locate_index(val_start, val_end, freq=freq)
return start_idx - day_start_idx, end_index - day_start_idx
def empty(self) -> bool:
for obj in self.get_decision():
if isinstance(obj, Order):
# Zero amount order will be treated as empty
if obj.amount > 1e-6:
return False
else:
return True
return True
def mod_inner_decision(self, inner_trade_decision: BaseTradeDecision):
"""
This method will be called on the inner_trade_decision after it is generated.
`inner_trade_decision` will be changed **inplaced**.
Motivation of the `mod_inner_decision`
- Leave a hook for outer decision to affact the decision generated by the inner strategy
- e.g. the outmost strategy generate a time range for trading. But the upper layer can only affact the
nearest layer in the original design. With `mod_inner_decision`, the decision can passed through multiple
layers
Parameters
----------
inner_trade_decision : BaseTradeDecision
"""
# base class provide a default behaviour to modify inner_trade_decision
# trade_range should be propagated when inner trade_range is not set
if inner_trade_decision.trade_range is None:
inner_trade_decision.trade_range = self.trade_range
class EmptyTradeDecision(BaseTradeDecision):
def empty(self) -> bool:
return True
class TradeDecisionWO(BaseTradeDecision):
"""
Trade Decision (W)ith (O)rder.
Besides, the time_range is also included.
"""
def __init__(self, order_list: List[Order], strategy: BaseStrategy, trade_range: Tuple[int, int] = None):
super().__init__(strategy, trade_range=trade_range)
self.order_list = order_list
start, end = strategy.trade_calendar.get_step_time()
for o in order_list:
if o.start_time is None:
o.start_time = start
if o.end_time is None:
o.end_time = end
def get_decision(self) -> List[object]:
return self.order_list
def __repr__(self) -> str:
return f"class: {self.__class__.__name__}; strategy: {self.strategy}; trade_range: {self.trade_range}; order_list[{len(self.order_list)}]"

802
qlib/backtest/exchange.py Normal file
View File

@@ -0,0 +1,802 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
from collections import defaultdict
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .account import Account
from qlib.backtest.position import BasePosition, Position
import random
from typing import List, Tuple, Union
import numpy as np
import pandas as pd
from ..data.data import D
from ..config import C, REG_CN
from ..log import get_module_logger
from .decision import Order, OrderDir, OrderHelper
from .high_performance_ds import BaseQuote, PandasQuote, NumpyQuote
class Exchange:
def __init__(
self,
freq="day",
start_time=None,
end_time=None,
codes="all",
deal_price: Union[str, Tuple[str], List[str]] = None,
subscribe_fields=[],
limit_threshold: Union[Tuple[str, str], float, None] = None,
volume_threshold=None,
open_cost=0.0015,
close_cost=0.0025,
min_cost=5,
impact_cost=0.0,
extra_quote=None,
quote_cls=NumpyQuote,
**kwargs,
):
"""__init__
:param freq: frequency of data
:param start_time: closed start time for backtest
:param end_time: closed end time for backtest
:param codes: list stock_id list or a string of instruments(i.e. all, csi500, sse50)
:param deal_price: Union[str, Tuple[str, str], List[str]]
The `deal_price` supports following two types of input
- <deal_price> : str
- (<buy_price>, <sell_price>): Tuple[str] or List[str]
<deal_price>, <buy_price> or <sell_price> := <price>
<price> := str
- for example '$close', '$open', '$vwap' ("close" is OK. `Exchange` will help to prepend
"$" to the expression)
:param subscribe_fields: list, subscribe fields. This expressions will be added to the query and `self.quote`.
It is useful when users want more fields to be queried
:param limit_threshold: Union[Tuple[str, str], float, None]
1) `None`: no limitation
2) float, 0.1 for example, default None
3) Tuple[str, str]: (<the expression for buying stock limitation>,
<the expression for sell stock limitation>)
`False` value indicates the stock is tradable
`True` value indicates the stock is limited and not tradable
:param volume_threshold: Union[
Dict[
"all": ("cum" or "current", limit_str),
"buy": ("cum" or "current", limit_str),
"sell":("cum" or "current", limit_str),
],
("cum" or "current", limit_str),
]
1) ("cum" or "current", limit_str) denotes a single volume limit.
- limit_str is qlib data expression which is allowed to define your own Operator.
Please refer to qlib/contrib/ops/high_freq.py, here are any custom operator for high frequency,
such as DayCumsum. !!!NOTE: if you want you use the custom operator, you need to
register it in qlib_init.
- "cum" means that this is a cumulative value over time, such as cumulative market volume.
So when it is used as a volume limit, it is necessary to subtract the dealt amount.
- "current" means that this is a real-time value and will not accumulate over time,
so it can be directly used as a capacity limit.
e.g. ("cum", "0.2 * DayCumsum($volume, '9:45', '14:45')"), ("current", "$bidV1")
2) "all" means the volume limits are both buying and selling.
"buy" means the volume limits of buying. "sell" means the volume limits of selling.
Different volume limits will be aggregated with min(). If volume_threshold is only
("cum" or "current", limit_str) instead of a dict, the volume limits are for
both by deault. In other words, it is same as {"all": ("cum" or "current", limit_str)}.
3) e.g. "volume_threshold": {
"all": ("cum", "0.2 * DayCumsum($volume, '9:45', '14:45')"),
"buy": ("current", "$askV1"),
"sell": ("current", "$bidV1"),
}
:param open_cost: cost rate for open, default 0.0015
:param close_cost: cost rate for close, default 0.0025
:param trade_unit: trade unit, 100 for China A market.
None for disable trade unit.
**NOTE**: `trade_unit` is included in the `kwargs`. It is necessary because we must
distinguish `not set` and `disable trade_unit`
:param min_cost: min cost, default 5
:param impact_cost: market impact cost rate (a.k.a. slippage). A recommended value is 0.1.
:param extra_quote: pandas, dataframe consists of
columns: like ['$vwap', '$close', '$volume', '$factor', 'limit_sell', 'limit_buy'].
The limit indicates that the etf is tradable on a specific day.
Necessary fields:
$close is for calculating the total value at end of each day.
Optional fields:
$volume is only necessary when we limit the trade amount or caculate PA(vwap) indicator
$vwap is only necessary when we use the $vwap price as the deal price
$factor is for rounding to the trading unit
limit_sell will be set to False by default(False indicates we can sell this
target on this day).
limit_buy will be set to False by default(False indicates we can buy this
target on this day).
index: MultipleIndex(instrument, pd.Datetime)
"""
self.freq = freq
self.start_time = start_time
self.end_time = end_time
self.trade_unit = kwargs.pop("trade_unit", C.trade_unit)
if len(kwargs) > 0:
raise ValueError(f"Get Unexpected arguments {kwargs}")
if limit_threshold is None:
limit_threshold = C.limit_threshold
if deal_price is None:
deal_price = C.deal_price
# we have some verbose information here. So logging is enable
self.logger = get_module_logger("online operator")
# TODO: the quote, trade_dates, codes are not necessary.
# It is just for performance consideration.
self.limit_type = self._get_limit_type(limit_threshold)
if limit_threshold is None:
if C.region == REG_CN:
self.logger.warning(f"limit_threshold not set. The stocks hit the limit may be bought/sold")
elif self.limit_type == self.LT_FLT and abs(limit_threshold) > 0.1:
if C.region == REG_CN:
self.logger.warning(f"limit_threshold may not be set to a reasonable value")
if isinstance(deal_price, str):
if deal_price[0] != "$":
deal_price = "$" + deal_price
self.buy_price = self.sell_price = deal_price
elif isinstance(deal_price, (tuple, list)):
self.buy_price, self.sell_price = deal_price
else:
raise NotImplementedError(f"This type of input is not supported")
if isinstance(codes, str):
codes = D.instruments(codes)
self.codes = codes
# Necessary fields
# $close is for calculating the total value at end of each day.
# $factor is for rounding to the trading unit
# $change is for calculating the limit of the stock
#  get volume limit from kwargs
self.buy_vol_limit, self.sell_vol_limit, vol_lt_fields = self._get_vol_limit(volume_threshold)
necessary_fields = {self.buy_price, self.sell_price, "$close", "$change", "$factor", "$volume"}
if self.limit_type == self.LT_TP_EXP:
for exp in limit_threshold:
necessary_fields.add(exp)
all_fields = necessary_fields | vol_lt_fields
all_fields = list(all_fields | set(subscribe_fields))
self.all_fields = all_fields
self.open_cost = open_cost
self.close_cost = close_cost
self.min_cost = min_cost
self.impact_cost = impact_cost
self.limit_threshold: Union[Tuple[str, str], float, None] = limit_threshold
self.volume_threshold = volume_threshold
self.extra_quote = extra_quote
self.get_quote_from_qlib()
# init quote by quote_df
self.quote_cls = quote_cls
self.quote: BaseQuote = self.quote_cls(self.quote_df, freq)
def get_quote_from_qlib(self):
# get stock data from qlib
if len(self.codes) == 0:
self.codes = D.instruments()
self.quote_df = D.features(
self.codes, self.all_fields, self.start_time, self.end_time, freq=self.freq, disk_cache=True
).dropna(subset=["$close"])
self.quote_df.columns = self.all_fields
# check buy_price data and sell_price data
for attr in "buy_price", "sell_price":
pstr = getattr(self, attr) # price string
if self.quote_df[pstr].isna().any():
self.logger.warning("{} field data contains nan.".format(pstr))
# update trade_w_adj_price
if self.quote_df["$factor"].isna().any():
# The 'factor.day.bin' file not exists, and `factor` field contains `nan`
# Use adjusted price
self.trade_w_adj_price = True
self.logger.warning("factor.day.bin file not exists or factor contains `nan`. Order using adjusted_price.")
if self.trade_unit is not None:
self.logger.warning(f"trade unit {self.trade_unit} is not supported in adjusted_price mode.")
else:
# The `factor.day.bin` file exists and all data `close` and `factor` are not `nan`
# Use normal price
self.trade_w_adj_price = False
# update limit
self._update_limit(self.limit_threshold)
# concat extra_quote
if self.extra_quote is not None:
# process extra_quote
if "$close" not in self.extra_quote:
raise ValueError("$close is necessray in extra_quote")
for attr in "buy_price", "sell_price":
pstr = getattr(self, attr) # price string
if pstr not in self.extra_quote.columns:
self.extra_quote[pstr] = self.extra_quote["$close"]
self.logger.warning(f"No {pstr} set for extra_quote. Use $close as {pstr}.")
if "$factor" not in self.extra_quote.columns:
self.extra_quote["$factor"] = 1.0
self.logger.warning("No $factor set for extra_quote. Use 1.0 as $factor.")
if "limit_sell" not in self.extra_quote.columns:
self.extra_quote["limit_sell"] = False
self.logger.warning("No limit_sell set for extra_quote. All stock will be able to be sold.")
if "limit_buy" not in self.extra_quote.columns:
self.extra_quote["limit_buy"] = False
self.logger.warning("No limit_buy set for extra_quote. All stock will be able to be bought.")
assert set(self.extra_quote.columns) == set(self.quote_df.columns) - {"$change"}
self.quote_df = pd.concat([self.quote_df, extra_quote], sort=False, axis=0)
LT_TP_EXP = "(exp)" # Tuple[str, str]
LT_FLT = "float" # float
LT_NONE = "none" # none
def _get_limit_type(self, limit_threshold):
"""get limit type"""
if isinstance(limit_threshold, Tuple):
return self.LT_TP_EXP
elif isinstance(limit_threshold, float):
return self.LT_FLT
elif limit_threshold is None:
return self.LT_NONE
else:
raise NotImplementedError(f"This type of `limit_threshold` is not supported")
def _update_limit(self, limit_threshold):
# check limit_threshold
limit_type = self._get_limit_type(limit_threshold)
if limit_type == self.LT_NONE:
self.quote_df["limit_buy"] = False
self.quote_df["limit_sell"] = False
elif limit_type == self.LT_TP_EXP:
# set limit
self.quote_df["limit_buy"] = self.quote_df[limit_threshold[0]]
self.quote_df["limit_sell"] = self.quote_df[limit_threshold[1]]
elif limit_type == self.LT_FLT:
self.quote_df["limit_buy"] = self.quote_df["$change"].ge(limit_threshold)
self.quote_df["limit_sell"] = self.quote_df["$change"].le(-limit_threshold) # pylint: disable=E1130
def _get_vol_limit(self, volume_threshold):
"""
preproccess the volume limit.
get the fields need to get from qlib.
get the volume limit list of buying and selling which is composed of all limits.
Parameters
----------
volume_threshold :
please refer to the doc of exchange.
Returns
-------
fields: set
the fields need to get from qlib.
buy_vol_limit: List[Tuple[str]]
all volume limits of buying.
sell_vol_limit: List[Tuple[str]]
all volume limits of selling.
Raises
------
ValueError
the format of volume_threshold is not supported.
"""
if volume_threshold is None:
return None, None, set()
fields = set()
buy_vol_limit = []
sell_vol_limit = []
if isinstance(volume_threshold, tuple):
volume_threshold = {"all": volume_threshold}
assert isinstance(volume_threshold, dict)
for key in volume_threshold:
vol_limit = volume_threshold[key]
assert isinstance(vol_limit, tuple)
fields.add(vol_limit[1])
if key in ("buy", "all"):
buy_vol_limit.append(vol_limit)
if key in ("sell", "all"):
sell_vol_limit.append(vol_limit)
return buy_vol_limit, sell_vol_limit, fields
def check_stock_limit(self, stock_id, start_time, end_time, direction=None):
"""
Parameters
----------
direction : int, optional
trade direction, by default None
- if direction is None, check if tradable for buying and selling.
- if direction == Order.BUY, check the if tradable for buying
- if direction == Order.SELL, check the sell limit for selling.
"""
if direction is None:
buy_limit = self.quote.get_data(stock_id, start_time, end_time, field="limit_buy", method="all")
sell_limit = self.quote.get_data(stock_id, start_time, end_time, field="limit_sell", method="all")
return buy_limit or sell_limit
elif direction == Order.BUY:
return self.quote.get_data(stock_id, start_time, end_time, field="limit_buy", method="all")
elif direction == Order.SELL:
return self.quote.get_data(stock_id, start_time, end_time, field="limit_sell", method="all")
else:
raise ValueError(f"direction {direction} is not supported!")
def check_stock_suspended(self, stock_id, start_time, end_time):
# is suspended
if stock_id in self.quote.get_all_stock():
return self.quote.get_data(stock_id, start_time, end_time, "$close") is None
else:
return True
def is_stock_tradable(self, stock_id, start_time, end_time, direction=None):
# check if stock can be traded
# same as check in check_order
if self.check_stock_suspended(stock_id, start_time, end_time) or self.check_stock_limit(
stock_id, start_time, end_time, direction
):
return False
else:
return True
def check_order(self, order):
# check limit and suspended
if self.check_stock_suspended(order.stock_id, order.start_time, order.end_time) or self.check_stock_limit(
order.stock_id, order.start_time, order.end_time, order.direction
):
return False
else:
return True
def deal_order(
self,
order,
trade_account: Account = None,
position: BasePosition = None,
dealt_order_amount: defaultdict = defaultdict(float),
):
"""
Deal order when the actual transaction
the results section in `Order` will be changed.
:param order: Deal the order.
:param trade_account: Trade account to be updated after dealing the order.
:param position: position to be updated after dealing the order.
:param dealt_order_amount: the dealt order amount dict with the format of {stock_id: float}
:return: trade_val, trade_cost, trade_price
"""
# check order first.
if self.check_order(order) is False:
order.deal_amount = 0.0
# using np.nan instead of None to make it more convenient to should the value in format string
self.logger.debug(f"Order failed due to trading limitation: {order}")
return 0.0, 0.0, np.nan
if trade_account is not None and position is not None:
raise ValueError("trade_account and position can only choose one")
# NOTE: order will be changed in this function
trade_price, trade_val, trade_cost = self._calc_trade_info_by_order(
order, trade_account.current_position if trade_account else position, dealt_order_amount
)
if trade_val > 1e-5:
# If the order can only be deal 0 value. Nothing to be updated
# Otherwise, it will result in
# 1) some stock with 0 value in the position
# 2) `trade_unit` of trade_cost will be lost in user account
if trade_account:
trade_account.update_order(order=order, trade_val=trade_val, cost=trade_cost, trade_price=trade_price)
elif position:
position.update_order(order=order, trade_val=trade_val, cost=trade_cost, trade_price=trade_price)
return trade_val, trade_cost, trade_price
def get_quote_info(self, stock_id, start_time, end_time, method="ts_data_last"):
return self.quote.get_data(stock_id, start_time, end_time, method=method)
def get_close(self, stock_id, start_time, end_time, method="ts_data_last"):
return self.quote.get_data(stock_id, start_time, end_time, field="$close", method=method)
def get_volume(self, stock_id, start_time, end_time):
"""get the total deal volume of stock with `stock_id` between the time interval [start_time, end_time)"""
return self.quote.get_data(stock_id, start_time, end_time, field="$volume", method="sum")
def get_deal_price(self, stock_id, start_time, end_time, direction: OrderDir, method="ts_data_last"):
if direction == OrderDir.SELL:
pstr = self.sell_price
elif direction == OrderDir.BUY:
pstr = self.buy_price
else:
raise NotImplementedError(f"This type of input is not supported")
deal_price = self.quote.get_data(stock_id, start_time, end_time, field=pstr, method=method)
if method is not None and (deal_price is None or np.isnan(deal_price) or deal_price <= 1e-08):
self.logger.warning(f"(stock_id:{stock_id}, trade_time:{(start_time, end_time)}, {pstr}): {deal_price}!!!")
self.logger.warning(f"setting deal_price to close price")
deal_price = self.get_close(stock_id, start_time, end_time, method)
return deal_price
def get_factor(self, stock_id, start_time, end_time) -> Union[float, None]:
"""
Returns
-------
Union[float, None]:
`None`: if the stock is suspended `None` may be returned
`float`: return factor if the factor exists
"""
assert start_time is not None and end_time is not None, "the time range must be given"
if stock_id not in self.quote.get_all_stock():
return None
return self.quote.get_data(stock_id, start_time, end_time, field="$factor", method="ts_data_last")
def generate_amount_position_from_weight_position(
self, weight_position, cash, start_time, end_time, direction=OrderDir.BUY
):
"""
The generate the target position according to the weight and the cash.
NOTE: All the cash will assigned to the tadable stock.
Parameter:
weight_position : dict {stock_id : weight}; allocate cash by weight_position
among then, weight must be in this range: 0 < weight < 1
cash : cash
start_time : the start time point of the step
end_time : the end time point of the step
direction : the direction of the deal price for estimating the amount
# NOTE: this function is used for calculating target position. So the default direction is buy
"""
# calculate the total weight of tradable value
tradable_weight = 0.0
for stock_id in weight_position:
if self.is_stock_tradable(stock_id=stock_id, start_time=start_time, end_time=end_time):
# weight_position must be greater than 0 and less than 1
if weight_position[stock_id] < 0 or weight_position[stock_id] > 1:
raise ValueError(
"weight_position is {}, "
"weight_position is not in the range of (0, 1).".format(weight_position[stock_id])
)
tradable_weight += weight_position[stock_id]
if tradable_weight - 1.0 >= 1e-5:
raise ValueError("tradable_weight is {}, can not greater than 1.".format(tradable_weight))
amount_dict = {}
for stock_id in weight_position:
if weight_position[stock_id] > 0.0 and self.is_stock_tradable(
stock_id=stock_id, start_time=start_time, end_time=end_time
):
amount_dict[stock_id] = (
cash
* weight_position[stock_id]
/ tradable_weight
// self.get_deal_price(
stock_id=stock_id, start_time=start_time, end_time=end_time, direction=direction
)
)
return amount_dict
def get_real_deal_amount(self, current_amount, target_amount, factor):
"""
Calculate the real adjust deal amount when considering the trading unit
:param current_amount:
:param target_amount:
:param factor:
:return real_deal_amount; Positive deal_amount indicates buying more stock.
"""
if current_amount == target_amount:
return 0
elif current_amount < target_amount:
deal_amount = target_amount - current_amount
deal_amount = self.round_amount_by_trade_unit(deal_amount, factor)
return deal_amount
else:
if target_amount == 0:
return -current_amount
else:
deal_amount = current_amount - target_amount
deal_amount = self.round_amount_by_trade_unit(deal_amount, factor)
return -deal_amount
def generate_order_for_target_amount_position(self, target_position, current_position, start_time, end_time):
"""
Note: some future information is used in this function
Parameter:
target_position : dict { stock_id : amount }
current_postion : dict { stock_id : amount}
trade_unit : trade_unit
down sample : for amount 321 and trade_unit 100, deal_amount is 300
deal order on trade_date
"""
# split buy and sell for further use
buy_order_list = []
sell_order_list = []
# three parts: kept stock_id, dropped stock_id, new stock_id
# handle kept stock_id
# because the order of the set is not fixed, the trading order of the stock is different, so that the backtest results of the same parameter are different;
# so here we sort stock_id, and then randomly shuffle the order of stock_id
# because the same random seed is used, the final stock_id order is fixed
sorted_ids = sorted(set(list(current_position.keys()) + list(target_position.keys())))
random.seed(0)
random.shuffle(sorted_ids)
for stock_id in sorted_ids:
# Do not generate order for the nontradable stocks
if not self.is_stock_tradable(stock_id=stock_id, start_time=start_time, end_time=end_time):
continue
target_amount = target_position.get(stock_id, 0)
current_amount = current_position.get(stock_id, 0)
factor = self.get_factor(stock_id, start_time=start_time, end_time=end_time)
deal_amount = self.get_real_deal_amount(current_amount, target_amount, factor)
if deal_amount == 0:
continue
elif deal_amount > 0:
# buy stock
buy_order_list.append(
Order(
stock_id=stock_id,
amount=deal_amount,
direction=Order.BUY,
start_time=start_time,
end_time=end_time,
factor=factor,
)
)
else:
# sell stock
sell_order_list.append(
Order(
stock_id=stock_id,
amount=abs(deal_amount),
direction=Order.SELL,
start_time=start_time,
end_time=end_time,
factor=factor,
)
)
# return order_list : buy + sell
return sell_order_list + buy_order_list
def calculate_amount_position_value(
self, amount_dict, start_time, end_time, only_tradable=False, direction=OrderDir.SELL
):
"""Parameter
position : Position()
amount_dict : {stock_id : amount}
direction : the direction of the deal price for estimating the amount
# NOTE:
This function is used for calculating current position value.
So the default direction is sell.
"""
value = 0
for stock_id in amount_dict:
if (
only_tradable is True
and self.check_stock_suspended(stock_id=stock_id, start_time=start_time, end_time=end_time) is False
and self.check_stock_limit(stock_id=stock_id, start_time=start_time, end_time=end_time) is False
or only_tradable is False
):
value += (
self.get_deal_price(
stock_id=stock_id, start_time=start_time, end_time=end_time, direction=direction
)
* amount_dict[stock_id]
)
return value
def _get_factor_or_raise_error(self, factor: float = None, stock_id: str = None, start_time=None, end_time=None):
"""Please refer to the docs of get_amount_of_trade_unit"""
if factor is None:
if stock_id is not None and start_time is not None and end_time is not None:
factor = self.get_factor(stock_id=stock_id, start_time=start_time, end_time=end_time)
else:
raise ValueError(f"`factor` and (`stock_id`, `start_time`, `end_time`) can't both be None")
return factor
def get_amount_of_trade_unit(self, factor: float = None, stock_id: str = None, start_time=None, end_time=None):
"""
get the trade unit of amount based on **factor**
the factor can be given directly or calculated in given time range and stock id.
`factor` has higher priority than `stock_id`, `start_time` and `end_time`
Parameters
----------
factor : float
the adjusted factor
stock_id : str
the id of the stock
start_time :
the start time of trading range
end_time :
the end time of trading range
"""
if not self.trade_w_adj_price and self.trade_unit is not None:
factor = self._get_factor_or_raise_error(
factor=factor, stock_id=stock_id, start_time=start_time, end_time=end_time
)
return self.trade_unit / factor
else:
return None
def round_amount_by_trade_unit(
self, deal_amount, factor: float = None, stock_id: str = None, start_time=None, end_time=None
):
"""Parameter
Please refer to the docs of get_amount_of_trade_unit
deal_amount : float, adjusted amount
factor : float, adjusted factor
return : float, real amount
"""
if not self.trade_w_adj_price and self.trade_unit is not None:
# the minimal amount is 1. Add 0.1 for solving precision problem.
factor = self._get_factor_or_raise_error(
factor=factor, stock_id=stock_id, start_time=start_time, end_time=end_time
)
return (deal_amount * factor + 0.1) // self.trade_unit * self.trade_unit / factor
return deal_amount
def _clip_amount_by_volume(self, order: Order, dealt_order_amount: dict) -> int:
"""parse the capacity limit string and return the actual amount of orders that can be executed.
NOTE:
this function will change the order.deal_amount **inplace**
- This will make the order info more accurate
Parameters
----------
order : Order
the order to be executed.
dealt_order_amount : dict
:param dealt_order_amount: the dealt order amount dict with the format of {stock_id: float}
"""
if order.direction == Order.BUY:
vol_limit = self.buy_vol_limit
elif order.direction == Order.SELL:
vol_limit = self.sell_vol_limit
if vol_limit is None:
return order.deal_amount
vol_limit_num = []
for limit in vol_limit:
assert isinstance(limit, tuple)
if limit[0] == "current":
limit_value = self.quote.get_data(
order.stock_id,
order.start_time,
order.end_time,
field=limit[1],
method="sum",
)
vol_limit_num.append(limit_value)
elif limit[0] == "cum":
limit_value = self.quote.get_data(
order.stock_id,
order.start_time,
order.end_time,
field=limit[1],
method="ts_data_last",
)
vol_limit_num.append(limit_value - dealt_order_amount[order.stock_id])
else:
raise ValueError(f"{limit[0]} is not supported")
vol_limit_min = min(vol_limit_num)
orig_deal_amount = order.deal_amount
order.deal_amount = max(min(vol_limit_min, orig_deal_amount), 0)
if vol_limit_min < orig_deal_amount:
self.logger.debug(
f"Order clipped due to volume limitation: {order}, {[(vol, rule) for vol, rule in zip(vol_limit_num, vol_limit)]}"
)
def _get_buy_amount_by_cash_limit(self, trade_price, cash, cost_ratio):
"""return the real order amount after cash limit for buying.
Parameters
----------
trade_price : float
position : cash
cost_ratio : float
Return
----------
float
the real order amount after cash limit for buying.
"""
max_trade_amount = 0
if cash >= self.min_cost:
# critical_price means the stock transaction price when the service fee is equal to min_cost.
critical_price = self.min_cost / cost_ratio + self.min_cost
if cash >= critical_price:
# the service fee is equal to cost_ratio * trade_amount
max_trade_amount = cash / (1 + cost_ratio) / trade_price
else:
# the service fee is equal to min_cost
max_trade_amount = (cash - self.min_cost) / trade_price
return max_trade_amount
def _calc_trade_info_by_order(self, order, position: Position, dealt_order_amount):
"""
Calculation of trade info
**NOTE**: Order will be changed in this function
:param order:
:param position: Position
:param dealt_order_amount: the dealt order amount dict with the format of {stock_id: float}
:return: trade_price, trade_val, trade_cost
"""
trade_price = self.get_deal_price(order.stock_id, order.start_time, order.end_time, direction=order.direction)
total_trade_val = self.get_volume(order.stock_id, order.start_time, order.end_time) * trade_price
order.factor = self.get_factor(order.stock_id, order.start_time, order.end_time)
order.deal_amount = order.amount # set to full amount and clip it step by step
# Clipping amount first
# - It simulates that the order is rejected directly by the exchange due to large order
# Another choice is placing it after rounding the order
# - It simulates that the large order is submitted, but partial is dealt regardless of rounding by trading unit.
self._clip_amount_by_volume(order, dealt_order_amount)
# TODO: the adjusted cost ratio can be overestimated as deal_amount will be clipped in the next steps
trade_val = order.deal_amount * trade_price
adj_cost_ratio = self.impact_cost * (trade_val / total_trade_val) ** 2
if order.direction == Order.SELL:
cost_ratio = self.close_cost + adj_cost_ratio
# sell
# if we don't know current position, we choose to sell all
# Otherwise, we clip the amount based on current position
if position is not None:
current_amount = (
position.get_stock_amount(order.stock_id) if position.check_stock(order.stock_id) else 0
)
if not np.isclose(order.deal_amount, current_amount):
# when not selling last stock. rounding is necessary
order.deal_amount = self.round_amount_by_trade_unit(
min(current_amount, order.deal_amount), order.factor
)
# in case of negative value of cash
if position.get_cash() + order.deal_amount * trade_price < max(
order.deal_amount * trade_price * cost_ratio,
self.min_cost,
):
order.deal_amount = 0
self.logger.debug(f"Order clipped due to cash limitation: {order}")
elif order.direction == Order.BUY:
cost_ratio = self.open_cost + adj_cost_ratio
# buy
if position is not None:
cash = position.get_cash()
trade_val = order.deal_amount * trade_price
if cash < max(trade_val * cost_ratio, self.min_cost):
# cash cannot cover cost
order.deal_amount = 0
self.logger.debug(f"Order clipped due to cost higher than cash: {order}")
elif cash < trade_val + max(trade_val * cost_ratio, self.min_cost):
# The money is not enough
max_buy_amount = self._get_buy_amount_by_cash_limit(trade_price, cash, cost_ratio)
order.deal_amount = self.round_amount_by_trade_unit(
min(max_buy_amount, order.deal_amount), order.factor
)
self.logger.debug(f"Order clipped due to cash limitation: {order}")
else:
# The money is enough
order.deal_amount = self.round_amount_by_trade_unit(order.deal_amount, order.factor)
else:
# Unknown amount of money. Just round the amount
order.deal_amount = self.round_amount_by_trade_unit(order.deal_amount, order.factor)
else:
raise NotImplementedError("order type {} error".format(order.type))
trade_val = order.deal_amount * trade_price
trade_cost = max(trade_val * cost_ratio, self.min_cost)
if trade_val <= 1e-5:
# if dealing is not successful, the trade_cost should be zero.
trade_cost = 0
return trade_price, trade_val, trade_cost
def get_order_helper(self) -> OrderHelper:
if not hasattr(self, "_order_helper"):
# cache to avoid recreate the same instance
self._order_helper = OrderHelper(self)
return self._order_helper

541
qlib/backtest/executor.py Normal file
View File

@@ -0,0 +1,541 @@
from abc import abstractclassmethod, abstractmethod
import copy
from qlib.backtest.position import BasePosition
from qlib.log import get_module_logger
from types import GeneratorType
from qlib.backtest.account import Account
import warnings
import pandas as pd
from typing import List, Tuple, Union
from collections import defaultdict
from qlib.backtest.report import Indicator
from .decision import EmptyTradeDecision, Order, BaseTradeDecision
from .exchange import Exchange
from .utils import TradeCalendarManager, CommonInfrastructure, LevelInfrastructure, get_start_end_idx
from ..utils import init_instance_by_config
from ..utils.time import Freq
from ..strategy.base import BaseStrategy
class BaseExecutor:
"""Base executor for trading"""
def __init__(
self,
time_per_step: str,
start_time: Union[str, pd.Timestamp] = None,
end_time: Union[str, pd.Timestamp] = None,
indicator_config: dict = {},
generate_portfolio_metrics: bool = False,
verbose: bool = False,
track_data: bool = False,
trade_exchange: Exchange = None,
common_infra: CommonInfrastructure = None,
settle_type=BasePosition.ST_NO,
**kwargs,
):
"""
Parameters
----------
time_per_step : str
trade time per trading step, used for genreate the trade calendar
show_indicator: bool, optional
whether to show indicators, :
- 'pa', the price advantage
- 'pos', the positive rate
- 'ffr', the fulfill rate
indicator_config: dict, optional
config for calculating trade indicator, including the following fields:
- 'show_indicator': whether to show indicators, optional, default by False. The indicators includes
- 'pa', the price advantage
- 'pos', the positive rate
- 'ffr', the fulfill rate
- 'pa_config': config for calculating price advantage(pa), optional
- 'base_price': the based price than which the trading price is advanced, Optional, default by 'twap'
- If 'base_price' is 'twap', the based price is the time weighted average price
- If 'base_price' is 'vwap', the based price is the volume weighted average price
- 'weight_method': weighted method when calculating total trading pa by different orders' pa in each step, optional, default by 'mean'
- If 'weight_method' is 'mean', calculating mean value of different orders' pa
- If 'weight_method' is 'amount_weighted', calculating amount weighted average value of different orders' pa
- If 'weight_method' is 'value_weighted', calculating value weighted average value of different orders' pa
- 'ffr_config': config for calculating fulfill rate(ffr), optional
- 'weight_method': weighted method when calculating total trading ffr by different orders' ffr in each step, optional, default by 'mean'
- If 'weight_method' is 'mean', calculating mean value of different orders' ffr
- If 'weight_method' is 'amount_weighted', calculating amount weighted average value of different orders' ffr
- If 'weight_method' is 'value_weighted', calculating value weighted average value of different orders' ffr
Example:
{
'show_indicator': True,
'pa_config': {
"agg": "twap", # "vwap"
"price": "$close", # default to use deal price of the exchange
},
'ffr_config':{
'weight_method': 'value_weighted',
}
}
generate_portfolio_metrics : bool, optional
whether to generate portfolio_metrics, by default False
verbose : bool, optional
whether to print trading info, by default False
track_data : bool, optional
whether to generate trade_decision, will be used when training rl agent
- If `self.track_data` is true, when making data for training, the input `trade_decision` of `execute` will be generated by `collect_data`
- Else, `trade_decision` will not be generated
trade_exchange : Exchange
exchange that provides market info, used to generate portfolio_metrics
- If generate_portfolio_metrics is None, trade_exchange will be ignored
- Else If `trade_exchange` is None, self.trade_exchange will be set with common_infra
common_infra : CommonInfrastructure, optional:
common infrastructure for backtesting, may including:
- trade_account : Account, optional
trade account for trading
- trade_exchange : Exchange, optional
exchange that provides market info
settle_type : str
Please refer to the docs of BasePosition.settle_start
"""
self.time_per_step = time_per_step
self.indicator_config = indicator_config
self.generate_portfolio_metrics = generate_portfolio_metrics
self.verbose = verbose
self.track_data = track_data
self._trade_exchange = trade_exchange
self.level_infra = LevelInfrastructure()
self.level_infra.reset_infra(common_infra=common_infra)
self._settle_type = settle_type
self.reset(start_time=start_time, end_time=end_time, common_infra=common_infra)
if common_infra is None:
get_module_logger("BaseExecutor").warning(f"`common_infra` is not set for {self}")
# record deal order amount in one day
self.dealt_order_amount = defaultdict(float)
self.deal_day = None
def reset_common_infra(self, common_infra):
"""
reset infrastructure for trading
- reset trade_account
"""
if not hasattr(self, "common_infra"):
self.common_infra = common_infra
else:
self.common_infra.update(common_infra)
if common_infra.has("trade_account"):
# NOTE: there is a trick in the code.
# copy is used instead of deepcopy. So positions are shared
self.trade_account: Account = copy.copy(common_infra.get("trade_account"))
self.trade_account.reset(freq=self.time_per_step, port_metr_enabled=self.generate_portfolio_metrics)
@property
def trade_exchange(self) -> Exchange:
"""get trade exchange in a prioritized order"""
return getattr(self, "_trade_exchange", None) or self.common_infra.get("trade_exchange")
@property
def trade_calendar(self) -> TradeCalendarManager:
"""
Though trade calendar can be accessed from multiple sources, but managing in a centralized way will make the
code easier
"""
return self.level_infra.get("trade_calendar")
def reset(self, common_infra: CommonInfrastructure = None, **kwargs):
"""
- reset `start_time` and `end_time`, used in trade calendar
- reset `common_infra`, used to reset `trade_account`, `trade_exchange`, .etc
"""
if "start_time" in kwargs or "end_time" in kwargs:
start_time = kwargs.get("start_time")
end_time = kwargs.get("end_time")
self.level_infra.reset_cal(freq=self.time_per_step, start_time=start_time, end_time=end_time)
if common_infra is not None:
self.reset_common_infra(common_infra)
def get_level_infra(self):
return self.level_infra
def finished(self):
return self.trade_calendar.finished()
def execute(self, trade_decision: BaseTradeDecision, level: int = 0):
"""execute the trade decision and return the executed result
NOTE: this function is never used directly in the framework. Should we delete it?
Parameters
----------
trade_decision : BaseTradeDecision
level : int
the level of current executor
Returns
----------
execute_result : List[object]
the executed result for trade decision
"""
return_value = {}
for _decision in self.collect_data(trade_decision, return_value=return_value, level=level):
pass
return return_value.get("execute_result")
@abstractclassmethod
def _collect_data(self, trade_decision: BaseTradeDecision, level: int = 0) -> Tuple[List[object], dict]:
"""
Please refer to the doc of collect_data
The only difference between `_collect_data` and `collect_data` is that some common steps are moved into
collect_data
Parameters
----------
Please refer to the doc of collect_data
Returns
-------
Tuple[List[object], dict]:
(<the executed result for trade decision>, <the extra kwargs for `self.trade_account.update_bar_end`>)
"""
def collect_data(
self, trade_decision: BaseTradeDecision, return_value: dict = None, level: int = 0
) -> List[object]:
"""Generator for collecting the trade decision data for rl training
his function will make a step forward
Parameters
----------
trade_decision : BaseTradeDecision
level : int
the level of current executor. 0 indicates the top level
return_value : dict
the mem address to return the value
e.g. {"return_value": <the executed result>}
Returns
----------
execute_result : List[object]
the executed result for trade decision.
** NOTE!!!! **:
1) This is necessary, The return value of generator will be used in NestedExecutor
2) Please note the executed results are not merged.
Yields
-------
object
trade decision
"""
if self.track_data:
yield trade_decision
atomic = not issubclass(self.__class__, NestedExecutor) # issubclass(A, A) is True
if atomic and trade_decision.get_range_limit(default_value=None) is not None:
raise ValueError("atomic executor doesn't support specify `range_limit`")
if self._settle_type != BasePosition.ST_NO:
self.trade_account.current_position.settle_start(self._settle_type)
obj = self._collect_data(trade_decision=trade_decision, level=level)
if isinstance(obj, GeneratorType):
res, kwargs = yield from obj
else:
# Some concrete executor don't have inner decisions
res, kwargs = obj
trade_start_time, trade_end_time = self.trade_calendar.get_step_time()
# Account will not be changed in this function
self.trade_account.update_bar_end(
trade_start_time,
trade_end_time,
self.trade_exchange,
atomic=atomic,
outer_trade_decision=trade_decision,
indicator_config=self.indicator_config,
**kwargs,
)
self.trade_calendar.step()
if self._settle_type != BasePosition.ST_NO:
self.trade_account.current_position.settle_commit()
if return_value is not None:
return_value.update({"execute_result": res})
return res
def get_all_executors(self):
"""get all executors"""
return [self]
class NestedExecutor(BaseExecutor):
"""
Nested Executor with inner strategy and executor
- At each time `execute` is called, it will call the inner strategy and executor to execute the `trade_decision` in a higher frequency env.
"""
def __init__(
self,
time_per_step: str,
inner_executor: Union[BaseExecutor, dict],
inner_strategy: Union[BaseStrategy, dict],
start_time: Union[str, pd.Timestamp] = None,
end_time: Union[str, pd.Timestamp] = None,
indicator_config: dict = {},
generate_portfolio_metrics: bool = False,
verbose: bool = False,
track_data: bool = False,
skip_empty_decision: bool = True,
align_range_limit: bool = True,
common_infra: CommonInfrastructure = None,
**kwargs,
):
"""
Parameters
----------
inner_executor : BaseExecutor
trading env in each trading bar.
inner_strategy : BaseStrategy
trading strategy in each trading bar
skip_empty_decision: bool
Will the executor skip call inner loop when the decision is empty.
It should be False in following cases
- The decisions may be updated by steps
- The inner executor may not follow the decisions from the outer strategy
align_range_limit: bool
force to align the trade_range decision
It is only for nested executor, because range_limit is given by outer strategy
"""
self.inner_executor: BaseExecutor = init_instance_by_config(
inner_executor, common_infra=common_infra, accept_types=BaseExecutor
)
self.inner_strategy: BaseStrategy = init_instance_by_config(
inner_strategy, common_infra=common_infra, accept_types=BaseStrategy
)
self._skip_empty_decision = skip_empty_decision
self._align_range_limit = align_range_limit
super(NestedExecutor, self).__init__(
time_per_step=time_per_step,
start_time=start_time,
end_time=end_time,
indicator_config=indicator_config,
generate_portfolio_metrics=generate_portfolio_metrics,
verbose=verbose,
track_data=track_data,
common_infra=common_infra,
**kwargs,
)
def reset_common_infra(self, common_infra):
"""
reset infrastructure for trading
- reset inner_strategyand inner_executor common infra
"""
super(NestedExecutor, self).reset_common_infra(common_infra)
self.inner_executor.reset_common_infra(common_infra)
self.inner_strategy.reset_common_infra(common_infra)
def _init_sub_trading(self, trade_decision):
trade_start_time, trade_end_time = self.trade_calendar.get_step_time()
self.inner_executor.reset(start_time=trade_start_time, end_time=trade_end_time)
sub_level_infra = self.inner_executor.get_level_infra()
self.level_infra.set_sub_level_infra(sub_level_infra)
self.inner_strategy.reset(level_infra=sub_level_infra, outer_trade_decision=trade_decision)
def _update_trade_decision(self, trade_decision: BaseTradeDecision) -> BaseTradeDecision:
# outter strategy have chance to update decision each iterator
updated_trade_decision = trade_decision.update(self.inner_executor.trade_calendar)
if updated_trade_decision is not None:
trade_decision = updated_trade_decision
# NEW UPDATE
# create a hook for inner strategy to update outter decision
self.inner_strategy.alter_outer_trade_decision(trade_decision)
return trade_decision
def _collect_data(self, trade_decision: BaseTradeDecision, level: int = 0):
execute_result = []
inner_order_indicators = []
decision_list = []
# NOTE:
# - this is necessary to calculating the steps in sub level
# - more detailed information will be set into trade decision
self._init_sub_trading(trade_decision)
_inner_execute_result = None
while not self.inner_executor.finished():
trade_decision = self._update_trade_decision(trade_decision)
if trade_decision.empty() and self._skip_empty_decision:
# give one chance for outer strategy to update the strategy
# - For updating some information in the sub executor(the strategy have no knowledge of the inner
# executor when generating the decision)
break
sub_cal: TradeCalendarManager = self.inner_executor.trade_calendar
# NOTE: make sure get_start_end_idx is after `self._update_trade_decision`
start_idx, end_idx = get_start_end_idx(sub_cal, trade_decision)
if not self._align_range_limit or start_idx <= sub_cal.get_trade_step() <= end_idx:
# if force align the range limit, skip the steps outside the decision range limit
_inner_trade_decision: BaseTradeDecision = self.inner_strategy.generate_trade_decision(
_inner_execute_result
)
trade_decision.mod_inner_decision(_inner_trade_decision) # propagate part of decision information
# NOTE sub_cal.get_step_time() must be called before collect_data in case of step shifting
decision_list.append((_inner_trade_decision, *sub_cal.get_step_time()))
# NOTE: Trade Calendar will step forward in the follow line
_inner_execute_result = yield from self.inner_executor.collect_data(
trade_decision=_inner_trade_decision, level=level + 1
)
execute_result.extend(_inner_execute_result)
inner_order_indicators.append(
self.inner_executor.trade_account.get_trade_indicator().get_order_indicator(raw=True)
)
else:
# do nothing and just step forward
sub_cal.step()
return execute_result, {"inner_order_indicators": inner_order_indicators, "decision_list": decision_list}
def get_all_executors(self):
"""get all executors, including self and inner_executor.get_all_executors()"""
return [self, *self.inner_executor.get_all_executors()]
class SimulatorExecutor(BaseExecutor):
"""Executor that simulate the true market"""
# TODO: TT_SERIAL & TT_PARAL will be replaced by feature fix_pos now.
# Please remove them in the future.
# available trade_types
TT_SERIAL = "serial"
## The orders will be executed serially in a sequence
# In each trading step, it is possible that users sell instruments first and use the money to buy new instruments
TT_PARAL = "parallel"
## The orders will be executed parallelly
# In each trading step, if users try to sell instruments first and buy new instruments with money, failure will
# occur
def __init__(
self,
time_per_step: str,
start_time: Union[str, pd.Timestamp] = None,
end_time: Union[str, pd.Timestamp] = None,
indicator_config: dict = {},
generate_portfolio_metrics: bool = False,
verbose: bool = False,
track_data: bool = False,
common_infra: CommonInfrastructure = None,
trade_type: str = TT_SERIAL,
**kwargs,
):
"""
Parameters
----------
trade_type: str
please refer to the doc of `TT_SERIAL` & `TT_PARAL`
"""
super(SimulatorExecutor, self).__init__(
time_per_step=time_per_step,
start_time=start_time,
end_time=end_time,
indicator_config=indicator_config,
generate_portfolio_metrics=generate_portfolio_metrics,
verbose=verbose,
track_data=track_data,
common_infra=common_infra,
**kwargs,
)
self.trade_type = trade_type
def _get_order_iterator(self, trade_decision: BaseTradeDecision) -> List[Order]:
"""
Parameters
----------
trade_decision : BaseTradeDecision
the trade decision given by the strategy
Returns
-------
List[Order]:
get a list orders according to `self.trade_type`
"""
orders = trade_decision.get_decision()
if self.trade_type == self.TT_SERIAL:
# Orders will be traded in a parallel way
order_it = orders
elif self.trade_type == self.TT_PARAL:
# NOTE: !!!!!!!
# Assumption: there will not be orders in different trading direction in a single step of a strategy !!!!
# The parallel trading failure will be caused only by the confliction of money
# Therefore, make the buying go first will make sure the confliction happen.
# It equals to parallel trading after sorting the order by direction
order_it = sorted(orders, key=lambda order: -order.direction)
else:
raise NotImplementedError(f"This type of input is not supported")
return order_it
def _update_dealt_order_amount(self, order):
"""update date and dealt order amount in the day."""
now_deal_day = self.trade_calendar.get_step_time()[0].floor(freq="D")
if self.deal_day is None or now_deal_day > self.deal_day:
self.dealt_order_amount = defaultdict(float)
self.deal_day = now_deal_day
self.dealt_order_amount[order.stock_id] += order.deal_amount
def _collect_data(self, trade_decision: BaseTradeDecision, level: int = 0):
trade_start_time, _ = self.trade_calendar.get_step_time()
execute_result = []
for order in self._get_order_iterator(trade_decision):
# execute the order.
# NOTE: The trade_account will be changed in this function
trade_val, trade_cost, trade_price = self.trade_exchange.deal_order(
order,
trade_account=self.trade_account,
dealt_order_amount=self.dealt_order_amount,
)
execute_result.append((order, trade_val, trade_cost, trade_price))
self._update_dealt_order_amount(order)
if self.verbose:
print(
"[I {:%Y-%m-%d %H:%M:%S}]: {} {}, price {:.2f}, amount {}, deal_amount {}, factor {}, value {:.2f}, cash {:.2f}.".format(
trade_start_time,
"sell" if order.direction == Order.SELL else "buy",
order.stock_id,
trade_price,
order.amount,
order.deal_amount,
order.factor,
trade_val,
self.trade_account.get_cash(),
)
)
return execute_result, {"trade_info": execute_result}

View File

@@ -0,0 +1,634 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from functools import lru_cache
import logging
from typing import List, Text, Union, Callable, Iterable, Dict
from collections import OrderedDict
import inspect
import pandas as pd
import numpy as np
from ..utils.index_data import IndexData, SingleData
from ..utils.resam import resam_ts_data, ts_data_last
from ..log import get_module_logger
from ..utils.time import is_single_value, Freq
import qlib.utils.index_data as idd
class BaseQuote:
def __init__(self, quote_df: pd.DataFrame, freq):
self.logger = get_module_logger("online operator", level=logging.INFO)
def get_all_stock(self) -> Iterable:
"""return all stock codes
Return
------
Iterable
all stock codes
"""
raise NotImplementedError(f"Please implement the `get_all_stock` method")
def get_data(
self,
stock_id: str,
start_time: Union[pd.Timestamp, str],
end_time: Union[pd.Timestamp, str],
field: Union[str],
method: Union[str, None] = None,
) -> Union[None, int, float, bool, IndexData]:
"""get the specific field of stock data during start time and end_time,
and apply method to the data.
Example:
.. code-block::
$close $volume
instrument datetime
SH600000 2010-01-04 86.778313 16162960.0
2010-01-05 87.433578 28117442.0
2010-01-06 85.713585 23632884.0
2010-01-07 83.788803 20813402.0
2010-01-08 84.730675 16044853.0
SH600655 2010-01-04 2699.567383 158193.328125
2010-01-08 2612.359619 77501.406250
2010-01-11 2712.982422 160852.390625
2010-01-12 2788.688232 164587.937500
2010-01-13 2790.604004 145460.453125
this function is used for three case:
1. method is not None. It returns int/float/bool/None.
- It will return None in one case, the method return None
print(get_data(stock_id="SH600000", start_time="2010-01-04", end_time="2010-01-06", field="$close", method="last"))
85.713585
2. method is None. It returns IndexData.
print(get_data(stock_id="SH600000", start_time="2010-01-04", end_time="2010-01-06", field="$close", method=None))
IndexData([86.778313, 87.433578, 85.713585], [2010-01-04, 2010-01-05, 2010-01-06])
Parameters
----------
stock_id: str
start_time : Union[pd.Timestamp, str]
closed start time for backtest
end_time : Union[pd.Timestamp, str]
closed end time for backtest
field : str
the columns of data to fetch
method : Union[str, None]
the method apply to data.
e.g [None, "last", "all", "sum", "mean", "ts_data_last"]
Return
----------
Union[None, int, float, bool, IndexData]
it will return None in following cases
- There is no stock data which meet the query criterion from data source.
- The `method` returns None
"""
raise NotImplementedError(f"Please implement the `get_data` method")
class PandasQuote(BaseQuote):
def __init__(self, quote_df: pd.DataFrame, freq):
super().__init__(quote_df=quote_df, freq=freq)
quote_dict = {}
for stock_id, stock_val in quote_df.groupby(level="instrument"):
quote_dict[stock_id] = stock_val.droplevel(level="instrument")
self.data = quote_dict
def get_all_stock(self):
return self.data.keys()
def get_data(self, stock_id, start_time, end_time, field, method=None):
if method == "ts_data_last":
method = ts_data_last
stock_data = resam_ts_data(self.data[stock_id][field], start_time, end_time, method=method)
if stock_data is None:
return None
elif isinstance(stock_data, (bool, np.bool_, int, float, np.number)):
return stock_data
elif isinstance(stock_data, pd.Series):
return idd.SingleData(stock_data)
else:
raise ValueError(f"stock data from resam_ts_data must be a number, pd.Series or pd.DataFrame")
class NumpyQuote(BaseQuote):
def __init__(self, quote_df: pd.DataFrame, freq, region="cn"):
"""NumpyQuote
Parameters
----------
quote_df : pd.DataFrame
the init dataframe from qlib.
self.data : Dict(stock_id, IndexData.DataFrame)
"""
super().__init__(quote_df=quote_df, freq=freq)
quote_dict = {}
for stock_id, stock_val in quote_df.groupby(level="instrument"):
quote_dict[stock_id] = idd.MultiData(stock_val.droplevel(level="instrument"))
quote_dict[stock_id].sort_index() # To support more flexible slicing, we must sort data first
self.data = quote_dict
n, unit = Freq.parse(freq)
if unit in Freq.SUPPORT_CAL_LIST:
self.freq = Freq.get_timedelta(1, unit)
else:
raise ValueError(f"{freq} is not supported in NumpyQuote")
self.region = region
def get_all_stock(self):
return self.data.keys()
@lru_cache(maxsize=512)
def get_data(self, stock_id, start_time, end_time, field, method=None):
# check stock id
if stock_id not in self.get_all_stock():
return None
# single data
# If it don't consider the classification of single data, it will consume a lot of time.
if is_single_value(start_time, end_time, self.freq, self.region):
# this is a very special case.
# skip aggregating function to speed-up the query calculation
# FIXME:
# it will go to the else logic when it comes to the
# 1) the day before holiday when daily trading
# 2) the last minute of the day when intraday trading
try:
return self.data[stock_id].loc[start_time, field]
except KeyError:
return None
else:
data = self.data[stock_id].loc[start_time:end_time, field]
if data.empty:
return None
if method is not None:
data = self._agg_data(data, method)
return data
def _agg_data(self, data: IndexData, method):
"""Agg data by specific method."""
# FIXME: why not call the method of data directly?
if method == "sum":
return np.nansum(data)
elif method == "mean":
return np.nanmean(data)
elif method == "last":
# FIXME: I've never seen that this method was called.
# Please merge it with "ts_data_last"
return data[-1]
elif method == "all":
return data.all()
elif method == "ts_data_last":
valid_data = data.loc[~data.isna().data.astype(bool)]
if len(valid_data) == 0:
return None
else:
return valid_data.iloc[-1]
else:
raise ValueError(f"{method} is not supported")
class BaseSingleMetric:
"""
The data structure of the single metric.
The following methods are used for computing metrics in one indicator.
"""
def __init__(self, metric: Union[dict, pd.Series]):
"""Single data structure for each metric.
Parameters
----------
metric : Union[dict, pd.Series]
keys/index is stock_id, value is the metric value.
for example:
SH600068 NaN
SH600079 1.0
SH600266 NaN
...
SZ300692 NaN
SZ300719 NaN,
"""
raise NotImplementedError(f"Please implement the `__init__` method")
def __add__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__add__` method")
def __radd__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
return self + other
def __sub__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__sub__` method")
def __rsub__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__rsub__` method")
def __mul__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__mul__` method")
def __truediv__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__truediv__` method")
def __eq__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__eq__` method")
def __gt__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__gt__` method")
def __lt__(self, other: Union["BaseSingleMetric", int, float]) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `__lt__` method")
def __len__(self) -> int:
raise NotImplementedError(f"Please implement the `__len__` method")
def sum(self) -> float:
raise NotImplementedError(f"Please implement the `sum` method")
def mean(self) -> float:
raise NotImplementedError(f"Please implement the `mean` method")
def count(self) -> int:
"""Return the count of the single metric, NaN is not included."""
raise NotImplementedError(f"Please implement the `count` method")
def abs(self) -> "BaseSingleMetric":
raise NotImplementedError(f"Please implement the `abs` method")
@property
def empty(self) -> bool:
"""If metric is empty, return True."""
raise NotImplementedError(f"Please implement the `empty` method")
def add(self, other: "BaseSingleMetric", fill_value: float = None) -> "BaseSingleMetric":
"""Replace np.NaN with fill_value in two metrics and add them."""
raise NotImplementedError(f"Please implement the `add` method")
def replace(self, replace_dict: dict) -> "BaseSingleMetric":
"""Replace the value of metric according to replace_dict."""
raise NotImplementedError(f"Please implement the `replace` method")
def apply(self, func: dict) -> "BaseSingleMetric":
"""Replace the value of metric with func(metric).
Currently, the func is only qlib/backtest/order/Order.parse_dir.
"""
raise NotImplementedError(f"Please implement the 'apply' method")
class BaseOrderIndicator:
"""
The data structure of order indicator.
!!!NOTE: There are two ways to organize the data structure. Please choose a better way.
1. One way is using BaseSingleMetric to represent each metric. For example, the data
structure of PandasOrderIndicator is Dict[str, PandasSingleMetric]. It uses
PandasSingleMetric based on pd.Series to represent each metric.
2. The another way doesn't use BaseSingleMetric to represent each metric. The data
structure of PandasOrderIndicator is a whole matrix. It means you are not necessary
to inherit the BaseSingleMetric.
"""
def __init__(self, data):
self.data = data
self.logger = get_module_logger("online operator")
def assign(self, col: str, metric: Union[dict, pd.Series]):
"""assign one metric.
Parameters
----------
col : str
the metric name of one metric.
metric : Union[dict, pd.Series]
one metric with stock_id index, such as deal_amount, ffr, etc.
for example:
SH600068 NaN
SH600079 1.0
SH600266 NaN
...
SZ300692 NaN
SZ300719 NaN,
"""
raise NotImplementedError(f"Please implement the 'assign' method")
def transfer(self, func: Callable, new_col: str = None) -> Union[None, BaseSingleMetric]:
"""compute new metric with existing metrics.
Parameters
----------
func : Callable
the func of computing new metric.
the kwargs of func will be replaced with metric data by name in this function.
e.g.
def func(pa):
return (pa > 0).sum() / pa.count()
new_col : str, optional
New metric will be assigned in the data if new_col is not None, by default None.
Return
----------
BaseSingleMetric
new metric.
"""
func_sig = inspect.signature(func).parameters.keys()
func_kwargs = {sig: self.data[sig] for sig in func_sig}
tmp_metric = func(**func_kwargs)
if new_col is not None:
self.data[new_col] = tmp_metric
else:
return tmp_metric
def get_metric_series(self, metric: str) -> pd.Series:
"""return the single metric with pd.Series format.
Parameters
----------
metric : str
the metric name.
Return
----------
pd.Series
the single metric.
If there is no metric name in the data, return pd.Series().
"""
raise NotImplementedError(f"Please implement the 'get_metric_series' method")
def get_index_data(self, metric) -> SingleData:
"""get one metric with the format of SingleData
Parameters
----------
metric : str
the metric name.
Return
------
IndexData.Series
one metric with the format of SingleData
"""
raise NotImplementedError(f"Please implement the 'get_index_data' method")
@staticmethod
def sum_all_indicators(order_indicator, indicators: list, metrics: Union[str, List[str]], fill_value: float = None):
"""sum indicators with the same metrics.
and assign to the order_indicator(BaseOrderIndicator).
NOTE: indicators could be a empty list when orders in lower level all fail.
Parameters
----------
order_indicator : BaseOrderIndicator
the order indicator to assign.
indicators : List[BaseOrderIndicator]
the list of all inner indicators.
metrics : Union[str, List[str]]
all metrics needs ot be sumed.
fill_value : float, optional
fill np.NaN with value. By default None.
"""
raise NotImplementedError(f"Please implement the 'sum_all_indicators' method")
def to_series(self) -> Dict[Text, pd.Series]:
"""return the metrics as pandas series
for example: { "ffr":
SH600068 NaN
SH600079 1.0
SH600266 NaN
...
SZ300692 NaN
SZ300719 NaN,
...
}
"""
raise NotImplementedError(f"Please implement the `to_series` method")
class SingleMetric(BaseSingleMetric):
def __init__(self, metric):
self.metric = metric
def __add__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric + other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric + other.metric)
else:
return NotImplemented
def __sub__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric - other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric - other.metric)
else:
return NotImplemented
def __rsub__(self, other):
if isinstance(other, (int, float)):
return self.__class__(other - self.metric)
elif isinstance(other, self.__class__):
return self.__class__(other.metric - self.metric)
else:
return NotImplemented
def __mul__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric * other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric * other.metric)
else:
return NotImplemented
def __truediv__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric / other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric / other.metric)
else:
return NotImplemented
def __eq__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric == other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric == other.metric)
else:
return NotImplemented
def __gt__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric > other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric > other.metric)
else:
return NotImplemented
def __lt__(self, other):
if isinstance(other, (int, float)):
return self.__class__(self.metric < other)
elif isinstance(other, self.__class__):
return self.__class__(self.metric < other.metric)
else:
return NotImplemented
def __len__(self):
return len(self.metric)
class PandasSingleMetric(SingleMetric):
"""Each SingleMetric is based on pd.Series."""
def __init__(self, metric: Union[dict, pd.Series] = {}):
if isinstance(metric, dict):
self.metric = pd.Series(metric)
elif isinstance(metric, pd.Series):
self.metric = metric
else:
raise ValueError(f"metric must be dict or pd.Series")
def sum(self):
return self.metric.sum()
def mean(self):
return self.metric.mean()
def count(self):
return self.metric.count()
def abs(self):
return self.__class__(self.metric.abs())
@property
def empty(self):
return self.metric.empty
@property
def index(self):
return list(self.metric.index)
def add(self, other, fill_value=None):
return self.__class__(self.metric.add(other.metric, fill_value=fill_value))
def replace(self, replace_dict: dict):
return self.__class__(self.metric.replace(replace_dict))
def apply(self, func: Callable):
return self.__class__(self.metric.apply(func))
def reindex(self, index, fill_value):
return self.__class__(self.metric.reindex(index, fill_value=fill_value))
def __repr__(self):
return repr(self.metric)
class PandasOrderIndicator(BaseOrderIndicator):
"""
The data structure is OrderedDict(str: PandasSingleMetric).
Each PandasSingleMetric based on pd.Series is one metric.
Str is the name of metric.
"""
def __init__(self):
self.data: Dict[str, PandasSingleMetric] = OrderedDict()
def assign(self, col: str, metric: Union[dict, pd.Series]):
self.data[col] = PandasSingleMetric(metric)
def get_index_data(self, metric):
if metric in self.data:
return idd.SingleData(self.data[metric].metric)
else:
return idd.SingleData()
def get_metric_series(self, metric: str) -> Union[pd.Series]:
if metric in self.data:
return self.data[metric].metric
else:
return pd.Series()
def to_series(self):
return {k: v.metric for k, v in self.data.items()}
@staticmethod
def sum_all_indicators(order_indicator, indicators: list, metrics: Union[str, List[str]], fill_value=0):
if isinstance(metrics, str):
metrics = [metrics]
for metric in metrics:
tmp_metric = PandasSingleMetric({})
for indicator in indicators:
tmp_metric = tmp_metric.add(indicator.data[metric], fill_value)
order_indicator.assign(metric, tmp_metric.metric)
def __repr__(self):
return repr(self.data)
class NumpyOrderIndicator(BaseOrderIndicator):
"""
The data structure is OrderedDict(str: SingleData).
Each idd.SingleData is one metric.
Str is the name of metric.
"""
def __init__(self):
self.data: Dict[str, SingleData] = OrderedDict()
def assign(self, col: str, metric: dict):
self.data[col] = idd.SingleData(metric)
def get_index_data(self, metric):
if metric in self.data:
return self.data[metric]
else:
return idd.SingleData()
def get_metric_series(self, metric: str) -> Union[pd.Series]:
return self.data[metric].to_series()
def to_series(self) -> Dict[str, pd.Series]:
tmp_metric_dict = {}
for metric in self.data:
tmp_metric_dict[metric] = self.get_metric_series(metric)
return tmp_metric_dict
@staticmethod
def sum_all_indicators(order_indicator, indicators: list, metrics: Union[str, List[str]], fill_value=0):
# get all index(stock_id)
stocks = set()
for indicator in indicators:
# set(np.ndarray.tolist()) is faster than set(np.ndarray)
stocks = stocks | set(indicator.data[metrics[0]].index.tolist())
stocks = list(stocks)
stocks.sort()
# add metric by index
if isinstance(metrics, str):
metrics = [metrics]
for metric in metrics:
order_indicator.data[metric] = idd.sum_by_index(
[indicator.data[metric] for indicator in indicators], stocks, fill_value
)
def __repr__(self):
return repr(self.data)

548
qlib/backtest/position.py Normal file
View File

@@ -0,0 +1,548 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import copy
import pathlib
from typing import Dict, List, Union
import pandas as pd
from datetime import timedelta
import numpy as np
from .decision import Order
from ..data.data import D
class BasePosition:
"""
The Position want to maintain the position like a dictionary
Please refer to the `Position` class for the position
"""
def __init__(self, cash=0.0, *args, **kwargs):
self._settle_type = self.ST_NO
def skip_update(self) -> bool:
"""
Should we skip updating operation for this position
For example, updating is meaningless for InfPosition
Returns
-------
bool:
should we skip the updating operator
"""
return False
def check_stock(self, stock_id: str) -> bool:
"""
check if is the stock in the position
Parameters
----------
stock_id : str
the id of the stock
Returns
-------
bool:
if is the stock in the position
"""
raise NotImplementedError(f"Please implement the `check_stock` method")
def update_order(self, order: Order, trade_val: float, cost: float, trade_price: float):
"""
Parameters
----------
order : Order
the order to update the position
trade_val : float
the trade value(money) of dealing results
cost : float
the trade cost of the dealing results
trade_price : float
the trade price of the dealing results
"""
raise NotImplementedError(f"Please implement the `update_order` method")
def update_stock_price(self, stock_id, price: float):
"""
Updating the latest price of the order
The useful when clearing balance at each bar end
Parameters
----------
stock_id :
the id of the stock
price : float
the price to be updated
"""
raise NotImplementedError(f"Please implement the `update stock price` method")
def calculate_stock_value(self) -> float:
"""
calculate the value of the all assets except cash in the position
Returns
-------
float:
the value(money) of all the stock
"""
raise NotImplementedError(f"Please implement the `calculate_stock_value` method")
def get_stock_list(self) -> List:
"""
Get the list of stocks in the position.
"""
raise NotImplementedError(f"Please implement the `get_stock_list` method")
def get_stock_price(self, code) -> float:
"""
get the latest price of the stock
Parameters
----------
code :
the code of the stock
"""
raise NotImplementedError(f"Please implement the `get_stock_price` method")
def get_stock_amount(self, code) -> float:
"""
get the amount of the stock
Parameters
----------
code :
the code of the stock
Returns
-------
float:
the amount of the stock
"""
raise NotImplementedError(f"Please implement the `get_stock_amount` method")
def get_cash(self, include_settle: bool = False) -> float:
"""
Returns
-------
float:
the available(tradable) cash in position
include_settle:
will the unsettled(delayed) cash included
Default: not include those unavailable cash
"""
raise NotImplementedError(f"Please implement the `get_cash` method")
def get_stock_amount_dict(self) -> Dict:
"""
generate stock amount dict {stock_id : amount of stock}
Returns
-------
Dict:
{stock_id : amount of stock}
"""
raise NotImplementedError(f"Please implement the `get_stock_amount_dict` method")
def get_stock_weight_dict(self, only_stock: bool = False) -> Dict:
"""
generate stock weight dict {stock_id : value weight of stock in the position}
it is meaningful in the beginning or the end of each trade step
- During execution of each trading step, the weight may be not consistant with the portfolio value
Parameters
----------
only_stock : bool
If only_stock=True, the weight of each stock in total stock will be returned
If only_stock=False, the weight of each stock in total assets(stock + cash) will be returned
Returns
-------
Dict:
{stock_id : value weight of stock in the position}
"""
raise NotImplementedError(f"Please implement the `get_stock_weight_dict` method")
def add_count_all(self, bar):
"""
Will be called at the end of each bar on each level
Parameters
----------
bar :
The level to be updated
"""
raise NotImplementedError(f"Please implement the `add_count_all` method")
def update_weight_all(self):
"""
Updating the position weight;
# TODO: this function is a little weird. The weight data in the position is in a wrong state after dealing order
# and before updating weight.
Parameters
----------
bar :
The level to be updated
"""
raise NotImplementedError(f"Please implement the `add_count_all` method")
ST_CASH = "cash"
ST_NO = None
def settle_start(self, settle_type: str):
"""
settlement start
It will act like start and commit a transaction
Parameters
----------
settle_type : str
Should we make delay the settlement in each execution (each execution will make the executor a step forward)
- "cash": make the cash settlement delayed.
- The cash you get can't be used in current step (e.g. you can't sell a stock to get cash to buy another
stock)
- None: not settlement mechanism
- TODO: other assets will be supported in the future.
"""
raise NotImplementedError(f"Please implement the `settle_conf` method")
def settle_commit(self):
"""
settlement commit
Parameters
----------
settle_type : str
please refer to the documents of Executor
"""
raise NotImplementedError(f"Please implement the `settle_commit` method")
class Position(BasePosition):
"""Position
current state of position
a typical example is :{
<instrument_id>: {
'count': <how many days the security has been hold>,
'amount': <the amount of the security>,
'price': <the close price of security in the last trading day>,
'weight': <the security weight of total position value>,
},
}
"""
def __init__(self, cash: float = 0, position_dict: Dict[str, Dict[str, float]] = {}):
"""Init position by cash and position_dict.
Parameters
----------
start_time :
the start time of backtest. It's for filling the initial value of stocks.
cash : float, optional
initial cash in account, by default 0
position_dict : Dict[
stock_id,
Union[
int, # it is equal to {"amount": int}
{"amount": int, "price"(optional): float},
]
]
initial stocks with parameters amount and price,
if there is no price key in the dict of stocks, it will be filled by _fill_stock_value.
by default {}.
"""
super().__init__()
# NOTE: The position dict must be copied!!!
# Otherwise the initial value
self.init_cash = cash
self.position = position_dict.copy()
for stock in self.position:
if isinstance(self.position[stock], int):
self.position[stock] = {"amount": self.position[stock]}
self.position["cash"] = cash
# If the stock price information is missing, the account value will not be calculated temporarily
try:
self.position["now_account_value"] = self.calculate_value()
except KeyError:
pass
def fill_stock_value(self, start_time: Union[str, pd.Timestamp], freq: str, last_days: int = 30):
"""fill the stock value by the close price of latest last_days from qlib.
Parameters
----------
start_time :
the start time of backtest.
last_days : int, optional
the days to get the latest close price, by default 30.
"""
stock_list = []
for stock in self.position:
if not isinstance(self.position[stock], dict):
continue
if ("price" not in self.position[stock]) or (self.position[stock]["price"] is None):
stock_list.append(stock)
if len(stock_list) == 0:
return
start_time = pd.Timestamp(start_time)
# note that start time is 2020-01-01 00:00:00 if raw start time is "2020-01-01"
price_end_time = start_time
price_start_time = start_time - timedelta(days=last_days)
price_df = D.features(
stock_list, ["$close"], price_start_time, price_end_time, freq=freq, disk_cache=True
).dropna()
price_dict = price_df.groupby(["instrument"]).tail(1).reset_index(level=1, drop=True)["$close"].to_dict()
if len(price_dict) < len(stock_list):
lack_stock = set(stock_list) - set(price_dict)
raise ValueError(f"{lack_stock} doesn't have close price in qlib in the latest {last_days} days")
for stock in stock_list:
self.position[stock]["price"] = price_dict[stock]
self.position["now_account_value"] = self.calculate_value()
def _init_stock(self, stock_id, amount, price=None):
"""
initialization the stock in current position
Parameters
----------
stock_id :
the id of the stock
amount : float
the amount of the stock
price :
the price when buying the init stock
"""
self.position[stock_id] = {}
self.position[stock_id]["amount"] = amount
self.position[stock_id]["price"] = price
self.position[stock_id]["weight"] = 0 # update the weight in the end of the trade date
def _buy_stock(self, stock_id, trade_val, cost, trade_price):
trade_amount = trade_val / trade_price
if stock_id not in self.position:
self._init_stock(stock_id=stock_id, amount=trade_amount, price=trade_price)
else:
# exist, add amount
self.position[stock_id]["amount"] += trade_amount
self.position["cash"] -= trade_val + cost
def _sell_stock(self, stock_id, trade_val, cost, trade_price):
trade_amount = trade_val / trade_price
if stock_id not in self.position:
raise KeyError("{} not in current position".format(stock_id))
else:
if np.isclose(self.position[stock_id]["amount"], trade_amount):
# Selling all the stocks
# we use np.isclose instead of abs(<the final amount>) <= 1e-5 because `np.isclose` consider both ralative amount and absolute amount
# Using abs(<the final amount>) <= 1e-5 will result in error when the amount is large
self._del_stock(stock_id)
else:
# decrease the amount of stock
self.position[stock_id]["amount"] -= trade_amount
# check if to delete
if self.position[stock_id]["amount"] < -1e-5:
raise ValueError(
"only have {} {}, require {}".format(self.position[stock_id]["amount"], stock_id, trade_amount)
)
new_cash = trade_val - cost
if self._settle_type == self.ST_CASH:
self.position["cash_delay"] += new_cash
elif self._settle_type == self.ST_NO:
self.position["cash"] += new_cash
else:
raise NotImplementedError(f"This type of input is not supported")
def _del_stock(self, stock_id):
del self.position[stock_id]
def check_stock(self, stock_id):
return stock_id in self.position
def update_order(self, order, trade_val, cost, trade_price):
# handle order, order is a order class, defined in exchange.py
if order.direction == Order.BUY:
# BUY
self._buy_stock(order.stock_id, trade_val, cost, trade_price)
elif order.direction == Order.SELL:
# SELL
self._sell_stock(order.stock_id, trade_val, cost, trade_price)
else:
raise NotImplementedError("do not support order direction {}".format(order.direction))
def update_stock_price(self, stock_id, price):
self.position[stock_id]["price"] = price
def update_stock_count(self, stock_id, bar, count):
self.position[stock_id][f"count_{bar}"] = count
def update_stock_weight(self, stock_id, weight):
self.position[stock_id]["weight"] = weight
def calculate_stock_value(self):
stock_list = self.get_stock_list()
value = 0
for stock_id in stock_list:
value += self.position[stock_id]["amount"] * self.position[stock_id]["price"]
return value
def calculate_value(self):
value = self.calculate_stock_value()
value += self.position["cash"] + self.position.get("cash_delay", 0.0)
return value
def get_stock_list(self):
stock_list = list(set(self.position.keys()) - {"cash", "now_account_value", "cash_delay"})
return stock_list
def get_stock_price(self, code):
return self.position[code]["price"]
def get_stock_amount(self, code):
return self.position[code]["amount"] if code in self.position else 0
def get_stock_count(self, code, bar):
"""the days the account has been hold, it may be used in some special strategies"""
if f"count_{bar}" in self.position[code]:
return self.position[code][f"count_{bar}"]
else:
return 0
def get_stock_weight(self, code):
return self.position[code]["weight"]
def get_cash(self, include_settle=False):
cash = self.position["cash"]
if include_settle:
cash += self.position.get("cash_delay", 0.0)
return cash
def get_stock_amount_dict(self):
"""generate stock amount dict {stock_id : amount of stock}"""
d = {}
stock_list = self.get_stock_list()
for stock_code in stock_list:
d[stock_code] = self.get_stock_amount(code=stock_code)
return d
def get_stock_weight_dict(self, only_stock=False):
"""get_stock_weight_dict
generate stock weight dict {stock_id : value weight of stock in the position}
it is meaningful in the beginning or the end of each trade date
:param only_stock: If only_stock=True, the weight of each stock in total stock will be returned
If only_stock=False, the weight of each stock in total assets(stock + cash) will be returned
"""
if only_stock:
position_value = self.calculate_stock_value()
else:
position_value = self.calculate_value()
d = {}
stock_list = self.get_stock_list()
for stock_code in stock_list:
d[stock_code] = self.position[stock_code]["amount"] * self.position[stock_code]["price"] / position_value
return d
def add_count_all(self, bar):
stock_list = self.get_stock_list()
for code in stock_list:
if f"count_{bar}" in self.position[code]:
self.position[code][f"count_{bar}"] += 1
else:
self.position[code][f"count_{bar}"] = 1
def update_weight_all(self):
weight_dict = self.get_stock_weight_dict()
for stock_code, weight in weight_dict.items():
self.update_stock_weight(stock_code, weight)
def settle_start(self, settle_type):
assert self._settle_type == self.ST_NO, "Currently, settlement can't be nested!!!!!"
self._settle_type = settle_type
if settle_type == self.ST_CASH:
self.position["cash_delay"] = 0.0
def settle_commit(self):
if self._settle_type != self.ST_NO:
if self._settle_type == self.ST_CASH:
self.position["cash"] += self.position["cash_delay"]
del self.position["cash_delay"]
else:
raise NotImplementedError(f"This type of input is not supported")
self._settle_type = self.ST_NO
class InfPosition(BasePosition):
"""
Position with infinite cash and amount.
This is useful for generating random orders.
"""
def skip_update(self) -> bool:
"""Updating state is meaningless for InfPosition"""
return True
def check_stock(self, stock_id: str) -> bool:
# InfPosition always have any stocks
return True
def update_order(self, order: Order, trade_val: float, cost: float, trade_price: float):
pass
def update_stock_price(self, stock_id, price: float):
pass
def calculate_stock_value(self) -> float:
"""
Returns
-------
float:
infinity stock value
"""
return np.inf
def get_stock_list(self) -> List:
raise NotImplementedError(f"InfPosition doesn't support stock list position")
def get_stock_price(self, code) -> float:
"""the price of the inf position is meaningless"""
return np.nan
def get_stock_amount(self, code) -> float:
return np.inf
def get_cash(self, include_settle=False) -> float:
return np.inf
def get_stock_amount_dict(self) -> Dict:
raise NotImplementedError(f"InfPosition doesn't support get_stock_amount_dict")
def get_stock_weight_dict(self, only_stock: bool) -> Dict:
raise NotImplementedError(f"InfPosition doesn't support get_stock_weight_dict")
def add_count_all(self, bar):
raise NotImplementedError(f"InfPosition doesn't support add_count_all")
def update_weight_all(self):
raise NotImplementedError(f"InfPosition doesn't support update_weight_all")
def settle_start(self, settle_type: str):
pass
def settle_commit(self):
pass

View File

@@ -1,12 +1,14 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
"""
This module is not well maintained.
"""
import numpy as np
import pandas as pd
from .position import Position
from ...data import D
from ...config import C
from ..data import D
from ..config import C
import datetime
from pathlib import Path
@@ -35,7 +37,7 @@ def get_benchmark_weight(
"""
if not path:
path = Path(C.dpm.get_data_path(freq)).expanduser() / "raw" / "AIndexMembers" / "weights.csv"
path = Path(C.dpm.get_data_uri(freq)).expanduser() / "raw" / "AIndexMembers" / "weights.csv"
# TODO: the storage of weights should be implemented in a more elegent way
# TODO: The benchmark is not consistant with the filename in instruments.
bench_weight_df = pd.read_csv(path, usecols=["code", "date", "index", "weight"])

617
qlib/backtest/report.py Normal file
View File

@@ -0,0 +1,617 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from collections import OrderedDict
import pathlib
from typing import Dict, List, Tuple, Union
import numpy as np
import pandas as pd
from qlib.backtest.exchange import Exchange
from .decision import IdxTradeRange
from qlib.backtest.decision import BaseTradeDecision, Order, OrderDir
from qlib.backtest.utils import TradeCalendarManager
from .high_performance_ds import BaseOrderIndicator, PandasOrderIndicator, NumpyOrderIndicator, SingleMetric
from ..data import D
from ..tests.config import CSI300_BENCH
from ..utils.resam import get_higher_eq_freq_feature, resam_ts_data
import qlib.utils.index_data as idd
class PortfolioMetrics:
"""
Motivation:
PortfolioMetrics is for supporting portfolio related metrics.
Implementation:
daily portfolio metrics of the account
contain those followings: return, cost, turnover, account, cash, bench, value
For each step(bar/day/minute), each column represents
- return: the return of the portfolio generated by strategy **without transaction fee**.
- cost: the transaction fee and slippage.
- account: the total value of assets(cash and securities are both included) in user account based on the close price of each step.
- cash: the amount of cash in user's account.
- bench: the return of the benchmark
- value: the total value of securities/stocks/instruments (cash is excluded).
update report
"""
def __init__(self, freq: str = "day", benchmark_config: dict = {}):
"""
Parameters
----------
freq : str
frequency of trading bar, used for updating hold count of trading bar
benchmark_config : dict
config of benchmark, may including the following arguments:
- benchmark : Union[str, list, pd.Series]
- If `benchmark` is pd.Series, `index` is trading date; the value T is the change from T-1 to T.
example:
print(D.features(D.instruments('csi500'), ['$close/Ref($close, 1)-1'])['$close/Ref($close, 1)-1'].head())
2017-01-04 0.011693
2017-01-05 0.000721
2017-01-06 -0.004322
2017-01-09 0.006874
2017-01-10 -0.003350
- If `benchmark` is list, will use the daily average change of the stock pool in the list as the 'bench'.
- If `benchmark` is str, will use the daily change as the 'bench'.
benchmark code, default is SH000300 CSI300
- start_time : Union[str, pd.Timestamp], optional
- If `benchmark` is pd.Series, it will be ignored
- Else, it represent start time of benchmark, by default None
- end_time : Union[str, pd.Timestamp], optional
- If `benchmark` is pd.Series, it will be ignored
- Else, it represent end time of benchmark, by default None
"""
self.init_vars()
self.init_bench(freq=freq, benchmark_config=benchmark_config)
def init_vars(self):
self.accounts = OrderedDict() # account postion value for each trade time
self.returns = OrderedDict() # daily return rate for each trade time
self.total_turnovers = OrderedDict() # total turnover for each trade time
self.turnovers = OrderedDict() # turnover for each trade time
self.total_costs = OrderedDict() # total trade cost for each trade time
self.costs = OrderedDict() # trade cost rate for each trade time
self.values = OrderedDict() # value for each trade time
self.cashes = OrderedDict()
self.benches = OrderedDict()
self.latest_pm_time = None # pd.TimeStamp
def init_bench(self, freq=None, benchmark_config=None):
if freq is not None:
self.freq = freq
self.benchmark_config = benchmark_config
self.bench = self._cal_benchmark(self.benchmark_config, self.freq)
def _cal_benchmark(self, benchmark_config, freq):
if benchmark_config is None:
return None
benchmark = benchmark_config.get("benchmark", CSI300_BENCH)
if benchmark is None:
return None
if isinstance(benchmark, pd.Series):
return benchmark
else:
start_time = benchmark_config.get("start_time", None)
end_time = benchmark_config.get("end_time", None)
if freq is None:
raise ValueError("benchmark freq can't be None!")
_codes = benchmark if isinstance(benchmark, (list, dict)) else [benchmark]
fields = ["$close/Ref($close,1)-1"]
_temp_result, _ = get_higher_eq_freq_feature(_codes, fields, start_time, end_time, freq=freq)
if len(_temp_result) == 0:
raise ValueError(f"The benchmark {_codes} does not exist. Please provide the right benchmark")
return _temp_result.groupby(level="datetime")[_temp_result.columns.tolist()[0]].mean().fillna(0)
def _sample_benchmark(self, bench, trade_start_time, trade_end_time):
if self.bench is None:
return None
def cal_change(x):
return (x + 1).prod()
_ret = resam_ts_data(bench, trade_start_time, trade_end_time, method=cal_change)
return 0.0 if _ret is None else _ret - 1
def is_empty(self):
return len(self.accounts) == 0
def get_latest_date(self):
return self.latest_pm_time
def get_latest_account_value(self):
return self.accounts[self.latest_pm_time]
def get_latest_total_cost(self):
return self.total_costs[self.latest_pm_time]
def get_latest_total_turnover(self):
return self.total_turnovers[self.latest_pm_time]
def update_portfolio_metrics_record(
self,
trade_start_time=None,
trade_end_time=None,
account_value=None,
cash=None,
return_rate=None,
total_turnover=None,
turnover_rate=None,
total_cost=None,
cost_rate=None,
stock_value=None,
bench_value=None,
):
# check data
if None in [
trade_start_time,
account_value,
cash,
return_rate,
total_turnover,
turnover_rate,
total_cost,
cost_rate,
stock_value,
]:
raise ValueError(
"None in [trade_start_time, account_value, cash, return_rate, total_turnover, turnover_rate, total_cost, cost_rate, stock_value]"
)
if trade_end_time is None and bench_value is None:
raise ValueError("Both trade_end_time and bench_value is None, benchmark is not usable.")
elif bench_value is None:
bench_value = self._sample_benchmark(self.bench, trade_start_time, trade_end_time)
# update pm data
self.accounts[trade_start_time] = account_value
self.returns[trade_start_time] = return_rate
self.total_turnovers[trade_start_time] = total_turnover
self.turnovers[trade_start_time] = turnover_rate
self.total_costs[trade_start_time] = total_cost
self.costs[trade_start_time] = cost_rate
self.values[trade_start_time] = stock_value
self.cashes[trade_start_time] = cash
self.benches[trade_start_time] = bench_value
# update pm
self.latest_pm_time = trade_start_time
# finish pm update in each step
def generate_portfolio_metrics_dataframe(self):
pm = pd.DataFrame()
pm["account"] = pd.Series(self.accounts)
pm["return"] = pd.Series(self.returns)
pm["total_turnover"] = pd.Series(self.total_turnovers)
pm["turnover"] = pd.Series(self.turnovers)
pm["total_cost"] = pd.Series(self.total_costs)
pm["cost"] = pd.Series(self.costs)
pm["value"] = pd.Series(self.values)
pm["cash"] = pd.Series(self.cashes)
pm["bench"] = pd.Series(self.benches)
pm.index.name = "datetime"
return pm
def save_portfolio_metrics(self, path):
r = self.generate_portfolio_metrics_dataframe()
r.to_csv(path)
def load_portfolio_metrics(self, path):
"""load pm from a file
should have format like
columns = ['account', 'return', 'total_turnover', 'turnover', 'cost', 'total_cost', 'value', 'cash', 'bench']
:param
path: str/ pathlib.Path()
"""
path = pathlib.Path(path)
r = pd.read_csv(open(path, "rb"), index_col=0)
r.index = pd.DatetimeIndex(r.index)
index = r.index
self.init_vars()
for trade_start_time in index:
self.update_portfolio_metrics_record(
trade_start_time=trade_start_time,
account_value=r.loc[trade_start_time]["account"],
cash=r.loc[trade_start_time]["cash"],
return_rate=r.loc[trade_start_time]["return"],
total_turnover=r.loc[trade_start_time]["total_turnover"],
turnover_rate=r.loc[trade_start_time]["turnover"],
total_cost=r.loc[trade_start_time]["total_cost"],
cost_rate=r.loc[trade_start_time]["cost"],
stock_value=r.loc[trade_start_time]["value"],
bench_value=r.loc[trade_start_time]["bench"],
)
class Indicator:
"""
`Indicator` is implemented in a aggregate way.
All the metrics are calculated aggregately.
All the metrics are calculated for a seperated stock and in a specific step on a specific level.
| indicator | desc. |
|--------------+--------------------------------------------------------------|
| amount | the *target* amount given by the outer strategy |
| deal_amount | the real deal amount |
| inner_amount | the total *target* amount of inner strategy |
| trade_price | the average deal price |
| trade_value | the total trade value |
| trade_cost | the total trade cost (base price need drection) |
| trade_dir | the trading direction |
| ffr | full fill rate |
| pa | price advantage |
| pos | win rate |
| base_price | the price of baseline |
| base_volume | the volume of baseline (for weighted aggregating base_price) |
**NOTE**:
The `base_price` and `base_volume` can't be NaN when there are not trading on that step. Otherwise
aggregating get wrong results.
So `base_price` will not be calculated in a aggregate way!!
"""
def __init__(self, order_indicator_cls=NumpyOrderIndicator):
self.order_indicator_cls = order_indicator_cls
# order indicator is metrics for a single order for a specific step
self.order_indicator_his = OrderedDict()
self.order_indicator: BaseOrderIndicator = self.order_indicator_cls()
# trade indicator is metrics for all orders for a specific step
self.trade_indicator_his = OrderedDict()
self.trade_indicator: Dict[str, float] = OrderedDict()
self._trade_calendar = None
# def reset(self, trade_calendar: TradeCalendarManager):
def reset(self):
self.order_indicator: BaseOrderIndicator = self.order_indicator_cls()
self.trade_indicator = OrderedDict()
# self._trade_calendar = trade_calendar
def record(self, trade_start_time):
self.order_indicator_his[trade_start_time] = self.get_order_indicator()
self.trade_indicator_his[trade_start_time] = self.get_trade_indicator()
def _update_order_trade_info(self, trade_info: list):
amount = dict()
deal_amount = dict()
trade_price = dict()
trade_value = dict()
trade_cost = dict()
trade_dir = dict()
pa = dict()
for order, _trade_val, _trade_cost, _trade_price in trade_info:
amount[order.stock_id] = order.amount_delta
deal_amount[order.stock_id] = order.deal_amount_delta
trade_price[order.stock_id] = _trade_price
trade_value[order.stock_id] = _trade_val * order.sign
trade_cost[order.stock_id] = _trade_cost
trade_dir[order.stock_id] = order.direction
# The PA in the innermost layer is meanless
pa[order.stock_id] = 0
self.order_indicator.assign("amount", amount)
self.order_indicator.assign("inner_amount", amount)
self.order_indicator.assign("deal_amount", deal_amount)
# NOTE: trade_price and baseline price will be same on the lowest-level
self.order_indicator.assign("trade_price", trade_price)
self.order_indicator.assign("trade_value", trade_value)
self.order_indicator.assign("trade_cost", trade_cost)
self.order_indicator.assign("trade_dir", trade_dir)
self.order_indicator.assign("pa", pa)
def _update_order_fulfill_rate(self):
def func(deal_amount, amount):
# deal_amount is np.NaN or None when there is no inner decision. So full fill rate is 0.
tmp_deal_amount = deal_amount.reindex(amount.index, 0)
tmp_deal_amount = tmp_deal_amount.replace({np.NaN: 0})
return tmp_deal_amount / amount
self.order_indicator.transfer(func, "ffr")
def update_order_indicators(self, trade_info: list):
self._update_order_trade_info(trade_info=trade_info)
self._update_order_fulfill_rate()
def _agg_order_trade_info(self, inner_order_indicators: List[Dict[str, pd.Series]]):
# calculate total trade amount with each inner order indicator.
def trade_amount_func(deal_amount, trade_price):
return deal_amount * trade_price
for indicator in inner_order_indicators:
indicator.transfer(trade_amount_func, "trade_price")
# sum inner order indicators with same metric.
all_metric = ["inner_amount", "deal_amount", "trade_price", "trade_value", "trade_cost", "trade_dir"]
self.order_indicator_cls.sum_all_indicators(
self.order_indicator, inner_order_indicators, all_metric, fill_value=0
)
def func(trade_price, deal_amount):
# trade_price is np.NaN instead of inf when deal_amount is zero.
tmp_deal_amount = deal_amount.replace({0: np.NaN})
return trade_price / tmp_deal_amount
self.order_indicator.transfer(func, "trade_price")
def func_apply(trade_dir):
return trade_dir.apply(Order.parse_dir)
self.order_indicator.transfer(func_apply, "trade_dir")
def _update_trade_amount(self, outer_trade_decision: BaseTradeDecision):
# NOTE: these indicator is designed for order execution, so the
decision: List[Order] = outer_trade_decision.get_decision()
if len(decision) == 0:
self.order_indicator.assign("amount", {})
else:
self.order_indicator.assign("amount", {order.stock_id: order.amount_delta for order in decision})
def _get_base_vol_pri(
self,
inst: str,
trade_start_time: pd.Timestamp,
trade_end_time: pd.Timestamp,
direction: OrderDir,
decision: BaseTradeDecision,
trade_exchange: Exchange,
pa_config: dict = {},
):
"""
Get the base volume and price information
All the base price values are rooted from this function
"""
agg = pa_config.get("agg", "twap").lower()
price = pa_config.get("price", "deal_price").lower()
if decision.trade_range is not None:
trade_start_time, trade_end_time = decision.trade_range.clip_time_range(
start_time=trade_start_time, end_time=trade_end_time
)
if price == "deal_price":
price_s = trade_exchange.get_deal_price(
inst, trade_start_time, trade_end_time, direction=direction, method=None
)
else:
raise NotImplementedError(f"This type of input is not supported")
# if there is no stock data during the time period
if price_s is None:
return None, None
if isinstance(price_s, (int, float, np.number)):
price_s = idd.SingleData(price_s, [trade_start_time])
elif isinstance(price_s, idd.SingleData):
pass
else:
raise NotImplementedError(f"This type of input is not supported")
# NOTE: there are some zeros in the trading price. These cases are known meaningless
# for aligning the previous logic, remove it.
# remove zero and negative values.
price_s = price_s.loc[(price_s > 1e-08).data.astype(np.bool)]
# NOTE ~(price_s < 1e-08) is different from price_s >= 1e-8
# ~(np.NaN < 1e-8) -> ~(False) -> True
if agg == "vwap":
volume_s = trade_exchange.get_volume(inst, trade_start_time, trade_end_time, method=None)
if isinstance(volume_s, (int, float, np.number)):
volume_s = idd.SingleData(volume_s, [trade_start_time])
volume_s = volume_s.reindex(price_s.index)
elif agg == "twap":
volume_s = idd.SingleData(1, price_s.index)
else:
raise NotImplementedError(f"This type of input is not supported")
base_volume = volume_s.sum()
base_price = (price_s * volume_s).sum() / base_volume
return base_price, base_volume
def _agg_base_price(
self,
inner_order_indicators: List[Dict[str, Union[SingleMetric, idd.SingleData]]],
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]],
trade_exchange: Exchange,
pa_config: dict = {},
):
"""
# NOTE:!!!!
# Strong assumption!!!!!!
# the correctness of the base_price relies on that the **same** exchange is used
Parameters
----------
inner_order_indicators : List[Dict[str, pd.Series]]
the indicators of account of inner executor
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]],
a list of decisions according to inner_order_indicators
trade_exchange : Exchange
for retrieving trading price
pa_config : dict
For example
{
"agg": "twap", # "vwap"
"price": "$close", # TODO: this is not supported now!!!!!
# default to use deal price of the exchange
}
"""
# TODO: I think there are potentials to be optimized
trade_dir = self.order_indicator.get_index_data("trade_dir")
if len(trade_dir) > 0:
bp_all, bv_all = [], []
# <step, inst, (base_volume | base_price)>
for oi, (dec, start, end) in zip(inner_order_indicators, decision_list):
bp_s = oi.get_index_data("base_price").reindex(trade_dir.index)
bv_s = oi.get_index_data("base_volume").reindex(trade_dir.index)
bp_new, bv_new = {}, {}
for pr, v, (inst, direction) in zip(bp_s.data, bv_s.data, zip(trade_dir.index, trade_dir.data)):
if np.isnan(pr):
bp_tmp, bv_tmp = self._get_base_vol_pri(
inst,
start,
end,
decision=dec,
direction=direction,
trade_exchange=trade_exchange,
pa_config=pa_config,
)
if (bp_tmp is not None) and (bv_tmp is not None):
bp_new[inst], bv_new[inst] = bp_tmp, bv_tmp
else:
bp_new[inst], bv_new[inst] = pr, v
bp_new = idd.SingleData(bp_new)
bv_new = idd.SingleData(bv_new)
bp_all.append(bp_new)
bv_all.append(bv_new)
bp_all = idd.concat(bp_all, axis=1)
bv_all = idd.concat(bv_all, axis=1)
base_volume = bv_all.sum(axis=1)
self.order_indicator.assign("base_volume", base_volume.to_dict())
self.order_indicator.assign("base_price", ((bp_all * bv_all).sum(axis=1) / base_volume).to_dict())
def _agg_order_price_advantage(self):
def if_empty_func(trade_price):
return trade_price.empty
if_empty = self.order_indicator.transfer(if_empty_func)
if not if_empty:
def func(trade_dir, trade_price, base_price):
sign = 1 - trade_dir * 2
return sign * (trade_price / base_price - 1)
self.order_indicator.transfer(func, "pa")
else:
self.order_indicator.assign("pa", {})
def agg_order_indicators(
self,
inner_order_indicators: List[Dict[str, pd.Series]],
decision_list: List[Tuple[BaseTradeDecision, pd.Timestamp, pd.Timestamp]],
outer_trade_decision: BaseTradeDecision,
trade_exchange: Exchange,
indicator_config={},
):
self._agg_order_trade_info(inner_order_indicators)
self._update_trade_amount(outer_trade_decision)
self._update_order_fulfill_rate()
pa_config = indicator_config.get("pa_config", {})
self._agg_base_price(inner_order_indicators, decision_list, trade_exchange, pa_config=pa_config) # TODO
self._agg_order_price_advantage()
def _cal_trade_fulfill_rate(self, method="mean"):
if method == "mean":
def func(ffr):
return ffr.mean()
elif method == "amount_weighted":
def func(ffr, deal_amount):
return (ffr * deal_amount.abs()).sum() / (deal_amount.abs().sum())
elif method == "value_weighted":
def func(ffr, trade_value):
return (ffr * trade_value.abs()).sum() / (trade_value.abs().sum())
else:
raise ValueError(f"method {method} is not supported!")
return self.order_indicator.transfer(func)
def _cal_trade_price_advantage(self, method="mean"):
if method == "mean":
def func(pa):
return pa.mean()
elif method == "amount_weighted":
def func(pa, deal_amount):
return (pa * deal_amount.abs()).sum() / (deal_amount.abs().sum())
elif method == "value_weighted":
def func(pa, trade_value):
return (pa * trade_value.abs()).sum() / (trade_value.abs().sum())
else:
raise ValueError(f"method {method} is not supported!")
return self.order_indicator.transfer(func)
def _cal_trade_positive_rate(self):
def func(pa):
return (pa > 0).sum() / pa.count()
return self.order_indicator.transfer(func)
def _cal_deal_amount(self):
def func(deal_amount):
return deal_amount.abs().sum()
return self.order_indicator.transfer(func)
def _cal_trade_value(self):
def func(trade_value):
return trade_value.abs().sum()
return self.order_indicator.transfer(func)
def _cal_trade_order_count(self):
def func(amount):
return amount.count()
return self.order_indicator.transfer(func)
def cal_trade_indicators(self, trade_start_time, freq, indicator_config={}):
show_indicator = indicator_config.get("show_indicator", False)
ffr_config = indicator_config.get("ffr_config", {})
pa_config = indicator_config.get("pa_config", {})
fulfill_rate = self._cal_trade_fulfill_rate(method=ffr_config.get("weight_method", "mean"))
price_advantage = self._cal_trade_price_advantage(method=pa_config.get("weight_method", "mean"))
positive_rate = self._cal_trade_positive_rate()
deal_amount = self._cal_deal_amount()
trade_value = self._cal_trade_value()
order_count = self._cal_trade_order_count()
self.trade_indicator["ffr"] = fulfill_rate
self.trade_indicator["pa"] = price_advantage
self.trade_indicator["pos"] = positive_rate
self.trade_indicator["deal_amount"] = deal_amount
self.trade_indicator["value"] = trade_value
self.trade_indicator["count"] = order_count
if show_indicator:
print(
"[Indicator({}) {:%Y-%m-%d %H:%M:%S}]: FFR: {}, PA: {}, POS: {}".format(
freq, trade_start_time, fulfill_rate, price_advantage, positive_rate
)
)
def get_order_indicator(self, raw: bool = True):
if raw:
return self.order_indicator
return self.order_indicator.to_series()
def get_trade_indicator(self):
return self.trade_indicator
def generate_trade_indicators_dataframe(self):
return pd.DataFrame.from_dict(self.trade_indicator_his, orient="index")

102
qlib/backtest/signal.py Normal file
View File

@@ -0,0 +1,102 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from qlib.utils import init_instance_by_config
from typing import Dict, List, Text, Tuple, Union
from ..model.base import BaseModel
from ..data.dataset import Dataset
from ..data.dataset.utils import convert_index_format
from ..utils.resam import resam_ts_data
import pandas as pd
import abc
class Signal(metaclass=abc.ABCMeta):
"""
Some trading strategy make decisions based on other prediction signals
The signals may comes from different sources(e.g. prepared data, online prediction from model and dataset)
This interface is tries to provide unified interface for those different sources
"""
@abc.abstractmethod
def get_signal(self, start_time, end_time) -> Union[pd.Series, pd.DataFrame, None]:
"""
get the signal at the end of the decision step(from `start_time` to `end_time`)
Returns
-------
Union[pd.Series, pd.DataFrame, None]:
returns None if no signal in the specific day
"""
...
class SignalWCache(Signal):
"""
Signal With pandas with based Cache
SignalWCache will store the prepared signal as a attribute and give the according signal based on input query
"""
def __init__(self, signal: Union[pd.Series, pd.DataFrame]):
"""
Parameters
----------
signal : Union[pd.Series, pd.DataFrame]
The expected format of the signal is like the data below (the order of index is not important and can be automatically adjusted)
instrument datetime
SH600000 2008-01-02 0.079704
2008-01-03 0.120125
2008-01-04 0.878860
2008-01-07 0.505539
2008-01-08 0.395004
"""
self.signal_cache = convert_index_format(signal, level="datetime")
def get_signal(self, start_time, end_time) -> Union[pd.Series, pd.DataFrame]:
# the frequency of the signal may not algin with the decision frequency of strategy
# so resampling from the data is necessary
# the latest signal leverage more recent data and therefore is used in trading.
signal = resam_ts_data(self.signal_cache, start_time=start_time, end_time=end_time, method="last")
return signal
class ModelSignal(SignalWCache):
def __init__(self, model: BaseModel, dataset: Dataset):
self.model = model
self.dataset = dataset
pred_scores = self.model.predict(dataset)
if isinstance(pred_scores, pd.DataFrame):
pred_scores = pred_scores.iloc[:, 0]
super().__init__(pred_scores)
def _update_model(self):
"""
When using online data, update model in each bar as the following steps:
- update dataset with online data, the dataset should support online update
- make the latest prediction scores of the new bar
- update the pred score into the latest prediction
"""
# TODO: this method is not included in the framework and could be refactor later
raise NotImplementedError("_update_model is not implemented!")
def create_signal_from(
obj: Union[Signal, Tuple[BaseModel, Dataset], List, Dict, Text, pd.Series, pd.DataFrame]
) -> Signal:
"""
create signal from diverse information
This method will choose the right method to create a signal based on `obj`
Please refer to the code below.
"""
if isinstance(obj, Signal):
return obj
elif isinstance(obj, (tuple, list)):
return ModelSignal(*obj)
elif isinstance(obj, (dict, str)):
return init_instance_by_config(obj)
elif isinstance(obj, (pd.DataFrame, pd.Series)):
return SignalWCache(signal=obj)
else:
raise NotImplementedError(f"This type of signal is not supported")

269
qlib/backtest/utils.py Normal file
View File

@@ -0,0 +1,269 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from __future__ import annotations
import bisect
from qlib.utils.time import epsilon_change
from typing import TYPE_CHECKING, Tuple, Union
if TYPE_CHECKING:
from qlib.backtest.decision import BaseTradeDecision
import pandas as pd
import warnings
from ..data.data import Cal
class TradeCalendarManager:
"""
Manager for trading calendar
- BaseStrategy and BaseExecutor will use it
"""
def __init__(
self,
freq: str,
start_time: Union[str, pd.Timestamp] = None,
end_time: Union[str, pd.Timestamp] = None,
level_infra: "LevelInfrastructure" = None,
):
"""
Parameters
----------
freq : str
frequency of trading calendar, also trade time per trading step
start_time : Union[str, pd.Timestamp], optional
closed start of the trading calendar, by default None
If `start_time` is None, it must be reset before trading.
end_time : Union[str, pd.Timestamp], optional
closed end of the trade time range, by default None
If `end_time` is None, it must be reset before trading.
"""
self.level_infra = level_infra
self.reset(freq=freq, start_time=start_time, end_time=end_time)
def reset(self, freq, start_time, end_time):
"""
Please refer to the docs of `__init__`
Reset the trade calendar
- self.trade_len : The total count for trading step
- self.trade_step : The number of trading step finished, self.trade_step can be [0, 1, 2, ..., self.trade_len - 1]
"""
self.freq = freq
self.start_time = pd.Timestamp(start_time) if start_time else None
self.end_time = pd.Timestamp(end_time) if end_time else None
_calendar = Cal.calendar(freq=freq)
self._calendar = _calendar
_, _, _start_index, _end_index = Cal.locate_index(start_time, end_time, freq=freq)
self.start_index = _start_index
self.end_index = _end_index
self.trade_len = _end_index - _start_index + 1
self.trade_step = 0
def finished(self):
"""
Check if the trading finished
- Should check before calling strategy.generate_decisions and executor.execute
- If self.trade_step >= self.self.trade_len, it means the trading is finished
- If self.trade_step < self.self.trade_len, it means the number of trading step finished is self.trade_step
"""
return self.trade_step >= self.trade_len - 1
def step(self):
if self.finished():
raise RuntimeError(f"The calendar is finished, please reset it if you want to call it!")
self.trade_step = self.trade_step + 1
def get_freq(self):
return self.freq
def get_trade_len(self):
"""get the total step length"""
return self.trade_len
def get_trade_step(self):
return self.trade_step
def get_step_time(self, trade_step=None, shift=0):
"""
Get the left and right endpoints of the trade_step'th trading interval
About the endpoints:
- Qlib uses the closed interval in time-series data selection, which has the same performance as pandas.Series.loc
# - The returned right endpoints should minus 1 seconds becasue of the closed interval representation in Qlib.
# Note: Qlib supports up to minutely decision execution, so 1 seconds is less than any trading time interval.
Parameters
----------
trade_step : int, optional
the number of trading step finished, by default None to indicate current step
shift : int, optional
shift bars , by default 0
Returns
-------
Tuple[pd.Timestamp, pd.Timestap]
- If shift == 0, return the trading time range
- If shift > 0, return the trading time range of the earlier shift bars
- If shift < 0, return the trading time range of the later shift bar
"""
if trade_step is None:
trade_step = self.get_trade_step()
trade_step = trade_step - shift
calendar_index = self.start_index + trade_step
return self._calendar[calendar_index], epsilon_change(self._calendar[calendar_index + 1])
def get_data_cal_range(self, rtype: str = "full") -> Tuple[int, int]:
"""
get the calendar range
The following assumptions are made
1) The frequency of the exchange in common_infra is the same as the data calendar
2) Users want the **data index** mod by **day** (i.e. 240 min)
Parameters
----------
rtype: str
- "full": return the full limitation of the deicsion in the day
- "step": return the limitation of current step
Returns
-------
Tuple[int, int]:
"""
# potential performance issue
day_start = pd.Timestamp(self.start_time.date())
day_end = epsilon_change(day_start + pd.Timedelta(days=1))
freq = self.level_infra.get("common_infra").get("trade_exchange").freq
_, _, day_start_idx, _ = Cal.locate_index(day_start, day_end, freq=freq)
if rtype == "full":
_, _, start_idx, end_index = Cal.locate_index(self.start_time, self.end_time, freq=freq)
elif rtype == "step":
_, _, start_idx, end_index = Cal.locate_index(*self.get_step_time(), freq=freq)
else:
raise ValueError(f"This type of input {rtype} is not supported")
return start_idx - day_start_idx, end_index - day_start_idx
def get_all_time(self):
"""Get the start_time and end_time for trading"""
return self.start_time, self.end_time
# helper functions
def get_range_idx(self, start_time: pd.Timestamp, end_time: pd.Timestamp) -> Tuple[int, int]:
"""
get the range index which involve start_time~end_time (both sides are closed)
Parameters
----------
start_time : pd.Timestamp
end_time : pd.Timestamp
Returns
-------
Tuple[int, int]:
the index of the range. **the left and right are closed**
"""
left, right = (
bisect.bisect_right(self._calendar, start_time) - 1,
bisect.bisect_right(self._calendar, end_time) - 1,
)
left -= self.start_index
right -= self.start_index
def clip(idx):
return min(max(0, idx), self.trade_len - 1)
return clip(left), clip(right)
def __repr__(self) -> str:
return f"class: {self.__class__.__name__}; {self.start_time}[{self.start_index}]~{self.end_time}[{self.end_index}]: [{self.trade_step}/{self.trade_len}]"
class BaseInfrastructure:
def __init__(self, **kwargs):
self.reset_infra(**kwargs)
def get_support_infra(self):
raise NotImplementedError("`get_support_infra` is not implemented!")
def reset_infra(self, **kwargs):
support_infra = self.get_support_infra()
for k, v in kwargs.items():
if k in support_infra:
setattr(self, k, v)
else:
warnings.warn(f"{k} is ignored in `reset_infra`!")
def get(self, infra_name):
if hasattr(self, infra_name):
return getattr(self, infra_name)
else:
warnings.warn(f"infra {infra_name} is not found!")
def has(self, infra_name):
if infra_name in self.get_support_infra() and hasattr(self, infra_name):
return True
else:
return False
def update(self, other):
support_infra = other.get_support_infra()
infra_dict = {_infra: getattr(other, _infra) for _infra in support_infra if hasattr(other, _infra)}
self.reset_infra(**infra_dict)
class CommonInfrastructure(BaseInfrastructure):
def get_support_infra(self):
return ["trade_account", "trade_exchange"]
class LevelInfrastructure(BaseInfrastructure):
"""level instrastructure is created by executor, and then shared to strategies on the same level"""
def get_support_infra(self):
"""
Descriptions about the infrastructure
sub_level_infra:
- **NOTE**: this will only work after _init_sub_trading !!!
"""
return ["trade_calendar", "sub_level_infra", "common_infra"]
def reset_cal(self, freq, start_time, end_time):
"""reset trade calendar manager"""
if self.has("trade_calendar"):
self.get("trade_calendar").reset(freq, start_time=start_time, end_time=end_time)
else:
self.reset_infra(
trade_calendar=TradeCalendarManager(freq, start_time=start_time, end_time=end_time, level_infra=self)
)
def set_sub_level_infra(self, sub_level_infra: LevelInfrastructure):
"""this will make the calendar access easier when acrossing multi-levels"""
self.reset_infra(sub_level_infra=sub_level_infra)
def get_start_end_idx(trade_calendar: TradeCalendarManager, outer_trade_decision: BaseTradeDecision) -> Union[int, int]:
"""
A helper function for getting the decision-level index range limitation for inner strategy
- NOTE: this function is not applicable to order-level
Parameters
----------
trade_calendar : TradeCalendarManager
outer_trade_decision : BaseTradeDecision
the trade decision made by outer strategy
Returns
-------
Union[int, int]:
start index and end index
"""
try:
return outer_trade_decision.get_range_limit(inner_calendar=trade_calendar)
except NotImplementedError:
return 0, trade_calendar.get_trade_len() - 1

View File

@@ -109,6 +109,8 @@ _default_config = {
"kernels": NUM_USABLE_CPU,
# How many tasks belong to one process. Recommend 1 for high-frequency data and None for daily data.
"maxtasksperchild": None,
# If joblib_backend is None, use loky
"joblib_backend": "multiprocessing",
"default_disk_cache": 1, # 0:skip/1:use
"mem_cache_size_limit": 500,
# memory cache expire second, only in used 'DatasetURICache' and 'client D.calendar'
@@ -165,6 +167,10 @@ _default_config = {
"task_url": "mongodb://localhost:27017/",
"task_db_name": "default_task_db",
},
# Shift minute for highfreq minite data, used in backtest
# if min_data_shift == 0, use default market time [9:30, 11:29, 1:00, 2:59]
# if min_data_shift != 0, use shifted market time [9:30, 11:29, 1:00, 2:59] - shift*minute
"min_data_shift": 0,
}
MODE_CONF = {
@@ -271,7 +277,7 @@ class QlibConfig(Config):
else:
return QlibConfig.LOCAL_URI
def get_data_path(self, freq: str = None) -> Path:
def get_data_uri(self, freq: str = None) -> Path:
if freq is None or freq not in self.provider_uri:
freq = QlibConfig.DEFAULT_FREQ
_provider_uri = self.provider_uri[freq]
@@ -328,11 +334,41 @@ class QlibConfig(Config):
if _mount_path[_freq] is None
else str(Path(_mount_path[_freq]).expanduser().resolve())
)
self["provider_uri"] = _provider_uri
self["mount_path"] = _mount_path
def set(self, default_conf="client", **kwargs):
def get_uri_type(self):
path = self["provider_uri"]
if isinstance(path, Path):
path = str(path)
is_win = re.match("^[a-zA-Z]:.*", path) is not None # such as 'C:\\data', 'D:'
is_nfs_or_win = (
re.match("^[^/]+:.+", path) is not None
) # such as 'host:/data/' (User may define short hostname by themselves or use localhost)
if is_nfs_or_win and not is_win:
return QlibConfig.NFS_URI
else:
return QlibConfig.LOCAL_URI
def set(self, default_conf: str = "client", **kwargs):
"""
configure qlib based on the input parameters
The configure will act like a dictionary.
Normally, it literally replace the value according to the keys.
However, sometimes it is hard for users to set the config when the configure is nested and complicated
So this API provides some special parameters for users to set the keys in a more convenient way.
- region: REG_CN, REG_US
- several region-related config will be changed
Parameters
----------
default_conf : str
the default config template chosen by user: "server", "client"
"""
from .utils import set_log_with_config, get_module_logger, can_use_cache
self.reset()

View File

@@ -1,324 +0,0 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from .order import Order
from .account import Account
from .position import Position
from .exchange import Exchange
from .report import Report
from .backtest import backtest as backtest_func, get_date_range
import numpy as np
import inspect
from ...utils import init_instance_by_config
from ...log import get_module_logger
from ...config import C
logger = get_module_logger("backtest caller")
def get_strategy(
strategy=None,
topk=50,
margin=0.5,
n_drop=5,
risk_degree=0.95,
str_type="dropout",
adjust_dates=None,
):
"""get_strategy
There will be 3 ways to return a stratgy. Please follow the code.
Parameters
----------
strategy : Strategy()
strategy used in backtest.
topk : int (Default value: 50)
top-N stocks to buy.
margin : int or float(Default value: 0.5)
- if isinstance(margin, int):
sell_limit = margin
- else:
sell_limit = pred_in_a_day.count() * margin
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit).
sell_limit should be no less than topk.
n_drop : int
number of stocks to be replaced in each trading date.
risk_degree: float
0-1, 0.95 for example, use 95% money to trade.
str_type: 'amount', 'weight' or 'dropout'
strategy type: TopkAmountStrategy ,TopkWeightStrategy or TopkDropoutStrategy.
Returns
-------
:class: Strategy
an initialized strategy object
"""
# There will be 3 ways to return a strategy.
if strategy is None:
# 1) create strategy with param `strategy`
str_cls_dict = {
"amount": "TopkAmountStrategy",
"weight": "TopkWeightStrategy",
"dropout": "TopkDropoutStrategy",
}
logger.info("Create new strategy ")
from .. import strategy as strategy_pool
str_cls = getattr(strategy_pool, str_cls_dict.get(str_type))
strategy = str_cls(
topk=topk,
buffer_margin=margin,
n_drop=n_drop,
risk_degree=risk_degree,
adjust_dates=adjust_dates,
)
elif isinstance(strategy, (dict, str)):
# 2) create strategy with init_instance_by_config
logger.info("Create new strategy ")
strategy = init_instance_by_config(strategy)
from ..strategy.strategy import BaseStrategy
# else: nothing happens. 3) Use the strategy directly
if not isinstance(strategy, BaseStrategy):
raise TypeError("Strategy not supported")
return strategy
def get_exchange(
pred,
exchange=None,
subscribe_fields=[],
open_cost=0.0015,
close_cost=0.0025,
min_cost=5.0,
trade_unit=None,
limit_threshold=None,
deal_price=None,
extract_codes=False,
shift=1,
):
"""get_exchange
Parameters
----------
# exchange related arguments
exchange: Exchange().
subscribe_fields: list
subscribe fields.
open_cost : float
open transaction cost.
close_cost : float
close transaction cost.
min_cost : float
min transaction cost.
trade_unit : int
100 for China A.
deal_price: str
dealing price type: 'close', 'open', 'vwap'.
limit_threshold : float
limit move 0.1 (10%) for example, long and short with same limit.
extract_codes: bool
will we pass the codes extracted from the pred to the exchange.
NOTE: This will be faster with offline qlib.
Returns
-------
:class: Exchange
an initialized Exchange object
"""
if trade_unit is None:
trade_unit = C.trade_unit
if limit_threshold is None:
limit_threshold = C.limit_threshold
if deal_price is None:
deal_price = C.deal_price
if exchange is None:
logger.info("Create new exchange")
# handle exception for deal_price
if deal_price[0] != "$":
deal_price = "$" + deal_price
if extract_codes:
codes = sorted(pred.index.get_level_values("instrument").unique())
else:
codes = "all" # TODO: We must ensure that 'all.txt' includes all the stocks
dates = sorted(pred.index.get_level_values("datetime").unique())
dates = np.append(dates, get_date_range(dates[-1], left_shift=1, right_shift=shift))
exchange = Exchange(
trade_dates=dates,
codes=codes,
deal_price=deal_price,
subscribe_fields=subscribe_fields,
limit_threshold=limit_threshold,
open_cost=open_cost,
close_cost=close_cost,
min_cost=min_cost,
trade_unit=trade_unit,
)
return exchange
def get_executor(
executor=None,
trade_exchange=None,
verbose=True,
):
"""get_executor
There will be 3 ways to return a executor. Please follow the code.
Parameters
----------
executor : BaseExecutor
executor used in backtest.
trade_exchange : Exchange
exchange used in executor
verbose : bool
whether to print log.
Returns
-------
:class: BaseExecutor
an initialized BaseExecutor object
"""
# There will be 3 ways to return a executor.
if executor is None:
# 1) create executor with param `executor`
logger.info("Create new executor ")
from ..online.executor import SimulatorExecutor
executor = SimulatorExecutor(trade_exchange=trade_exchange, verbose=verbose)
elif isinstance(executor, (dict, str)):
# 2) create executor with config
logger.info("Create new executor ")
executor = init_instance_by_config(executor)
from ..online.executor import BaseExecutor
# 3) Use the executor directly
if not isinstance(executor, BaseExecutor):
raise TypeError("Executor not supported")
return executor
# This is the API for compatibility for legacy code
def backtest(pred, account=1e9, shift=1, benchmark="SH000905", verbose=True, return_order=False, **kwargs):
"""This function will help you set a reasonable Exchange and provide default value for strategy
Parameters
----------
- **backtest workflow related or commmon arguments**
pred : pandas.DataFrame
predict should has <datetime, instrument> index and one `score` column.
account : float
init account value.
shift : int
whether to shift prediction by one day.
benchmark : str
benchmark code, default is SH000905 CSI 500.
verbose : bool
whether to print log.
return_order : bool
whether to return order list
- **strategy related arguments**
strategy : Strategy()
strategy used in backtest.
topk : int (Default value: 50)
top-N stocks to buy.
margin : int or float(Default value: 0.5)
- if isinstance(margin, int):
sell_limit = margin
- else:
sell_limit = pred_in_a_day.count() * margin
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit).
sell_limit should be no less than topk.
n_drop : int
number of stocks to be replaced in each trading date.
risk_degree: float
0-1, 0.95 for example, use 95% money to trade.
str_type: 'amount', 'weight' or 'dropout'
strategy type: TopkAmountStrategy ,TopkWeightStrategy or TopkDropoutStrategy.
- **exchange related arguments**
exchange: Exchange()
pass the exchange for speeding up.
subscribe_fields: list
subscribe fields.
open_cost : float
open transaction cost. The default value is 0.002(0.2%).
close_cost : float
close transaction cost. The default value is 0.002(0.2%).
min_cost : float
min transaction cost.
trade_unit : int
100 for China A.
deal_price: str
dealing price type: 'close', 'open', 'vwap'.
limit_threshold : float
limit move 0.1 (10%) for example, long and short with same limit.
extract_codes: bool
will we pass the codes extracted from the pred to the exchange.
.. note:: This will be faster with offline qlib.
- **executor related arguments**
executor : BaseExecutor()
executor used in backtest.
verbose : bool
whether to print log.
"""
# check strategy:
spec = inspect.getfullargspec(get_strategy)
str_args = {k: v for k, v in kwargs.items() if k in spec.args}
strategy = get_strategy(**str_args)
# init exchange:
spec = inspect.getfullargspec(get_exchange)
ex_args = {k: v for k, v in kwargs.items() if k in spec.args}
trade_exchange = get_exchange(pred, **ex_args)
# init executor:
executor = get_executor(executor=kwargs.get("executor"), trade_exchange=trade_exchange, verbose=verbose)
# run backtest
report_dict = backtest_func(
pred=pred,
strategy=strategy,
executor=executor,
trade_exchange=trade_exchange,
shift=shift,
verbose=verbose,
account=account,
benchmark=benchmark,
return_order=return_order,
)
# for compatibility of the old API. return the dict positions
positions = report_dict.get("positions")
report_dict.update({"positions": {k: p.position for k, p in positions.items()}})
return report_dict

View File

@@ -1,169 +0,0 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import copy
from .position import Position
from .report import Report
from .order import Order
"""
rtn & earning in the Account
rtn:
from order's view
1.change if any order is executed, sell order or buy order
2.change at the end of today, (today_clse - stock_price) * amount
earning
from value of current position
earning will be updated at the end of trade date
earning = today_value - pre_value
**is consider cost**
while earning is the difference of two position value, so it considers cost, it is the true return rate
in the specific accomplishment for rtn, it does not consider cost, in other words, rtn - cost = earning
"""
class Account:
def __init__(self, init_cash, last_trade_date=None):
self.init_vars(init_cash, last_trade_date)
def init_vars(self, init_cash, last_trade_date=None):
# init cash
self.init_cash = init_cash
self.current = Position(cash=init_cash)
self.positions = {}
self.rtn = 0
self.ct = 0
self.to = 0
self.val = 0
self.report = Report()
self.earning = 0
self.last_trade_date = last_trade_date
def get_positions(self):
return self.positions
def get_cash(self):
return self.current.position["cash"]
def update_state_from_order(self, order, trade_val, cost, trade_price):
# update turnover
self.to += trade_val
# update cost
self.ct += cost
# update return
# update self.rtn from order
trade_amount = trade_val / trade_price
if order.direction == Order.SELL: # 0 for sell
# when sell stock, get profit from price change
profit = trade_val - self.current.get_stock_price(order.stock_id) * trade_amount
self.rtn += profit # note here do not consider cost
elif order.direction == Order.BUY: # 1 for buy
# when buy stock, we get return for the rtn computing method
# profit in buy order is to make self.rtn is consistent with self.earning at the end of date
profit = self.current.get_stock_price(order.stock_id) * trade_amount - trade_val
self.rtn += profit
def update_order(self, order, trade_val, cost, trade_price):
# if stock is sold out, no stock price information in Position, then we should update account first, then update current position
# if stock is bought, there is no stock in current position, update current, then update account
# The cost will be substracted from the cash at last. So the trading logic can ignore the cost calculation
trade_amount = trade_val / trade_price
if order.direction == Order.SELL:
# sell stock
self.update_state_from_order(order, trade_val, cost, trade_price)
# update current position
# for may sell all of stock_id
self.current.update_order(order, trade_val, cost, trade_price)
else:
# buy stock
# deal order, then update state
self.current.update_order(order, trade_val, cost, trade_price)
self.update_state_from_order(order, trade_val, cost, trade_price)
def update_daily_end(self, today, trader):
"""
today: pd.TimeStamp
quote: pd.DataFrame (code, date), collumns
when the end of trade date
- update rtn
- update price for each asset
- update value for this account
- update earning (2nd view of return )
- update holding day, count of stock
- update position hitory
- update report
:return: None
"""
# update price for stock in the position and the profit from changed_price
stock_list = self.current.get_stock_list()
profit = 0
for code in stock_list:
# if suspend, no new price to be updated, profit is 0
if trader.check_stock_suspended(code, today):
continue
today_close = trader.get_close(code, today)
profit += (today_close - self.current.position[code]["price"]) * self.current.position[code]["amount"]
self.current.update_stock_price(stock_id=code, price=today_close)
self.rtn += profit
# update holding day count
self.current.add_count_all()
# update value
self.val = self.current.calculate_value()
# update earning (2nd view of return)
# account_value - last_account_value
# for the first trade date, account_value - init_cash
# self.report.is_empty() to judge is_first_trade_date
# get last_account_value, today_account_value, today_stock_value
if self.report.is_empty():
last_account_value = self.init_cash
else:
last_account_value = self.report.get_latest_account_value()
today_account_value = self.current.calculate_value()
today_stock_value = self.current.calculate_stock_value()
self.earning = today_account_value - last_account_value
# update report for today
# judge whether the the trading is begin.
# and don't add init account state into report, due to we don't have excess return in those days.
self.report.update_report_record(
trade_date=today,
account_value=today_account_value,
cash=self.current.position["cash"],
return_rate=(self.earning + self.ct) / last_account_value,
# here use earning to calculate return, position's view, earning consider cost, true return
# in order to make same definition with original backtest in evaluate.py
turnover_rate=self.to / last_account_value,
cost_rate=self.ct / last_account_value,
stock_value=today_stock_value,
)
# set today_account_value to position
self.current.position["today_account_value"] = today_account_value
self.current.update_weight_all()
# update positions
# note use deepcopy
self.positions[today] = copy.deepcopy(self.current)
# finish today's updation
# reset the daily variables
self.rtn = 0
self.ct = 0
self.to = 0
self.last_trade_date = today
def load_account(self, account_path):
report = Report()
position = Position()
last_trade_date = position.load_position(account_path / "position.xlsx")
report.load_report(account_path / "report.csv")
# assign values
self.init_vars(position.init_cash)
self.current = position
self.report = report
self.last_trade_date = last_trade_date if last_trade_date else None
def save_account(self, account_path):
self.current.save_position(account_path / "position.xlsx", self.last_trade_date)
self.report.save_report(account_path / "report.csv")

View File

@@ -1,146 +0,0 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import numpy as np
import pandas as pd
from ...utils import get_date_by_shift, get_date_range
from ...data import D
from .account import Account
from ...config import C
from ...log import get_module_logger
from ...data.dataset.utils import get_level_index
LOG = get_module_logger("backtest")
def backtest(pred, strategy, executor, trade_exchange, shift, verbose, account, benchmark, return_order):
"""
Parameters
----------
pred : pandas.DataFrame
predict should has <datetime, instrument> index and one `score` column
Qlib want to support multi-singal strategy in the future. So pd.Series is not used.
strategy : Strategy()
strategy part for backtest
trade_exchange : Exchange()
exchage for backtest
shift : int
whether to shift prediction by one day
verbose : bool
whether to print log
account : float
init account value
benchmark : str/list/pd.Series
`benchmark` is pd.Series, `index` is trading date; the value T is the change from T-1 to T.
example:
print(D.features(D.instruments('csi500'), ['$close/Ref($close, 1)-1'])['$close/Ref($close, 1)-1'].head())
2017-01-04 0.011693
2017-01-05 0.000721
2017-01-06 -0.004322
2017-01-09 0.006874
2017-01-10 -0.003350
`benchmark` is list, will use the daily average change of the stock pool in the list as the 'bench'.
`benchmark` is str, will use the daily change as the 'bench'.
benchmark code, default is SH000905 CSI500
"""
# Convert format if the input format is not expected
if get_level_index(pred, level="datetime") == 1:
pred = pred.swaplevel().sort_index()
if isinstance(pred, pd.Series):
pred = pred.to_frame("score")
trade_account = Account(init_cash=account)
_pred_dates = pred.index.get_level_values(level="datetime")
predict_dates = D.calendar(start_time=_pred_dates.min(), end_time=_pred_dates.max())
if isinstance(benchmark, pd.Series):
bench = benchmark
else:
_codes = benchmark if isinstance(benchmark, list) else [benchmark]
_temp_result = D.features(
_codes,
["$close/Ref($close,1)-1"],
predict_dates[0],
get_date_by_shift(predict_dates[-1], shift=shift),
disk_cache=1,
)
if len(_temp_result) == 0:
raise ValueError(f"The benchmark {_codes} does not exist. Please provide the right benchmark")
bench = _temp_result.groupby(level="datetime")[_temp_result.columns.tolist()[0]].mean()
trade_dates = np.append(predict_dates[shift:], get_date_range(predict_dates[-1], left_shift=1, right_shift=shift))
if return_order:
multi_order_list = []
# trading apart
for pred_date, trade_date in zip(predict_dates, trade_dates):
# for loop predict date and trading date
# print
if verbose:
LOG.info("[I {:%Y-%m-%d}]: trade begin.".format(trade_date))
# 1. Load the score_series at pred_date
try:
score = pred.loc(axis=0)[pred_date, :] # (trade_date, stock_id) multi_index, score in pdate
score_series = score.reset_index(level="datetime", drop=True)[
"score"
] # pd.Series(index:stock_id, data: score)
except KeyError:
LOG.warning("No score found on predict date[{:%Y-%m-%d}]".format(trade_date))
score_series = None
if score_series is not None and score_series.count() > 0: # in case of the scores are all None
# 2. Update your strategy (and model)
strategy.update(score_series, pred_date, trade_date)
# 3. Generate order list
order_list = strategy.generate_order_list(
score_series=score_series,
current=trade_account.current,
trade_exchange=trade_exchange,
pred_date=pred_date,
trade_date=trade_date,
)
else:
order_list = []
if return_order:
multi_order_list.append((trade_account, order_list, trade_date))
# 4. Get result after executing order list
# NOTE: The following operation will modify order.amount.
# NOTE: If it is buy and the cash is insufficient, the tradable amount will be recalculated
trade_info = executor.execute(trade_account, order_list, trade_date)
# 5. Update account information according to transaction
update_account(trade_account, trade_info, trade_exchange, trade_date)
# generate backtest report
report_df = trade_account.report.generate_report_dataframe()
report_df["bench"] = bench
positions = trade_account.get_positions()
report_dict = {"report_df": report_df, "positions": positions}
if return_order:
report_dict.update({"order_list": multi_order_list})
return report_dict
def update_account(trade_account, trade_info, trade_exchange, trade_date):
"""
Update the account and strategy
Parameters
----------
trade_account : Account()
trade_info : list of [Order(), float, float, float]
(order, trade_val, trade_cost, trade_price), trade_info with out factor
trade_exchange : Exchange()
used to get the $close_price at trade_date to update account
trade_date : pd.Timestamp
"""
# update account
for [order, trade_val, trade_cost, trade_price] in trade_info:
if order.deal_amount == 0:
continue
trade_account.update_order(order=order, trade_val=trade_val, cost=trade_cost, trade_price=trade_price)
# at the end of trade date, update the account based the $close_price of stocks.
trade_account.update_daily_end(today=trade_date, trader=trade_exchange)

View File

@@ -1,425 +0,0 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
import random
import logging
import numpy as np
import pandas as pd
from ...data import D
from .order import Order
from ...config import C, REG_CN
from ...log import get_module_logger
class Exchange:
def __init__(
self,
trade_dates=None,
codes="all",
deal_price=None,
subscribe_fields=[],
limit_threshold=None,
open_cost=0.0015,
close_cost=0.0025,
trade_unit=None,
min_cost=5,
extra_quote=None,
):
"""__init__
:param trade_dates: list of pd.Timestamp
:param codes: list stock_id list or a string of instruments(i.e. all, csi500, sse50)
:param deal_price: str, 'close', 'open', 'vwap'
:param subscribe_fields: list, subscribe fields
:param limit_threshold: float, 0.1 for example, default None
:param open_cost: cost rate for open, default 0.0015
:param close_cost: cost rate for close, default 0.0025
:param trade_unit: trade unit, 100 for China A market
:param min_cost: min cost, default 5
:param extra_quote: pandas, dataframe consists of
columns: like ['$vwap', '$close', '$factor', 'limit'].
The limit indicates that the etf is tradable on a specific day.
Necessary fields:
$close is for calculating the total value at end of each day.
Optional fields:
$vwap is only necessary when we use the $vwap price as the deal price
$factor is for rounding to the trading unit
limit will be set to False by default(False indicates we can buy this
target on this day).
index: MultipleIndex(instrument, pd.Datetime)
"""
if trade_unit is None:
trade_unit = C.trade_unit
if limit_threshold is None:
limit_threshold = C.limit_threshold
if deal_price is None:
deal_price = C.deal_price
self.logger = get_module_logger("online operator", level=logging.INFO)
self.trade_unit = trade_unit
# TODO: the quote, trade_dates, codes are not necessray.
# It is just for performance consideration.
if limit_threshold is None:
if C.region == REG_CN:
self.logger.warning(f"limit_threshold not set. The stocks hit the limit may be bought/sold")
elif abs(limit_threshold) > 0.1:
if C.region == REG_CN:
self.logger.warning(f"limit_threshold may not be set to a reasonable value")
if deal_price[0] != "$":
self.deal_price = "$" + deal_price
else:
self.deal_price = deal_price
if isinstance(codes, str):
codes = D.instruments(codes)
self.codes = codes
# Necessary fields
# $close is for calculating the total value at end of each day.
# $factor is for rounding to the trading unit
# $change is for calculating the limit of the stock
necessary_fields = {self.deal_price, "$close", "$change", "$factor"}
subscribe_fields = list(necessary_fields | set(subscribe_fields))
all_fields = list(necessary_fields | set(subscribe_fields))
self.all_fields = all_fields
self.open_cost = open_cost
self.close_cost = close_cost
self.min_cost = min_cost
self.limit_threshold = limit_threshold
# TODO: the quote, trade_dates, codes are not necessray.
# It is just for performance consideration.
if trade_dates is not None and len(trade_dates):
start_date, end_date = trade_dates[0], trade_dates[-1]
else:
self.logger.warning("trade_dates have not been assigned, all dates will be loaded")
start_date, end_date = None, None
self.extra_quote = extra_quote
self.set_quote(codes, start_date, end_date)
def set_quote(self, codes, start_date, end_date):
if len(codes) == 0:
codes = D.instruments()
self.quote = D.features(codes, self.all_fields, start_date, end_date, disk_cache=True).dropna(subset=["$close"])
self.quote.columns = self.all_fields
if self.quote[self.deal_price].isna().any():
self.logger.warning("{} field data contains nan.".format(self.deal_price))
if self.quote["$factor"].isna().any():
# The 'factor.day.bin' file not exists, and `factor` field contains `nan`
# Use adjusted price
self.trade_w_adj_price = True
self.logger.warning("factor.day.bin file not exists or factor contains `nan`. Order using adjusted_price.")
else:
# The `factor.day.bin` file exists and all data `close` and `factor` are not `nan`
# Use normal price
self.trade_w_adj_price = False
# update limit
# check limit_threshold
if self.limit_threshold is None:
self.quote["limit"] = False
else:
# set limit
self._update_limit(buy_limit=self.limit_threshold, sell_limit=self.limit_threshold)
quote_df = self.quote
if self.extra_quote is not None:
# process extra_quote
if "$close" not in self.extra_quote:
raise ValueError("$close is necessray in extra_quote")
if self.deal_price not in self.extra_quote.columns:
self.extra_quote[self.deal_price] = self.extra_quote["$close"]
self.logger.warning("No deal_price set for extra_quote. Use $close as deal_price.")
if "$factor" not in self.extra_quote.columns:
self.extra_quote["$factor"] = 1.0
self.logger.warning("No $factor set for extra_quote. Use 1.0 as $factor.")
if "limit" not in self.extra_quote.columns:
self.extra_quote["limit"] = False
self.logger.warning("No limit set for extra_quote. All stock will be tradable.")
assert set(self.extra_quote.columns) == set(quote_df.columns) - {"$change"}
quote_df = pd.concat([quote_df, self.extra_quote], sort=False, axis=0)
# update quote: pd.DataFrame to dict, for search use
self.quote = quote_df.to_dict("index")
def _update_limit(self, buy_limit, sell_limit):
self.quote["limit"] = ~self.quote["$change"].between(-sell_limit, buy_limit, inclusive=False)
def check_stock_limit(self, stock_id, trade_date):
"""Parameter
stock_id
trade_date
is limtited
"""
return self.quote[(stock_id, trade_date)]["limit"]
def check_stock_suspended(self, stock_id, trade_date):
# is suspended
return (stock_id, trade_date) not in self.quote
def is_stock_tradable(self, stock_id, trade_date):
# check if stock can be traded
# same as check in check_order
if self.check_stock_suspended(stock_id, trade_date) or self.check_stock_limit(stock_id, trade_date):
return False
else:
return True
def check_order(self, order):
# check limit and suspended
if self.check_stock_suspended(order.stock_id, order.trade_date) or self.check_stock_limit(
order.stock_id, order.trade_date
):
return False
else:
return True
def deal_order(self, order, trade_account=None, position=None):
"""
Deal order when the actual transaction
:param order: Deal the order.
:param trade_account: Trade account to be updated after dealing the order.
:param position: position to be updated after dealing the order.
:return: trade_val, trade_cost, trade_price
"""
# need to check order first
# TODO: check the order unit limit in the exchange!!!!
# The order limit is related to the adj factor and the cur_amount.
# factor = self.quote[(order.stock_id, order.trade_date)]['$factor']
# cur_amount = trade_account.current.get_stock_amount(order.stock_id)
if self.check_order(order) is False:
raise AttributeError("need to check order first")
if trade_account is not None and position is not None:
raise ValueError("trade_account and position can only choose one")
trade_price = self.get_deal_price(order.stock_id, order.trade_date)
trade_val, trade_cost = self._calc_trade_info_by_order(
order, trade_account.current if trade_account else position
)
# update account
if trade_val > 0:
# If the order can only be deal 0 trade_val. Nothing to be updated
# Otherwise, it will result some stock with 0 amount in the position
if trade_account:
trade_account.update_order(order=order, trade_val=trade_val, cost=trade_cost, trade_price=trade_price)
elif position:
position.update_order(order=order, trade_val=trade_val, cost=trade_cost, trade_price=trade_price)
return trade_val, trade_cost, trade_price
def get_quote_info(self, stock_id, trade_date):
return self.quote[(stock_id, trade_date)]
def get_close(self, stock_id, trade_date):
return self.quote[(stock_id, trade_date)]["$close"]
def get_deal_price(self, stock_id, trade_date):
deal_price = self.quote[(stock_id, trade_date)][self.deal_price]
if np.isclose(deal_price, 0.0) or np.isnan(deal_price):
self.logger.warning(f"(stock_id:{stock_id}, trade_date:{trade_date}, {self.deal_price}): {deal_price}!!!")
self.logger.warning(f"setting deal_price to close price")
deal_price = self.get_close(stock_id, trade_date)
return deal_price
def get_factor(self, stock_id, trade_date):
return self.quote[(stock_id, trade_date)]["$factor"]
def generate_amount_position_from_weight_position(self, weight_position, cash, trade_date):
"""
The generate the target position according to the weight and the cash.
NOTE: All the cash will assigned to the tadable stock.
Parameter:
weight_position : dict {stock_id : weight}; allocate cash by weight_position
among then, weight must be in this range: 0 < weight < 1
cash : cash
trade_date : trade date
"""
# calculate the total weight of tradable value
tradable_weight = 0.0
for stock_id in weight_position:
if self.is_stock_tradable(stock_id=stock_id, trade_date=trade_date):
# weight_position must be greater than 0 and less than 1
if weight_position[stock_id] < 0 or weight_position[stock_id] > 1:
raise ValueError(
"weight_position is {}, "
"weight_position is not in the range of (0, 1).".format(weight_position[stock_id])
)
tradable_weight += weight_position[stock_id]
if tradable_weight - 1.0 >= 1e-5:
raise ValueError("tradable_weight is {}, can not greater than 1.".format(tradable_weight))
amount_dict = {}
for stock_id in weight_position:
if weight_position[stock_id] > 0.0 and self.is_stock_tradable(stock_id=stock_id, trade_date=trade_date):
amount_dict[stock_id] = (
cash
* weight_position[stock_id]
/ tradable_weight
// self.get_deal_price(stock_id=stock_id, trade_date=trade_date)
)
return amount_dict
def get_real_deal_amount(self, current_amount, target_amount, factor):
"""
Calculate the real adjust deal amount when considering the trading unit
:param current_amount:
:param target_amount:
:param factor:
:return real_deal_amount; Positive deal_amount indicates buying more stock.
"""
if current_amount == target_amount:
return 0
elif current_amount < target_amount:
deal_amount = target_amount - current_amount
deal_amount = self.round_amount_by_trade_unit(deal_amount, factor)
return deal_amount
else:
if target_amount == 0:
return -current_amount
else:
deal_amount = current_amount - target_amount
deal_amount = self.round_amount_by_trade_unit(deal_amount, factor)
return -deal_amount
def generate_order_for_target_amount_position(self, target_position, current_position, trade_date):
"""Parameter:
target_position : dict { stock_id : amount }
current_postion : dict { stock_id : amount}
trade_unit : trade_unit
down sample : for amount 321 and trade_unit 100, deal_amount is 300
deal order on trade_date
"""
# split buy and sell for further use
buy_order_list = []
sell_order_list = []
# three parts: kept stock_id, dropped stock_id, new stock_id
# handle kept stock_id
# because the order of the set is not fixed, the trading order of the stock is different, so that the backtest results of the same parameter are different;
# so here we sort stock_id, and then randomly shuffle the order of stock_id
# because the same random seed is used, the final stock_id order is fixed
sorted_ids = sorted(set(list(current_position.keys()) + list(target_position.keys())))
random.seed(0)
random.shuffle(sorted_ids)
for stock_id in sorted_ids:
# Do not generate order for the nontradable stocks
if not self.is_stock_tradable(stock_id=stock_id, trade_date=trade_date):
continue
target_amount = target_position.get(stock_id, 0)
current_amount = current_position.get(stock_id, 0)
factor = self.quote[(stock_id, trade_date)]["$factor"]
deal_amount = self.get_real_deal_amount(current_amount, target_amount, factor)
if deal_amount == 0:
continue
elif deal_amount > 0:
# buy stock
buy_order_list.append(
Order(
stock_id=stock_id,
amount=deal_amount,
direction=Order.BUY,
trade_date=trade_date,
factor=factor,
)
)
else:
# sell stock
sell_order_list.append(
Order(
stock_id=stock_id,
amount=abs(deal_amount),
direction=Order.SELL,
trade_date=trade_date,
factor=factor,
)
)
# return order_list : buy + sell
return sell_order_list + buy_order_list
def calculate_amount_position_value(self, amount_dict, trade_date, only_tradable=False):
"""Parameter
position : Position()
amount_dict : {stock_id : amount}
"""
value = 0
for stock_id in amount_dict:
if (
self.check_stock_suspended(stock_id=stock_id, trade_date=trade_date) is False
and self.check_stock_limit(stock_id=stock_id, trade_date=trade_date) is False
):
value += self.get_deal_price(stock_id=stock_id, trade_date=trade_date) * amount_dict[stock_id]
return value
def round_amount_by_trade_unit(self, deal_amount, factor):
"""Parameter
deal_amount : float, adjusted amount
factor : float, adjusted factor
return : float, real amount
"""
if not self.trade_w_adj_price:
# the minimal amount is 1. Add 0.1 for solving precision problem.
return (deal_amount * factor + 0.1) // self.trade_unit * self.trade_unit / factor
return deal_amount
def _calc_trade_info_by_order(self, order, position):
"""
Calculation of trade info
:param order:
:param position: Position
:return: trade_val, trade_cost
"""
trade_price = self.get_deal_price(order.stock_id, order.trade_date)
if order.direction == Order.SELL:
# sell
if position is not None:
if np.isclose(order.amount, position.get_stock_amount(order.stock_id)):
# when selling last stock. The amount don't need rounding
order.deal_amount = order.amount
else:
order.deal_amount = self.round_amount_by_trade_unit(order.amount, order.factor)
else:
# TODO: We don't know current position.
# We choose to sell all
order.deal_amount = order.amount
trade_val = order.deal_amount * trade_price
trade_cost = max(trade_val * self.close_cost, self.min_cost)
elif order.direction == Order.BUY:
# buy
if position is not None:
cash = position.get_cash()
trade_val = order.amount * trade_price
if cash < trade_val * (1 + self.open_cost):
# The money is not enough
order.deal_amount = self.round_amount_by_trade_unit(
cash / (1 + self.open_cost) / trade_price, order.factor
)
else:
# THe money is enough
order.deal_amount = self.round_amount_by_trade_unit(order.amount, order.factor)
else:
# Unknown amount of money. Just round the amount
order.deal_amount = self.round_amount_by_trade_unit(order.amount, order.factor)
trade_val = order.deal_amount * trade_price
trade_cost = trade_val * self.open_cost
else:
raise NotImplementedError("order type {} error".format(order.type))
return trade_val, trade_cost

Some files were not shown because too many files have changed in this diff Show More