modify get_data method for CI

fix pip install CI
Update __init__.py
2026-06-29 09:01:18 +08:00 · 2023-07-05 13:59:38 +08:00 · 2023-06-26 09:19:40 +08:00 · 2023-06-26 00:00:46 +08:00 · 2023-06-25 23:48:37 +08:00 · 2023-06-25 23:39:11 +08:00
114 changed files with 3271 additions and 1024 deletions
--- a/.github/release-drafter.yml
+++ b/.github/release-drafter.yml
@@ -14,6 +14,9 @@ categories:
    label: 
      - 'doc'
      - 'documentation'
+  - title: '🧹 Maintenance'
+    label: 
+      - 'maintenance'
 change-template: '- $TITLE @$AUTHOR (#$NUMBER)'
 change-title-escapes: '\<*_&' # You can add # and @ to disable mentions, and add ` to disable code blocks.
 version-resolver:
@@ -30,4 +33,4 @@ version-resolver:
 template: |
  ## Changes

-  $CHANGES
+  $CHANGES
--- a/.github/workflows/test_qlib_from_pip.yml
+++ b/.github/workflows/test_qlib_from_pip.yml
@@ -13,16 +13,26 @@ jobs:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
-        os: [windows-latest, ubuntu-18.04, ubuntu-20.04, macos-11, macos-latest]
+        os: [windows-latest, ubuntu-20.04, ubuntu-22.04, macos-11, macos-latest]
        # not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
        python-version: [3.7, 3.8]

    steps:
    - name: Test qlib from pip
-      uses: actions/checkout@v2
+      uses: actions/checkout@v3
+
+    # Since version 3.7 of python for MacOS is installed in CI, version 3.7.17, this version causes "_bz not found error".
+    # So we make the version number of python 3.7 for MacOS more specific.
+    # refs: https://github.com/actions/setup-python/issues/682
+    - name: Set up Python ${{ matrix.python-version }}
+      if: (matrix.os == 'macos-latest' && matrix.python-version == '3.7') || (matrix.os == 'macos-11' && matrix.python-version == '3.7')
+      uses: actions/setup-python@v4
+      with:
+        python-version: "3.7.16"

    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v2
+      if: (matrix.os != 'macos-latest' || matrix.python-version != '3.7') && (matrix.os != 'macos-11' || matrix.python-version != '3.7')
+      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}

@@ -50,7 +60,7 @@ jobs:

    - name: Downloads dependencies data
      run: |
-        python scripts/get_data.py qlib_data --name qlib_data_simple --target_dir ~/.qlib/qlib_data/cn_data --interval 1d --region cn
+        python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn

    - name: Test workflow by config
      run: |
--- a/.github/workflows/test_qlib_from_source.yml
+++ b/.github/workflows/test_qlib_from_source.yml
@@ -14,22 +14,34 @@ jobs:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
-        os: [windows-latest, ubuntu-18.04, ubuntu-20.04, macos-11, macos-latest]
+        os: [windows-latest, ubuntu-20.04, ubuntu-22.04, macos-11, macos-latest]
        # not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
        python-version: [3.7, 3.8]

    steps:
    - name: Test qlib from source
-      uses: actions/checkout@v2
+      uses: actions/checkout@v3
+
+    # Since version 3.7 of python for MacOS is installed in CI, version 3.7.17, this version causes "_bz not found error".
+    # So we make the version number of python 3.7 for MacOS more specific.
+    # refs: https://github.com/actions/setup-python/issues/682
+    - name: Set up Python ${{ matrix.python-version }}
+      if: (matrix.os == 'macos-latest' && matrix.python-version == '3.7') || (matrix.os == 'macos-11' && matrix.python-version == '3.7')
+      uses: actions/setup-python@v4
+      with:
+        python-version: "3.7.16"

    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v2
+      if: (matrix.os != 'macos-latest' || matrix.python-version != '3.7') && (matrix.os != 'macos-11' || matrix.python-version != '3.7')
+      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}

    - name: Update pip to the latest version
+      # pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs.
+      # The pip version has been temporarily fixed to 23.0
      run: |
-        python -m pip install --upgrade pip
+        python -m pip install pip==23.0

    - name: Installing pytorch for macos
      if: ${{ matrix.os == 'macos-11' || matrix.os == 'macos-latest' }}
@@ -37,15 +49,13 @@ jobs:
        python -m pip install torch torchvision torchaudio

    - name: Installing pytorch for ubuntu
-      if: ${{ matrix.os == 'ubuntu-18.04' || matrix.os == 'ubuntu-20.04' }}
+      if: ${{ matrix.os == 'ubuntu-20.04' || matrix.os == 'ubuntu-22.04' }}
      run: |
-        python -m pip install --upgrade pip
        python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

    - name: Installing pytorch for windows
      if: ${{ matrix.os == 'windows-latest' }}
      run: |
-        python -m pip install --upgrade pip
        python -m pip install torch torchvision torchaudio

    - name: Set up Python tools
@@ -120,12 +130,16 @@ jobs:
      run: |
        mypy qlib --install-types --non-interactive || true
        mypy qlib --verbose
+    
+    - name: Check Qlib ipynb with nbqa
+      run: |
+        nbqa black . -l 120 --check --diff
+        nbqa pylint . --disable=C0104,C0114,C0115,C0116,C0301,C0302,C0411,C0413,C1802,R0401,R0801,R0902,R0903,R0911,R0912,R0913,R0914,R0915,R1720,W0105,W0123,W0201,W0511,W0613,W1113,W1514,E0401,E1121,C0103,C0209,R0402,R1705,R1710,R1725,R1735,W0102,W0212,W0221,W0223,W0231,W0237,W0612,W0621,W0622,W0703,W1309,E1102,E1136,W0719,W0104,W0404,C0412,W0611,C0410 --const-rgx='[a-z_][a-z0-9_]{2,30}$'

    - name: Test data downloads
      run: |
        python scripts/get_data.py qlib_data --name qlib_data_simple --target_dir ~/.qlib/qlib_data/cn_data --interval 1d --region cn
-        azcopy copy https://qlibpublic.blob.core.windows.net/data/rl /tmp/qlibpublic/data --recursive
-        mv /tmp/qlibpublic/data tests/.data
+        python scripts/get_data.py download_data --file_name rl_data.zip --target_dir tests/.data/rl

    - name: Install Lightgbm for MacOS
      if: ${{ matrix.os == 'macos-11' || matrix.os == 'macos-latest' }}
@@ -138,6 +152,12 @@ jobs:
        brew unlink libomp
        brew install libomp.rb

+    # Run after data downloads
+    - name: Check Qlib ipynb with nbconvert
+      run: |
+        # add more ipynb files in future
+        jupyter nbconvert --to notebook --execute examples/workflow_by_code.ipynb
+
    - name: Test workflow by config (install from source)
      run: |
        python -m pip install numba
--- a/.github/workflows/test_qlib_from_source_slow.yml
+++ b/.github/workflows/test_qlib_from_source_slow.yml
@@ -14,23 +14,34 @@ jobs:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
-        os: [windows-latest, ubuntu-18.04, ubuntu-20.04, macos-11, macos-latest]
+        os: [windows-latest, ubuntu-20.04, ubuntu-22.04, macos-11, macos-latest]
        # not supporting 3.6 due to annotations is not supported https://stackoverflow.com/a/52890129
        python-version: [3.7, 3.8]

    steps:
    - name: Test qlib from source slow
-      uses: actions/checkout@v2
+      uses: actions/checkout@v3
+
+    # Since version 3.7 of python for MacOS is installed in CI, version 3.7.17, this version causes "_bz not found error".
+    # So we make the version number of python 3.7 for MacOS more specific.
+    # refs: https://github.com/actions/setup-python/issues/682
+    - name: Set up Python ${{ matrix.python-version }}
+      if: (matrix.os == 'macos-latest' && matrix.python-version == '3.7') || (matrix.os == 'macos-11' && matrix.python-version == '3.7')
+      uses: actions/setup-python@v4
+      with:
+        python-version: "3.7.16"

    - name: Set up Python ${{ matrix.python-version }}
-      uses: actions/setup-python@v2
+      if: (matrix.os != 'macos-latest' || matrix.python-version != '3.7') && (matrix.os != 'macos-11' || matrix.python-version != '3.7')
+      uses: actions/setup-python@v4
      with:
        python-version: ${{ matrix.python-version }}

    - name: Set up Python tools
+      # pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs.
+      # The pip version has been temporarily fixed to 23.0
      run: |
-        python -m pip install --upgrade pip
-        # python -m pip is necessary to upgrade pip.
+        python -m pip install pip==23.0
        pip install --upgrade cython numpy
        pip install -e .[dev]

--- a/.gitignore
+++ b/.gitignore
@@ -10,7 +10,6 @@ _build
 build/
 dist/

-
 *.pkl
 *.hd5
 *.csv
@@ -27,6 +26,8 @@ examples/estimator/estimator_example/
 examples/rl/data/
 examples/rl/checkpoints/
 examples/rl/outputs/
+examples/rl_order_execution/data/
+examples/rl_order_execution/outputs/

 *.egg-info/

--- a/README.md
+++ b/README.md
@@ -11,6 +11,7 @@
 Recent released features
 | Feature | Status |
 | --                      | ------    |
+| KRNN and Sandwich models | :chart_with_upwards_trend: [Released](https://github.com/microsoft/qlib/pull/1414/) on May 26, 2023 |
 | Release Qlib v0.9.0 | :octocat: [Released](https://github.com/microsoft/qlib/releases/tag/v0.9.0) on Dec 9, 2022 |
 | RL Learning Framework | :hammer: :chart_with_upwards_trend: Released on Nov 10, 2022. [#1332](https://github.com/microsoft/qlib/pull/1332), [#1322](https://github.com/microsoft/qlib/pull/1322), [#1316](https://github.com/microsoft/qlib/pull/1316),[#1299](https://github.com/microsoft/qlib/pull/1299),[#1263](https://github.com/microsoft/qlib/pull/1263), [#1244](https://github.com/microsoft/qlib/pull/1244), [#1169](https://github.com/microsoft/qlib/pull/1169), [#1125](https://github.com/microsoft/qlib/pull/1125), [#1076](https://github.com/microsoft/qlib/pull/1076)|
 | HIST and IGMTF models | :chart_with_upwards_trend: [Released](https://github.com/microsoft/qlib/pull/1040) on Apr 10, 2022 |
@@ -42,13 +43,11 @@ Features released before 2021 are not listed here.
  <img src="http://fintech.msra.cn/images_v070/logo/1.png" />
 </p>

+Qlib is an open-source, AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms, including supervised learning, market dynamics modeling, and reinforcement learning.

-Qlib is an AI-oriented quantitative investment platform, which aims to realize the potential, empower the research, and create the value of AI technologies in quantitative investment.
+An increasing number of SOTA Quant research works/papers in diverse paradigms are being released in Qlib to collaboratively solve key challenges in quantitative investment. For example, 1) using supervised learning to mine the market's complex non-linear patterns from rich and heterogeneous financial data, 2) modeling the dynamic nature of the financial market using adaptive concept drift technology, and 3) using reinforcement learning to model continuous investment decisions and assist investors in optimizing their trading strategies.

 It contains the full ML pipeline of data processing, model training, back-testing; and covers the entire chain of quantitative investment: alpha seeking, risk modeling, portfolio optimization, and order execution. 
-
-With Qlib, users can easily try ideas to create better Quant investment strategies.
-
 For more details, please refer to our paper ["Qlib: An AI-oriented Quantitative Investment Platform"](https://arxiv.org/abs/2009.11189).


@@ -355,6 +354,8 @@ Here is a list of models built on `Qlib`.
 - [ADD based on pytorch (Hongshun Tang, et al.2020)](examples/benchmarks/ADD/)
 - [IGMTF based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/IGMTF/)
 - [HIST based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/HIST/)
+- [KRNN based on pytorch](examples/benchmarks/KRNN/)
+- [Sandwich based on pytorch](examples/benchmarks/Sandwich/)

 Your PR of new Quant models is highly welcomed.

--- a/docs/component/data.rst
+++ b/docs/component/data.rst
@@ -119,7 +119,7 @@ Here are some example:
 for daily data:
  .. code-block:: bash

-    python scripts/get_data.py csv_data_cn --target_dir ~/.qlib/csv_data/cn_data
+    python scripts/get_data.py download_data --file_name csv_data_cn.zip --target_dir ~/.qlib/csv_data/cn_data

 for 1min data:
  .. code-block:: bash
--- a/examples/benchmarks/KRNN/README.md
+++ b/examples/benchmarks/KRNN/README.md
@@ -0,0 +1,8 @@
+# KRNN
+* Code: [https://github.com/microsoft/FOST/blob/main/fostool/model/krnn.py](https://github.com/microsoft/FOST/blob/main/fostool/model/krnn.py)
+
+
+# Introductions about the settings/configs.
+* Torch_geometric is used in the original model in FOST, but we didn't use it.
+* make use your CUDA version matches the torch version to allow the usage of GPU, we use CUDA==10.2 and torch.__version__==1.12.1
+
--- a/examples/benchmarks/KRNN/requirements.txt
+++ b/examples/benchmarks/KRNN/requirements.txt
@@ -0,0 +1,2 @@
+numpy==1.23.4
+pandas==1.5.2
--- a/examples/benchmarks/KRNN/workflow_config_krnn_Alpha360.yaml
+++ b/examples/benchmarks/KRNN/workflow_config_krnn_Alpha360.yaml
@@ -0,0 +1,91 @@
+qlib_init:
+    provider_uri: "~/.qlib/qlib_data/cn_data"
+    region: cn
+market: &market csi300
+benchmark: &benchmark SH000300
+data_handler_config: &data_handler_config
+    start_time: 2008-01-01
+    end_time: 2020-08-01
+    fit_start_time: 2008-01-01
+    fit_end_time: 2014-12-31
+    instruments: *market
+    infer_processors:
+        - class: RobustZScoreNorm
+          kwargs:
+              fields_group: feature
+              clip_outlier: true
+        - class: Fillna
+          kwargs:
+              fields_group: feature
+    learn_processors:
+        - class: DropnaLabel
+        - class: CSRankNorm
+          kwargs:
+              fields_group: label
+    label: ["Ref($close, -2) / Ref($close, -1) - 1"]
+port_analysis_config: &port_analysis_config
+    strategy:
+        class: TopkDropoutStrategy
+        module_path: qlib.contrib.strategy
+        kwargs:
+            signal:
+                - <MODEL> 
+                - <DATASET>
+            topk: 50
+            n_drop: 5
+    backtest:
+        start_time: 2017-01-01
+        end_time: 2020-08-01
+        account: 100000000
+        benchmark: *benchmark
+        exchange_kwargs:
+            limit_threshold: 0.095
+            deal_price: close
+            open_cost: 0.0005
+            close_cost: 0.0015
+            min_cost: 5
+task:
+    model:
+        class: KRNN
+        module_path: qlib.contrib.model.pytorch_krnn
+        kwargs:
+            fea_dim: 6
+            cnn_dim: 8
+            cnn_kernel_size: 3
+            rnn_dim: 8
+            rnn_dups: 2
+            rnn_layers: 2
+            n_epochs: 200
+            lr: 0.001
+            early_stop: 20
+            batch_size: 2000
+            metric: loss
+            GPU: 0
+    dataset:
+        class: DatasetH
+        module_path: qlib.data.dataset
+        kwargs:
+            handler:
+                class: Alpha360
+                module_path: qlib.contrib.data.handler
+                kwargs: *data_handler_config
+            segments:
+                train: [2008-01-01, 2014-12-31]
+                valid: [2015-01-01, 2016-12-31]
+                test: [2017-01-01, 2020-08-01]
+    record: 
+        - class: SignalRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            model: <MODEL>
+            dataset: <DATASET>
+        - class: SigAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            ana_long_short: False
+            ann_scaler: 252
+        - class: PortAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            config: *port_analysis_config
+
--- a/examples/benchmarks/LightGBM/multi_freq_handler.py
+++ b/examples/benchmarks/LightGBM/multi_freq_handler.py
@@ -29,13 +29,13 @@ class Avg15minHandler(DataHandlerLP):
        fit_end_time=None,
        process_type=DataHandlerLP.PTYPE_A,
        filter_pipe=None,
-        inst_processor=None,
+        inst_processors=None,
        **kwargs,
    ):
        infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
        learn_processors = check_transform_proc(learn_processors, fit_start_time, fit_end_time)
        data_loader = Avg15minLoader(
-            config=self.loader_config(), filter_pipe=filter_pipe, freq=freq, inst_processor=inst_processor
+            config=self.loader_config(), filter_pipe=filter_pipe, freq=freq, inst_processors=inst_processors
        )
        super().__init__(
            instruments=instruments,
--- a/examples/benchmarks/LightGBM/workflow_config_lightgbm_Alpha158_multi_freq.yaml
+++ b/examples/benchmarks/LightGBM/workflow_config_lightgbm_Alpha158_multi_freq.yaml
@@ -18,7 +18,7 @@ data_handler_config: &data_handler_config
        label: day
        feature: 1min
    # with label as reference
-    inst_processor:
+    inst_processors:
        feature:
            - class: Resample1minProcessor
              module_path: features_sample.py
--- a/examples/benchmarks/LightGBM/workflow_config_lightgbm_multi_freq.yaml
+++ b/examples/benchmarks/LightGBM/workflow_config_lightgbm_multi_freq.yaml
@@ -19,7 +19,7 @@ data_handler_config: &data_handler_config
        feature_15min: 1min
        feature_day: day
    # with label as reference
-    inst_processor:
+    inst_processors:
        feature_15min:
            - class: ResampleNProcessor
              module_path: features_resample_N.py
--- a/examples/benchmarks/MLP/workflow_config_mlp_Alpha158.yaml
+++ b/examples/benchmarks/MLP/workflow_config_mlp_Alpha158.yaml
@@ -64,8 +64,6 @@ task:
        kwargs:
            loss: mse
            lr: 0.002
-            lr_decay: 0.96
-            lr_decay_steps: 100
            optimizer: adam
            max_steps: 8000
            batch_size: 8192
--- a/examples/benchmarks/MLP/workflow_config_mlp_Alpha158_csi500.yaml
+++ b/examples/benchmarks/MLP/workflow_config_mlp_Alpha158_csi500.yaml
@@ -64,8 +64,6 @@ task:
        kwargs:
            loss: mse
            lr: 0.002
-            lr_decay: 0.96
-            lr_decay_steps: 100
            optimizer: adam
            max_steps: 8000
            batch_size: 8192
--- a/examples/benchmarks/MLP/workflow_config_mlp_Alpha360.yaml
+++ b/examples/benchmarks/MLP/workflow_config_mlp_Alpha360.yaml
@@ -52,8 +52,6 @@ task:
        kwargs:
            loss: mse
            lr: 0.002
-            lr_decay: 0.96
-            lr_decay_steps: 100
            optimizer: adam
            max_steps: 8000
            batch_size: 4096
--- a/examples/benchmarks/MLP/workflow_config_mlp_Alpha360_csi500.yaml
+++ b/examples/benchmarks/MLP/workflow_config_mlp_Alpha360_csi500.yaml
@@ -52,8 +52,6 @@ task:
        kwargs:
            loss: mse
            lr: 0.002
-            lr_decay: 0.96
-            lr_decay_steps: 100
            optimizer: adam
            max_steps: 8000
            batch_size: 4096
--- a/examples/benchmarks/README.md
+++ b/examples/benchmarks/README.md
@@ -26,7 +26,7 @@ The numbers shown below demonstrate the performance of the entire `workflow` of

 | Model Name                               | Dataset                             | IC          | ICIR        | Rank IC     | Rank ICIR   | Annualized Return | Information Ratio | Max Drawdown |
 |------------------------------------------|-------------------------------------|-------------|-------------|-------------|-------------|-------------------|-------------------|--------------|
-| TCN(Shaojie Bai, et al.)                 | Alpha158                            | 0.0275±0.00 | 0.2157±0.01 | 0.0411±0.00 | 0.3379±0.01 | 0.0190±0.02       | 0.2887±0.27       | -0.1202±0.03 |
+| TCN(Shaojie Bai, et al.)                 | Alpha158                            | 0.0279±0.00 | 0.2181±0.01 | 0.0421±0.00 | 0.3429±0.01 | 0.0262±0.02       | 0.4133±0.25       | -0.1090±0.03 |
 | TabNet(Sercan O. Arik, et al.)           | Alpha158                            | 0.0204±0.01 | 0.1554±0.07 | 0.0333±0.00 | 0.2552±0.05 | 0.0227±0.04       | 0.3676±0.54       | -0.1089±0.08 |
 | Transformer(Ashish Vaswani, et al.)      | Alpha158                            | 0.0264±0.00 | 0.2053±0.02 | 0.0407±0.00 | 0.3273±0.02 | 0.0273±0.02       | 0.3970±0.26       | -0.1101±0.02 |
 | GRU(Kyunghyun Cho, et al.)               | Alpha158(with selected 20 features) | 0.0315±0.00 | 0.2450±0.04 | 0.0428±0.00 | 0.3440±0.03 | 0.0344±0.02       | 0.5160±0.25       | -0.1017±0.02 |
@@ -68,6 +68,8 @@ The numbers shown below demonstrate the performance of the entire `workflow` of
 | TRA(Hengxu Lin, et al.)                   | Alpha360 | 0.0485±0.00 | 0.3787±0.03 | 0.0587±0.00 | 0.4756±0.03 | 0.0920±0.03       | 1.2789±0.42       | -0.0834±0.02 |
 | IGMTF(Wentao Xu, et al.)                  | Alpha360 | 0.0480±0.00 | 0.3589±0.02 | 0.0606±0.00 | 0.4773±0.01 | 0.0946±0.02       | 1.3509±0.25       | -0.0716±0.02 |
 | HIST(Wentao Xu, et al.)                   | Alpha360 | 0.0522±0.00 | 0.3530±0.01 | 0.0667±0.00 | 0.4576±0.01 | 0.0987±0.02       | 1.3726±0.27       | -0.0681±0.01 |
+| KRNN                                      | Alpha360 | 0.0173±0.01 | 0.1210±0.06 | 0.0270±0.01 | 0.2018±0.04 | -0.0465±0.05      | -0.5415±0.62      | -0.2919±0.13 |
+| Sandwich                                  | Alpha360 | 0.0258±0.00 | 0.1924±0.04 | 0.0337±0.00 | 0.2624±0.03 | 0.0005±0.03       | 0.0001±0.33       | -0.1752±0.05 |


 - The selected 20 features are based on the feature importance of a lightgbm-based model.
@@ -134,7 +136,7 @@ If you want to contribute your new models, you can follow the steps below.
    - `README.md`: a brief introduction to your models
    - `workflow_config_<model name>_<dataset>.yaml`: a configuration which can read by `qrun`. You are encouraged to run your model in all datasets.
 3. You can integrate your model as a module [in this folder](https://github.com/microsoft/qlib/tree/main/qlib/contrib/model).
-4. Please updated your results in the benchmark tables, e.g. [Alpha360](#alpha158-dataset), [Alpha158](#alpha158-dataset)(the values of each metric are the mean and std calculated based on 20 runs with different random seeds, if you don't have enough computational resource, you can ask for help in the PR).
+4. Please update your results in the above **Benchmark Tables**, e.g. [Alpha360](#alpha158-dataset), [Alpha158](#alpha158-dataset)(the values of each metric are the mean and std calculated based on **20 Runs** with different random seeds. You can accomplish the above operations through the automated [script](https://github.com/microsoft/qlib/blob/main/examples/run_all_model.py#LL286C22-L286C22) provided by Qlib, and get the final result in the .md file. if you don't have enough computational resource, you can ask for help in the PR).
 5. Update the info in the index page in the [news list](https://github.com/microsoft/qlib#newspaper-whats-new----sparkling_heart) and [model list](https://github.com/microsoft/qlib#quant-model-paper-zoo).

 Finally, you can send PR for review. ([here is an example](https://github.com/microsoft/qlib/pull/1040))
--- a/examples/benchmarks/Sandwich/README.md
+++ b/examples/benchmarks/Sandwich/README.md
@@ -0,0 +1,8 @@
+# Sandwich
+* Code: [https://github.com/microsoft/FOST/blob/main/fostool/model/sandwich.py](https://github.com/microsoft/FOST/blob/main/fostool/model/sandwich.py)
+
+
+# Introductions about the settings/configs.
+* Torch_geometric is used in the original model in FOST, but we didn't use it.
+make use your CUDA version matches the torch version to allow the usage of GPU, we use CUDA==10.2 and torch.version==1.12.1
+
--- a/examples/benchmarks/Sandwich/requirements.txt
+++ b/examples/benchmarks/Sandwich/requirements.txt
@@ -0,0 +1,2 @@
+numpy==1.23.4
+pandas==1.5.2
--- a/examples/benchmarks/Sandwich/workflow_config_sandwich_Alpha360.yaml
+++ b/examples/benchmarks/Sandwich/workflow_config_sandwich_Alpha360.yaml
@@ -0,0 +1,93 @@
+qlib_init:
+    provider_uri: "~/.qlib/qlib_data/cn_data"
+    region: cn
+market: &market csi300
+benchmark: &benchmark SH000300
+data_handler_config: &data_handler_config
+    start_time: 2008-01-01
+    end_time: 2020-08-01
+    fit_start_time: 2008-01-01
+    fit_end_time: 2014-12-31
+    instruments: *market
+    infer_processors:
+        - class: RobustZScoreNorm
+          kwargs:
+              fields_group: feature
+              clip_outlier: true
+        - class: Fillna
+          kwargs:
+              fields_group: feature
+    learn_processors:
+        - class: DropnaLabel
+        - class: CSRankNorm
+          kwargs:
+              fields_group: label
+    label: ["Ref($close, -2) / Ref($close, -1) - 1"]
+port_analysis_config: &port_analysis_config
+    strategy:
+        class: TopkDropoutStrategy
+        module_path: qlib.contrib.strategy
+        kwargs:
+            signal:
+                - <MODEL> 
+                - <DATASET>
+            topk: 50
+            n_drop: 5
+    backtest:
+        start_time: 2017-01-01
+        end_time: 2020-08-01
+        account: 100000000
+        benchmark: *benchmark
+        exchange_kwargs:
+            limit_threshold: 0.095
+            deal_price: close
+            open_cost: 0.0005
+            close_cost: 0.0015
+            min_cost: 5
+task:
+    model:
+        class: Sandwich
+        module_path: qlib.contrib.model.pytorch_sandwich
+        kwargs:
+            fea_dim: 6
+            cnn_dim_1: 16
+            cnn_dim_2: 16
+            cnn_kernel_size: 3
+            rnn_dim_1: 8
+            rnn_dim_2: 8
+            rnn_dups: 2
+            rnn_layers: 2
+            n_epochs: 200
+            lr: 0.001
+            early_stop: 20
+            batch_size: 2000
+            metric: loss
+            GPU: 0
+    dataset:
+        class: DatasetH
+        module_path: qlib.data.dataset
+        kwargs:
+            handler:
+                class: Alpha360
+                module_path: qlib.contrib.data.handler
+                kwargs: *data_handler_config
+            segments:
+                train: [2008-01-01, 2014-12-31]
+                valid: [2015-01-01, 2016-12-31]
+                test: [2017-01-01, 2020-08-01]
+    record: 
+        - class: SignalRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            model: <MODEL>
+            dataset: <DATASET>
+        - class: SigAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            ana_long_short: False
+            ann_scaler: 252
+        - class: PortAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            config: *port_analysis_config
+
--- a/examples/benchmarks/TRA/Reports.ipynb
+++ b/examples/benchmarks/TRA/Reports.ipynb
@@ -25,59 +25,65 @@
    "import seaborn as sns\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib\n",
-    "sns.set(style='white')\n",
-    "matplotlib.rcParams['pdf.fonttype'] = 42\n",
-    "matplotlib.rcParams['ps.fonttype'] = 42\n",
+    "\n",
+    "sns.set(style=\"white\")\n",
+    "matplotlib.rcParams[\"pdf.fonttype\"] = 42\n",
+    "matplotlib.rcParams[\"ps.fonttype\"] = 42\n",
    "\n",
    "from tqdm.auto import tqdm\n",
    "from joblib import Parallel, delayed\n",
    "\n",
+    "\n",
    "def func(x, N=80):\n",
    "    ret = x.ret.copy()\n",
    "    x = x.rank(pct=True)\n",
-    "    x['ret'] = ret\n",
+    "    x[\"ret\"] = ret\n",
    "    diff = x.score.sub(x.label)\n",
-    "    r = x.nlargest(N, columns='score').ret.mean()\n",
-    "    r -= x.nsmallest(N, columns='score').ret.mean()\n",
-    "    return pd.Series({\n",
-    "        'MSE': diff.pow(2).mean(), \n",
-    "        'MAE': diff.abs().mean(), \n",
-    "        'IC': x.score.corr(x.label),\n",
-    "        'R': r\n",
-    "    })\n",
-    "    \n",
+    "    r = x.nlargest(N, columns=\"score\").ret.mean()\n",
+    "    r -= x.nsmallest(N, columns=\"score\").ret.mean()\n",
+    "    return pd.Series(\n",
+    "        {\n",
+    "            \"MSE\": diff.pow(2).mean(),\n",
+    "            \"MAE\": diff.abs().mean(),\n",
+    "            \"IC\": x.score.corr(x.label),\n",
+    "            \"R\": r,\n",
+    "        }\n",
+    "    )\n",
+    "\n",
+    "\n",
    "ret = pd.read_pickle(\"data/ret.pkl\").clip(-0.1, 0.1)\n",
+    "\n",
+    "\n",
    "def backtest(fname, **kwargs):\n",
-    "    pred = pd.read_pickle(fname).loc['2018-09-21':'2020-06-30']  # test period\n",
-    "    pred['ret'] = ret\n",
+    "    pred = pd.read_pickle(fname).loc[\"2018-09-21\":\"2020-06-30\"]  # test period\n",
+    "    pred[\"ret\"] = ret\n",
    "    dates = pred.index.unique(level=0)\n",
    "    res = Parallel(n_jobs=-1)(delayed(func)(pred.loc[d], **kwargs) for d in dates)\n",
-    "    res = {\n",
-    "       dates[i]: res[i]\n",
-    "       for i in range(len(dates))\n",
-    "    }\n",
+    "    res = {dates[i]: res[i] for i in range(len(dates))}\n",
    "    res = pd.DataFrame(res).T\n",
-    "    r = res['R'].copy()\n",
+    "    r = res[\"R\"].copy()\n",
    "    r.index = pd.to_datetime(r.index)\n",
    "    r = r.reindex(pd.date_range(r.index[0], r.index[-1])).fillna(0)  # paper use 365 days\n",
    "    return {\n",
-    "        'MSE': res['MSE'].mean(),\n",
-    "        'MAE': res['MAE'].mean(),\n",
-    "        'IC': res['IC'].mean(),\n",
-    "        'ICIR': res['IC'].mean()/res['IC'].std(),\n",
-    "        'AR': r.mean()*365,\n",
-    "        'AV': r.std()*365**0.5,\n",
-    "        'SR': r.mean()/r.std()*365**0.5,\n",
-    "        'MDD': (r.cumsum().cummax() - r.cumsum()).max()\n",
+    "        \"MSE\": res[\"MSE\"].mean(),\n",
+    "        \"MAE\": res[\"MAE\"].mean(),\n",
+    "        \"IC\": res[\"IC\"].mean(),\n",
+    "        \"ICIR\": res[\"IC\"].mean() / res[\"IC\"].std(),\n",
+    "        \"AR\": r.mean() * 365,\n",
+    "        \"AV\": r.std() * 365**0.5,\n",
+    "        \"SR\": r.mean() / r.std() * 365**0.5,\n",
+    "        \"MDD\": (r.cumsum().cummax() - r.cumsum()).max(),\n",
    "    }, r\n",
    "\n",
+    "\n",
    "def fmt(x, p=3, scale=1, std=False):\n",
-    "    _fmt = '{:.%df}'%p\n",
+    "    _fmt = \"{:.%df}\" % p\n",
    "    string = _fmt.format((x.mean() if not isinstance(x, (float, np.floating)) else x) * scale)\n",
    "    if std and len(x) > 1:\n",
-    "        string += ' ('+_fmt.format(x.std()*scale)+')'\n",
+    "        string += \" (\" + _fmt.format(x.std() * scale) + \")\"\n",
    "    return string\n",
    "\n",
+    "\n",
    "def backtest_multi(files, **kwargs):\n",
    "    res = []\n",
    "    pnl = []\n",
@@ -88,14 +94,14 @@
    "    res = pd.DataFrame(res)\n",
    "    pnl = pd.concat(pnl, axis=1)\n",
    "    return {\n",
-    "        'MSE': fmt(res['MSE'], std=True),\n",
-    "        'MAE': fmt(res['MAE'], std=True),\n",
-    "        'IC': fmt(res['IC']),\n",
-    "        'ICIR': fmt(res['ICIR']),\n",
-    "        'AR': fmt(res['AR'], scale=100, p=1)+'%',\n",
-    "        'VR': fmt(res['AV'], scale=100, p=1)+'%',\n",
-    "        'SR': fmt(res['SR']),\n",
-    "        'MDD': fmt(res['MDD'], scale=100, p=1)+'%'\n",
+    "        \"MSE\": fmt(res[\"MSE\"], std=True),\n",
+    "        \"MAE\": fmt(res[\"MAE\"], std=True),\n",
+    "        \"IC\": fmt(res[\"IC\"]),\n",
+    "        \"ICIR\": fmt(res[\"ICIR\"]),\n",
+    "        \"AR\": fmt(res[\"AR\"], scale=100, p=1) + \"%\",\n",
+    "        \"VR\": fmt(res[\"AV\"], scale=100, p=1) + \"%\",\n",
+    "        \"SR\": fmt(res[\"SR\"]),\n",
+    "        \"MDD\": fmt(res[\"MDD\"], scale=100, p=1) + \"%\",\n",
    "    }, pnl"
   ]
  },
@@ -124,16 +130,20 @@
   "outputs": [],
   "source": [
    "exps = {\n",
-    "    'Linear': ['output/Linear/pred.pkl'],\n",
-    "    'LightGBM': ['output/GBDT/lr0.05_leaves128/pred.pkl'],\n",
-    "    'MLP': glob.glob('output/search/MLP/hs128_bs512_do0.3_lr0.001_seed*/pred.pkl'),\n",
-    "    'SFM': glob.glob('output/search/SFM/hs32_bs512_do0.5_lr0.001_seed*/pred.pkl'),\n",
-    "    'ALSTM': glob.glob('output/search/LSTM_Attn/hs256_bs1024_do0.1_lr0.0002_seed*/pred.pkl'),\n",
-    "    'Trans.': glob.glob('output/search/Transformer/head4_hs64_bs1024_do0.1_lr0.0002_seed*/pred.pkl'),\n",
-    "    'ALSTM+TS':glob.glob('output/LSTM_Attn_TS/hs256_bs1024_do0.1_lr0.0002_seed*/pred.pkl'),\n",
-    "    'Trans.+TS':glob.glob('output/Transformer_TS/head4_hs64_bs1024_do0.1_lr0.0002_seed*/pred.pkl'),\n",
-    "    'ALSTM+TRA(Ours)': glob.glob('output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl'),\n",
-    "    'Trans.+TRA(Ours)': glob.glob('output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb1.0_head4_hs64_bs512_do0.1_lr0.0005_seed*/pred.pkl')\n",
+    "    \"Linear\": [\"output/Linear/pred.pkl\"],\n",
+    "    \"LightGBM\": [\"output/GBDT/lr0.05_leaves128/pred.pkl\"],\n",
+    "    \"MLP\": glob.glob(\"output/search/MLP/hs128_bs512_do0.3_lr0.001_seed*/pred.pkl\"),\n",
+    "    \"SFM\": glob.glob(\"output/search/SFM/hs32_bs512_do0.5_lr0.001_seed*/pred.pkl\"),\n",
+    "    \"ALSTM\": glob.glob(\"output/search/LSTM_Attn/hs256_bs1024_do0.1_lr0.0002_seed*/pred.pkl\"),\n",
+    "    \"Trans.\": glob.glob(\"output/search/Transformer/head4_hs64_bs1024_do0.1_lr0.0002_seed*/pred.pkl\"),\n",
+    "    \"ALSTM+TS\": glob.glob(\"output/LSTM_Attn_TS/hs256_bs1024_do0.1_lr0.0002_seed*/pred.pkl\"),\n",
+    "    \"Trans.+TS\": glob.glob(\"output/Transformer_TS/head4_hs64_bs1024_do0.1_lr0.0002_seed*/pred.pkl\"),\n",
+    "    \"ALSTM+TRA(Ours)\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl\"\n",
+    "    ),\n",
+    "    \"Trans.+TRA(Ours)\": glob.glob(\n",
+    "        \"output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb1.0_head4_hs64_bs512_do0.1_lr0.0005_seed*/pred.pkl\"\n",
+    "    ),\n",
    "}"
   ]
  },
@@ -160,14 +170,8 @@
    }
   ],
   "source": [
-    "res = {\n",
-    "    name: backtest_multi(exps[name])\n",
-    "    for name in tqdm(exps)\n",
-    "}\n",
-    "report = pd.DataFrame({\n",
-    "    k: v[0]\n",
-    "    for k, v in res.items()\n",
-    "}).T"
+    "res = {name: backtest_multi(exps[name]) for name in tqdm(exps)}\n",
+    "report = pd.DataFrame({k: v[0] for k, v in res.items()}).T"
   ]
  },
  {
@@ -385,24 +389,40 @@
    }
   ],
   "source": [
-    "df = pd.read_pickle('output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb0.0_head4_hs64_bs512_do0.1_lr0.0005_seed1000/pred.pkl')\n",
-    "code = 'SH600157'\n",
-    "date = '2018-09-28'\n",
+    "df = pd.read_pickle(\n",
+    "    \"output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb0.0_head4_hs64_bs512_do0.1_lr0.0005_seed1000/pred.pkl\"\n",
+    ")\n",
+    "code = \"SH600157\"\n",
+    "date = \"2018-09-28\"\n",
    "lookbackperiod = 50\n",
    "\n",
    "prob = df.iloc[:, -3:].loc(axis=0)[:, code].reset_index(level=1, drop=True).loc[date:].iloc[:lookbackperiod]\n",
-    "pred = df.loc[:,[\"score_0\",\"score_1\",\"score_2\",\"label\"]].loc(axis=0)[:, code].reset_index(level=1, drop=True).loc[date:].iloc[:lookbackperiod]\n",
-    "e_all = pred.iloc[:,:-1].sub(pred.iloc[:,-1], axis=0).pow(2)\n",
+    "pred = (\n",
+    "    df.loc[:, [\"score_0\", \"score_1\", \"score_2\", \"label\"]]\n",
+    "    .loc(axis=0)[:, code]\n",
+    "    .reset_index(level=1, drop=True)\n",
+    "    .loc[date:]\n",
+    "    .iloc[:lookbackperiod]\n",
+    ")\n",
+    "e_all = pred.iloc[:, :-1].sub(pred.iloc[:, -1], axis=0).pow(2)\n",
    "e_all = e_all.sub(e_all.min(axis=1), axis=0)\n",
-    "e_all.columns = [r'$\\theta_%d$'%d for d in range(1, 4)]\n",
+    "e_all.columns = [r\"$\\theta_%d$\" % d for d in range(1, 4)]\n",
    "prob = pd.Series(np.argmax(prob.values, axis=1), index=prob.index).rolling(7).mean().round()\n",
    "\n",
    "fig, axes = plt.subplots(1, 2, figsize=(7, 3))\n",
-    "e_all.plot(ax=axes[0], xlabel='', rot=30)\n",
-    "prob.plot(ax=axes[1], xlabel='', rot=30, color='red', linestyle='None', marker='^', markersize=5)\n",
+    "e_all.plot(ax=axes[0], xlabel=\"\", rot=30)\n",
+    "prob.plot(\n",
+    "    ax=axes[1],\n",
+    "    xlabel=\"\",\n",
+    "    rot=30,\n",
+    "    color=\"red\",\n",
+    "    linestyle=\"None\",\n",
+    "    marker=\"^\",\n",
+    "    markersize=5,\n",
+    ")\n",
    "plt.yticks(np.array([0, 1, 2]), e_all.columns.values)\n",
-    "axes[0].set_ylabel('Predictor Loss')\n",
-    "axes[1].set_ylabel('Router Selection')\n",
+    "axes[0].set_ylabel(\"Predictor Loss\")\n",
+    "axes[1].set_ylabel(\"Router Selection\")\n",
    "plt.tight_layout()\n",
    "# plt.savefig('select.pdf', bbox_inches='tight')\n",
    "plt.show()"
@@ -428,10 +448,18 @@
   "outputs": [],
   "source": [
    "exps = {\n",
-    "    'Random': glob.glob('output/search/LSTM_Attn_tra/K10_traHs16_traSrcNONE_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl'),\n",
-    "    'LR': glob.glob('output/search/LSTM_Attn_tra/K10_traHs16_traSrcLR_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl'),\n",
-    "    'TPE': glob.glob('output/search/LSTM_Attn_tra/K10_traHs16_traSrcTPE_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl'),\n",
-    "    'LR+TPE': glob.glob('output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl')\n",
+    "    \"Random\": glob.glob(\n",
+    "        \"output/search/LSTM_Attn_tra/K10_traHs16_traSrcNONE_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl\"\n",
+    "    ),\n",
+    "    \"LR\": glob.glob(\n",
+    "        \"output/search/LSTM_Attn_tra/K10_traHs16_traSrcLR_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl\"\n",
+    "    ),\n",
+    "    \"TPE\": glob.glob(\n",
+    "        \"output/search/LSTM_Attn_tra/K10_traHs16_traSrcTPE_traLamb1.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl\"\n",
+    "    ),\n",
+    "    \"LR+TPE\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/pred.pkl\"\n",
+    "    ),\n",
    "}"
   ]
  },
@@ -456,14 +484,8 @@
    }
   ],
   "source": [
-    "res = {\n",
-    "    name: backtest_multi(exps[name])\n",
-    "    for name in tqdm(exps)\n",
-    "}\n",
-    "report = pd.DataFrame({\n",
-    "    k: v[0]\n",
-    "    for k, v in res.items()\n",
-    "}).T"
+    "res = {name: backtest_multi(exps[name]) for name in tqdm(exps)}\n",
+    "report = pd.DataFrame({k: v[0] for k, v in res.items()}).T"
   ]
  },
  {
@@ -597,18 +619,22 @@
    }
   ],
   "source": [
-    "a = pd.read_pickle('output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb0.0_head4_hs64_bs512_do0.1_lr0.0005_seed3000/pred.pkl')\n",
-    "b = pd.read_pickle('output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb2.0_head4_hs64_bs512_do0.1_lr0.0005_seed3000/pred.pkl')\n",
+    "a = pd.read_pickle(\n",
+    "    \"output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb0.0_head4_hs64_bs512_do0.1_lr0.0005_seed3000/pred.pkl\"\n",
+    ")\n",
+    "b = pd.read_pickle(\n",
+    "    \"output/search/finetune/Transformer_tra/K3_traHs16_traSrcLR_TPE_traLamb2.0_head4_hs64_bs512_do0.1_lr0.0005_seed3000/pred.pkl\"\n",
+    ")\n",
    "a = a.iloc[:, -3:]\n",
    "b = b.iloc[:, -3:]\n",
    "b = np.eye(3)[b.values.argmax(axis=1)]\n",
    "a = np.eye(3)[a.values.argmax(axis=1)]\n",
    "\n",
-    "res = pd.DataFrame({\n",
-    "    'with OT': b.sum(axis=0) / b.sum(),\n",
-    "    'without OT': a.sum(axis=0)/ a.sum()  \n",
-    "},index=[r'$\\theta_1$',r'$\\theta_2$',r'$\\theta_3$'])\n",
-    "res.plot.bar(rot=30, figsize=(5, 4), color=['b', 'g'])\n",
+    "res = pd.DataFrame(\n",
+    "    {\"with OT\": b.sum(axis=0) / b.sum(), \"without OT\": a.sum(axis=0) / a.sum()},\n",
+    "    index=[r\"$\\theta_1$\", r\"$\\theta_2$\", r\"$\\theta_3$\"],\n",
+    ")\n",
+    "res.plot.bar(rot=30, figsize=(5, 4), color=[\"b\", \"g\"])\n",
    "del a, b"
   ]
  },
@@ -633,11 +659,19 @@
   "outputs": [],
   "source": [
    "exps = {\n",
-    "    'K=1': glob.glob('output/search/LSTM_Attn/hs256_bs1024_do0.1_lr0.0002_seed*/info.json'),\n",
-    "    'K=3': glob.glob('output/search/finetune/LSTM_Attn_tra/K3_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json'),\n",
-    "    'K=5': glob.glob('output/search/finetune/LSTM_Attn_tra/K5_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json'),\n",
-    "    'K=10': glob.glob('output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json'),\n",
-    "    'K=20': glob.glob('output/search/finetune/LSTM_Attn_tra/K20_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json')\n",
+    "    \"K=1\": glob.glob(\"output/search/LSTM_Attn/hs256_bs1024_do0.1_lr0.0002_seed*/info.json\"),\n",
+    "    \"K=3\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K3_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json\"\n",
+    "    ),\n",
+    "    \"K=5\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K5_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json\"\n",
+    "    ),\n",
+    "    \"K=10\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K10_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json\"\n",
+    "    ),\n",
+    "    \"K=20\": glob.glob(\n",
+    "        \"output/search/finetune/LSTM_Attn_tra/K20_traHs16_traSrcLR_TPE_traLamb2.0_hs256_bs1024_do0.1_lr0.0001_seed*/info.json\"\n",
+    "    ),\n",
    "}"
   ]
  },
@@ -649,16 +683,11 @@
   "source": [
    "report = dict()\n",
    "for k, v in exps.items():\n",
-    "    \n",
    "    tmp = dict()\n",
    "    for fname in v:\n",
    "        with open(fname) as f:\n",
    "            info = json.load(f)\n",
-    "        tmp[fname] = (\n",
-    "        {\n",
-    "            \"IC\":info[\"metric\"][\"IC\"],\n",
-    "            \"MSE\":info[\"metric\"][\"MSE\"]\n",
-    "        })\n",
+    "        tmp[fname] = {\"IC\": info[\"metric\"][\"IC\"], \"MSE\": info[\"metric\"][\"MSE\"]}\n",
    "    tmp = pd.DataFrame(tmp).T\n",
    "    report[k] = tmp.mean()\n",
    "report = pd.DataFrame(report).T"
@@ -681,13 +710,14 @@
    }
   ],
   "source": [
-    "fig, axes = plt.subplots(1, 2, figsize=(6,3)); axes = axes.flatten()\n",
-    "report['IC'].plot.bar(rot=30, ax=axes[0])\n",
+    "fig, axes = plt.subplots(1, 2, figsize=(6, 3))\n",
+    "axes = axes.flatten()\n",
+    "report[\"IC\"].plot.bar(rot=30, ax=axes[0])\n",
    "axes[0].set_ylim(0.045, 0.062)\n",
-    "axes[0].set_title('IC performance')\n",
-    "report['MSE'].astype(float).plot.bar(rot=30, ax=axes[1], color='green')\n",
+    "axes[0].set_title(\"IC performance\")\n",
+    "report[\"MSE\"].astype(float).plot.bar(rot=30, ax=axes[1], color=\"green\")\n",
    "axes[1].set_ylim(0.155, 0.1585)\n",
-    "axes[1].set_title('MSE performance')\n",
+    "axes[1].set_title(\"MSE performance\")\n",
    "plt.tight_layout()\n",
    "# plt.savefig('sensitivity.pdf')"
   ]
--- a/examples/benchmarks_dynamic/DDG-DA/Makefile
+++ b/examples/benchmarks_dynamic/DDG-DA/Makefile
@@ -0,0 +1,4 @@
+.PHONY: clean
+
+clean:
+	-rm -r *.pkl mlruns || true
--- a/examples/benchmarks_dynamic/DDG-DA/vis_data.py
+++ b/examples/benchmarks_dynamic/DDG-DA/vis_data.py
@@ -0,0 +1,107 @@
+import pickle
+import numpy as np
+import pandas as pd
+import matplotlib.pyplot as plt
+import seaborn as sns
+
+sns.set(color_codes=True)
+plt.rcParams["font.sans-serif"] = "SimHei"
+plt.rcParams["axes.unicode_minus"] = False
+from tqdm.auto import tqdm
+
+# tqdm.pandas()  # for progress_apply
+# %matplotlib inline
+# %load_ext autoreload
+
+
+# # Meta Input
+
+# +
+with open("./internal_data_s20.pkl", "rb") as f:
+    data = pickle.load(f)
+
+data.data_ic_df.columns.names = ["start_date", "end_date"]
+
+data_sim = data.data_ic_df.droplevel(axis=1, level="end_date")
+
+data_sim.index.name = "test datetime"
+# -
+
+plt.figure(figsize=(40, 20))
+sns.heatmap(data_sim)
+
+plt.figure(figsize=(40, 20))
+sns.heatmap(data_sim.rolling(20).mean())
+
+# # Meta Model
+
+from qlib import auto_init
+
+auto_init()
+from qlib.workflow import R
+
+exp = R.get_exp(experiment_name="DDG-DA")
+meta_rec = exp.list_recorders(rtype="list", max_results=1)[0]
+meta_m = meta_rec.load_object("model")
+
+pd.DataFrame(meta_m.tn.twm.linear.weight.detach().numpy()).T[0].plot()
+
+pd.DataFrame(meta_m.tn.twm.linear.weight.detach().numpy()).T[0].rolling(5).mean().plot()
+
+# # Meta Output
+
+# +
+with open("./tasks_s20.pkl", "rb") as f:
+    tasks = pickle.load(f)
+
+task_df = {}
+for t in tasks:
+    test_seg = t["dataset"]["kwargs"]["segments"]["test"]
+    if None not in test_seg:
+        # The last rolling is skipped.
+        task_df[test_seg] = t["reweighter"].time_weight
+task_df = pd.concat(task_df)
+
+task_df.index.names = ["OS_start", "OS_end", "IS_start", "IS_end"]
+task_df = task_df.droplevel(["OS_end", "IS_end"])
+task_df = task_df.unstack("OS_start")
+# -
+
+plt.figure(figsize=(40, 20))
+sns.heatmap(task_df.T)
+
+plt.figure(figsize=(40, 20))
+sns.heatmap(task_df.rolling(10).mean().T)
+
+# # Sub Models
+#
+# NOTE:
+# - this section assumes that the model is Linear model!!
+# - Other models does not support this analysis
+
+exp = R.get_exp(experiment_name="rolling_ds")
+
+
+def show_linear_weight(exp):
+    coef_df = {}
+    for r in exp.list_recorders("list"):
+        t = r.load_object("task")
+        if None in t["dataset"]["kwargs"]["segments"]["test"]:
+            continue
+        m = r.load_object("params.pkl")
+        coef_df[t["dataset"]["kwargs"]["segments"]["test"]] = pd.Series(m.coef_)
+
+    coef_df = pd.concat(coef_df)
+
+    coef_df.index.names = ["test_start", "test_end", "coef_idx"]
+
+    coef_df = coef_df.droplevel("test_end").unstack("coef_idx").T
+
+    plt.figure(figsize=(40, 20))
+    sns.heatmap(coef_df)
+    plt.show()
+
+
+show_linear_weight(R.get_exp(experiment_name="rolling_ds"))
+
+show_linear_weight(R.get_exp(experiment_name="rolling_models"))
--- a/examples/benchmarks_dynamic/DDG-DA/workflow.py
+++ b/examples/benchmarks_dynamic/DDG-DA/workflow.py
@@ -10,8 +10,10 @@ import pandas as pd
 import fire
 import sys
 import pickle
+from typing import Optional
 from qlib import auto_init
 from qlib.model.trainer import TrainerR
+from qlib.typehint import Literal
 from qlib.utils import init_instance_by_config
 from qlib.workflow import R
 from qlib.tests.data import GetData
@@ -30,7 +32,33 @@ class DDGDA:
    - `rm -r mlruns`
    """

-    def __init__(self, sim_task_model="linear", forecast_model="linear"):
+    def __init__(
+        self,
+        sim_task_model: Literal["linear", "gbdt"] = "gbdt",
+        forecast_model: Literal["linear", "gbdt"] = "linear",
+        h_path: Optional[str] = None,
+        test_end: Optional[str] = None,
+        train_start: Optional[str] = None,
+        meta_1st_train_end: Optional[str] = None,
+        task_ext_conf: Optional[dict] = None,
+        alpha: float = 0.01,
+        proxy_hd: str = "handler_proxy.pkl",
+    ):
+        """
+
+        Parameters
+        ----------
+
+        train_start: Optional[str]
+            the start datetime for data.  It is used in training start time (for both tasks & meta learing)
+        test_end: Optional[str]
+            the end datetime for data. It is used in test end time
+        meta_1st_train_end: Optional[str]
+            the datetime of training end of the first meta_task
+        alpha: float
+            Setting the L2 regularization for ridge
+            The `alpha` is only passed to MetaModelDS (it is not passed to sim_task_model currently..)
+        """
        self.step = 20
        # NOTE:
        # the horizon must match the meaning in the base task template
@@ -38,10 +66,19 @@ class DDGDA:
        self.meta_exp_name = "DDG-DA"
        self.sim_task_model = sim_task_model  # The model to capture the distribution of data.
        self.forecast_model = forecast_model  # downstream forecasting models' type
+        self.rb_kwargs = {
+            "h_path": h_path,
+            "test_end": test_end,
+            "train_start": train_start,
+            "task_ext_conf": task_ext_conf,
+        }
+        self.alpha = alpha
+        self.meta_1st_train_end = meta_1st_train_end
+        self.proxy_hd = proxy_hd

    def get_feature_importance(self):
        # this must be lightGBM, because it needs to get the feature importance
-        rb = RollingBenchmark(model_type="gbdt")
+        rb = RollingBenchmark(model_type="gbdt", **self.rb_kwargs)
        task = rb.basic_task()

        with R.start(experiment_name="feature_importance"):
@@ -69,7 +106,7 @@ class DDGDA:
        fi = self.get_feature_importance()
        col_selected = fi.nlargest(topk)

-        rb = RollingBenchmark(model_type=self.sim_task_model)
+        rb = RollingBenchmark(model_type=self.sim_task_model, **self.rb_kwargs)
        task = rb.basic_task()
        dataset = init_instance_by_config(task["dataset"])
        prep_ds = dataset.prepare(slice(None), col_set=["feature", "label"], data_key=DataHandlerLP.DK_L)
@@ -79,7 +116,9 @@ class DDGDA:

        feature_selected = feature_df.loc[:, col_selected.index]

-        feature_selected = feature_selected.groupby("datetime").apply(lambda df: (df - df.mean()).div(df.std()))
+        feature_selected = feature_selected.groupby("datetime", group_keys=False).apply(
+            lambda df: (df - df.mean()).div(df.std())
+        )
        feature_selected = feature_selected.fillna(0.0)

        df_all = {
@@ -96,7 +135,7 @@ class DDGDA:
                "kwargs": {"config": DIRNAME / "fea_label_df.pkl"},
            }
        )
-        handler.to_pickle(DIRNAME / "handler_proxy.pkl", dump_all=True)
+        handler.to_pickle(DIRNAME / self.proxy_hd, dump_all=True)

    @property
    def _internal_data_path(self):
@@ -108,7 +147,7 @@ class DDGDA:
        This function will dump the input data for meta model
        """
        # According to the experiments, the choice of the model type is very important for achieving good results
-        rb = RollingBenchmark(model_type=self.sim_task_model)
+        rb = RollingBenchmark(model_type=self.sim_task_model, **self.rb_kwargs)
        sim_task = rb.basic_task()

        if self.sim_task_model == "gbdt":
@@ -122,24 +161,28 @@ class DDGDA:
        with self._internal_data_path.open("wb") as f:
            pickle.dump(internal_data, f)

-    def train_meta_model(self):
+    def train_meta_model(self, fill_method="max"):
        """
        training a meta model based on a simplified linear proxy model;
        """

        # 1) leverage the simplified proxy forecasting model to train meta model.
        # - Only the dataset part is important, in current version of meta model will integrate the
-        rb = RollingBenchmark(model_type=self.sim_task_model)
+        rb = RollingBenchmark(model_type=self.sim_task_model, **self.rb_kwargs)
        sim_task = rb.basic_task()
+        # the train_start for training meta model does not necessarily align with final rolling
+        train_start = "2008-01-01" if self.rb_kwargs.get("train_start") is None else self.rb_kwargs.get("train_start")
+        train_end = "2010-12-31" if self.meta_1st_train_end is None else self.meta_1st_train_end
+        test_start = (pd.Timestamp(train_end) + pd.Timedelta(days=1)).strftime("%Y-%m-%d")
        proxy_forecast_model_task = {
            # "model": "qlib.contrib.model.linear.LinearModel",
            "dataset": {
                "class": "qlib.data.dataset.DatasetH",
                "kwargs": {
-                    "handler": f"file://{(DIRNAME / 'handler_proxy.pkl').absolute()}",
+                    "handler": f"file://{(DIRNAME / self.proxy_hd).absolute()}",
                    "segments": {
-                        "train": ("2008-01-01", "2010-12-31"),
-                        "test": ("2011-01-01", sim_task["dataset"]["kwargs"]["segments"]["test"][1]),
+                        "train": (train_start, train_end),
+                        "test": (test_start, sim_task["dataset"]["kwargs"]["segments"]["test"][1]),
                    },
                },
            },
@@ -156,7 +199,7 @@ class DDGDA:
            segments=0.62,  # keep test period consistent with the dataset yaml
            trunc_days=1 + self.horizon,
            hist_step_n=30,
-            fill_method="max",
+            fill_method=fill_method,
            rolling_ext_days=0,
        )
        # NOTE:
@@ -165,12 +208,15 @@ class DDGDA:
        # So the misalignment will not affect the effectiveness of the method.
        with self._internal_data_path.open("rb") as f:
            internal_data = pickle.load(f)
+
        md = MetaDatasetDS(exp_name=internal_data, **kwargs)

        # 3) train and logging meta model
        with R.start(experiment_name=self.meta_exp_name):
            R.log_params(**kwargs)
-            mm = MetaModelDS(step=self.step, hist_step_n=kwargs["hist_step_n"], lr=0.001, max_epoch=100, seed=43)
+            mm = MetaModelDS(
+                step=self.step, hist_step_n=kwargs["hist_step_n"], lr=0.001, max_epoch=30, seed=43, alpha=self.alpha
+            )
            mm.fit(md)
            R.save_objects(model=mm)

@@ -203,7 +249,7 @@ class DDGDA:
        hist_step_n = int(param["hist_step_n"])
        fill_method = param.get("fill_method", "max")

-        rb = RollingBenchmark(model_type=self.forecast_model)
+        rb = RollingBenchmark(model_type=self.forecast_model, **self.rb_kwargs)
        task_l = rb.create_rolling_tasks()

        # 2.2) create meta dataset for final dataset
@@ -233,13 +279,13 @@ class DDGDA:
        """
        with self._task_path.open("rb") as f:
            tasks = pickle.load(f)
-        rb = RollingBenchmark(rolling_exp="rolling_ds", model_type=self.forecast_model)
+        rb = RollingBenchmark(rolling_exp="rolling_ds", model_type=self.forecast_model, **self.rb_kwargs)
        rb.train_rolling_tasks(tasks)
        rb.ens_rolling()
        rb.update_rolling_rec()

    def run_all(self):
-        # 1) file: handler_proxy.pkl
+        # 1) file: handler_proxy.pkl (self.proxy_hd)
        self.dump_data_for_proxy_model()
        # 2)
        # file: internal_data_s20.pkl
--- a/examples/benchmarks_dynamic/README.md
+++ b/examples/benchmarks_dynamic/README.md
@@ -8,15 +8,17 @@ The table below shows the performances of different solutions on different forec
 Here is the [crowd sourced version of qlib data](data_collector/crowd_source/README.md): https://github.com/chenditc/investment_data/releases
 ```bash
 wget https://github.com/chenditc/investment_data/releases/download/20220720/qlib_bin.tar.gz
+mkdir -p ~/.qlib/qlib_data/cn_data
 tar -zxvf qlib_bin.tar.gz -C ~/.qlib/qlib_data/cn_data --strip-components=2
+rm -f qlib_bin.tar.gz
 ```

 | Model Name       | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
-|------------------|---------|----|------|---------|-----------|-------------------|-------------------|--------------|
-| RR[Linear]       |Alpha158 |0.089|0.577|0.102    |0.627      |0.093              |1.458              |-0.073        |
-| DDG-DA[Linear]   |Alpha158 |0.096|0.636|0.107    |0.677      |0.067              |0.996              |-0.091        |
-| RR[LightGBM]     |Alpha158 |0.082|0.589|0.091    |0.626      |0.077              |1.320              |-0.091        |
-| DDG-DA[LightGBM] |Alpha158 |0.085|0.658|0.094    |0.686      |0.115              |1.792              |-0.068        |
+|------------------|---------|------|------|---------|-----------|-------------------|-------------------|--------------|
+| RR[Linear]       |Alpha158 |0.0945|0.5989|0.1069   |0.6495     |0.0857             |1.3682             |-0.0986       |
+| DDG-DA[Linear]   |Alpha158 |0.0983|0.6157|0.1108   |0.6646     |0.0764             |1.1904             |-0.0769       |
+| RR[LightGBM]     |Alpha158 |0.0816|0.5887|0.0912   |0.6263     |0.0771             |1.3196             |-0.0909       |
+| DDG-DA[LightGBM] |Alpha158 |0.0878|0.6185|0.0975   |0.6524     |0.1261             |2.0096             |-0.0744       |

 - The label horizon of the `Alpha158` dataset is set to 20.
 - The rolling time intervals are set to 20 trading days.
--- a/examples/benchmarks_dynamic/baseline/rolling_benchmark.py
+++ b/examples/benchmarks_dynamic/baseline/rolling_benchmark.py
@@ -1,13 +1,17 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.
+from typing import Optional
 from qlib.model.ens.ensemble import RollingEnsemble
 from qlib.utils import init_instance_by_config
 import fire
 import yaml
+import pandas as pd
 from qlib import auto_init
 from pathlib import Path
 from tqdm.auto import tqdm
 from qlib.model.trainer import TrainerR
+from qlib.log import get_module_logger
+from qlib.utils.data import update_config
 from qlib.workflow import R
 from qlib.tests.data import GetData

@@ -25,23 +29,57 @@ class RollingBenchmark:

    """

-    def __init__(self, rolling_exp="rolling_models", model_type="linear") -> None:
+    def __init__(
+        self,
+        rolling_exp: str = "rolling_models",
+        model_type: str = "linear",
+        h_path: Optional[str] = None,
+        train_start: Optional[str] = None,
+        test_end: Optional[str] = None,
+        task_ext_conf: Optional[dict] = None,
+    ) -> None:
+        """
+        Parameters
+        ----------
+        rolling_exp : str
+            The name for the experiments for rolling
+        model_type : str
+            The model to be boosted.
+        h_path : Optional[str]
+            the dumped data handler;
+        test_end : Optional[str]
+            the test end for the data. It is typically used together with the handler
+        train_start : Optional[str]
+            the train start for the data.  It is typically used together with the handler.
+        task_ext_conf : Optional[dict]
+            some option to update the
+        """
        self.step = 20
        self.horizon = 20
        self.rolling_exp = rolling_exp
        self.model_type = model_type
+        self.h_path = h_path
+        self.train_start = train_start
+        self.test_end = test_end
+        self.logger = get_module_logger("RollingBenchmark")
+        self.task_ext_conf = task_ext_conf

    def basic_task(self):
        """For fast training rolling"""
        if self.model_type == "gbdt":
-            conf_path = DIRNAME.parent.parent / "benchmarks" / "LightGBM" / "workflow_config_lightgbm_Alpha158.yaml"
+            conf_path = DIRNAME / "workflow_config_lightgbm_Alpha158.yaml"
            # dump the processed data on to disk for later loading to speed up the processing
            h_path = DIRNAME / "lightgbm_alpha158_handler_horizon{}.pkl".format(self.horizon)
        elif self.model_type == "linear":
-            conf_path = DIRNAME.parent.parent / "benchmarks" / "Linear" / "workflow_config_linear_Alpha158.yaml"
+            # We use ridge regression to stabilize the performance
+            conf_path = DIRNAME / "workflow_config_linear_Alpha158.yaml"
            h_path = DIRNAME / "linear_alpha158_handler_horizon{}.pkl".format(self.horizon)
        else:
            raise AssertionError("Model type is not supported!")
+
+        if self.h_path is not None:
+            h_path = Path(self.h_path)
+
        with conf_path.open("r") as f:
            conf = yaml.safe_load(f)

@@ -52,6 +90,9 @@ class RollingBenchmark:

        task = conf["task"]

+        if self.task_ext_conf is not None:
+            task = update_config(task, self.task_ext_conf)
+
        if not h_path.exists():
            h_conf = task["dataset"]["kwargs"]["handler"]
            h = init_instance_by_config(h_conf)
@@ -59,6 +100,15 @@ class RollingBenchmark:

        task["dataset"]["kwargs"]["handler"] = f"file://{h_path}"
        task["record"] = ["qlib.workflow.record_temp.SignalRecord"]
+
+        if self.train_start is not None:
+            seg = task["dataset"]["kwargs"]["segments"]["train"]
+            task["dataset"]["kwargs"]["segments"]["train"] = pd.Timestamp(self.train_start), seg[1]
+
+        if self.test_end is not None:
+            seg = task["dataset"]["kwargs"]["segments"]["test"]
+            task["dataset"]["kwargs"]["segments"]["test"] = seg[0], pd.Timestamp(self.test_end)
+        self.logger.info(task)
        return task

    def create_rolling_tasks(self):
@@ -93,7 +143,7 @@ class RollingBenchmark:
        """
        Evaluate the combined rolling results
        """
-        for rid, rec in R.list_recorders(experiment_name=self.COMB_EXP).items():
+        for _, rec in R.list_recorders(experiment_name=self.COMB_EXP).items():
            for rt_cls in SigAnaRecord, PortAnaRecord:
                rt = rt_cls(recorder=rec, skip_existing=True)
                rt.generate()
--- a/examples/benchmarks_dynamic/baseline/workflow_config_lightgbm_Alpha158.yaml
+++ b/examples/benchmarks_dynamic/baseline/workflow_config_lightgbm_Alpha158.yaml
@@ -0,0 +1,72 @@
+qlib_init:
+    provider_uri: "~/.qlib/qlib_data/cn_data"
+    region: cn
+market: &market csi300
+benchmark: &benchmark SH000300
+data_handler_config: &data_handler_config
+    start_time: 2008-01-01
+    end_time: 2020-08-01
+    fit_start_time: 2008-01-01
+    fit_end_time: 2014-12-31
+    instruments: *market
+port_analysis_config: &port_analysis_config
+    strategy:
+        class: TopkDropoutStrategy
+        module_path: qlib.contrib.strategy
+        kwargs:
+            model: <MODEL> 
+            dataset: <DATASET>
+            topk: 50
+            n_drop: 5
+    backtest:
+        start_time: 2017-01-01
+        end_time: 2020-08-01
+        account: 100000000
+        benchmark: *benchmark
+        exchange_kwargs:
+            limit_threshold: 0.095
+            deal_price: close
+            open_cost: 0.0005
+            close_cost: 0.0015
+            min_cost: 5
+task:
+    model:
+        class: LGBModel
+        module_path: qlib.contrib.model.gbdt
+        kwargs:
+            loss: mse
+            colsample_bytree: 0.8879
+            learning_rate: 0.2
+            subsample: 0.8789
+            lambda_l1: 205.6999
+            lambda_l2: 580.9768
+            max_depth: 8
+            num_leaves: 210
+            num_threads: 20
+    dataset:
+        class: DatasetH
+        module_path: qlib.data.dataset
+        kwargs:
+            handler:
+                class: Alpha158
+                module_path: qlib.contrib.data.handler
+                kwargs: *data_handler_config
+            segments:
+                train: [2008-01-01, 2014-12-31]
+                valid: [2015-01-01, 2016-12-31]
+                test: [2017-01-01, 2020-08-01]
+    record: 
+        - class: SignalRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            model: <MODEL>
+            dataset: <DATASET>
+        - class: SigAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            ana_long_short: False
+            ann_scaler: 252
+        - class: PortAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            config: *port_analysis_config
--- a/examples/benchmarks_dynamic/baseline/workflow_config_linear_Alpha158.yaml
+++ b/examples/benchmarks_dynamic/baseline/workflow_config_linear_Alpha158.yaml
@@ -0,0 +1,79 @@
+qlib_init:
+    provider_uri: "~/.qlib/qlib_data/cn_data"
+    region: cn
+market: &market csi300
+benchmark: &benchmark SH000300
+data_handler_config: &data_handler_config
+    start_time: 2008-01-01
+    end_time: 2020-08-01
+    fit_start_time: 2008-01-01
+    fit_end_time: 2014-12-31
+    instruments: *market
+    infer_processors:
+        - class: RobustZScoreNorm
+          kwargs:
+              fields_group: feature
+              clip_outlier: true
+        - class: Fillna
+          kwargs:
+              fields_group: feature
+    learn_processors:
+        - class: DropnaLabel
+        - class: CSRankNorm
+          kwargs:
+              fields_group: label
+port_analysis_config: &port_analysis_config
+    strategy:
+        class: TopkDropoutStrategy
+        module_path: qlib.contrib.strategy
+        kwargs:
+            signal:
+                - <MODEL> 
+                - <DATASET>
+            topk: 50
+            n_drop: 5
+    backtest:
+        start_time: 2017-01-01
+        end_time: 2020-08-01
+        account: 100000000
+        benchmark: *benchmark
+        exchange_kwargs:
+            limit_threshold: 0.095
+            deal_price: close
+            open_cost: 0.0005
+            close_cost: 0.0015
+            min_cost: 5
+task:
+    model:
+        class: LinearModel
+        module_path: qlib.contrib.model.linear
+        kwargs:
+            estimator: ridge
+            alpha: 0.05
+    dataset:
+        class: DatasetH
+        module_path: qlib.data.dataset
+        kwargs:
+            handler:
+                class: Alpha158
+                module_path: qlib.contrib.data.handler
+                kwargs: *data_handler_config
+            segments:
+                train: [2008-01-01, 2014-12-31]
+                valid: [2015-01-01, 2016-12-31]
+                test: [2017-01-01, 2020-08-01]
+    record: 
+        - class: SignalRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            model: <MODEL>
+            dataset: <DATASET>
+        - class: SigAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            ana_long_short: True
+            ann_scaler: 252
+        - class: PortAnaRecord
+          module_path: qlib.workflow.record_temp
+          kwargs: 
+            config: *port_analysis_config
--- a/examples/rl/README.md
+++ b/examples/rl/README.md
@@ -1,60 +0,0 @@
-This folder contains a simple example of how to run Qlib RL. It contains:
-
-```
-.
-├── experiment_config
-│   ├── backtest       # Backtest config
-│   └── training       # Training config
-├── README.md          # Readme (the current file)
-└── scripts            # Scripts for data pre-processing
-```
-
-## Data preparation
-
-Use [AzCopy](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10) to download data:
-
-```
-azcopy copy https://qlibpublic.blob.core.windows.net/data/rl/qlib_rl_example_data ./ --recursive
-mv qlib_rl_example_data data
-```
-
-The downloaded data will be placed at `./data`. The original data are in `data/csv`. To create all data needed by the case, run:
-
-```
-bash scripts/data_pipeline.sh
-```
-
-After the execution finishes, the `data/` directory should be like:
-
-```
-data
-├── backtest_orders.csv
-├── bin
-├── csv
-├── pickle
-├── pickle_dataframe
-└── training_order_split
-```
-
-## Run training
-
-Run:
-
-```
-python -m qlib.rl.contrib.train_onpolicy --config_path ./experiment_config/training/config.yml
-```
-
-After training, checkpoints will be stored under `checkpoints/`.
-
-## Run backtest
-
-```
-python -m qlib.rl.contrib.backtest --config_path ./experiment_config/backtest/config.yml
-```
-
-The backtest workflow will use the trained model in `checkpoints/`. The backtest summary can be found in `outputs/`.
-
-## Others
-The RL module is designed in a loosely-coupled way. Currently, RL examples are integrated with concrete business logic.
-But the core part of RL is much simpler than what you see.
-To demonstrate the simple core of RL, [a dedicated notebook](./simple_example.ipynb) for RL without business loss is created.
--- a/examples/rl/experiment_config/backtest/config.yml
+++ b/examples/rl/experiment_config/backtest/config.yml
@@ -1,57 +0,0 @@
-order_file: ./data/backtest_orders.csv
-start_time: "9:45"
-end_time: "14:44"
-qlib:
-  provider_uri_1min: ./data/bin
-  feature_root_dir: ./data/pickle
-  feature_columns_today: [
-    "$open", "$high", "$low", "$close", "$vwap", "$volume",
-  ]
-  feature_columns_yesterday: [
-    "$open_v1", "$high_v1", "$low_v1", "$close_v1", "$vwap_v1", "$volume_v1",
-  ]
-exchange:
-  limit_threshold: ['$close == 0', '$close == 0']
-  deal_price: ["If($close == 0, $vwap, $close)", "If($close == 0, $vwap, $close)"]
-  volume_threshold:
-    all: ["cum", "0.2 * DayCumsum($volume, '9:45', '14:44')"]
-    buy: ["current", "$close"]
-    sell: ["current", "$close"]
-strategies: 
-  30min: 
-    class: TWAPStrategy
-    module_path: qlib.contrib.strategy.rule_strategy
-    kwargs: {}
-  1day: 
-    class: SAOEIntStrategy
-    module_path: qlib.rl.order_execution.strategy
-    kwargs:
-      state_interpreter:
-        class: FullHistoryStateInterpreter
-        module_path: qlib.rl.order_execution.interpreter
-        kwargs:
-          max_step: 8
-          data_ticks: 240
-          data_dim: 6
-          processed_data_provider:
-            class: PickleProcessedDataProvider
-            module_path: qlib.rl.data.pickle_styled
-            kwargs:
-              data_dir: ./data/pickle_dataframe/feature
-      action_interpreter: 
-        class: CategoricalActionInterpreter
-        module_path: qlib.rl.order_execution.interpreter
-        kwargs: 
-          values: 14
-          max_step: 8
-      network: 
-          class: Recurrent
-          module_path: qlib.rl.order_execution.network
-          kwargs: {}
-      policy: 
-          class: PPO
-          module_path: qlib.rl.order_execution.policy
-          kwargs: 
-            lr: 1.0e-4
-            weight_file: ./checkpoints/latest.pth
-concurrency: 5
--- a/examples/rl/experiment_config/training/config.yml
+++ b/examples/rl/experiment_config/training/config.yml
@@ -1,59 +0,0 @@
-simulator:
-  time_per_step: 30
-  vol_limit: null
-env:
-  concurrency: 1
-  parallel_mode: dummy
-action_interpreter:
-  class: CategoricalActionInterpreter
-  kwargs:
-    values: 14
-    max_step: 8
-  module_path: qlib.rl.order_execution.interpreter
-state_interpreter:
-  class: FullHistoryStateInterpreter
-  kwargs:
-    data_dim: 6
-    data_ticks: 240
-    max_step: 8
-    processed_data_provider:
-      class: PickleProcessedDataProvider
-      module_path: qlib.rl.data.pickle_styled
-      kwargs:
-        data_dir: ./data/pickle_dataframe/feature
-  module_path: qlib.rl.order_execution.interpreter
-reward:
-  class: PAPenaltyReward
-  kwargs:
-    penalty: 100.0
-  module_path: qlib.rl.order_execution.reward
-data:
-  source:
-    order_dir: ./data/training_order_split
-    data_dir: ./data/pickle_dataframe/backtest
-    total_time: 240
-    default_start_time: 0
-    default_end_time: 240
-    proc_data_dim: 6
-  num_workers: 0
-  queue_size: 20
-network:
-  class: Recurrent
-  module_path: qlib.rl.order_execution.network
-policy:
-  class: PPO
-  kwargs:
-    lr: 0.0001
-  module_path: qlib.rl.order_execution.policy
-runtime:
-  seed: 42
-  use_cuda: false
-trainer:
-  max_epoch: 2
-  repeat_per_collect: 5
-  earlystop_patience: 2
-  episode_per_collect: 20
-  batch_size: 16
-  val_every_n_epoch: 1
-  checkpoint_path: ./checkpoints
-  checkpoint_every_n_iters: 1
--- a/examples/rl/scripts/collect_pickle_dataframe.py
+++ b/examples/rl/scripts/collect_pickle_dataframe.py
@@ -1,21 +0,0 @@
-# Copyright (c) Microsoft Corporation.
-# Licensed under the MIT License.
-
-import os
-import pickle
-import pandas as pd
-from tqdm import tqdm
-
-os.makedirs(os.path.join("data", "pickle_dataframe"), exist_ok=True)
-
-for tag in ("backtest", "feature"):
-    df = pickle.load(open(os.path.join("data", "pickle", f"{tag}.pkl"), "rb"))
-    df = pd.concat(list(df.values())).reset_index()
-    df["date"] = df["datetime"].dt.date.astype("datetime64")
-    instruments = sorted(set(df["instrument"]))
-
-    os.makedirs(os.path.join("data", "pickle_dataframe", tag), exist_ok=True)
-    for instrument in tqdm(instruments):
-        cur = df[df["instrument"] == instrument].sort_values(by=["datetime"])
-        cur = cur.set_index(["instrument", "datetime", "date"])
-        pickle.dump(cur, open(os.path.join("data", "pickle_dataframe", tag, f"{instrument}.pkl"), "wb"))
--- a/examples/rl/scripts/data_pipeline.sh
+++ b/examples/rl/scripts/data_pipeline.sh
@@ -1,14 +0,0 @@
-# Generate `bin` format data
-set -e
-python ../../scripts/dump_bin.py dump_all --csv_path ./data/csv --qlib_dir ./data/bin --include_fields open,close,high,low,vwap,volume --symbol_field_name symbol --date_field_name date --freq 1min
-
-# Generate pickle format data
-python scripts/gen_pickle_data.py -c scripts/pickle_data_config.yml
-if [ -e stat/ ]; then
-    rm -r stat/
-fi
-python scripts/collect_pickle_dataframe.py
-
-# Sample orders
-python scripts/gen_training_orders.py
-python scripts/gen_backtest_orders.py
--- a/examples/rl/scripts/gen_backtest_orders.py
+++ b/examples/rl/scripts/gen_backtest_orders.py
@@ -1,55 +0,0 @@
-# Copyright (c) Microsoft Corporation.
-# Licensed under the MIT License.
-
-import argparse
-import os
-import pandas as pd
-import numpy as np
-import pickle
-
-parser = argparse.ArgumentParser()
-parser.add_argument("--seed", type=int, default=20220926)
-parser.add_argument("--num_order", type=int, default=10)
-args = parser.parse_args()
-
-np.random.seed(args.seed)
-
-path = os.path.join("data", "pickle", "backtesttest.pkl")
-df = pickle.load(open(path, "rb")).reset_index()
-df["date"] = df["datetime"].dt.date.astype("datetime64")
-
-instruments = sorted(set(df["instrument"]))
-
-# TODO: The example is expected to be able to handle data containing missing values.
-# TODO: Currently, we just simply skip dates that contain missing data. We will add
-# TODO: this feature in the future.
-skip_dates = {}
-for instrument in instruments:
-    csv_df = pd.read_csv(os.path.join("data", "csv", f"{instrument}.csv"))
-    csv_df = csv_df[csv_df["close"].isna()]
-    dates = set([str(d).split(" ")[0] for d in csv_df["date"]])
-    skip_dates[instrument] = dates
-
-df_list = []
-for instrument in instruments:
-    print(instrument)
-
-    cur_df = df[df["instrument"] == instrument]
-
-    dates = sorted(set([str(d).split(" ")[0] for d in cur_df["date"]]))
-    dates = [date for date in dates if date not in skip_dates[instrument]]
-
-    n = args.num_order
-    df_list.append(
-        pd.DataFrame(
-            {
-                "date": sorted(np.random.choice(dates, size=n, replace=False)),
-                "instrument": [instrument] * n,
-                "amount": np.random.randint(low=3, high=11, size=n) * 100.0,
-                "order_type": np.random.randint(low=0, high=2, size=n),
-            }
-        ).set_index(["date", "instrument"]),
-    )
-
-total_df = pd.concat(df_list)
-total_df.to_csv("data/backtest_orders.csv")
--- a/examples/rl/scripts/gen_training_orders.py
+++ b/examples/rl/scripts/gen_training_orders.py
@@ -1,39 +0,0 @@
-# Copyright (c) Microsoft Corporation.
-# Licensed under the MIT License.
-
-import argparse
-import os
-import pandas as pd
-import numpy as np
-import pickle
-
-parser = argparse.ArgumentParser()
-parser.add_argument("--seed", type=int, default=20220926)
-parser.add_argument("--stock", type=str, default="AAPL")
-parser.add_argument("--train_size", type=int, default=10)
-parser.add_argument("--valid_size", type=int, default=2)
-parser.add_argument("--test_size", type=int, default=2)
-args = parser.parse_args()
-
-np.random.seed(args.seed)
-
-os.makedirs(os.path.join("data", "training_order_split"), exist_ok=True)
-
-for group, n in zip(("train", "valid", "test"), (args.train_size, args.valid_size, args.test_size)):
-    path = os.path.join("data", "pickle", f"backtest{group}.pkl")
-    df = pickle.load(open(path, "rb")).reset_index()
-    df["date"] = df["datetime"].dt.date.astype("datetime64")
-
-    dates = sorted(set([str(d).split(" ")[0] for d in df["date"]]))
-
-    data_df = pd.DataFrame(
-        {
-            "date": sorted(np.random.choice(dates, size=n, replace=False)),
-            "instrument": [args.stock] * n,
-            "amount": np.random.randint(low=3, high=11, size=n) * 100.0,
-            "order_type": [0] * n,
-        }
-    ).set_index(["date", "instrument"])
-
-    os.makedirs(os.path.join("data", "training_order_split", group), exist_ok=True)
-    pickle.dump(data_df, open(os.path.join("data", "training_order_split", group, f"{args.stock}.pkl"), "wb"))
--- a/examples/rl/simple_example.ipynb
+++ b/examples/rl/simple_example.ipynb
@@ -41,6 +41,7 @@
    "\n",
    "State = namedtuple(\"State\", [\"value\", \"last_action\"])\n",
    "\n",
+    "\n",
    "class SimpleSimulator(Simulator[float, State, float]):\n",
    "    def __init__(self, initial: float, nsteps: int, **kwargs: Any) -> None:\n",
    "        super().__init__(initial)\n",
@@ -92,6 +93,7 @@
    "from gym import spaces\n",
    "from qlib.rl.interpreter import StateInterpreter\n",
    "\n",
+    "\n",
    "class SimpleStateInterpreter(StateInterpreter[Tuple[float, float], np.ndarray]):\n",
    "    def interpret(self, state: State) -> np.ndarray:\n",
    "        # Convert state.value to a 1D Numpy array\n",
@@ -101,7 +103,8 @@
    "    @property\n",
    "    def observation_space(self) -> spaces.Box:\n",
    "        return spaces.Box(0, np.inf, shape=(1,), dtype=np.float32)\n",
-    "    \n",
+    "\n",
+    "\n",
    "state_interpreter = SimpleStateInterpreter()"
   ]
  },
@@ -120,6 +123,7 @@
   "source": [
    "from qlib.rl.interpreter import ActionInterpreter\n",
    "\n",
+    "\n",
    "class SimpleActionInterpreter(ActionInterpreter[State, int, float]):\n",
    "    def __init__(self, n_value: int) -> None:\n",
    "        self.n_value = n_value\n",
@@ -132,7 +136,8 @@
    "        assert 0 <= action <= self.n_value\n",
    "        # simulator_state.value is used as the denominator\n",
    "        return simulator_state.value * (action / self.n_value)\n",
-    "    \n",
+    "\n",
+    "\n",
    "action_interpreter = SimpleActionInterpreter(n_value=10)"
   ]
  },
@@ -151,12 +156,14 @@
   "source": [
    "from qlib.rl.reward import Reward\n",
    "\n",
+    "\n",
    "class SimpleReward(Reward[State]):\n",
    "    def reward(self, simulator_state: State) -> float:\n",
    "        # Use last_action to calculate reward. This is why it should be in the state.\n",
    "        rew = simulator_state.last_action / simulator_state.value\n",
    "        return rew\n",
-    "    \n",
+    "\n",
+    "\n",
    "reward = SimpleReward()"
   ]
  },
@@ -180,6 +187,7 @@
    "from torch import nn\n",
    "from qlib.rl.order_execution import PPO\n",
    "\n",
+    "\n",
    "class SimpleFullyConnect(nn.Module):\n",
    "    def __init__(self, dims: List[int]) -> None:\n",
    "        super().__init__()\n",
@@ -195,7 +203,8 @@
    "\n",
    "    def forward(self, x: torch.Tensor) -> torch.Tensor:\n",
    "        return self.fc(x)\n",
-    "    \n",
+    "\n",
+    "\n",
    "policy = PPO(\n",
    "    network=SimpleFullyConnect(dims=[16, 8]),\n",
    "    obs_space=state_interpreter.observation_space,\n",
@@ -221,6 +230,7 @@
   "source": [
    "from torch.utils.data import Dataset\n",
    "\n",
+    "\n",
    "class SimpleDataset(Dataset):\n",
    "    def __init__(self, positions: List[float]) -> None:\n",
    "        self.positions = positions\n",
@@ -230,7 +240,8 @@
    "\n",
    "    def __getitem__(self, index: int) -> float:\n",
    "        return self.positions[index]\n",
-    "    \n",
+    "\n",
+    "\n",
    "dataset = SimpleDataset(positions=[10.0, 50.0, 100.0])"
   ]
  },
@@ -265,11 +276,13 @@
    "trainer_kwargs = {\n",
    "    \"max_iters\": 10,\n",
    "    \"finite_env_type\": \"dummy\",\n",
-    "    \"callbacks\": [Checkpoint(\n",
-    "        dirpath=Path(\"./checkpoints\"),\n",
-    "        every_n_iters=1,\n",
-    "        save_latest=\"copy\",\n",
-    "    )],\n",
+    "    \"callbacks\": [\n",
+    "        Checkpoint(\n",
+    "            dirpath=Path(\"./checkpoints\"),\n",
+    "            every_n_iters=1,\n",
+    "            save_latest=\"copy\",\n",
+    "        )\n",
+    "    ],\n",
    "}\n",
    "vessel_kwargs = {\n",
    "    \"update_kwargs\": {\"batch_size\": 16, \"repeat\": 5},\n",
--- a/examples/rl_order_execution/README.md
+++ b/examples/rl_order_execution/README.md
@@ -0,0 +1,100 @@
+# RL Example for Order Execution
+
+This folder comprises an example of Reinforcement Learning (RL) workflows for order execution scenario, including both training workflows and backtest workflows.
+
+## Data Processing
+
+### Get Data
+
+```
+python -m qlib.run.get_data qlib_data qlib_data --target_dir ./data/bin --region hs300 --interval 5min
+```
+
+### Generate Pickle-Style Data
+
+To run codes in this example, we need data in pickle format. To achieve this, run following commands (might need a few minutes to finish):
+
+[//]: # (TODO: Instead of dumping dataframe with different format &#40;like `_gen_dataset` and `_gen_day_dataset` in `qlib/contrib/data/highfreq_provider.py`&#41;, we encourage to implement different subclass of `Dataset` and `DataHandler`. This will keep the workflow cleaner and interfaces more consistent, and move all the complexity to the subclass.)
+
+```
+python scripts/gen_pickle_data.py -c scripts/pickle_data_config.yml
+python scripts/gen_training_orders.py
+python scripts/merge_orders.py
+```
+
+When finished, the structure under `data/` should be:
+
+```
+data
+├── bin
+├── orders
+└── pickle
+```
+
+## Training
+
+Each training task is specified by a config file. The config file for task `TASKNAME` is `exp_configs/train_TASKNAME.yml`. This example provides two training tasks:
+
+- **PPO**: Method proposed by IJCAL 2020 paper "[An End-to-End Optimal Trade Execution Framework based on Proximal Policy Optimization](https://www.ijcai.org/proceedings/2020/0627.pdf)".
+- **OPDS**: Method proposed by AAAI 2021 paper "[Universal Trading for Order Execution with Oracle Policy Distillation](https://arxiv.org/abs/2103.10860)".
+
+The main differece between these two methods is their reward functions. Please see their config files for details.
+
+Take OPDS as an example, to run the training workflow, run:
+
+```
+python -m qlib.rl.contrib.train_onpolicy --config_path exp_configs/train_opds.yml --run_backtest
+```
+
+Metrics, logs, and checkpoints will be stored under `outputs/opds` (configured by `exp_configs/train_opds.yml`). 
+
+## Backtest
+
+Once the training workflow has completed, the trained model can be used for the backtesting workflow. Still taking OPDS as an example, once training is finished, the latest checkpoint of the model can be found at `outputs/opds/checkpoints/latest.pth`. To run backtest workflow:
+
+1. Uncomment the `weight_file` parameter in `exp_configs/train_opds.yml` (it is commented by default). While it is possible to run the backtesting workflow without setting a checkpoint, this will lead to randomly initialized model results, thus making them meaningless.
+2. Run `python -m qlib.rl.contrib.backtest --config_path exp_configs/backtest_opds.yml`.
+
+The backtest result is stored in `outputs/checkpoints/backtest_result.csv`.
+
+In addition to OPDS and PPO, we also provide TWAP ([Time-weighted average price](https://en.wikipedia.org/wiki/Time-weighted_average_price)) as a weak baseline. The config file for TWAP is `exp_configs/backtest_twap.yml`.
+
+### Gap between backtest and training pipeline's testing
+
+It is worthy to notice that the results of the backtesting process may differ from the results of the testing process used during training.
+This is because different simulators are used to simulate market conditions during training and backtesting.
+In training pipeline, the simplified simulator called `SingleAssetOrderExecutionSimple` is used for efficiency reasons. 
+`SingleAssetOrderExecutionSimple` makes no restriction to trading amounts. 
+No matter what the amount of the order is, it can be completely executed.
+However, during backtesting, a more realistic simulator called `SingleAssetOrderExecution` is used. 
+It takes into account practical constraints in more real-world scenarios (for example, the trading volume must be a multiple of the smallest trading unit).
+As a result, the amount of an order that is actually executed during backtesting may differ from the amount expected to be executed.
+
+If you would like to obtain results that are exactly the same as those obtained during testing in the training pipeline, you could run training pipeline with only backtest phrase.
+In order to do this:
+- Modify the training config. Add the path of the checkpoint you want to use (see following for an example).
+- Run `python -m qlib.rl.contrib.train_onpolicy --config_path PATH/TO/CONFIG --run_backtest --no_training`
+
+```yaml
+...
+policy:
+  class: PPO  # PPO, DQN
+  kwargs:
+    lr: 0.0001
+    weight_file: PATH/TO/CHECKPOINT
+  module_path: qlib.rl.order_execution.policy
+...
+```
+
+## Benchmarks (TBD)
+
+To accurately evaluate the performance of models using Reinforcement Learning algorithms, it's best to run experiments multiple times and compute the average performance across all trials. However, given the time-consuming nature of model training, this is not always feasible. An alternative approach is to run each training task only once, selecting the 10 checkpoints with the highest validation performance to simulate multiple trials. In this example, we use "Price Advantage (PA)" as the metric for selecting these checkpoints. The average performance of these 10 checkpoints on the testing set is as follows:
+
+| **Model**                   | **PA mean with std.** |
+|-----------------------------|-----------------------|
+| OPDS (with PPO policy)      |  0.4785 ± 0.7815      |
+| OPDS (with DQN policy)      | -0.0114 ± 0.5780      |
+| PPO                         | -1.0935 ± 0.0922      |
+| TWAP                        |   ≈ 0.0 ± 0.0         |
+
+The table above also includes TWAP as a rule-based baseline. The ideal PA of TWAP should be 0.0, however, in this example, the order execution is divided into two steps: first, the order is split equally among each half hour, and then each five minutes within each half hour. Since trading is forbidden during the last five minutes of the day, this approach may slightly differ from traditional TWAP over the course of a full day (as there are 5 minutes missing in the last "half hour"). Therefore, the PA of TWAP can be considered as a number that is close to 0.0. To verify this, you may run a TWAP backtest and check the results.
--- a/examples/rl_order_execution/exp_configs/backtest_opds.yml
+++ b/examples/rl_order_execution/exp_configs/backtest_opds.yml
@@ -0,0 +1,53 @@
+order_file: ./data/orders/test_orders.pkl
+start_time: "9:30"
+end_time: "14:54"
+data_granularity: "5min"
+qlib:
+  provider_uri_5min: ./data/bin/
+exchange:
+  limit_threshold: null
+  deal_price: ["$close", "$close"]
+  volume_threshold: null
+strategies:
+  1day:
+    class: SAOEIntStrategy
+    kwargs:
+      data_granularity: 5
+      action_interpreter:
+        class: CategoricalActionInterpreter
+        kwargs:
+          max_step: 8
+          values: 4
+        module_path: qlib.rl.order_execution.interpreter
+      network:
+        class: Recurrent
+        kwargs: {}
+        module_path: qlib.rl.order_execution.network
+      policy:
+        class: PPO  # PPO, DQN
+        kwargs:
+          lr: 0.0001
+          # Restore `weight_file` once the training workflow finishes. You can change the checkpoint file you want to use.
+          # weight_file: outputs/opds/checkpoints/latest.pth
+        module_path: qlib.rl.order_execution.policy
+      state_interpreter:
+        class: FullHistoryStateInterpreter
+        kwargs:
+          data_dim: 5
+          data_ticks: 48
+          max_step: 8
+          processed_data_provider:
+            class: HandlerProcessedDataProvider
+            kwargs:
+              data_dir: ./data/pickle/
+              feature_columns_today: ["$high", "$low", "$open", "$close", "$volume"]
+              feature_columns_yesterday: ["$high_1", "$low_1", "$open_1", "$close_1", "$volume_1"]
+            module_path: qlib.rl.data.native
+        module_path: qlib.rl.order_execution.interpreter
+    module_path: qlib.rl.order_execution.strategy
+  30min:
+    class: TWAPStrategy
+    kwargs: {}
+    module_path: qlib.contrib.strategy.rule_strategy
+concurrency: 16
+output_dir: outputs/opds/
--- a/examples/rl_order_execution/exp_configs/backtest_ppo.yml
+++ b/examples/rl_order_execution/exp_configs/backtest_ppo.yml
@@ -0,0 +1,53 @@
+order_file: ./data/orders/test_orders.pkl
+start_time: "9:30"
+end_time: "14:54"
+data_granularity: "5min"
+qlib:
+  provider_uri_5min: ./data/bin/
+exchange:
+  limit_threshold: null
+  deal_price: ["$close", "$close"]
+  volume_threshold: null
+strategies:
+  1day:
+    class: SAOEIntStrategy
+    kwargs:
+      data_granularity: 5
+      action_interpreter:
+        class: CategoricalActionInterpreter
+        kwargs:
+          max_step: 8
+          values: 4
+        module_path: qlib.rl.order_execution.interpreter
+      network:
+        class: Recurrent
+        kwargs: {}
+        module_path: qlib.rl.order_execution.network
+      policy:
+        class: PPO  # PPO, DQN
+        kwargs:
+          lr: 0.0001
+          # Restore `weight_file` once the training workflow finishes. You can change the checkpoint file you want to use.
+          # weight_file: outputs/ppo/checkpoints/latest.pth
+        module_path: qlib.rl.order_execution.policy
+      state_interpreter:
+        class: FullHistoryStateInterpreter
+        kwargs:
+          data_dim: 5
+          data_ticks: 48
+          max_step: 8
+          processed_data_provider:
+            class: HandlerProcessedDataProvider
+            kwargs:
+              data_dir: ./data/pickle/
+              feature_columns_today: ["$high", "$low", "$open", "$close", "$volume"]
+              feature_columns_yesterday: ["$high_1", "$low_1", "$open_1", "$close_1", "$volume_1"]
+            module_path: qlib.rl.data.native
+        module_path: qlib.rl.order_execution.interpreter
+    module_path: qlib.rl.order_execution.strategy
+  30min:
+    class: TWAPStrategy
+    kwargs: {}
+    module_path: qlib.contrib.strategy.rule_strategy
+concurrency: 16
+output_dir: outputs/ppo/
--- a/examples/rl_order_execution/exp_configs/backtest_twap.yml
+++ b/examples/rl_order_execution/exp_configs/backtest_twap.yml
@@ -0,0 +1,21 @@
+order_file: ./data/orders/test_orders.pkl
+start_time: "9:30"
+end_time: "14:54"
+data_granularity: "5min"
+qlib:
+  provider_uri_5min: ./data/bin/
+exchange:
+  limit_threshold: null
+  deal_price: ["$close", "$close"]
+  volume_threshold: null
+strategies:
+  1day:
+    class: TWAPStrategy
+    kwargs: {}
+    module_path: qlib.contrib.strategy.rule_strategy
+  30min:
+    class: TWAPStrategy
+    kwargs: {}
+    module_path: qlib.contrib.strategy.rule_strategy
+concurrency: 16
+output_dir: outputs/twap/
--- a/examples/rl_order_execution/exp_configs/train_opds.yml
+++ b/examples/rl_order_execution/exp_configs/train_opds.yml
@@ -0,0 +1,66 @@
+simulator:
+  data_granularity: 5
+  time_per_step: 30
+  vol_limit: null
+env:
+  concurrency: 32
+  parallel_mode: dummy
+action_interpreter:
+  class: CategoricalActionInterpreter
+  kwargs:
+    values: 4
+    max_step: 8
+  module_path: qlib.rl.order_execution.interpreter
+state_interpreter:
+  class: FullHistoryStateInterpreter
+  kwargs:
+    data_dim: 5
+    data_ticks: 48  # 48 = 240 min / 5 min
+    max_step: 8
+    processed_data_provider:
+      class: HandlerProcessedDataProvider
+      kwargs:
+        data_dir: ./data/pickle/
+        feature_columns_today: ["$high", "$low", "$open", "$close", "$volume"]
+        feature_columns_yesterday: ["$high_1", "$low_1", "$open_1", "$close_1", "$volume_1"]
+        backtest: false
+      module_path: qlib.rl.data.native
+  module_path: qlib.rl.order_execution.interpreter
+reward:
+  class: PAPenaltyReward
+  kwargs:
+    penalty: 4.0
+    scale: 0.01
+  module_path: qlib.rl.order_execution.reward
+data:
+  source:
+    order_dir: ./data/orders
+    feature_root_dir: ./data/pickle/
+    feature_columns_today: ["$close0", "$volume0"]
+    feature_columns_yesterday: []
+    total_time: 240
+    default_start_time_index: 0
+    default_end_time_index: 235
+    proc_data_dim: 5
+  num_workers: 0
+  queue_size: 20
+network:
+  class: Recurrent
+  module_path: qlib.rl.order_execution.network
+policy:
+  class: PPO  # PPO, DQN
+  kwargs:
+    lr: 0.0001
+  module_path: qlib.rl.order_execution.policy
+runtime:
+  seed: 42
+  use_cuda: false
+trainer:
+  max_epoch: 500
+  repeat_per_collect: 25
+  earlystop_patience: 50
+  episode_per_collect: 10000
+  batch_size: 1024
+  val_every_n_epoch: 4
+  checkpoint_path: ./outputs/opds
+  checkpoint_every_n_iters: 1
--- a/examples/rl_order_execution/exp_configs/train_ppo.yml
+++ b/examples/rl_order_execution/exp_configs/train_ppo.yml
@@ -0,0 +1,67 @@
+simulator:
+  data_granularity: 5
+  time_per_step: 30
+  vol_limit: null
+env:
+  concurrency: 32
+  parallel_mode: dummy
+action_interpreter:
+  class: CategoricalActionInterpreter
+  kwargs:
+    values: 4
+    max_step: 8
+  module_path: qlib.rl.order_execution.interpreter
+state_interpreter:
+  class: FullHistoryStateInterpreter
+  kwargs:
+    data_dim: 5
+    data_ticks: 48  # 48 = 240 min / 5 min
+    max_step: 8
+    processed_data_provider:
+      class: HandlerProcessedDataProvider
+      kwargs:
+        data_dir: ./data/pickle/
+        feature_columns_today: ["$high", "$low", "$open", "$close", "$volume"]
+        feature_columns_yesterday: ["$high_1", "$low_1", "$open_1", "$close_1", "$volume_1"]
+        backtest: false
+      module_path: qlib.rl.data.native
+  module_path: qlib.rl.order_execution.interpreter
+reward:
+  class: PPOReward
+  kwargs:
+    max_step: 8
+    start_time_index: 0
+    end_time_index: 46  # 46 = (240 - 5) min / 5 min - 1
+  module_path: qlib.rl.order_execution.reward
+data:
+  source:
+    order_dir: ./data/orders
+    feature_root_dir: ./data/pickle/
+    feature_columns_today: ["$close0", "$volume0"]
+    feature_columns_yesterday: []
+    total_time: 240
+    default_start_time_index: 0
+    default_end_time_index: 235
+    proc_data_dim: 5
+  num_workers: 0
+  queue_size: 20
+network:
+  class: Recurrent
+  module_path: qlib.rl.order_execution.network
+policy:
+  class: PPO  # PPO, DQN
+  kwargs:
+    lr: 0.0001
+  module_path: qlib.rl.order_execution.policy
+runtime:
+  seed: 42
+  use_cuda: false
+trainer:
+  max_epoch: 500
+  repeat_per_collect: 25
+  earlystop_patience: 50
+  episode_per_collect: 10000
+  batch_size: 1024
+  val_every_n_epoch: 4
+  checkpoint_path: ./outputs/ppo
+  checkpoint_every_n_iters: 1
--- a/examples/rl_order_execution/scripts/gen_pickle_data.py
+++ b/examples/rl_order_execution/scripts/gen_pickle_data.py
@@ -4,6 +4,7 @@
 import yaml
 import argparse
 import os
+import shutil
 from copy import deepcopy

 from qlib.contrib.data.highfreq_provider import HighFreqProvider
@@ -41,3 +42,5 @@ if __name__ == "__main__":
    if args.split == "stock" or args.split == "both":
        provider._gen_stock_dataset(deepcopy(provider.feature_conf), "feature")
        provider._gen_stock_dataset(deepcopy(provider.backtest_conf), "backtest")
+
+    shutil.rmtree("stat/", ignore_errors=True)
--- a/examples/rl_order_execution/scripts/gen_training_orders.py
+++ b/examples/rl_order_execution/scripts/gen_training_orders.py
@@ -0,0 +1,53 @@
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT License.
+
+import os
+import numpy as np
+import pandas as pd
+
+from pathlib import Path
+
+DATA_PATH = Path(os.path.join("data", "pickle", "backtest"))
+OUTPUT_PATH = Path(os.path.join("data", "orders"))
+
+
+def generate_order(stock: str, start_idx: int, end_idx: int) -> bool:
+    dataset = pd.read_pickle(DATA_PATH / f"{stock}.pkl")
+    df = dataset.handler.fetch(level=None).reset_index()
+    if len(df) == 0 or df.isnull().values.any() or min(df["$volume0"]) < 1e-5:
+        return False
+
+    df["date"] = df["datetime"].dt.date.astype("datetime64")
+    df = df.set_index(["instrument", "datetime", "date"])
+    df = df.groupby("date").take(range(start_idx, end_idx)).droplevel(level=0)
+
+    order_all = pd.DataFrame(df.groupby(level=(2, 0)).mean().dropna())
+    order_all["amount"] = np.random.lognormal(-3.28, 1.14) * order_all["$volume0"]
+    order_all = order_all[order_all["amount"] > 0.0]
+    order_all["order_type"] = 0
+    order_all = order_all.drop(columns=["$volume0"])
+
+    order_train = order_all[order_all.index.get_level_values(0) <= pd.Timestamp("2021-06-30")]
+    order_test = order_all[order_all.index.get_level_values(0) > pd.Timestamp("2021-06-30")]
+    order_valid = order_test[order_test.index.get_level_values(0) <= pd.Timestamp("2021-09-30")]
+    order_test = order_test[order_test.index.get_level_values(0) > pd.Timestamp("2021-09-30")]
+
+    for order, tag in zip((order_train, order_valid, order_test, order_all), ("train", "valid", "test", "all")):
+        path = OUTPUT_PATH / tag
+        os.makedirs(path, exist_ok=True)
+        if len(order) > 0:
+            order.to_pickle(path / f"{stock}.pkl.target")
+    return True
+
+
+np.random.seed(1234)
+file_list = sorted(os.listdir(DATA_PATH))
+stocks = [f.replace(".pkl", "") for f in file_list]
+np.random.shuffle(stocks)
+
+cnt = 0
+for stock in stocks:
+    if generate_order(stock, 0, 240 // 5 - 1):
+        cnt += 1
+        if cnt == 100:
+            break
--- a/examples/rl_order_execution/scripts/merge_orders.py
+++ b/examples/rl_order_execution/scripts/merge_orders.py
@@ -0,0 +1,15 @@
+import pickle
+import os
+import pandas as pd
+from tqdm import tqdm
+
+for tag in ["test", "valid"]:
+    files = os.listdir(os.path.join("data/orders/", tag))
+    dfs = []
+    for f in tqdm(files):
+        df = pickle.load(open(os.path.join("data/orders/", tag, f), "rb"))
+        df = df.drop(["$close0"], axis=1)
+        dfs.append(df)
+
+    total_df = pd.concat(dfs)
+    pickle.dump(total_df, open(os.path.join("data", "orders", f"{tag}_orders.pkl"), "wb"))
--- a/examples/rl_order_execution/scripts/pickle_data_config.yml
+++ b/examples/rl_order_execution/scripts/pickle_data_config.yml
@@ -1,15 +1,16 @@
 # start & end time for training/validation/test datasets
 start_time: !!str &start 2020-01-01
-end_time: !!str &end 2020-07-31
-train_end_time: !!str &tend 2020-03-31
-valid_start_time: !!str &vstart 2020-04-01
-valid_end_time: !!str &vend 2020-05-31
-test_start_time: !!str &tstart 2020-06-01
+end_time: !!str &end 2021-12-31
+train_end_time: !!str &tend 2021-06-30
+valid_start_time: !!str &vstart 2021-07-01
+valid_end_time: !!str &vend 2021-09-30
+test_start_time: !!str &tstart 2021-10-01
 # the instrument set
-instruments: &ins all
+instruments: &ins csi300s19_22
 # qlib related configuration
 qlib_conf:
-    provider_uri: ./data/bin # path to generated qlib bin
+    provider_uri: 
+        5min: ./data/bin # path to generated qlib bin
    redis_port: 233
 feature_conf:
    path: ./data/pickle/feature.pkl # output path of feature
@@ -26,14 +27,23 @@ feature_conf:
                fit_end_time: *tend
                instruments: *ins
                day_length: 240 # how many minutes in one trading day
+                freq: 5min
+                columns: ["$open", "$high", "$low", "$close"]
                infer_processors:
                - class: HighFreqNorm
                  module_path: qlib.contrib.data.highfreq_processor
                  kwargs:
                    feature_save_dir: ./stat/  #  output path of statistics of features (for feature normalization)
                    norm_groups: 
-                        price: 10
+                        price: 8
                        volume: 2
+                inst_processors:
+                - class: TimeRangeFlt
+                  module_path: qlib.data.dataset.processor
+                  kwargs:
+                    start_time: "2020-01-01"
+                    end_time: "2021-12-31"
+                    freq: 5min
        segments:
            train: !!python/tuple [*start, *tend]
            valid: !!python/tuple [*vstart, *vend]
@@ -51,7 +61,17 @@ backtest_conf:
                end_time: *end
                instruments: *ins
                day_length: 240
+                freq: 5min
+                columns: ["$close", "$volume"]
+                inst_processors:
+                - class: TimeRangeFlt
+                  module_path: qlib.data.dataset.processor
+                  kwargs:
+                    start_time: "2020-01-01"
+                    end_time: "2021-12-31"
+                    freq: 5min
        segments:
            train: !!python/tuple [*start, *tend]
            valid: !!python/tuple [*vstart, *vend]
            test: !!python/tuple [*tstart, *end]
+freq: 5min
--- a/examples/tutorial/detailed_workflow.ipynb
+++ b/examples/tutorial/detailed_workflow.ipynb
@@ -88,6 +88,7 @@
   "outputs": [],
   "source": [
    "from qlib.tests.data import GetData\n",
+    "\n",
    "GetData().qlib_data(exists_skip=True)"
   ]
  },
@@ -99,6 +100,7 @@
   "outputs": [],
   "source": [
    "import qlib\n",
+    "\n",
    "qlib.init()"
   ]
  },
@@ -134,7 +136,8 @@
   "outputs": [],
   "source": [
    "from qlib.data import D\n",
-    "D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]  # calendar data"
+    "\n",
+    "print(D.calendar(start_time=\"2010-01-01\", end_time=\"2017-12-31\", freq=\"day\")[:2])  # calendar data"
   ]
  },
  {
@@ -152,7 +155,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df = D.features(['SH601216'], ['$open', '$high', '$low', '$close', '$factor'], start_time='2020-05-01', end_time='2020-05-31')   "
+    "df = D.features(\n",
+    "    [\"SH601216\"],\n",
+    "    [\"$open\", \"$high\", \"$low\", \"$close\", \"$factor\"],\n",
+    "    start_time=\"2020-05-01\",\n",
+    "    end_time=\"2020-05-31\",\n",
+    ")"
   ]
  },
  {
@@ -163,11 +171,18 @@
   "outputs": [],
   "source": [
    "import plotly.graph_objects as go\n",
-    "fig = go.Figure(data=[go.Candlestick(x=df.index.get_level_values(\"datetime\"),\n",
-    "                open=df['$open'],\n",
-    "                high=df['$high'],\n",
-    "                low=df['$low'],\n",
-    "                close=df['$close'])])\n",
+    "\n",
+    "fig = go.Figure(\n",
+    "    data=[\n",
+    "        go.Candlestick(\n",
+    "            x=df.index.get_level_values(\"datetime\"),\n",
+    "            open=df[\"$open\"],\n",
+    "            high=df[\"$high\"],\n",
+    "            low=df[\"$low\"],\n",
+    "            close=df[\"$close\"],\n",
+    "        )\n",
+    "    ]\n",
+    ")\n",
    "fig.show()"
   ]
  },
@@ -197,11 +212,18 @@
   "outputs": [],
   "source": [
    "import plotly.graph_objects as go\n",
-    "fig = go.Figure(data=[go.Candlestick(x=df.index.get_level_values(\"datetime\"),\n",
-    "                open=df['$open'] / df['$factor'],\n",
-    "                high=df['$high'] / df['$factor'],\n",
-    "                low=df['$low'] / df['$factor'],\n",
-    "                close=df['$close'] / df['$factor'])])\n",
+    "\n",
+    "fig = go.Figure(\n",
+    "    data=[\n",
+    "        go.Candlestick(\n",
+    "            x=df.index.get_level_values(\"datetime\"),\n",
+    "            open=df[\"$open\"] / df[\"$factor\"],\n",
+    "            high=df[\"$high\"] / df[\"$factor\"],\n",
+    "            low=df[\"$low\"] / df[\"$factor\"],\n",
+    "            close=df[\"$close\"] / df[\"$factor\"],\n",
+    "        )\n",
+    "    ]\n",
+    ")\n",
    "fig.show()"
   ]
  },
@@ -240,7 +262,7 @@
   "outputs": [],
   "source": [
    "# dynamic universe\n",
-    "universe = D.list_instruments(D.instruments('csi100'),  start_time='2010-01-01', end_time='2020-12-31')\n",
+    "universe = D.list_instruments(D.instruments(\"csi100\"), start_time=\"2010-01-01\", end_time=\"2020-12-31\")\n",
    "pprint(universe)"
   ]
  },
@@ -271,8 +293,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df = D.features(D.instruments('csi100'), ['$close'], start_time='2010-01-01', end_time='2020-12-31')   \n",
-    "df.groupby('datetime').size().plot()"
+    "df = D.features(D.instruments(\"csi100\"), [\"$close\"], start_time=\"2010-01-01\", end_time=\"2020-12-31\")\n",
+    "df.groupby(\"datetime\").size().plot()"
   ]
  },
  {
@@ -313,8 +335,7 @@
    "    !cd ../../scripts/data_collector/pit/ && pip install -r requirements.txt\n",
    "    !cd ../../scripts/data_collector/pit/ && python collector.py download_data --source_dir ~/.qlib/stock_data/source/pit --start 2000-01-01 --end 2020-01-01 --interval quarterly --symbol_regex \"^(600519|000725).*\"\n",
    "    !cd ../../scripts/data_collector/pit/ && python collector.py normalize_data --interval quarterly --source_dir ~/.qlib/stock_data/source/pit --normalize_dir ~/.qlib/stock_data/source/pit_normalized\n",
-    "    !cd ../../scripts/ && python dump_pit.py dump --csv_path ~/.qlib/stock_data/source/pit_normalized --qlib_dir ~/.qlib/qlib_data/cn_data --interval quarterly\n",
-    "    pass"
+    "    !cd ../../scripts/ && python dump_pit.py dump --csv_path ~/.qlib/stock_data/source/pit_normalized --qlib_dir ~/.qlib/qlib_data/cn_data --interval quarterly"
   ]
  },
  {
@@ -338,7 +359,13 @@
   "outputs": [],
   "source": [
    "instruments = [\"sh600519\"]\n",
-    "data = D.features(instruments, ['P($$roewa_q)'], start_time=\"2019-01-01\", end_time=\"2019-07-19\", freq=\"day\")"
+    "data = D.features(\n",
+    "    instruments,\n",
+    "    [\"P($$roewa_q)\"],\n",
+    "    start_time=\"2019-01-01\",\n",
+    "    end_time=\"2019-07-19\",\n",
+    "    freq=\"day\",\n",
+    ")"
   ]
  },
  {
@@ -366,7 +393,10 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "D.features([\"sh600519\"], ['(EMA($close, 12) - EMA($close, 26))/$close - EMA((EMA($close, 12) - EMA($close, 26))/$close, 9)/$close'])"
+    "D.features(\n",
+    "    [\"sh600519\"],\n",
+    "    [\"(EMA($close, 12) - EMA($close, 26))/$close - EMA((EMA($close, 12) - EMA($close, 26))/$close, 9)/$close\"],\n",
+    ")"
   ]
  },
  {
@@ -418,7 +448,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "qdl = QlibDataLoader(config=(['$close / Ref($close, 10)'], ['RET10']))"
+    "qdl = QlibDataLoader(config=([\"$close / Ref($close, 10)\"], [\"RET10\"]))"
   ]
  },
  {
@@ -428,7 +458,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "qdl.load(instruments=['sh600519'], start_time='20190101', end_time='20191231')"
+    "qdl.load(instruments=[\"sh600519\"], start_time=\"20190101\", end_time=\"20191231\")"
   ]
  },
  {
@@ -456,7 +486,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df = qdl.load(instruments=['sh600519'], start_time='20190101', end_time='20191231')"
+    "df = qdl.load(instruments=[\"sh600519\"], start_time=\"20190101\", end_time=\"20191231\")"
   ]
  },
  {
@@ -476,7 +506,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df.plot(kind='hist')"
+    "df.plot(kind=\"hist\")"
   ]
  },
  {
@@ -508,9 +538,16 @@
   "source": [
    "# NOTE: normally, the training & validation time range will be  `fit_start_time` ， `fit_end_time`\n",
    "# however，all the components are decomposed, so the training & validation time range is unknown when preprocessing.\n",
-    "dh = DataHandlerLP(instruments=['sh600519'], start_time='20170101', end_time='20191231',\n",
-    "             infer_processors=[ZScoreNorm(fit_start_time='20170101', fit_end_time='20181231'), Fillna()],\n",
-    "             data_loader=qdl)"
+    "dh = DataHandlerLP(\n",
+    "    instruments=[\"sh600519\"],\n",
+    "    start_time=\"20170101\",\n",
+    "    end_time=\"20191231\",\n",
+    "    infer_processors=[\n",
+    "        ZScoreNorm(fit_start_time=\"20170101\", fit_end_time=\"20181231\"),\n",
+    "        Fillna(),\n",
+    "    ],\n",
+    "    data_loader=qdl,\n",
+    ")"
   ]
  },
  {
@@ -550,7 +587,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "df.plot(kind='hist')"
+    "df.plot(kind=\"hist\")"
   ]
  },
  {
@@ -586,7 +623,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ds = DatasetH(dh, segments={\"train\": ('20180101', '20181231'), \"valid\": ('20190101', '20191231')})"
+    "ds = DatasetH(dh, segments={\"train\": (\"20180101\", \"20181231\"), \"valid\": (\"20190101\", \"20191231\")})"
   ]
  },
  {
@@ -596,7 +633,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ds.prepare('train')"
+    "ds.prepare(\"train\")"
   ]
  },
  {
@@ -606,7 +643,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ds.prepare('valid')"
+    "ds.prepare(\"valid\")"
   ]
  },
  {
@@ -628,8 +665,12 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "ds = TSDatasetH(step_len=10, handler=dh, segments={\"train\": ('20180101', '20181231'), \"valid\": ('20190101', '20191231')})\n",
-    "train_sampler = ds.prepare('train')"
+    "ds = TSDatasetH(\n",
+    "    step_len=10,\n",
+    "    handler=dh,\n",
+    "    segments={\"train\": (\"20180101\", \"20181231\"), \"valid\": (\"20190101\", \"20191231\")},\n",
+    ")\n",
+    "train_sampler = ds.prepare(\"train\")"
   ]
  },
  {
@@ -649,7 +690,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "train_sampler[0] # Retrieving the first example"
+    "train_sampler[0]  # Retrieving the first example"
   ]
  },
  {
@@ -659,7 +700,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "train_sampler['2018-01-08', 'sh600519']  # get the time series by <'timestamp', 'instrument_id'> index"
+    "train_sampler[\"2018-01-08\", \"sh600519\"]  # get the time series by <'timestamp', 'instrument_id'> index"
   ]
  },
  {
@@ -682,11 +723,11 @@
   "outputs": [],
   "source": [
    "handler_kwargs = {\n",
-    "        \"start_time\": \"2008-01-01\",\n",
-    "        \"end_time\": \"2020-08-01\",\n",
-    "        \"fit_start_time\": \"2008-01-01\",\n",
-    "        \"fit_end_time\": \"2014-12-31\",\n",
-    "        \"instruments\": MARKET,\n",
+    "    \"start_time\": \"2008-01-01\",\n",
+    "    \"end_time\": \"2020-08-01\",\n",
+    "    \"fit_start_time\": \"2008-01-01\",\n",
+    "    \"fit_end_time\": \"2014-12-31\",\n",
+    "    \"instruments\": MARKET,\n",
    "}\n",
    "handler_conf = {\n",
    "    \"class\": \"Alpha158\",\n",
@@ -735,6 +776,7 @@
   "outputs": [],
   "source": [
    "from qlib.contrib.data.handler import Alpha158\n",
+    "\n",
    "hd = Alpha158(**handler_kwargs)"
   ]
  },
@@ -826,7 +868,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "hd.process_type # appending type"
+    "hd.process_type  # appending type"
   ]
  },
  {
@@ -857,16 +899,16 @@
   "outputs": [],
   "source": [
    "dataset_conf = {\n",
-    "        \"class\": \"DatasetH\",\n",
-    "        \"module_path\": \"qlib.data.dataset\",\n",
-    "        \"kwargs\": {\n",
-    "            \"handler\": hd,\n",
-    "            \"segments\": {\n",
-    "                \"train\": (\"2008-01-01\", \"2014-12-31\"),\n",
-    "                \"valid\": (\"2015-01-01\", \"2016-12-31\"),\n",
-    "                \"test\": (\"2017-01-01\", \"2020-08-01\"),\n",
-    "            },\n",
+    "    \"class\": \"DatasetH\",\n",
+    "    \"module_path\": \"qlib.data.dataset\",\n",
+    "    \"kwargs\": {\n",
+    "        \"handler\": hd,\n",
+    "        \"segments\": {\n",
+    "            \"train\": (\"2008-01-01\", \"2014-12-31\"),\n",
+    "            \"valid\": (\"2015-01-01\", \"2016-12-31\"),\n",
+    "            \"test\": (\"2017-01-01\", \"2020-08-01\"),\n",
    "        },\n",
+    "    },\n",
    "}"
   ]
  },
@@ -908,7 +950,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "model = init_instance_by_config({\n",
+    "model = init_instance_by_config(\n",
+    "    {\n",
    "        \"class\": \"LGBModel\",\n",
    "        \"module_path\": \"qlib.contrib.model.gbdt\",\n",
    "        \"kwargs\": {\n",
@@ -922,7 +965,8 @@
    "            \"num_leaves\": 210,\n",
    "            \"num_threads\": 20,\n",
    "        },\n",
-    "})"
+    "    }\n",
+    ")"
   ]
  },
  {
@@ -938,7 +982,7 @@
    "    R.save_objects(trained_model=model)\n",
    "\n",
    "    rec = R.get_recorder()\n",
-    "    rid = rec.id # save the record id\n",
+    "    rid = rec.id  # save the record id\n",
    "\n",
    "    # Inference and saving signal\n",
    "    sr = SignalRecord(model, dataset, rec)\n",
@@ -1001,12 +1045,11 @@
    "\n",
    "# backtest and analysis\n",
    "with R.start(experiment_name=EXP_NAME, recorder_id=rid, resume=True):\n",
-    "\n",
    "    # signal-based analysis\n",
    "    rec = R.get_recorder()\n",
    "    sar = SigAnaRecord(rec)\n",
    "    sar.generate()\n",
-    "    \n",
+    "\n",
    "    #  portfolio-based analysis: backtest\n",
    "    par = PortAnaRecord(rec, port_analysis_config, \"day\")\n",
    "    par.generate()"
@@ -1137,7 +1180,7 @@
   "outputs": [],
   "source": [
    "label_df = dataset.prepare(\"test\", col_set=\"label\")\n",
-    "label_df.columns = ['label']"
+    "label_df.columns = [\"label\"]"
   ]
  },
  {
--- a/examples/workflow_by_code.ipynb
+++ b/examples/workflow_by_code.ipynb
@@ -38,7 +38,7 @@
    "    # install qlib\n",
    "    ! pip install --upgrade numpy\n",
    "    ! pip install pyqlib\n",
-    "    if 'google.colab' in sys.modules:\n",
+    "    if \"google.colab\" in sys.modules:\n",
    "        # The Google colab environment is a little outdated. We have to downgrade the pyyaml to make it compatible with other packages\n",
    "        ! pip install pyyaml==5.4.1\n",
    "    # reload\n",
@@ -50,7 +50,8 @@
    "    scripts_dir = Path(\"~/tmp/qlib_code/scripts\").expanduser().resolve()\n",
    "    scripts_dir.mkdir(parents=True, exist_ok=True)\n",
    "    import requests\n",
-    "    with requests.get(\"https://raw.githubusercontent.com/microsoft/qlib/main/scripts/get_data.py\") as resp:\n",
+    "\n",
+    "    with requests.get(\"https://raw.githubusercontent.com/microsoft/qlib/main/scripts/get_data.py\", timeout=10) as resp:\n",
    "        with open(scripts_dir.joinpath(\"get_data.py\"), \"wb\") as fp:\n",
    "            fp.write(resp.content)"
   ]
@@ -61,14 +62,13 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "\n",
    "import qlib\n",
    "import pandas as pd\n",
    "from qlib.constant import REG_CN\n",
    "from qlib.utils import exists_qlib_data, init_instance_by_config\n",
    "from qlib.workflow import R\n",
    "from qlib.workflow.record_temp import SignalRecord, PortAnaRecord\n",
-    "from qlib.utils import flatten_dict\n"
+    "from qlib.utils import flatten_dict"
   ]
  },
  {
@@ -86,6 +86,7 @@
    "    print(f\"Qlib data is not found in {provider_uri}\")\n",
    "    sys.path.append(str(scripts_dir))\n",
    "    from get_data import GetData\n",
+    "\n",
    "    GetData().qlib_data(target_dir=provider_uri, region=REG_CN)\n",
    "qlib.init(provider_uri=provider_uri, region=REG_CN)"
   ]
@@ -169,7 +170,7 @@
    "    R.log_params(**flatten_dict(task))\n",
    "    model.fit(dataset)\n",
    "    R.save_objects(trained_model=model)\n",
-    "    rid = R.get_recorder().id\n"
+    "    rid = R.get_recorder().id"
   ]
  },
  {
@@ -238,7 +239,7 @@
    "\n",
    "    # backtest & analysis\n",
    "    par = PortAnaRecord(recorder, port_analysis_config, \"day\")\n",
-    "    par.generate()\n"
+    "    par.generate()"
   ]
  },
  {
@@ -256,6 +257,7 @@
   "source": [
    "from qlib.contrib.report import analysis_model, analysis_position\n",
    "from qlib.data import D\n",
+    "\n",
    "recorder = R.get_recorder(recorder_id=ba_rid, experiment_name=\"backtest_analysis\")\n",
    "print(recorder)\n",
    "pred_df = recorder.load_object(\"pred.pkl\")\n",
@@ -317,7 +319,7 @@
   "outputs": [],
   "source": [
    "label_df = dataset.prepare(\"test\", col_set=\"label\")\n",
-    "label_df.columns = ['label']"
+    "label_df.columns = [\"label\"]"
   ]
  },
  {
--- a/qlib/init.py
+++ b/qlib/init.py
@@ -2,7 +2,7 @@
 # Licensed under the MIT License.
 from pathlib import Path

-__version__ = "0.9.1"
+__version__ = "0.9.2"
 __version__bak = __version__  # This version is backup for QlibConfig.reset_qlib_version
 import os
 from typing import Union
--- a/qlib/backtest/init.py
+++ b/qlib/backtest/init.py
@@ -40,8 +40,8 @@ def get_exchange(
    open_cost: float = 0.0015,
    close_cost: float = 0.0025,
    min_cost: float = 5.0,
-    limit_threshold: Union[Tuple[str, str], float, None] = None,
-    deal_price: Union[str, Tuple[str, str], List[str]] = None,
+    limit_threshold: Union[Tuple[str, str], float, None] | None = None,
+    deal_price: Union[str, Tuple[str, str], List[str]] | None = None,
    **kwargs: Any,
 ) -> Exchange:
    """get_exchange
@@ -284,7 +284,7 @@ def collect_data(
    account: Union[float, int, dict] = 1e9,
    exchange_kwargs: dict = {},
    pos_type: str = "Position",
-    return_value: dict = None,
+    return_value: dict | None = None,
 ) -> Generator[object, None, None]:
    """initialize the strategy and executor, then collect the trade decision data for rl training

--- a/qlib/backtest/account.py
+++ b/qlib/backtest/account.py
@@ -152,7 +152,9 @@ class Account:
        # trading related metrics(e.g. high-frequency trading)
        self.indicator = Indicator()

-    def reset(self, freq: str = None, benchmark_config: dict = None, port_metr_enabled: bool = None) -> None:
+    def reset(
+        self, freq: str | None = None, benchmark_config: dict | None = None, port_metr_enabled: bool | None = None
+    ) -> None:
        """reset freq and report of account

        Parameters
--- a/qlib/backtest/backtest.py
+++ b/qlib/backtest/backtest.py
@@ -55,7 +55,7 @@ def collect_data_loop(
    end_time: Union[pd.Timestamp, str],
    trade_strategy: BaseStrategy,
    trade_executor: BaseExecutor,
-    return_value: dict = None,
+    return_value: dict | None = None,
 ) -> Generator[BaseTradeDecision, Optional[BaseTradeDecision], None]:
    """Generator for collecting the trade decision data for rl training

--- a/qlib/backtest/decision.py
+++ b/qlib/backtest/decision.py
@@ -254,7 +254,7 @@ class IdxTradeRange(TradeRange):
        self._start_idx = start_idx
        self._end_idx = end_idx

-    def __call__(self, trade_calendar: TradeCalendarManager = None) -> Tuple[int, int]:
+    def __call__(self, trade_calendar: TradeCalendarManager | None = None) -> Tuple[int, int]:
        return self._start_idx, self._end_idx

    def clip_time_range(self, start_time: pd.Timestamp, end_time: pd.Timestamp) -> Tuple[pd.Timestamp, pd.Timestamp]:
@@ -315,7 +315,7 @@ class BaseTradeDecision(Generic[DecisionType]):
        2. Same as `case 1.3`
    """

-    def __init__(self, strategy: BaseStrategy, trade_range: Union[Tuple[int, int], TradeRange] = None) -> None:
+    def __init__(self, strategy: BaseStrategy, trade_range: Union[Tuple[int, int], TradeRange, None] = None) -> None:
        """
        Parameters
        ----------
@@ -554,7 +554,7 @@ class TradeDecisionWO(BaseTradeDecision[Order]):
        self,
        order_list: List[Order],
        strategy: BaseStrategy,
-        trade_range: Union[Tuple[int, int], TradeRange] = None,
+        trade_range: Union[Tuple[int, int], TradeRange, None] = None,
    ) -> None:
        super().__init__(strategy, trade_range=trade_range)
        self.order_list = cast(List[Order], order_list)
--- a/qlib/backtest/exchange.py
+++ b/qlib/backtest/exchange.py
@@ -41,10 +41,10 @@ class Exchange:
        start_time: Union[pd.Timestamp, str] = None,
        end_time: Union[pd.Timestamp, str] = None,
        codes: Union[list, str] = "all",
-        deal_price: Union[str, Tuple[str, str], List[str]] = None,
+        deal_price: Union[str, Tuple[str, str], List[str], None] = None,
        subscribe_fields: list = [],
        limit_threshold: Union[Tuple[str, str], float, None] = None,
-        volume_threshold: Union[tuple, dict] = None,
+        volume_threshold: Union[tuple, dict, None] = None,
        open_cost: float = 0.0015,
        close_cost: float = 0.0025,
        min_cost: float = 5.0,
@@ -340,7 +340,7 @@ class Exchange:
        stock_id: str,
        start_time: pd.Timestamp,
        end_time: pd.Timestamp,
-        direction: int = None,
+        direction: int | None = None,
    ) -> bool:
        """
        Parameters
@@ -406,7 +406,7 @@ class Exchange:
        stock_id: str,
        start_time: pd.Timestamp,
        end_time: pd.Timestamp,
-        direction: int = None,
+        direction: int | None = None,
    ) -> bool:
        # check if stock can be traded
        return not (
@@ -421,8 +421,8 @@ class Exchange:
    def deal_order(
        self,
        order: Order,
-        trade_account: Account = None,
-        position: BasePosition = None,
+        trade_account: Account | None = None,
+        position: BasePosition | None = None,
        dealt_order_amount: Dict[str, float] = defaultdict(float),
    ) -> Tuple[float, float, float]:
        """
@@ -586,7 +586,7 @@ class Exchange:
                )
        return amount_dict

-    def get_real_deal_amount(self, current_amount: float, target_amount: float, factor: float = None) -> float:
+    def get_real_deal_amount(self, current_amount: float, target_amount: float, factor: float | None = None) -> float:
        """
        Calculate the real adjust deal amount when considering the trading unit
        :param current_amount:
@@ -712,8 +712,8 @@ class Exchange:

    def _get_factor_or_raise_error(
        self,
-        factor: float = None,
-        stock_id: str = None,
+        factor: float | None = None,
+        stock_id: str | None = None,
        start_time: pd.Timestamp = None,
        end_time: pd.Timestamp = None,
    ) -> float:
@@ -728,8 +728,8 @@ class Exchange:

    def get_amount_of_trade_unit(
        self,
-        factor: float = None,
-        stock_id: str = None,
+        factor: float | None = None,
+        stock_id: str | None = None,
        start_time: pd.Timestamp = None,
        end_time: pd.Timestamp = None,
    ) -> Optional[float]:
@@ -762,8 +762,8 @@ class Exchange:
    def round_amount_by_trade_unit(
        self,
        deal_amount: float,
-        factor: float = None,
-        stock_id: str = None,
+        factor: float | None = None,
+        stock_id: str | None = None,
        start_time: pd.Timestamp = None,
        end_time: pd.Timestamp = None,
    ) -> float:
--- a/qlib/backtest/executor.py
+++ b/qlib/backtest/executor.py
@@ -31,8 +31,8 @@ class BaseExecutor:
        generate_portfolio_metrics: bool = False,
        verbose: bool = False,
        track_data: bool = False,
-        trade_exchange: Exchange = None,
-        common_infra: CommonInfrastructure = None,
+        trade_exchange: Exchange | None = None,
+        common_infra: CommonInfrastructure | None = None,
        settle_type: str = BasePosition.ST_NO,
        **kwargs: Any,
    ) -> None:
@@ -161,7 +161,7 @@ class BaseExecutor:
        """
        return self.level_infra.get("trade_calendar")

-    def reset(self, common_infra: CommonInfrastructure = None, **kwargs: Any) -> None:
+    def reset(self, common_infra: CommonInfrastructure | None = None, **kwargs: Any) -> None:
        """
        - reset `start_time` and `end_time`, used in trade calendar
        - reset `common_infra`, used to reset `trade_account`, `trade_exchange`, .etc
@@ -227,7 +227,7 @@ class BaseExecutor:
    def collect_data(
        self,
        trade_decision: BaseTradeDecision,
-        return_value: dict = None,
+        return_value: dict | None = None,
        level: int = 0,
    ) -> Generator[Any, Any, List[object]]:
        """Generator for collecting the trade decision data for rl training
@@ -327,7 +327,7 @@ class NestedExecutor(BaseExecutor):
        track_data: bool = False,
        skip_empty_decision: bool = True,
        align_range_limit: bool = True,
-        common_infra: CommonInfrastructure = None,
+        common_infra: CommonInfrastructure | None = None,
        **kwargs: Any,
    ) -> None:
        """
@@ -534,7 +534,7 @@ class SimulatorExecutor(BaseExecutor):
        generate_portfolio_metrics: bool = False,
        verbose: bool = False,
        track_data: bool = False,
-        common_infra: CommonInfrastructure = None,
+        common_infra: CommonInfrastructure | None = None,
        trade_type: str = TT_SERIAL,
        **kwargs: Any,
    ) -> None:
--- a/qlib/backtest/position.py
+++ b/qlib/backtest/position.py
@@ -1,6 +1,7 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.

+from __future__ import annotations

 from datetime import timedelta
 from typing import Any, Dict, List, Union
@@ -320,7 +321,7 @@ class Position(BasePosition):
            self.position[stock]["price"] = price_dict[stock]
        self.position["now_account_value"] = self.calculate_value()

-    def _init_stock(self, stock_id: str, amount: float, price: float = None) -> None:
+    def _init_stock(self, stock_id: str, amount: float, price: float | None = None) -> None:
        """
        initialization the stock in current position

--- a/qlib/backtest/report.py
+++ b/qlib/backtest/report.py
@@ -1,6 +1,7 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.

+from __future__ import annotations

 import pathlib
 from collections import OrderedDict
@@ -86,7 +87,7 @@ class PortfolioMetrics:
        self.benches: dict = OrderedDict()
        self.latest_pm_time: Optional[pd.TimeStamp] = None

-    def init_bench(self, freq: str = None, benchmark_config: dict = None) -> None:
+    def init_bench(self, freq: str | None = None, benchmark_config: dict | None = None) -> None:
        if freq is not None:
            self.freq = freq
        self.benchmark_config = benchmark_config
@@ -149,15 +150,15 @@ class PortfolioMetrics:
        self,
        trade_start_time: Union[str, pd.Timestamp] = None,
        trade_end_time: Union[str, pd.Timestamp] = None,
-        account_value: float = None,
-        cash: float = None,
-        return_rate: float = None,
-        total_turnover: float = None,
-        turnover_rate: float = None,
-        total_cost: float = None,
-        cost_rate: float = None,
-        stock_value: float = None,
-        bench_value: float = None,
+        account_value: float | None = None,
+        cash: float | None = None,
+        return_rate: float | None = None,
+        total_turnover: float | None = None,
+        turnover_rate: float | None = None,
+        total_cost: float | None = None,
+        cost_rate: float | None = None,
+        stock_value: float | None = None,
+        bench_value: float | None = None,
    ) -> None:
        # check data
        if None in [
--- a/qlib/backtest/utils.py
+++ b/qlib/backtest/utils.py
@@ -31,7 +31,7 @@ class TradeCalendarManager:
        freq: str,
        start_time: Union[str, pd.Timestamp] = None,
        end_time: Union[str, pd.Timestamp] = None,
-        level_infra: LevelInfrastructure = None,
+        level_infra: LevelInfrastructure | None = None,
    ) -> None:
        """
        Parameters
@@ -99,7 +99,7 @@ class TradeCalendarManager:
    def get_trade_step(self) -> int:
        return self.trade_step

-    def get_step_time(self, trade_step: int = None, shift: int = 0) -> Tuple[pd.Timestamp, pd.Timestamp]:
+    def get_step_time(self, trade_step: int | None = None, shift: int = 0) -> Tuple[pd.Timestamp, pd.Timestamp]:
        """
        Get the left and right endpoints of the trade_step'th trading interval

--- a/qlib/config.py
+++ b/qlib/config.py
@@ -147,6 +147,7 @@ _default_config = {
    "redis_host": "127.0.0.1",
    "redis_port": 6379,
    "redis_task_db": 1,
+    "redis_password": None,
    # This value can be reset via qlib.init
    "logging_level": logging.INFO,
    # Global configuration of qlib log
--- a/qlib/contrib/data/handler.py
+++ b/qlib/contrib/data/handler.py
@@ -56,7 +56,7 @@ class Alpha360(DataHandlerLP):
        fit_start_time=None,
        fit_end_time=None,
        filter_pipe=None,
-        inst_processor=None,
+        inst_processors=None,
        **kwargs
    ):
        infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
@@ -71,7 +71,7 @@ class Alpha360(DataHandlerLP):
                },
                "filter_pipe": filter_pipe,
                "freq": freq,
-                "inst_processor": inst_processor,
+                "inst_processors": inst_processors,
            },
        }

@@ -152,7 +152,7 @@ class Alpha158(DataHandlerLP):
        fit_end_time=None,
        process_type=DataHandlerLP.PTYPE_A,
        filter_pipe=None,
-        inst_processor=None,
+        inst_processors=None,
        **kwargs
    ):
        infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
@@ -167,7 +167,7 @@ class Alpha158(DataHandlerLP):
                },
                "filter_pipe": filter_pipe,
                "freq": freq,
-                "inst_processor": inst_processor,
+                "inst_processors": inst_processors,
            },
        }
        super().__init__(
--- a/qlib/contrib/data/highfreq_handler.py
+++ b/qlib/contrib/data/highfreq_handler.py
@@ -44,7 +44,7 @@ class HighFreqHandler(DataHandlerLP):
        names = []

        template_if = "If(IsNull({1}), {0}, {1})"
-        template_paused = "Select(Gt($hx_paused_num, 1.001), {0})"
+        template_paused = "Select(Gt($paused_num, 1.001), {0})"

        def get_normalized_price_feature(price_field, shift=0):
            # norm with the close price of 237th minute of yesterday.
@@ -115,6 +115,7 @@ class HighFreqGeneralHandler(DataHandlerLP):
        day_length=240,
        freq="1min",
        columns=["$open", "$high", "$low", "$close", "$vwap"],
+        inst_processors=None,
    ):
        self.day_length = day_length
        self.columns = columns
@@ -128,6 +129,7 @@ class HighFreqGeneralHandler(DataHandlerLP):
                "config": self.get_feature_config(),
                "swap_level": False,
                "freq": freq,
+                "inst_processors": inst_processors,
            },
        }
        super().__init__(
@@ -257,6 +259,7 @@ class HighFreqGeneralBacktestHandler(DataHandler):
        day_length=240,
        freq="1min",
        columns=["$close", "$vwap", "$volume"],
+        inst_processors=None,
    ):
        self.day_length = day_length
        self.columns = set(columns)
@@ -266,6 +269,7 @@ class HighFreqGeneralBacktestHandler(DataHandler):
                "config": self.get_feature_config(),
                "swap_level": False,
                "freq": freq,
+                "inst_processors": inst_processors,
            },
        }
        super().__init__(
@@ -311,6 +315,7 @@ class HighFreqOrderHandler(DataHandlerLP):
        learn_processors=[],
        fit_start_time=None,
        fit_end_time=None,
+        inst_processors=None,
        drop_raw=True,
    ):

@@ -323,6 +328,7 @@ class HighFreqOrderHandler(DataHandlerLP):
                "config": self.get_feature_config(),
                "swap_level": False,
                "freq": "1min",
+                "inst_processors": inst_processors,
            },
        }
        super().__init__(
@@ -482,7 +488,7 @@ class HighFreqBacktestOrderHandler(DataHandler):
        names = []

        template_if = "If(IsNull({1}), {0}, {1})"
-        template_paused = "Select(Gt($hx_paused_num, 1.001), {0})"
+        template_paused = "Select(Gt($paused_num, 1.001), {0})"
        template_fillnan = "FFillNan({0})"
        fields += [
            template_fillnan.format(template_paused.format("$close")),
--- a/qlib/contrib/data/highfreq_provider.py
+++ b/qlib/contrib/data/highfreq_provider.py
@@ -128,7 +128,7 @@ class HighFreqProvider:
            raise ValueError("Must specify the path to save the dataset.") from e
        if os.path.isfile(path):
            start = time.time()
-            self.logger.info("Dataset exists, load from disk.", __name__)
+            self.logger.info(f"[{__name__}]Dataset exists, load from disk.")

            # res = dataset.prepare(['train', 'valid', 'test'])
            with open(path, "rb") as f:
@@ -137,11 +137,11 @@ class HighFreqProvider:
                res = [data[i] for i in datasets]
            else:
                res = data.prepare(datasets)
-            self.logger.info(f"Data loaded, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Data loaded, time cost: {time.time() - start:.2f}")
        else:
            if not os.path.exists(os.path.dirname(path)):
                os.makedirs(os.path.dirname(path))
-            self.logger.info("Generating dataset", __name__)
+            self.logger.info(f"[{__name__}]Generating dataset")
            start_time = time.time()
            self._prepare_calender_cache()
            dataset = init_instance_by_config(config)
@@ -160,7 +160,7 @@ class HighFreqProvider:
            with open(path[:-4] + "test.pkl", "wb") as f:
                pkl.dump(testset, f)
            res = [data[i] for i in datasets]
-            self.logger.info(f"Data generated, time cost: {(time.time() - start_time):.2f}", __name__)
+            self.logger.info(f"[{__name__}]Data generated, time cost: {(time.time() - start_time):.2f}")
        return res

    def _gen_data(self, config, datasets=["train", "valid", "test"]):
@@ -170,7 +170,7 @@ class HighFreqProvider:
            raise ValueError("Must specify the path to save the dataset.") from e
        if os.path.isfile(path):
            start = time.time()
-            self.logger.info("Dataset exists, load from disk.", __name__)
+            self.logger.info(f"[{__name__}]Dataset exists, load from disk.")

            # res = dataset.prepare(['train', 'valid', 'test'])
            with open(path, "rb") as f:
@@ -179,18 +179,18 @@ class HighFreqProvider:
                res = [data[i] for i in datasets]
            else:
                res = data.prepare(datasets)
-            self.logger.info(f"Data loaded, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Data loaded, time cost: {time.time() - start:.2f}")
        else:
            if not os.path.exists(os.path.dirname(path)):
                os.makedirs(os.path.dirname(path))
-            self.logger.info("Generating dataset", __name__)
+            self.logger.info(f"[{__name__}]Generating dataset")
            start_time = time.time()
            self._prepare_calender_cache()
            dataset = init_instance_by_config(config)
            dataset.config(dump_all=True, recursive=True)
            dataset.to_pickle(path)
            res = dataset.prepare(datasets)
-            self.logger.info(f"Data generated, time cost: {(time.time() - start_time):.2f}", __name__)
+            self.logger.info(f"[{__name__}]Data generated, time cost: {(time.time() - start_time):.2f}")
        return res

    def _gen_dataset(self, config):
@@ -200,21 +200,21 @@ class HighFreqProvider:
            raise ValueError("Must specify the path to save the dataset.") from e
        if os.path.isfile(path):
            start = time.time()
-            self.logger.info("Dataset exists, load from disk.", __name__)
+            self.logger.info(f"[{__name__}]Dataset exists, load from disk.")

            with open(path, "rb") as f:
                dataset = pkl.load(f)
-            self.logger.info(f"Data loaded, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Data loaded, time cost: {time.time() - start:.2f}")
        else:
            start = time.time()
            if not os.path.exists(os.path.dirname(path)):
                os.makedirs(os.path.dirname(path))
-            self.logger.info("Generating dataset", __name__)
+            self.logger.info(f"[{__name__}]Generating dataset")
            self._prepare_calender_cache()
            dataset = init_instance_by_config(config)
-            self.logger.info(f"Dataset init, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Dataset init, time cost: {time.time() - start:.2f}")
            dataset.prepare(["train", "valid", "test"])
-            self.logger.info(f"Dataset prepared, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Dataset prepared, time cost: {time.time() - start:.2f}")
            dataset.config(dump_all=True, recursive=True)
            dataset.to_pickle(path)
        return dataset
@@ -227,15 +227,15 @@ class HighFreqProvider:

        if os.path.isfile(path + "tmp_dataset.pkl"):
            start = time.time()
-            self.logger.info("Dataset exists, load from disk.", __name__)
+            self.logger.info(f"[{__name__}]Dataset exists, load from disk.")
        else:
            start = time.time()
            if not os.path.exists(os.path.dirname(path)):
                os.makedirs(os.path.dirname(path))
-            self.logger.info("Generating dataset", __name__)
+            self.logger.info(f"[{__name__}]Generating dataset")
            self._prepare_calender_cache()
            dataset = init_instance_by_config(config)
-            self.logger.info(f"Dataset init, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Dataset init, time cost: {time.time() - start:.2f}")
            dataset.config(dump_all=False, recursive=True)
            dataset.to_pickle(path + "tmp_dataset.pkl")

@@ -268,15 +268,15 @@ class HighFreqProvider:

        if os.path.isfile(path + "tmp_dataset.pkl"):
            start = time.time()
-            self.logger.info("Dataset exists, load from disk.", __name__)
+            self.logger.info(f"[{__name__}]Dataset exists, load from disk.")
        else:
            start = time.time()
            if not os.path.exists(os.path.dirname(path)):
                os.makedirs(os.path.dirname(path))
-            self.logger.info("Generating dataset", __name__)
+            self.logger.info(f"[{__name__}]Generating dataset")
            self._prepare_calender_cache()
            dataset = init_instance_by_config(config)
-            self.logger.info(f"Dataset init, time cost: {time.time() - start:.2f}", __name__)
+            self.logger.info(f"[{__name__}]Dataset init, time cost: {time.time() - start:.2f}")
            dataset.config(dump_all=False, recursive=True)
            dataset.to_pickle(path + "tmp_dataset.pkl")

--- a/qlib/contrib/meta/data_selection/dataset.py
+++ b/qlib/contrib/meta/data_selection/dataset.py
@@ -55,8 +55,10 @@ class InternalData:
        # The handler is initialized for only once.
        if not trainer.has_worker():
            self.dh = init_task_handler(perf_task_tpl)
+            self.dh.config(dump_all=False)  # in some cases, the data handler are saved to disk with `dump_all=True`
        else:
            self.dh = init_instance_by_config(perf_task_tpl["dataset"]["kwargs"]["handler"])
+        assert self.dh.dump_all is False  # otherwise, it will save all the detailed data

        seg = perf_task_tpl["dataset"]["kwargs"]["segments"]

@@ -77,7 +79,7 @@ class InternalData:
            get_module_logger("Internal Data").info("the data has been initialized")
        else:
            # train new models
-            assert 0 == len(recorders), "An empty experiment is required for setup `InternalData``"
+            assert 0 == len(recorders), "An empty experiment is required for setup `InternalData`"
            trainer.train(gen_task)

        # 2) extract the similarity matrix
@@ -119,6 +121,7 @@ class MetaTaskDS(MetaTask):

    def __init__(self, task: dict, meta_info: pd.DataFrame, mode: str = MetaTask.PROC_MODE_FULL, fill_method="max"):
        """
+
        The description of the processed data

            time_perf: A array with shape  <hist_step_n * step, data pieces>  ->  data piece performance
@@ -132,6 +135,10 @@ class MetaTaskDS(MetaTask):
                   [0., 0., 0., ..., 0., 0., 1.],
                   [0., 0., 0., ..., 0., 0., 1.]])

+        Parameters
+        ----------
+        meta_info: pd.DataFrame
+            please refer to the docs of _prepare_meta_ipt for detailed explanation.
        """
        super().__init__(task, meta_info)
        self.fill_method = fill_method
@@ -180,12 +187,41 @@ class MetaTaskDS(MetaTask):
        self.processed_meta_input = data_to_tensor(self.processed_meta_input)

    def _get_processed_meta_info(self):
-        meta_info_norm = self.meta_info.sub(self.meta_info.mean(axis=1), axis=0)  # .fillna(0.)
-        if self.fill_method == "max":
-            meta_info_norm = meta_info_norm.T.fillna(
-                meta_info_norm.max(axis=1)
-            ).T  # fill it with row max to align with previous implementation
+        meta_info_norm = self.meta_info.sub(self.meta_info.mean(axis=1), axis=0)
+        if self.fill_method.startswith("max"):
+            suffix = self.fill_method.lstrip("max")
+            if suffix == "seg":
+                fill_value = {}
+                for col in meta_info_norm.columns:
+                    fill_value[col] = meta_info_norm.loc[meta_info_norm[col].isna(), :].dropna(axis=1).mean().max()
+                fill_value = pd.Series(fill_value).sort_index()
+                # The NaN Values are filled segment-wise. Below is an exampleof fill_value
+                # 2009-01-05  2009-02-06    0.145809
+                # 2009-02-09  2009-03-06    0.148005
+                # 2009-03-09  2009-04-03    0.090385
+                # 2009-04-07  2009-05-05    0.114318
+                # 2009-05-06  2009-06-04    0.119328
+                # ...
+                meta_info_norm = meta_info_norm.fillna(fill_value)
+            else:
+                if len(suffix) > 0:
+                    get_module_logger("MetaTaskDS").warning(
+                        f"fill_method={self.fill_method}; the info after can't be correctly parsed. Please check your parameters."
+                    )
+                fill_value = meta_info_norm.max(axis=1)
+                # fill it with row max to align with previous implementation
+                # This will magnify the data similarity when data is in daily freq
+
+                # the fill value corresponds to data like this
+                # It get a performance value for each day.
+                # The performance value are get from other models on this day
+                # 2009-01-16    0.276320
+                # 2009-01-19    0.280603
+                #                 ...
+                # 2011-06-27    0.203773
+                meta_info_norm = meta_info_norm.T.fillna(fill_value).T
        elif self.fill_method == "zero":
+            # It will fillna(0.0) at the end.
            pass
        else:
            raise NotImplementedError(f"This type of input is not supported")
@@ -286,7 +322,33 @@ class MetaDatasetDS(MetaTaskDataset):
                logger.warning(f"ValueError: {e}")
        assert len(self.meta_task_l) > 0, "No meta tasks found. Please check the data and setting"

-    def _prepare_meta_ipt(self, task):
+    def _prepare_meta_ipt(self, task) -> pd.DataFrame:
+        """
+        Please refer to `self.internal_data.setup` for detailed information about `self.internal_data.data_ic_df`
+
+        Indices with format below can be successfully sliced by  `ic_df.loc[:end, pd.IndexSlice[:, :end]]`
+
+               2021-06-21 2021-06-04 .. 2021-03-22 2021-03-08
+               2021-07-02 2021-06-18 .. 2021-04-02 None
+
+        Returns
+        -------
+            a pd.DataFrame with similar content below.
+            - each column corresponds to a trained model named by the training data range
+            - each row corresponds to a day of data tested by the models of the columns
+            - The rows cells that overlaps with the data used by columns are masked
+
+
+                       2009-01-05 2009-02-09 ... 2011-04-27 2011-05-26
+                       2009-02-06 2009-03-06 ... 2011-05-25 2011-06-23
+            datetime                         ...
+            2009-01-13        NaN   0.310639 ...  -0.169057   0.137792
+            2009-01-14        NaN   0.261086 ...  -0.143567   0.082581
+            ...               ...        ... ...        ...        ...
+            2011-06-30  -0.054907  -0.020219 ...  -0.023226        NaN
+            2011-07-01  -0.075762  -0.026626 ...  -0.003167        NaN
+
+        """
        ic_df = self.internal_data.data_ic_df

        segs = task["dataset"]["kwargs"]["segments"]
@@ -294,15 +356,19 @@ class MetaDatasetDS(MetaTaskDataset):
        ic_df_avail = ic_df.loc[:end, pd.IndexSlice[:, :end]]

        # meta data set focus on the **information** instead of preprocess
-        # 1) filter the future info
-        def mask_future(s):
-            """mask future information"""
-            # from qlib.utils import get_date_by_shift
+        # 1) filter the overlap info
+        def mask_overlap(s):
+            """
+            mask overlap information
+            data after self.name[end] with self.trunc_days that contains future info are also considered as overlap info
+
+            Approximately the diagnal + horizon length of data are masked.
+            """
            start, end = s.name
            end = get_date_by_shift(trading_date=end, shift=self.trunc_days - 1, future=True)
            return s.mask((s.index >= start) & (s.index <= end))

-        ic_df_avail = ic_df_avail.apply(mask_future)  # apply to each col
+        ic_df_avail = ic_df_avail.apply(mask_overlap)  # apply to each col

        # 2) filter the info with too long periods
        total_len = self.step * self.hist_step_n
--- a/qlib/contrib/meta/data_selection/model.py
+++ b/qlib/contrib/meta/data_selection/model.py
@@ -52,6 +52,7 @@ class MetaModelDS(MetaTaskModel):
        lr=0.0001,
        max_epoch=100,
        seed=43,
+        alpha=0.0,
    ):
        self.step = step
        self.hist_step_n = hist_step_n
@@ -61,6 +62,7 @@ class MetaModelDS(MetaTaskModel):
        self.lr = lr
        self.max_epoch = max_epoch
        self.fitted = False
+        self.alpha = alpha
        torch.manual_seed(seed)

    def run_epoch(self, phase, task_list, epoch, opt, loss_l, ignore_weight=False):
@@ -144,7 +146,11 @@ class MetaModelDS(MetaTaskModel):
            )  # debug: record when the test phase starts

        self.tn = PredNet(
-            step=self.step, hist_step_n=self.hist_step_n, clip_weight=self.clip_weight, clip_method=self.clip_method
+            step=self.step,
+            hist_step_n=self.hist_step_n,
+            clip_weight=self.clip_weight,
+            clip_method=self.clip_method,
+            alpha=self.alpha,
        )

        opt = optim.Adam(self.tn.parameters(), lr=self.lr)
--- a/qlib/contrib/meta/data_selection/net.py
+++ b/qlib/contrib/meta/data_selection/net.py
@@ -41,11 +41,18 @@ class TimeWeightMeta(SingleMetaBase):


 class PredNet(nn.Module):
-    def __init__(self, step, hist_step_n, clip_weight=None, clip_method="tanh"):
+    def __init__(self, step, hist_step_n, clip_weight=None, clip_method="tanh", alpha: float = 0.0):
+        """
+        Parameters
+        ----------
+        alpha : float
+            the regularization for sub model (useful when align meta model with linear submodel)
+        """
        super().__init__()
        self.step = step
        self.twm = TimeWeightMeta(hist_step_n=hist_step_n, clip_weight=clip_weight, clip_method=clip_method)
        self.init_paramters(hist_step_n)
+        self.alpha = alpha

    def get_sample_weights(self, X, time_perf, time_belong, ignore_weight=False):
        weights = torch.from_numpy(np.ones(X.shape[0])).float().to(X.device)
@@ -59,7 +66,7 @@ class PredNet(nn.Module):
        """Please refer to the docs of MetaTaskDS for the description of the variables"""
        weights = self.get_sample_weights(X, time_perf, time_belong, ignore_weight=ignore_weight)
        X_w = X.T * weights.view(1, -1)
-        theta = torch.inverse(X_w @ X) @ X_w @ y
+        theta = torch.inverse(X_w @ X + self.alpha * torch.eye(X_w.shape[0])) @ X_w @ y
        return X_test @ theta, weights

    def init_paramters(self, hist_step_n):
--- a/qlib/contrib/meta/data_selection/utils.py
+++ b/qlib/contrib/meta/data_selection/utils.py
@@ -5,6 +5,9 @@ import numpy as np
 import torch
 from torch import nn

+from qlib.constant import EPS
+from qlib.log import get_module_logger
+

 class ICLoss(nn.Module):
    def forward(self, pred, y, idx, skip_size=50):
@@ -24,6 +27,7 @@ class ICLoss(nn.Module):
                diff_point.append(i)
            prev = date
        diff_point.append(None)
+        # The lengths of diff_point will be one more larger then diff_point

        ic_all = 0.0
        skip_n = 0
@@ -34,13 +38,23 @@ class ICLoss(nn.Module):
                skip_n += 1
                continue
            y_focus = y[start_i:end_i]
+            if pred_focus.std() < EPS or y_focus.std() < EPS:
+                # These cases often happend at the end of test data.
+                # Usually caused by fillna(0.)
+                skip_n += 1
+                continue
+
            ic_day = torch.dot(
                (pred_focus - pred_focus.mean()) / np.sqrt(pred_focus.shape[0]) / pred_focus.std(),
                (y_focus - y_focus.mean()) / np.sqrt(y_focus.shape[0]) / y_focus.std(),
            )
            ic_all += ic_day
        if len(diff_point) - 1 - skip_n <= 0:
-            raise ValueError("No enough data for calculating iC")
+            raise ValueError("No enough data for calculating IC")
+        if skip_n > 0:
+            get_module_logger("ICLoss").info(
+                f"{skip_n} days are skipped due to zero std or small scale of valid samples."
+            )
        ic_mean = ic_all / (len(diff_point) - 1 - skip_n)
        return -ic_mean  # ic loss

--- a/qlib/contrib/model/linear.py
+++ b/qlib/contrib/model/linear.py
@@ -4,6 +4,7 @@
 import numpy as np
 import pandas as pd
 from typing import Text, Union
+from qlib.log import get_module_logger
 from qlib.data.dataset.weight import Reweighter
 from scipy.optimize import nnls
 from sklearn.linear_model import LinearRegression, Ridge, Lasso
@@ -29,7 +30,7 @@ class LinearModel(Model):
    RIDGE = "ridge"
    LASSO = "lasso"

-    def __init__(self, estimator="ols", alpha=0.0, fit_intercept=False):
+    def __init__(self, estimator="ols", alpha=0.0, fit_intercept=False, include_valid: bool = False):
        """
        Parameters
        ----------
@@ -39,6 +40,9 @@ class LinearModel(Model):
            l1 or l2 regularization parameter
        fit_intercept : bool
            whether fit intercept
+        include_valid: bool
+            Should the validation data be included for training?
+            The validation data should be included
        """
        assert estimator in [self.OLS, self.NNLS, self.RIDGE, self.LASSO], f"unsupported estimator `{estimator}`"
        self.estimator = estimator
@@ -49,9 +53,16 @@ class LinearModel(Model):
        self.fit_intercept = fit_intercept

        self.coef_ = None
+        self.include_valid = include_valid

    def fit(self, dataset: DatasetH, reweighter: Reweighter = None):
        df_train = dataset.prepare("train", col_set=["feature", "label"], data_key=DataHandlerLP.DK_L)
+        if self.include_valid:
+            try:
+                df_valid = dataset.prepare("valid", col_set=["feature", "label"], data_key=DataHandlerLP.DK_L)
+                df_train = pd.concat([df_train, df_valid])
+            except KeyError:
+                get_module_logger("LinearModel").info("include_valid=True, but valid does not exist")
        if df_train.empty:
            raise ValueError("Empty data from dataset, please check your dataset config.")
        if reweighter is not None:
--- a/qlib/contrib/model/pytorch_krnn.py
+++ b/qlib/contrib/model/pytorch_krnn.py
@@ -0,0 +1,511 @@
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT License.
+
+
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import pandas as pd
+from typing import Text, Union
+import copy
+from ...utils import get_or_create_path
+from ...log import get_module_logger
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+
+from ...model.base import Model
+from ...data.dataset import DatasetH
+from ...data.dataset.handler import DataHandlerLP
+
+########################################################################
+########################################################################
+########################################################################
+
+
+class CNNEncoderBase(nn.Module):
+    def __init__(self, input_dim, output_dim, kernel_size, device):
+        """Build a basic CNN encoder
+
+        Parameters
+        ----------
+        input_dim : int
+            The input dimension
+        output_dim : int
+            The output dimension
+        kernel_size : int
+            The size of convolutional kernels
+        """
+        super().__init__()
+
+        self.input_dim = input_dim
+        self.output_dim = output_dim
+        self.kernel_size = kernel_size
+        self.device = device
+
+        # set padding to ensure the same length
+        # it is correct only when kernel_size is odd, dilation is 1, stride is 1
+        self.conv = nn.Conv1d(input_dim, output_dim, kernel_size, padding=(kernel_size - 1) // 2)
+
+    def forward(self, x):
+        """
+        Parameters
+        ----------
+        x : torch.Tensor
+            input data
+
+        Returns
+        -------
+        torch.Tensor
+            Updated representations
+        """
+
+        # input shape: [batch_size, seq_len*input_dim]
+        # output shape: [batch_size, seq_len, input_dim]
+        x = x.view(x.shape[0], -1, self.input_dim).permute(0, 2, 1).to(self.device)
+        y = self.conv(x)  # [batch_size, output_dim, conved_seq_len]
+        y = y.permute(0, 2, 1)  # [batch_size, conved_seq_len, output_dim]
+
+        return y
+
+
+class KRNNEncoderBase(nn.Module):
+    def __init__(self, input_dim, output_dim, dup_num, rnn_layers, dropout, device):
+        """Build K parallel RNNs
+
+        Parameters
+        ----------
+        input_dim : int
+            The input dimension
+        output_dim : int
+            The output dimension
+        dup_num : int
+            The number of parallel RNNs
+        rnn_layers: int
+            The number of RNN layers
+        """
+        super().__init__()
+
+        self.input_dim = input_dim
+        self.output_dim = output_dim
+        self.dup_num = dup_num
+        self.rnn_layers = rnn_layers
+        self.dropout = dropout
+        self.device = device
+
+        self.rnn_modules = nn.ModuleList()
+        for _ in range(dup_num):
+            self.rnn_modules.append(nn.GRU(input_dim, output_dim, num_layers=self.rnn_layers, dropout=dropout))
+
+    def forward(self, x):
+        """
+        Parameters
+        ----------
+        x : torch.Tensor
+            Input data
+        n_id : torch.Tensor
+            Node indices
+
+        Returns
+        -------
+        torch.Tensor
+            Updated representations
+        """
+
+        # input shape: [batch_size, seq_len, input_dim]
+        # output shape: [batch_size, seq_len, output_dim]
+        # [seq_len, batch_size, input_dim]
+        batch_size, seq_len, input_dim = x.shape
+        x = x.permute(1, 0, 2).to(self.device)
+
+        hids = []
+        for rnn in self.rnn_modules:
+            h, _ = rnn(x)  # [seq_len, batch_size, output_dim]
+            hids.append(h)
+        # [seq_len, batch_size, output_dim, num_dups]
+        hids = torch.stack(hids, dim=-1)
+        hids = hids.view(seq_len, batch_size, self.output_dim, self.dup_num)
+        hids = hids.mean(dim=3)
+        hids = hids.permute(1, 0, 2)
+
+        return hids
+
+
+class CNNKRNNEncoder(nn.Module):
+    def __init__(
+        self, cnn_input_dim, cnn_output_dim, cnn_kernel_size, rnn_output_dim, rnn_dup_num, rnn_layers, dropout, device
+    ):
+        """Build an encoder composed of CNN and KRNN
+
+        Parameters
+        ----------
+        cnn_input_dim : int
+            The input dimension of CNN
+        cnn_output_dim : int
+            The output dimension of CNN
+        cnn_kernel_size : int
+            The size of convolutional kernels
+        rnn_output_dim : int
+            The output dimension of KRNN
+        rnn_dup_num : int
+            The number of parallel duplicates for KRNN
+        rnn_layers : int
+            The number of RNN layers
+        """
+        super().__init__()
+
+        self.cnn_encoder = CNNEncoderBase(cnn_input_dim, cnn_output_dim, cnn_kernel_size, device)
+        self.krnn_encoder = KRNNEncoderBase(cnn_output_dim, rnn_output_dim, rnn_dup_num, rnn_layers, dropout, device)
+
+    def forward(self, x):
+        """
+        Parameters
+        ----------
+        x : torch.Tensor
+            Input data
+        n_id : torch.Tensor
+            Node indices
+
+        Returns
+        -------
+        torch.Tensor
+            Updated representations
+        """
+        cnn_out = self.cnn_encoder(x)
+        krnn_out = self.krnn_encoder(cnn_out)
+
+        return krnn_out
+
+
+class KRNNModel(nn.Module):
+    def __init__(self, fea_dim, cnn_dim, cnn_kernel_size, rnn_dim, rnn_dups, rnn_layers, dropout, device, **params):
+        """Build a KRNN model
+
+        Parameters
+        ----------
+        fea_dim : int
+            The feature dimension
+        cnn_dim : int
+            The hidden dimension of CNN
+        cnn_kernel_size : int
+            The size of convolutional kernels
+        rnn_dim : int
+            The hidden dimension of KRNN
+        rnn_dups : int
+            The number of parallel duplicates
+        rnn_layers: int
+            The number of RNN layers
+        """
+        super().__init__()
+
+        self.encoder = CNNKRNNEncoder(
+            cnn_input_dim=fea_dim,
+            cnn_output_dim=cnn_dim,
+            cnn_kernel_size=cnn_kernel_size,
+            rnn_output_dim=rnn_dim,
+            rnn_dup_num=rnn_dups,
+            rnn_layers=rnn_layers,
+            dropout=dropout,
+            device=device,
+        )
+
+        self.out_fc = nn.Linear(rnn_dim, 1)
+        self.device = device
+
+    def forward(self, x):
+        # x: [batch_size, node_num, seq_len, input_dim]
+        encode = self.encoder(x)
+        out = self.out_fc(encode[:, -1, :]).squeeze().to(self.device)
+
+        return out
+
+
+class KRNN(Model):
+    """KRNN Model
+
+    Parameters
+    ----------
+    d_feat : int
+        input dimension for each time step
+    metric: str
+        the evaluation metric used in early stop
+    optimizer : str
+        optimizer name
+    GPU : str
+        the GPU ID(s) used for training
+    """
+
+    def __init__(
+        self,
+        fea_dim=6,
+        cnn_dim=64,
+        cnn_kernel_size=3,
+        rnn_dim=64,
+        rnn_dups=3,
+        rnn_layers=2,
+        dropout=0,
+        n_epochs=200,
+        lr=0.001,
+        metric="",
+        batch_size=2000,
+        early_stop=20,
+        loss="mse",
+        optimizer="adam",
+        GPU=0,
+        seed=None,
+        **kwargs
+    ):
+        # Set logger.
+        self.logger = get_module_logger("KRNN")
+        self.logger.info("KRNN pytorch version...")
+
+        # set hyper-parameters.
+        self.fea_dim = fea_dim
+        self.cnn_dim = cnn_dim
+        self.cnn_kernel_size = cnn_kernel_size
+        self.rnn_dim = rnn_dim
+        self.rnn_dups = rnn_dups
+        self.rnn_layers = rnn_layers
+        self.dropout = dropout
+        self.n_epochs = n_epochs
+        self.lr = lr
+        self.metric = metric
+        self.batch_size = batch_size
+        self.early_stop = early_stop
+        self.optimizer = optimizer.lower()
+        self.loss = loss
+        self.device = torch.device("cuda:%d" % (GPU) if torch.cuda.is_available() and GPU >= 0 else "cpu")
+        self.seed = seed
+
+        self.logger.info(
+            "KRNN parameters setting:"
+            "\nfea_dim : {}"
+            "\ncnn_dim : {}"
+            "\ncnn_kernel_size : {}"
+            "\nrnn_dim : {}"
+            "\nrnn_dups : {}"
+            "\nrnn_layers : {}"
+            "\ndropout : {}"
+            "\nn_epochs : {}"
+            "\nlr : {}"
+            "\nmetric : {}"
+            "\nbatch_size: {}"
+            "\nearly_stop : {}"
+            "\noptimizer : {}"
+            "\nloss_type : {}"
+            "\nvisible_GPU : {}"
+            "\nuse_GPU : {}"
+            "\nseed : {}".format(
+                fea_dim,
+                cnn_dim,
+                cnn_kernel_size,
+                rnn_dim,
+                rnn_dups,
+                rnn_layers,
+                dropout,
+                n_epochs,
+                lr,
+                metric,
+                batch_size,
+                early_stop,
+                optimizer.lower(),
+                loss,
+                GPU,
+                self.use_gpu,
+                seed,
+            )
+        )
+
+        if self.seed is not None:
+            np.random.seed(self.seed)
+            torch.manual_seed(self.seed)
+
+        self.krnn_model = KRNNModel(
+            fea_dim=self.fea_dim,
+            cnn_dim=self.cnn_dim,
+            cnn_kernel_size=self.cnn_kernel_size,
+            rnn_dim=self.rnn_dim,
+            rnn_dups=self.rnn_dups,
+            rnn_layers=self.rnn_layers,
+            dropout=self.dropout,
+            device=self.device,
+        )
+        if optimizer.lower() == "adam":
+            self.train_optimizer = optim.Adam(self.krnn_model.parameters(), lr=self.lr)
+        elif optimizer.lower() == "gd":
+            self.train_optimizer = optim.SGD(self.krnn_model.parameters(), lr=self.lr)
+        else:
+            raise NotImplementedError("optimizer {} is not supported!".format(optimizer))
+
+        self.fitted = False
+        self.krnn_model.to(self.device)
+
+    @property
+    def use_gpu(self):
+        return self.device != torch.device("cpu")
+
+    def mse(self, pred, label):
+        loss = (pred - label) ** 2
+        return torch.mean(loss)
+
+    def loss_fn(self, pred, label):
+        mask = ~torch.isnan(label)
+
+        if self.loss == "mse":
+            return self.mse(pred[mask], label[mask])
+
+        raise ValueError("unknown loss `%s`" % self.loss)
+
+    def metric_fn(self, pred, label):
+        mask = torch.isfinite(label)
+
+        if self.metric in ("", "loss"):
+            return -self.loss_fn(pred[mask], label[mask])
+
+        raise ValueError("unknown metric `%s`" % self.metric)
+
+    def get_daily_inter(self, df, shuffle=False):
+        # organize the train data into daily batches
+        daily_count = df.groupby(level=0).size().values
+        daily_index = np.roll(np.cumsum(daily_count), 1)
+        daily_index[0] = 0
+        if shuffle:
+            # shuffle data
+            daily_shuffle = list(zip(daily_index, daily_count))
+            np.random.shuffle(daily_shuffle)
+            daily_index, daily_count = zip(*daily_shuffle)
+        return daily_index, daily_count
+
+    def train_epoch(self, x_train, y_train):
+        x_train_values = x_train.values
+        y_train_values = np.squeeze(y_train.values)
+        self.krnn_model.train()
+
+        indices = np.arange(len(x_train_values))
+        np.random.shuffle(indices)
+
+        for i in range(len(indices))[:: self.batch_size]:
+            if len(indices) - i < self.batch_size:
+                break
+
+            feature = torch.from_numpy(x_train_values[indices[i : i + self.batch_size]]).float().to(self.device)
+            label = torch.from_numpy(y_train_values[indices[i : i + self.batch_size]]).float().to(self.device)
+
+            pred = self.krnn_model(feature)
+            loss = self.loss_fn(pred, label)
+
+            self.train_optimizer.zero_grad()
+            loss.backward()
+            torch.nn.utils.clip_grad_value_(self.krnn_model.parameters(), 3.0)
+            self.train_optimizer.step()
+
+    def test_epoch(self, data_x, data_y):
+        # prepare training data
+        x_values = data_x.values
+        y_values = np.squeeze(data_y.values)
+
+        self.krnn_model.eval()
+
+        scores = []
+        losses = []
+
+        indices = np.arange(len(x_values))
+
+        for i in range(len(indices))[:: self.batch_size]:
+            if len(indices) - i < self.batch_size:
+                break
+
+            feature = torch.from_numpy(x_values[indices[i : i + self.batch_size]]).float().to(self.device)
+            label = torch.from_numpy(y_values[indices[i : i + self.batch_size]]).float().to(self.device)
+
+            pred = self.krnn_model(feature)
+            loss = self.loss_fn(pred, label)
+            losses.append(loss.item())
+
+            score = self.metric_fn(pred, label)
+            scores.append(score.item())
+
+        return np.mean(losses), np.mean(scores)
+
+    def fit(
+        self,
+        dataset: DatasetH,
+        evals_result=dict(),
+        save_path=None,
+    ):
+        df_train, df_valid, df_test = dataset.prepare(
+            ["train", "valid", "test"],
+            col_set=["feature", "label"],
+            data_key=DataHandlerLP.DK_L,
+        )
+        if df_train.empty or df_valid.empty:
+            raise ValueError("Empty data from dataset, please check your dataset config.")
+
+        x_train, y_train = df_train["feature"], df_train["label"]
+        x_valid, y_valid = df_valid["feature"], df_valid["label"]
+
+        save_path = get_or_create_path(save_path)
+        stop_steps = 0
+        train_loss = 0
+        best_score = -np.inf
+        best_epoch = 0
+        evals_result["train"] = []
+        evals_result["valid"] = []
+
+        # train
+        self.logger.info("training...")
+        self.fitted = True
+
+        for step in range(self.n_epochs):
+            self.logger.info("Epoch%d:", step)
+            self.logger.info("training...")
+            self.train_epoch(x_train, y_train)
+            self.logger.info("evaluating...")
+            train_loss, train_score = self.test_epoch(x_train, y_train)
+            val_loss, val_score = self.test_epoch(x_valid, y_valid)
+            self.logger.info("train %.6f, valid %.6f" % (train_score, val_score))
+            evals_result["train"].append(train_score)
+            evals_result["valid"].append(val_score)
+
+            if val_score > best_score:
+                best_score = val_score
+                stop_steps = 0
+                best_epoch = step
+                best_param = copy.deepcopy(self.krnn_model.state_dict())
+            else:
+                stop_steps += 1
+                if stop_steps >= self.early_stop:
+                    self.logger.info("early stop")
+                    break
+
+        self.logger.info("best score: %.6lf @ %d" % (best_score, best_epoch))
+        self.krnn_model.load_state_dict(best_param)
+        torch.save(best_param, save_path)
+
+        if self.use_gpu:
+            torch.cuda.empty_cache()
+
+    def predict(self, dataset: DatasetH, segment: Union[Text, slice] = "test"):
+        if not self.fitted:
+            raise ValueError("model is not fitted yet!")
+
+        x_test = dataset.prepare(segment, col_set="feature", data_key=DataHandlerLP.DK_I)
+        index = x_test.index
+        self.krnn_model.eval()
+        x_values = x_test.values
+        sample_num = x_values.shape[0]
+        preds = []
+
+        for begin in range(sample_num)[:: self.batch_size]:
+            if sample_num - begin < self.batch_size:
+                end = sample_num
+            else:
+                end = begin + self.batch_size
+            x_batch = torch.from_numpy(x_values[begin:end]).float().to(self.device)
+            with torch.no_grad():
+                pred = self.krnn_model(x_batch).detach().cpu().numpy()
+            preds.append(pred)
+
+        return pd.Series(np.concatenate(preds), index=index)
--- a/qlib/contrib/model/pytorch_nn.py
+++ b/qlib/contrib/model/pytorch_nn.py
@@ -47,10 +47,6 @@ class DNNModelPytorch(Model):
        layer sizes
    lr : float
        learning rate
-    lr_decay : float
-        learning rate decay
-    lr_decay_steps : int
-        learning rate decay steps
    optimizer : str
        optimizer name
    GPU : int
@@ -64,8 +60,6 @@ class DNNModelPytorch(Model):
        batch_size=2000,
        early_stop_rounds=50,
        eval_steps=20,
-        lr_decay=0.96,
-        lr_decay_steps=100,
        optimizer="gd",
        loss="mse",
        GPU=0,
@@ -93,8 +87,6 @@ class DNNModelPytorch(Model):
        self.batch_size = batch_size
        self.early_stop_rounds = early_stop_rounds
        self.eval_steps = eval_steps
-        self.lr_decay = lr_decay
-        self.lr_decay_steps = lr_decay_steps
        self.optimizer = optimizer.lower()
        self.loss_type = loss
        if isinstance(GPU, str):
@@ -116,8 +108,6 @@ class DNNModelPytorch(Model):
            f"\nbatch_size : {batch_size}"
            f"\nearly_stop_rounds : {early_stop_rounds}"
            f"\neval_steps : {eval_steps}"
-            f"\nlr_decay : {lr_decay}"
-            f"\nlr_decay_steps : {lr_decay_steps}"
            f"\noptimizer : {optimizer}"
            f"\nloss_type : {loss}"
            f"\nseed : {seed}"
--- a/qlib/contrib/model/pytorch_sandwich.py
+++ b/qlib/contrib/model/pytorch_sandwich.py
@@ -0,0 +1,381 @@
+# Copyright (c) Microsoft Corporation.
+# Licensed under the MIT License.
+
+
+from __future__ import division
+from __future__ import print_function
+
+import numpy as np
+import pandas as pd
+from typing import Text, Union
+import copy
+from ...utils import get_or_create_path
+from ...log import get_module_logger
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+
+from ...model.base import Model
+from ...data.dataset import DatasetH
+from ...data.dataset.handler import DataHandlerLP
+from .pytorch_krnn import CNNKRNNEncoder
+
+
+class SandwichModel(nn.Module):
+    def __init__(
+        self,
+        fea_dim,
+        cnn_dim_1,
+        cnn_dim_2,
+        cnn_kernel_size,
+        rnn_dim_1,
+        rnn_dim_2,
+        rnn_dups,
+        rnn_layers,
+        dropout,
+        device,
+        **params
+    ):
+        """Build a Sandwich model
+
+        Parameters
+        ----------
+        fea_dim : int
+            The feature dimension
+        cnn_dim_1 : int
+            The hidden dimension of the first CNN
+        cnn_dim_2 : int
+            The hidden dimension of the second CNN
+        cnn_kernel_size : int
+            The size of convolutional kernels
+        rnn_dim_1 : int
+            The hidden dimension of the first KRNN
+        rnn_dim_2 : int
+            The hidden dimension of the second KRNN
+        rnn_dups : int
+            The number of parallel duplicates
+        rnn_layers: int
+            The number of RNN layers
+        """
+        super().__init__()
+
+        self.first_encoder = CNNKRNNEncoder(
+            cnn_input_dim=fea_dim,
+            cnn_output_dim=cnn_dim_1,
+            cnn_kernel_size=cnn_kernel_size,
+            rnn_output_dim=rnn_dim_1,
+            rnn_dup_num=rnn_dups,
+            rnn_layers=rnn_layers,
+            dropout=dropout,
+            device=device,
+        )
+
+        self.second_encoder = CNNKRNNEncoder(
+            cnn_input_dim=rnn_dim_1,
+            cnn_output_dim=cnn_dim_2,
+            cnn_kernel_size=cnn_kernel_size,
+            rnn_output_dim=rnn_dim_2,
+            rnn_dup_num=rnn_dups,
+            rnn_layers=rnn_layers,
+            dropout=dropout,
+            device=device,
+        )
+
+        self.out_fc = nn.Linear(rnn_dim_2, 1)
+        self.device = device
+
+    def forward(self, x):
+        # x: [batch_size, node_num, seq_len, input_dim]
+        encode = self.first_encoder(x)
+        encode = self.second_encoder(encode)
+        out = self.out_fc(encode[:, -1, :]).squeeze().to(self.device)
+
+        return out
+
+
+class Sandwich(Model):
+    """Sandwich Model
+
+    Parameters
+    ----------
+    d_feat : int
+        input dimension for each time step
+    metric: str
+        the evaluation metric used in early stop
+    optimizer : str
+        optimizer name
+    GPU : str
+        the GPU ID(s) used for training
+    """
+
+    def __init__(
+        self,
+        fea_dim=6,
+        cnn_dim_1=64,
+        cnn_dim_2=32,
+        cnn_kernel_size=3,
+        rnn_dim_1=16,
+        rnn_dim_2=8,
+        rnn_dups=3,
+        rnn_layers=2,
+        dropout=0,
+        n_epochs=200,
+        lr=0.001,
+        metric="",
+        batch_size=2000,
+        early_stop=20,
+        loss="mse",
+        optimizer="adam",
+        GPU=0,
+        seed=None,
+        **kwargs
+    ):
+        # Set logger.
+        self.logger = get_module_logger("Sandwich")
+        self.logger.info("Sandwich pytorch version...")
+
+        # set hyper-parameters.
+        self.fea_dim = fea_dim
+        self.cnn_dim_1 = cnn_dim_1
+        self.cnn_dim_2 = cnn_dim_2
+        self.cnn_kernel_size = cnn_kernel_size
+        self.rnn_dim_1 = rnn_dim_1
+        self.rnn_dim_2 = rnn_dim_2
+        self.rnn_dups = rnn_dups
+        self.rnn_layers = rnn_layers
+        self.dropout = dropout
+        self.n_epochs = n_epochs
+        self.lr = lr
+        self.metric = metric
+        self.batch_size = batch_size
+        self.early_stop = early_stop
+        self.optimizer = optimizer.lower()
+        self.loss = loss
+        self.device = torch.device("cuda:%d" % (GPU) if torch.cuda.is_available() and GPU >= 0 else "cpu")
+        self.seed = seed
+
+        self.logger.info(
+            "Sandwich parameters setting:"
+            "\nfea_dim : {}"
+            "\ncnn_dim_1 : {}"
+            "\ncnn_dim_2 : {}"
+            "\ncnn_kernel_size : {}"
+            "\nrnn_dim_1 : {}"
+            "\nrnn_dim_2 : {}"
+            "\nrnn_dups : {}"
+            "\nrnn_layers : {}"
+            "\ndropout : {}"
+            "\nn_epochs : {}"
+            "\nlr : {}"
+            "\nmetric : {}"
+            "\nbatch_size: {}"
+            "\nearly_stop : {}"
+            "\noptimizer : {}"
+            "\nloss_type : {}"
+            "\nvisible_GPU : {}"
+            "\nuse_GPU : {}"
+            "\nseed : {}".format(
+                fea_dim,
+                cnn_dim_1,
+                cnn_dim_2,
+                cnn_kernel_size,
+                rnn_dim_1,
+                rnn_dim_2,
+                rnn_dups,
+                rnn_layers,
+                dropout,
+                n_epochs,
+                lr,
+                metric,
+                batch_size,
+                early_stop,
+                optimizer.lower(),
+                loss,
+                GPU,
+                self.use_gpu,
+                seed,
+            )
+        )
+
+        if self.seed is not None:
+            np.random.seed(self.seed)
+            torch.manual_seed(self.seed)
+
+        self.sandwich_model = SandwichModel(
+            fea_dim=self.fea_dim,
+            cnn_dim_1=self.cnn_dim_1,
+            cnn_dim_2=self.cnn_dim_2,
+            cnn_kernel_size=self.cnn_kernel_size,
+            rnn_dim_1=self.rnn_dim_1,
+            rnn_dim_2=self.rnn_dim_2,
+            rnn_dups=self.rnn_dups,
+            rnn_layers=self.rnn_layers,
+            dropout=self.dropout,
+            device=self.device,
+        )
+        if optimizer.lower() == "adam":
+            self.train_optimizer = optim.Adam(self.sandwich_model.parameters(), lr=self.lr)
+        elif optimizer.lower() == "gd":
+            self.train_optimizer = optim.SGD(self.sandwich_model.parameters(), lr=self.lr)
+        else:
+            raise NotImplementedError("optimizer {} is not supported!".format(optimizer))
+
+        self.fitted = False
+        self.sandwich_model.to(self.device)
+
+    @property
+    def use_gpu(self):
+        return self.device != torch.device("cpu")
+
+    def mse(self, pred, label):
+        loss = (pred - label) ** 2
+        return torch.mean(loss)
+
+    def loss_fn(self, pred, label):
+        mask = ~torch.isnan(label)
+
+        if self.loss == "mse":
+            return self.mse(pred[mask], label[mask])
+
+        raise ValueError("unknown loss `%s`" % self.loss)
+
+    def metric_fn(self, pred, label):
+        mask = torch.isfinite(label)
+
+        if self.metric in ("", "loss"):
+            return -self.loss_fn(pred[mask], label[mask])
+
+        raise ValueError("unknown metric `%s`" % self.metric)
+
+    def train_epoch(self, x_train, y_train):
+        x_train_values = x_train.values
+        y_train_values = np.squeeze(y_train.values)
+        self.sandwich_model.train()
+
+        indices = np.arange(len(x_train_values))
+        np.random.shuffle(indices)
+
+        for i in range(len(indices))[:: self.batch_size]:
+            if len(indices) - i < self.batch_size:
+                break
+
+            feature = torch.from_numpy(x_train_values[indices[i : i + self.batch_size]]).float().to(self.device)
+            label = torch.from_numpy(y_train_values[indices[i : i + self.batch_size]]).float().to(self.device)
+
+            pred = self.sandwich_model(feature)
+            loss = self.loss_fn(pred, label)
+
+            self.train_optimizer.zero_grad()
+            loss.backward()
+            torch.nn.utils.clip_grad_value_(self.sandwich_model.parameters(), 3.0)
+            self.train_optimizer.step()
+
+    def test_epoch(self, data_x, data_y):
+        # prepare training data
+        x_values = data_x.values
+        y_values = np.squeeze(data_y.values)
+
+        self.sandwich_model.eval()
+
+        scores = []
+        losses = []
+
+        indices = np.arange(len(x_values))
+
+        for i in range(len(indices))[:: self.batch_size]:
+            if len(indices) - i < self.batch_size:
+                break
+
+            feature = torch.from_numpy(x_values[indices[i : i + self.batch_size]]).float().to(self.device)
+            label = torch.from_numpy(y_values[indices[i : i + self.batch_size]]).float().to(self.device)
+
+            pred = self.sandwich_model(feature)
+            loss = self.loss_fn(pred, label)
+            losses.append(loss.item())
+
+            score = self.metric_fn(pred, label)
+            scores.append(score.item())
+
+        return np.mean(losses), np.mean(scores)
+
+    def fit(
+        self,
+        dataset: DatasetH,
+        evals_result=dict(),
+        save_path=None,
+    ):
+        df_train, df_valid, df_test = dataset.prepare(
+            ["train", "valid", "test"],
+            col_set=["feature", "label"],
+            data_key=DataHandlerLP.DK_L,
+        )
+        if df_train.empty or df_valid.empty:
+            raise ValueError("Empty data from dataset, please check your dataset config.")
+
+        x_train, y_train = df_train["feature"], df_train["label"]
+        x_valid, y_valid = df_valid["feature"], df_valid["label"]
+
+        save_path = get_or_create_path(save_path)
+        stop_steps = 0
+        train_loss = 0
+        best_score = -np.inf
+        best_epoch = 0
+        evals_result["train"] = []
+        evals_result["valid"] = []
+
+        # train
+        self.logger.info("training...")
+        self.fitted = True
+
+        for step in range(self.n_epochs):
+            self.logger.info("Epoch%d:", step)
+            self.logger.info("training...")
+            self.train_epoch(x_train, y_train)
+            self.logger.info("evaluating...")
+            train_loss, train_score = self.test_epoch(x_train, y_train)
+            val_loss, val_score = self.test_epoch(x_valid, y_valid)
+            self.logger.info("train %.6f, valid %.6f" % (train_score, val_score))
+            evals_result["train"].append(train_score)
+            evals_result["valid"].append(val_score)
+
+            if val_score > best_score:
+                best_score = val_score
+                stop_steps = 0
+                best_epoch = step
+                best_param = copy.deepcopy(self.sandwich_model.state_dict())
+            else:
+                stop_steps += 1
+                if stop_steps >= self.early_stop:
+                    self.logger.info("early stop")
+                    break
+
+        self.logger.info("best score: %.6lf @ %d" % (best_score, best_epoch))
+        self.sandwich_model.load_state_dict(best_param)
+        torch.save(best_param, save_path)
+
+        if self.use_gpu:
+            torch.cuda.empty_cache()
+
+    def predict(self, dataset: DatasetH, segment: Union[Text, slice] = "test"):
+        if not self.fitted:
+            raise ValueError("model is not fitted yet!")
+
+        x_test = dataset.prepare(segment, col_set="feature", data_key=DataHandlerLP.DK_I)
+        index = x_test.index
+        self.sandwich_model.eval()
+        x_values = x_test.values
+        sample_num = x_values.shape[0]
+        preds = []
+
+        for begin in range(sample_num)[:: self.batch_size]:
+            if sample_num - begin < self.batch_size:
+                end = sample_num
+            else:
+                end = begin + self.batch_size
+            x_batch = torch.from_numpy(x_values[begin:end]).float().to(self.device)
+            with torch.no_grad():
+                pred = self.sandwich_model(x_batch).detach().cpu().numpy()
+            preds.append(pred)
+
+        return pd.Series(np.concatenate(preds), index=index)
--- a/qlib/contrib/model/pytorch_tcn_ts.py
+++ b/qlib/contrib/model/pytorch_tcn_ts.py
@@ -168,7 +168,8 @@ class TCN(Model):
        self.TCN_model.train()

        for data in data_loader:
-            feature = data[:, :, 0:-1].to(self.device)
+            data = torch.transpose(data, 1, 2)
+            feature = data[:, 0:-1, :].to(self.device)
            label = data[:, -1, -1].to(self.device)

            pred = self.TCN_model(feature.float())
@@ -187,8 +188,8 @@ class TCN(Model):
        losses = []

        for data in data_loader:
-
-            feature = data[:, :, 0:-1].to(self.device)
+            data = torch.transpose(data, 1, 2)
+            feature = data[:, 0:-1, :].to(self.device)
            # feature[torch.isnan(feature)] = 0
            label = data[:, -1, -1].to(self.device)

--- a/qlib/contrib/ops/high_freq.py
+++ b/qlib/contrib/ops/high_freq.py
@@ -70,7 +70,7 @@ class DayCumsum(ElemOperator):
        Otherwise, the value is zero.
    """

-    def __init__(self, feature, start: str = "9:30", end: str = "14:59"):
+    def __init__(self, feature, start: str = "9:30", end: str = "14:59", data_granularity: int = 1):
        self.feature = feature
        self.start = datetime.strptime(start, "%H:%M")
        self.end = datetime.strptime(end, "%H:%M")
@@ -80,15 +80,17 @@ class DayCumsum(ElemOperator):
        self.noon_open = datetime.strptime("13:00", "%H:%M")
        self.noon_close = datetime.strptime("15:00", "%H:%M")

-        self.start_id = time_to_day_index(self.start)
-        self.end_id = time_to_day_index(self.end)
+        self.data_granularity = data_granularity
+        self.start_id = time_to_day_index(self.start) // self.data_granularity
+        self.end_id = time_to_day_index(self.end) // self.data_granularity
+        assert 240 % self.data_granularity == 0

    def period_cusum(self, df):
        df = df.copy()
-        assert len(df) == 240
+        assert len(df) == 240 // self.data_granularity
        df.iloc[0 : self.start_id] = 0
        df = df.cumsum()
-        df.iloc[self.end_id + 1 : 240] = 0
+        df.iloc[self.end_id + 1 : 240 // self.data_granularity] = 0
        return df

    def _load_internal(self, instrument, start_index, end_index, freq):
--- a/qlib/contrib/strategy/rule_strategy.py
+++ b/qlib/contrib/strategy/rule_strategy.py
@@ -635,7 +635,7 @@ class FileOrderStrategy(BaseStrategy):
            self.order_df = file
        else:
            with get_io_object(file) as f:
-                self.order_df = pd.read_csv(f, dtype={"datetime": np.str})
+                self.order_df = pd.read_csv(f, dtype={"datetime": str})

        self.order_df["datetime"] = self.order_df["datetime"].apply(pd.Timestamp)
        self.order_df = self.order_df.set_index(["datetime", "instrument"])
--- a/qlib/data/data.py
+++ b/qlib/data/data.py
@@ -783,7 +783,7 @@ class LocalPITProvider(PITProvider):
        index_path = C.dpm.get_data_uri() / "financial" / instrument.lower() / f"{field}.index"
        data_path = C.dpm.get_data_uri() / "financial" / instrument.lower() / f"{field}.data"
        if not (index_path.exists() and data_path.exists()):
-            raise FileNotFoundError("No file is found. Raise exception and  ")
+            raise FileNotFoundError("No file is found.")
        # NOTE: The most significant performance loss is here.
        # Does the acceleration that makes the program complicated really matters?
        # - It makes parameters of the interface complicate
@@ -797,14 +797,14 @@ class LocalPITProvider(PITProvider):
        cur_time_int = int(cur_time.year) * 10000 + int(cur_time.month) * 100 + int(cur_time.day)
        loc = np.searchsorted(data["date"], cur_time_int, side="right")
        if loc <= 0:
-            return pd.Series()
+            return pd.Series(dtype=C.pit_record_type["value"])
        last_period = data["period"][:loc].max()  # return the latest quarter
        first_period = data["period"][:loc].min()
        period_list = get_period_list(first_period, last_period, quarterly)
        if period is not None:
            # NOTE: `period` has higher priority than `start_index` & `end_index`
            if period not in period_list:
-                return pd.Series()
+                return pd.Series(dtype=C.pit_record_type["value"])
            else:
                period_list = [period]
        else:
@@ -868,7 +868,7 @@ class LocalExpressionProvider(ExpressionProvider):
        # Ensure that each column type is consistent
        # FIXME:
        # 1) The stock data is currently float. If there is other types of data, this part needs to be re-implemented.
-        # 2) The the precision should be configurable
+        # 2) The precision should be configurable
        try:
            series = series.astype(np.float32)
        except ValueError:
--- a/qlib/data/dataset/init.py
+++ b/qlib/data/dataset/init.py
@@ -417,7 +417,7 @@ class TSDataSampler:
            # NOTE: bool(np.nan) is True !!!!!!!!
            # make sure reindex comes first. Otherwise extra NaN may appear.
            flt_data = flt_data.swaplevel()
-            flt_data = flt_data.reindex(self.data_index).fillna(False).astype(np.bool)
+            flt_data = flt_data.reindex(self.data_index).fillna(False).astype(bool)
            self.flt_data = flt_data.values
            self.idx_map = self.flt_idx_map(self.flt_data, self.idx_map)
            self.data_index = self.data_index[np.where(self.flt_data)[0]]
--- a/qlib/data/dataset/handler.py
+++ b/qlib/data/dataset/handler.py
@@ -7,6 +7,7 @@ from typing import Callable, Union, Tuple, List, Iterator, Optional

 import pandas as pd

+from qlib.typehint import Literal
 from ...log import get_module_logger, TimeInspector
 from ...utils import init_instance_by_config
 from ...utils.serial import Serializable
@@ -49,6 +50,8 @@ class DataHandler(Serializable):
    - Fetching data with `col_set=CS_RAW` will return the raw data and may avoid pandas from copying the data when calling `loc`
    """

+    _data: pd.DataFrame  # underlying data.
+
    def __init__(
        self,
        instruments=None,
@@ -155,6 +158,11 @@ class DataHandler(Serializable):
        """
        fetch data from underlying data source

+        Design motivation:
+        - providing a unified interface for underlying data.
+        - Potential to make the interface more friendly.
+        - User can improve performance when fetching data in this extra layer
+
        Parameters
        ----------
        selector : Union[pd.Timestamp, slice, str]
@@ -328,6 +336,9 @@ class DataHandler(Serializable):
            yield cur_date, self.fetch(selector, **kwargs)


+DATA_KEY_TYPE = Literal["raw", "infer", "learn"]
+
+
 class DataHandlerLP(DataHandler):
    """
    DataHandler with **(L)earnable (P)rocessor**
@@ -346,17 +357,28 @@ class DataHandlerLP(DataHandler):

        - These processors only apply to the learning phase.

-    Tips to improve the performance of data handler
+    Tips for data handler

    - To reduce the memory cost

        - `drop_raw=True`: this will modify the data inplace on raw data;
+
+    - Please note processed data like `self._infer` or `self._learn` are concepts different from `segments` in Qlib's `Dataset` like "train" and "test"
+
+        - Processed data like `self._infer` or `self._learn` are underlying data processed with different processors
+        - `segments` in Qlib's `Dataset` like "train" and "test" are simply the time segmentations when querying data("train" are often before "test" in time-series).
+        - For example, you can query `data._infer` processed by `infer_processors` in the "train" time segmentation.
    """

+    # based on `self._data`, _infer and _learn are genrated after processors
+    _infer: pd.DataFrame  # data for inference
+    _learn: pd.DataFrame  # data for learning models
+
    # data key
-    DK_R = "raw"
-    DK_I = "infer"
-    DK_L = "learn"
+    DK_R: DATA_KEY_TYPE = "raw"
+    DK_I: DATA_KEY_TYPE = "infer"
+    DK_L: DATA_KEY_TYPE = "learn"
+    # map data_key to attribute name
    ATTR_MAP = {DK_R: "_data", DK_I: "_infer", DK_L: "_learn"}

    # process type
@@ -600,7 +622,7 @@ class DataHandlerLP(DataHandler):

        # TODO: Be able to cache handler data. Save the memory for data processing

-    def _get_df_by_key(self, data_key: str = DK_I) -> pd.DataFrame:
+    def _get_df_by_key(self, data_key: DATA_KEY_TYPE = DK_I) -> pd.DataFrame:
        if data_key == self.DK_R and self.drop_raw:
            raise AttributeError(
                "DataHandlerLP has not attribute _data, please set drop_raw = False if you want to use raw data"
@@ -613,7 +635,7 @@ class DataHandlerLP(DataHandler):
        selector: Union[pd.Timestamp, slice, str] = slice(None, None),
        level: Union[str, int] = "datetime",
        col_set=DataHandler.CS_ALL,
-        data_key: str = DK_I,
+        data_key: DATA_KEY_TYPE = DK_I,
        squeeze: bool = False,
        proc_func: Callable = None,
    ) -> pd.DataFrame:
@@ -647,7 +669,7 @@ class DataHandlerLP(DataHandler):
            proc_func=proc_func,
        )

-    def get_cols(self, col_set=DataHandler.CS_ALL, data_key: str = DK_I) -> list:
+    def get_cols(self, col_set=DataHandler.CS_ALL, data_key: DATA_KEY_TYPE = DK_I) -> list:
        """
        get the column names

@@ -655,7 +677,7 @@ class DataHandlerLP(DataHandler):
        ----------
        col_set : str
            select a set of meaningful columns.(e.g. features, columns).
-        data_key : str
+        data_key : DATA_KEY_TYPE
            the data to fetch:  DK_*.

        Returns
@@ -698,3 +720,26 @@ class DataHandlerLP(DataHandler):
        ]:
            setattr(new_hd, key, getattr(handler, key, None))
        return new_hd
+
+    @classmethod
+    def from_df(cls, df: pd.DataFrame) -> "DataHandlerLP":
+        """
+        Motivation:
+        - When user want to get a quick data handler.
+
+        The created data handler will have only one shared Dataframe without processors.
+        After creating the handler, user may often want to dump the handler for reuse
+        Here is a typical use case
+
+        .. code-block:: python
+
+            from qlib.data.dataset import DataHandlerLP
+            dh = DataHandlerLP.from_df(df)
+            dh.to_pickle(fname, dump_all=True)
+
+        TODO:
+        - The StaticDataLoader is quite slow. It don't have to copy the data again...
+
+        """
+        loader = data_loader_module.StaticDataLoader(df)
+        return cls(data_loader=loader)
--- a/qlib/data/dataset/loader.py
+++ b/qlib/data/dataset/loader.py
@@ -153,7 +153,7 @@ class QlibDataLoader(DLWParser):
        filter_pipe: List = None,
        swap_level: bool = True,
        freq: Union[str, dict] = "day",
-        inst_processor: dict = None,
+        inst_processors: Union[dict, list] = None,
    ):
        """
        Parameters
@@ -167,16 +167,19 @@ class QlibDataLoader(DLWParser):
        freq:  dict or str
            If type(config) == dict and type(freq) == str, load config data using freq.
            If type(config) == dict and type(freq) == dict, load config[<group_name>] data using freq[<group_name>]
-        inst_processor: dict
-            If inst_processor is not None and type(config) == dict; load config[<group_name>] data using inst_processor[<group_name>]
+        inst_processors: dict | list
+            If inst_processors is not None and type(config) == dict; load config[<group_name>] data using inst_processors[<group_name>]
+            If inst_processors is a list, then it will be applied to all groups.
        """
        self.filter_pipe = filter_pipe
        self.swap_level = swap_level
        self.freq = freq

        # sample
-        self.inst_processor = inst_processor if inst_processor is not None else {}
-        assert isinstance(self.inst_processor, dict), f"inst_processor(={self.inst_processor}) must be dict"
+        self.inst_processors = inst_processors if inst_processors is not None else {}
+        assert isinstance(
+            self.inst_processors, (dict, list)
+        ), f"inst_processors(={self.inst_processors}) must be dict or list"

        super().__init__(config)

@@ -187,8 +190,8 @@ class QlibDataLoader(DLWParser):
                    if _gp not in freq:
                        raise ValueError(f"freq(={freq}) missing group(={_gp})")
                assert (
-                    self.inst_processor
-                ), f"freq(={self.freq}), inst_processor(={self.inst_processor}) cannot be None/empty"
+                    self.inst_processors
+                ), f"freq(={self.freq}), inst_processors(={self.inst_processors}) cannot be None/empty"

    def load_group_df(
        self,
@@ -208,9 +211,10 @@ class QlibDataLoader(DLWParser):
            warnings.warn("`filter_pipe` is not None, but it will not be used with `instruments` as list")

        freq = self.freq[gp_name] if isinstance(self.freq, dict) else self.freq
-        df = D.features(
-            instruments, exprs, start_time, end_time, freq=freq, inst_processors=self.inst_processor.get(gp_name, [])
+        inst_processors = (
+            self.inst_processors if isinstance(self.inst_processors, list) else self.inst_processors.get(gp_name, [])
        )
+        df = D.features(instruments, exprs, start_time, end_time, freq=freq, inst_processors=inst_processors)
        df.columns = names
        if self.swap_level:
            df = df.swaplevel().sort_index()  # NOTE: if swaplevel, return <datetime, instrument>
--- a/qlib/data/dataset/processor.py
+++ b/qlib/data/dataset/processor.py
@@ -2,7 +2,7 @@
 # Licensed under the MIT License.

 import abc
-from typing import Union, Text
+from typing import Union, Text, Optional
 import numpy as np
 import pandas as pd

@@ -11,6 +11,8 @@ from ...constant import EPS
 from .utils import fetch_df_by_index
 from ...utils.serial import Serializable
 from ...utils.paral import datetime_groupby_apply
+from qlib.data.inst_processor import InstProcessor
+from qlib.data import D


 def get_group_columns(df: pd.DataFrame, group: Union[Text, None]):
@@ -378,3 +380,42 @@ class HashStockFormat(Processor):
        from .storage import HashingStockStorage  # pylint: disable=C0415

        return HashingStockStorage.from_df(df)
+
+
+class TimeRangeFlt(InstProcessor):
+    """
+    This is a filter to filter stock.
+    Only keep the data that exist from start_time to end_time (the existence in the middle is not checked.)
+    WARNING:  It may induce leakage!!!
+    """
+
+    def __init__(
+        self,
+        start_time: Optional[Union[pd.Timestamp, str]] = None,
+        end_time: Optional[Union[pd.Timestamp, str]] = None,
+        freq: str = "day",
+    ):
+        """
+        Parameters
+        ----------
+        start_time : Optional[Union[pd.Timestamp, str]]
+            The data must start earlier (or equal) than `start_time`
+            None indicates data will not be filtered based on `start_time`
+        end_time : Optional[Union[pd.Timestamp, str]]
+            similar to start_time
+        freq : str
+            The frequency of the calendar
+        """
+        # Align to calendar before filtering
+        cal = D.calendar(start_time=start_time, end_time=end_time, freq=freq)
+        self.start_time = None if start_time is None else cal[0]
+        self.end_time = None if end_time is None else cal[-1]
+
+    def __call__(self, df: pd.DataFrame, instrument, *args, **kwargs):
+        if (
+            df.empty
+            or (self.start_time is None or df.index.min() <= self.start_time)
+            and (self.end_time is None or df.index.max() >= self.end_time)
+        ):
+            return df
+        return df.head(0)
--- a/qlib/data/dataset/utils.py
+++ b/qlib/data/dataset/utils.py
@@ -2,9 +2,8 @@
 # Licensed under the MIT License.
 from __future__ import annotations
 import pandas as pd
-from typing import Union, List
+from typing import Union, List, TYPE_CHECKING
 from qlib.utils import init_instance_by_config
-from typing import TYPE_CHECKING

 if TYPE_CHECKING:
    from qlib.data.dataset import DataHandler
@@ -121,7 +120,7 @@ def convert_index_format(df: Union[pd.DataFrame, pd.Series], level: str = "datet
    return df


-def init_task_handler(task: dict) -> Union[DataHandler, None]:
+def init_task_handler(task: dict) -> DataHandler:
    """
    initialize the handler part of the task **inplace**

@@ -142,5 +141,6 @@ def init_task_handler(task: dict) -> Union[DataHandler, None]:
    if h_conf is not None:
        handler = init_instance_by_config(h_conf, accept_types=DataHandler)
        task["dataset"]["kwargs"]["handler"] = handler
-
        return handler
+    else:
+        raise ValueError("The task does not contains a handler part.")
--- a/qlib/rl/contrib/backtest.py
+++ b/qlib/rl/contrib/backtest.py
@@ -28,14 +28,15 @@ from qlib.typehint import Literal

 def _get_multi_level_executor_config(
    strategy_config: dict,
-    cash_limit: float = None,
+    cash_limit: float | None = None,
    generate_report: bool = False,
+    data_granularity: str = "1min",
 ) -> dict:
    executor_config = {
        "class": "SimulatorExecutor",
        "module_path": "qlib.backtest.executor",
        "kwargs": {
-            "time_per_step": "1min",
+            "time_per_step": data_granularity,
            "verbose": False,
            "trade_type": SimulatorExecutor.TT_PARAL if cash_limit is not None else SimulatorExecutor.TT_SERIAL,
            "generate_report": generate_report,
@@ -127,7 +128,7 @@ def single_with_simulator(
    backtest_config: dict,
    orders: pd.DataFrame,
    split: Literal["stock", "day"] = "stock",
-    cash_limit: float = None,
+    cash_limit: float | None = None,
    generate_report: bool = False,
 ) -> Union[Tuple[pd.DataFrame, dict], pd.DataFrame]:
    """Run backtest in a single thread with SingleAssetOrderExecution simulator. The orders will be executed day by day.
@@ -154,12 +155,7 @@ def single_with_simulator(
    -------
        If generate_report is True, return execution records and the generated report. Otherwise, return only records.
    """
-    if split == "stock":
-        stock_id = orders.iloc[0].instrument
-        init_qlib(backtest_config["qlib"], part=stock_id)
-    else:
-        day = orders.iloc[0].datetime
-        init_qlib(backtest_config["qlib"], part=day)
+    init_qlib(backtest_config["qlib"])

    stocks = orders.instrument.unique().tolist()

@@ -181,13 +177,14 @@ def single_with_simulator(
            strategy_config=backtest_config["strategies"],
            cash_limit=cash_limit,
            generate_report=generate_report,
+            data_granularity=backtest_config["data_granularity"],
        )

        exchange_config = copy.deepcopy(backtest_config["exchange"])
        exchange_config.update(
            {
                "codes": stocks,
-                "freq": "1min",
+                "freq": backtest_config["data_granularity"],
            }
        )

@@ -202,7 +199,7 @@ def single_with_simulator(
        reports.append(simulator.report_dict)
        decisions += simulator.decisions

-    indicator_1day_objs = [report["indicator"]["1day"][1] for report in reports]
+    indicator_1day_objs = [report["indicator_dict"]["1day"][1] for report in reports]
    indicator_info = {k: v for obj in indicator_1day_objs for k, v in obj.order_indicator_his.items()}
    records = _convert_indicator_to_dataframe(indicator_info)
    assert records is None or not np.isnan(records["ffr"]).any()
@@ -226,7 +223,7 @@ def single_with_collect_data_loop(
    backtest_config: dict,
    orders: pd.DataFrame,
    split: Literal["stock", "day"] = "stock",
-    cash_limit: float = None,
+    cash_limit: float | None = None,
    generate_report: bool = False,
 ) -> Union[Tuple[pd.DataFrame, dict], pd.DataFrame]:
    """Run backtest in a single thread with collect_data_loop.
@@ -253,12 +250,7 @@ def single_with_collect_data_loop(
        If generate_report is True, return execution records and the generated report. Otherwise, return only records.
    """

-    if split == "stock":
-        stock_id = orders.iloc[0].instrument
-        init_qlib(backtest_config["qlib"], part=stock_id)
-    else:
-        day = orders.iloc[0].datetime
-        init_qlib(backtest_config["qlib"], part=day)
+    init_qlib(backtest_config["qlib"])

    trade_start_time = orders["datetime"].min()
    trade_end_time = orders["datetime"].max()
@@ -280,13 +272,14 @@ def single_with_collect_data_loop(
        strategy_config=backtest_config["strategies"],
        cash_limit=cash_limit,
        generate_report=generate_report,
+        data_granularity=backtest_config["data_granularity"],
    )

    exchange_config = copy.deepcopy(backtest_config["exchange"])
    exchange_config.update(
        {
            "codes": stocks,
-            "freq": "1min",
+            "freq": backtest_config["data_granularity"],
        }
    )

@@ -357,7 +350,10 @@ def backtest(backtest_config: dict, with_simulator: bool = False) -> pd.DataFram

    if not output_path.exists():
        os.makedirs(output_path)
-    res.to_csv(output_path / "summary.csv")
+
+    if "pa" in res.columns:
+        res["pa"] = res["pa"] * 10000.0  # align with training metrics
+    res.to_csv(output_path / "backtest_result.csv")
    return res


--- a/qlib/rl/contrib/naive_config_parser.py
+++ b/qlib/rl/contrib/naive_config_parser.py
@@ -98,8 +98,9 @@ def get_backtest_config_fromfile(path: str) -> dict:
        "debug_single_day": None,
        "concurrency": -1,
        "multiplier": 1.0,
-        "output_dir": "outputs/",
+        "output_dir": "outputs_backtest/",
        "generate_report": False,
+        "data_granularity": "1min",
    }
    backtest_config = merge_a_into_b(a=backtest_config, b=backtest_config_default)

--- a/qlib/rl/contrib/train_onpolicy.py
+++ b/qlib/rl/contrib/train_onpolicy.py
@@ -1,20 +1,23 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.
+from __future__ import annotations
+
 import argparse
 import os
 import random
+import sys
+import warnings
 from pathlib import Path
 from typing import cast, List, Optional

 import numpy as np
 import pandas as pd
-import qlib
 import torch
 import yaml
 from qlib.backtest import Order
 from qlib.backtest.decision import OrderDir
 from qlib.constant import ONE_MIN
-from qlib.rl.data.pickle_styled import load_simple_intraday_backtest_data
+from qlib.rl.data.native import load_handler_intraday_processed_data
 from qlib.rl.interpreter import ActionInterpreter, StateInterpreter
 from qlib.rl.order_execution import SingleAssetOrderExecutionSimple
 from qlib.rl.reward import Reward
@@ -23,7 +26,6 @@ from qlib.rl.trainer.callbacks import Callback, EarlyStopping, MetricsWriter
 from qlib.rl.utils.log import CsvWriter
 from qlib.utils import init_instance_by_config
 from tianshou.policy import BasePolicy
-from torch import nn
 from torch.utils.data import Dataset


@@ -49,19 +51,17 @@ def _read_orders(order_dir: Path) -> pd.DataFrame:
 class LazyLoadDataset(Dataset):
    def __init__(
        self,
+        data_dir: str,
        order_file_path: Path,
-        data_dir: Path,
        default_start_time_index: int,
        default_end_time_index: int,
    ) -> None:
        self._default_start_time_index = default_start_time_index
        self._default_end_time_index = default_end_time_index

-        self._order_file_path = order_file_path
        self._order_df = _read_orders(order_file_path).reset_index()
-
-        self._data_dir = data_dir
        self._ticks_index: Optional[pd.DatetimeIndex] = None
+        self._data_dir = Path(data_dir)

    def __len__(self) -> int:
        return len(self._order_df)
@@ -74,12 +74,17 @@ class LazyLoadDataset(Dataset):
            # TODO: We only load ticks index once based on the assumption that ticks index of different dates
            # TODO: in one experiment are all the same. If that assumption is not hold, we need to load ticks index
            # TODO: of all dates.
-            backtest_data = load_simple_intraday_backtest_data(
+
+            data = load_handler_intraday_processed_data(
                data_dir=self._data_dir,
                stock_id=row["instrument"],
                date=date,
+                feature_columns_today=[],
+                feature_columns_yesterday=[],
+                backtest=True,
+                index_only=True,
            )
-            self._ticks_index = [t - date for t in backtest_data.get_time_index()]
+            self._ticks_index = [t - date for t in data.today.index]

        order = Order(
            stock_id=row["instrument"],
@@ -101,10 +106,9 @@ def train_and_test(
    action_interpreter: ActionInterpreter,
    policy: BasePolicy,
    reward: Reward,
+    run_training: bool,
    run_backtest: bool,
 ) -> None:
-    qlib.init()
-
    order_root_path = Path(data_config["source"]["order_dir"])

    data_granularity = simulator_config.get("data_granularity", 1)
@@ -112,72 +116,78 @@ def train_and_test(
    def _simulator_factory_simple(order: Order) -> SingleAssetOrderExecutionSimple:
        return SingleAssetOrderExecutionSimple(
            order=order,
-            data_dir=Path(data_config["source"]["data_dir"]),
-            ticks_per_step=simulator_config["time_per_step"],
+            data_dir=data_config["source"]["feature_root_dir"],
+            feature_columns_today=data_config["source"]["feature_columns_today"],
+            feature_columns_yesterday=data_config["source"]["feature_columns_yesterday"],
            data_granularity=data_granularity,
-            deal_price_type=data_config["source"].get("deal_price_column", "close"),
+            ticks_per_step=simulator_config["time_per_step"],
            vol_threshold=simulator_config["vol_limit"],
        )

    assert data_config["source"]["default_start_time_index"] % data_granularity == 0
    assert data_config["source"]["default_end_time_index"] % data_granularity == 0

-    train_dataset, valid_dataset, test_dataset = [
-        LazyLoadDataset(
-            order_file_path=order_root_path / tag,
-            data_dir=Path(data_config["source"]["data_dir"]),
+    if run_training:
+        train_dataset, valid_dataset = [
+            LazyLoadDataset(
+                data_dir=data_config["source"]["feature_root_dir"],
+                order_file_path=order_root_path / tag,
+                default_start_time_index=data_config["source"]["default_start_time_index"] // data_granularity,
+                default_end_time_index=data_config["source"]["default_end_time_index"] // data_granularity,
+            )
+            for tag in ("train", "valid")
+        ]
+
+        callbacks: List[Callback] = []
+        if "checkpoint_path" in trainer_config:
+            callbacks.append(MetricsWriter(dirpath=Path(trainer_config["checkpoint_path"])))
+            callbacks.append(
+                Checkpoint(
+                    dirpath=Path(trainer_config["checkpoint_path"]) / "checkpoints",
+                    every_n_iters=trainer_config.get("checkpoint_every_n_iters", 1),
+                    save_latest="copy",
+                ),
+            )
+        if "earlystop_patience" in trainer_config:
+            callbacks.append(
+                EarlyStopping(
+                    patience=trainer_config["earlystop_patience"],
+                    monitor="val/pa",
+                )
+            )
+
+        train(
+            simulator_fn=_simulator_factory_simple,
+            state_interpreter=state_interpreter,
+            action_interpreter=action_interpreter,
+            policy=policy,
+            reward=reward,
+            initial_states=cast(List[Order], train_dataset),
+            trainer_kwargs={
+                "max_iters": trainer_config["max_epoch"],
+                "finite_env_type": env_config["parallel_mode"],
+                "concurrency": env_config["concurrency"],
+                "val_every_n_iters": trainer_config.get("val_every_n_epoch", None),
+                "callbacks": callbacks,
+            },
+            vessel_kwargs={
+                "episode_per_iter": trainer_config["episode_per_collect"],
+                "update_kwargs": {
+                    "batch_size": trainer_config["batch_size"],
+                    "repeat": trainer_config["repeat_per_collect"],
+                },
+                "val_initial_states": valid_dataset,
+            },
+        )
+
+    if run_backtest:
+        test_dataset = LazyLoadDataset(
+            data_dir=data_config["source"]["feature_root_dir"],
+            order_file_path=order_root_path / "test",
            default_start_time_index=data_config["source"]["default_start_time_index"] // data_granularity,
            default_end_time_index=data_config["source"]["default_end_time_index"] // data_granularity,
        )
-        for tag in ("train", "valid", "test")
-    ]

-    if "checkpoint_path" in trainer_config:
-        callbacks: List[Callback] = []
-        callbacks.append(MetricsWriter(dirpath=Path(trainer_config["checkpoint_path"])))
-        callbacks.append(
-            Checkpoint(
-                dirpath=Path(trainer_config["checkpoint_path"]) / "checkpoints",
-                every_n_iters=trainer_config.get("checkpoint_every_n_iters", 1),
-                save_latest="copy",
-            ),
-        )
-    if "earlystop_patience" in trainer_config:
-        callbacks.append(
-            EarlyStopping(
-                patience=trainer_config["earlystop_patience"],
-                monitor="val/pa",
-            )
-        )
-
-    trainer_kwargs = {
-        "max_iters": trainer_config["max_epoch"],
-        "finite_env_type": env_config["parallel_mode"],
-        "concurrency": env_config["concurrency"],
-        "val_every_n_iters": trainer_config.get("val_every_n_epoch", None),
-        "callbacks": callbacks,
-    }
-    vessel_kwargs = {
-        "episode_per_iter": trainer_config["episode_per_collect"],
-        "update_kwargs": {
-            "batch_size": trainer_config["batch_size"],
-            "repeat": trainer_config["repeat_per_collect"],
-        },
-        "val_initial_states": valid_dataset,
-    }
-
-    train(
-        simulator_fn=_simulator_factory_simple,
-        state_interpreter=state_interpreter,
-        action_interpreter=action_interpreter,
-        policy=policy,
-        reward=reward,
-        initial_states=cast(List[Order], train_dataset),
-        trainer_kwargs=trainer_kwargs,
-        vessel_kwargs=vessel_kwargs,
-    )
-
-    if run_backtest:
        backtest(
            simulator_fn=_simulator_factory_simple,
            state_interpreter=state_interpreter,
@@ -186,35 +196,42 @@ def train_and_test(
            policy=policy,
            logger=CsvWriter(Path(trainer_config["checkpoint_path"])),
            reward=reward,
-            finite_env_type=trainer_kwargs["finite_env_type"],
-            concurrency=trainer_kwargs["concurrency"],
+            finite_env_type=env_config["parallel_mode"],
+            concurrency=env_config["concurrency"],
        )


-def main(config: dict, run_backtest: bool) -> None:
+def main(config: dict, run_training: bool, run_backtest: bool) -> None:
+    if not run_training and not run_backtest:
+        warnings.warn("Skip the entire job since training and backtest are both skipped.")
+        return
+
    if "seed" in config["runtime"]:
        seed_everything(config["runtime"]["seed"])

-    state_config = config["state_interpreter"]
-    state_interpreter: StateInterpreter = init_instance_by_config(state_config)
+    for extra_module_path in config["env"].get("extra_module_paths", []):
+        sys.path.append(extra_module_path)

+    state_interpreter: StateInterpreter = init_instance_by_config(config["state_interpreter"])
    action_interpreter: ActionInterpreter = init_instance_by_config(config["action_interpreter"])
    reward: Reward = init_instance_by_config(config["reward"])

+    additional_policy_kwargs = {
+        "obs_space": state_interpreter.observation_space,
+        "action_space": action_interpreter.action_space,
+    }
+
    # Create torch network
-    if "kwargs" not in config["network"]:
-        config["network"]["kwargs"] = {}
-    config["network"]["kwargs"].update({"obs_space": state_interpreter.observation_space})
-    network: nn.Module = init_instance_by_config(config["network"])
+    if "network" in config:
+        if "kwargs" not in config["network"]:
+            config["network"]["kwargs"] = {}
+        config["network"]["kwargs"].update({"obs_space": state_interpreter.observation_space})
+        additional_policy_kwargs["network"] = init_instance_by_config(config["network"])

    # Create policy
-    config["policy"]["kwargs"].update(
-        {
-            "network": network,
-            "obs_space": state_interpreter.observation_space,
-            "action_space": action_interpreter.action_space,
-        }
-    )
+    if "kwargs" not in config["policy"]:
+        config["policy"]["kwargs"] = {}
+    config["policy"]["kwargs"].update(additional_policy_kwargs)
    policy: BasePolicy = init_instance_by_config(config["policy"])

    use_cuda = config["runtime"].get("use_cuda", False)
@@ -230,22 +247,22 @@ def main(config: dict, run_backtest: bool) -> None:
        state_interpreter=state_interpreter,
        policy=policy,
        reward=reward,
+        run_training=run_training,
        run_backtest=run_backtest,
    )


 if __name__ == "__main__":
-    import warnings
-
    warnings.filterwarnings("ignore", category=DeprecationWarning)
    warnings.filterwarnings("ignore", category=RuntimeWarning)

    parser = argparse.ArgumentParser()
    parser.add_argument("--config_path", type=str, required=True, help="Path to the config file")
-    parser.add_argument("--run_backtest", action="store_true", help="Run backtest workflow after training is finished")
+    parser.add_argument("--no_training", action="store_true", help="Skip training workflow.")
+    parser.add_argument("--run_backtest", action="store_true", help="Run backtest workflow.")
    args = parser.parse_args()

    with open(args.config_path, "r") as input_stream:
        config = yaml.safe_load(input_stream)

-    main(config, run_backtest=args.run_backtest)
+    main(config, run_training=not args.no_training, run_backtest=args.run_backtest)
--- a/qlib/rl/data/integration.py
+++ b/qlib/rl/data/integration.py
@@ -8,48 +8,14 @@ TODO: The implementation here is kind of adhoc. It is better to design a more un

 from __future__ import annotations

-import pickle
 from pathlib import Path
-from typing import List

-import cachetools
-import numpy as np
-import pandas as pd
 import qlib
 from qlib.constant import REG_CN
 from qlib.contrib.ops.high_freq import BFillNan, Cut, Date, DayCumsum, DayLast, FFillNan, IsInf, IsNull, Select
-from qlib.data.dataset import DatasetH
-
-dataset = None


-class DataWrapper:
-    def __init__(
-        self,
-        feature_dataset: DatasetH,
-        backtest_dataset: DatasetH,
-        columns_today: List[str],
-        columns_yesterday: List[str],
-        _internal: bool = False,
-    ):
-        assert _internal, "Init function of data wrapper is for internal use only."
-
-        self.feature_dataset = feature_dataset
-        self.backtest_dataset = backtest_dataset
-        self.columns_today = columns_today
-        self.columns_yesterday = columns_yesterday
-
-    @cachetools.cached(  # type: ignore
-        cache=cachetools.LRUCache(100),
-        key=lambda _, stock_id, date, backtest: (stock_id, date.replace(hour=0, minute=0, second=0), backtest),
-    )
-    def get(self, stock_id: str, date: pd.Timestamp, backtest: bool = False) -> pd.DataFrame:
-        start_time, end_time = date.replace(hour=0, minute=0, second=0), date.replace(hour=23, minute=59, second=59)
-        dataset = self.backtest_dataset if backtest else self.feature_dataset
-        return dataset.handler.fetch(pd.IndexSlice[stock_id, start_time:end_time], level=None)
-
-
-def init_qlib(qlib_config: dict, part: str = None) -> None:
+def init_qlib(qlib_config: dict) -> None:
    """Initialize necessary resource to launch the workflow, including data direction, feature columns, etc..

    Parameters
@@ -72,20 +38,15 @@ def init_qlib(qlib_config: dict, part: str = None) -> None:
                    "$bidV_1", "$bidV1_1", "$bidV3_1", "$bidV5_1", "$askV_1", "$askV1_1", "$askV3_1", "$askV5_1",
                ],
            }
-    part
-        Identifying which part (stock / date) to load.
    """

-    global dataset  # pylint: disable=W0603
-
    def _convert_to_path(path: str | Path) -> Path:
        return path if isinstance(path, Path) else Path(path)

    provider_uri_map = {}
-    if "provider_uri_day" in qlib_config:
-        provider_uri_map["day"] = _convert_to_path(qlib_config["provider_uri_day"]).as_posix()
-    if "provider_uri_1min" in qlib_config:
-        provider_uri_map["1min"] = _convert_to_path(qlib_config["provider_uri_1min"]).as_posix()
+    for granularity in ["1min", "5min", "day"]:
+        if f"provider_uri_{granularity}" in qlib_config:
+            provider_uri_map[f"{granularity}"] = _convert_to_path(qlib_config[f"provider_uri_{granularity}"]).as_posix()

    qlib.init(
        region=REG_CN,
@@ -119,47 +80,3 @@ def init_qlib(qlib_config: dict, part: str = None) -> None:
        redis_port=-1,
        clear_mem_cache=False,  # init_qlib will be called for multiple times. Keep the cache for improving performance
    )
-
-    if part == "skip":
-        return
-
-    # this won't work if it's put outside in case of multiprocessing
-    from qlib.data import D  # noqa pylint: disable=C0415,W0611
-
-    if part is None:
-        feature_path = Path(qlib_config["feature_root_dir"]) / "feature.pkl"
-        backtest_path = Path(qlib_config["feature_root_dir"]) / "backtest.pkl"
-    else:
-        feature_path = Path(qlib_config["feature_root_dir"]) / "feature" / (part + ".pkl")
-        backtest_path = Path(qlib_config["feature_root_dir"]) / "backtest" / (part + ".pkl")
-
-    with feature_path.open("rb") as f:
-        feature_dataset = pickle.load(f)
-    with backtest_path.open("rb") as f:
-        backtest_dataset = pickle.load(f)
-
-    dataset = DataWrapper(
-        feature_dataset,
-        backtest_dataset,
-        qlib_config["feature_columns_today"],
-        qlib_config["feature_columns_yesterday"],
-        _internal=True,
-    )
-
-
-def fetch_features(stock_id: str, date: pd.Timestamp, yesterday: bool = False, backtest: bool = False) -> pd.DataFrame:
-    assert dataset is not None, "You must call init_qlib() before doing this."
-
-    if backtest:
-        fields = ["$close", "$volume"]
-    else:
-        fields = dataset.columns_yesterday if yesterday else dataset.columns_today
-
-    data = dataset.get(stock_id, date, backtest)
-    if data is None or len(data) == 0:
-        # create a fake index, but RL doesn't care about index
-        data = pd.DataFrame(0.0, index=np.arange(240), columns=fields, dtype=np.float32)  # FIXME: hardcode here
-    else:
-        data = data.rename(columns={c: c.rstrip("0") for c in data.columns})
-        data = data[fields]
-    return data
--- a/qlib/rl/data/native.py
+++ b/qlib/rl/data/native.py
@@ -2,17 +2,29 @@
 # Licensed under the MIT License.
 from __future__ import annotations

-from typing import cast
+from pathlib import Path
+from typing import cast, List

 import cachetools
 import pandas as pd
+import pickle
+import os

 from qlib.backtest import Exchange, Order
 from qlib.backtest.decision import TradeRange, TradeRangeByTime
-from qlib.rl.order_execution.utils import get_ticks_slice
-
+from qlib.constant import EPS_T
 from .base import BaseIntradayBacktestData, BaseIntradayProcessedData, ProcessedDataProvider
-from .integration import fetch_features
+
+
+def get_ticks_slice(
+    ticks_index: pd.DatetimeIndex,
+    start: pd.Timestamp,
+    end: pd.Timestamp,
+    include_end: bool = False,
+) -> pd.DatetimeIndex:
+    if not include_end:
+        end = end - EPS_T
+    return ticks_index[ticks_index.slice_indexer(start, end)]


 class IntradayBacktestData(BaseIntradayBacktestData):
@@ -71,6 +83,31 @@ class IntradayBacktestData(BaseIntradayBacktestData):
        return pd.DatetimeIndex([e[1] for e in list(self._exchange.quote_df.index)])


+class DataframeIntradayBacktestData(BaseIntradayBacktestData):
+    """Backtest data from dataframe"""
+
+    def __init__(self, df: pd.DataFrame, price_column: str = "$close0", volume_column: str = "$volume0") -> None:
+        self.df = df
+        self.price_column = price_column
+        self.volume_column = volume_column
+
+    def __repr__(self) -> str:
+        with pd.option_context("memory_usage", False, "display.max_info_columns", 1, "display.large_repr", "info"):
+            return f"{self.__class__.__name__}({self.df})"
+
+    def __len__(self) -> int:
+        return len(self.df)
+
+    def get_deal_price(self) -> pd.Series:
+        return self.df[self.price_column]
+
+    def get_volume(self) -> pd.Series:
+        return self.df[self.volume_column]
+
+    def get_time_index(self) -> pd.DatetimeIndex:
+        return cast(pd.DatetimeIndex, self.df.index)
+
+
@cachetools.cached(  # type: ignore
    cache=cachetools.LRUCache(100),
    key=lambda order, _, __: order.key_by_day,
@@ -103,13 +140,18 @@ def load_backtest_data(
    return backtest_data


-class NTIntradayProcessedData(BaseIntradayProcessedData):
-    """Subclass of IntradayProcessedData. Used to handle NT style data."""
+class HandlerIntradayProcessedData(BaseIntradayProcessedData):
+    """Subclass of IntradayProcessedData. Used to handle handler (bin format) style data."""

    def __init__(
        self,
+        data_dir: Path,
        stock_id: str,
        date: pd.Timestamp,
+        feature_columns_today: List[str],
+        feature_columns_yesterday: List[str],
+        backtest: bool = False,
+        index_only: bool = False,
    ) -> None:
        def _drop_stock_id(df: pd.DataFrame) -> pd.DataFrame:
            df = df.reset_index()
@@ -117,8 +159,18 @@ class NTIntradayProcessedData(BaseIntradayProcessedData):
                df = df.drop(columns=["instrument"])
            return df.set_index(["datetime"])

-        self.today = _drop_stock_id(fetch_features(stock_id, date))
-        self.yesterday = _drop_stock_id(fetch_features(stock_id, date, yesterday=True))
+        path = os.path.join(data_dir, "backtest" if backtest else "feature", f"{stock_id}.pkl")
+        start_time, end_time = date.replace(hour=0, minute=0, second=0), date.replace(hour=23, minute=59, second=59)
+        with open(path, "rb") as fstream:
+            dataset = pickle.load(fstream)
+        data = dataset.handler.fetch(pd.IndexSlice[stock_id, start_time:end_time], level=None)
+
+        if index_only:
+            self.today = _drop_stock_id(data[[]])
+            self.yesterday = _drop_stock_id(data[[]])
+        else:
+            self.today = _drop_stock_id(data[feature_columns_today])
+            self.yesterday = _drop_stock_id(data[feature_columns_yesterday])

    def __repr__(self) -> str:
        with pd.option_context("memory_usage", False, "display.max_info_columns", 1, "display.large_repr", "info"):
@@ -127,12 +179,42 @@ class NTIntradayProcessedData(BaseIntradayProcessedData):

@cachetools.cached(  # type: ignore
    cache=cachetools.LRUCache(100),  # 100 * 50K = 5MB
+    key=lambda data_dir, stock_id, date, feature_columns_today, feature_columns_yesterday, backtest, index_only: (
+        stock_id,
+        date,
+        backtest,
+        index_only,
+    ),
 )
-def load_nt_intraday_processed_data(stock_id: str, date: pd.Timestamp) -> NTIntradayProcessedData:
-    return NTIntradayProcessedData(stock_id, date)
+def load_handler_intraday_processed_data(
+    data_dir: Path,
+    stock_id: str,
+    date: pd.Timestamp,
+    feature_columns_today: List[str],
+    feature_columns_yesterday: List[str],
+    backtest: bool = False,
+    index_only: bool = False,
+) -> HandlerIntradayProcessedData:
+    return HandlerIntradayProcessedData(
+        data_dir, stock_id, date, feature_columns_today, feature_columns_yesterday, backtest, index_only
+    )


-class NTProcessedDataProvider(ProcessedDataProvider):
+class HandlerProcessedDataProvider(ProcessedDataProvider):
+    def __init__(
+        self,
+        data_dir: str,
+        feature_columns_today: List[str],
+        feature_columns_yesterday: List[str],
+        backtest: bool = False,
+    ) -> None:
+        super().__init__()
+
+        self.data_dir = Path(data_dir)
+        self.feature_columns_today = feature_columns_today
+        self.feature_columns_yesterday = feature_columns_yesterday
+        self.backtest = backtest
+
    def get_data(
        self,
        stock_id: str,
@@ -140,4 +222,12 @@ class NTProcessedDataProvider(ProcessedDataProvider):
        feature_dim: int,
        time_index: pd.Index,
    ) -> BaseIntradayProcessedData:
-        return load_nt_intraday_processed_data(stock_id, date)
+        return load_handler_intraday_processed_data(
+            self.data_dir,
+            stock_id,
+            date,
+            self.feature_columns_today,
+            self.feature_columns_yesterday,
+            backtest=self.backtest,
+            index_only=False,
+        )
--- a/qlib/rl/data/pickle_styled.py
+++ b/qlib/rl/data/pickle_styled.py
@@ -104,7 +104,7 @@ class SimpleIntradayBacktestData(BaseIntradayBacktestData):
        stock_id: str,
        date: pd.Timestamp,
        deal_price: DealPriceType = "close",
-        order_dir: int = None,
+        order_dir: int | None = None,
    ) -> None:
        super(SimpleIntradayBacktestData, self).__init__()

@@ -158,8 +158,8 @@ class SimpleIntradayBacktestData(BaseIntradayBacktestData):
        return cast(pd.DatetimeIndex, self.data.index)


-class IntradayProcessedData(BaseIntradayProcessedData):
-    """Subclass of IntradayProcessedData. Used to handle Dataset Handler style data."""
+class PickleIntradayProcessedData(BaseIntradayProcessedData):
+    """Subclass of IntradayProcessedData. Used to handle pickle-styled data."""

    def __init__(
        self,
@@ -208,7 +208,7 @@ def load_simple_intraday_backtest_data(
    stock_id: str,
    date: pd.Timestamp,
    deal_price: DealPriceType = "close",
-    order_dir: int = None,
+    order_dir: int | None = None,
 ) -> SimpleIntradayBacktestData:
    return SimpleIntradayBacktestData(data_dir, stock_id, date, deal_price, order_dir)

@@ -217,14 +217,14 @@ def load_simple_intraday_backtest_data(
    cache=cachetools.LRUCache(100),  # 100 * 50K = 5MB
    key=lambda data_dir, stock_id, date, feature_dim, time_index: hashkey(data_dir, stock_id, date),
 )
-def load_pickled_intraday_processed_data(
+def load_pickle_intraday_processed_data(
    data_dir: Path,
    stock_id: str,
    date: pd.Timestamp,
    feature_dim: int,
    time_index: pd.Index,
 ) -> BaseIntradayProcessedData:
-    return IntradayProcessedData(data_dir, stock_id, date, feature_dim, time_index)
+    return PickleIntradayProcessedData(data_dir, stock_id, date, feature_dim, time_index)


 class PickleProcessedDataProvider(ProcessedDataProvider):
@@ -240,7 +240,7 @@ class PickleProcessedDataProvider(ProcessedDataProvider):
        feature_dim: int,
        time_index: pd.Index,
    ) -> BaseIntradayProcessedData:
-        return load_pickled_intraday_processed_data(
+        return load_pickle_intraday_processed_data(
            data_dir=self._data_dir,
            stock_id=stock_id,
            date=date,
--- a/qlib/rl/order_execution/interpreter.py
+++ b/qlib/rl/order_execution/interpreter.py
@@ -53,6 +53,18 @@ class FullHistoryObs(TypedDict):
    position_history: Any


+class DummyStateInterpreter(StateInterpreter[SAOEState, dict]):
+    """Dummy interpreter for policies that do not need inputs (for example, AllOne)."""
+
+    def interpret(self, state: SAOEState) -> dict:
+        # TODO: A fake state, used to pass `check_nan_observation`. Find a better way in the future.
+        return {"DUMMY": _to_int32(1)}
+
+    @property
+    def observation_space(self) -> spaces.Dict:
+        return spaces.Dict({"DUMMY": spaces.Box(-np.inf, np.inf, shape=(), dtype=np.int32)})
+
+
 class FullHistoryStateInterpreter(StateInterpreter[SAOEState, FullHistoryObs]):
    """The observation of all the history, including today (until this moment), and yesterday.

--- a/qlib/rl/order_execution/policy.py
+++ b/qlib/rl/order_execution/policy.py
@@ -12,11 +12,11 @@ import torch
 import torch.nn as nn
 from gym.spaces import Discrete
 from tianshou.data import Batch, ReplayBuffer, to_torch
-from tianshou.policy import BasePolicy, PPOPolicy
+from tianshou.policy import BasePolicy, PPOPolicy, DQNPolicy

 from qlib.rl.trainer.trainer import Trainer

-__all__ = ["AllOne", "PPO"]
+__all__ = ["AllOne", "PPO", "DQN"]


 # baselines #
@@ -32,7 +32,7 @@ class NonLearnablePolicy(BasePolicy):
        super().__init__()

    def learn(self, batch: Batch, **kwargs: Any) -> Dict[str, Any]:
-        pass
+        return {}

    def process_fn(
        self,
@@ -40,7 +40,7 @@ class NonLearnablePolicy(BasePolicy):
        buffer: ReplayBuffer,
        indices: np.ndarray,
    ) -> Batch:
-        pass
+        return Batch({})


 class AllOne(NonLearnablePolicy):
@@ -49,13 +49,18 @@ class AllOne(NonLearnablePolicy):
    Useful when implementing some baselines (e.g., TWAP).
    """

+    def __init__(self, obs_space: gym.Space, action_space: gym.Space, fill_value: float | int = 1.0) -> None:
+        super().__init__(obs_space, action_space)
+
+        self.fill_value = fill_value
+
    def forward(
        self,
        batch: Batch,
        state: dict | Batch | np.ndarray = None,
        **kwargs: Any,
    ) -> Batch:
-        return Batch(act=np.full(len(batch), 1.0), state=state)
+        return Batch(act=np.full(len(batch), self.fill_value), state=state)


 # ppo #
@@ -153,6 +158,56 @@ class PPO(PPOPolicy):
            set_weight(self, Trainer.get_policy_state_dict(weight_file))


+DQNModel = PPOActor  # Reuse PPOActor.
+
+
+class DQN(DQNPolicy):
+    """A wrapper of tianshou DQNPolicy.
+
+    Differences:
+
+    - Auto-create model network. Supports discrete action space only.
+    - Support a ``weight_file`` that supports loading checkpoint.
+    """
+
+    def __init__(
+        self,
+        network: nn.Module,
+        obs_space: gym.Space,
+        action_space: gym.Space,
+        lr: float,
+        weight_decay: float = 0.0,
+        discount_factor: float = 0.99,
+        estimation_step: int = 1,
+        target_update_freq: int = 0,
+        reward_normalization: bool = False,
+        is_double: bool = True,
+        clip_loss_grad: bool = False,
+        weight_file: Optional[Path] = None,
+    ) -> None:
+        assert isinstance(action_space, Discrete)
+
+        model = DQNModel(network, action_space.n)
+        optimizer = torch.optim.Adam(
+            model.parameters(),
+            lr=lr,
+            weight_decay=weight_decay,
+        )
+
+        super().__init__(
+            model,
+            optimizer,
+            discount_factor=discount_factor,
+            estimation_step=estimation_step,
+            target_update_freq=target_update_freq,
+            reward_normalization=reward_normalization,
+            is_double=is_double,
+            clip_loss_grad=clip_loss_grad,
+        )
+        if weight_file is not None:
+            set_weight(self, Trainer.get_policy_state_dict(weight_file))
+
+
 # utilities: these should be put in a separate (common) file. #


--- a/qlib/rl/order_execution/reward.py
+++ b/qlib/rl/order_execution/reward.py
@@ -7,6 +7,7 @@ from typing import cast

 import numpy as np

+from qlib.backtest.decision import OrderDir
 from qlib.rl.order_execution.state import SAOEMetrics, SAOEState
 from qlib.rl.reward import Reward

@@ -47,3 +48,52 @@ class PAPenaltyReward(Reward[SAOEState]):
        self.log("reward/pa", pa)
        self.log("reward/penalty", penalty)
        return reward * self.scale
+
+
+class PPOReward(Reward[SAOEState]):
+    """Reward proposed by paper "An End-to-End Optimal Trade Execution Framework based on Proximal Policy Optimization".
+
+    Parameters
+    ----------
+    max_step
+        Maximum number of steps.
+    start_time_index
+        First time index that allowed to trade.
+    end_time_index
+        Last time index that allowed to trade.
+    """
+
+    def __init__(self, max_step: int, start_time_index: int = 0, end_time_index: int = 239) -> None:
+        self.max_step = max_step
+        self.start_time_index = start_time_index
+        self.end_time_index = end_time_index
+
+    def reward(self, simulator_state: SAOEState) -> float:
+        if simulator_state.cur_step == self.max_step - 1 or simulator_state.position < 1e-6:
+            if simulator_state.history_exec["deal_amount"].sum() == 0.0:
+                vwap_price = cast(
+                    float,
+                    np.average(simulator_state.history_exec["market_price"]),
+                )
+            else:
+                vwap_price = cast(
+                    float,
+                    np.average(
+                        simulator_state.history_exec["market_price"],
+                        weights=simulator_state.history_exec["deal_amount"],
+                    ),
+                )
+            twap_price = simulator_state.backtest_data.get_deal_price().mean()
+
+            if simulator_state.order.direction == OrderDir.SELL:
+                ratio = vwap_price / twap_price if twap_price != 0 else 1.0
+            else:
+                ratio = twap_price / vwap_price if vwap_price != 0 else 1.0
+            if ratio < 1.0:
+                return -1.0
+            elif ratio < 1.1:
+                return 0.0
+            else:
+                return 1.0
+        else:
+            return 0.0
--- a/qlib/rl/order_execution/simulator_qlib.py
+++ b/qlib/rl/order_execution/simulator_qlib.py
@@ -38,8 +38,8 @@ class SingleAssetOrderExecution(Simulator[Order, SAOEState, float]):
        order: Order,
        executor_config: dict,
        exchange_config: dict,
-        qlib_config: dict = None,
-        cash_limit: Optional[float] = None,
+        qlib_config: dict | None = None,
+        cash_limit: float | None = None,
    ) -> None:
        super().__init__(initial=order)

@@ -63,11 +63,11 @@ class SingleAssetOrderExecution(Simulator[Order, SAOEState, float]):
        strategy_config: dict,
        executor_config: dict,
        exchange_config: dict,
-        qlib_config: dict = None,
+        qlib_config: dict | None = None,
        cash_limit: Optional[float] = None,
    ) -> None:
        if qlib_config is not None:
-            init_qlib(qlib_config, part="skip")
+            init_qlib(qlib_config)

        strategy, self._executor = get_strategy_executor(
            start_time=order.date,
--- a/qlib/rl/order_execution/simulator_simple.py
+++ b/qlib/rl/order_execution/simulator_simple.py
@@ -3,17 +3,19 @@

 from __future__ import annotations

-from pathlib import Path
-from typing import Any, cast, Optional
+from typing import Any, cast, List, Optional

 import numpy as np
 import pandas as pd
+
+from pathlib import Path
 from qlib.backtest.decision import Order, OrderDir
 from qlib.constant import EPS, EPS_T, float_or_ndarray
-from qlib.rl.data.pickle_styled import DealPriceType, load_simple_intraday_backtest_data
+from qlib.rl.data.base import BaseIntradayBacktestData
+from qlib.rl.data.native import DataframeIntradayBacktestData, load_handler_intraday_processed_data
+from qlib.rl.data.pickle_styled import load_simple_intraday_backtest_data
 from qlib.rl.simulator import Simulator
 from qlib.rl.utils import LogLevel
-
 from .state import SAOEMetrics, SAOEState

 __all__ = ["SingleAssetOrderExecutionSimple"]
@@ -36,12 +38,16 @@ class SingleAssetOrderExecutionSimple(Simulator[Order, SAOEState, float]):
    ----------
    order
        The seed to start an SAOE simulator is an order.
+    data_dir
+        Path to load backtest data.
+    feature_columns_today
+        Columns of today's feature.
+    feature_columns_yesterday
+        Columns of yesterday's feature.
    data_granularity
        Number of ticks between consecutive data entries.
    ticks_per_step
        How many ticks per step.
-    data_dir
-        Path to load backtest data
    vol_threshold
        Maximum execution volume (divided by market execution volume).
    """
@@ -73,9 +79,10 @@ class SingleAssetOrderExecutionSimple(Simulator[Order, SAOEState, float]):
        self,
        order: Order,
        data_dir: Path,
+        feature_columns_today: List[str] = [],
+        feature_columns_yesterday: List[str] = [],
        data_granularity: int = 1,
        ticks_per_step: int = 30,
-        deal_price_type: DealPriceType = "close",
        vol_threshold: Optional[float] = None,
    ) -> None:
        super().__init__(initial=order)
@@ -83,18 +90,13 @@ class SingleAssetOrderExecutionSimple(Simulator[Order, SAOEState, float]):
        assert ticks_per_step % data_granularity == 0

        self.order = order
-        self.ticks_per_step: int = ticks_per_step // data_granularity
-        self.deal_price_type = deal_price_type
-        self.vol_threshold = vol_threshold
        self.data_dir = data_dir
-        self.backtest_data = load_simple_intraday_backtest_data(
-            self.data_dir,
-            order.stock_id,
-            pd.Timestamp(order.start_time.date()),
-            self.deal_price_type,
-            order.direction,
-        )
+        self.feature_columns_today = feature_columns_today
+        self.feature_columns_yesterday = feature_columns_yesterday
+        self.ticks_per_step: int = ticks_per_step // data_granularity
+        self.vol_threshold = vol_threshold

+        self.backtest_data = self.get_backtest_data()
        self.ticks_index = self.backtest_data.get_time_index()

        # Get time index available for trading
@@ -118,6 +120,30 @@ class SingleAssetOrderExecutionSimple(Simulator[Order, SAOEState, float]):
        self.market_vol: Optional[np.ndarray] = None
        self.market_vol_limit: Optional[np.ndarray] = None

+    def get_backtest_data(self) -> BaseIntradayBacktestData:
+        try:
+            data = load_handler_intraday_processed_data(
+                data_dir=self.data_dir,
+                stock_id=self.order.stock_id,
+                date=pd.Timestamp(self.order.start_time.date()),
+                feature_columns_today=self.feature_columns_today,
+                feature_columns_yesterday=self.feature_columns_yesterday,
+                backtest=True,
+                index_only=False,
+            )
+            return DataframeIntradayBacktestData(data.today)
+        except (AttributeError, FileNotFoundError):
+            # TODO: For compatibility with older versions of test scripts (tests/rl/test_saoe_simple.py)
+            # TODO: In the future, we should modify the data format used by the test script,
+            # TODO: and then delete this branch.
+            return load_simple_intraday_backtest_data(
+                self.data_dir / "backtest",
+                self.order.stock_id,
+                pd.Timestamp(self.order.start_time.date()),
+                "close",
+                self.order.direction,
+            )
+
    def step(self, amount: float) -> None:
        """Execute one step or SAOE.

--- a/qlib/rl/order_execution/strategy.py
+++ b/qlib/rl/order_execution/strategy.py
@@ -7,6 +7,7 @@ import collections
 from types import GeneratorType
 from typing import Any, Callable, cast, Dict, Generator, List, Optional, Tuple, Union

+import warnings
 import numpy as np
 import pandas as pd
 import torch
@@ -89,6 +90,7 @@ class SAOEStateAdapter:
        exchange: Exchange,
        ticks_per_step: int,
        backtest_data: IntradayBacktestData,
+        data_granularity: int = 1,
    ) -> None:
        self.position = order.amount
        self.order = order
@@ -106,11 +108,13 @@ class SAOEStateAdapter:

        self.cur_time = max(backtest_data.ticks_for_order[0], order.start_time)
        self.ticks_per_step = ticks_per_step
+        self.data_granularity = data_granularity
+        assert self.ticks_per_step % self.data_granularity == 0

    def _next_time(self) -> pd.Timestamp:
        current_loc = self.backtest_data.ticks_index.get_loc(self.cur_time)
-        next_loc = current_loc + self.ticks_per_step
-        next_loc = next_loc - next_loc % self.ticks_per_step
+        next_loc = current_loc + (self.ticks_per_step // self.data_granularity)
+        next_loc = next_loc - next_loc % (self.ticks_per_step // self.data_granularity)
        if (
            next_loc < len(self.backtest_data.ticks_index)
            and self.backtest_data.ticks_index[next_loc] < self.order.end_time
@@ -130,11 +134,16 @@ class SAOEStateAdapter:

        exec_vol = np.zeros(last_step_size)
        for order, _, __, ___ in execute_result:
-            idx, _ = get_day_min_idx_range(order.start_time, order.end_time, "1min", REG_CN)
+            idx, _ = get_day_min_idx_range(order.start_time, order.end_time, f"{self.data_granularity}min", REG_CN)
            exec_vol[idx - last_step_range[0]] = order.deal_amount

        if exec_vol.sum() > self.position and exec_vol.sum() > 0.0:
-            assert exec_vol.sum() < self.position + 1, f"{exec_vol} too large"
+            if exec_vol.sum() > self.position + 1.0:
+                warnings.warn(
+                    f"Sum of execution volume is {exec_vol.sum()} which is larger than "
+                    f"position + 1.0 = {self.position} + 1.0 = {self.position + 1.0}. "
+                    f"All execution volume is scaled down linearly to ensure that their sum does not position."
+                )
            exec_vol *= self.position / (exec_vol.sum())

        market_volume = cast(
@@ -168,7 +177,9 @@ class SAOEStateAdapter:
            self.history_exec,
            self._collect_multi_order_metric(
                order=self.order,
-                datetime=_get_all_timestamps(start_time, end_time, include_end=True),
+                datetime=_get_all_timestamps(
+                    start_time, end_time, include_end=True, granularity=ONE_MIN * self.data_granularity
+                ),
                market_vol=market_volume,
                market_price=market_price,
                exec_vol=exec_vol,
@@ -293,9 +304,10 @@ class SAOEStrategy(RLStrategy):
    def __init__(
        self,
        policy: BasePolicy,
-        outer_trade_decision: BaseTradeDecision = None,
-        level_infra: LevelInfrastructure = None,
-        common_infra: CommonInfrastructure = None,
+        outer_trade_decision: BaseTradeDecision | None = None,
+        level_infra: LevelInfrastructure | None = None,
+        common_infra: CommonInfrastructure | None = None,
+        data_granularity: int = 1,
        **kwargs: Any,
    ) -> None:
        super(SAOEStrategy, self).__init__(
@@ -306,6 +318,7 @@ class SAOEStrategy(RLStrategy):
            **kwargs,
        )

+        self._data_granularity = data_granularity
        self.adapter_dict: Dict[tuple, SAOEStateAdapter] = {}
        self._last_step_range = (0, 0)

@@ -324,9 +337,10 @@ class SAOEStrategy(RLStrategy):
            exchange=self.trade_exchange,
            ticks_per_step=int(pd.Timedelta(self.trade_calendar.get_freq()) / ONE_MIN),
            backtest_data=backtest_data,
+            data_granularity=self._data_granularity,
        )

-    def reset(self, outer_trade_decision: BaseTradeDecision = None, **kwargs: Any) -> None:
+    def reset(self, outer_trade_decision: BaseTradeDecision | None = None, **kwargs: Any) -> None:
        super(SAOEStrategy, self).reset(outer_trade_decision=outer_trade_decision, **kwargs)

        self.adapter_dict = {}
@@ -366,7 +380,7 @@ class SAOEStrategy(RLStrategy):

    def generate_trade_decision(
        self,
-        execute_result: list = None,
+        execute_result: list | None = None,
    ) -> Union[BaseTradeDecision, Generator[Any, Any, BaseTradeDecision]]:
        """
        For SAOEStrategy, we need to update the `self._last_step_range` every time a decision is generated.
@@ -385,7 +399,7 @@ class SAOEStrategy(RLStrategy):

    def _generate_trade_decision(
        self,
-        execute_result: list = None,
+        execute_result: list | None = None,
    ) -> Union[BaseTradeDecision, Generator[Any, Any, BaseTradeDecision]]:
        raise NotImplementedError

@@ -399,14 +413,14 @@ class ProxySAOEStrategy(SAOEStrategy):

    def __init__(
        self,
-        outer_trade_decision: BaseTradeDecision = None,
-        level_infra: LevelInfrastructure = None,
-        common_infra: CommonInfrastructure = None,
+        outer_trade_decision: BaseTradeDecision | None = None,
+        level_infra: LevelInfrastructure | None = None,
+        common_infra: CommonInfrastructure | None = None,
        **kwargs: Any,
    ) -> None:
        super().__init__(None, outer_trade_decision, level_infra, common_infra, **kwargs)

-    def _generate_trade_decision(self, execute_result: list = None) -> Generator[Any, Any, BaseTradeDecision]:
+    def _generate_trade_decision(self, execute_result: list | None = None) -> Generator[Any, Any, BaseTradeDecision]:
        # Once the following line is executed, this ProxySAOEStrategy (self) will be yielded to the outside
        # of the entire executor, and the execution will be suspended. When the execution is resumed by `send()`,
        # the item will be captured by `exec_vol`. The outside policy could communicate with the inner
@@ -418,7 +432,7 @@ class ProxySAOEStrategy(SAOEStrategy):

        return TradeDecisionWO([order], self)

-    def reset(self, outer_trade_decision: BaseTradeDecision = None, **kwargs: Any) -> None:
+    def reset(self, outer_trade_decision: BaseTradeDecision | None = None, **kwargs: Any) -> None:
        super().reset(outer_trade_decision=outer_trade_decision, **kwargs)

        assert isinstance(outer_trade_decision, TradeDecisionWO)
@@ -437,9 +451,9 @@ class SAOEIntStrategy(SAOEStrategy):
        state_interpreter: dict | StateInterpreter,
        action_interpreter: dict | ActionInterpreter,
        network: dict | torch.nn.Module | None = None,
-        outer_trade_decision: BaseTradeDecision = None,
-        level_infra: LevelInfrastructure = None,
-        common_infra: CommonInfrastructure = None,
+        outer_trade_decision: BaseTradeDecision | None = None,
+        level_infra: LevelInfrastructure | None = None,
+        common_infra: CommonInfrastructure | None = None,
        **kwargs: Any,
    ) -> None:
        super(SAOEIntStrategy, self).__init__(
@@ -488,7 +502,7 @@ class SAOEIntStrategy(SAOEStrategy):
        if self._policy is not None:
            self._policy.eval()

-    def reset(self, outer_trade_decision: BaseTradeDecision = None, **kwargs: Any) -> None:
+    def reset(self, outer_trade_decision: BaseTradeDecision | None = None, **kwargs: Any) -> None:
        super().reset(outer_trade_decision=outer_trade_decision, **kwargs)

    def _generate_trade_details(self, act: np.ndarray, exec_vols: List[float]) -> pd.DataFrame:
@@ -508,7 +522,7 @@ class SAOEIntStrategy(SAOEStrategy):
                trade_details[-1]["rl_action"] = a
        return pd.DataFrame.from_records(trade_details)

-    def _generate_trade_decision(self, execute_result: list = None) -> BaseTradeDecision:
+    def _generate_trade_decision(self, execute_result: list | None = None) -> BaseTradeDecision:
        states = []
        obs_batch = []
        for decision in self.outer_trade_decision.get_decision():
--- a/qlib/rl/order_execution/utils.py
+++ b/qlib/rl/order_execution/utils.py
@@ -10,18 +10,7 @@ import pandas as pd

 from qlib.backtest.decision import OrderDir
 from qlib.backtest.executor import BaseExecutor, NestedExecutor, SimulatorExecutor
-from qlib.constant import EPS_T, float_or_ndarray
-
-
-def get_ticks_slice(
-    ticks_index: pd.DatetimeIndex,
-    start: pd.Timestamp,
-    end: pd.Timestamp,
-    include_end: bool = False,
-) -> pd.DatetimeIndex:
-    if not include_end:
-        end = end - EPS_T
-    return ticks_index[ticks_index.slice_indexer(start, end)]
+from qlib.constant import float_or_ndarray


 def dataframe_append(df: pd.DataFrame, other: Any) -> pd.DataFrame:
--- a/qlib/rl/strategy/single_order.py
+++ b/qlib/rl/strategy/single_order.py
@@ -1,6 +1,8 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.

+from __future__ import annotations
+
 from qlib.backtest import Order
 from qlib.backtest.decision import OrderHelper, TradeDecisionWO, TradeRange
 from qlib.strategy.base import BaseStrategy
@@ -12,14 +14,14 @@ class SingleOrderStrategy(BaseStrategy):
    def __init__(
        self,
        order: Order,
-        trade_range: TradeRange = None,
+        trade_range: TradeRange | None = None,
    ) -> None:
        super().__init__()

        self._order = order
        self._trade_range = trade_range

-    def generate_trade_decision(self, execute_result: list = None) -> TradeDecisionWO:
+    def generate_trade_decision(self, execute_result: list | None = None) -> TradeDecisionWO:
        oh: OrderHelper = self.common_infra.get("trade_exchange").get_order_helper()
        order_list = [
            oh.create(
--- a/qlib/rl/utils/data_queue.py
+++ b/qlib/rl/utils/data_queue.py
@@ -4,6 +4,7 @@
 from __future__ import annotations

 import multiprocessing
+from multiprocessing.sharedctypes import Synchronized
 import os
 import threading
 import time
@@ -78,7 +79,9 @@ class DataQueue(Generic[T]):

        self._activated: bool = False
        self._queue: multiprocessing.Queue = multiprocessing.Queue(maxsize=queue_maxsize)
-        self._done = multiprocessing.Value("i", 0)
+        # Mypy 0.981 brought '"SynchronizedBase[Any]" has no attribute "value"  [attr-defined]' bug.
+        # Therefore, add this type casting to pass Mypy checking.
+        self._done = cast(Synchronized, multiprocessing.Value("i", 0))

    def __enter__(self) -> DataQueue:
        self.activate()
@@ -122,7 +125,7 @@ class DataQueue(Generic[T]):
                if self._done.value:
                    raise StopIteration  # pylint: disable=raise-missing-from

-    def put(self, obj: Any, block: bool = True, timeout: int = None) -> None:
+    def put(self, obj: Any, block: bool = True, timeout: int | None = None) -> None:
        self._queue.put(obj, block=block, timeout=timeout)

    def mark_as_done(self) -> None:
--- a/qlib/rl/utils/env_wrapper.py
+++ b/qlib/rl/utils/env_wrapper.py
@@ -99,9 +99,9 @@ class EnvWrapper(
        state_interpreter: StateInterpreter[StateType, ObsType],
        action_interpreter: ActionInterpreter[StateType, PolicyActType, ActType],
        seed_iterator: Optional[Iterable[InitialStateType]],
-        reward_fn: Reward = None,
-        aux_info_collector: AuxiliaryInfoCollector[StateType, Any] = None,
-        logger: LogCollector = None,
+        reward_fn: Reward | None = None,
+        aux_info_collector: AuxiliaryInfoCollector[StateType, Any] | None = None,
+        logger: LogCollector | None = None,
    ) -> None:
        # Assign weak reference to wrapper.
        #
--- a/qlib/rl/utils/log.py
+++ b/qlib/rl/utils/log.py
@@ -397,7 +397,7 @@ class ConsoleWriter(LogWriter):
    def __init__(
        self,
        log_every_n_episode: int = 20,
-        total_episodes: int = None,
+        total_episodes: int | None = None,
        float_format: str = ":.4f",
        counter_format: str = ":4d",
        loglevel: int | LogLevel = LogLevel.PERIODIC,
--- a/qlib/tests/data.py
+++ b/qlib/tests/data.py
@@ -1,6 +1,7 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.

+import os
 import re
 import sys
 import qlib
@@ -11,13 +12,15 @@ import datetime
 from tqdm import tqdm
 from pathlib import Path
 from loguru import logger
+from cryptography.fernet import Fernet
 from qlib.utils import exists_qlib_data


 class GetData:
-    DATASET_VERSION = "v2"
    REMOTE_URL = "https://qlibpublic.blob.core.windows.net/data/default/stock_data"
-    QLIB_DATA_NAME = "{dataset_name}_{region}_{interval}_{qlib_version}.zip"
+    # "?" is not included in the token.
+    TOKEN = "gAAAAABkmDhojHc0VSCDdNK1MqmRzNLeDFXe5hy8obHpa6SDQh4de6nW5gtzuD-fa6O_WZb0yyqYOL7ndOfJX_751W3xN5YB4-n-P22jK-t6ucoZqhT70KPD0Lf0_P328QPJVZ1gDnjIdjhi2YLOcP4BFTHLNYO0mvzszR8TKm9iT5AKRvuysWnpi8bbYwGU9zAcJK3x9EPL43hOGtxliFHcPNGMBoJW4g_ercdhi0-Qgv5_JLsV-29_MV-_AhuaYvJuN2dEywBy"
+    KEY = "EYcA8cgorA8X9OhyMwVfuFxn_1W3jGk6jCbs3L2oPoA="

    def __init__(self, delete_zip_file=False):
        """
@@ -29,24 +32,44 @@ class GetData:
        """
        self.delete_zip_file = delete_zip_file

-    def normalize_dataset_version(self, dataset_version: str = None):
-        if dataset_version is None:
-            dataset_version = self.DATASET_VERSION
-        return dataset_version
+    def merge_remote_url(self, file_name: str):
+        fernet = Fernet(self.KEY)
+        token = fernet.decrypt(self.TOKEN).decode()
+        return f"{self.REMOTE_URL}/{file_name}?{token}"

-    def merge_remote_url(self, file_name: str, dataset_version: str = None):
-        return f"{self.REMOTE_URL}/{self.normalize_dataset_version(dataset_version)}/{file_name}"
+    def download_data(self, file_name: str, target_dir: [Path, str], delete_old: bool = True):
+        """
+        Download the specified file to the target folder.

-    def _download_data(
-        self, file_name: str, target_dir: [Path, str], delete_old: bool = True, dataset_version: str = None
-    ):
+        Parameters
+        ----------
+        target_dir: str
+            data save directory
+        file_name: str
+            dataset name, needs to endwith .zip, value from [rl_data.zip, csv_data_cn.zip, ...]
+            may contain folder names, for example: v2/qlib_data_simple_cn_1d_latest.zip
+        delete_old: bool
+            delete an existing directory, by default True
+
+        Examples
+        ---------
+        # get rl data
+        python get_data.py download_data --file_name rl_data.zip --target_dir ~/.qlib/qlib_data/rl_data
+        When this command is run, the data will be downloaded from this link: https://qlibpublic.blob.core.windows.net/data/default/stock_data/rl_data.zip?{token}
+
+        # get cn csv data
+        python get_data.py download_data --file_name csv_data_cn.zip --target_dir ~/.qlib/csv_data/cn_data
+        When this command is run, the data will be downloaded from this link: https://qlibpublic.blob.core.windows.net/data/default/stock_data/csv_data_cn.zip?{token}
+        -------
+
+        """
        target_dir = Path(target_dir).expanduser()
        target_dir.mkdir(exist_ok=True, parents=True)
        # saved file name
-        _target_file_name = datetime.datetime.now().strftime("%Y%m%d%H%M%S") + "_" + file_name
+        _target_file_name = datetime.datetime.now().strftime("%Y%m%d%H%M%S") + "_" + os.path.basename(file_name)
        target_path = target_dir.joinpath(_target_file_name)

-        url = self.merge_remote_url(file_name, dataset_version)
+        url = self.merge_remote_url(file_name)
        resp = requests.get(url, stream=True, timeout=60)
        resp.raise_for_status()
        if resp.status_code != 200:
@@ -56,7 +79,7 @@ class GetData:
        logger.warning(
            f"The data for the example is collected from Yahoo Finance. Please be aware that the quality of the data might not be perfect. (You can refer to the original data source: https://finance.yahoo.com/lookup.)"
        )
-        logger.info(f"{file_name} downloading......")
+        logger.info(f"{os.path.basename(file_name)} downloading......")
        with tqdm(total=int(resp.headers.get("Content-Length", 0))) as p_bar:
            with target_path.open("wb") as fp:
                for chunk in resp.iter_content(chunk_size=chunk_size):
@@ -67,8 +90,8 @@ class GetData:
        if self.delete_zip_file:
            target_path.unlink()

-    def check_dataset(self, file_name: str, dataset_version: str = None):
-        url = self.merge_remote_url(file_name, dataset_version)
+    def check_dataset(self, file_name: str):
+        url = self.merge_remote_url(file_name)
        resp = requests.get(url, stream=True, timeout=60)
        status = True
        if resp.status_code == 404:
@@ -140,9 +163,11 @@ class GetData:
        ---------
        # get 1d data
        python get_data.py qlib_data --name qlib_data --target_dir ~/.qlib/qlib_data/cn_data --interval 1d --region cn
+        When this command is run, the data will be downloaded from this link: https://qlibpublic.blob.core.windows.net/data/default/stock_data/v2/qlib_data_cn_1d_latest.zip?{token}

        # get 1min data
        python get_data.py qlib_data --name qlib_data --target_dir ~/.qlib/qlib_data/cn_data_1min --interval 1min --region cn
+        When this command is run, the data will be downloaded from this link: https://qlibpublic.blob.core.windows.net/data/default/stock_data/v2/qlib_data_cn_1min_latest.zip?{token}
        -------

        """
@@ -155,29 +180,12 @@ class GetData:

        qlib_version = ".".join(re.findall(r"(\d+)\.+", qlib.__version__))

-        def _get_file_name(v):
-            return self.QLIB_DATA_NAME.format(
-                dataset_name=name, region=region.lower(), interval=interval.lower(), qlib_version=v
-            )
+        def _get_file_name_with_version(qlib_version, dataset_version):
+            dataset_version = "v2" if dataset_version is None else dataset_version
+            file_name_with_version = f"{dataset_version}/{name}_{region.lower()}_{interval.lower()}_{qlib_version}.zip"
+            return file_name_with_version

-        file_name = _get_file_name(qlib_version)
-        if not self.check_dataset(file_name, version):
-            file_name = _get_file_name("latest")
-        self._download_data(file_name.lower(), target_dir, delete_old, dataset_version=version)
-
-    def csv_data_cn(self, target_dir="~/.qlib/csv_data/cn_data"):
-        """download cn csv data from remote
-
-        Parameters
-        ----------
-        target_dir: str
-            data save directory
-
-        Examples
-        ---------
-        python get_data.py csv_data_cn --target_dir ~/.qlib/csv_data/cn_data
-        -------
-
-        """
-        file_name = "csv_data_cn.zip"
-        self._download_data(file_name, target_dir)
+        file_name = _get_file_name_with_version(qlib_version, dataset_version=version)
+        if not self.check_dataset(file_name):
+            file_name = _get_file_name_with_version("latest", dataset_version=version)
+        self.download_data(file_name.lower(), target_dir, delete_old)
--- a/qlib/utils/init.py
+++ b/qlib/utils/init.py
@@ -1,6 +1,7 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.

+# TODO: this utils covers too much utilities, please seperat it into sub modules

 from __future__ import division
 from __future__ import print_function
@@ -43,7 +44,7 @@ is_deprecated_lexsorted_pandas = version.parse(pd.__version__) > version.parse("
 #################### Server ####################
 def get_redis_connection():
    """get redis connection instance."""
-    return redis.StrictRedis(host=C.redis_host, port=C.redis_port, db=C.redis_task_db)
+    return redis.StrictRedis(host=C.redis_host, port=C.redis_port, db=C.redis_task_db, password=C.redis_password)


 #################### Data ####################
@@ -224,7 +225,7 @@ def requests_with_retry(url, retry=5, **kwargs):
        except Exception as e:
            log.warning("exception encountered {}".format(e))
            continue
-    raise Exception("ERROR: requests failed!")
+    raise TimeoutError("ERROR: requests failed!")


 #################### Parse ####################
@@ -426,7 +427,8 @@ def init_instance_by_config(
            # path like 'file:///<path to pickle file>/obj.pkl'
            pr = urlparse(config)
            if pr.scheme == "file":
-                with open(os.path.join(pr.netloc, pr.path), "rb") as f:
+                pr_path = os.path.join(pr.netloc, pr.path) if bool(pr.path) else pr.netloc
+                with open(os.path.normpath(pr_path), "rb") as f:
                    return pickle.load(f)
        else:
            with config.open("rb") as f:
--- a/qlib/utils/data.py
+++ b/qlib/utils/data.py
@@ -1,6 +1,10 @@
 # Copyright (c) Microsoft Corporation.
 # Licensed under the MIT License.
-from typing import Union
+"""
+This module covers some utility functions that operate on data or basic object
+"""
+from copy import deepcopy
+from typing import List, Union
 import pandas as pd
 import numpy as np

@@ -54,3 +58,48 @@ def deepcopy_basic_type(obj: object) -> object:
        return {k: deepcopy_basic_type(v) for k, v in obj.items()}
    else:
        return obj
+
+
+S_DROP = "__DROP__"  # this is a symbol which indicates drop the value
+
+
+def update_config(base_config: dict, ext_config: Union[dict, List[dict]]):
+    """
+    supporting adding base config based on the ext_config
+
+    >>> bc = {"a": "xixi"}
+    >>> ec = {"b": "haha"}
+    >>> new_bc = update_config(bc, ec)
+    >>> print(new_bc)
+    {'a': 'xixi', 'b': 'haha'}
+    >>> print(bc)  # base config should not be changed
+    {'a': 'xixi'}
+    >>> print(update_config(bc, {"b": S_DROP}))
+    {'a': 'xixi'}
+    >>> print(update_config(new_bc, {"b": S_DROP}))
+    {'a': 'xixi'}
+    """
+
+    base_config = deepcopy(base_config)  # in case of modifying base config
+
+    for ec in ext_config if isinstance(ext_config, (list, tuple)) else [ext_config]:
+        for key in ec:
+            if key not in base_config:
+                # if it is not in the default key, then replace it.
+                # ADD if not drop
+                if ec[key] != S_DROP:
+                    base_config[key] = ec[key]
+
+            else:
+                if isinstance(base_config[key], dict) and isinstance(ec[key], dict):
+                    # Recursive
+                    # Both of them are dict, then update it nested
+                    base_config[key] = update_config(base_config[key], ec[key])
+                elif ec[key] == S_DROP:
+                    # DROP
+                    del base_config[key]
+                else:
+                    # REPLACE
+                    # one of then are not dict. Then replace
+                    base_config[key] = ec[key]
+    return base_config
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Linlang	99a60ac836	modify get_data method for CI	2023-07-05 13:59:38 +08:00
Linlang	fd1e24c308	fix pip install CI	2023-06-26 09:19:40 +08:00
you-n-g	27f476b311	Update __init__.py	2023-06-26 00:00:46 +08:00
you-n-g	0e61cac6a8	Update release-drafter.yml (#1569 ) * Update release-drafter.yml * Update release-drafter.yml	2023-06-25 23:48:37 +08:00
Linlang	21f0b394e7	change get_data url (#1558 ) * change_url * fix_CI * fix_CI_2 * fix_CI_3 * fix_CI_4 * fix_CI_5 * fix_CI_6 * fix_CI_7 * fix_CI_8 * fix_CI_9 * fix_CI_10 * fix_CI_11 * fix_CI_12 * fix_CI_13 * fix_CI_13 * fix_CI_14 * fix_CI_15 * fix_CI_16 * fix_CI_17 * fix_CI_18 * fix_CI_19 * fix_CI_20 * fix_CI_21 * fix_CI_22 * fix_CI_23 * fix_CI_24 * fix_CI_25 * fix_CI_26 * fix_CI_27 * fix_get_data_error * fix_get_data_error2 * modify_get_data * modify_get_data2 * modify_get_data3 * modify_get_data4 * fix_CI_28 * fix_CI_29 * fix_CI_30 --------- Co-authored-by: Linlang <v-linlanglv@microsoft.com>	2023-06-25 23:39:11 +08:00
Wendi Li	cd4ab998fb	Update on Dynamic Benchmark (#1539 ) * move config file to benchmark_dynamic & switch default sim task model to GBDT * Update benchmark_dynamic results * Change the default value of alpha of DDG-DA	2023-06-03 08:42:24 +08:00
you-n-g	0e9ac9dce7	Fix CI (#1529 )	2023-05-31 08:39:52 +08:00
yaxuan999	efffb2819a	added KRNN and Sandwich models and their example results based on Alpha360 (#1414 ) * Update README.md updated the result of KRNN and Sandwich models based on Alpha360 * Update README.md * Update README.md * Add files via upload * Update README.md * Update README.md * Update README.md * Add files via upload * Delete pytorch_krnn.py * Delete pytorch_sandwich.py * Add files via upload * Update pytorch_sandwich.py * Update pytorch_krnn.py * Update pytorch_sandwich.py * Update pytorch_krnn.py * Update README.md * Update README.md * Update requirements.txt * Update requirements.txt * Update README.md * Update README.md * Update pytorch_sandwich.py * Update link on index --------- Co-authored-by: Young <afe.young@gmail.com>	2023-05-26 18:42:58 +08:00
Fivele-Li	19a0eb78bc	Fix TCN model input dimension mismatch (#1520 ) * transpose dimension 1 and 2 to match nn.Conv1d input * 1.update TCN benchmarks; 2.Emphasize updating the benchmark table; * replace specific version with main --------- Co-authored-by: lijinhui <362237642@qq.com>	2023-05-26 14:44:34 +08:00
Fivele-Li	370477288d	fix_DDG-DA_workflow_bug (#1516 ) * 1.specify group_keys=False to avoid FutureWarning; 2.fix get train_start from dict unexpected problem; * fix black * Add comments * Add make file --------- Co-authored-by: Young <afe.young@gmail.com>	2023-05-24 15:49:58 +08:00
you-n-g	94268619c4	Update README.md	2023-05-23 09:50:00 +08:00
Huoran Li	8d60a6a02b	Resolve RL FIXMES (#1503 ) * Solve several small FIXMEs left in RL * Add TODO in example * Minor bugfix * black	2023-05-17 16:57:08 +08:00
Fivele-Li	7234308651	Add base config in yml (#1500 ) * path on Windows contains double '/' which may cause open file failed. * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * add baseConfig in yml,user can add new keys or update/drop keys in baseConfig; * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs. The pip version has been temporarily fixed to 23.0.1. * 1.Search for baseConfig in multiple directories; 2.Add user instructions in qrun; * fix format with black * 1.modify baseConfig key to BASE_CONFIG_PATH; 2.only find config file in absolute path and relative path; * load BASE_CONFIG_PATH on absolute path & relative path; * fix Lint with black --------- Co-authored-by: lijinhui <362237642@qq.com>	2023-05-12 17:35:37 +08:00
Chaoying	acf5df27ce	Add support for redis password (#1508 )	2023-05-08 16:17:15 +08:00
Chaoying	37a59f28d3	Fix deprecated syntax in numpy (#1507 ) * Fix deprecated syntax in numpy * Replace np.bool with bool	2023-05-08 16:17:02 +08:00
YQ Tsui	b084c352f5	provide dtype to empty series to surpress warning; fix type (#1449 )	2023-05-05 17:47:44 +08:00
Maksim Zayakin	9e22e5168b	Remove unused `DNNModelPytorch` params (#1470 ) * Remove lr_decay and lr_decay_steps params More flexible way to pass a scheduler (via callable function) is already supported * remove lr_decay and lr_decay_steps from mlp workflow configs	2023-04-28 17:48:40 +08:00
Fivele-Li	dceff7b471	Specify the tianshou version to match the dev environment to avoid the error in issue #1477 . (#1502 )	2023-04-28 13:50:25 +08:00
Huoran Li	7f1e8c5206	Refine Qlib RL data format (#1480 ) * wip * wip * wip * Fix naming errors * Backtest test passed * Why training stuck? * Minor * Refine train configs * Use dummy in training * Remove pickle_dataframe * CI * CI * Add more strict condition to filter orders * Pass test * Add TODO in example --------- Co-authored-by: Young <afe.young@gmail.com>	2023-04-26 21:14:30 +08:00
Fivele-Li	46264dfec9	normpath for Windows (#1495 ) * path on Windows contains double '/' which may cause open file failed. * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * locate import numpy error * pip release version 23.1 on Apr.15 2023, CI failed to run, Please refer to #1495 ofr detailed logs. The pip version has been temporarily fixed to 23.0.1. --------- Co-authored-by: lijinhui <362237642@qq.com>	2023-04-26 16:26:12 +08:00
Fivele-Li	754799ab05	update ubuntu CI version; (#1488 ) * update ubuntu CI version; (End of standard support for 18.04 LTS - 31 May 2023) * update ubuntu CI version; --------- Co-authored-by: lijinhui <362237642@qq.com>	2023-04-10 17:06:48 +08:00
you-n-g	32c3070b73	Refine DDG-DA (#1472 ) * Run ddg-da successfully * Support include valid; More parameters * Support L2 reg & visualization * Blackformat * Enable fill_method * Support specify handler & optim dataset * Fix Pylint	2023-04-07 15:00:21 +08:00
you-n-g	40de67265a	Update Docs about some concepts in DataHandler (#1485 )	2023-04-07 10:02:16 +08:00
saurabh dave	e6f9a94fc5	fix: removed extra blank link between sections (#1451 )	2023-04-03 17:32:01 +08:00
Fivele-Li	73937863f1	Merge pull request #1475 from qianyun210603/bugfix [BUGFIX] potential file// url parsing error	2023-03-24 11:22:57 +08:00
BookSword	d010219ba6	Merge branch 'main' into bugfix	2023-03-23 16:11:19 +08:00
BookSword	4fc8a5f25f	merge	2023-03-23 16:05:09 +08:00
Linlang	0e8bfcb5d3	fix_pylint_w0719 (#1463 ) * fix_pylint_w0719 * remove_fixme	2023-03-17 19:25:49 +08:00
you-n-g	e457ca8511	Improve annotation & documentation for handler (#1312 ) * Improve annotation & documentation for handler * Add type	2023-03-15 21:15:40 +08:00
Huoran Li	4dbb8ecb86	Remove (#1464 )	2023-03-15 15:26:44 +08:00
Huoran Li	653c082e7a	Order execution open source (#1447 ) * Waiting for bin data * Complete readme * CI * Add inst filter by time * Update qlib/data/dataset/processor.py * typo * Fix time filter bug * Add Filter and set Universe * Complete data pipeline * Fix Provider Logger Info Args * Add DQN; a minor bugfix in ppo reward. * update readme. modify assertion logic in strategy check. * Fix Doc issues and fix black * Fix pylint Error --------- Co-authored-by: Young <afe.young@gmail.com> Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>	2023-03-13 12:06:28 +08:00
you-n-g	f98e04ca9d	Fix Field Name Error	2023-03-03 16:28:47 +08:00
Cadenza-Li	76f2fb1a1a	Add ipynb format check (#1439 ) * Update test_qlib_from_source.yml * add ipynb format check to workflow * test ipynb CI * modify nbqa check path * add pylint flake8 mypy check to ipynb * check ipynb with black and pylint * reformat .ipynb files * format line length nbqa black . -l 120 * update nbqa .ipynb format CI * format old ipynb files * add nbconvert check to CI * adjust CI order to avoid repeating download data	2023-02-21 09:23:22 +08:00
Huoran Li	5eb5ac1f1f	RL backtest pipeline on 5-min data (#1417 ) * Workflow runnable * CI * Slight changes to make the workflow runnable. The changes of handler/provider should be reverted before merging. * Train experiment successful * Refine handler & provider * test passed * Ready to test on server * Minor * Test passed * TWAP training * Add PPOReward * Add a FIXME * Refine PPO reward according to PR comments * Minor * Resolve PR comments * CI issues * CI issues * CI issues	2023-02-13 12:43:22 +08:00
Young	6295939346	Update to Dev Version	2023-01-29 18:55:23 +08:00