diff --git a/README.md b/README.md
index 4f9b68cec..21bc0219e 100644
--- a/README.md
+++ b/README.md
@@ -17,6 +17,7 @@ Recent released features
 | High-frequency trading example | [Part of code released](https://github.com/microsoft/qlib/pull/227) on Jan 28, 2021  | 
 | High-frequency data(1min) | [Released](https://github.com/microsoft/qlib/pull/221) on Jan 27, 2021 |
 | Tabnet Model | [Released](https://github.com/microsoft/qlib/pull/205) on Jan 22, 2021 | 
+| TCTS Model | [Released](https://github.com/microsoft/qlib/pull/491) on July 1, 2021 |
 
 Features released before 2021 are not listed here.
 
@@ -288,6 +289,7 @@ Here is a list of models built on `Qlib`.
 - [TFT based on tensorflow (Bryan Lim, et al. 2019)](examples/benchmarks/TFT/tft.py)
 - [TabNet based on pytorch (Sercan O. Arik, et al. 2019)](qlib/contrib/model/pytorch_tabnet.py)
 - [DoubleEnsemble based on LightGBM (Chuheng Zhang, et al. 2020)](qlib/contrib/model/double_ensemble.py)
+- [TCTS based on pytorch (Xueqing Wu, et al. 2021)](qlib/contrib/model/pytorch_tcts.py)
 
 Your PR of new Quant models is highly welcomed.
 
diff --git a/examples/benchmarks/GATs/workflow_config_gats_Alpha158.yaml b/examples/benchmarks/GATs/workflow_config_gats_Alpha158.yaml
index 71454e7f9..5fb7d5cc1 100644
--- a/examples/benchmarks/GATs/workflow_config_gats_Alpha158.yaml
+++ b/examples/benchmarks/GATs/workflow_config_gats_Alpha158.yaml
@@ -61,7 +61,6 @@ task:
             metric: loss
             loss: mse
             base_model: LSTM
-            with_pretrain: True
             model_path: "benchmarks/LSTM/csi300_lstm_ts.pkl"
             GPU: 0
     dataset:
diff --git a/examples/benchmarks/GATs/workflow_config_gats_Alpha360.yaml b/examples/benchmarks/GATs/workflow_config_gats_Alpha360.yaml
index d778c9b1b..86ce51018 100644
--- a/examples/benchmarks/GATs/workflow_config_gats_Alpha360.yaml
+++ b/examples/benchmarks/GATs/workflow_config_gats_Alpha360.yaml
@@ -54,7 +54,6 @@ task:
             metric: loss
             loss: mse
             base_model: LSTM
-            with_pretrain: True
             model_path: "benchmarks/LSTM/model_lstm_csi300.pkl"
             GPU: 0
     dataset:
@@ -81,4 +80,4 @@ task:
         - class: PortAnaRecord
           module_path: qlib.workflow.record_temp
           kwargs: 
-            config: *port_analysis_config
\ No newline at end of file
+            config: *port_analysis_config
diff --git a/examples/benchmarks/README.md b/examples/benchmarks/README.md
index 133380fe0..1920a6a3b 100644
--- a/examples/benchmarks/README.md
+++ b/examples/benchmarks/README.md
@@ -22,6 +22,7 @@ The numbers shown below demonstrate the performance of the entire `workflow` of
 | GATs (Petar Velickovic, et al.) | Alpha360 | 0.0475±0.00 | 0.3515±0.02| 0.0592±0.00 | 0.4585±0.01 | 0.0876±0.02 | 1.1513±0.27| -0.0795±0.02 |
 | DoubleEnsemble (Chuheng Zhang, et al.) | Alpha360 | 0.0407±0.00| 0.3053±0.00 | 0.0490±0.00 | 0.3840±0.00 | 0.0380±0.02 | 0.5000±0.21 | -0.0984±0.02 |
 | TabNet (Sercan O. Arik, et al.)| Alpha360 | 0.0192±0.00 | 0.1401±0.00| 0.0291±0.00 | 0.2163±0.00 | -0.0258±0.00 | -0.2961±0.00| -0.1429±0.00 |
+| TCTS (Xueqing Wu, et al.)| Alpha360 | 0.0485±0.00 | 0.3689±0.04| 0.0586±0.00 | 0.4669±0.02 | 0.0816±0.02 | 1.1572±0.30| -0.0689±0.02 |
 
 ## Alpha158 dataset
 | Model Name | Dataset | IC | ICIR | Rank IC | Rank ICIR | Annualized Return | Information Ratio | Max Drawdown |
diff --git a/examples/benchmarks/TCTS/TCTS.md b/examples/benchmarks/TCTS/TCTS.md
deleted file mode 100644
index ee67ffbeb..000000000
--- a/examples/benchmarks/TCTS/TCTS.md
+++ /dev/null
@@ -1,52 +0,0 @@
-# Temporally Correlated Task Scheduling for Sequence Learning
-We provide the [code](https://github.com/microsoft/qlib/blob/main/qlib/contrib/model/pytorch_tcts.py) for reproducing the stock trend forecasting experiments.
-
-### Background
-Sequence learning has attracted much research attention from the machine learning community in recent years. In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks, which are different in terms of how much input information to use or which future step to predict. In stock trend forecasting, as demonstrated in Figure1, one can predict the price of a stock in different future days (e.g., tomorrow, the day after tomorrow). In this paper, we propose a framework to make use of those temporally correlated tasks to help each other. 
-
-<p align="center"> 
-<img src="task_description.png" width="600" height="200"/>
-</p>
-
-
-### Method
-Given that there are usually multiple temporally correlated tasks, the key challenge lies in which tasks to use and when to use them in the training process. In this work, we introduce a learnable task scheduler for sequence learning, which adaptively selects temporally correlated tasks during the training process. The scheduler accesses the model status and the current training data (e.g., in current minibatch), and selects the best auxiliary task to help the training of the main task. The scheduler and the model for the main task are jointly trained through bi-level optimization: the scheduler is trained to maximize the validation performance of the model, and the model is trained to minimize the training loss guided by the scheduler. The process is demonstrated in Figure2.
-
-<p align="center"> 
-<img src="workflow.png"/>
-</p>
-
-At step <img src="https://render.githubusercontent.com/render/math?math=s">, with training data <img src="https://render.githubusercontent.com/render/math?math=x_s,y_s">, the scheduler <img src="https://render.githubusercontent.com/render/math?math=\varphi"> chooses a suitable task <img src="https://render.githubusercontent.com/render/math?math=T_{i_s}"> (green solid lines) to update the model <img src="https://render.githubusercontent.com/render/math?math=f"> (blue solid lines). After <img src="https://render.githubusercontent.com/render/math?math=S"> steps, we evaluate the model <img src="https://render.githubusercontent.com/render/math?math=f"> on the validation set and update the scheduler <img src="https://render.githubusercontent.com/render/math?math=\varphi"> (green dashed lines).
-
-### DataSet
-* We use the historical transaction data for 300 stocks on [CSI300](http://www.csindex.com.cn/en/indices/index-detail/000300) from 01/01/2008 to 08/01/2020. 
-* We split the data into training (01/01/2008-12/31/2013), validation (01/01/2014-12/31/2015), and test sets (01/01/2016-08/01/2020) based on the transaction time. 
-
-### Experiments
-#### Task Description
-* The main tasks <img src="https://render.githubusercontent.com/render/math?math=T_k"> (<img src="https://render.githubusercontent.com/render/math?math=task_k"> in Figure1) refers to forecasting return of stock <img src="https://render.githubusercontent.com/render/math?math=i"> as following,
-<div align=center>
-<img src="https://render.githubusercontent.com/render/math?math=r_{i}^k = \frac{\price_i^{t+k}}{\price_i^{t+k-1}} - 1">
-</div>
-
-* Temporally correlated task sets <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_k = \{T_1, T_2, ... , T_k\}">, in this paper, <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">, <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5"> and <img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_10"> are used.
-#### Baselines
-* GRU/MLP/LightGBM (LGB)/Graph Attention Networks (GAT)
-* Multi-task learning (MTL): In multi-task learning, multiple tasks are jointly trained and mutually boosted. Each task is treated equally, while in our setting, we focus on the main task.
-* Curriculum transfer learning (CL): Transfer learning also leverages auxiliary tasks to boost the main task. [Curriculum transfer learning](https://arxiv.org/pdf/1804.00810.pdf) is one kind of transfer learning which schedules auxiliary tasks according to certain rules. Our problem can also be regarded as a special kind of transfer learning, where the auxiliary tasks are temporally correlated with the main task. Our learning process is dynamically controlled by a scheduler rather than some pre-defined rules. In the CL baseline, we start from the task <img src="https://render.githubusercontent.com/render/math?math=T_1" >, then <img src="https://render.githubusercontent.com/render/math?math=T_2" >, and gradually move to the last one.
-#### Result
-| Methods | <img src="https://render.githubusercontent.com/render/math?math=T_1" > | <img src="https://render.githubusercontent.com/render/math?math=T_2"> | <img src="https://render.githubusercontent.com/render/math?math=T_3"> |
-| :----: | :----: | :----: | :----: |
-| GRU | 0.049 / 1.903 | 0.018 / 1.972 | 0.014 / 1.989 |
-| MLP | 0.023 / 1.961 | 0.022 / 1.962 | 0.015 / 1.978 |
-| LGB | 0.038 / 1.883 | 0.023 / 1.952 | 0.007 / 1.987 |
-| GAT | 0.052 / 1.898 | 0.024 / 1.954 | 0.015 / 1.973 |
-| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">)  | 0.061 / 1.862  | 0.023 / 1.942  | 0.012 / 1.956 |
-| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">)  | 0.051 / 1.880  | 0.028 / 1.941  | 0.016 / 1.962 |
-| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_3">)  | 0.071 / 1.851  | 0.030 / 1.939  | 0.017 / 1.963 |
-| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">)  | 0.057 / 1.875  | 0.021 / 1.939  | 0.017 / 1.959 |
-| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">)  | 0.056 / 1.877  | 0.028 / 1.942  | 0.015 / 1.962 |
-| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_5">)  | 0.075 / 1.849  | 0.032 /1.939  | 0.021 / 1.955  | 
-| MTL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">)  | 0.052 / 1.882  | 0.020 / 1.947  | 0.019 / 1.952 |
-| CL(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">)  | 0.051 / 1.882  | 0.028 / 1.950  | 0.016 / 1.961 |
-| Ours(<img src="https://render.githubusercontent.com/render/math?math=\mathcal{T}_{10}">)  | 0.067 /  1.867  | 0.030 / 1.960  | 0.022 / 1.942|
\ No newline at end of file
diff --git a/examples/benchmarks/TCTS/workflow_config_tcts_Alpha360.yaml b/examples/benchmarks/TCTS/workflow_config_tcts_Alpha360.yaml
index 589f4b43e..89c66f992 100644
--- a/examples/benchmarks/TCTS/workflow_config_tcts_Alpha360.yaml
+++ b/examples/benchmarks/TCTS/workflow_config_tcts_Alpha360.yaml
@@ -22,11 +22,9 @@ data_handler_config: &data_handler_config
         - class: CSRankNorm
           kwargs:
               fields_group: label
-    label: ["Ref($close, -2) / Ref($close, -1) - 1",
-            "Ref($close, -3) / Ref($close, -1) - 1",
-            "Ref($close, -4) / Ref($close, -1) - 1",
-            "Ref($close, -5) / Ref($close, -1) - 1",
-            "Ref($close, -6) / Ref($close, -1) - 1"]
+    label: ["Ref($close, -1) / $close - 1",
+            "Ref($close, -2) / Ref($close, -1) - 1",
+            "Ref($close, -3) / Ref($close, -2) - 1"]
 port_analysis_config: &port_analysis_config
     strategy:
         class: TopkDropoutStrategy
@@ -61,11 +59,12 @@ task:
             GPU: 0
             fore_optimizer: adam
             weight_optimizer: adam
-            output_dim: 5
-            fore_lr: 5e-7
-            weight_lr: 5e-7
+            output_dim: 3
+            fore_lr: 5e-4
+            weight_lr: 5e-4
             steps: 3
-            target_label: 0
+            target_label: 1
+            lowest_valid_performance: 0.993
     dataset:
         class: DatasetH
         module_path: qlib.data.dataset
@@ -87,7 +86,8 @@ task:
           kwargs: 
             ana_long_short: False
             ann_scaler: 252
+            label_col: 1
         - class: PortAnaRecord
           module_path: qlib.workflow.record_temp
           kwargs: 
-            config: *port_analysis_config
\ No newline at end of file
+            config: *port_analysis_config
diff --git a/qlib/contrib/model/pytorch_gats.py b/qlib/contrib/model/pytorch_gats.py
index 493bf120f..7e5bb78ee 100644
--- a/qlib/contrib/model/pytorch_gats.py
+++ b/qlib/contrib/model/pytorch_gats.py
@@ -53,7 +53,6 @@ class GATs(Model):
         early_stop=20,
         loss="mse",
         base_model="GRU",
-        with_pretrain=True,
         model_path=None,
         optimizer="adam",
         GPU=0,
@@ -76,7 +75,6 @@ class GATs(Model):
         self.optimizer = optimizer.lower()
         self.loss = loss
         self.base_model = base_model
-        self.with_pretrain = with_pretrain
         self.model_path = model_path
         self.device = torch.device("cuda:%d" % (GPU) if torch.cuda.is_available() and GPU >= 0 else "cpu")
         self.seed = seed
@@ -94,7 +92,6 @@ class GATs(Model):
             "\noptimizer : {}"
             "\nloss_type : {}"
             "\nbase_model : {}"
-            "\nwith_pretrain : {}"
             "\nmodel_path : {}"
             "\ndevice : {}"
             "\nuse_GPU : {}"
@@ -110,7 +107,6 @@ class GATs(Model):
                 optimizer.lower(),
                 loss,
                 base_model,
-                with_pretrain,
                 model_path,
                 self.device,
                 self.use_gpu,
@@ -253,24 +249,22 @@ class GATs(Model):
         evals_result["valid"] = []
 
         # load pretrained base_model
-        if self.with_pretrain:
-            if self.model_path == None:
-                raise ValueError("the path of the pretrained model should be given first!")
-            self.logger.info("Loading pretrained model...")
-            if self.base_model == "LSTM":
-                pretrained_model = LSTMModel()
-                pretrained_model.load_state_dict(torch.load(self.model_path))
-            elif self.base_model == "GRU":
-                pretrained_model = GRUModel()
-                pretrained_model.load_state_dict(torch.load(self.model_path))
-            else:
-                raise ValueError("unknown base model name `%s`" % self.base_model)
+        if self.base_model == "LSTM":
+            pretrained_model = LSTMModel()
+        elif self.base_model == "GRU":
+            pretrained_model = GRUModel()
+        else:
+            raise ValueError("unknown base model name `%s`" % self.base_model)
 
-            model_dict = self.GAT_model.state_dict()
-            pretrained_dict = {k: v for k, v in pretrained_model.state_dict().items() if k in model_dict}
-            model_dict.update(pretrained_dict)
-            self.GAT_model.load_state_dict(model_dict)
-            self.logger.info("Loading pretrained model Done...")
+        if self.model_path is not None:
+            self.logger.info("Loading pretrained model...")
+            pretrained_model.load_state_dict(torch.load(self.model_path))
+
+        model_dict = self.GAT_model.state_dict()
+        pretrained_dict = {k: v for k, v in pretrained_model.state_dict().items() if k in model_dict}
+        model_dict.update(pretrained_dict)
+        self.GAT_model.load_state_dict(model_dict)
+        self.logger.info("Loading pretrained model Done...")
 
         # train
         self.logger.info("training...")
diff --git a/qlib/contrib/model/pytorch_gats_ts.py b/qlib/contrib/model/pytorch_gats_ts.py
index 5f9961b0b..09123cc5c 100644
--- a/qlib/contrib/model/pytorch_gats_ts.py
+++ b/qlib/contrib/model/pytorch_gats_ts.py
@@ -29,8 +29,8 @@ class DailyBatchSampler(Sampler):
     def __init__(self, data_source):
 
         self.data_source = data_source
-        self.data = self.data_source.data.loc[self.data_source.get_index()]
-        self.daily_count = self.data.groupby(level=0).size().values  # calculate number of samples in each batch
+        # calculate number of samples in each batch
+        self.daily_count = pd.Series(index=self.data_source.get_index()).groupby("datetime").size().values
         self.daily_index = np.roll(np.cumsum(self.daily_count), 1)  # calculate begin index of each batch
         self.daily_index[0] = 0
 
@@ -72,7 +72,6 @@ class GATs(Model):
         early_stop=20,
         loss="mse",
         base_model="GRU",
-        with_pretrain=True,
         model_path=None,
         optimizer="adam",
         GPU="0",
@@ -96,7 +95,6 @@ class GATs(Model):
         self.optimizer = optimizer.lower()
         self.loss = loss
         self.base_model = base_model
-        self.with_pretrain = with_pretrain
         self.model_path = model_path
         self.device = torch.device("cuda:%d" % (GPU) if torch.cuda.is_available() and GPU >= 0 else "cpu")
         self.n_jobs = n_jobs
@@ -115,7 +113,6 @@ class GATs(Model):
             "\noptimizer : {}"
             "\nloss_type : {}"
             "\nbase_model : {}"
-            "\nwith_pretrain : {}"
             "\nmodel_path : {}"
             "\nvisible_GPU : {}"
             "\nuse_GPU : {}"
@@ -131,7 +128,6 @@ class GATs(Model):
                 optimizer.lower(),
                 loss,
                 base_model,
-                with_pretrain,
                 model_path,
                 GPU,
                 self.use_gpu,
@@ -270,28 +266,22 @@ class GATs(Model):
         evals_result["valid"] = []
 
         # load pretrained base_model
-        if self.with_pretrain:
-            if self.model_path == None:
-                raise ValueError("the path of the pretrained model should be given first!")
-            self.logger.info("Loading pretrained model...")
-            if self.base_model == "LSTM":
-                pretrained_model = LSTMModel(
-                    d_feat=self.d_feat, hidden_size=self.hidden_size, num_layers=self.num_layers
-                )
-                pretrained_model.load_state_dict(torch.load(self.model_path))
-            elif self.base_model == "GRU":
-                pretrained_model = GRUModel(
-                    d_feat=self.d_feat, hidden_size=self.hidden_size, num_layers=self.num_layers
-                )
-                pretrained_model.load_state_dict(torch.load(self.model_path))
-            else:
-                raise ValueError("unknown base model name `%s`" % self.base_model)
+        if self.base_model == "LSTM":
+            pretrained_model = LSTMModel(d_feat=self.d_feat, hidden_size=self.hidden_size, num_layers=self.num_layers)
+        elif self.base_model == "GRU":
+            pretrained_model = GRUModel(d_feat=self.d_feat, hidden_size=self.hidden_size, num_layers=self.num_layers)
+        else:
+            raise ValueError("unknown base model name `%s`" % self.base_model)
 
-            model_dict = self.GAT_model.state_dict()
-            pretrained_dict = {k: v for k, v in pretrained_model.state_dict().items() if k in model_dict}
-            model_dict.update(pretrained_dict)
-            self.GAT_model.load_state_dict(model_dict)
-            self.logger.info("Loading pretrained model Done...")
+        if self.model_path is not None:
+            self.logger.info("Loading pretrained model...")
+            pretrained_model.load_state_dict(torch.load(self.model_path))
+
+        model_dict = self.GAT_model.state_dict()
+        pretrained_dict = {k: v for k, v in pretrained_model.state_dict().items() if k in model_dict}
+        model_dict.update(pretrained_dict)
+        self.GAT_model.load_state_dict(model_dict)
+        self.logger.info("Loading pretrained model Done...")
 
         # train
         self.logger.info("training...")
diff --git a/qlib/contrib/model/pytorch_tcts.py b/qlib/contrib/model/pytorch_tcts.py
index 9f44ba31c..bf46660ea 100644
--- a/qlib/contrib/model/pytorch_tcts.py
+++ b/qlib/contrib/model/pytorch_tcts.py
@@ -9,12 +9,13 @@ import os
 import numpy as np
 import pandas as pd
 import copy
+import random
 from sklearn.metrics import roc_auc_score, mean_squared_error
 import logging
 from ...utils import (
     unpack_archive_with_buffer,
     save_multiple_parts_file,
-    create_save_path,
+    get_or_create_path,
     drop_nan_by_y_index,
 )
 from ...log import get_module_logger, TimeInspector
@@ -60,8 +61,9 @@ class TCTS(Model):
         weight_lr=5e-7,
         steps=3,
         GPU=0,
-        seed=None,
+        seed=0,
         target_label=0,
+        lowest_valid_performance=0.993,
         **kwargs
     ):
         # Set logger.
@@ -85,6 +87,9 @@ class TCTS(Model):
         self.weight_lr = weight_lr
         self.steps = steps
         self.target_label = target_label
+        self.lowest_valid_performance = lowest_valid_performance
+        self._fore_optimizer = fore_optimizer
+        self._weight_optimizer = weight_optimizer
 
         self.logger.info(
             "TCTS parameters setting:"
@@ -113,40 +118,6 @@ class TCTS(Model):
             )
         )
 
-        if self.seed is not None:
-            np.random.seed(self.seed)
-            torch.manual_seed(self.seed)
-
-        self.fore_model = GRUModel(
-            d_feat=self.d_feat,
-            hidden_size=self.hidden_size,
-            num_layers=self.num_layers,
-            dropout=self.dropout,
-        )
-        self.weight_model = MLPModel(
-            d_feat=360 + 2 * self.output_dim + 1,
-            hidden_size=self.hidden_size,
-            num_layers=self.num_layers,
-            dropout=self.dropout,
-            output_dim=self.output_dim,
-        )
-        if fore_optimizer.lower() == "adam":
-            self.fore_optimizer = optim.Adam(self.fore_model.parameters(), lr=self.fore_lr)
-        elif fore_optimizer.lower() == "gd":
-            self.fore_optimizer = optim.SGD(self.fore_model.parameters(), lr=self.fore_lr)
-        else:
-            raise NotImplementedError("optimizer {} is not supported!".format(fore_optimizer))
-        if weight_optimizer.lower() == "adam":
-            self.weight_optimizer = optim.Adam(self.weight_model.parameters(), lr=self.weight_lr)
-        elif weight_optimizer.lower() == "gd":
-            self.weight_optimizer = optim.SGD(self.weight_model.parameters(), lr=self.weight_lr)
-        else:
-            raise NotImplementedError("optimizer {} is not supported!".format(weight_optimizer))
-
-        self.fitted = False
-        self.fore_model.to(self.device)
-        self.weight_model.to(self.device)
-
     def loss_fn(self, pred, label, weight):
 
         loc = torch.argmax(weight, 1)
@@ -258,11 +229,9 @@ class TCTS(Model):
     def fit(
         self,
         dataset: DatasetH,
-        evals_result=dict(),
         verbose=True,
         save_path=None,
     ):
-
         df_train, df_valid, df_test = dataset.prepare(
             ["train", "valid", "test"],
             col_set=["feature", "label"],
@@ -274,7 +243,62 @@ class TCTS(Model):
         x_test, y_test = df_test["feature"], df_test["label"]
 
         if save_path == None:
-            save_path = create_save_path(save_path)
+            save_path = get_or_create_path(save_path)
+        best_loss = np.inf
+        while best_loss > self.lowest_valid_performance:
+            if best_loss < np.inf:
+                print("Failed! Start retraining.")
+                self.seed = random.randint(0, 1000)  # reset random seed
+
+            if self.seed is not None:
+                np.random.seed(self.seed)
+                torch.manual_seed(self.seed)
+
+            best_loss = self.training(
+                x_train, y_train, x_valid, y_valid, x_test, y_test, verbose=verbose, save_path=save_path
+            )
+
+    def training(
+        self,
+        x_train,
+        y_train,
+        x_valid,
+        y_valid,
+        x_test,
+        y_test,
+        verbose=True,
+        save_path=None,
+    ):
+
+        self.fore_model = GRUModel(
+            d_feat=self.d_feat,
+            hidden_size=self.hidden_size,
+            num_layers=self.num_layers,
+            dropout=self.dropout,
+        )
+        self.weight_model = MLPModel(
+            d_feat=360 + 2 * self.output_dim + 1,
+            hidden_size=self.hidden_size,
+            num_layers=self.num_layers,
+            dropout=self.dropout,
+            output_dim=self.output_dim,
+        )
+        if self._fore_optimizer.lower() == "adam":
+            self.fore_optimizer = optim.Adam(self.fore_model.parameters(), lr=self.fore_lr)
+        elif self._fore_optimizer.lower() == "gd":
+            self.fore_optimizer = optim.SGD(self.fore_model.parameters(), lr=self.fore_lr)
+        else:
+            raise NotImplementedError("optimizer {} is not supported!".format(self._fore_optimizer))
+        if self._weight_optimizer.lower() == "adam":
+            self.weight_optimizer = optim.Adam(self.weight_model.parameters(), lr=self.weight_lr)
+        elif self._weight_optimizer.lower() == "gd":
+            self.weight_optimizer = optim.SGD(self.weight_model.parameters(), lr=self.weight_lr)
+        else:
+            raise NotImplementedError("optimizer {} is not supported!".format(self._weight_optimizer))
+
+        self.fitted = False
+        self.fore_model.to(self.device)
+        self.weight_model.to(self.device)
 
         best_loss = np.inf
         best_epoch = 0
@@ -291,7 +315,8 @@ class TCTS(Model):
             val_loss = self.test_epoch(x_valid, y_valid)
             test_loss = self.test_epoch(x_test, y_test)
 
-            print("valid %.6f, test %.6f" % (val_loss, test_loss))
+            if verbose:
+                print("valid %.6f, test %.6f" % (val_loss, test_loss))
 
             if val_loss < best_loss:
                 best_loss = val_loss
@@ -316,6 +341,8 @@ class TCTS(Model):
         if self.use_gpu:
             torch.cuda.empty_cache()
 
+        return best_loss
+
     def predict(self, dataset):
         if not self.fitted:
             raise ValueError("model is not fitted yet!")
diff --git a/qlib/workflow/record_temp.py b/qlib/workflow/record_temp.py
index fc71b3f9a..cf30bfad5 100644
--- a/qlib/workflow/record_temp.py
+++ b/qlib/workflow/record_temp.py
@@ -227,10 +227,11 @@ class SigAnaRecord(SignalRecord):
 
     artifact_path = "sig_analysis"
 
-    def __init__(self, recorder, ana_long_short=False, ann_scaler=252, **kwargs):
+    def __init__(self, recorder, ana_long_short=False, ann_scaler=252, label_col=0, **kwargs):
         super().__init__(recorder=recorder, **kwargs)
         self.ana_long_short = ana_long_short
         self.ann_scaler = ann_scaler
+        self.label_col = label_col
 
     def generate(self, **kwargs):
         try:
@@ -243,7 +244,7 @@ class SigAnaRecord(SignalRecord):
         if label is None or not isinstance(label, pd.DataFrame) or label.empty:
             logger.warn(f"Empty label.")
             return
-        ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, 0])
+        ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, self.label_col])
         metrics = {
             "IC": ic.mean(),
             "ICIR": ic.mean() / ic.std(),
@@ -252,7 +253,7 @@ class SigAnaRecord(SignalRecord):
         }
         objects = {"ic.pkl": ic, "ric.pkl": ric}
         if self.ana_long_short:
-            long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, 0])
+            long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, self.label_col])
             metrics.update(
                 {
                     "Long-Short Ann Return": long_short_r.mean() * self.ann_scaler,