mirror of
https://github.com/microsoft/qlib.git
synced 2026-06-06 05:51:17 +08:00
@@ -30,7 +30,7 @@ Version 0.2.1
|
||||
--------------------
|
||||
- Support registering user-defined ``Provider``.
|
||||
- Support use operators in string format, e.g. ``['Ref($close, 1)']`` is valid field format.
|
||||
- Support dynamic fields in ``$some_field`` format. And exising fields like ``Close()`` may be deprecated in the future.
|
||||
- Support dynamic fields in ``$some_field`` format. And existing fields like ``Close()`` may be deprecated in the future.
|
||||
|
||||
Version 0.2.2
|
||||
--------------------
|
||||
@@ -78,7 +78,7 @@ Version 0.3.5
|
||||
- Support multi-label training, you can provide multiple label in ``handler``. (But LightGBM doesn't support due to the algorithm itself)
|
||||
- Refactor ``handler`` code, dataset.py is no longer used, and you can deploy your own labels and features in ``feature_label_config``
|
||||
- Handler only offer DataFrame. Also, ``trainer`` and model.py only receive DataFrame
|
||||
- Change ``split_rolling_data``, we roll the data on market calender now, not on normal date
|
||||
- Change ``split_rolling_data``, we roll the data on market calendar now, not on normal date
|
||||
- Move some date config from ``handler`` to ``trainer``
|
||||
|
||||
Version 0.4.0
|
||||
@@ -167,11 +167,11 @@ Version 0.8.0
|
||||
- There are lots of changes for daily trading, it is hard to list all of them. But a few important changes could be noticed
|
||||
- The trading limitation is more accurate;
|
||||
- In `previous version <https://github.com/microsoft/qlib/blob/v0.7.2/qlib/contrib/backtest/exchange.py#L160>`_, longing and shorting actions share the same action.
|
||||
- In `current verison <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/backtest/exchange.py#L304>`_, the trading limitation is different between loging and shorting action.
|
||||
- In `current version <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/backtest/exchange.py#L304>`_, the trading limitation is different between logging and shorting action.
|
||||
- The constant is different when calculating annualized metrics.
|
||||
- `Current version <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/contrib/evaluate.py#L42>`_ uses more accurate constant than `previous version <https://github.com/microsoft/qlib/blob/v0.7.2/qlib/contrib/evaluate.py#L22>`_
|
||||
- `A new version <https://github.com/microsoft/qlib/blob/7c31012b507a3823117bddcc693fc64899460b2a/qlib/tests/data.py#L17>`_ of data is released. Due to the unstability of Yahoo data source, the data may be different after downloading data again.
|
||||
- Users could chec kout the backtesting results between `Current version <https://github.com/microsoft/qlib/tree/7c31012b507a3823117bddcc693fc64899460b2a/examples/benchmarks>`_ and `previous version <https://github.com/microsoft/qlib/tree/v0.7.2/examples/benchmarks>`_
|
||||
- Users could check out the backtesting results between `Current version <https://github.com/microsoft/qlib/tree/7c31012b507a3823117bddcc693fc64899460b2a/examples/benchmarks>`_ and `previous version <https://github.com/microsoft/qlib/tree/v0.7.2/examples/benchmarks>`_
|
||||
|
||||
|
||||
Other Versions
|
||||
|
||||
@@ -14,7 +14,7 @@ To get the join trading performance of daily and intraday trading, they must int
|
||||
In order to support the joint backtest strategies in multiple levels, a corresponding framework is required. None of the publicly available high-frequency trading frameworks considers multi-level joint trading, which make the backtesting aforementioned inaccurate.
|
||||
|
||||
Besides backtesting, the optimization of strategies from different levels is not standalone and can be affected by each other.
|
||||
For example, the best portfolio management strategy may change with the performance of order executions(e.g. a portfolio with higher turnover may becomes a better choice when we imporve the order execution strategies).
|
||||
For example, the best portfolio management strategy may change with the performance of order executions(e.g. a portfolio with higher turnover may becomes a better choice when we improve the order execution strategies).
|
||||
To achieve the overall good performance , it is necessary to consider the interaction of strategies in different level.
|
||||
|
||||
Therefore, building a new framework for trading in multiple levels becomes necessary to solve the various problems mentioned above, for which we designed a nested decision execution framework that consider the interaction of strategies.
|
||||
|
||||
@@ -37,7 +37,7 @@ Here is a general view of the structure of the system:
|
||||
|
||||
This experiment management system defines a set of interface and provided a concrete implementation ``MLflowExpManager``, which is based on the machine learning platform: ``MLFlow`` (`link <https://mlflow.org/>`_).
|
||||
|
||||
If users set the implementation of ``ExpManager`` to be ``MLflowExpManager``, they can use the command `mlflow ui` to visualize and check the experiment results. For more information, pleaes refer to the related documents `here <https://www.mlflow.org/docs/latest/cli.html#mlflow-ui>`_.
|
||||
If users set the implementation of ``ExpManager`` to be ``MLflowExpManager``, they can use the command `mlflow ui` to visualize and check the experiment results. For more information, please refer to the related documents `here <https://www.mlflow.org/docs/latest/cli.html#mlflow-ui>`_.
|
||||
|
||||
Qlib Recorder
|
||||
===================
|
||||
|
||||
@@ -31,7 +31,7 @@ Let's see an example,
|
||||
|
||||
First make sure you have the latest version of `qlib` installed.
|
||||
|
||||
Then, you need to privide a configuration to setup the experiment.
|
||||
Then, you need to provide a configuration to setup the experiment.
|
||||
We write a simple configuration example as following,
|
||||
|
||||
.. code-block:: YAML
|
||||
@@ -217,13 +217,13 @@ The tuner pipeline contains different tuners, and the `tuner` program will proce
|
||||
Each part represents a tuner, and its modules which are to be tuned. Space in each part is the hyper-parameters' space of a certain module, you need to create your searching space and modify it in `/qlib/contrib/tuner/space.py`. We use `hyperopt` package to help us to construct the space, you can see the detail of how to use it in https://github.com/hyperopt/hyperopt/wiki/FMin .
|
||||
|
||||
- model
|
||||
You need to provide the `class` and the `space` of the model. If the model is user's own implementation, you need to privide the `module_path`.
|
||||
You need to provide the `class` and the `space` of the model. If the model is user's own implementation, you need to provide the `module_path`.
|
||||
|
||||
- trainer
|
||||
You need to proveide the `class` of the trainer. If the trainer is user's own implementation, you need to privide the `module_path`.
|
||||
You need to provide the `class` of the trainer. If the trainer is user's own implementation, you need to provide the `module_path`.
|
||||
|
||||
- strategy
|
||||
You need to provide the `class` and the `space` of the strategy. If the strategy is user's own implementation, you need to privide the `module_path`.
|
||||
You need to provide the `class` and the `space` of the strategy. If the strategy is user's own implementation, you need to provide the `module_path`.
|
||||
|
||||
- data_label
|
||||
The label of the data, you can search which kinds of labels will lead to a better result. This part is optional, and you only need to provide `space`.
|
||||
@@ -273,7 +273,7 @@ You need to use the same dataset to evaluate your different `estimator` experime
|
||||
About the data and backtest
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
`data` and `backtest` are all same in the whole `tuner` experiment. Different `estimator` experiments must use the same data and backtest method. So, these two parts of config are same with that in `estimator` configuration. You can see the precise defination of these parts in `estimator` introduction. We only provide an example here.
|
||||
`data` and `backtest` are all same in the whole `tuner` experiment. Different `estimator` experiments must use the same data and backtest method. So, these two parts of config are same with that in `estimator` configuration. You can see the precise definition of these parts in `estimator` introduction. We only provide an example here.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
|
||||
@@ -31,7 +31,7 @@ Users can easily intsall ``Qlib`` according to the following steps:
|
||||
git clone https://github.com/microsoft/qlib.git && cd qlib
|
||||
python setup.py install
|
||||
|
||||
To kown more about `installation`, please refer to `Qlib Installation <../start/installation.html>`_.
|
||||
To known more about `installation`, please refer to `Qlib Installation <../start/installation.html>`_.
|
||||
|
||||
Prepare Data
|
||||
==============
|
||||
@@ -44,7 +44,7 @@ Load and prepare data by running the following code:
|
||||
|
||||
This dataset is created by public data collected by crawler scripts in ``scripts/data_collector/``, which have been released in the same repository. Users could create the same dataset with it.
|
||||
|
||||
To kown more about `prepare data`, please refer to `Data Preparation <../component/data.html#data-preparation>`_.
|
||||
To known more about `prepare data`, please refer to `Data Preparation <../component/data.html#data-preparation>`_.
|
||||
|
||||
Auto Quant Research Workflow
|
||||
====================================
|
||||
|
||||
@@ -32,7 +32,7 @@ import abc
|
||||
import enum
|
||||
|
||||
|
||||
# Type defintions
|
||||
# Type definitions
|
||||
class DataTypes(enum.IntEnum):
|
||||
"""Defines numerical types of each column."""
|
||||
|
||||
|
||||
@@ -254,9 +254,9 @@ class DistributedHyperparamOptManager(HyperparamOptManager):
|
||||
param_ranges: Discrete hyperparameter range for random search.
|
||||
fixed_params: Fixed model parameters per experiment.
|
||||
root_model_folder: Folder to store optimisation artifacts.
|
||||
worker_number: Worker index definining which set of hyperparameters to
|
||||
worker_number: Worker index defining which set of hyperparameters to
|
||||
test.
|
||||
search_iterations: Maximum numer of random search iterations.
|
||||
search_iterations: Maximum number of random search iterations.
|
||||
num_iterations_per_worker: How many iterations are handled per worker.
|
||||
clear_serialised_params: Whether to regenerate hyperparameter
|
||||
combinations.
|
||||
@@ -330,7 +330,7 @@ class DistributedHyperparamOptManager(HyperparamOptManager):
|
||||
if os.path.exists(self.serialised_ranges_folder):
|
||||
df = pd.read_csv(self.serialised_ranges_path, index_col=0)
|
||||
else:
|
||||
print("Unable to load - regenerating serach ranges instead")
|
||||
print("Unable to load - regenerating search ranges instead")
|
||||
df = self.update_serialised_hyperparam_df()
|
||||
|
||||
return df
|
||||
|
||||
@@ -342,7 +342,7 @@ class TFTDataCache:
|
||||
|
||||
@classmethod
|
||||
def contains(cls, key):
|
||||
"""Retuns boolean indicating whether key is present in cache."""
|
||||
"""Returns boolean indicating whether key is present in cache."""
|
||||
|
||||
return key in cls._data_cache
|
||||
|
||||
@@ -1120,10 +1120,10 @@ class TemporalFusionTransformer:
|
||||
Args:
|
||||
df: Input dataframe
|
||||
return_targets: Whether to also return outputs aligned with predictions to
|
||||
faciliate evaluation
|
||||
facilitate evaluation
|
||||
|
||||
Returns:
|
||||
Input dataframe or tuple of (input dataframe, algined output dataframe).
|
||||
Input dataframe or tuple of (input dataframe, aligned output dataframe).
|
||||
"""
|
||||
|
||||
data = self._batch_data(df)
|
||||
|
||||
@@ -295,7 +295,7 @@ class TFTModel(ModelFT):
|
||||
def to_pickle(self, path: Union[Path, str]):
|
||||
"""
|
||||
Tensorflow model can't be dumped directly.
|
||||
So the data should be save seperatedly
|
||||
So the data should be save separately
|
||||
|
||||
**TODO**: Please implement the function to load the files
|
||||
|
||||
|
||||
@@ -57,7 +57,7 @@ And here are two ways to run the model:
|
||||
python example.py --config_file configs/config_alstm.yaml
|
||||
```
|
||||
|
||||
Here we trained TRA on a pretrained backbone model. Therefore we run `*_init.yaml` before TRA's scipts.
|
||||
Here we trained TRA on a pretrained backbone model. Therefore we run `*_init.yaml` before TRA's scripts.
|
||||
|
||||
### Results
|
||||
|
||||
|
||||
@@ -124,7 +124,7 @@ class TRAModel(Model):
|
||||
loss = (pred - label).pow(2).mean()
|
||||
|
||||
L = (all_preds.detach() - label[:, None]).pow(2)
|
||||
L -= L.min(dim=-1, keepdim=True).values # normalize & ensure postive input
|
||||
L -= L.min(dim=-1, keepdim=True).values # normalize & ensure positive input
|
||||
|
||||
data_set.assign_data(index, L) # save loss to memory
|
||||
|
||||
@@ -165,7 +165,7 @@ class TRAModel(Model):
|
||||
|
||||
L = (all_preds - label[:, None]).pow(2)
|
||||
|
||||
L -= L.min(dim=-1, keepdim=True).values # normalize & ensure postive input
|
||||
L -= L.min(dim=-1, keepdim=True).values # normalize & ensure positive input
|
||||
|
||||
data_set.assign_data(index, L) # save loss to memory
|
||||
|
||||
@@ -484,7 +484,7 @@ class TRA(nn.Module):
|
||||
|
||||
"""Temporal Routing Adaptor (TRA)
|
||||
|
||||
TRA takes historical prediction erros & latent representation as inputs,
|
||||
TRA takes historical prediction errors & latent representation as inputs,
|
||||
then routes the input sample to a specific predictor for training & inference.
|
||||
|
||||
Args:
|
||||
|
||||
@@ -150,7 +150,7 @@ class Cut(ElemOperator):
|
||||
self.l = l
|
||||
self.r = r
|
||||
if (self.l is not None and self.l <= 0) or (self.r is not None and self.r >= 0):
|
||||
raise ValueError("Cut operator l shoud > 0 and r should < 0")
|
||||
raise ValueError("Cut operator l should > 0 and r should < 0")
|
||||
|
||||
super(Cut, self).__init__(feature)
|
||||
|
||||
|
||||
@@ -298,7 +298,7 @@ class NestedDecisionExecutionWorkflow:
|
||||
# - Aligning the profit calculation between multiple levels and single levels.
|
||||
# 2) comparing different backtest
|
||||
# - Basic test idea:
|
||||
# - the daily backtest will be similar as multi-level(the data quality makes this gap samller)
|
||||
# - the daily backtest will be similar as multi-level(the data quality makes this gap smaller)
|
||||
|
||||
def check_diff_freq(self):
|
||||
self._init_qlib()
|
||||
|
||||
@@ -241,7 +241,7 @@ def auto_init(**kwargs):
|
||||
default_exp_name: "Experiment"
|
||||
|
||||
Example 2)
|
||||
If you wan to create simple a stand alone config, you can use following config(a.k.a `conf_type: origin`)
|
||||
If you want to create simple a stand alone config, you can use following config(a.k.a `conf_type: origin`)
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
|
||||
@@ -31,7 +31,7 @@ rtn & earning in the Account
|
||||
class AccumulatedInfo:
|
||||
"""
|
||||
accumulated trading info, including accumulated return/cost/turnover
|
||||
AccumulatedInfo should be shared accross different levels
|
||||
AccumulatedInfo should be shared across different levels
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
@@ -199,7 +199,7 @@ class Account:
|
||||
|
||||
# if stock is sold out, no stock price information in Position, then we should update account first, then update current position
|
||||
# if stock is bought, there is no stock in current position, update current, then update account
|
||||
# The cost will be substracted from the cash at last. So the trading logic can ignore the cost calculation
|
||||
# The cost will be subtracted from the cash at last. So the trading logic can ignore the cost calculation
|
||||
if order.direction == Order.SELL:
|
||||
# sell stock
|
||||
self._update_state_from_order(order, trade_val, cost, trade_price)
|
||||
@@ -378,7 +378,7 @@ class Account:
|
||||
)
|
||||
|
||||
def get_portfolio_metrics(self):
|
||||
"""get the history portfolio_metrics and postions instance"""
|
||||
"""get the history portfolio_metrics and positions instance"""
|
||||
if self.is_port_metr_enabled():
|
||||
_portfolio_metrics = self.portfolio_metrics.generate_portfolio_metrics_dataframe()
|
||||
_positions = self.get_hist_positions()
|
||||
|
||||
@@ -13,7 +13,7 @@ from tqdm.auto import tqdm
|
||||
|
||||
|
||||
def backtest_loop(start_time, end_time, trade_strategy: BaseStrategy, trade_executor: BaseExecutor):
|
||||
"""backtest funciton for the interaction of the outermost strategy and executor in the nested decision execution
|
||||
"""backtest function for the interaction of the outermost strategy and executor in the nested decision execution
|
||||
|
||||
please refer to the docs of `collect_data_loop`
|
||||
|
||||
|
||||
@@ -505,8 +505,8 @@ class BaseTradeDecision:
|
||||
`inner_trade_decision` will be changed **inplaced**.
|
||||
|
||||
Motivation of the `mod_inner_decision`
|
||||
- Leave a hook for outer decision to affact the decision generated by the inner strategy
|
||||
- e.g. the outmost strategy generate a time range for trading. But the upper layer can only affact the
|
||||
- Leave a hook for outer decision to affect the decision generated by the inner strategy
|
||||
- e.g. the outmost strategy generate a time range for trading. But the upper layer can only affect the
|
||||
nearest layer in the original design. With `mod_inner_decision`, the decision can passed through multiple
|
||||
layers
|
||||
|
||||
|
||||
@@ -103,7 +103,7 @@ class Exchange:
|
||||
Necessary fields:
|
||||
$close is for calculating the total value at end of each day.
|
||||
Optional fields:
|
||||
$volume is only necessary when we limit the trade amount or caculate PA(vwap) indicator
|
||||
$volume is only necessary when we limit the trade amount or calculate PA(vwap) indicator
|
||||
$vwap is only necessary when we use the $vwap price as the deal price
|
||||
$factor is for rounding to the trading unit
|
||||
limit_sell will be set to False by default(False indicates we can sell this
|
||||
@@ -505,7 +505,7 @@ class Exchange:
|
||||
Note: some future information is used in this function
|
||||
Parameter:
|
||||
target_position : dict { stock_id : amount }
|
||||
current_postion : dict { stock_id : amount}
|
||||
current_position : dict { stock_id : amount}
|
||||
trade_unit : trade_unit
|
||||
down sample : for amount 321 and trade_unit 100, deal_amount is 300
|
||||
deal order on trade_date
|
||||
|
||||
@@ -41,7 +41,7 @@ class BaseExecutor:
|
||||
Parameters
|
||||
----------
|
||||
time_per_step : str
|
||||
trade time per trading step, used for genreate the trade calendar
|
||||
trade time per trading step, used for generate the trade calendar
|
||||
show_indicator: bool, optional
|
||||
whether to show indicators, :
|
||||
- 'pa', the price advantage
|
||||
@@ -369,12 +369,12 @@ class NestedExecutor(BaseExecutor):
|
||||
self.inner_strategy.reset(level_infra=sub_level_infra, outer_trade_decision=trade_decision)
|
||||
|
||||
def _update_trade_decision(self, trade_decision: BaseTradeDecision) -> BaseTradeDecision:
|
||||
# outter strategy have chance to update decision each iterator
|
||||
# outer strategy have chance to update decision each iterator
|
||||
updated_trade_decision = trade_decision.update(self.inner_executor.trade_calendar)
|
||||
if updated_trade_decision is not None:
|
||||
trade_decision = updated_trade_decision
|
||||
# NEW UPDATE
|
||||
# create a hook for inner strategy to update outter decision
|
||||
# create a hook for inner strategy to update outer decision
|
||||
self.inner_strategy.alter_outer_trade_decision(trade_decision)
|
||||
return trade_decision
|
||||
|
||||
|
||||
@@ -400,7 +400,7 @@ class BaseOrderIndicator:
|
||||
indicators : List[BaseOrderIndicator]
|
||||
the list of all inner indicators.
|
||||
metrics : Union[str, List[str]]
|
||||
all metrics needs ot be sumed.
|
||||
all metrics needs to be sumed.
|
||||
fill_value : float, optional
|
||||
fill np.NaN with value. By default None.
|
||||
"""
|
||||
|
||||
@@ -152,7 +152,7 @@ class BasePosition:
|
||||
"""
|
||||
generate stock weight dict {stock_id : value weight of stock in the position}
|
||||
it is meaningful in the beginning or the end of each trade step
|
||||
- During execution of each trading step, the weight may be not consistant with the portfolio value
|
||||
- During execution of each trading step, the weight may be not consistent with the portfolio value
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
@@ -39,7 +39,7 @@ def get_benchmark_weight(
|
||||
if not path:
|
||||
path = Path(C.dpm.get_data_uri(freq)).expanduser() / "raw" / "AIndexMembers" / "weights.csv"
|
||||
# TODO: the storage of weights should be implemented in a more elegent way
|
||||
# TODO: The benchmark is not consistant with the filename in instruments.
|
||||
# TODO: The benchmark is not consistent with the filename in instruments.
|
||||
bench_weight_df = pd.read_csv(path, usecols=["code", "date", "index", "weight"])
|
||||
bench_weight_df = bench_weight_df[bench_weight_df["index"] == bench]
|
||||
bench_weight_df["date"] = pd.to_datetime(bench_weight_df["date"])
|
||||
|
||||
@@ -73,7 +73,7 @@ class PortfolioMetrics:
|
||||
self.init_bench(freq=freq, benchmark_config=benchmark_config)
|
||||
|
||||
def init_vars(self):
|
||||
self.accounts = OrderedDict() # account postion value for each trade time
|
||||
self.accounts = OrderedDict() # account position value for each trade time
|
||||
self.returns = OrderedDict() # daily return rate for each trade time
|
||||
self.total_turnovers = OrderedDict() # total turnover for each trade time
|
||||
self.turnovers = OrderedDict() # turnover for each trade time
|
||||
@@ -236,7 +236,7 @@ class Indicator:
|
||||
"""
|
||||
`Indicator` is implemented in a aggregate way.
|
||||
All the metrics are calculated aggregately.
|
||||
All the metrics are calculated for a seperated stock and in a specific step on a specific level.
|
||||
All the metrics are calculated for a separated stock and in a specific step on a specific level.
|
||||
|
||||
| indicator | desc. |
|
||||
|--------------+--------------------------------------------------------------|
|
||||
|
||||
@@ -93,7 +93,7 @@ class TradeCalendarManager:
|
||||
|
||||
About the endpoints:
|
||||
- Qlib uses the closed interval in time-series data selection, which has the same performance as pandas.Series.loc
|
||||
# - The returned right endpoints should minus 1 seconds becasue of the closed interval representation in Qlib.
|
||||
# - The returned right endpoints should minus 1 seconds because of the closed interval representation in Qlib.
|
||||
# Note: Qlib supports up to minutely decision execution, so 1 seconds is less than any trading time interval.
|
||||
|
||||
Parameters
|
||||
|
||||
@@ -18,8 +18,8 @@ class SepDataFrame:
|
||||
"""
|
||||
(Sep)erate DataFrame
|
||||
We usually concat multiple dataframe to be processed together(Such as feature, label, weight, filter).
|
||||
However, they are usally be used seperately at last.
|
||||
This will result in extra cost for concating and spliting data(reshaping and copying data in the memory is very expensive)
|
||||
However, they are usually be used separately at last.
|
||||
This will result in extra cost for concatenating and splitting data(reshaping and copying data in the memory is very expensive)
|
||||
|
||||
SepDataFrame tries to act like a DataFrame whose column with multiindex
|
||||
"""
|
||||
|
||||
@@ -38,11 +38,11 @@ def _get_position_value_from_df(evaluate_date, position, close_data_df):
|
||||
def get_position_value(evaluate_date, position):
|
||||
"""sum of close*amount
|
||||
|
||||
get value of postion
|
||||
get value of position
|
||||
|
||||
use close price
|
||||
|
||||
postions:
|
||||
positions:
|
||||
{
|
||||
Timestamp('2016-01-05 00:00:00'):
|
||||
{
|
||||
|
||||
@@ -56,7 +56,7 @@ class HFLGBModel(ModelFT, LightGBMFInt):
|
||||
|
||||
def hf_signal_test(self, dataset: DatasetH, threhold=0.2):
|
||||
"""
|
||||
Test the sigal in high frequency test set
|
||||
Test the signal in high frequency test set
|
||||
"""
|
||||
if self.model == None:
|
||||
raise ValueError("Model hasn't been trained yet")
|
||||
|
||||
@@ -446,7 +446,7 @@ class TabNet(nn.Module):
|
||||
Args:
|
||||
n_d: dimension of the features used to calculate the final results
|
||||
n_a: dimension of the features input to the attention transformer of the next step
|
||||
n_shared: numbr of shared steps in feature transfomer(optional)
|
||||
n_shared: numbr of shared steps in feature transformer(optional)
|
||||
n_ind: number of independent steps in feature transformer
|
||||
n_steps: number of steps of pass through tabbet
|
||||
relax coefficient:
|
||||
@@ -479,7 +479,7 @@ class TabNet(nn.Module):
|
||||
out = torch.zeros(x.size(0), self.n_d).to(x.device)
|
||||
for step in self.steps:
|
||||
x_te, l = step(x, x_a, priors)
|
||||
out += F.relu(x_te[:, : self.n_d]) # split the feautre from feat_transformer
|
||||
out += F.relu(x_te[:, : self.n_d]) # split the feature from feat_transformer
|
||||
x_a = x_te[:, self.n_d :]
|
||||
sparse_loss.append(l)
|
||||
return self.fc(out), sum(sparse_loss)
|
||||
|
||||
@@ -232,7 +232,7 @@ class TRAModel(Model):
|
||||
choice_all.append(pd.DataFrame(choice.detach().cpu().numpy(), index=index))
|
||||
decay = self.rho ** (self.global_step // 100) # decay every 100 steps
|
||||
lamb = 0 if is_pretrain else self.lamb * decay
|
||||
reg = prob.log().mul(P).sum(dim=1).mean() # train router to predict OT assignment
|
||||
reg = prob.log().mul(P).sum(dim=1).mean() # train router to predict TO assignment
|
||||
if self._writer is not None and not is_pretrain:
|
||||
self._writer.add_scalar("training/router_loss", -reg.item(), self.global_step)
|
||||
self._writer.add_scalar("training/reg_loss", loss.item(), self.global_step)
|
||||
@@ -663,7 +663,7 @@ class TRA(nn.Module):
|
||||
|
||||
"""Temporal Routing Adaptor (TRA)
|
||||
|
||||
TRA takes historical prediction erros & latent representation as inputs,
|
||||
TRA takes historical prediction errors & latent representation as inputs,
|
||||
then routes the input sample to a specific predictor for training & inference.
|
||||
|
||||
Args:
|
||||
|
||||
@@ -33,5 +33,5 @@ def count_parameters(models_or_parameters, unit="m"):
|
||||
elif unit == "gb" or unit == "g":
|
||||
counts /= 2 ** 30
|
||||
elif unit is not None:
|
||||
raise ValueError("Unknow unit: {:}".format(unit))
|
||||
raise ValueError("Unknown unit: {:}".format(unit))
|
||||
return counts
|
||||
|
||||
@@ -36,7 +36,7 @@ def save_instance(instance, file_path):
|
||||
save(dump) an instance to a pickle file
|
||||
Parameter
|
||||
instance :
|
||||
data to te dumped
|
||||
data to be dumped
|
||||
file_path : string / pathlib.Path()
|
||||
path of file to be dumped
|
||||
"""
|
||||
|
||||
@@ -47,7 +47,7 @@ class SoftTopkStrategy(WeightStrategyBase):
|
||||
Return the proportion of your total value you will used in investment.
|
||||
Dynamically risk_degree will result in Market timing
|
||||
"""
|
||||
# It will use 95% amoutn of your total value by default
|
||||
# It will use 95% amount of your total value by default
|
||||
return self.risk_degree
|
||||
|
||||
def generate_target_weight_position(self, score, current, trade_start_time, trade_end_time):
|
||||
|
||||
@@ -24,7 +24,7 @@ class TWAPStrategy(BaseStrategy):
|
||||
|
||||
NOTE:
|
||||
- This TWAP strategy will celling round when trading. This will make the TWAP trading strategy produce the order
|
||||
ealier when the total trade unit of amount is less than the trading step
|
||||
earlier when the total trade unit of amount is less than the trading step
|
||||
"""
|
||||
|
||||
def reset(self, outer_trade_decision: BaseTradeDecision = None, **kwargs):
|
||||
@@ -43,8 +43,8 @@ class TWAPStrategy(BaseStrategy):
|
||||
def generate_trade_decision(self, execute_result=None):
|
||||
# NOTE: corner cases!!!
|
||||
# - If using upperbound round, please don't sell the amount which should in next step
|
||||
# - the coordinate of the amount between steps is hard to be dealed between steps in the same level. It
|
||||
# is easier to be dealed in upper steps
|
||||
# - the coordinate of the amount between steps is hard to be dealt between steps in the same level. It
|
||||
# is easier to be dealt in upper steps
|
||||
|
||||
# strategy is not available. Give an empty decision
|
||||
if len(self.outer_trade_decision.get_decision()) == 0:
|
||||
|
||||
@@ -69,7 +69,7 @@ class BaseSignalStrategy(BaseStrategy):
|
||||
Return the proportion of your total value you will used in investment.
|
||||
Dynamically risk_degree will result in Market timing.
|
||||
"""
|
||||
# It will use 95% amoutn of your total value by default
|
||||
# It will use 95% amount of your total value by default
|
||||
return self.risk_degree
|
||||
|
||||
|
||||
|
||||
@@ -90,7 +90,7 @@ class QLibTuner(Tuner):
|
||||
|
||||
def objective(self, params):
|
||||
|
||||
# 1. Setup an config for a spcific estimator process
|
||||
# 1. Setup an config for a specific estimator process
|
||||
estimator_path = self.setup_estimator_config(params)
|
||||
self.logger.info("Searching params: {} ".format(params))
|
||||
|
||||
|
||||
@@ -359,7 +359,7 @@ class ExpressionCache(BaseProviderCache):
|
||||
def update(self, cache_uri: Union[str, Path], freq: str = "day"):
|
||||
"""Update expression cache to latest calendar.
|
||||
|
||||
Overide this method to define how to update expression cache corresponding to users' own cache mechanism.
|
||||
Override this method to define how to update expression cache corresponding to users' own cache mechanism.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@@ -445,7 +445,7 @@ class DatasetCache(BaseProviderCache):
|
||||
def update(self, cache_uri: Union[str, Path], freq: str = "day"):
|
||||
"""Update dataset cache to latest calendar.
|
||||
|
||||
Overide this method to define how to update dataset cache corresponding to users' own cache mechanism.
|
||||
Override this method to define how to update dataset cache corresponding to users' own cache mechanism.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@@ -543,7 +543,7 @@ class DiskExpressionCache(ExpressionCache):
|
||||
# instance
|
||||
series = self.provider.expression(instrument, field, _calendar[0], _calendar[-1], freq)
|
||||
if not series.empty:
|
||||
# This expresion is empty, we don't generate any cache for it.
|
||||
# This expression is empty, we don't generate any cache for it.
|
||||
with CacheUtils.writer_lock(self.r, f"{str(C.dpm.get_data_uri(freq))}:expression-{_cache_uri}"):
|
||||
self.gen_expression_cache(
|
||||
expression_data=series,
|
||||
@@ -858,7 +858,7 @@ class DiskDatasetCache(DatasetCache):
|
||||
"""gen_dataset_cache
|
||||
|
||||
.. note:: This function does not consider the cache read write lock. Please
|
||||
Aquire the lock outside this function
|
||||
Acquire the lock outside this function
|
||||
|
||||
The format the cache contains 3 parts(followed by typical filename).
|
||||
|
||||
@@ -1035,7 +1035,7 @@ class DiskDatasetCache(DatasetCache):
|
||||
# FIXME:
|
||||
# Because the feature cache are stored as .bin file.
|
||||
# So the series read from features are all float32.
|
||||
# However, the first dataset cache is calulated based on the
|
||||
# However, the first dataset cache is calculated based on the
|
||||
# raw data. So the data type may be float64.
|
||||
# Different data type will result in failure of appending data
|
||||
if "/{}".format(DatasetCache.HDF_KEY) in store.keys():
|
||||
|
||||
@@ -58,7 +58,7 @@ class Client:
|
||||
msg_proc_func : func
|
||||
the function to process the message when receiving response, should have arg `*args`.
|
||||
msg_queue: Queue
|
||||
The queue to pass the messsage after callback.
|
||||
The queue to pass the message after callback.
|
||||
"""
|
||||
head_info = {"version": qlib.__version__}
|
||||
|
||||
|
||||
@@ -16,7 +16,7 @@ from multiprocessing import Pool
|
||||
from typing import Iterable, Union
|
||||
from typing import List, Union
|
||||
|
||||
# For supporting multiprocessing in outter code, joblib is used
|
||||
# For supporting multiprocessing in outer code, joblib is used
|
||||
from joblib import delayed
|
||||
|
||||
from .cache import H
|
||||
|
||||
@@ -392,7 +392,7 @@ class TSDataSampler:
|
||||
2021-01-14 12441 12442 12443 12444 12445 12446 ...
|
||||
2) the second element: {<original index>: <row, col>}
|
||||
"""
|
||||
# object incase of pandas converting int to flaot
|
||||
# object incase of pandas converting int to float
|
||||
idx_df = pd.Series(range(data.shape[0]), index=data.index, dtype=object)
|
||||
idx_df = lazy_sort_index(idx_df.unstack())
|
||||
# NOTE: the correctness of `__getitem__` depends on columns sorted here
|
||||
|
||||
@@ -70,7 +70,7 @@ class DataHandler(Serializable):
|
||||
Parameters
|
||||
----------
|
||||
instruments :
|
||||
The stock list to retrive.
|
||||
The stock list to retrieve.
|
||||
start_time :
|
||||
start_time of the original data.
|
||||
end_time :
|
||||
|
||||
@@ -75,7 +75,7 @@ class Processor(Serializable):
|
||||
|
||||
def readonly(self) -> bool:
|
||||
"""
|
||||
Does the processor treat the input data readonly (i.e. does not write the input data) when processsing
|
||||
Does the processor treat the input data readonly (i.e. does not write the input data) when processing
|
||||
|
||||
Knowning the readonly information is helpful to the Handler to avoid uncessary copy
|
||||
"""
|
||||
|
||||
@@ -63,7 +63,7 @@ class HasingStockStorage(BaseHandlerStorage):
|
||||
"""Hasing data storage for datahanlder
|
||||
- The default data storage pandas.DataFrame is too slow when randomly accessing one stock's data
|
||||
- HasingStockStorage hashes the multiple stocks' data(pandas.DataFrame) by the key `stock_id`.
|
||||
- HasingStockStorage hases the pandas.DataFrame into a dict, whose key is the stock_id(str) and value this stock data(panda.DataFrame), it has the following format:
|
||||
- HasingStockStorage hashes the pandas.DataFrame into a dict, whose key is the stock_id(str) and value this stock data(panda.DataFrame), it has the following format:
|
||||
{
|
||||
stock1_id: stock1_data,
|
||||
stock2_id: stock2_data,
|
||||
|
||||
@@ -64,10 +64,10 @@ class QlibIntRLEnv(QlibRLEnv):
|
||||
Parameters
|
||||
----------
|
||||
state_interpreter : Union[dict, StateInterpreter]
|
||||
interpretor that interprets the qlib execute result into rl env state.
|
||||
interpreter that interprets the qlib execute result into rl env state.
|
||||
|
||||
action_interpreter : Union[dict, ActionInterpreter]
|
||||
interpretor that interprets the rl agent action into qlib order list
|
||||
interpreter that interprets the rl agent action into qlib order list
|
||||
"""
|
||||
super(QlibIntRLEnv, self).__init__(executor=executor)
|
||||
self.state_interpreter = init_instance_by_config(state_interpreter, accept_types=StateInterpreter)
|
||||
|
||||
@@ -34,7 +34,7 @@ class BaseStrategy:
|
||||
Parameters
|
||||
----------
|
||||
outer_trade_decision : BaseTradeDecision, optional
|
||||
the trade decision of outer strategy which this startegy relies, and it will be traded in [start_time, end_time], by default None
|
||||
the trade decision of outer strategy which this strategy relies, and it will be traded in [start_time, end_time], by default None
|
||||
- If the strategy is used to split trade decision, it will be used
|
||||
- If the strategy is used for portfolio management, it can be ignored
|
||||
level_infra : LevelInfrastructure, optional
|
||||
@@ -232,9 +232,9 @@ class RLIntStrategy(RLStrategy):
|
||||
Parameters
|
||||
----------
|
||||
state_interpreter : Union[dict, StateInterpreter]
|
||||
interpretor that interprets the qlib execute result into rl env state
|
||||
interpreter that interprets the qlib execute result into rl env state
|
||||
action_interpreter : Union[dict, ActionInterpreter]
|
||||
interpretor that interprets the rl agent action into qlib order list
|
||||
interpreter that interprets the rl agent action into qlib order list
|
||||
start_time : Union[str, pd.Timestamp], optional
|
||||
start time of trading, by default None
|
||||
end_time : Union[str, pd.Timestamp], optional
|
||||
|
||||
@@ -579,7 +579,7 @@ def get_date_range(trading_date, left_shift=0, right_shift=0, future=False):
|
||||
|
||||
|
||||
def get_date_by_shift(trading_date, shift, future=False, clip_shift=True, freq="day", align: Optional[str] = None):
|
||||
"""get trading date with shift bias wil cur_date
|
||||
"""get trading date with shift bias will cur_date
|
||||
e.g. : shift == 1, return next trading date
|
||||
shift == -1, return previous trading date
|
||||
----------
|
||||
|
||||
@@ -6,7 +6,7 @@ Motivation of index_data
|
||||
Some users just want a simple numpy dataframe with indices and don't want such a complicated tools.
|
||||
Such users are the target of `index_data`
|
||||
|
||||
`index_data` try to behave like pandas (some API will be different because we try to be simpler and more intuitive) but don't compromize the performance. It provides the basic numpy data and simple indexing feature. If users call APIs which may compromize the performance, index_data will raise Errors.
|
||||
`index_data` try to behave like pandas (some API will be different because we try to be simpler and more intuitive) but don't compromise the performance. It provides the basic numpy data and simple indexing feature. If users call APIs which may compromise the performance, index_data will raise Errors.
|
||||
"""
|
||||
|
||||
from typing import Dict, Tuple, Union, Callable, List
|
||||
|
||||
@@ -203,10 +203,10 @@ def get_valid_value(series, last=True):
|
||||
"""get the first/last not nan value of pd.Series with single level index
|
||||
Parameters
|
||||
----------
|
||||
series : pd.Seires
|
||||
series : pd.Series
|
||||
series should not be empty
|
||||
last : bool, optional
|
||||
wether to get the last valid value, by default True
|
||||
whether to get the last valid value, by default True
|
||||
- if last is True, get the last valid value
|
||||
- else, get the first valid value
|
||||
|
||||
|
||||
@@ -88,7 +88,7 @@ class Experiment:
|
||||
def search_records(self, **kwargs):
|
||||
"""
|
||||
Get a pandas DataFrame of records that fit the search criteria of the experiment.
|
||||
Inputs are the search critera user want to apply.
|
||||
Inputs are the search criteria user want to apply.
|
||||
|
||||
Returns
|
||||
-------
|
||||
|
||||
@@ -105,7 +105,7 @@ class ExpManager:
|
||||
def search_records(self, experiment_ids=None, **kwargs):
|
||||
"""
|
||||
Get a pandas DataFrame of records that fit the search criteria of the experiment.
|
||||
Inputs are the search critera user want to apply.
|
||||
Inputs are the search criteria user want to apply.
|
||||
|
||||
Returns
|
||||
-------
|
||||
|
||||
@@ -75,7 +75,7 @@ class RecordUpdater(metaclass=ABCMeta):
|
||||
class DSBasedUpdater(RecordUpdater, metaclass=ABCMeta):
|
||||
"""
|
||||
Dataset-Based Updater
|
||||
- Provding updating feature for Updating data based on Qlib Dataset
|
||||
- Providing updating feature for Updating data based on Qlib Dataset
|
||||
|
||||
Assumption
|
||||
- Based on Qlib dataset
|
||||
|
||||
@@ -116,7 +116,7 @@ class RecordTemp:
|
||||
"""
|
||||
Check if the records is properly generated and saved.
|
||||
It is useful in following examples
|
||||
- checking if the depended files complete before genrating new things.
|
||||
- checking if the depended files complete before generating new things.
|
||||
- checking if the final files is completed
|
||||
|
||||
Parameters
|
||||
|
||||
@@ -9,5 +9,5 @@ A typical task workflow
|
||||
|-----------------------+------------------------------------------------|
|
||||
| TaskGen | Generating tasks. |
|
||||
| TaskManager(optional) | Manage generated tasks |
|
||||
| run task | retrive tasks from TaskManager and run tasks. |
|
||||
| run task | retrieve tasks from TaskManager and run tasks. |
|
||||
"""
|
||||
|
||||
@@ -272,7 +272,7 @@ class RollingGen(TaskGen):
|
||||
class MultiHorizonGenBase(TaskGen):
|
||||
def __init__(self, horizon: List[int] = [5], label_leak_n=2):
|
||||
"""
|
||||
This task generator tries to genrate tasks for different horizons based on an existing task
|
||||
This task generator tries to generate tasks for different horizons based on an existing task
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
@@ -48,7 +48,7 @@ class TaskManager:
|
||||
The tasks manager assumes that you will only update the tasks you fetched.
|
||||
The mongo fetch one and update will make it date updating secure.
|
||||
|
||||
This class can be used as a tool from commandline. Here are serveral examples.
|
||||
This class can be used as a tool from commandline. Here are several examples.
|
||||
You can view the help of manage module with the following commands:
|
||||
python -m qlib.workflow.task.manage -h # show manual of manage module CLI
|
||||
python -m qlib.workflow.task.manage wait -h # show manual of the wait command of manage
|
||||
@@ -368,7 +368,7 @@ class TaskManager:
|
||||
|
||||
def return_task(self, task, status=STATUS_WAITING):
|
||||
"""
|
||||
Return a task to status. Alway using in error handling.
|
||||
Return a task to status. Always using in error handling.
|
||||
|
||||
Args:
|
||||
task ([type]): [description]
|
||||
|
||||
@@ -103,7 +103,7 @@ class FundCollector(BaseCollector):
|
||||
error_msg = f"{symbol}-{interval}-{start}-{end}"
|
||||
|
||||
try:
|
||||
# TODO: numberOfHistoricalDaysToCrawl should be bigger enouhg
|
||||
# TODO: numberOfHistoricalDaysToCrawl should be bigger enough
|
||||
url = INDEX_BENCH_URL.format(
|
||||
index_code=symbol, numberOfHistoricalDaysToCrawl=10000, startDate=start, endDate=end
|
||||
)
|
||||
|
||||
@@ -360,7 +360,7 @@ def get_en_fund_symbols(qlib_data_path: [str, Path] = None) -> list:
|
||||
_symbols = []
|
||||
for sub_data in re.findall(r"[\[](.*?)[\]]", resp.content.decode().split("= [")[-1].replace("];", "")):
|
||||
data = sub_data.replace('"', "").replace("'", "")
|
||||
# TODO: do we need other informations, like fund_name from ['000001', 'HXCZHH', '华夏成长混合', '混合型', 'HUAXIACHENGZHANGHUNHE']
|
||||
# TODO: do we need other information, like fund_name from ['000001', 'HXCZHH', '华夏成长混合', '混合型', 'HUAXIACHENGZHANGHUNHE']
|
||||
_symbols.append(data.split(",")[0])
|
||||
except Exception as e:
|
||||
logger.warning(f"request error: {e}")
|
||||
|
||||
Reference in New Issue
Block a user