mirror of
https://github.com/microsoft/qlib.git
synced 2026-07-02 02:21:18 +08:00
Update part of the docs
This commit is contained in:
22
README.md
22
README.md
@@ -128,14 +128,14 @@ Users could create the same dataset with it.
|
||||
-->
|
||||
|
||||
## Auto Quant Research Workflow
|
||||
Qlib provides a tool named `Estimator` to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). You can start an auto quant research workflow and have a graphical reports analysis according to the following steps:
|
||||
Qlib provides a tool named `qrun` to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). You can start an auto quant research workflow and have a graphical reports analysis according to the following steps:
|
||||
|
||||
1. Quant Research Workflow: Run `Estimator` with [estimator_config.yaml](examples/estimator/estimator_config.yaml) as following. (*Please note that this may **not work** under MacOS with Python 3.8 due to the incompatibility of the `sacred` package we use with Python 3.8. We will fix this bug in the future.*)
|
||||
1. Quant Research Workflow: Run `qrun` with lightgbm workflow config ([workflow_config_lightgbm.yaml](examples/benchmarks/LightGBM/workflow_config_lightgbm.yaml)) as following.
|
||||
```bash
|
||||
cd examples # Avoid running program under the directory contains `qlib`
|
||||
estimator -c estimator/estimator_config.yaml
|
||||
qrun benchmarks/LightGBM/workflow_config_lightgbm.yaml
|
||||
```
|
||||
The result of `Estimator` is as follows, please refer to please refer to [Intraday Trading](https://qlib.readthedocs.io/en/latest/component/backtest.html) for more details about the result.
|
||||
The result of `qrun` is as follows, please refer to please refer to [Intraday Trading](https://qlib.readthedocs.io/en/latest/component/backtest.html) for more details about the result.
|
||||
|
||||
```bash
|
||||
|
||||
@@ -154,9 +154,9 @@ Qlib provides a tool named `Estimator` to run the whole workflow automatically (
|
||||
|
||||
|
||||
```
|
||||
Here are detailed documents for [Estimator](https://qlib.readthedocs.io/en/latest/component/estimator.html).
|
||||
Here are detailed documents for `qrun` and [workflow](https://qlib.readthedocs.io/en/latest/component/workflow.html).
|
||||
|
||||
2. Graphical Reports Analysis: Run `examples/estimator/analyze_from_estimator.ipynb` with `jupyter notebook` to get graphical reports
|
||||
2. Graphical Reports Analysis: Run `examples/workflow_by_code.ipynb` with `jupyter notebook` to get graphical reports
|
||||
- Forecasting signal (model prediction) analysis
|
||||
- Cumulative Return of groups
|
||||

|
||||
@@ -184,14 +184,20 @@ Qlib provides a tool named `Estimator` to run the whole workflow automatically (
|
||||
-->
|
||||
|
||||
## Building Customized Quant Research Workflow by Code
|
||||
The automatic workflow may not suite the research workflow of all Quant researchers. To support a flexible Quant research workflow, Qlib also provides a modularized interface to allow researchers to build their own workflow by code. [Here](examples/train_backtest_analyze.ipynb) is a demo for customized Quant research workflow by code
|
||||
The automatic workflow may not suite the research workflow of all Quant researchers. To support a flexible Quant research workflow, Qlib also provides a modularized interface to allow researchers to build their own workflow by code. [Here](examples/workflow_by_code.ipynb) is a demo for customized Quant research workflow by code.
|
||||
|
||||
|
||||
# Quant Model Zoo
|
||||
|
||||
Here is a list of models built on `Qlib`.
|
||||
- [GBDT based on lightgbm](qlib/contrib/model/gbdt.py)
|
||||
- [GBDT based on LightGBM](qlib/contrib/model/gbdt.py)
|
||||
- [GBDT based on Catboost](qlib/contrib/model/catboost_model.py)
|
||||
- [GBDT based on XGBoost](qlib/contrib/model/xgboost.py)
|
||||
- [MLP based on pytorch](qlib/contrib/model/pytorch_nn.py)
|
||||
- [GRU based on pytorch](qlib/contrib/model/pytorch_gru.py)
|
||||
- [LSTM based on pytorcn](qlib/contrib/model/pytorch_lstm.py)
|
||||
- [GATs based on pytorch](qlib/contrib/model/pytorch_gats.py)
|
||||
- [TFT based on tensorflow-1.15.0](examples/benchmarks/TFT/tft.py)
|
||||
|
||||
Your PR of new Quant models is highly welcomed.
|
||||
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _alpha:
|
||||
|
||||
===========================
|
||||
Building Formulaic Alphas
|
||||
===========================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _server:
|
||||
|
||||
=================================
|
||||
``Online`` & ``Offline`` mode
|
||||
=================================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _backtest:
|
||||
|
||||
============================================
|
||||
Intraday Trading: Model&Strategy Testing
|
||||
============================================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _data:
|
||||
|
||||
================================
|
||||
Data Layer: Data Framework&Usage
|
||||
================================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _model:
|
||||
|
||||
============================================
|
||||
Interday Model: Model Training & Prediction
|
||||
============================================
|
||||
|
||||
409
docs/component/recorder.rst
Normal file
409
docs/component/recorder.rst
Normal file
@@ -0,0 +1,409 @@
|
||||
.. _recorder:
|
||||
|
||||
====================================
|
||||
Qlib Recorder: Experiment Management
|
||||
====================================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
``Qlib`` contains an experiment management system named ``QlibRecorder``, which is designed to help users handle experiment and analysis results in an efficient way.
|
||||
|
||||
There are three components of the system:
|
||||
|
||||
- `ExperimentManager`
|
||||
a class that manages experiments.
|
||||
|
||||
- `Experiment`
|
||||
a class of experiment, and each instance of it is responsible for a single experiment.
|
||||
|
||||
- `Recorder`
|
||||
a class of recorder, and each instance of it is responsible for a single run.
|
||||
|
||||
Here is a general view of the structure of the system:
|
||||
|
||||
.. code-block::
|
||||
|
||||
ExperimentManager
|
||||
- Experiment 1
|
||||
- Recorder 1
|
||||
- Recorder 2
|
||||
- ...
|
||||
- Experiment 2
|
||||
- Recorder 1
|
||||
- Recorder 2
|
||||
- ...
|
||||
- ...
|
||||
|
||||
Currently, the components of this experiment management system are implemented using the machine learning platform: ``MLFlow`` (`link <https://mlflow.org/>`_).
|
||||
|
||||
|
||||
Qlib Recorder
|
||||
===================
|
||||
``QlibRecorder`` provides a high level API for users to use the experiment management system. The interfaces are wrapped in the variable ``R`` in ``Qlib``, and users can directly use ``R`` to interact with the system. The following command shows how to import ``R`` in Python:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
from qlib.workflow import R
|
||||
|
||||
``QlibRecorder`` includes several common API for managing `experiments` and `recorders` within a workflow. For more available APIs, please refer to the following section about `Experiment Manager`, `Experiment` and `Recorder`.
|
||||
|
||||
Here are the available interfaces of ``QlibRecorder``:
|
||||
|
||||
- `__init__(exp_manager)`
|
||||
- Initialization.
|
||||
- It takes in an input: `exp_manager`, which is an `ExperimentManager` instance. The instance will be created during ``qlib.init``.
|
||||
|
||||
- `start(experiment_name=None, recorder_name=None)`
|
||||
- High level API to start an experiment. This method can only be called within a Python's '`with`' statement.
|
||||
- Parameters:
|
||||
- `experiment_name` : str
|
||||
name of the experiment one wants to start.
|
||||
- `recorder_name` : str
|
||||
name of the recorder under the experiment one wants to start.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
with R.start('test', 'recorder_1'):
|
||||
model.fit(dataset)
|
||||
R.log...
|
||||
... # further operations
|
||||
|
||||
- `start_exp(experiment_name=None, recorder_name=None, uri=None)`
|
||||
- Lower level method for starting an experiment. When use this method, one should end the experiment manually and the status of the recorder may not be handled properly.
|
||||
- Parameters:
|
||||
- `experiment_name` : str
|
||||
the name of the experiment to be started
|
||||
- `recorder_name` : str
|
||||
name of the recorder under the experiment one wants to start.
|
||||
- `uri` : str
|
||||
the tracking uri of the experiment, where all the artifacts/metrics etc. will be stored.
|
||||
The default uri are set in the qlib.config.
|
||||
- Returns:
|
||||
- an experiment instance being started.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
R.start_exp(experiment_name='test', recorder_name='recorder_1')
|
||||
... # further operations
|
||||
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
|
||||
|
||||
- `end_exp(recorder_status=Recorder.STATUS_FI)`
|
||||
- Method for ending an experiment manually. It will end the current active experiment, as well as its active recorder with the specified `status` type.
|
||||
- Parameters:
|
||||
- `status` : str
|
||||
The status of a recorder, which can be '`SCHEDULED`', '`RUNNING`', '`FINISHED`', '`FAILED`'.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
R.start_exp(experiment_name='test')
|
||||
... # further operations
|
||||
R.end_exp('FINISHED') or R.end_exp(Recorder.STATUS_S)
|
||||
|
||||
- `search_records(experiment_ids, **kwargs)`
|
||||
- Get a pandas DataFrame of all the records that have been stored with the given search criteria. This method is highly correlated with MLFlow's ``search_runs`` method (`link <https://www.mlflow.org/docs/latest/python_api/mlflow.html#mlflow.search_runs>`_).
|
||||
- Parameters:
|
||||
- `experiment_ids` : list
|
||||
list of experiment IDs.
|
||||
- `filter_string` : str
|
||||
filter query string, defaults to searching all runs.
|
||||
- `run_view_type` : int
|
||||
one of enum values ACTIVE_ONLY (1), DELETED_ONLY (2), or ALL (3).
|
||||
- `max_results` : int
|
||||
the maximum number of runs to put in the dataframe.
|
||||
- `order_by` : list
|
||||
list of columns to order by (e.g., “metrics.rmse”).
|
||||
- Returns:
|
||||
- A pandas.DataFrame of records, where each metric, parameter, and tag are expanded into their own columns named metrics.*, params.*, and tags.* respectively. For records that don't have a particular metric, parameter, or tag, their value will be (NumPy) Nan, None, or None respectively.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
R.log_metrics(m=2.50, step=0)
|
||||
records = R.search_runs([experiment_id], order_by=["metrics.m DESC"])
|
||||
|
||||
- `list_experiments()`
|
||||
- Method for listing all the existing experiments (except for those being deleted.)
|
||||
- Returns:
|
||||
- A dictionary (name -> experiment) of experiments information that being stored.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
exps = R.list_experiments()
|
||||
|
||||
- `list_recorders(experiment_id=None, experiment_name=None)`
|
||||
- Method for listing all the recorders of experiment with given id or name. If user doesn't provide the id or name of the experiment, this method will try to retrieve the default experiment and list all the recorders of the default experiment. If the default experiment doesn't exist, the method will first create the default experiment, and then create a new recorder under it.
|
||||
- Parameters:
|
||||
- `experiment_id` : str
|
||||
id of the experiment.
|
||||
- `experiment_name` : str
|
||||
name of the experiment.
|
||||
- Returns:
|
||||
- A dictionary (id -> recorder) of recorder information that being stored.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
recorders = R.list_recorders(experiment_name='test')
|
||||
|
||||
- `get_exp(experiment_id=None, experiment_name=None, create: bool = True)`
|
||||
- Method for retrieving an experiment with given id or name. Once the '`create`' argument is set to True, if no valid experiment is found, this method will create one for the user. Otherwise, it will only retrieve a specific experiment or raise an Error.
|
||||
|
||||
- If '`create`' is True:
|
||||
- If ``R``'s running:
|
||||
- no id or name specified, return the active experiment.
|
||||
- if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name, and the experiment is set to be running.
|
||||
- If ``R``'s not running:
|
||||
- no id or name specified, create a default experiment, and the experiment is set to be running.
|
||||
- if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given name or the default experiment, and the experiment is set to be running.
|
||||
- Else If '`create`' is False:
|
||||
- If ``R``'s running:
|
||||
- no id or name specified, return the active experiment.
|
||||
- if id or name is specified, return the specified experiment. If no such exp found, raise Error.
|
||||
- If ``R``'s not running:
|
||||
- no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
|
||||
- if id or name is specified, return the specified experiment. If no such exp found, raise Error.
|
||||
- Parameters:
|
||||
- `experiment_id` : str
|
||||
id of the experiment.
|
||||
- `experiment_name` : str
|
||||
name of the experiment.
|
||||
- `create` : boolean
|
||||
an argument determines whether the method will automatically create a new experiment according to user's specification if the experiment hasn't been created before.
|
||||
- Returns:
|
||||
- An experiment instance with given id or name.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
# Case 1
|
||||
with R.start('test'):
|
||||
exp = R.get_exp()
|
||||
recorders = exp.list_recorders()
|
||||
|
||||
# Case 2
|
||||
with R.start('test'):
|
||||
exp = R.get_exp('test1')
|
||||
|
||||
# Case 3
|
||||
exp = R.get_exp() -> a default experiment.
|
||||
|
||||
# Case 4
|
||||
exp = R.get_exp(experiment_name='test')
|
||||
|
||||
# Case 5
|
||||
exp = R.get_exp(create=False) -> the default experiment if exists.
|
||||
|
||||
- `delete_exp(experiment_id=None, experiment_name=None)`
|
||||
- Method for deleting the experiment with given id or name. At least one of id or name must be given, otherwise, error will occur.
|
||||
- Parameters:
|
||||
- `experiment_id` : str
|
||||
id of the experiment.
|
||||
- `experiment_name` : str
|
||||
name of the experiment.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
R.delete_exp(experiment_name='test')
|
||||
|
||||
- `get_uri()`
|
||||
- Method for retrieving the uri of current experiment manager.
|
||||
- Returns:
|
||||
- The uri of current experiment manager.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
uri = R.get_uri()
|
||||
|
||||
- `get_recorder(recorder_id=None, recorder_name=None, experiment_name=None)`
|
||||
- Method for retrieving a recorder. The recorder can be used for further process such as ``save_objects``, ``load_object``, ``log_params``, ``log_metrics``, etc.
|
||||
|
||||
- If ``R``'s running:
|
||||
- no id or name specified, return the active recorder.
|
||||
- if id or name is specified, return the specified recorder.
|
||||
- If ``R``'s not running:
|
||||
- no id or name specified, raise Error.
|
||||
- if id or name is specified, and the corresponding experiment_name must be given, return the specified recorder. Otherwise, raise Error.
|
||||
- Parameters:
|
||||
- `recorder_id` : str
|
||||
id of the recorder.
|
||||
- `recorder_name` : str
|
||||
name of the recorder.
|
||||
- `experiment_name` : str
|
||||
name of the experiment.
|
||||
- Returns:
|
||||
- A recorder instance.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
# Case 1
|
||||
with R.start('test'):
|
||||
recorder = R.get_recorder()
|
||||
|
||||
# Case 2
|
||||
with R.start('test'):
|
||||
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')
|
||||
|
||||
# Case 3
|
||||
recorder = R.get_recorder() -> Error
|
||||
|
||||
# Case 4
|
||||
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d') -> Error
|
||||
|
||||
# Case 5
|
||||
recorder = R.get_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d', experiment_name='test')
|
||||
|
||||
- `delete_recorder(recorder_id=None, recorder_name=None)`
|
||||
- Method for deleting the recorders with given id or name. At least one of id or name must be given, otherwise, error will occur.
|
||||
- Parameters:
|
||||
- `recorder_id` : str
|
||||
id of the experiment.
|
||||
- `recorder_name` : str
|
||||
name of the experiment.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
R.delete_recorder(recorder_id='2e7a4efd66574fa49039e00ffaefa99d')
|
||||
|
||||
- `save_objects(local_path=None, artifact_path=None, **kwargs)`
|
||||
- Method for saving objects as artifacts in the experiment to the uri. It supports either saving from a local file/directory, or directly saving objects. User can use valid python's keywords arguments to specify the object to be saved as well as its name (name: value).
|
||||
|
||||
- If R's running: it will save the objects through the running recorder.
|
||||
- If R's not running: the system will create a default experiment, and a new recorder and save objects under it.
|
||||
|
||||
.. note::
|
||||
|
||||
If one wants to save objects with a specific recorder. It is recommended to first get the specific recorder through `get_recorder` API and use the recorder the save objects. The supported arguments are the same as this method.
|
||||
|
||||
- Parameters:
|
||||
- `local_path` : str
|
||||
if provided, them save the file or directory to the artifact URI.
|
||||
- `artifact_path` : str
|
||||
the relative path for the artifact to be stored in the URI.
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
# Case 1
|
||||
with R.start('test'):
|
||||
pred = model.predict(dataset)
|
||||
R.save_objects(**{"pred.pkl": pred}, artifact_path='prediction')
|
||||
|
||||
# Case 2
|
||||
with R.start('test'):
|
||||
R.save_objects(local_path='results/pred.pkl')
|
||||
|
||||
- `log_params(**kwargs)`
|
||||
- Method for logging parameters during an experiment. In addition to using ``R``, one can also log to a specific recorder after getting it with `get_recorder` API.
|
||||
|
||||
- If R's running: it will log parameters through the running recorder.
|
||||
- If R's not running: the system will create a default experiment as well as a new recorder, and log parameters under it.
|
||||
- Parameters:
|
||||
- `keyword argument`:
|
||||
name1=value1, name2=value2, ...
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
# Case 1
|
||||
with R.start('test'):
|
||||
R.log_params(learning_rate=0.01)
|
||||
|
||||
# Case 2
|
||||
R.log_params(learning_rate=0.01)
|
||||
|
||||
- `log_metrics(step=None, **kwargs)`
|
||||
- Method for logging metrics during an experiment. In addition to using ``R``, one can also log to a specific recorder after getting it with `get_recorder` API.
|
||||
|
||||
- If R's running: it will log metrics through the running recorder.
|
||||
- If R's not running: the system will create a default experiment as well as a new recorder, and log metrics under it.
|
||||
- Parameters:
|
||||
- `step`: int
|
||||
a single integer step at which to log the specified Metrics. If unspecified, each metric is logged at step zero.
|
||||
- `keyword argument`:
|
||||
name1=value1, name2=value2, ...
|
||||
|
||||
- `set_tags(**kwargs)`
|
||||
- Method for setting tags for a recorder. In addition to using ``R``, one can also set the tag to a specific recorder after getting it with `get_recorder` API.
|
||||
|
||||
- If R's running: it will set tags through the running recorder.
|
||||
- If R's not running: the system will create a default experiment as well as a new recorder, and set the tags under it.
|
||||
- Parameters:
|
||||
- `keyword argument`:
|
||||
name1=value1, name2=value2, ...
|
||||
- Use case:
|
||||
|
||||
.. code-block:: Python
|
||||
|
||||
# Case 1
|
||||
with R.start('test'):
|
||||
R.set_tags(release_version="2.2.0")
|
||||
|
||||
# Case 2
|
||||
R.set_tags(release_version="2.2.0")
|
||||
|
||||
|
||||
Experiment Manager
|
||||
===================
|
||||
|
||||
The ``ExpManager`` module in ``Qlib`` is responsible for managing different experiments. Most of the APIs of ``ExpManager`` are similar to ``QlibRecorder``, and the most important API will be the ``get_exp`` method. User can directly refer to the documents above for some detailed information about how to use the ``get_exp`` method.
|
||||
|
||||
For other interfaces such as `create_exp`, `delete_exp`, please refer to `Experiment Manager API <../reference/api.html#experiment-manager>`_.
|
||||
|
||||
Experiment
|
||||
===================
|
||||
|
||||
The ``Experiment`` class is solely responsible for a single experiment, and it will handle any operations that are related to an experiment. Basic methods such as `start`, `end` an experiment are included. Besides, methods related to `recorders` are also available: such methods include `get_recorder` and `list_recorders`.
|
||||
|
||||
For other interfaces such as `search_records`, `delete_recorder`, please refer to `Experiment API <../reference/api.html#experiment>`_.
|
||||
|
||||
Recorder
|
||||
===================
|
||||
|
||||
The ``Recorder`` class is responsible for a single recorder. It will handle some detailed operations such as ``log_metrics``, ``log_params`` of a single run. It is designed to help user to easily track results and things being generated during a run.
|
||||
|
||||
Here are some important APIs that are not included in the ``QlibRecorder``:
|
||||
|
||||
- `list_artifacts(artifact_path: str = None)`
|
||||
- List all the artifacts of a recorder.
|
||||
- Parameters:
|
||||
- `artifact_path` : str
|
||||
the relative path for the artifact to be stored in the URI.
|
||||
- Returns:
|
||||
- A list of artifacts information (name, path, etc.) that being stored.
|
||||
|
||||
- `list_metrics()`
|
||||
- List all the metrics of a recorder.
|
||||
- Returns:
|
||||
- A dictionary of metrics that being stored.
|
||||
|
||||
- `list_params()`
|
||||
- List all the params of a recorder.
|
||||
- Returns:
|
||||
- A dictionary of params that being stored.
|
||||
|
||||
- `list_tags()`
|
||||
- List all the tags of a recorder.
|
||||
- Returns:
|
||||
- A dictionary of tags that being stored.
|
||||
|
||||
For other interfaces such as `save_objects`, `load_object`, please refer to `Recorder API <../reference/api.html#recorder>`_.
|
||||
|
||||
Record Template
|
||||
===================
|
||||
|
||||
The ``RecordTemp`` class is a class that enables generate experiment results such as IC and backtest in a certain format. We have provided three different `Record Template` class:
|
||||
|
||||
- ``SignalRecord``: This class generates the `preidction` of the model.
|
||||
- ``SigAnaRecord``: This class generates the `IC`, `ICIR`, `Rank IC` and `Rank ICIR`.
|
||||
- ``PortAnaRecord``: This class generates the results of `backtest`. The detailed information about `backtest` as well as the available `strategy`, users can refer to `Strategy <../component/strategy.html>`_ and `Backtest <../component/backtest.html>`_.
|
||||
|
||||
For more information, please refer to `Record Template API <../reference/api.html#module-qlib.workflow.record_temp>`_.
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _report:
|
||||
|
||||
==========================================
|
||||
Aanalysis: Evaluation & Results Analysis
|
||||
==========================================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _strategy:
|
||||
|
||||
========================================
|
||||
Interday Strategy: Portfolio Management
|
||||
========================================
|
||||
|
||||
279
docs/component/workflow.rst
Normal file
279
docs/component/workflow.rst
Normal file
@@ -0,0 +1,279 @@
|
||||
.. _workflow:
|
||||
|
||||
=================================
|
||||
Workflow: Workflow Management
|
||||
=================================
|
||||
.. currentmodule:: qlib
|
||||
|
||||
Introduction
|
||||
===================
|
||||
|
||||
The components in `Qlib Framework <../introduction/introduction.html#framework>`_ are designed in a loosely-coupled way. Users could build their own Quant research workflow with these components like `Example <https://github.com/microsoft/qlib/blob/main/examples/workflow_by_code.py>`_.
|
||||
|
||||
|
||||
Besides, ``Qlib`` provides more user-friendly interfaces named ``qrun`` to automatically run the whole workflow defined by configuration. A concrete execution of the whole workflow is called an `experiment`.
|
||||
With ``qrun``, user can easily run an `experiment`, which includes the following steps:
|
||||
|
||||
- Data
|
||||
- Loading
|
||||
- Processing
|
||||
- Slicing
|
||||
- Model
|
||||
- Training and inference (static or rolling)
|
||||
- Saving & loading
|
||||
- Evaluation
|
||||
- Backtest
|
||||
|
||||
For each `experiment`, ``Qlib`` has a complete system to tracking all the information as well as artifacts generated during training, inference and evaluation phase. For more information about how Qlib handles `experiment`, please refer to the related document: `Recorder: Experiment Management <../component/recorder.html>`_.
|
||||
|
||||
Complete Example
|
||||
===================
|
||||
|
||||
Before getting into details, here is a complete example of ``qrun``, which defines the workflow in typical Quant research.
|
||||
Below is a typical config file of ``qrun``.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
provider_uri: "~/.qlib/qlib_data/cn_data"
|
||||
region: cn
|
||||
market: &market csi300
|
||||
benchmark: &benchmark SH000300
|
||||
data_handler_config: &data_handler_config
|
||||
start_time: 2008-01-01
|
||||
end_time: 2020-08-01
|
||||
fit_start_time: 2008-01-01
|
||||
fit_end_time: 2014-12-31
|
||||
instruments: *market
|
||||
port_analysis_config: &port_analysis_config
|
||||
strategy:
|
||||
class: TopkDropoutStrategy
|
||||
module_path: qlib.contrib.strategy.strategy
|
||||
kwargs:
|
||||
topk: 50
|
||||
n_drop: 5
|
||||
backtest:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 100000000
|
||||
benchmark: *benchmark
|
||||
deal_price: close
|
||||
open_cost: 0.0005
|
||||
close_cost: 0.0015
|
||||
min_cost: 5
|
||||
task:
|
||||
model:
|
||||
class: LGBModel
|
||||
module_path: qlib.contrib.model.gbdt
|
||||
kwargs:
|
||||
loss: mse
|
||||
colsample_bytree: 0.8879
|
||||
learning_rate: 0.0421
|
||||
subsample: 0.8789
|
||||
lambda_l1: 205.6999
|
||||
lambda_l2: 580.9768
|
||||
max_depth: 8
|
||||
num_leaves: 210
|
||||
num_threads: 20
|
||||
dataset:
|
||||
class: DatasetH
|
||||
module_path: qlib.data.dataset
|
||||
kwargs:
|
||||
handler:
|
||||
class: Alpha158
|
||||
module_path: qlib.contrib.data.handler
|
||||
kwargs: *data_handler_config
|
||||
segments:
|
||||
train: [2008-01-01, 2014-12-31]
|
||||
valid: [2015-01-01, 2016-12-31]
|
||||
test: [2017-01-01, 2020-08-01]
|
||||
record:
|
||||
- class: SignalRecord
|
||||
module_path: qlib.workflow.record_temp
|
||||
kwargs: {}
|
||||
- class: PortAnaRecord
|
||||
module_path: qlib.workflow.record_temp
|
||||
kwargs:
|
||||
config: *port_analysis_config
|
||||
|
||||
After saving the config into `configuration.yaml`, users could start the workflow and test their ideas with a single command below.
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
qrun -c configuration.yaml
|
||||
|
||||
.. note::
|
||||
|
||||
`qrun` will be placed in your $PATH directory when installing ``Qlib``.
|
||||
|
||||
|
||||
Configuration File
|
||||
===================
|
||||
|
||||
Let's get into details of ``qrun`` in this section.
|
||||
|
||||
Before using ``qrun``, users need to prepare a configuration file. The following content shows how to prepare each part of the configuration file.
|
||||
|
||||
Qlib Data Section
|
||||
--------------------
|
||||
|
||||
At first, the configuration file needs to contain several basic parameters about the data, which will be used for qlib initialization, data handling and backtest.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
provider_uri: "~/.qlib/qlib_data/cn_data"
|
||||
region: cn
|
||||
market: &market csi300
|
||||
benchmark: &benchmark SH000300
|
||||
|
||||
The meaning of each field is as follows:
|
||||
|
||||
- `provider_uri`
|
||||
Type: str. The URI of the Qlib data. For example, it could be the location where the data loaded by ``get_data.py`` are stored.
|
||||
|
||||
- `region`
|
||||
- If `region` == "us", ``Qlib`` will be initialized in US-stock mode.
|
||||
- If `region` == "cn", ``Qlib`` will be initialized in china-stock mode.
|
||||
|
||||
.. note::
|
||||
|
||||
The value of `region` should be aligned with the data stored in `provider_uri`.
|
||||
|
||||
- `market`
|
||||
Type: str. Index name, the default value is `csi500`.
|
||||
|
||||
- `benchmark`
|
||||
Type: str, list or pandas.Series. Stock index symbol, the default value is `SH000905`.
|
||||
|
||||
.. note::
|
||||
|
||||
* If `benchmark` is str, it will use the daily change as the 'bench'.
|
||||
|
||||
* If `benchmark` is list, it will use the daily average change of the stock pool in the list as the 'bench'.
|
||||
|
||||
* If `benchmark` is pandas.Series, whose `index` is trading date and the value T is the change from T-1 to T, it will be directly used as the 'bench'. An example is as following:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
print(D.features(D.instruments('csi500'), ['$close/Ref($close, 1)-1'])['$close/Ref($close, 1)-1'].head())
|
||||
2017-01-04 0.011693
|
||||
2017-01-05 0.000721
|
||||
2017-01-06 -0.004322
|
||||
2017-01-09 0.006874
|
||||
2017-01-10 -0.003350
|
||||
.. note::
|
||||
|
||||
The symbol `&` in `yaml` file stands for an anchor of a field, which is useful when another fields include this parameter as part of the value. Taking the configuration file above as an example, users can directly change the value of `market` and `benchmark` without traversing the entire configuration file.
|
||||
|
||||
Model Section
|
||||
--------------------
|
||||
|
||||
In the `task` field, the `model` section describes the parameters of the model to be used for training and inference. For more information about the base ``Model`` class, please refer to `Qlib Model <../component/model.html>`_.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
model:
|
||||
class: LGBModel
|
||||
module_path: qlib.contrib.model.gbdt
|
||||
kwargs:
|
||||
loss: mse
|
||||
colsample_bytree: 0.8879
|
||||
learning_rate: 0.0421
|
||||
subsample: 0.8789
|
||||
lambda_l1: 205.6999
|
||||
lambda_l2: 580.9768
|
||||
max_depth: 8
|
||||
num_leaves: 210
|
||||
num_threads: 20
|
||||
|
||||
The meaning of each field is as follows:
|
||||
|
||||
- `class`
|
||||
Type: str. The name for the model class.
|
||||
|
||||
- `module_path`
|
||||
Type: str. The path for the model in qlib.
|
||||
|
||||
- `kwargs`
|
||||
The keywords arguments for the model. Please refer to the specific model implementation for more information: `models <https://github.com/microsoft/qlib/blob/main/qlib/contrib/model>`_.
|
||||
|
||||
.. note::
|
||||
|
||||
``Qlib`` provides a util named: ``init_instance_by_config`` to initialize any class inside ``Qlib`` with the configuration includes the fields: `class`, `module_path` and `kwargs`.
|
||||
|
||||
Dataset Section
|
||||
--------------------
|
||||
|
||||
The `dataset` field describes the parameters for the ``Dataset`` module in ``Qlib`` as well those for the module ``DataHandler``. For more information about the ``Dataset`` module, please refer to `Qlib Model <../component/data.html#dataset>`_.
|
||||
|
||||
The keywords arguments configuration of the ``DataHandler`` is as follows:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
data_handler_config: &data_handler_config
|
||||
start_time: 2008-01-01
|
||||
end_time: 2020-08-01
|
||||
fit_start_time: 2008-01-01
|
||||
fit_end_time: 2014-12-31
|
||||
instruments: *market
|
||||
|
||||
Users can refer to the document of `DataHandler <../component/data.html#datahandler>`_ for more information about the meaning of each field in the configuration.
|
||||
|
||||
Here is the configuration for the ``Dataset`` module which will take care of data preprossing and slicing during the training and testing phase.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
dataset:
|
||||
class: DatasetH
|
||||
module_path: qlib.data.dataset
|
||||
kwargs:
|
||||
handler:
|
||||
class: Alpha158
|
||||
module_path: qlib.contrib.data.handler
|
||||
kwargs: *data_handler_config
|
||||
segments:
|
||||
train: [2008-01-01, 2014-12-31]
|
||||
valid: [2015-01-01, 2016-12-31]
|
||||
test: [2017-01-01, 2020-08-01]
|
||||
|
||||
Record Section
|
||||
--------------------
|
||||
|
||||
The `record` field is about the parameters the ``Record`` module in ``Qlib``. ``Record`` is responsible for generating certain analysis and evaluation results such as `prediction`, `information Coefficient (IC)` and `backtest`.
|
||||
|
||||
The following script is the configuration of `backtest` and the `strategy` used in `backtest`:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
port_analysis_config: &port_analysis_config
|
||||
strategy:
|
||||
class: TopkDropoutStrategy
|
||||
module_path: qlib.contrib.strategy.strategy
|
||||
kwargs:
|
||||
topk: 50
|
||||
n_drop: 5
|
||||
backtest:
|
||||
verbose: False
|
||||
limit_threshold: 0.095
|
||||
account: 100000000
|
||||
benchmark: *benchmark
|
||||
deal_price: close
|
||||
open_cost: 0.0005
|
||||
close_cost: 0.0015
|
||||
min_cost: 5
|
||||
|
||||
For more information about the meaning of each field in configuration of `strategy` and `backtest`, users can look up the documents: `Strategy <../component/strategy.html>`_ and `Backtest <../component/backtest.html>`_.
|
||||
|
||||
Here is the configuration details of different `Record Template` such as ``SignalRecord`` and ``PortAnaRecord``:
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
record:
|
||||
- class: SignalRecord
|
||||
module_path: qlib.workflow.record_temp
|
||||
kwargs: {}
|
||||
- class: PortAnaRecord
|
||||
module_path: qlib.workflow.record_temp
|
||||
kwargs:
|
||||
config: *port_analysis_config
|
||||
|
||||
For more information about the ``Record`` module in ``Qlib``, user can refer to the related document: `Record <../component/recorder.html#record-template>`_.
|
||||
@@ -35,11 +35,12 @@ Document Structure
|
||||
:maxdepth: 3
|
||||
:caption: COMPONENTS:
|
||||
|
||||
Estimator: Workflow Management <component/estimator.rst>
|
||||
Workflow: Workflow Management <component/workflow.rst>
|
||||
Data Layer: Data Framework&Usage <component/data.rst>
|
||||
Interday Model: Model Training & Prediction <component/model.rst>
|
||||
Interday Strategy: Portfolio Management <component/strategy.rst>
|
||||
Intraday Trading: Model&Strategy Testing <component/backtest.rst>
|
||||
Qlib Recorder: Experiment Management <component/recorder.rst>
|
||||
Aanalysis: Evaluation & Results Analysis <component/report.rst>
|
||||
|
||||
.. toctree::
|
||||
@@ -48,6 +49,7 @@ Document Structure
|
||||
|
||||
Building Formulaic Alphas <advanced/alpha.rst>
|
||||
Online & Offline mode <advanced/server.rst>
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:caption: REFERENCE:
|
||||
|
||||
@@ -49,18 +49,19 @@ To kown more about `prepare data`, please refer to `Data Preparation <../compone
|
||||
Auto Quant Research Workflow
|
||||
====================================
|
||||
|
||||
``Qlib`` provides a tool named ``Estimator`` to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). Users can start an auto quant research workflow and have a graphical reports analysis according to the following steps:
|
||||
``Qlib`` provides a tool named ``qrun`` to run the whole workflow automatically (including building dataset, training models, backtest and evaluation). Users can start an auto quant research workflow and have a graphical reports analysis according to the following steps:
|
||||
|
||||
- Quant Research Workflow:
|
||||
- Run ``Estimator`` with `estimator_config.yaml` as following.
|
||||
- Run ``qrun`` with a config file of the LightGBM model `workflow_config_lightgbm.yaml` as following.
|
||||
|
||||
.. code-block::
|
||||
|
||||
cd examples # Avoid running program under the directory contains `qlib`
|
||||
estimator -c estimator/estimator_config.yaml
|
||||
qrun benchmarks/LightGBM/workflow_config_lightgbm.yaml
|
||||
|
||||
|
||||
- Estimator result
|
||||
The result of ``Estimator`` is as follows, which is also the result of ``Intraday Trading``. Please refer to `Intraday Trading <../component/backtest.html>`_. for more details about the result.
|
||||
- Workflow result
|
||||
The result of ``qrun`` is as follows, which is also the result of ``Intraday Trading``. Please refer to `Intraday Trading <../component/backtest.html>`_. for more details about the result.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
@@ -77,11 +78,11 @@ Auto Quant Research Workflow
|
||||
max_drawdown -0.075024
|
||||
|
||||
|
||||
To know more about `Estimator`, please refer to `Estimator: Workflow Management <../component/estimator.html>`_.
|
||||
To know more about `workflow` and `qrun`, please refer to `Workflow: Workflow Management <../component/workflow.html>`_.
|
||||
|
||||
- Graphical Reports Analysis:
|
||||
- Run ``examples/estimator/analyze_from_estimator.ipynb`` with jupyter notebook
|
||||
Users can have portfolio analysis or prediction score (model prediction) analysis by run ``examples/estimator/analyze_from_estimator.ipynb``.
|
||||
- Run ``examples/workflow_by_code.ipynb`` with jupyter notebook
|
||||
Users can have portfolio analysis or prediction score (model prediction) analysis by run ``examples/workflow_by_code.ipynb``.
|
||||
- Graphical Reports
|
||||
Users can get graphical reports about the analysis, please refer to `Aanalysis: Evaluation & Results Analysis <../component/report.html>`_ for more details.
|
||||
|
||||
@@ -90,4 +91,4 @@ Auto Quant Research Workflow
|
||||
Custom Model Integration
|
||||
===============================================
|
||||
|
||||
``Qlib`` provides ``lightGBM`` and ``Dnn`` model as the baseline of ``Interday Model``. In addition to the default model, users can integrate their own custom models into ``Qlib``. If users are interested in the custom model, please refer to `Custom Model Integration <../start/integration.html>`_.
|
||||
``Qlib`` provides several models such as ``lightGBM`` and ``DNN`` model as the baseline of ``Interday Model``. In addition to the default model, users can integrate their own custom models into ``Qlib``. If users are interested in the custom model, please refer to `Custom Model Integration <../start/integration.html>`_.
|
||||
|
||||
@@ -116,3 +116,26 @@ Report
|
||||
:members:
|
||||
|
||||
|
||||
Workflow
|
||||
====================
|
||||
|
||||
|
||||
Experiment Manager
|
||||
--------------------
|
||||
.. autoclass:: qlib.workflow.expm.ExpManager
|
||||
:members:
|
||||
|
||||
Experiment
|
||||
--------------------
|
||||
.. autoclass:: qlib.workflow.exp.Experiment
|
||||
:members:
|
||||
|
||||
Recorder
|
||||
--------------------
|
||||
.. autoclass:: qlib.workflow.recorder.Recorder
|
||||
:members:
|
||||
|
||||
Record Template
|
||||
--------------------
|
||||
.. automodule:: qlib.workflow.record_temp
|
||||
:members:
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _getdata:
|
||||
|
||||
=============================
|
||||
Data Retrieval
|
||||
=============================
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _initialization:
|
||||
|
||||
====================
|
||||
Qlib Initialization
|
||||
====================
|
||||
@@ -59,7 +60,7 @@ Besides `provider_uri` and `region`, `qlib.init` has other parameters. The follo
|
||||
|
||||
If Qlib fails to connect redis via `redis_host` and `redis_port`, cache mechanism will not be used! Please refer to `Cache <../component/data.html#cache>`_ for details.
|
||||
- `exp_manager`
|
||||
Type: dict, optional parameter, the setting of experiment manager to be used in qlib. Users can specify an experiment manager class, as well as the tracking URI for all the experiments. However, please be aware that we only support input of a dictionary in the following style for `exp_manager`.
|
||||
Type: dict, optional parameter, the setting of `experiment manager` to be used in qlib. Users can specify an experiment manager class, as well as the tracking URI for all the experiments. However, please be aware that we only support input of a dictionary in the following style for `exp_manager`. For more information about `exp_manager`, users can refer to `Recorder: Experiment Management <../component/recorder.html>`_.
|
||||
::
|
||||
|
||||
{
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
.. _installation:
|
||||
|
||||
====================
|
||||
Installation
|
||||
====================
|
||||
|
||||
@@ -65,10 +65,14 @@ def get_strategy(
|
||||
topk : int (Default value: 50)
|
||||
top-N stocks to buy.
|
||||
margin : int or float(Default value: 0.5)
|
||||
if isinstance(margin, int):
|
||||
- if isinstance(margin, int):
|
||||
|
||||
sell_limit = margin
|
||||
else:
|
||||
|
||||
- else:
|
||||
|
||||
sell_limit = pred_in_a_day.count() * margin
|
||||
|
||||
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit)
|
||||
sell_limit should be no less than topk
|
||||
n_drop : int
|
||||
@@ -204,10 +208,14 @@ def backtest(pred, account=1e9, shift=1, benchmark="SH000905", verbose=True, **k
|
||||
topk : int (Default value: 50)
|
||||
top-N stocks to buy.
|
||||
margin : int or float(Default value: 0.5)
|
||||
if isinstance(margin, int):
|
||||
- if isinstance(margin, int):
|
||||
|
||||
sell_limit = margin
|
||||
else:
|
||||
|
||||
- else:
|
||||
|
||||
sell_limit = pred_in_a_day.count() * margin
|
||||
|
||||
buffer margin, in single score_mode, continue holding stock if it is in nlargest(sell_limit)
|
||||
sell_limit should be no less than topk
|
||||
n_drop : int
|
||||
|
||||
@@ -16,7 +16,7 @@ class LGBModel(ModelFT):
|
||||
def __init__(self, loss="mse", **kwargs):
|
||||
if loss not in {"mse", "binary"}:
|
||||
raise NotImplementedError
|
||||
self.params = {"objective": loss, 'verbosity': -1}
|
||||
self.params = {"objective": loss, "verbosity": -1}
|
||||
self.params.update(kwargs)
|
||||
self.model = None
|
||||
|
||||
|
||||
@@ -137,7 +137,9 @@ class WeightStrategyBase(BaseStrategy, AdjustTimer):
|
||||
self.order_generator = order_generator_cls_or_obj
|
||||
|
||||
def generate_target_weight_position(self, score, current, trade_date):
|
||||
"""Parameter:
|
||||
"""
|
||||
Parameters:
|
||||
---------
|
||||
score : pred score for this trade date, pd.Series, index is stock_id, contain 'score' column
|
||||
current : current position, use Position() class
|
||||
trade_exchange : Exchange()
|
||||
@@ -148,7 +150,9 @@ class WeightStrategyBase(BaseStrategy, AdjustTimer):
|
||||
raise NotImplementedError()
|
||||
|
||||
def generate_order_list(self, score_series, current, trade_exchange, pred_date, trade_date):
|
||||
"""Parameter
|
||||
"""
|
||||
Parameters:
|
||||
----------
|
||||
score_series : pd.Seires
|
||||
stock_id , score
|
||||
current : Position()
|
||||
@@ -181,7 +185,9 @@ class WeightStrategyBase(BaseStrategy, AdjustTimer):
|
||||
|
||||
class TopkDropoutStrategy(BaseStrategy, ListAdjustTimer):
|
||||
def __init__(self, topk, n_drop, method="bottom", risk_degree=0.95, thresh=1, hold_thresh=1, **kwargs):
|
||||
"""Parameter
|
||||
"""
|
||||
Parameters:
|
||||
-----------
|
||||
topk : int
|
||||
The number of stocks in the portfolio
|
||||
n_drop : int
|
||||
@@ -218,19 +224,21 @@ class TopkDropoutStrategy(BaseStrategy, ListAdjustTimer):
|
||||
return self.risk_degree
|
||||
|
||||
def generate_order_list(self, score_series, current, trade_exchange, pred_date, trade_date):
|
||||
"""Gnererate order list according to score_series at trade_date.
|
||||
will not change current.
|
||||
Parameter
|
||||
score_series : pd.Seires
|
||||
stock_id , score
|
||||
current : Position()
|
||||
current of account
|
||||
trade_exchange : Exchange()
|
||||
exchange
|
||||
pred_date : pd.Timestamp
|
||||
predict date
|
||||
trade_date : pd.Timestamp
|
||||
trade date
|
||||
"""
|
||||
Gnererate order list according to score_series at trade_date, will not change current.
|
||||
|
||||
Parameters:
|
||||
----------
|
||||
score_series : pd.Series
|
||||
stock_id , score
|
||||
current : Position()
|
||||
current of account
|
||||
trade_exchange : Exchange()
|
||||
exchange
|
||||
pred_date : pd.Timestamp
|
||||
predict date
|
||||
trade_date : pd.Timestamp
|
||||
trade date
|
||||
"""
|
||||
if not self.is_adjust(trade_date):
|
||||
return []
|
||||
|
||||
@@ -748,7 +748,8 @@ class DiskDatasetCache(DatasetCache):
|
||||
|
||||
The format the cache contains 3 parts(followed by typical filename).
|
||||
|
||||
- index : cache/d41366901e25de3ec47297f12e2ba11d.index
|
||||
- index : cache/d41366901e25de3ec47297f12e2ba11d.index
|
||||
|
||||
- The content of the file may be in following format(pandas.Series)
|
||||
|
||||
.. code-block:: python
|
||||
@@ -765,7 +766,9 @@ class DiskDatasetCache(DatasetCache):
|
||||
- It indicates the `end_index` of the data for `timestamp`
|
||||
|
||||
- meta data: cache/d41366901e25de3ec47297f12e2ba11d.meta
|
||||
|
||||
- data : cache/d41366901e25de3ec47297f12e2ba11d
|
||||
|
||||
- This is a hdf file sorted by datetime
|
||||
|
||||
:param cache_path: The path to store the cache
|
||||
|
||||
@@ -152,16 +152,19 @@ class InstrumentProvider(abc.ABC):
|
||||
{`market`=>base market name, `filter_pipe`=>list of filters}
|
||||
|
||||
example :
|
||||
{'market': 'csi500',
|
||||
'filter_pipe': [{'filter_type': 'ExpressionDFilter',
|
||||
'rule_expression': '$open<40',
|
||||
'filter_start_time': None,
|
||||
'filter_end_time': None,
|
||||
'keep': False},
|
||||
{'filter_type': 'NameDFilter',
|
||||
'name_rule_re': 'SH[0-9]{4}55',
|
||||
'filter_start_time': None,
|
||||
'filter_end_time': None}]}
|
||||
|
||||
.. code-block::
|
||||
|
||||
{'market': 'csi500',
|
||||
'filter_pipe': [{'filter_type': 'ExpressionDFilter',
|
||||
'rule_expression': '$open<40',
|
||||
'filter_start_time': None,
|
||||
'filter_end_time': None,
|
||||
'keep': False},
|
||||
{'filter_type': 'NameDFilter',
|
||||
'name_rule_re': 'SH[0-9]{4}55',
|
||||
'filter_start_time': None,
|
||||
'filter_end_time': None}]}
|
||||
"""
|
||||
if filter_pipe is None:
|
||||
filter_pipe = []
|
||||
@@ -956,6 +959,8 @@ class BaseProvider:
|
||||
disk_cache=None,
|
||||
):
|
||||
"""
|
||||
Parameters:
|
||||
-----------
|
||||
disk_cache : int
|
||||
whether to skip(0)/use(1)/replace(2) disk_cache
|
||||
|
||||
|
||||
@@ -40,12 +40,15 @@ class DataHandler(Serializable):
|
||||
|
||||
Example of the data:
|
||||
The multi-index of the columns is optional.
|
||||
feature label
|
||||
$close $volume Ref($close, 1) Mean($close, 3) $high-$low LABEL0
|
||||
datetime instrument
|
||||
2010-01-04 SH600000 81.807068 17145150.0 83.737389 83.016739 2.741058 0.0032
|
||||
SH600004 13.313329 11800983.0 13.313329 13.317701 0.183632 0.0042
|
||||
SH600005 37.796539 12231662.0 38.258602 37.919757 0.970325 0.0289
|
||||
|
||||
.. code-block::
|
||||
|
||||
feature label
|
||||
$close $volume Ref($close, 1) Mean($close, 3) $high-$low LABEL0
|
||||
datetime instrument
|
||||
2010-01-04 SH600000 81.807068 17145150.0 83.737389 83.016739 2.741058 0.0032
|
||||
SH600004 13.313329 11800983.0 13.313329 13.317701 0.183632 0.0042
|
||||
SH600005 37.796539 12231662.0 38.258602 37.919757 0.970325 0.0289
|
||||
|
||||
"""
|
||||
|
||||
@@ -107,7 +110,8 @@ class DataHandler(Serializable):
|
||||
----------
|
||||
enable_cache : bool
|
||||
default value is false
|
||||
if `enable_cache` == True
|
||||
- if `enable_cache` == True:
|
||||
|
||||
the processed data will be saved on disk, and handler will load the cached data from the disk directly
|
||||
when we call `init` next time
|
||||
"""
|
||||
@@ -145,16 +149,21 @@ class DataHandler(Serializable):
|
||||
level : Union[str, int]
|
||||
which index level to select the data
|
||||
col_set : Union[str, List[str]]
|
||||
if isinstance(col_set, str):
|
||||
|
||||
- if isinstance(col_set, str):
|
||||
|
||||
select a set of meaningful columns.(e.g. features, columns)
|
||||
if isinstance(col_set, List[str]):
|
||||
|
||||
- if isinstance(col_set, List[str]):
|
||||
|
||||
select several sets of meaningful columns, the returned data has multiple levels
|
||||
|
||||
squeeze : bool
|
||||
whether squeeze columns and index
|
||||
|
||||
Returns
|
||||
-------
|
||||
pd.DataFrame:
|
||||
pd.DataFrame.
|
||||
"""
|
||||
# Fetch column first will be more friendly to SepDataFrame
|
||||
df = self._fetch_df_by_col(self._data, col_set)
|
||||
|
||||
@@ -161,7 +161,7 @@ class StaticDataLoader(DataLoader):
|
||||
DataLoader that supports loading data from file or as provided.
|
||||
"""
|
||||
|
||||
def __init__(self, config: dict, join='outer'):
|
||||
def __init__(self, config: dict, join="outer"):
|
||||
"""
|
||||
Parameters
|
||||
----------
|
||||
@@ -187,8 +187,9 @@ class StaticDataLoader(DataLoader):
|
||||
def _maybe_load_raw_data(self):
|
||||
if self._data is not None:
|
||||
return
|
||||
self._data = pd.concat({
|
||||
fields_group: load_dataset(path_or_obj)
|
||||
for fields_group, path_or_obj in self.config.items()
|
||||
}, axis=1, join=self.join)
|
||||
self._data = pd.concat(
|
||||
{fields_group: load_dataset(path_or_obj) for fields_group, path_or_obj in self.config.items()},
|
||||
axis=1,
|
||||
join=self.join,
|
||||
)
|
||||
self._data.sort_index(inplace=True)
|
||||
|
||||
@@ -25,8 +25,10 @@ class Model(BaseModel):
|
||||
"""
|
||||
Learn model from the base model
|
||||
|
||||
** NOTE **: The the attribute names of learned model should **not** start with '_'. So that the model could be
|
||||
dumped to disk.
|
||||
.. note::
|
||||
|
||||
The the attribute names of learned model should `not` start with '_'. So that the model could be
|
||||
dumped to disk.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
@@ -702,7 +702,7 @@ def load_dataset(path_or_obj):
|
||||
if isinstance(path_or_obj, pd.DataFrame):
|
||||
return path_or_obj
|
||||
if not os.path.exists(path_or_obj):
|
||||
raise ValueError(f'file {path_or_obj} doesn\'t exist')
|
||||
raise ValueError(f"file {path_or_obj} doesn't exist")
|
||||
_, extension = os.path.splitext(path_or_obj)
|
||||
if extension == ".h5":
|
||||
return pd.read_hdf(path_or_obj)
|
||||
|
||||
@@ -162,6 +162,10 @@ class QlibRecorder:
|
||||
"""
|
||||
Method for listing all the recorders of experiment with given id or name.
|
||||
|
||||
If user doesn't provide the id or name of the experiment, this method will try to retrieve the default experiment and
|
||||
list all the recorders of the default experiment. If the default experiment doesn't exist, the method will first
|
||||
create the default experiment, and then create a new recorder under it.
|
||||
|
||||
Use case:
|
||||
---------
|
||||
```
|
||||
@@ -382,7 +386,7 @@ class QlibRecorder:
|
||||
----------
|
||||
local_path : str
|
||||
if provided, them save the file or directory to the artifact URI.
|
||||
artifact_path=None : str
|
||||
artifact_path : str
|
||||
the relative path for the artifact to be stored in the URI.
|
||||
"""
|
||||
self.get_exp().get_recorder().save_objects(local_path, artifact_path, **kwargs)
|
||||
|
||||
@@ -12,7 +12,7 @@ logger = get_module_logger("workflow", "INFO")
|
||||
|
||||
class Experiment:
|
||||
"""
|
||||
Thie is the `Experiment` class for each experiment being run. The API is designed similar to mlflow.
|
||||
This is the `Experiment` class for each experiment being run. The API is designed similar to mlflow.
|
||||
(The link: https://mlflow.org/docs/latest/python_api/mlflow.html)
|
||||
"""
|
||||
|
||||
@@ -111,24 +111,29 @@ class Experiment:
|
||||
active recorder. The `create` argument determines whether the method will automatically create a new recorder
|
||||
according to user's specification if the recorder hasn't been created before
|
||||
|
||||
If `create` is True:
|
||||
If R's running:
|
||||
1) no id or name specified, return the active recorder.
|
||||
2) if id or name is specified, return the specified recorder. If no such exp found,
|
||||
create a new recorder with given id or name, and the recorder shoud be running.
|
||||
If R's not running:
|
||||
1) no id or name specified, create a new recorder.
|
||||
2) if id or name is specified, return the specified experiment. If no such exp found,
|
||||
create a new recorder with given id or name, and the recorder shoud be running.
|
||||
Else If `create` is False:
|
||||
If R's running:
|
||||
1) no id or name specified, return the active recorder.
|
||||
2) if id or name is specified, return the specified recorder. If no such exp found,
|
||||
raise Error.
|
||||
If R's not running:
|
||||
1) no id or name specified, raise Error.
|
||||
2) if id or name is specified, return the specified recorder. If no such exp found,
|
||||
raise Error.
|
||||
* If `create` is True:
|
||||
|
||||
* If R's running:
|
||||
|
||||
* no id or name specified, return the active recorder.
|
||||
* if id or name is specified, return the specified recorder. If no such exp found, create a new recorder with given id or name, and the recorder shoud be running.
|
||||
|
||||
* If R's not running:
|
||||
|
||||
* no id or name specified, create a new recorder.
|
||||
* if id or name is specified, return the specified experiment. If no such exp found, create a new recorder with given id or name, and the recorder shoud be running.
|
||||
|
||||
* Else If `create` is False:
|
||||
|
||||
* If R's running:
|
||||
|
||||
* no id or name specified, return the active recorder.
|
||||
* if id or name is specified, return the specified recorder. If no such exp found, raise Error.
|
||||
|
||||
* If R's not running:
|
||||
|
||||
* no id or name specified, raise Error.
|
||||
* if id or name is specified, return the specified recorder. If no such exp found, raise Error.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@@ -147,7 +152,8 @@ class Experiment:
|
||||
|
||||
def list_recorders(self):
|
||||
"""
|
||||
List all the existing recorders of this experiment.
|
||||
List all the existing recorders of this experiment. Please first get the experiment instance before calling this method.
|
||||
If user want to use the method `R.list_recorders()`, please refer to the related API document in `QlibRecorder`.
|
||||
|
||||
Returns
|
||||
-------
|
||||
|
||||
@@ -94,26 +94,31 @@ class ExpManager:
|
||||
When user specify experiment id and name, the method will try to return the specific experiment.
|
||||
When user does not provide recorder id or name, the method will try to return the current active experiment.
|
||||
The `create` argument determines whether the method will automatically create a new experiment according
|
||||
to user's specification if the experiment hasn't been created before
|
||||
to user's specification if the experiment hasn't been created before.
|
||||
|
||||
If `create` is True:
|
||||
If R's running:
|
||||
1) no id or name specified, return the active experiment.
|
||||
2) if id or name is specified, return the specified experiment. If no such exp found,
|
||||
create a new experiment with given id or name, and the experiment is set to be running.
|
||||
If R's not running:
|
||||
1) no id or name specified, create a default experiment.
|
||||
2) if id or name is specified, return the specified experiment. If no such exp found,
|
||||
create a new experiment with given id or name, and the experiment is set to be running.
|
||||
Else If `create` is False:
|
||||
If R's running:
|
||||
1) no id or name specified, return the active experiment.
|
||||
2) if id or name is specified, return the specified experiment. If no such exp found,
|
||||
raise Error.
|
||||
If R's not running:
|
||||
1) no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
|
||||
2) if id or name is specified, return the specified experiment. If no such exp found,
|
||||
raise Error.
|
||||
* If `create` is True:
|
||||
|
||||
* If R's running:
|
||||
|
||||
* no id or name specified, return the active experiment.
|
||||
* if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name, and the experiment is set to be running.
|
||||
|
||||
* If R's not running:
|
||||
|
||||
* no id or name specified, create a default experiment.
|
||||
* if id or name is specified, return the specified experiment. If no such exp found, create a new experiment with given id or name, and the experiment is set to be running.
|
||||
|
||||
* Else If `create` is False:
|
||||
|
||||
* If R's running:
|
||||
|
||||
* no id or name specified, return the active experiment.
|
||||
* if id or name is specified, return the specified experiment. If no such exp found, raise Error.
|
||||
|
||||
* If R's not running:
|
||||
|
||||
* no id or name specified. If the default experiment exists, return it, otherwise, raise Error.
|
||||
* if id or name is specified, return the specified experiment. If no such exp found, raise Error.
|
||||
|
||||
Parameters
|
||||
----------
|
||||
|
||||
@@ -56,7 +56,12 @@ class RecordTemp:
|
||||
|
||||
def load(self, name):
|
||||
"""
|
||||
Load the stored records.
|
||||
Load the stored records. Due to the fact that some problems occured when we tried to balancing a clean API
|
||||
with the Python's inheritance. This method has to be used in a rather ugly way, and we will try to fix them
|
||||
in the future::
|
||||
|
||||
sar = SigAnaRecord(recorder)
|
||||
ic = sar.load(sar.get_path("ic.pkl"))
|
||||
|
||||
Parameters
|
||||
----------
|
||||
@@ -102,7 +107,7 @@ class RecordTemp:
|
||||
|
||||
class SignalRecord(RecordTemp):
|
||||
"""
|
||||
This is the Signal Record class that generates the signal prediction.
|
||||
This is the Signal Record class that generates the signal prediction. This class inherits the ``RecordTemp`` class.
|
||||
"""
|
||||
|
||||
def __init__(self, model=None, dataset=None, recorder=None, **kwargs):
|
||||
@@ -145,6 +150,9 @@ class SignalRecord(RecordTemp):
|
||||
|
||||
|
||||
class SigAnaRecord(SignalRecord):
|
||||
"""
|
||||
This is the Signal Analysis Record class that generates the analysis results such as IC and IR. This class inherits the ``RecordTemp`` class.
|
||||
"""
|
||||
|
||||
artifact_path = "sig_analysis"
|
||||
|
||||
@@ -196,7 +204,7 @@ class SigAnaRecord(SignalRecord):
|
||||
|
||||
class PortAnaRecord(SignalRecord):
|
||||
"""
|
||||
This is the Portfolio Analysis Record class that generates the results such as those of backtest.
|
||||
This is the Portfolio Analysis Record class that generates the analysis results such as those of backtest. This class inherits the ``RecordTemp`` class.
|
||||
"""
|
||||
|
||||
artifact_path = "portfolio_analysis"
|
||||
|
||||
@@ -22,4 +22,4 @@ scikit_learn==0.23.2
|
||||
torch==1.6.0
|
||||
tqdm==4.49.0
|
||||
yahooquery==2.2.7
|
||||
mlflow==1.11.0
|
||||
mlflow==1.12.1
|
||||
Reference in New Issue
Block a user