Update docs

2026-07-03 11:00:57 +08:00 · 2020-11-30 18:54:31 +08:00
parent 1877ad8c39
commit 29f12e857f
22 changed files with 180 additions and 159 deletions
--- a/docs/component/data.rst
+++ b/docs/component/data.rst
@@ -29,7 +29,18 @@ Qlib Format Data
 ------------------

 We've specially designed a data structure to manage financial data, please refer to the `File storage design section in Qlib paper <https://arxiv.org/abs/2009.11189>`_ for detailed information.
-Such data will be stored with filename suffix `.bin` (We'll call them `.bin` file, `.bin` format, or qlib format). `.bin` file is designed for scientific computing on finance data
+Such data will be stored with filename suffix `.bin` (We'll call them `.bin` file, `.bin` format, or qlib format). `.bin` file is designed for scientific computing on finance data.
+
+``Qlib`` provides two different off-the-shelf dataset, which can be accessed through this `link <https://github.com/microsoft/qlib/blob/main/qlib/contrib/data/handler.py>`_:
+
+========================  =================  ================
+Dataset                   US Market          China Market
+========================  =================  ================
+Alpha360                  √                  √
+
+Alpha158                  √                  √
+========================  =================  ================
+

 Qlib Format Dataset
 --------------------
@@ -45,7 +56,7 @@ In addition to China-Stock data, ``Qlib`` also includes a US-Stock dataset, whic

    python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/us_data --region us

-After running the above command, users can find china-stock and us-stock data in Qlib format in the ``~/.qlib/csv_data/cn_data`` directory and ``~/.qlib/csv_data/us_data`` directory respectively.
+After running the above command, users can find china-stock and us-stock data in ``Qlib`` format in the ``~/.qlib/csv_data/cn_data`` directory and ``~/.qlib/csv_data/us_data`` directory respectively.

 ``Qlib`` also provides the scripts in ``scripts/data_collector`` to help users crawl the latest data on the Internet and convert it to qlib format.

@@ -54,8 +65,7 @@ When ``Qlib`` is initialized with this dataset, users could build and evaluate t
 Converting CSV Format into Qlib Format
 -------------------------------------------

-``Qlib`` has provided the script ``scripts/dump_bin.py`` to convert data in CSV format into `.bin` files (Qlib format).
-
+``Qlib`` has provided the script ``scripts/dump_bin.py`` to convert **any** data in CSV format into `.bin` files (``Qlib`` format) as long as they are in the correct format.

 Users can download the demo china-stock data in CSV format as follows for reference to the CSV format.

@@ -130,9 +140,21 @@ After conversion, users can find their Qlib format data in the directory `~/.qli

    In the convention of `Qlib` data processing, `open, close, high, low, volume, money and factor` will be set to NaN if the stock is suspended. 

-China-Stock Mode & US-Stock Mode
+Multiple Stock Modes
 --------------------------------

+``Qlib`` now provides two different stock modes for users: China-Stock Mode & US-Stock Mode. Here are some different settings of these two modes:
+
+==============  =================  ================
+Region          Trade Unit         Limit Threshold
+==============  =================  ================
+China           100                0.099
+
+US              1                  None
+==============  =================  ================
+
+The `trade unit` defines the unit number of stocks can be used in a trade, and the `limit threshold` defines the bound set to the percentage of ups and downs of a stock.
+
 - If users use ``Qlib`` in china-stock mode, china-stock data is required. Users can use ``Qlib`` in china-stock mode according to the following steps:
    - Download china-stock in qlib format, please refer to section `Qlib Format Dataset <#qlib-format-dataset>`_.
    - Initialize ``Qlib`` in china-stock mode
@@ -208,13 +230,19 @@ QlibDataLoader

 The ``QlibDataLoader`` class in ``Qlib`` is such an interface that allows users to load raw data from the ``Qlib`` data source.

+StaticDataLoader
+---------------
+
+The ``StaticDataLoader`` class in ``Qlib`` is such an interface that allows users to load raw data from file or as provided.
+
+
 Interface
 ------------

 Here are some interfaces of the ``QlibDataLoader`` class:

-.. autoclass:: qlib.data.dataset.loader.QlibDataLoader
-    :members: load, load_group_df
+.. autoclass:: qlib.data.dataset.loader.DataLoader
+    :members:

 API
 -----------
--- a/docs/component/model.rst
+++ b/docs/component/model.rst
@@ -18,45 +18,10 @@ Base Class & Interface

 The base class provides the following interfaces:

- `__init__(**kwargs)`
-    - Initialization.
-
- `fit(self, dataset, **kwargs)`
-    - Train model.
-    - Parameter:
-        - `dataset`, ``Qlib``'s ``DatasetH`` type. For more information about ``DatasetH``, users can refer to the related document: `Qlib Dataset <../component/data.html#dataset>`_.
-            The `dataset` is passed into the `model`'s method because there are some unique data preprocessing procedures for each, we want to give each model maximum flexibility to handle the data that is suitable for their own.
-            The following code example shows how to retrieve `x_train`, `y_train` and `w_train` from the `dataset`:
-
-            .. code-block:: Python
-
-                # get features and labels
-                df_train, df_valid = dataset.prepare(
-                    ["train", "valid"], col_set=["feature", "label"], data_key=DataHandlerLP.DK_L
-                )
-                x_train, y_train = df_train["feature"], df_train["label"]
-                x_valid, y_valid = df_valid["feature"], df_valid["label"]
-
-                # get weights
-                try:
-                    wdf_train, wdf_valid = dataset.prepare(["train", "valid"], col_set=["weight"], data_key=DataHandlerLP.DK_L)
-                    w_train, w_valid = wdf_train["weight"], wdf_valid["weight"]
-                except KeyError as e:
-                    w_train = pd.DataFrame(np.ones_like(y_train.values), index=y_train.index)
-                    w_valid = pd.DataFrame(np.ones_like(y_valid.values), index=y_valid.index)
-        
- `predict(self, dataset, **kwargs)`
-    - Predict test data.
-    - Parameter:
-        - `dataset`, ``Qlib``'s ``DatasetH`` type. The usage is similar to the example above.
-    - Returns:
-        - Predic results with type: `pandas.Series`.
-
- `finetune(self, dataset, **kwargs)`
-    - Finetune the model.
-    - Parameter:
-        - `dataset`, ``Qlib``'s ``DatasetH`` type. The usage is similar to the example above.
+.. autoclass:: qlib.model.base.Model
+    :members:

+``Qlib`` also provides a base class `qlib.model.base.ModelFT <../reference/api.html#qlib.model.base.ModelFT>`_, which includes the method for finetuning the model.
    
 For other interfaces such as `finetune`, please refer to `Model API <../reference/api.html#module-qlib.model.base>`_.

--- a/docs/component/recorder.rst
+++ b/docs/component/recorder.rst
@@ -72,6 +72,8 @@ The ``Experiment`` class is solely responsible for a single experiment, and it w

 For other interfaces such as `search_records`, `delete_recorder`, please refer to `Experiment API <../reference/api.html#experiment>`_.

+``Qlib`` also provides a default ``Experiment``, which will be created and used under certain situations when users use the APIs such as `log_metrics` or `get_exp`. If the default ``Experiment`` is used, there will be related logged information when running ``Qlib``. Users are able to change the name of the default ``Experiment`` in the config file of ``Qlib`` or during ``Qlib``'s `initialization <../start/initialization.html#parameters>`_, which is set to be '`Experiment`'.
+
 Recorder
 ===================

--- a/docs/component/workflow.rst
+++ b/docs/component/workflow.rst
@@ -11,8 +11,8 @@ Introduction
 The components in `Qlib Framework <../introduction/introduction.html#framework>`_ are designed in a loosely-coupled way. Users could build their own Quant research workflow with these components like `Example <https://github.com/microsoft/qlib/blob/main/examples/workflow_by_code.py>`_.


-Besides, ``Qlib`` provides more user-friendly interfaces named ``qrun`` to automatically run the whole workflow defined by configuration.  A concrete execution of the whole workflow is called an `experiment`.
-With ``qrun``, user can easily run an `experiment`, which includes the following steps:
+Besides, ``Qlib`` provides more user-friendly interfaces named ``qrun`` to automatically run the whole workflow defined by configuration. Running the whole workflow is called an `execution`.
+With ``qrun``, user can easily start an `execution`, which includes the following steps:

 - Data
    - Loading
@@ -25,7 +25,7 @@ With ``qrun``, user can easily run an `experiment`, which includes the following
    - Forecast signal analysis
    - Backtest

-For each `experiment`, ``Qlib`` has a complete system to tracking all the information as well as artifacts generated during training, inference and evaluation phase. For more information about how Qlib handles `experiment`, please refer to the related document: `Recorder: Experiment Management <../component/recorder.html>`_.
+For each `execution`, ``Qlib`` has a complete system to tracking all the information as well as artifacts generated during training, inference and evaluation phase. For more information about how ``Qlib`` handles this, please refer to the related document: `Recorder: Experiment Management <../component/recorder.html>`_.

 Complete Example
 ===================
@@ -35,8 +35,9 @@ Below is a typical config file of ``qrun``.

 .. code-block:: YAML

-    provider_uri: "~/.qlib/qlib_data/cn_data"
-    region: cn
+    qlib_init:
+        provider_uri: "~/.qlib/qlib_data/cn_data"
+        region: cn
    market: &market csi300
    benchmark: &benchmark SH000300
    data_handler_config: &data_handler_config
@@ -100,12 +101,16 @@ After saving the config into `configuration.yaml`, users could start the workflo

 .. code-block:: bash

-    qrun -c configuration.yaml
+    qrun configuration.yaml

 .. note:: 

    `qrun` will be placed in your $PATH directory when installing ``Qlib``.

+.. note:: 
+        
+    The symbol `&` in `yaml` file stands for an anchor of a field, which is useful when another fields include this parameter as part of the value. Taking the configuration file above as an example, users can directly change the value of `market` and `benchmark` without traversing the entire configuration file.
+

 Configuration File
 ===================
@@ -114,17 +119,15 @@ Let's get into details of ``qrun`` in this section.

 Before using ``qrun``, users need to prepare a configuration file. The following content shows how to prepare each part of the configuration file.

-Qlib Data Section
+Qlib Init Section
 --------------------

-At first, the configuration file needs to contain several basic parameters about the data, which will be used for qlib initialization, data handling and backtest.
+At first, the configuration file needs to contain several basic parameters which will be used for qlib initialization.

 .. code-block:: YAML

    provider_uri: "~/.qlib/qlib_data/cn_data"
    region: cn
-    market: &market csi300
-    benchmark: &benchmark SH000300

 The meaning of each field is as follows:

@@ -139,34 +142,14 @@ The meaning of each field is as follows:
        
        The value of `region` should be aligned with the data stored in `provider_uri`.

- `market`
-    Type: str. Index name, the default value is `csi500`.

- `benchmark`
-    Type: str, list or pandas.Series. Stock index symbol, the default value is `SH000905`.
+Task Section
+--------------------

-    .. note::
-
-        * If `benchmark` is str, it will use the daily change as the 'bench'.
-
-        * If `benchmark` is list, it will use the daily average change of the stock pool in the list as the 'bench'.
-
-        * If `benchmark` is pandas.Series, whose `index` is trading date and the value T is the change from T-1 to T, it will be directly used as the 'bench'. An example is as following:
-        
-            .. code-block:: python
-
-                print(D.features(D.instruments('csi500'), ['$close/Ref($close, 1)-1'])['$close/Ref($close, 1)-1'].head())
-                    2017-01-04    0.011693
-                    2017-01-05    0.000721
-                    2017-01-06   -0.004322
-                    2017-01-09    0.006874
-                    2017-01-10   -0.003350
-.. note:: 
-        
-    The symbol `&` in `yaml` file stands for an anchor of a field, which is useful when another fields include this parameter as part of the value. Taking the configuration file above as an example, users can directly change the value of `market` and `benchmark` without traversing the entire configuration file.
+The `task` field in the configuration corresponds to a `task`, which contains the parameters of three different subsections: `Model`, `Dataset` and `Record`.

 Model Section
--------------------
+~~~~~~~~~~~~~~~~~~~~

 In the `task` field, the `model` section describes the parameters of the model to be used for training and inference. For more information about the base ``Model`` class, please refer to `Qlib Model <../component/model.html>`_.

@@ -202,7 +185,7 @@ The meaning of each field is as follows:
    ``Qlib`` provides a util named: ``init_instance_by_config`` to initialize any class inside ``Qlib`` with the configuration includes the fields: `class`, `module_path` and `kwargs`.

 Dataset Section
--------------------
+~~~~~~~~~~~~~~~~~~~~

 The `dataset` field describes the parameters for the ``Dataset`` module in ``Qlib`` as well those for the module ``DataHandler``. For more information about the ``Dataset`` module, please refer to `Qlib Model <../component/data.html#dataset>`_.

@@ -237,9 +220,9 @@ Here is the configuration for the ``Dataset`` module which will take care of dat
                test: [2017-01-01, 2020-08-01]

 Record Section
--------------------
+~~~~~~~~~~~~~~~~~~~~

-The `record` field is about the parameters the ``Record`` module in ``Qlib``. ``Record`` is responsible for generating certain analysis and evaluation results such as `prediction`, `information Coefficient (IC)` and `backtest`.
+The `record` field is about the parameters the ``Record`` module in ``Qlib``. ``Record`` is responsible for tracking training process and results such as `information Coefficient (IC)` and `backtest` in a standard format.

 The following script is the configuration of `backtest` and the `strategy` used in `backtest`: