mirror of
https://github.com/microsoft/qlib.git
synced 2026-07-04 11:30:57 +08:00
Draft version of refactoring handler
This commit is contained in:
@@ -156,17 +156,17 @@ Data Handler
|
||||
|
||||
Users can use ``Data Handler`` in an automatic workflow by ``Estimator``, refer to `Estimator: Workflow Management <estimator.html>`_ for more details.
|
||||
|
||||
Also, ``Data Handler`` can be used as an independent module, by which users can easily preprocess data(standardization, remove NaN, etc.) and build datasets. It is a subclass of ``qlib.data.dataset.handler.BaseDataHandler``, which provides some interfaces as follows.
|
||||
Also, ``Data Handler`` can be used as an independent module, by which users can easily preprocess data(standardization, remove NaN, etc.) and build datasets. It is a subclass of ``qlib.data.dataset.handler.DataHandlerLP``, which provides some interfaces as follows.
|
||||
|
||||
Base Class & Interface
|
||||
----------------------
|
||||
|
||||
Qlib provides a base class `qlib.data.dataset.BaseDataHandler <../reference/api.html#qlib.data.dataset.handler.BaseDataHandler>`_, which provides the following interfaces:
|
||||
Qlib provides a base class `qlib.data.dataset.DataHandlerLP <../reference/api.html#qlib.data.dataset.handler.DataHandlerLP>`_, which provides the following interfaces:
|
||||
|
||||
- `setup_feature`
|
||||
- `load_feature`
|
||||
Implement the interface to load the data features.
|
||||
|
||||
- `setup_label`
|
||||
- `load_label`
|
||||
Implement the interface to load the data labels and calculate the users' labels.
|
||||
|
||||
- `setup_processed_data`
|
||||
@@ -174,11 +174,7 @@ Qlib provides a base class `qlib.data.dataset.BaseDataHandler <../reference/api.
|
||||
|
||||
Qlib also provides two functions to help users init the data handler, users can override them for users' needs.
|
||||
|
||||
- `_init_kwargs`
|
||||
Users can init the kwargs of the data handler in this function, some kwargs may be used when init the raw df.
|
||||
Kwargs are the other attributes in data.args, like dropna_label, dropna_feature
|
||||
|
||||
- `_init_raw_df`
|
||||
- `_init_raw_data`
|
||||
Users can init the raw df, feature names, and label names of data handler in this function.
|
||||
If the index of feature df and label df are not the same, users need to override this method to merge them (e.g. inner, left, right merge).
|
||||
|
||||
|
||||
@@ -284,7 +284,7 @@ To know more about ``Interday Model``, please refer to `Interday Model: Training
|
||||
Data Section
|
||||
-----------------
|
||||
|
||||
``Data Handler`` can be used to load raw data, prepare features and label columns, preprocess data (standardization, remove NaN, etc.), split training, validation, and test sets. It is a subclass of `qlib.data.dataset.handler.BaseDataHandler`.
|
||||
``Data Handler`` can be used to load raw data, prepare features and label columns, preprocess data (standardization, remove NaN, etc.), split training, validation, and test sets. It is a subclass of `qlib.data.dataset.handler.DataHandlerLP`.
|
||||
|
||||
Users can use the specified data handler by config as follows.
|
||||
|
||||
@@ -315,7 +315,7 @@ Users can use the specified data handler by config as follows.
|
||||
fend_time: 2018-12-11
|
||||
|
||||
- `class`
|
||||
Data handler class, str type, which should be a subclass of `qlib.data.dataset.handler.BaseDataHandler`, and implements 5 important interfaces for loading features, loading raw data, preprocessing raw data, slicing train, validation, and test data. The default value is `ALPHA360`. If users want to write a data handler to retrieve the data in ``Qlib``, `QlibDataHandler` is suggested.
|
||||
Data handler class, str type, which should be a subclass of `qlib.data.dataset.handler.DataHandlerLP`, and implements 5 important interfaces for loading features, loading raw data, preprocessing raw data, slicing train, validation, and test data. The default value is `ALPHA360`. If users want to write a data handler to retrieve the data in ``Qlib``, `QlibDataHandler` is suggested.
|
||||
|
||||
- `module_path`
|
||||
The module path, str type, absolute url is also supported, indicates the path of the `class` implementation of the data processor class. The default value is `qlib.data.dataset.handler`.
|
||||
@@ -363,7 +363,7 @@ Users can use the specified data handler by config as follows.
|
||||
Custom Data Handler
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Qlib support custom data handler, but it must be a subclass of the ``qlib.contrib.estimator.handler.BaseDataHandler``, the config for custom data handler may be as follows.
|
||||
Qlib support custom data handler, but it must be a subclass of the ``qlib.data.dataset.handler.DataHandlerLP``, the config for custom data handler may be as follows.
|
||||
|
||||
.. code-block:: YAML
|
||||
|
||||
|
||||
Reference in New Issue
Block a user