mirror of
https://github.com/microsoft/qlib.git
synced 2026-07-03 11:00:57 +08:00
update docs
This commit is contained in:
42
docs/advanced/serial.rst
Normal file
42
docs/advanced/serial.rst
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
.. _serial:
|
||||||
|
|
||||||
|
=================================
|
||||||
|
Serialization
|
||||||
|
=================================
|
||||||
|
.. currentmodule:: qlib
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
===================
|
||||||
|
``Qlib`` supports dumping the state of ``DataHandler``, ``DataSet``, ``Processor`` and ``Model``, etc. into a disk and reloading them.
|
||||||
|
|
||||||
|
Serializable Class
|
||||||
|
========================
|
||||||
|
|
||||||
|
``Qlib`` provides a base class ``qlib.utils.serial.Serializable``, whose state can be dumped in or loaded from disk in `pickle` format.
|
||||||
|
When users dump the state of the ``Serializable`` instance, the attributes of the instance whose name **does not** start with `_` will be saved on the disk.
|
||||||
|
|
||||||
|
Example
|
||||||
|
==========================
|
||||||
|
``Qlib``'s serializable class includes ``DataHandler``, ``DataSet``, ``Processor`` and ``Model``, etc., which are subclass of ``qlib.utils.serial.Serializable``.
|
||||||
|
Specifically, ``qlib.data.dataset.DatasetH`` is one of them. Users can serialize ``DatasetH`` as follows.
|
||||||
|
|
||||||
|
.. code-block:: Python
|
||||||
|
|
||||||
|
##=============dump dataset=============
|
||||||
|
dataset.to_pickle(path="dataset.pkl") # dataset is the instance of qlib.data.dataset.DatasetH
|
||||||
|
|
||||||
|
##=============reload dataset=============
|
||||||
|
with open("dataset.pkl", "rb") as file_dataset:
|
||||||
|
dataset = pickle.load(file_dataset)
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Only state of ``DatasetH`` should be saved on the disk, such as some `mean` and `variance` used for data normalization, etc.
|
||||||
|
|
||||||
|
After reloading the ``DatasetH``, users need to reinitialize it. It means that users can reset some states of ``DatasetH`` or ``QlibDataHandler`` such as `instruments`, `start_time`, `end_time` and `segments`, etc., and generate new data according to the states (data is not state and should not be saved on the disk).
|
||||||
|
|
||||||
|
A more detailed example is in this `link <https://github.com/microsoft/qlib/tree/main/examples/highfreq>`_.
|
||||||
|
|
||||||
|
|
||||||
|
API
|
||||||
|
===================
|
||||||
|
Please refer to `Serializable API <../reference/api.html#module-qlib.utils.serial.Serializable>`_.
|
||||||
@@ -153,3 +153,13 @@ Record Template
|
|||||||
--------------------
|
--------------------
|
||||||
.. automodule:: qlib.workflow.record_temp
|
.. automodule:: qlib.workflow.record_temp
|
||||||
:members:
|
:members:
|
||||||
|
|
||||||
|
|
||||||
|
Utils
|
||||||
|
====================
|
||||||
|
|
||||||
|
Serializable
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
.. automodule:: qlib.utils.serial.Serializable
|
||||||
|
:members:
|
||||||
@@ -12,11 +12,11 @@ Get high-frequency data by running the following command:
|
|||||||
## Dump & Reload & Reinitialize the Dataset
|
## Dump & Reload & Reinitialize the Dataset
|
||||||
|
|
||||||
|
|
||||||
The High-Frequency Dataset is implemented as `qlib.data.dataset.DatasetH` in the `workflow.py`. `DatatsetH` is the subclass of `qlib.utils.serial.Serializable`, which supports being dumped in or loaded from disk in `pickle` format.
|
The High-Frequency Dataset is implemented as `qlib.data.dataset.DatasetH` in the `workflow.py`. `DatatsetH` is the subclass of [`qlib.utils.serial.Serializable`](https://qlib.readthedocs.io/en/latest/advanced/serial.html), whose state can be dumped in or loaded from disk in `pickle` format.
|
||||||
|
|
||||||
### About Reinitialization
|
### About Reinitialization
|
||||||
|
|
||||||
After reloading `Dataset` from disk, `Qlib` also support reinitializing the dataset. It means that users can reset some config of `Dataset` or `DataHandler` such as `instruments`, `start_time`, `end_time` and `segmens`, etc.
|
After reloading `Dataset` from disk, `Qlib` also support reinitializing the dataset. It means that users can reset some states of `Dataset` or `DataHandler` such as `instruments`, `start_time`, `end_time` and `segments`, etc., and generate new data according to the states.
|
||||||
|
|
||||||
The example is given in `workflow.py`, users can run the code as follows.
|
The example is given in `workflow.py`, users can run the code as follows.
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user