1
0
mirror of https://github.com/microsoft/qlib.git synced 2026-07-05 20:11:08 +08:00

release-0.5.0 (#1)

* init commit

* change the version number

* rich the docs&fix cache docs

* update index readme

* Modify cache class name

* Modify sharpe to information_ratio

* Modify Group- to Group

* add the description of graphical results & fix the backtest docs

* fix docs in details

* update docs

* Update introduction.rst

* Update README.md

* Update introduction.rst

* Update introduction.rst

* Update introduction.rst

* Update installation.rst

* Update installation.rst

* Update initialization.rst

* Update getdata.rst

* Update integration.rst

* Update initialization.rst

* Update getdata.rst

* Update estimator.rst

Modify some typos.

* Update README.md

Modify the typos.

* Update initialization.rst

* Update data.rst

* Update report.rst

* Update estimator.rst

* Update cumulative_return.py

* Update model.rst

* Update rank_label.py

* Update cumulative_return.py

* Update strategy.rst

* Update getdata.rst

* Update backtest.rst

* Update integration.rst

* Update getdata.rst

* Update introduction.rst

* Update introduction.rst

* Update README.md

* Update report.rst

* Update integration.rst

Fix typos

* Update installation.rst

Fix typos

* Update getdata.rst

* Update initialization.rst

Fix typos.

* add quick start docs&fix detials

* fix estimator docs & fix strategy docs

* fix the cahce in data.rst

* update documents

* Fix Corr && Rsquare

* fix data retrival example to csi300 & fix a data bug

* fix filter bug

* Fix data collector

* Modift model args

* add the log & fix README.md\quick.rst

* add enviroment depend & add intoduction of qlib-server online mode

* fix image center fomat & set log_only of docs is True

* fix README.md format

* update data preparation & readme logo image

* get_data support version

* Modify analysis names

* Modify analysis graph

* update report.rst & data.rst

* commmit estimator for merge

* minimal requirements

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update READEME.md

* Update READEME.md

* update estimator

* Fix doc urls

* fix get_data.py docstring

* update test_get_data.py

* Upate docs

* Upate docs

* Upate docs

Co-authored-by: bxdd <bxddream@gmail.com>
Co-authored-by: zhupr <zhu.pengrong@foxmail.com>
Co-authored-by: Wendi Li <wendili.academic@qq.com>
Co-authored-by: Dingsu Wang <dingsu.wang@gmail.com>
Co-authored-by: bxdd <45119470+bxdd@users.noreply.github.com>
Co-authored-by: cslwqxx <cslwqxx@users.noreply.github.com>
This commit is contained in:
you-n-g
2020-09-23 23:01:39 -05:00
committed by GitHub
parent 99ebd87cba
commit de9e13b171
82 changed files with 1580 additions and 1145 deletions

View File

@@ -8,7 +8,7 @@ Data Retrieval
Introduction
====================
Users can get stock data by ``Qlib``. Following examples will demonstrate the basic user interface.
Users can get stock data with ``Qlib``. The following examples demonstrate the basic user interface.
Examples
====================
@@ -16,122 +16,109 @@ Examples
``QLib`` Initialization:
.. note:: In order to get the data, users need to initialize ``Qlib`` with `qlib.init` first. Please refer to `initialization <initialization.rst>`_.
.. note:: In order to get the data, users need to initialize ``Qlib`` with `qlib.init` first. Please refer to `initialization <initialization.html>`_.
It is recommended to use the following code to initialize qlib:
If users followed steps in `initialization <initialization.html>`_ and downloaded the data, they should use the following code to initialize qlib
.. code-block:: python
>>> import qlib
>>> qlib.init(provider_uri='~/.qlib/qlib_data/cn_data')
>> import qlib
>> qlib.init(provider_uri='~/.qlib/qlib_data/cn_data')
Load trading calendar with the given time range and frequency:
Load trading calendar with given time range and frequency:
.. code-block:: python
>>> from qlib.data import D
>>> D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]
>> from qlib.data import D
>> D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]
[Timestamp('2010-01-04 00:00:00'), Timestamp('2010-01-05 00:00:00')]
Parse a given market name into a stockpool config:
Parse a given market name into a stock pool config:
.. code-block:: python
>>> from qlib.data import D
>>> D.instruments(market='all')
>> from qlib.data import D
>> D.instruments(market='all')
{'market': 'all', 'filter_pipe': []}
Load instruments of certain stockpool in the given time range:
Load instruments of certain stock pool in the given time range:
.. code-block:: python
>>> from qlib.data import D
>>> instruments = D.instruments(market='csi300')
>>> D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6]
>> from qlib.data import D
>> instruments = D.instruments(market='csi300')
>> D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6]
['SH600036', 'SH600110', 'SH600087', 'SH600900', 'SH600089', 'SZ000912']
Load dynamic instruments from a base market according to a name filter
.. code-block:: python
>>> from qlib.data import D
>>> from qlib.data.filter import NameDFilter
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter])
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
>> from qlib.data import D
>> from qlib.data.filter import NameDFilter
>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter])
>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
['SH600655', 'SH601555']
Load dynamic instruments from a base market according to an expression filter
.. code-block:: python
>>> from qlib.data import D
>>> from qlib.data.filter import ExpressionDFilter
>>> expressionDFilter = ExpressionDFilter(rule_expression='$close>100')
>>> instruments = D.instruments(market='csi300', filter_pipe=[expressionDFilter])
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
>> from qlib.data import D
>> from qlib.data.filter import ExpressionDFilter
>> expressionDFilter = ExpressionDFilter(rule_expression='$close>2000')
>> instruments = D.instruments(market='csi300', filter_pipe=[expressionDFilter])
>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
['SZ000651', 'SZ000002', 'SH600655', 'SH600570']
To know more about how to use the filter or how to build one's own filter, go to API Reference: `filter API <../reference/api.html#filter>`_
For more details about filter, please refer `Filter API <../component/data.html>`_.
Load features of certain instruments in given time range:
.. note:: This is not a recommended way to get features.
Load features of certain instruments in a given time range:
.. code-block:: python
>>> from qlib.data import D
>>> instruments = ['SH600000']
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
$close $volume Ref($close,1) Mean($close,3) \
instrument datetime
SH600000 2010-01-04 81.809998 17144536.0 NaN 81.809998
2010-01-05 82.419998 29827816.0 81.809998 82.114998
2010-01-06 80.800003 25070040.0 82.419998 81.676666
2010-01-07 78.989998 22077858.0 80.800003 80.736666
2010-01-08 79.879997 17019168.0 78.989998 79.889999
>> from qlib.data import D
>> instruments = ['SH600000']
>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
$close $volume Ref($close, 1) Mean($close, 3) $high-$low
instrument datetime
SH600000 2010-01-04 86.778313 16162960.0 88.825928 88.061483 2.907631
2010-01-05 87.433578 28117442.0 86.778313 87.679273 3.235252
2010-01-06 85.713585 23632884.0 87.433578 86.641825 1.720009
2010-01-07 83.788803 20813402.0 85.713585 85.645322 3.030487
2010-01-08 84.730675 16044853.0 83.788803 84.744354 2.047623
Sub($high,$low)
instrument datetime
SH600000 2010-01-04 2.741158
2010-01-05 3.049736
2010-01-06 1.621399
2010-01-07 2.856926
2010-01-08 1.930397
2010-01-08 1.930397
Load features of certain stock pool in a given time range:
Load features of certain stockpool in given time range:
.. note:: Since the server need to cache all-time data for your request stockpool and fields, it may take longer to process your request than before. But in the second time, your request will be processed and responded in a flash even if you change the timespan.
.. note:: With cache enabled, the qlib data server will cache data all the time for the requested stock pool and fields, it may take longer to process the request for the first time than that without cache. But after the first time, requests with the same stock pool and fields will hit the cache and be processed faster even the requested time period changes.
.. code-block:: python
>>> from qlib.data import D
>>> from qlib.data.filter import NameDFilter, ExpressionDFilter
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>>> expressionDFilter = ExpressionDFilter(rule_expression='($close/$factor)>100')
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter, expressionDFilter])
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
>> from qlib.data import D
>> from qlib.data.filter import NameDFilter, ExpressionDFilter
>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
>> expressionDFilter = ExpressionDFilter(rule_expression='$close>Ref($close,1)')
>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter, expressionDFilter])
>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
$close $volume Ref($close, 1) \
instrument datetime
SH600655 2015-06-15 4342.160156 258706.359375 4530.459961
2015-06-16 4409.270020 257349.718750 4342.160156
2015-06-17 4312.330078 235214.890625 4409.270020
2015-06-18 4086.729980 196772.859375 4312.330078
2015-06-19 3678.250000 182916.453125 4086.729980
Mean($close, 3) high low
instrument datetime
SH600655 2015-06-15 4480.743327 285.251465
2015-06-16 4427.296712 298.301270
2015-06-16 4354.586751 356.098145
2015-06-16 4269.443359 363.554932
2015-06-16 4025.770020 368.954346
$close $volume Ref($close, 1) Mean($close, 3) $high-$low
instrument datetime
SH600655 2010-01-04 2699.567383 158193.328125 2619.070312 2626.097738 124.580566
2010-01-08 2612.359619 77501.406250 2584.567627 2623.220133 83.373047
2010-01-11 2712.982422 160852.390625 2612.359619 2636.636556 146.621582
2010-01-12 2788.688232 164587.937500 2712.982422 2704.676758 128.413818
2010-01-13 2790.604004 145460.453125 2788.688232 2764.091553 128.413818
.. note:: When calling D.features() at client, use parameter 'disk_cache=0' to skip dataset cache, use 'disk_cache=1' to generate and use dataset cache. In addition, when calling at server, you can use 'disk_cache=2' to update the dataset cache.
For more details about features, please refer `Feature API <../component/data.html>`_.
.. note:: When calling `D.features()` at the client, use parameter `disk_cache=0` to skip dataset cache, use `disk_cache=1` to generate and use dataset cache. In addition, when calling at the server, users can use `disk_cache=2` to update the dataset cache.
API
====================
To know more about how to use the Data, go to API Reference: `Data API <../reference/api.html#Data>`_
To know more about how to use the Data, go to API Reference: `Data API <../reference/api.html#data>`_

View File

@@ -9,17 +9,16 @@ Qlib Initialization
Initialization
=========================
Please execute the following process to initialize ``Qlib``.
Please follow the steps below to initialize ``Qlib``.
- Download and prepare the Data: execute the following command to download the stock data.
- Download and prepare the Data: execute the following command to download stock data.
.. code-block:: bash
python scripts/get_data.py qlib_data_cn --target_dir ~/.qlib/qlib_data/cn_data
Know more about how to use ``get_data.py``, refer to `Raw Data <../advanced/data.html#raw-data>`_.
Please refer to `Raw Data <../component/data.html>`_ for more information about ``get_data.py``,
- Run the initialization code: run the following code in python:
- Initialize Qlib before calling other APIs: run following code in python.
.. code-block:: Python
@@ -34,17 +33,17 @@ Please execute the following process to initialize ``Qlib``.
Parameters
-------------------
In fact, in addition to `provider_uri` and `region`, `qlib.init` has other parameters. The following are all the parameters of `qlib.init`:
Besides `provider_uri` and `region`, `qlib.init` has other parameters. The following are several important parameters of `qlib.init`:
- `provider_uri`
Type: str. The local directory where the data loaded by ``get_data.py`` is stored.
Type: str. The URI of the Qlib data. For example, it could be the location where the data loaded by ``get_data.py`` are stored.
- `region`
Type: str, optional parameter(default: ``qlib.config.REG_CN``).
Currently: ``qlib.config.REG_US``('us') and ``qlib.config.REG_CN``('cn') is supported. Different value of ``region`` will
result in different stock market mode.
Type: str, optional parameter(default: `qlib.config.REG_CN`).
Currently: ``qlib.config.REG_US`` ('us') and ``qlib.config.REG_CN`` ('cn') is supported. Different value of `region` will result in different stock market mode.
- ``qlib.config.REG_US``: US stock market.
- ``qlib.config.REG_CN``: China stock market.
Different modse will result in different trading limitations and costs.
- `redis_host`
Type: str, optional parameter(default: "127.0.0.1"), host of `redis`
The lock and cache mechanism relies on redis.
@@ -57,4 +56,4 @@ In fact, in addition to `provider_uri` and `region`, `qlib.init` has other param
.. note::
If redis connection failed with `redis_host` and `redis_port`, cache will not be used! Please refer to `Cache <../advanced/cache.rst>`_.
If Qlib fails to connect redis via `redis_host` and `redis_port`, cache mechanism will not be used! Please refer to `Cache <../component/data.html#cache>`_ for details.

View File

@@ -6,33 +6,34 @@ Installation
.. currentmodule:: qlib
How to Install ``Qlib``
====================
``Qlib`` Installation
=====================
.. note::
``Qlib`` only supports Python3, and supports up to Python3.8.
`Qlib` supports both `Windows` and `Linux`. It's recommended to use `Qlib` in `Linux`. ``Qlib`` supports Python3, which is up to Python3.8.
Please execute the following process to install ``Qlib``:
Please follow the steps below to install ``Qlib``:
- Change the directory to ``Qlib``, in which the file ``setup.py`` exists.
- Then, please execute the following command:
- Enter the root directory of ``Qlib``, in which the file ``setup.py`` exists.
- Then, please execute the following command to install the environment dependencies and install ``Qlib``:
.. code-block:: bash
$ pip install numpy
$ pip install --upgrade cython
$ git clone https://github.com/microsoft/qlib.git && cd qlib
$ python setup.py install
.. note::
It's recommended to use anaconda/miniconda to setup environment.
``Qlib`` needs lightgbm and tensorflow packages, use pip to install them.
It's recommended to use anaconda/miniconda to setup the environment. ``Qlib`` needs lightgbm and pytorch packages, use pip to install them.
.. note::
Do not import qlib in the repository folder which contains ``qlib``, otherwise errors may occur.
Do not import qlib in the root directory of ``Qlib``, otherwise, errors may occur.
Use the following code to confirm installation successful:
Use the following code to make sure the installation successful:
.. code-block:: python
@@ -41,3 +42,4 @@ Use the following code to confirm installation successful:
<LATEST VERSION>
=====================

View File

@@ -9,18 +9,18 @@ Introduction
Users can integrate their own custom models according to the following steps.
- Define a custom model class, which should be a subclass of the `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_
- Write a configuration file that describes the path and parameters of the custom model
- Test the custom model
- Define a custom model class, which should be a subclass of the `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_.
- Write a configuration file that describes the path and parameters of the custom model.
- Test the custom model.
Custom Model Class
===========================
The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_ and override the methods in it.
- Override the `__init__` method
- ``Qlib`` passes the initialized parameters to the \_\_init\_\_ method
- ``Qlib`` passes the initialized parameters to the \_\_init\_\_ method.
- The parameter must be consistent with the hyperparameters in the configuration file.
- Code Example: In the following example, the hyperparameter filed of the configuration file should contain parameters such as loss:mse.
- Code Example: In the following example, the hyperparameter filed of the configuration file should contain parameters such as `loss:mse`.
.. code-block:: Python
def __init__(self, loss='mse', **kwargs):
@@ -32,9 +32,9 @@ The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/a
- Override the `fit` method
- ``Qlib`` calls the fit method to train the model
- The parameters must include training feature 'x_train', training label 'y_train', test feature 'x_valid', test label 'y_valid'at least.
- The parameters could include some optional parameters with default values, such as train weight 'w_train', test weight 'w_valid' and 'num_boost_round = 1000'.
- Code Example: In the following example, 'num_boost_round = 1000' is an optional parameter.
- The parameters must include training feature `x_train`, training label `y_train`, test feature `x_valid`, test label `y_valid` at least.
- The parameters could include some optional parameters with default values, such as train weight `w_train`, test weight `w_valid` and `num_boost_round = 1000`.
- Code Example: In the following example, `num_boost_round = 1000` is an optional parameter.
.. code-block:: Python
def fit(self, x_train:pd.DataFrame, y_train:pd.DataFrame, x_valid:pd.DataFrame, y_valid:pd.DataFrame,
@@ -61,10 +61,10 @@ The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/a
)
- Override the `predict` method
- The parameters include the test features
- Return the prediction score
- Please refer to `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_ for the parameter types of the fit method
- Code Example:In the following example, user need to user dnn to predict the label(such as 'preds') of test data 'x_test' and return it.
- The parameters include the test features.
- Return the `prediction score`.
- Please refer to `qlib.contrib.model.base.Model <../reference/api.html#module-qlib.contrib.model.base>`_ for the parameter types of the fit method.
- Code Example: In the following example, users need to use dnn to predict the label(such as `preds`) of test data `x_test` and return it.
.. code-block:: Python
def predict(self, x_test:pd.DataFrame, **kwargs)-> numpy.ndarray:
@@ -73,9 +73,9 @@ The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/a
return self._model.predict(x_test.values)
- Override the `score` method
- The parameters include the test features and test labels
- Return the evaluation score of model. It's recommended to adopt the loss between labels and prediction score.
- Code Example:In the following example, user need to calculate the weighted loss with test data 'x_test', test label 'y_test' and the weight 'w_test'.
- The parameters include the test features and test labels.
- Return the evaluation score of the model. It's recommended to adopt the loss between labels and `prediction score`.
- Code Example: In the following example, users need to calculate the weighted loss with test data `x_test`, test label `y_test` and the weight `w_test`.
.. code-block:: Python
def score(self, x_test:pd.Dataframe, y_test:pd.Dataframe, w_test:pd.DataFrame = None) -> float:
@@ -87,8 +87,8 @@ The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/a
return scorer(y_test.values, preds, sample_weight=w_test_weight)
- Override the `save` method & `load` method
- The `save` method parameter include the a `filename` that represents an absolute path, user need to save model into the path.
- The `load` method parameter include the a `buffer` read from the `filename` passed in `save` method , user need to load model from the `buffer`.
- The `save` method parameter includes the a `filename` that represents an absolute path, user need to save model into the path.
- The `load` method parameter includes the a `buffer` read from the `filename` passed in the `save` method, users need to load model from the `buffer`.
- Code Example:
.. code-block:: Python
@@ -104,9 +104,9 @@ The Custom models need to inherit `qlib.contrib.model.base.Model <../reference/a
Configuration File
=======================
The configuration file is described in detail in the `estimator <../advanced/estimator.html#Example>`_ document. In order to integrate the custom model into ``Qlib``, you need to modify the "model" field in the configuration file.
The configuration file is described in detail in the `estimator <../component/estimator.html#complete-example>`_ document. In order to integrate the custom model into ``Qlib``, users need to modify the "model" field in the configuration file.
- Example: The following example describes the model field of configuration file about the custom lightgbm model mentioned above , where module_path is the module path, class is the class name, and args is the hyperparameter passed into the __init__ method. All parameters in the field is passed to 'self._params' by '\*\*kwargs' in `__init__` except 'loss = mse'.
- Example: The following example describes the `model` field of configuration file about the custom lightgbm model mentioned above, where `module_path` is the module path, `class` is the class name, and `args` is the hyperparameter passed into the __init__ method. All parameters in the field is passed to `self._params` by `\*\*kwargs` in `__init__` except `loss = mse`.
.. code-block:: YAML
@@ -128,7 +128,7 @@ Users could find configuration file of the baseline of the ``Model`` in ``qlib/e
Model Testing
=====================
Assuming that the configuration file is ``examples/estimator/estimator_config.yaml``, user can run the following command to test the custom model:
Assuming that the configuration file is ``examples/estimator/estimator_config.yaml``, users can run the following command to test the custom model:
.. code-block:: bash
@@ -137,10 +137,10 @@ Assuming that the configuration file is ``examples/estimator/estimator_config.ya
.. note:: ``estimator`` is a built-in command of ``Qlib``.
Also, ``Model`` can also be tested as a single module. An example has been given in ``examples.estimator.train_backtest_analyze.ipynb``.
Also, ``Model`` can also be tested as a single module. An example has been given in ``examples/train_backtest_analyze.ipynb``.
Reference
=====================
To know more about ``Model``, please refer to `Interday Model: Model Training & Prediction <../advanced/model.rst>`_ and `Model API <../reference/api.html#module-qlib.contrib.model.base>`_.
To know more about ``Model``, please refer to `Interday Model: Model Training & Prediction <../component/model.html>`_ and `Model API <../reference/api.html#module-qlib.contrib.model.base>`_.