mirror of
https://github.com/microsoft/qlib.git
synced 2026-06-06 05:51:17 +08:00
138 lines
5.5 KiB
ReStructuredText
138 lines
5.5 KiB
ReStructuredText
.. _getdata:
|
||
=============================
|
||
Data Retrieval
|
||
=============================
|
||
|
||
.. currentmodule:: qlib
|
||
|
||
Introduction
|
||
====================
|
||
|
||
Users can get stock data by ``Qlib``. Following examples will demonstrate the basic user interface.
|
||
|
||
Examples
|
||
====================
|
||
|
||
|
||
``QLib`` Initialization:
|
||
|
||
.. note:: In order to get the data, users need to initialize ``Qlib`` with `qlib.init` first. Please refer to `initialization <initialization.rst>`_.
|
||
|
||
It is recommended to use the following code to initialize qlib:
|
||
|
||
.. code-block:: python
|
||
|
||
>>> import qlib
|
||
>>> qlib.init(provider_uri='~/.qlib/qlib_data/cn_data')
|
||
|
||
|
||
Load trading calendar with the given time range and frequency:
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> D.calendar(start_time='2010-01-01', end_time='2017-12-31', freq='day')[:2]
|
||
[Timestamp('2010-01-04 00:00:00'), Timestamp('2010-01-05 00:00:00')]
|
||
|
||
Parse a given market name into a stockpool config:
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> D.instruments(market='all')
|
||
{'market': 'all', 'filter_pipe': []}
|
||
|
||
Load instruments of certain stockpool in the given time range:
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> instruments = D.instruments(market='csi300')
|
||
>>> D.list_instruments(instruments=instruments, start_time='2010-01-01', end_time='2017-12-31', as_list=True)[:6]
|
||
|
||
|
||
Load dynamic instruments from a base market according to a name filter
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> from qlib.data.filter import NameDFilter
|
||
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
|
||
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter])
|
||
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
|
||
|
||
Load dynamic instruments from a base market according to an expression filter
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> from qlib.data.filter import ExpressionDFilter
|
||
>>> expressionDFilter = ExpressionDFilter(rule_expression='$close>100')
|
||
>>> instruments = D.instruments(market='csi300', filter_pipe=[expressionDFilter])
|
||
>>> D.list_instruments(instruments=instruments, start_time='2015-01-01', end_time='2016-02-15', as_list=True)
|
||
|
||
To know more about how to use the filter or how to build one's own filter, go to API Reference: `filter API <../reference/api.html#filter>`_
|
||
|
||
Load features of certain instruments in given time range:
|
||
|
||
.. note:: This is not a recommended way to get features.
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> instruments = ['SH600000']
|
||
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
|
||
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
|
||
$close $volume Ref($close,1) Mean($close,3) \
|
||
instrument datetime
|
||
SH600000 2010-01-04 81.809998 17144536.0 NaN 81.809998
|
||
2010-01-05 82.419998 29827816.0 81.809998 82.114998
|
||
2010-01-06 80.800003 25070040.0 82.419998 81.676666
|
||
2010-01-07 78.989998 22077858.0 80.800003 80.736666
|
||
2010-01-08 79.879997 17019168.0 78.989998 79.889999
|
||
|
||
Sub($high,$low)
|
||
instrument datetime
|
||
SH600000 2010-01-04 2.741158
|
||
2010-01-05 3.049736
|
||
2010-01-06 1.621399
|
||
2010-01-07 2.856926
|
||
2010-01-08 1.930397
|
||
2010-01-08 1.930397
|
||
|
||
Load features of certain stockpool in given time range:
|
||
|
||
.. note:: Since the server need to cache all-time data for your request stockpool and fields, it may take longer to process your request than before. But in the second time, your request will be processed and responded in a flash even if you change the timespan.
|
||
|
||
.. code-block:: python
|
||
|
||
>>> from qlib.data import D
|
||
>>> from qlib.data.filter import NameDFilter, ExpressionDFilter
|
||
>>> nameDFilter = NameDFilter(name_rule_re='SH[0-9]{4}55')
|
||
>>> expressionDFilter = ExpressionDFilter(rule_expression='($close/$factor)>100')
|
||
>>> instruments = D.instruments(market='csi300', filter_pipe=[nameDFilter, expressionDFilter])
|
||
>>> fields = ['$close', '$volume', 'Ref($close, 1)', 'Mean($close, 3)', '$high-$low']
|
||
>>> D.features(instruments, fields, start_time='2010-01-01', end_time='2017-12-31', freq='day').head()
|
||
|
||
$close $volume Ref($close, 1) \
|
||
instrument datetime
|
||
SH600655 2015-06-15 4342.160156 258706.359375 4530.459961
|
||
2015-06-16 4409.270020 257349.718750 4342.160156
|
||
2015-06-17 4312.330078 235214.890625 4409.270020
|
||
2015-06-18 4086.729980 196772.859375 4312.330078
|
||
2015-06-19 3678.250000 182916.453125 4086.729980
|
||
Mean($close, 3) high− low
|
||
instrument datetime
|
||
SH600655 2015-06-15 4480.743327 285.251465
|
||
2015-06-16 4427.296712 298.301270
|
||
2015-06-16 4354.586751 356.098145
|
||
2015-06-16 4269.443359 363.554932
|
||
2015-06-16 4025.770020 368.954346
|
||
|
||
|
||
.. note:: When calling D.features() at client, use parameter 'disk_cache=0' to skip dataset cache, use 'disk_cache=1' to generate and use dataset cache. In addition, when calling at server, you can use 'disk_cache=2' to update the dataset cache.
|
||
|
||
API
|
||
====================
|
||
To know more about how to use the Data, go to API Reference: `Data API <../reference/api.html#Data>`_
|