From f5ec21913553eb2ccb35afc49cc7f5fa630cb76e Mon Sep 17 00:00:00 2001 From: Jactus Date: Thu, 26 Nov 2020 22:45:40 +0800 Subject: [PATCH] Update docs and README --- README.md | 23 ++++++++++++-- docs/component/data.rst | 56 +++++++++++++++++++++++++++++++---- docs/start/initialization.rst | 8 +++-- 3 files changed, 76 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index fd20e7c3a..3e0c985c8 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,8 @@ For more details, please refer to our paper ["Qlib: An AI-oriented Quantitative - [Data Preparation](#data-preparation) - [Auto Quant Research Workflow](#auto-quant-research-workflow) - [Building Customized Quant Research Workflow by Code](#building-customized-quant-research-workflow-by-code) -- [Quant Model Zoo](#quant-model-zoo) + - [Run a single model](#run-a-single-model) + - [Run multiple models](#run-multiple-models) - [Quant Dataset Zoo](#quant-dataset-zoo) - [More About Qlib](#more-about-qlib) - [Offline Mode and Online Mode](#offline-mode-and-online-mode) @@ -188,7 +189,25 @@ Qlib provides a tool named `qrun` to run the whole workflow automatically (inclu The automatic workflow may not suite the research workflow of all Quant researchers. To support a flexible Quant research workflow, Qlib also provides a modularized interface to allow researchers to build their own workflow by code. [Here](examples/workflow_by_code.ipynb) is a demo for customized Quant research workflow by code. -# Quant Model Zoo +[# Quant Model Zoo](examples/benchmarks) + +## Run a single model +`Qlib` provides three different ways to run a single model, users can pick the one that fits their cases best: +- User can use the tool `qrun` mentioned above to run a model's workflow based from a config file. +- User can create a `workflow_by_code` python script based on the [one](examples/workflow_by_code.py) listed in the `examples` folder. +- User can use the script [`run_all_model.py`](examples/run_all_model.py) listed in the `examples` folder to run a model. Here is an example of the specific shell command to be used: `python run_all_model.py --models=lightgbm`. For more use cases, please refer to the file's [docstrings](examples/run_all_model.py). + +## Run multiple models +`Qlib` also provides a script [`run_all_model.py`](examples/run_all_model.py) which can run multiple models for several iterations. (**Note**: the script only supprots *Linux* now. Other OS will be supported in the future.) + +The script will create a unique virtual environment for each model, and delete the environments after training. Thus, only experiment results such as `IC` and `backtest` results will be generated and stored. + +Here is an example of running all the models for 10 iterations: +```python +python run_all_model.py 10 +``` + +It also provides the API to run specific models at once. For more use cases, please refer to the file's [docstrings](examples/run_all_model.py). Here is a list of models built on `Qlib`. - [GBDT based on LightGBM](qlib/contrib/model/gbdt.py) diff --git a/docs/component/data.rst b/docs/component/data.rst index 3323211d6..aa01fe226 100644 --- a/docs/component/data.rst +++ b/docs/component/data.rst @@ -33,13 +33,19 @@ Such data will be stored with filename suffix `.bin` (We'll call them `.bin` fil Qlib Format Dataset -------------------- -``Qlib`` has provided an off-the-shelf dataset in `.bin` format, users could use the script ``scripts/get_data.py`` to download the dataset as follows. +``Qlib`` has provided an off-the-shelf dataset in `.bin` format, users could use the script ``scripts/get_data.py`` to download the China-Stock dataset as follows. .. code-block:: bash python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn -After running the above command, users can find china-stock data in Qlib format in the ``~/.qlib/csv_data/cn_data`` directory. +In addition to China-Stock data, ``Qlib`` also includes a US-Stock dataset, which can be downloaded with the following command: + +.. code-block:: bash + + python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/us_data --region us + +After running the above command, users can find china-stock and us-stock data in Qlib format in the ``~/.qlib/csv_data/cn_data`` directory and ``~/.qlib/csv_data/us_data`` directory respectively. ``Qlib`` also provides the scripts in ``scripts/data_collector`` to help users crawl the latest data on the Internet and convert it to qlib format. @@ -51,12 +57,45 @@ Converting CSV Format into Qlib Format ``Qlib`` has provided the script ``scripts/dump_bin.py`` to convert data in CSV format into `.bin` files (Qlib format). -Users can download the china-stock data in CSV format as follows for reference to the CSV format. +Users can download the demo china-stock data in CSV format as follows for reference to the CSV format. .. code-block:: bash python scripts/get_data.py csv_data_cn --target_dir ~/.qlib/csv_data/cn_data +Users can also provide their own data in CSV format. However, the CSV data **must satisfies** following criterions: + +- CSV file is named after a specific stock *or* the CSV file includes a column of the stock name + + - Name the CSV file after a stock: `SH600000.csv`, `AAPL.csv` (not case sensitive). + + - CSV file includes a column of the stock name. User **must** specify the column name when dumping the data. Here is an example: + + .. code-block:: bash + + python scripts/dump_bin.py dump_all ... --symbol_field_name symbol + + where the data are in the following format: + + .. code-block:: + + symbol,close + SH600000,120 + +- CSV file **must** includes a column for the date, and when dumping the data, user must specify the date column name. Here is an example: + + .. code-block:: bash + + python scripts/dump_bin.py dump_all ... --date_field_name date + + where the data are in the following format: + + .. code-block:: + + symbol,date,close,open,volume + SH600000,2020-11-01,120,121,12300000 + SH600000,2020-11-02,123,120,12300000 + Supposed that users prepare their CSV format data in the directory ``~/.qlib/csv_data/my_data``, they can run the following command to start the conversion. @@ -64,6 +103,12 @@ Supposed that users prepare their CSV format data in the directory ``~/.qlib/csv python scripts/dump_bin.py dump_all --csv_path ~/.qlib/csv_data/my_data --qlib_dir ~/.qlib/qlib_data/my_data --include_fields open,close,high,low,volume,factor +For other supported parameters when dumping the data into `.bin` file, users can refer to the information by running the following commands: + +.. code-block:: bash + + python dump_bin.py dump_all --help + After conversion, users can find their Qlib format data in the directory `~/.qlib/qlib_data/my_data`. .. note:: @@ -99,9 +144,8 @@ China-Stock Mode & US-Stock Mode qlib.init(provider_uri='~/.qlib/qlib_data/cn_data', region=REG_CN) -- If users use ``Qlib`` in US-stock mode, US-stock data is required. ``Qlib`` does not provide a script to download US-stock data. Users can use ``Qlib`` in US-stock mode according to the following steps: - - Prepare data in CSV format - - Convert data from CSV format to Qlib format, please refer to section `Converting CSV Format into Qlib Format <#converting-csv-format-into-qlib-format>`_. +- If users use ``Qlib`` in US-stock mode, US-stock data is required. ``Qlib`` also provides a script to download US-stock data. Users can use ``Qlib`` in US-stock mode according to the following steps: + - Download china-stock in qlib format, please refer to section `Qlib Format Dataset <#qlib-format-dataset>`_. - Initialize ``Qlib`` in US-stock mode Supposed that users prepare their Qlib format data in the directory ``~/.qlib/csv_data/us_data``. Users only need to initialize ``Qlib`` as follows. diff --git a/docs/start/initialization.rst b/docs/start/initialization.rst index af89a098e..423d7edf8 100644 --- a/docs/start/initialization.rst +++ b/docs/start/initialization.rst @@ -12,14 +12,16 @@ Initialization Please follow the steps below to initialize ``Qlib``. -- Download and prepare the Data: execute the following command to download stock data. Please pay `attention` that the data is collected from `Yahoo Finance `_ and the data might not be perfect. We recommend users to prepare their own data if they have high-quality datasets. Please refer to `Data <../component/data.html#converting-csv-format-into-qlib-format>` for more information about customized dataset. +Download and prepare the Data: execute the following command to download stock data. Please pay `attention` that the data is collected from `Yahoo Finance `_ and the data might not be perfect. We recommend users to prepare their own data if they have high-quality datasets. Please refer to `Data <../component/data.html#converting-csv-format-into-qlib-format>`_ for more information about customized dataset. + .. code-block:: bash python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn - Please refer to `Data Preparation <../component/data.html#data-preparation>`_ for more information about `get_data.py`, + +Please refer to `Data Preparation <../component/data.html#data-preparation>`_ for more information about `get_data.py`, -- Initialize Qlib before calling other APIs: run following code in python. +Initialize Qlib before calling other APIs: run following code in python. .. code-block:: Python