1
0
mirror of https://github.com/microsoft/qlib.git synced 2026-06-06 05:51:17 +08:00
Files
qlib/scripts/data_collector
Pengrong Zhu 2aee9e0145 Add future calendar collector (#795)
* fix Windows mount

* add future_calendar_collector

* update docs

Co-authored-by: Young <afe.young@gmail.com>
Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
2022-01-16 10:14:27 +08:00
..
2021-12-26 14:12:48 +08:00
2021-11-20 15:03:53 +08:00
2021-11-20 15:03:53 +08:00

Data Collector

Introduction

Scripts for data collection

Custom Data Collection

Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo

  1. Create a dataset code directory in the current directory
  2. Add collector.py
    • add collector class:
      CUR_DIR = Path(__file__).resolve().parent
      sys.path.append(str(CUR_DIR.parent.parent))
      from data_collector.base import BaseCollector, BaseNormalize, BaseRun
      class UserCollector(BaseCollector):
          ...
      
    • add normalize class:
      class UserNormalzie(BaseNormalize):
          ...
      
    • add CLI class:
      class Run(BaseRun):
          ...
      
  3. add README.md
  4. add requirements.txt

Description of dataset

Basic data
Features Price/Volume:
   - $close/$open/$low/$high/$volume/$change/$factor
Calendar <freq>.txt:
   - day.txt
   - 1min.txt
Instruments <market>.txt:
   - required: all.txt;
   - csi300.txt/csi500.txt/sp500.txt
  • Features: data, digital
    • if not adjusted, factor=1

Data-dependent component

To make the component running correctly, the dependent data are required

Component required data
Data retrieval Features, Calendar, Instrument
Backtest Features[Price/Volume], Calendar, Instruments