1
0
mirror of https://github.com/microsoft/qlib.git synced 2026-07-01 18:11:18 +08:00
Files
qlib/scripts/data_collector/pit/README.md
bxdd faa99f30fa Support Point-in-time Data Operation (#343)
* add period ops class

* black format

* add pit data read

* fix bug in period ops

* update ops runnable

* update PIT test example

* black format

* update PIT test

* update tets_PIT

* update code format

* add check_feature_exist

* black format

* optimize the PIT Algorithm

* fix bug

* update example

* update test_PIT name

* add pit collector

* black format

* fix bugs

* fix try

* fix bug & add dump_pit.py

* Successfully run and understand PIT

* Add some docs and remove a bug

* mv crypto collector

* black format

* Run succesfully after merging master

* Pass test and fix code

* remove useless PIT code

* fix PYlint

* Rename

Co-authored-by: Young <afe.young@gmail.com>
2022-03-10 14:27:52 +08:00

1.2 KiB

Collect Point-in-Time Data

Please pay ATTENTION that the data is collected from baostock and the data might not be perfect. We recommend users to prepare their own data if they have high-quality dataset. For more information, users can refer to the related document

Requirements

pip install -r requirements.txt

Collector Data

Download Quarterly CN Data

cd qlib/scripts/data_collector/pit/
# download from baostock.com
python collector.py download_data --source_dir ./csv_pit --start 2000-01-01 --end 2020-01-01 --interval quarterly

Downloading all data from the stock is very time consuming. If you just want run a quick test on a few stocks, you can run the command below

python collector.py download_data --source_dir ./csv_pit --start 2000-01-01 --end 2020-01-01 --interval quarterly --symbol_flt_regx "^(600519|000725).*"

Dump Data into PIT Format

cd qlib/scripts
# data_collector/pit/csv_pit is the data you download just now.
python dump_pit.py dump --csv_path data_collector/pit/csv_pit --qlib_dir ~/.qlib/qlib_data/cn_data --interval quarterly