mirror of
https://github.com/microsoft/qlib.git
synced 2026-06-06 05:51:17 +08:00
* fix: replace deprecated pandas fillna(method=) with ffill()/bfill() Replace deprecated fillna(method="ffill"/"bfill") calls with modern pandas ffill() and bfill() methods to fix FutureWarnings in pandas 2.x. Also includes black formatting fixes for compliance. This addresses the pandas deprecation warnings portion of issue #1981. Other issues (date parsing, type conversion, timezone handling) will be addressed in separate commits. Fixes: - Yahoo collector: 2 instances in calc_change() and adjusted_price() - BaoStock collector: 1 instance in calc_change() - Core utils: resam.py fillna operations - Backtest: profit_attribution.py stock data processing - High-freq ops: FFillNan and BFillNan operators - Position analysis: parse_position.py weight processing Partially addresses GitHub issue #1981 * lint with black * lint with black * limit minimum version of pandas * limit minimum version of pandas --------- Co-authored-by: Linlang <Lv.Linlang@hotmail.com>
Data Collector
Introduction
Scripts for data collection
- yahoo: get US/CN stock data from Yahoo Finance
- fund: get fund data from http://fund.eastmoney.com
- cn_index: get CN index from http://www.csindex.com.cn, CSI300/CSI100
- us_index: get US index from https://en.wikipedia.org/wiki, SP500/NASDAQ100/DJIA/SP400
- contrib: scripts for some auxiliary functions
Custom Data Collection
Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo
- Create a dataset code directory in the current directory
- Add
collector.py- add collector class:
CUR_DIR = Path(__file__).resolve().parent sys.path.append(str(CUR_DIR.parent.parent)) from data_collector.base import BaseCollector, BaseNormalize, BaseRun class UserCollector(BaseCollector): ... - add normalize class:
class UserNormalzie(BaseNormalize): ... - add
CLIclass:class Run(BaseRun): ...
- add collector class:
- add
README.md - add
requirements.txt
Description of dataset
| Basic data | |
|---|---|
| Features | Price/Volume: - $close/$open/$low/$high/$volume/$change/$factor |
| Calendar | <freq>.txt: - day.txt - 1min.txt |
| Instruments | <market>.txt: - required: all.txt; - csi300.txt/csi500.txt/sp500.txt |
Features: data, digital- if not adjusted, factor=1
Data-dependent component
To make the component running correctly, the dependent data are required
| Component | required data |
|---|---|
| Data retrieval | Features, Calendar, Instrument |
| Backtest | Features[Price/Volume], Calendar, Instruments |