* Intermediate version * Fix yaml template & Successfully run rolling * Be compatible with benchmark * Get same results with previous linear model * Black formatting * Update black * Update the placeholder mechanism * Update CI * Update CI * Upgrade Black * Fix CI and simplify code * Fix CI * Move the data processing caching mechanism into utils. * Adjusting DDG-DA * Organize import
iBOVESPA History Companies Collection
Requirements
-
Install the libs from the file
requirements.txtpip install -r requirements.txt -
requirements.txtfile was generated using python3.8
For the ibovespa (IBOV) index, we have:
Method get_new_companies
Index start date
-
The ibovespa index started on 2 January 1968 (wiki). In order to use this start date in our
bench_start_date(self)method, two conditions must be satisfied:-
APIs used to download brazilian stocks (B3) historical prices must keep track of such historic data since 2 January 1968
-
Some website or API must provide, from that date, the historic index composition. In other words, the companies used to build the index .
As a consequence, the method
bench_start_date(self)insidecollector.pywas implemented usingpd.Timestamp("2003-01-03")due to two reasons-
The earliest ibov composition that have been found was from the first quarter of 2003. More informations about such composition can be seen on the sections below.
-
Yahoo finance, one of the libraries used to download symbols historic prices, keeps track from this date forward.
-
-
Within the
get_new_companiesmethod, a logic was implemented to get, for each ibovespa component stock, the start date that yahoo finance keeps track of.
Code Logic
The code does a web scrapping into the B3's website, which keeps track of the ibovespa stocks composition on the current day.
Other approaches, such as request and Beautiful Soup could have been used. However, the website shows the table with the stocks with some delay, since it uses a script inside of it to obtain such compositions.
Alternatively, selenium was used to download this stocks' composition in order to overcome this problem.
Futhermore, the data downloaded from the selenium script was preprocessed so it could be saved into the csv format stablished by scripts/data_collector/index.py.
Method get_changes
No suitable data source that keeps track of ibovespa's history stocks composition has been found. Except from this repository which provide such information have been used, however it only provides the data from the 1st quarter of 2003 to 3rd quarter of 2021.
With that reference, the index's composition can be compared quarter by quarter and year by year and then generate a file that keeps track of which stocks have been removed and which have been added each quarter and year.
Collector Data
# parse instruments, using in qlib/instruments.
python collector.py --index_name IBOV --qlib_dir ~/.qlib/qlib_data/br_data --method parse_instruments
# parse new companies
python collector.py --index_name IBOV --qlib_dir ~/.qlib/qlib_data/br_data --method save_new_companies