Major changes:
- Fix FixedFlagMarketInjector to add market_0, market_1 columns based on instrument codes
- Fix FixedFlagSTInjector to create IsST column from ST_S, ST_Y flags
- Update generate_beta_embedding.py to handle IsST creation conditionally
- Add dump_polars_dataset.py for generating raw and processed datasets
- Add debug_data_divergence.py for comparing gold-standard vs polars output
Documentation:
- Update BUG_ANALYSIS_FINAL.md with IsST column issue discovery
- Update README.md with polars dataset generation instructions
Key discovery:
- The FlagSTInjector in the gold-standard qlib code fails silently
- The VAE was trained without IsST column (341 features, not 342)
- The polars pipeline correctly skips FlagSTInjector to match gold-standard
Generated dataset structure (2026-02-23 to 2026-02-27):
- Raw data: 18,291 rows × 204 columns
- Processed data: 18,291 rows × 342 columns (341 for VAE input)
- market_0, market_1 columns correctly added to feature_flag group