You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
guofu 49c9dae181
Add CTA 1D Parquet loader and data requirements
2 weeks ago
common Consolidate Python code under src/ directories 3 weeks ago
cta_1d Add CTA 1D Parquet loader and data requirements 2 weeks ago
data_ops_research/cta_1d Add CTA 1D Parquet loader and data requirements 2 weeks ago
stock_15m Consolidate Python code under src/ directories 3 weeks ago
.env.template Initial alpha_lab structure\n\n- Notebook-centric experiment framework\n- CTA 1D and Stock 15m tasks\n- Minimal common utilities\n- Manual experiment tracking 3 weeks ago
.gitignore Initial alpha_lab structure\n\n- Notebook-centric experiment framework\n- CTA 1D and Stock 15m tasks\n- Minimal common utilities\n- Manual experiment tracking 3 weeks ago
CLAUDE.md Update documentation for src/ consolidation 3 weeks ago
README.md Update documentation for src/ consolidation 3 weeks ago
requirements.txt Initial alpha_lab structure\n\n- Notebook-centric experiment framework\n- CTA 1D and Stock 15m tasks\n- Minimal common utilities\n- Manual experiment tracking 3 weeks ago

README.md

Alpha Lab

Quantitative research experiments for qshare library. This repository contains Jupyter notebooks and analysis scripts for exploring trading strategies and machine learning models.

Philosophy

  • Notebook-centric: Experiments are interactive notebooks, not rigid scripts
  • Minimal abstraction: Simple functions over complex class hierarchies
  • Self-contained: Each task directory is independent
  • Ad-hoc friendly: Easy to modify for exploration

Structure

alpha_lab/
├── common/              # Shared utilities (keep minimal!)
│   ├── __init__.py
│   ├── paths.py         # Path management
│   └── plotting.py      # Common plotting functions
│
├── cta_1d/             # CTA 1-day return prediction
│   ├── __init__.py     # Re-exports from src/
│   ├── config.yaml     # Task configuration
│   ├── src/            # Implementation modules
│   │   ├── __init__.py
│   │   ├── loader.py   # CTA1DLoader
│   │   ├── train.py    # Training functions
│   │   ├── backtest.py # Backtest functions
│   │   └── labels.py   # Label blending utilities
│   ├── 01_data_check.ipynb
│   ├── 02_label_analysis.ipynb
│   ├── 03_baseline_xgb.ipynb
│   └── 04_blend_comparison.ipynb
│
├── stock_15m/          # Stock 15-minute return prediction
│   ├── __init__.py     # Re-exports from src/
│   ├── config.yaml     # Task configuration
│   ├── src/            # Implementation modules
│   │   ├── __init__.py
│   │   ├── loader.py   # Stock15mLoader
│   │   └── train.py    # Training functions
│   ├── 01_data_exploration.ipynb
│   └── 02_baseline_model.ipynb
│
└── results/            # Output directory (gitignored)
    ├── cta_1d/
    └── stock_15m/

Setup

# Install dependencies
pip install -r requirements.txt

# Create environment file
cp .env.template .env
# Edit .env with your settings

Usage

Interactive (Notebooks)

Start Jupyter and run notebooks interactively:

jupyter notebook

Each task directory contains numbered notebooks:

  • 01_*.ipynb - Data loading and exploration
  • 02_*.ipynb - Analysis and baseline models
  • 03_*.ipynb - Advanced experiments
  • 04_*.ipynb - Comparisons and ablations

Command Line

Train models from config files:

# CTA 1D
python -m cta_1d.train --config cta_1d/config.yaml --output results/cta_1d/exp01

# Stock 15m
python -m stock_15m.train --config stock_15m/config.yaml --output results/stock_15m/exp01

# CTA Backtest
python -m cta_1d.backtest \
    --model results/cta_1d/exp01/model.json \
    --dt-range 2023-01-01 2023-12-31 \
    --output results/cta_1d/backtest_01

Python API

# Import from task root (re-exports from src/)
from cta_1d import CTA1DLoader, train_model, TrainConfig
from stock_15m import Stock15mLoader, train_model, TrainConfig
from common import create_experiment_dir

Experiment Tracking

Experiments are tracked manually in results/{task}/README.md:

## 2025-01-15: Baseline XGB
- Notebook: `cta_1d/03_baseline_xgb.ipynb` (cells 1-50)
- Config: eta=0.5, lambda=0.1
- Train IC: 0.042
- Test IC: 0.038
- Notes: Dual normalization, 4 trades/day

Adding a New Task

  1. Create directory: mkdir my_task
  2. Add src/ subdirectory with:
    • __init__.py - Export public APIs
    • loader.py - Dataset loader class
    • Other modules as needed
  3. Add root __init__.py that re-exports from src/
  4. Create numbered notebooks
  5. Add entry to results/my_task/README.md

Best Practices

  1. Keep it simple: Only add to common/ after 3+ copies
  2. Module organization: Place implementation in src/, re-export from root __init__.py
  3. Notebook configs: Define CONFIG dict in first cell for easy modification
  4. Document results: Update results README after significant runs
  5. Git discipline: Don't commit large files, results, or credentials