You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
158 lines
4.7 KiB
158 lines
4.7 KiB
# Alpha Lab
|
|
|
|
Quantitative research experiments for qshare library. This repository contains Jupyter notebooks and analysis scripts for exploring trading strategies and machine learning models.
|
|
|
|
## Philosophy
|
|
|
|
- **Notebook-centric**: Experiments are interactive notebooks, not rigid scripts
|
|
- **Minimal abstraction**: Simple functions over complex class hierarchies
|
|
- **Self-contained**: Each task directory is independent
|
|
- **Ad-hoc friendly**: Easy to modify for exploration
|
|
|
|
## Structure
|
|
|
|
```
|
|
alpha_lab/
|
|
├── common/ # Shared utilities (keep minimal!)
|
|
│ ├── __init__.py
|
|
│ ├── paths.py # Path management
|
|
│ └── plotting.py # Common plotting functions
|
|
│
|
|
├── cta_1d/ # CTA 1-day return prediction
|
|
│ ├── __init__.py # Re-exports from src/
|
|
│ ├── config.yaml # Task configuration
|
|
│ ├── src/ # Implementation modules
|
|
│ │ ├── __init__.py
|
|
│ │ ├── loader.py # CTA1DLoader
|
|
│ │ ├── train.py # Training functions
|
|
│ │ ├── backtest.py # Backtest functions
|
|
│ │ └── labels.py # Label blending utilities
|
|
│ ├── 01_data_check.ipynb
|
|
│ ├── 02_label_analysis.ipynb
|
|
│ ├── 03_baseline_xgb.ipynb
|
|
│ └── 04_blend_comparison.ipynb
|
|
│
|
|
├── stock_15m/ # Stock 15-minute return prediction
|
|
│ ├── __init__.py # Re-exports from src/
|
|
│ ├── config.yaml # Task configuration
|
|
│ ├── src/ # Implementation modules
|
|
│ │ ├── __init__.py
|
|
│ │ ├── loader.py # Stock15mLoader
|
|
│ │ └── train.py # Training functions
|
|
│ ├── 01_data_exploration.ipynb
|
|
│ └── 02_baseline_model.ipynb
|
|
│
|
|
└── results/ # Output directory (gitignored)
|
|
├── cta_1d/
|
|
└── stock_15m/
|
|
```
|
|
|
|
## Setup
|
|
|
|
```bash
|
|
# Install dependencies
|
|
pip install -r requirements.txt
|
|
|
|
# Create environment file
|
|
cp .env.template .env
|
|
# Edit .env with your settings
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Interactive (Notebooks)
|
|
|
|
Start Jupyter and run notebooks interactively:
|
|
|
|
```bash
|
|
jupyter notebook
|
|
```
|
|
|
|
Each task directory contains numbered notebooks:
|
|
- `01_*.ipynb` - Data loading and exploration
|
|
- `02_*.ipynb` - Analysis and baseline models
|
|
- `03_*.ipynb` - Advanced experiments
|
|
- `04_*.ipynb` - Comparisons and ablations
|
|
|
|
### Command Line
|
|
|
|
Train models from config files:
|
|
|
|
```bash
|
|
# CTA 1D
|
|
python -m cta_1d.train --config cta_1d/config.yaml --output results/cta_1d/exp01
|
|
|
|
# Stock 15m
|
|
python -m stock_15m.train --config stock_15m/config.yaml --output results/stock_15m/exp01
|
|
|
|
# CTA Backtest
|
|
python -m cta_1d.backtest \
|
|
--model results/cta_1d/exp01/model.json \
|
|
--dt-range 2023-01-01 2023-12-31 \
|
|
--output results/cta_1d/backtest_01
|
|
```
|
|
|
|
### Python API
|
|
|
|
```python
|
|
# Import from task root (re-exports from src/)
|
|
from cta_1d import CTA1DLoader, train_model, TrainConfig
|
|
from stock_15m import Stock15mLoader, train_model, TrainConfig
|
|
from common import create_experiment_dir
|
|
```
|
|
|
|
## Experiment Tracking
|
|
|
|
Experiments are tracked manually in `results/{task}/README.md`:
|
|
|
|
```markdown
|
|
## 2025-01-15: Baseline XGB
|
|
- Notebook: `cta_1d/03_baseline_xgb.ipynb` (cells 1-50)
|
|
- Config: eta=0.5, lambda=0.1
|
|
- Train IC: 0.042
|
|
- Test IC: 0.038
|
|
- Notes: Dual normalization, 4 trades/day
|
|
```
|
|
|
|
## Adding a New Task
|
|
|
|
1. Create directory: `mkdir my_task`
|
|
2. Add `src/` subdirectory with:
|
|
- `__init__.py` - Export public APIs
|
|
- `loader.py` - Dataset loader class
|
|
- Other modules as needed
|
|
3. Add root `__init__.py` that re-exports from `src/`
|
|
4. Create numbered notebooks
|
|
5. Add entry to `results/my_task/README.md`
|
|
|
|
## Git Worktrees
|
|
|
|
This repository uses git worktrees for parallel experiment development:
|
|
|
|
| Worktree | Branch | Purpose |
|
|
|----------|--------|---------|
|
|
| `alpha_lab` | `master` | Main repo (reference) |
|
|
| `alpha_lab_cta_1d` | `cta_1d_exp` | CTA 1-day experiments |
|
|
| `alpha_lab_stock_1d` | `stock_1d_exp` | Stock 1-day experiments |
|
|
| `alpha_lab_stock_15m` | `stock_15m_exp` | Stock 15-min experiments |
|
|
| `alpha_lab_data_ops` | `data_ops_exp` | Data ops research |
|
|
|
|
```bash
|
|
# Create a new worktree
|
|
git worktree add ../alpha_lab_new_exp -b new_exp
|
|
|
|
# List all worktrees
|
|
git worktree list
|
|
|
|
# Remove a worktree when done
|
|
git worktree remove ../alpha_lab_new_exp
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Keep it simple**: Only add to `common/` after 3+ copies
|
|
2. **Module organization**: Place implementation in `src/`, re-export from root `__init__.py`
|
|
3. **Notebook configs**: Define CONFIG dict in first cell for easy modification
|
|
4. **Document results**: Update results README after significant runs
|
|
5. **Git discipline**: Don't commit large files, results, or credentials
|