01
Key Findings
01
Capital Constraint Dominates Architecture
$100k models achieved the highest average test Sharpe (2.035) across all architectures,
outperforming both $1M (1.716) and $10k (1.865). Forcing selectivity through capital
constraints produces cleaner risk-adjusted returns than architectural complexity alone.
02
Transformer Underperforms VGG
Contrary to the initial hypothesis, the Cross-Stock Transformer achieved a mean test
Sharpe of 1.568 versus 2.089 for VGG + Alpaca. Local convolutional feature extraction
is a more data-efficient inductive bias than global attention for daily trading with
4 years of historical data.
03
30-Stock Universe Outperforms 50-Stock
Every 30-stock model outperforms its 50-stock counterpart on average Sharpe (2.048
vs 1.757). A focused universe gives the CNN a cleaner learning signal with less
cross-stock noise to contend with.
02
Architecture Comparison — Avg Test Sharpe
VGG + Alpaca
Live data · FinBERT sentiment
Avg Test Sharpe
2.089
Best
2.726
Worst
1.009→1.575
VGG Baseline
Yahoo Finance · No sentiment
Avg Test Sharpe
2.062
Best
3.111
Worst
0.826→1.001
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
Avg Test Sharpe
1.892
Best
2.350
Worst
0.992→1.273
Transformer
Alpaca · Cross-stock attention
Avg Test Sharpe
1.568
Best
1.895
Worst
0.629
03
Full Results — All 24 Models
| Model | Universe | Capital | Test Sharpe | Test Return | Max DD | Win Rate | Train Sharpe | vs BAH |
|---|---|---|---|---|---|---|---|---|
VGG Baseline Yahoo Finance · No sentiment |
30-Stock | $10k | 3.111 | 98.70% | -11.27% | 56.02% | 2.428 | ▲ +1.137 |
VGG + Alpaca Alpaca · FinBERT sentiment |
50-Stock | $10k | 2.726 | 9.51% | -3.71% | 65.79% | 2.137 | ▲ +1.289 |
VGG + Alpaca Alpaca · FinBERT sentiment |
30-Stock | $100k | 2.531 | 57.16% | -9.50% | 56.17% | 2.029 | ▲ +0.556 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
50-Stock | $100k | 2.347 | 42.47% | -7.02% | 55.76% | 1.709 | ▲ +0.910 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
30-Stock | $100k | 2.350 | 48.92% | -8.95% | 58.82% | 1.773 | ▲ +0.376 |
VGG Baseline Yahoo Finance · No sentiment |
30-Stock | $1M | 3.086 | 48.22% | -5.00% | 60.17% | 2.140 | ▲ +1.111 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
30-Stock | $1M | 2.349 | 49.02% | -10.84% | 55.74% | 1.816 | ▲ +0.375 |
VGG Baseline Yahoo Finance · No sentiment |
30-Stock | $100k | 2.287 | 55.37% | -19.86% | 55.37% | 1.687 | ▲ +0.313 |
VGG + Alpaca Alpaca · FinBERT sentiment |
30-Stock | $10k | 2.111 | 123.85% | -20.21% | 56.85% | 2.166 | ▲ +0.136 |
VGG + Alpaca Alpaca · FinBERT sentiment |
50-Stock | $100k | 2.019 | 72.48% | -15.28% | 56.20% | 1.708 | ▲ +0.582 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
50-Stock | $100k | 1.895 | 52.93% | -19.62% | 55.04% | 1.718 | ▲ +0.386 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
50-Stock | $1M | 1.861 | 38.26% | -11.27% | 54.89% | 1.470 | ▲ +0.352 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
50-Stock | $10k | 1.806 | 33.99% | -8.89% | 48.61% | 2.253 | ▲ +0.297 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
30-Stock | $1M | 1.748 | 55.23% | -19.70% | 57.87% | 1.498 | ▲ +0.389 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
50-Stock | $10k | 1.695 | 68.48% | -21.66% | 56.45% | 1.805 | ▲ +0.258 |
VGG + Alpaca Alpaca · FinBERT sentiment |
30-Stock | $1M | 1.570 | 31.41% | -12.52% | 55.79% | 1.777 | ▼ -0.405 |
VGG Baseline Yahoo Finance · No sentiment |
50-Stock | $10k | 1.504 | 30.98% | -10.33% | 58.53% | 1.999 | ▲ +0.067 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
30-Stock | $100k | 1.468 | 25.91% | -11.93% | 52.94% | 1.538 | ▲ +0.109 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
30-Stock | $10k | 1.338 | 41.07% | -19.24% | 50.62% | 2.325 | ▼ -0.636 |
VGG + Alpaca Alpaca · FinBERT sentiment |
50-Stock | $1M | 1.575 | 29.96% | -7.78% | 57.56% | 2.138 | ▲ +0.138 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
50-Stock | $1M | 1.273 | 23.61% | -8.42% | 56.16% | 1.965 | ▼ -0.164 |
VGG Baseline Yahoo Finance · No sentiment |
50-Stock | $100k | 1.381 | 32.11% | -10.75% | 54.96% | 1.168 | ▼ -0.056 |
VGG Baseline Yahoo Finance · No sentiment |
50-Stock | $1M | 1.001 | 20.97% | -9.33% | 55.71% | 1.922 | ▼ -0.436 |
Transformer Alpaca · Cross-stock attention · 4-component reward |
30-Stock | $10k | 0.629 | 13.31% | -12.14% | 53.24% | 1.460 | ▼ -0.730 |
04
Training Performance Over Time
Train / Total Return %
Train / Sharpe Ratio
Train / Max Drawdown %
Train / Win Rate %
05
Test Performance Over Time (2024)
Test / Total Return %
Test / Sharpe Ratio
Test / Max Drawdown %
Test / Win Rate %
06
Market Regime Analysis — VGG + Alpaca Best Configuration
30-Stock · $100k Best Overall
Sharpe 2.531| Period | Sharpe | Return | Max DD |
|---|---|---|---|
| H1 Bull (Jan–Jun) | 3.343 | +34.27% | -7.76% |
| H2 Volatile (Jul–Dec) | 1.776 | +17.49% | -9.50% |
| Buy-and-Hold H1 | 3.411 | +26.23% | -5.79% |
| Buy-and-Hold H2 | 0.911 | +9.98% | -10.82% |
50-Stock · $100k
Sharpe 2.019| Period | Sharpe | Return | Max DD |
|---|---|---|---|
| H1 Bull (Jan–Jun) | 0.685 | +7.76% | -9.61% |
| H2 Volatile (Jul–Dec) | 2.816 | +57.97% | -15.28% |
| Buy-and-Hold H1 | 2.137 | +14.57% | -4.76% |
| Buy-and-Hold H2 | 0.948 | +9.45% | -7.80% |
30-Stock · $10k
Sharpe 2.111| Period | Sharpe | Return | Max DD |
|---|---|---|---|
| H1 Bull (Jan–Jun) | 2.812 | +55.39% | -15.99% |
| H2 Volatile (Jul–Dec) | 1.658 | +42.06% | -18.65% |
| Buy-and-Hold H1 | 3.411 | +26.23% | -5.79% |
| Buy-and-Hold H2 | 0.911 | +9.98% | -10.82% |
30-Stock · $1M
Sharpe 1.570| Period | Sharpe | Return | Max DD |
|---|---|---|---|
| H1 Bull (Jan–Jun) | 2.266 | +19.71% | -10.68% |
| H2 Volatile (Jul–Dec) | 1.056 | +10.41% | -12.52% |
| Buy-and-Hold H1 | 3.411 | +26.23% | -5.79% |
| Buy-and-Hold H2 | 0.911 | +9.98% | -10.82% |
07
Multi-Seed Robustness Evaluation
3 seeds × 4 architectures · 30-Stock $100k · Test Period 2024 · All σ < 0.3 — architectural rankings statistically confirmed
Confirms
VGG + Alpaca
2.476 ± 0.263
Highest mean Sharpe across all seeds · σ within threshold
Confirms
Transformer
1.690 ± 0.166
Lowest mean Sharpe · always-buy degeneracy reproduced across all 3 seeds
Diverges
VGG Baseline
2.136 ± 0.065
Single-seed 2.287 → mean 2.136 · most stable architecture (lowest σ)
Diverges
VGG + FinBERT
1.827 ± 0.068
Single-seed 2.350 → mean 1.827 · falls below B&H · drawdown reduction structural
Full Metric Breakdown — Mean ± Std (3 Seeds) · 30-Stock $100k
| Model | Sharpe | Return (%) | Max DD (%) | Win Rate (%) | Volatility (%) | Calmar | Avg Daily Ret (%) |
|---|---|---|---|---|---|---|---|
VGG + Alpaca Alpaca · FinBERT sentiment |
2.476 ±0.263 | 45.21 ±4.93 | -10.01 ±2.03 | 59.24 ±2.59 | 18.28 ±3.55 | 4.671 ±0.945 | 0.197 ±0.023 |
VGG Baseline Yahoo Finance · No sentiment |
2.136 ±0.065 | 45.24 ±2.34 | -9.95 ±0.27 | 59.44 ±2.48 | 16.60 ±0.98 | 4.548 ±0.202 | 0.160 ±0.008 |
VGG + FinBERT Yahoo Finance · FinBERT sentiment |
1.827 ±0.068 | 38.04 ±10.42 | -11.87 ±3.07 | 57.84 ±0.31 | 18.80 ±4.35 | 3.325 ±0.786 | 0.155 ±0.027 |
Transformer Alpaca · Cross-stock attention |
1.690 ±0.166 | 31.55 ±3.08 | -10.16 ±3.76 | 56.30 ±2.90 | 16.40 ±2.76 | 3.387 ±0.763 | 0.128 ±0.007 |
Buy-and-Hold Equal-weight benchmark |
1.975 | 39.34 | -10.82 | — | — | — | — |
Per-Seed Sharpe Ratio — Original vs Multi-Seed
| Model | Seed 1 | Seed 2 | Seed 3 | Mean | Std | Original | Verdict |
|---|---|---|---|---|---|---|---|
VGG + Alpaca |
2.166 | 2.453 | 2.810 | 2.476 | 0.263 | 2.531 | CONFIRMS |
VGG Baseline |
2.045 | 2.166 | 2.196 | 2.136 | 0.065 | 2.287 | DIVERGES |
VGG + FinBERT |
1.781 | 1.776 | 1.923 | 1.827 | 0.068 | 2.350 | DIVERGES |
Transformer |
1.455 | 1.799 | 1.815 | 1.690 | 0.166 | 1.468 | CONFIRMS |
W&B Tracking — All 12 Runs (3 Seeds × 4 Architectures) · Test Period 2024
Test / Total Return %
Test / Sharpe Ratio
Test / Max Drawdown %
Test / Win Rate %
01
Ranking Preserved
VGG + Alpaca > VGG Baseline > VGG + FinBERT > Transformer holds across all 3 seeds.
σ < 0.3 for every architecture confirms the single-seed ranking is statistically valid.
02
FinBERT Drawdown Is Structural
Despite mean Sharpe (1.827) falling below B&H, the ~55% max drawdown reduction vs
VGG Baseline is reproduced across all 3 seeds — confirming it is initialization-independent.
03
Transformer Degeneracy Confirmed
The always-buy policy (11:1 buy/sell ratio) and lowest Calmar Ratio (3.39 ± 0.76)
are reproduced across all 3 seeds — ruling out a measurement artifact.
04
VGG Baseline Most Consistent
Lowest Sharpe σ (0.065) and lowest average volatility (16.60%) across seeds —
the simplest policy trades least aggressively, yielding the most seed-stable behavior.