// FinRL · Deep Reinforcement Learning · 2020–2025

DRL Stock Trading
Ablation Study

24 models across 2 stock universes, 4 architectures, and 3 capital levels. Training period 2020–2024 · Test period 2024–2025 · Risk-free rate 5% annualised.

PPO + Stable Baselines 3
FinBERT Sentiment
Polygon.io News
Alpaca Live Data
VecNormalize + TrainSharpeCB
Best Test Sharpe
3.111
30-Stock VGG Baseline · $10k
Models Beating Buy-and-Hold
16 / 24
67% of all configurations
Best Test Return (to peak)
123.85%
30-Stock VGG + Alpaca · $10k
Best Capital Level (avg Sharpe)
$100k
Avg Sharpe 2.035 across all architectures
01

Key Findings

01
Capital Constraint Dominates Architecture
$100k models achieved the highest average test Sharpe (2.035) across all architectures, outperforming both $1M (1.716) and $10k (1.865). Forcing selectivity through capital constraints produces cleaner risk-adjusted returns than architectural complexity alone.
02
Transformer Underperforms VGG
Contrary to the initial hypothesis, the Cross-Stock Transformer achieved a mean test Sharpe of 1.568 versus 2.089 for VGG + Alpaca. Local convolutional feature extraction is a more data-efficient inductive bias than global attention for daily trading with 4 years of historical data.
03
30-Stock Universe Outperforms 50-Stock
Every 30-stock model outperforms its 50-stock counterpart on average Sharpe (2.048 vs 1.757). A focused universe gives the CNN a cleaner learning signal with less cross-stock noise to contend with.
02

Architecture Comparison — Avg Test Sharpe

VGG + Alpaca
Live data · FinBERT sentiment
Avg Test Sharpe
2.089
Best
2.726
Worst
1.009→1.575
VGG Baseline
Yahoo Finance · No sentiment
Avg Test Sharpe
2.062
Best
3.111
Worst
0.826→1.001
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
Avg Test Sharpe
1.892
Best
2.350
Worst
0.992→1.273
Transformer
Alpaca · Cross-stock attention
Avg Test Sharpe
1.568
Best
1.895
Worst
0.629
03

Full Results — All 24 Models

Model Universe Capital Test Sharpe Test Return Max DD Win Rate Train Sharpe vs BAH
VGG Baseline
Yahoo Finance · No sentiment
30-Stock $10k 3.111 98.70% -11.27% 56.02% 2.428 ▲ +1.137
VGG + Alpaca
Alpaca · FinBERT sentiment
50-Stock $10k 2.726 9.51% -3.71% 65.79% 2.137 ▲ +1.289
VGG + Alpaca
Alpaca · FinBERT sentiment
30-Stock $100k 2.531 57.16% -9.50% 56.17% 2.029 ▲ +0.556
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
50-Stock $100k 2.347 42.47% -7.02% 55.76% 1.709 ▲ +0.910
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
30-Stock $100k 2.350 48.92% -8.95% 58.82% 1.773 ▲ +0.376
VGG Baseline
Yahoo Finance · No sentiment
30-Stock $1M 3.086 48.22% -5.00% 60.17% 2.140 ▲ +1.111
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
30-Stock $1M 2.349 49.02% -10.84% 55.74% 1.816 ▲ +0.375
VGG Baseline
Yahoo Finance · No sentiment
30-Stock $100k 2.287 55.37% -19.86% 55.37% 1.687 ▲ +0.313
VGG + Alpaca
Alpaca · FinBERT sentiment
30-Stock $10k 2.111 123.85% -20.21% 56.85% 2.166 ▲ +0.136
VGG + Alpaca
Alpaca · FinBERT sentiment
50-Stock $100k 2.019 72.48% -15.28% 56.20% 1.708 ▲ +0.582
Transformer
Alpaca · Cross-stock attention · 4-component reward
50-Stock $100k 1.895 52.93% -19.62% 55.04% 1.718 ▲ +0.386
Transformer
Alpaca · Cross-stock attention · 4-component reward
50-Stock $1M 1.861 38.26% -11.27% 54.89% 1.470 ▲ +0.352
Transformer
Alpaca · Cross-stock attention · 4-component reward
50-Stock $10k 1.806 33.99% -8.89% 48.61% 2.253 ▲ +0.297
Transformer
Alpaca · Cross-stock attention · 4-component reward
30-Stock $1M 1.748 55.23% -19.70% 57.87% 1.498 ▲ +0.389
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
50-Stock $10k 1.695 68.48% -21.66% 56.45% 1.805 ▲ +0.258
VGG + Alpaca
Alpaca · FinBERT sentiment
30-Stock $1M 1.570 31.41% -12.52% 55.79% 1.777 ▼ -0.405
VGG Baseline
Yahoo Finance · No sentiment
50-Stock $10k 1.504 30.98% -10.33% 58.53% 1.999 ▲ +0.067
Transformer
Alpaca · Cross-stock attention · 4-component reward
30-Stock $100k 1.468 25.91% -11.93% 52.94% 1.538 ▲ +0.109
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
30-Stock $10k 1.338 41.07% -19.24% 50.62% 2.325 ▼ -0.636
VGG + Alpaca
Alpaca · FinBERT sentiment
50-Stock $1M 1.575 29.96% -7.78% 57.56% 2.138 ▲ +0.138
VGG + FinBERT
Yahoo Finance · FinBERT sentiment
50-Stock $1M 1.273 23.61% -8.42% 56.16% 1.965 ▼ -0.164
VGG Baseline
Yahoo Finance · No sentiment
50-Stock $100k 1.381 32.11% -10.75% 54.96% 1.168 ▼ -0.056
VGG Baseline
Yahoo Finance · No sentiment
50-Stock $1M 1.001 20.97% -9.33% 55.71% 1.922 ▼ -0.436
Transformer
Alpaca · Cross-stock attention · 4-component reward
30-Stock $10k 0.629 13.31% -12.14% 53.24% 1.460 ▼ -0.730
04

Training Performance Over Time

Train / Total Return %

Training Total Return

Train / Sharpe Ratio

Training Sharpe Ratio

Train / Max Drawdown %

Training Max Drawdown

Train / Win Rate %

Training Win Rate
05

Test Performance Over Time (2024)

Test / Total Return %

Test Total Return

Test / Sharpe Ratio

Test Sharpe Ratio

Test / Max Drawdown %

Test Max Drawdown

Test / Win Rate %

Test Win Rate
06

Market Regime Analysis — VGG + Alpaca Best Configuration

30-Stock · $100k Best Overall

Sharpe 2.531
Period Sharpe Return Max DD
H1 Bull (Jan–Jun) 3.343 +34.27% -7.76%
H2 Volatile (Jul–Dec) 1.776 +17.49% -9.50%
Buy-and-Hold H1 3.411 +26.23% -5.79%
Buy-and-Hold H2 0.911 +9.98% -10.82%

50-Stock · $100k

Sharpe 2.019
Period Sharpe Return Max DD
H1 Bull (Jan–Jun) 0.685 +7.76% -9.61%
H2 Volatile (Jul–Dec) 2.816 +57.97% -15.28%
Buy-and-Hold H1 2.137 +14.57% -4.76%
Buy-and-Hold H2 0.948 +9.45% -7.80%

30-Stock · $10k

Sharpe 2.111
Period Sharpe Return Max DD
H1 Bull (Jan–Jun) 2.812 +55.39% -15.99%
H2 Volatile (Jul–Dec) 1.658 +42.06% -18.65%
Buy-and-Hold H1 3.411 +26.23% -5.79%
Buy-and-Hold H2 0.911 +9.98% -10.82%

30-Stock · $1M

Sharpe 1.570
Period Sharpe Return Max DD
H1 Bull (Jan–Jun) 2.266 +19.71% -10.68%
H2 Volatile (Jul–Dec) 1.056 +10.41% -12.52%
Buy-and-Hold H1 3.411 +26.23% -5.79%
Buy-and-Hold H2 0.911 +9.98% -10.82%