5. Out-of-Sample (OOS) Validation — What It Means and Why It Matters

The curve-fitting problem

When a strategy is built and tested on the same data, it is easy to accidentally find settings that looked great historically purely by chance — not because there is real edge. This is called curve-fitting or overfitting. A curve-fitted strategy will appear profitable in backtesting but fail in live trading because its "edge" was just noise in the historical data.
EdgeLab addresses this automatically using out-of-sample (OOS) validation.

How OOS validation works

When EdgeLab runs a backtest, it splits the historical data into two segments:

In-Sample (IS) — The earlier portion of the data, used to develop and measure the strategy
Out-of-Sample (OOS) — A reserved later portion of the data that was not used during strategy development

The strategy is then tested on the OOS segment. If the strategy has real edge, it should perform similarly on data it has never "seen." If the OOS results collapse compared to the in-sample results, that is a strong signal of curve-fitting.

Reading the OOS verdict

At the top of the Metrics panel, EdgeLab displays an OOS verdict banner:
PASS (green) — The OOS results are reasonably close to the in-sample results. The strategy held up on unseen data and has a credible claim to real edge. It meets the OOS requirement for deployment.
WARNING (amber) — The OOS results show meaningful degradation from in-sample performance but are still positive. The strategy may have some edge but is also partially curve-fitted. Deploy with reduced position size and monitor closely.
FAIL (red) — The OOS results are significantly worse than in-sample, or are outright negative. The strategy's historical performance was largely an artifact of curve-fitting. This strategy should not be deployed.
The banner also shows the exact IS and OOS expectancy figures and the percentage degradation so you can see the magnitude of any drop-off.

What the Deploy tab checks

The Deploy tab in the Results panel enforces the following before allowing you to go live:

30 or more trades in the backtest
Positive expectancy
Robustness score of 0.5 or higher
OOS verdict of PASS

If any of these are not met, the Deploy Live button is disabled and the panel explains which criterion failed. This is intentional — it prevents impulse deployment of strategies that have not demonstrated real edge.

Updated on: 24/04/2026

Was this article helpful?

Thank you!