Validation methodology

Bank-grade validation, applied to your algo

AlgoCheck doesn’t just compute metrics — it applies the same overfitting statistics, cross-validation and model-validation discipline that quant funds and bank model-risk teams use, with every method traceable to published research or a recognised standard.

The standards we follow

The frameworks institutional allocators and regulators recognise — the difference between a hobbyist tool and an institutional audit.

Fed / OCC SR 11-7

Model Risk Management

The US banking regulators’ “effective challenge” doctrine: independent validation, ongoing monitoring, and outcomes analysis. Our audit is structured around its three pillars — conceptual soundness, monitoring, and back-tested outcomes.

FMSB

Model Risk for Electronic Trading Algorithms (2025)

The Financial Markets Standards Board applies SR 11-7 specifically to trading algorithms. Our validation, drift-monitoring and outcomes vocabulary mirrors it — the same bar investment banks hold their own algos to.

CFA Institute

Investment Model Validation (2024)

The practitioner model-validation playbook the CFA Institute hands to professional asset managers — robustness, regime sensitivity, out-of-sample assessment. The language allocators recognise.

SEC Marketing Rule 206(4)-1

Hypothetical performance

A backtest is “hypothetical performance.” Every audit report frames it with the assumptions, risks and limits the rule requires — the standard under which advisers were sanctioned in 2024.

The science under the score

Every check below feeds the composite AlgoCheck Score and its confidence grade. Methods marked with their authors are implemented directly from the source research.

Overfitting & selection bias

Is the edge real, or just the luckiest of many tries?

Probabilistic & Deflated Sharpe RatioBailey & López de Prado, 2012–2014
Discounts the Sharpe for how many variants were tried, fat tails and sample length.
Probability of Backtest Overfitting (CSCV)Bailey, Borwein, López de Prado, Zhu, 2017
A single 0–100% number: the odds the best in-sample setting goes mediocre out-of-sample.
Effective number of trialsLópez de Prado & Lewis, 2019
Counts genuinely independent tries when many tested variants are near-identical.
Higher significance bar (t > 3)Harvey, Liu & Zhu, 2016
Raises the data-mining significance threshold the way top finance journals now require.
Minimum Backtest LengthBailey, Borwein, López de Prado, Zhu, 2014
Whether the history is long enough to justify how many variants were tested.

Combinatorial cross-validation

Thousands of leak-proof time splits instead of one lucky backtest.

Combinatorial Purged Cross-Validation (CPCV)López de Prado, 2018
Generates a distribution of out-of-sample paths with purging and embargo, not a single run.
CPCV vs walk-forward benchmarkArian, Norouzi & Seco, 2024
Peer-reviewed proof that CPCV best resists “great-in-backtest-fails-live” overfitting.

Monte Carlo & robustness

Stress the result against luck, costs, and unseen markets.

Monte Carlo resamplingTrade reshuffle · bootstrap · block bootstrap · trade skipping
Drawdown distributions and ruin probabilities, not a single vanity equity curve.
Cost & slippage stress gridRobustness-to-costs practice
Finds the exact cost multiplier at which the edge dies.
Strategy-decay modellingFalck, Rej & Thesmar, 2022 · McLean & Pontiff, 2016
Estimates how much of the backtested edge is likely to evaporate live.
Implementation / engine riskYin et al., 2026
Five backtest engines diverge on the same strategy once costs are modelled — we report the ambiguity.

Live monitoring & decay

Catch the regime break before the drawdown confirms it.

Live-vs-backtest driftKolmogorov–Smirnov · CUSUM · rolling SQN
Flags when live results statistically diverge from the backtest baseline.
Bayesian online change-point detectionAdams & MacKay, 2007
Real-time probability that the market regime your algo was built for has broken.

Audit the evidence, not the promise

AlgoCheck scores evidence quality and risk indicators — never future profitability. A backtest is hypothetical performance; our report frames it with the assumptions and limits a professional reviewer expects.

Run an Audit

References are provided for transparency and do not imply endorsement by their authors. AlgoCheck.ai does not provide investment advice or guarantees of future performance.