← The Validation Crisis

About

The book

The Validation Crisis is a methodological audit of the major published AGI capability-arrival forecasts — Aschenbrenner, Cotra, Davidson, METR, Grace, Epoch, Karnofsky — applying the validation discipline quantitative finance learned between 2014 and 2018, and a constructive proposal: the Deflated Capability Forecast, an adaptation of the Deflated Sharpe Ratio (Bailey & López de Prado, 2014) and the Probability of Backtest Overfitting (Bailey, Borwein, López de Prado, & Zhu, 2014) to capability projection.

The framework takes a forecast and widens its interval by the amount the underlying methodology actually warrants — producing an honest distribution over outcomes with explicit treatment of the tails, in place of a point estimate carrying unearned precision. The book applies it to each of the surveyed forecasts and, finally, to itself. A preregistered self-prediction about what the framework would produce on Aschenbrenner’s projection — that the deflation would be at least a factor of 2.3 — failed at 1.285×. The failure is reported in full, in Chapter 16, under the same labels the audit applied to others.

The author

Written by Kacper Saks — a quantitative developer in Warsaw, and an engineer in a European regulated industrial sector.

I write here in a personal capacity. Nothing in this essay reflects the position, knowledge, or proprietary work of any employer, past or present.

Reproducibility

Every figure has a corresponding executed notebook in the repository; the full test suite (747 tests across the DCF package and the figure validations) and the build pipeline are containerised. The companion Python package, deflated-capability-forecast, is the reference implementation of the DCF computations — a 37-symbol public API for computing deflations, probabilities of forecast overfitting, effective-N composites, and the asymmetric distribution machinery of the master equation.

Preregistration

The book preregisters its predictions before computing the framework on each forecast — the same discipline imported from clinical-trial registration and the factor-zoo correction in finance. The preregistered content is the immutable content of the preregistration-v3-locked git tag, at commit c624b3987e75ea41398a47e70003b643fc8ed730, verified by the HASH_PLACEHOLDER-protocol sidecar fingerprint 510c8e8ca334461b42be7d4a3ce6fc1528fb343944880ad0d61fa0e213c83d5c. The full verification recipe — re-runnable with git, sed, and shasum — lives in preregistration/PROTOCOL.md.

How to cite

The book:

@book{saks2026validation,
  author    = {Saks, Kacper},
  title     = {The Validation Crisis: Why the {AGI} Timeline Debate is Built on Unvalidated Forecasting},
  year      = {2026},
  publisher = {Self-published},
  url       = {https://validationcrisis.ai},
}

The package:

@software{saks2026dcf,
  author  = {Saks, Kacper},
  title   = {{deflated-capability-forecast}: Reference Implementation of the {Deflated Capability Forecast}},
  year    = {2026},
  version = {0.0.1},
  url     = {https://validationcrisis.ai},
}

Licence

This work is published under two licences. The software — source code, reproducibility notebooks, build tooling — is licensed under MIT. The manuscript text and figures — the book itself and the figure sources — are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). The split lets the package be used in any context with attribution per MIT, while the book and figures travel with the academic-attribution requirement that long-form work of this kind typically carries.

Return to the book