What this guide is for
If you are a project sponsor, treasury or finance team member, investment committee member or capital approver, you have probably been handed a cost estimate with phrases like "P50 cost", "P80 confidence", "Anticipated Final Cost", and "Optimism Bias Adjustment". These terms come from quantitative risk analysis and the HM Treasury / IPA framework that governs UK capital programme business cases. They are precise. They are also routinely misused — including in business cases that pass governance — and the cost of approving something whose confidence position you do not actually understand is paid years later, in overruns.
This guide is the version of the confidence-level conversation aimed at the people making the approval decision, not at the analysts producing the QRA. It deliberately avoids the methodology depth that practitioner guides cover and instead focuses on what a sponsor needs to know to read a cost estimate properly, to ask the right challenging questions, and to defend the figure you eventually approve.
For SOMA's practitioner-facing companion guide on the same topic — the deeper coverage of how the percentiles are calculated, what AACE recommended practices apply, and how Monte Carlo simulations are built — see the linked guide on confidence levels at the end of this article.
What "confidence level" actually means on a cost estimate
A confidence level on a capital cost estimate is the probability that the actual outturn cost will be at or below a stated figure. It is the output of a Monte Carlo simulation — a model run thousands of times with varied inputs — that produces not one cost number but a distribution of possible outturn costs. Each percentile of that distribution is one "confidence level": P50 is the cost figure with a 50% probability of not being exceeded, P80 with 80% probability, P95 with 95%.
It is easier to read backwards. If your project P80 is £288m, that means: in 80% of the modelled outturn scenarios the cost is at or below £288m, and in 20% it exceeds. If you fund the project at £288m, you are accepting a one-in-five probability that it will overrun. If you fund at P50 (£262m on the same project) you are accepting roughly a one-in-two probability. If you fund at P95 (£315m) you are accepting a one-in-twenty probability.
The distribution is rarely symmetric. On most capital projects, the spread between P50 and P95 is significantly wider on the upside than the spread between P5 and P50 on the downside — what statisticians call right-skewed. A project with a P50 of £100m might have a P95 of £135m but a P5 of only £90m. That asymmetry is what makes the choice of confidence level a commercial decision rather than a statistical one: you are choosing how much risk to underwrite, and the curve does not give you that decision back.
The single most important thing to understand: the deterministic point estimate your team has been working to during planning is not the same as the P50, even though it is often labelled as the "most likely" figure. The most-likely outcome is the modal point of the distribution; the P50 is the median. On a right-skewed distribution the median is higher than the mode, which means a deterministic plan built around the most-likely cost is statistically biased low. This is one of the central reasons capital projects routinely overrun even when the deterministic plan looks defensible.
Why your central estimate should be P50, not P80
The IPA Cost Estimating Guidance — the source UK departments treat as the working standard — is unambiguous on this point: the central estimate presented in a business case must be the "Median Scenario / P50 equivalent". This is not a convention you can choose to follow or not; it is the IPA's written requirement.
The reasoning is statistical honesty. The point of presenting a single headline figure to an investment committee is to give them the best-supported estimate of what the project will actually cost. The P50 is, by construction, the figure with equal probability of being too high or too low. Any other percentile chosen as the headline figure is implicitly making a risk-appetite choice on behalf of the committee rather than presenting them with the central estimate they need to make that choice themselves.
Many business cases present the P80 as the headline figure on the (well-intentioned) basis that funding at P80 is more prudent. This is exactly backwards: the P80 is a sensitivity test, not the central estimate. Presenting P80 as the headline conflates the central estimate with the contingency provision, makes it impossible for the committee to see what the project will actually cost as a central position, and routinely hides accumulated optimism bias by labelling the same number as both "the cost" and "the prudent cost". A well-structured business case shows the P50 as the central estimate, the P80 as the upper-bound sensitivity, and the requested funding (often at P80) as a separate explicit decision with its own justification.
When you see a business case with no P50 — only a "budget at P80" — you should be asking what the underlying central estimate is. Funding decisions made against an obscured central estimate routinely turn into disputes about what was actually agreed.
The IPA Cost Estimating Requirements at each stage gate
The IPA framework presents cost confidence as percentage bands around the Anticipated Final Cost (AFC), with the band tightening as the project progresses through stage gates. The published requirements are:
Strategic Outline Case (SOC): a tolerance band of -20% to +50% around the AFC. A project at SOC with an AFC of £100m has a defensible range of £80m to £150m. This is wide because the scope is still being defined and the estimating class (per AACE) is typically Class 5 (rough order of magnitude) to Class 4 (concept screening).
Outline Business Case (OBC): a tolerance band of -15% to +30%. The same project at OBC should be inside £85m to £130m. The narrowing reflects that scope is now more defined, design is at concept-to-developed stage, and estimating is Class 3 (budget authorisation, control and cost-impact assessment) or better.
Final Business Case (FBC): a tolerance band of -10% to +10%. The same project at FBC should be inside £90m to £110m. By this stage scope is locked, design is detailed, and estimating is Class 2 (control or bid / tender estimate).
These bands are not the same thing as confidence levels — they are a separate IPA framework for expressing where the AFC sits within an expected range as the project matures. You will often see both presented together: the central estimate at P50, the IPA band giving the expected tolerance at this stage gate, and the P80 figure giving the explicit upper-bound sensitivity used to size contingency. A well-structured business case shows all three and explains how they relate.
When the P80 from the QRA falls outside the IPA stage-gate band, that is a flag worth investigating. It either means the project carries materially more risk than is normal at this stage (which the business case should explicitly explain), or that the QRA is over-stating the risk (which a peer review would identify). Either way it is a sponsor-level question, not an analyst-level question.
When P80 is the right upper-bound sensitivity (and when P95 is)
P80 has become the de facto UK departmental convention for upper-bound sensitivity on capital programmes. It is the percentile most departments use to size contingency, the percentile contractors typically have to commit to under target cost arrangements, and the percentile that boards see in approval papers. But the convention is just that — a working benchmark — and the right percentile for any given decision depends on three things: the consequence of overrun, the appetite of the funding body for that consequence, and the cost of buying additional certainty.
P80 fits comfortably on most UK infrastructure programmes where the funding body is a government department, the consequence of overrun is reputational and budgetary but recoverable, and the marginal contingency between P80 and higher percentiles is significant. On a £200m programme with a P50 of £200m and a P80 of £225m, the £25m contingency at P80 buys a one-in-five-protected position; moving to P95 might cost another £20m for only modest additional protection.
P95 is the right percentile where the consequence of overrun is unacceptable rather than uncomfortable. Safety-critical defence programmes (where overrun risks operational capability), nuclear new-build (where overrun risks reputational and political consequences far beyond the budget), and portfolio-level capital safeguards (where the sponsor is underwriting many projects and cannot afford the one in five overrun rate that P80 implies) are the typical P95 contexts. The HM Treasury Green Book (2022, paragraphs 6.72-6.84) uses P90 as a worked example of how to express uncertainty around a central estimate — neither mandating that level nor ruling it out, but signalling that higher-percentile thinking is appropriate for high-value, high-impact proposals.
P50 alone is appropriate where the funding decision explicitly accepts risk in exchange for lower upfront capital commitment — typically innovation programmes, technology demonstrators, and early-stage R&D where the sponsor has deliberately chosen to accept a 50% probability of overrun. This is a valid choice if it is made explicitly. The failure mode is funding at P50 while claiming P80 confidence, which is rarer than it was but still occurs on programmes where the QRA has not been scrutinised properly.
The practical position for most UK public-sector business cases is: present the P50 as the central estimate (IPA-required), present P80 as the upper-bound sensitivity (departmental convention), use the IPA stage-gate tolerance bands to test that the figures are within normal expectation, and document why the funded position is where it is on the curve. "P80 because the Green Book says so" is not a defensible justification — the Green Book does not say so. "P80 because departmental finance has set that as the risk-appetite point and the IPA cost-band tolerance accommodates it" is defensible.
What to challenge in the QRA presented to you
Six questions a sponsor should ask of any QRA report before approving the figures. Each question targets a known failure mode that produces unreliable confidence-level outputs.
First — does the risk register reflect this specific programme, or has it been copy-pasted from a template? A QRA built on a generic risk register produces generic numbers. Ask the team to point to the three risks that are most specific to this programme. If they cannot, the analysis is generic.
Second — has correlation been modelled, and how? Zero correlation between risks (the default in many tools) almost always understates the spread because real projects bunch risks around common causes. A serious QRA shows the correlation matrix it has used and explains the dominant correlation pairs.
Third — does the output distribution have a sensible shape? A right-skewed distribution (longer tail on the upside) is what you should see on most capital projects. A distribution that is symmetric and narrow is usually a sign that the inputs were guessed conservatively rather than calibrated against benchmark data. Ask to see the histogram, not just the percentile table.
Fourth — what is in the tornado chart? A defensible QRA produces a tornado chart showing which inputs drive the variance. A small number of dominant drivers (typically three to six) is the normal pattern; a flat tornado where every input contributes equally usually means the model is generic. Ask which three drivers are most consequential and whether the team has a mitigation plan for them.
Fifth — when was the QRA last run, and what has changed since? A QRA that is more than three months old on a live programme is likely to be stale. Scope changes, schedule slippage, supply chain shifts and new risks accumulate. Ask whether the cost-confidence position you are being shown reflects the current state of the project.
Sixth — who has peer-reviewed the QRA? An internally-produced QRA that has not been reviewed by someone independent of the project team carries less weight at gateway review than one that has. For programmes above the IPA materiality threshold, a peer review or independent assurance opinion is typically expected before the figures support a funding decision.
A worked example — the numbers in plain English
Consider a £200m hospital construction programme being presented for FBC approval. The estimated cost build-up gives a deterministic figure of £200m. The project team's QRA gives a P50 of £212m, a P80 of £241m, and a P95 of £268m. The IPA stage-gate band at FBC is ±10%, giving an expected range of £190m to £233m around the most-likely figure.
How does a sponsor read this? The deterministic £200m sits below the P50 of £212m, indicating that the deterministic plan is statistically optimistic — the modelled distribution suggests £212m is the most defensible central estimate. The IPA band's upper limit of £233m sits inside the P80 of £241m, suggesting the project carries slightly more risk than is normal at FBC stage (the P80 is "outside" the IPA band). The P95 at £268m is the figure that would protect against severe overrun but at considerable contingency cost.
The decision the sponsor is being asked to make is: at what figure to approve. Approving at the deterministic £200m would carry a higher-than-50% probability of overrun and would be inconsistent with the IPA requirement to present a P50 central estimate. Approving at the P50 of £212m would mean accepting roughly a one-in-two probability of overrun, which is statistically honest but may be politically uncomfortable. Approving at the P80 of £241m would buy a one-in-five protected position and is the conventional UK departmental choice, but it sits slightly outside the IPA stage-gate band, which the business case needs to explain. Approving at the P95 of £268m would buy a one-in-twenty position but at £56m of additional contingency over the P80.
A well-structured business case in this position would present P50 as the central estimate, request funding at P80, explain why the P80 sits slightly outside the IPA band (typically because the programme has identifiable risks that are larger than the comparator dataset assumes), and set out a contingency drawdown protocol that lets the project team use the P50-to-P80 gap as the project progresses without renegotiating funding. The sponsor approves the figure understanding both the central estimate and the risk position they are underwriting.
Optimism bias adjustment and the Green Book
Alongside the P50 central estimate, the HM Treasury Green Book requires explicit adjustment for optimism bias on capital business cases. Optimism bias is the documented tendency for project costs to outturn higher than ex ante estimates by a roughly predictable amount, derived from large empirical studies of completed projects. The Green Book publishes default optimism bias uplifts by sector and project type (typically 6% to 50% on capital expenditure depending on the sector and stage), to be applied unless the project can demonstrate why the default does not apply.
The relationship between QRA-derived confidence levels and optimism bias adjustment is a regular point of confusion. They are addressing the same underlying problem (estimates outturn higher than expected) by different methods. The QRA models the project-specific risks and uncertainties bottom-up; the optimism bias adjustment applies a top-down empirical uplift derived from comparable historical projects. Both can be required by the Green Book — the QRA as the project-specific evidence, the optimism bias uplift as the empirical reality check.
In practice the convention is to apply optimism bias to the central estimate (P50) before reading the P80 from the QRA. So if a project has a deterministic estimate of £200m, a sector optimism-bias uplift of 24% gives an optimism-adjusted base of £248m. The QRA then runs on the adjusted base and produces P50/P80/P95 figures around that. This avoids double-counting risk by applying both methods to the same baseline, which would produce contingency figures that are unrealistically large.
Sponsors should ensure the business case is explicit about whether optimism bias has been applied, at what rate, and at what point in the calculation chain. A common failure mode is for the optimism bias adjustment to be presented separately from the QRA result with no clear statement of how they combine, leaving the approving body uncertain whether the figure they are seeing already incorporates the empirical uplift or not.
The decision in front of you
When you are presented with a capital project cost estimate and asked to approve it, the questions in front of you are: what is the project's P50 central estimate (and is it the optimism-adjusted figure or not)? What is the P80 the team is recommending as the funded position? Where do those figures sit relative to the IPA stage-gate band at this stage? What are the three risks driving the spread, and what mitigation is planned? When was the QRA last run, and who has peer-reviewed it?
If the business case answers those six questions clearly, the figure is approvable on the evidence presented. If any of them is opaque — if the central estimate is presented at P80 with no underlying P50, if the IPA band is not referenced, if the risk drivers cannot be named, or if peer review is missing — the right response is not to refuse approval but to send the business case back for the missing evidence before approving. Approving against an incomplete confidence-level picture is what produces the disputes years later about what was actually agreed at sanction.
The single biggest cost-control discipline available to a sponsor is to refuse to approve a figure they do not actually understand the confidence position behind. The single most common controls failure is to approve one anyway under time pressure. The framework above is what makes the difference.