What does "confidence level" mean on a capital project cost estimate?

A confidence level is the probability that the actual outturn cost will be at or below a stated figure. A P50 cost has a 50% probability of not being exceeded; a P80 cost has 80% probability; a P95 cost has 95% probability. The confidence levels come from a Monte Carlo simulation that runs the cost model thousands of times with varied inputs, producing a distribution of possible outturn costs from which any percentile can be read.

Does the IPA require P50 or P80 as the central estimate for a UK business case?

The IPA Cost Estimating Guidance requires the central estimate in a business case to be the "Median Scenario / P50 equivalent". P80 is not the IPA's written requirement for the central estimate — it is a UK departmental convention for upper-bound sensitivity that has emerged separately. Presenting P80 as the headline figure conflates the central estimate with the contingency provision and is inconsistent with the IPA requirement.

What are the IPA cost-estimating tolerance bands at each stage gate?

The IPA tolerance bands around the Anticipated Final Cost are: Strategic Outline Case -20% to +50%; Outline Business Case -15% to +30%; Final Business Case -10% to +10%. These bands reflect the expected range of the AFC as scope, design and estimating maturity improve through stage gates. They are separate from the QRA-derived confidence percentiles and the two are typically presented together: the P50 as the central estimate, the IPA band as the expected range at this stage gate, and the P80 as the upper-bound sensitivity used to size contingency.

How does optimism bias relate to QRA confidence levels?

Optimism bias is the empirically-observed tendency for project costs to outturn higher than ex ante estimates by a predictable amount, derived from large studies of completed projects. The Green Book publishes default uplifts by sector. QRA-derived confidence levels model project-specific risks bottom-up. Both can be required: the QRA as project-specific evidence, the optimism-bias uplift as empirical reality check. Convention is to apply optimism bias to the P50 central estimate before reading the P80 from the QRA, to avoid double-counting risk.

What questions should a sponsor ask before approving a QRA?

Six diagnostic questions cut through quickly. First, does the risk register reflect this specific programme or has it been copy-pasted from a template? Second, has correlation between risks been modelled (zero correlation almost always understates the spread)? Third, does the output distribution have a sensible right-skewed shape? Fourth, what does the tornado chart say about the three dominant variance drivers? Fifth, when was the QRA last run and what has changed since? Sixth, who has peer-reviewed it? If any answer is unclear, the right response is to send the business case back for the missing evidence before approving — not to approve anyway under time pressure.

What is the difference between a deterministic cost estimate and a P50 cost?

A deterministic cost estimate is a single point figure produced by building up quantities and rates — the "most likely" outcome. A P50 cost is the median of the modelled cost distribution from a QRA, the figure with equal probability of being too high or too low. These are not the same number. On a right-skewed distribution (which most capital projects are), the median is higher than the mode, which means a deterministic plan built around the most-likely cost is statistically biased low. This is one of the central reasons capital projects routinely overrun even when the deterministic plan looks defensible.

Guide

Capital Project Estimate Confidence Level — A Sponsor's Guide to P50, P80, P95 and IPA Cost Bands

For project sponsors, treasury teams, investment committees and capital approval boards. What "confidence level" actually means on a cost estimate, what the IPA Cost Estimating Requirements specify at each stage gate, when P80 is the right upper-bound sensitivity, and what to challenge in the QRA your project team has put in front of you.

Adam O'Neill22 May 202612 min readPart of Quantitative risk analysis (QRA)

What is Capital project estimate confidence level?

A capital project estimate confidence level is the probability that the actual outturn cost will be at or below a stated figure. P50 is the median (50% probability of not exceeding), P80 the conventional UK upper-bound sensitivity, P95 used for safety-critical programmes. The IPA Cost Estimating Requirements present these as percentage bands around the Anticipated Final Cost (±20-50% at SOC, ±15-30% at OBC, ±10% at FBC).

What this guide is for

If you are a project sponsor, treasury or finance team member, investment committee member or capital approver, you have probably been handed a cost estimate with phrases like "P50 cost", "P80 confidence", "Anticipated Final Cost", and "Optimism Bias Adjustment". These terms come from quantitative risk analysis and the HM Treasury / IPA framework that governs UK capital programme business cases. They are precise. They are also routinely misused — including in business cases that pass governance — and the cost of approving something whose confidence position you do not actually understand is paid years later, in overruns.

This guide is the version of the confidence-level conversation aimed at the people making the approval decision, not at the analysts producing the QRA. It deliberately avoids the methodology depth that practitioner guides cover and instead focuses on what a sponsor needs to know to read a cost estimate properly, to ask the right challenging questions, and to defend the figure you eventually approve.

For SOMA's practitioner-facing companion guide on the same topic — the deeper coverage of how the percentiles are calculated, what AACE recommended practices apply, and how Monte Carlo simulations are built — see the linked guide on confidence levels at the end of this article.

What "confidence level" actually means on a cost estimate

A confidence level on a capital cost estimate is the probability that the actual outturn cost will be at or below a stated figure. It is the output of a Monte Carlo simulation — a model run thousands of times with varied inputs — that produces not one cost number but a distribution of possible outturn costs. Each percentile of that distribution is one "confidence level": P50 is the cost figure with a 50% probability of not being exceeded, P80 with 80% probability, P95 with 95%.

It is easier to read backwards. If your project P80 is £288m, that means: in 80% of the modelled outturn scenarios the cost is at or below £288m, and in 20% it exceeds. If you fund the project at £288m, you are accepting a one-in-five probability that it will overrun. If you fund at P50 (£262m on the same project) you are accepting roughly a one-in-two probability. If you fund at P95 (£315m) you are accepting a one-in-twenty probability.

The distribution is rarely symmetric. On most capital projects, the spread between P50 and P95 is significantly wider on the upside than the spread between P5 and P50 on the downside — what statisticians call right-skewed. A project with a P50 of £100m might have a P95 of £135m but a P5 of only £90m. That asymmetry is what makes the choice of confidence level a commercial decision rather than a statistical one: you are choosing how much risk to underwrite, and the curve does not give you that decision back.

The single most important thing to understand: the deterministic point estimate your team has been working to during planning is not the same as the P50, even though it is often labelled as the "most likely" figure. The most-likely outcome is the modal point of the distribution; the P50 is the median. On a right-skewed distribution the median is higher than the mode, which means a deterministic plan built around the most-likely cost is statistically biased low. This is one of the central reasons capital projects routinely overrun even when the deterministic plan looks defensible.

Why your central estimate should be P50, not P80

The IPA Cost Estimating Guidance — the source UK departments treat as the working standard — is unambiguous on this point: the central estimate presented in a business case must be the "Median Scenario / P50 equivalent". This is not a convention you can choose to follow or not; it is the IPA's written requirement.

The reasoning is statistical honesty. The point of presenting a single headline figure to an investment committee is to give them the best-supported estimate of what the project will actually cost. The P50 is, by construction, the figure with equal probability of being too high or too low. Any other percentile chosen as the headline figure is implicitly making a risk-appetite choice on behalf of the committee rather than presenting them with the central estimate they need to make that choice themselves.

Many business cases present the P80 as the headline figure on the (well-intentioned) basis that funding at P80 is more prudent. This is exactly backwards: the P80 is a sensitivity test, not the central estimate. Presenting P80 as the headline conflates the central estimate with the contingency provision, makes it impossible for the committee to see what the project will actually cost as a central position, and routinely hides accumulated optimism bias by labelling the same number as both "the cost" and "the prudent cost". A well-structured business case shows the P50 as the central estimate, the P80 as the upper-bound sensitivity, and the requested funding (often at P80) as a separate explicit decision with its own justification.

When you see a business case with no P50 — only a "budget at P80" — you should be asking what the underlying central estimate is. Funding decisions made against an obscured central estimate routinely turn into disputes about what was actually agreed.

The IPA Cost Estimating Requirements at each stage gate

The IPA framework presents cost confidence as percentage bands around the Anticipated Final Cost (AFC), with the band tightening as the project progresses through stage gates. The published requirements are:

Strategic Outline Case (SOC): a tolerance band of -20% to +50% around the AFC. A project at SOC with an AFC of £100m has a defensible range of £80m to £150m. This is wide because the scope is still being defined and the estimating class (per AACE) is typically Class 5 (rough order of magnitude) to Class 4 (concept screening).

Outline Business Case (OBC): a tolerance band of -15% to +30%. The same project at OBC should be inside £85m to £130m. The narrowing reflects that scope is now more defined, design is at concept-to-developed stage, and estimating is Class 3 (budget authorisation, control and cost-impact assessment) or better.

Final Business Case (FBC): a tolerance band of -10% to +10%. The same project at FBC should be inside £90m to £110m. By this stage scope is locked, design is detailed, and estimating is Class 2 (control or bid / tender estimate).

These bands are not the same thing as confidence levels — they are a separate IPA framework for expressing where the AFC sits within an expected range as the project matures. You will often see both presented together: the central estimate at P50, the IPA band giving the expected tolerance at this stage gate, and the P80 figure giving the explicit upper-bound sensitivity used to size contingency. A well-structured business case shows all three and explains how they relate.

When the P80 from the QRA falls outside the IPA stage-gate band, that is a flag worth investigating. It either means the project carries materially more risk than is normal at this stage (which the business case should explicitly explain), or that the QRA is over-stating the risk (which a peer review would identify). Either way it is a sponsor-level question, not an analyst-level question.

When P80 is the right upper-bound sensitivity (and when P95 is)

P80 has become the de facto UK departmental convention for upper-bound sensitivity on capital programmes. It is the percentile most departments use to size contingency, the percentile contractors typically have to commit to under target cost arrangements, and the percentile that boards see in approval papers. But the convention is just that — a working benchmark — and the right percentile for any given decision depends on three things: the consequence of overrun, the appetite of the funding body for that consequence, and the cost of buying additional certainty.

P80 fits comfortably on most UK infrastructure programmes where the funding body is a government department, the consequence of overrun is reputational and budgetary but recoverable, and the marginal contingency between P80 and higher percentiles is significant. On a £200m programme with a P50 of £200m and a P80 of £225m, the £25m contingency at P80 buys a one-in-five-protected position; moving to P95 might cost another £20m for only modest additional protection.

P95 is the right percentile where the consequence of overrun is unacceptable rather than uncomfortable. Safety-critical defence programmes (where overrun risks operational capability), nuclear new-build (where overrun risks reputational and political consequences far beyond the budget), and portfolio-level capital safeguards (where the sponsor is underwriting many projects and cannot afford the one in five overrun rate that P80 implies) are the typical P95 contexts. The HM Treasury Green Book (2022, paragraphs 6.72-6.84) uses P90 as a worked example of how to express uncertainty around a central estimate — neither mandating that level nor ruling it out, but signalling that higher-percentile thinking is appropriate for high-value, high-impact proposals.

P50 alone is appropriate where the funding decision explicitly accepts risk in exchange for lower upfront capital commitment — typically innovation programmes, technology demonstrators, and early-stage R&D where the sponsor has deliberately chosen to accept a 50% probability of overrun. This is a valid choice if it is made explicitly. The failure mode is funding at P50 while claiming P80 confidence, which is rarer than it was but still occurs on programmes where the QRA has not been scrutinised properly.

The practical position for most UK public-sector business cases is: present the P50 as the central estimate (IPA-required), present P80 as the upper-bound sensitivity (departmental convention), use the IPA stage-gate tolerance bands to test that the figures are within normal expectation, and document why the funded position is where it is on the curve. "P80 because the Green Book says so" is not a defensible justification — the Green Book does not say so. "P80 because departmental finance has set that as the risk-appetite point and the IPA cost-band tolerance accommodates it" is defensible.

What to challenge in the QRA presented to you

Six questions a sponsor should ask of any QRA report before approving the figures. Each question targets a known failure mode that produces unreliable confidence-level outputs.

First — does the risk register reflect this specific programme, or has it been copy-pasted from a template? A QRA built on a generic risk register produces generic numbers. Ask the team to point to the three risks that are most specific to this programme. If they cannot, the analysis is generic.

Second — has correlation been modelled, and how? Zero correlation between risks (the default in many tools) almost always understates the spread because real projects bunch risks around common causes. A serious QRA shows the correlation matrix it has used and explains the dominant correlation pairs.

Third — does the output distribution have a sensible shape? A right-skewed distribution (longer tail on the upside) is what you should see on most capital projects. A distribution that is symmetric and narrow is usually a sign that the inputs were guessed conservatively rather than calibrated against benchmark data. Ask to see the histogram, not just the percentile table.

Fourth — what is in the tornado chart? A defensible QRA produces a tornado chart showing which inputs drive the variance. A small number of dominant drivers (typically three to six) is the normal pattern; a flat tornado where every input contributes equally usually means the model is generic. Ask which three drivers are most consequential and whether the team has a mitigation plan for them.

Fifth — when was the QRA last run, and what has changed since? A QRA that is more than three months old on a live programme is likely to be stale. Scope changes, schedule slippage, supply chain shifts and new risks accumulate. Ask whether the cost-confidence position you are being shown reflects the current state of the project.

Sixth — who has peer-reviewed the QRA? An internally-produced QRA that has not been reviewed by someone independent of the project team carries less weight at gateway review than one that has. For programmes above the IPA materiality threshold, a peer review or independent assurance opinion is typically expected before the figures support a funding decision.

A worked example — the numbers in plain English

Consider a £200m hospital construction programme being presented for FBC approval. The estimated cost build-up gives a deterministic figure of £200m. The project team's QRA gives a P50 of £212m, a P80 of £241m, and a P95 of £268m. The IPA stage-gate band at FBC is ±10%, giving an expected range of £190m to £233m around the most-likely figure.

How does a sponsor read this? The deterministic £200m sits below the P50 of £212m, indicating that the deterministic plan is statistically optimistic — the modelled distribution suggests £212m is the most defensible central estimate. The IPA band's upper limit of £233m sits inside the P80 of £241m, suggesting the project carries slightly more risk than is normal at FBC stage (the P80 is "outside" the IPA band). The P95 at £268m is the figure that would protect against severe overrun but at considerable contingency cost.

The decision the sponsor is being asked to make is: at what figure to approve. Approving at the deterministic £200m would carry a higher-than-50% probability of overrun and would be inconsistent with the IPA requirement to present a P50 central estimate. Approving at the P50 of £212m would mean accepting roughly a one-in-two probability of overrun, which is statistically honest but may be politically uncomfortable. Approving at the P80 of £241m would buy a one-in-five protected position and is the conventional UK departmental choice, but it sits slightly outside the IPA stage-gate band, which the business case needs to explain. Approving at the P95 of £268m would buy a one-in-twenty position but at £56m of additional contingency over the P80.

A well-structured business case in this position would present P50 as the central estimate, request funding at P80, explain why the P80 sits slightly outside the IPA band (typically because the programme has identifiable risks that are larger than the comparator dataset assumes), and set out a contingency drawdown protocol that lets the project team use the P50-to-P80 gap as the project progresses without renegotiating funding. The sponsor approves the figure understanding both the central estimate and the risk position they are underwriting.

Optimism bias adjustment and the Green Book

Alongside the P50 central estimate, the HM Treasury Green Book requires explicit adjustment for optimism bias on capital business cases. Optimism bias is the documented tendency for project costs to outturn higher than ex ante estimates by a roughly predictable amount, derived from large empirical studies of completed projects. The Green Book publishes default optimism bias uplifts by sector and project type (typically 6% to 50% on capital expenditure depending on the sector and stage), to be applied unless the project can demonstrate why the default does not apply.

The relationship between QRA-derived confidence levels and optimism bias adjustment is a regular point of confusion. They are addressing the same underlying problem (estimates outturn higher than expected) by different methods. The QRA models the project-specific risks and uncertainties bottom-up; the optimism bias adjustment applies a top-down empirical uplift derived from comparable historical projects. Both can be required by the Green Book — the QRA as the project-specific evidence, the optimism bias uplift as the empirical reality check.

In practice the convention is to apply optimism bias to the central estimate (P50) before reading the P80 from the QRA. So if a project has a deterministic estimate of £200m, a sector optimism-bias uplift of 24% gives an optimism-adjusted base of £248m. The QRA then runs on the adjusted base and produces P50/P80/P95 figures around that. This avoids double-counting risk by applying both methods to the same baseline, which would produce contingency figures that are unrealistically large.

Sponsors should ensure the business case is explicit about whether optimism bias has been applied, at what rate, and at what point in the calculation chain. A common failure mode is for the optimism bias adjustment to be presented separately from the QRA result with no clear statement of how they combine, leaving the approving body uncertain whether the figure they are seeing already incorporates the empirical uplift or not.

The decision in front of you

When you are presented with a capital project cost estimate and asked to approve it, the questions in front of you are: what is the project's P50 central estimate (and is it the optimism-adjusted figure or not)? What is the P80 the team is recommending as the funded position? Where do those figures sit relative to the IPA stage-gate band at this stage? What are the three risks driving the spread, and what mitigation is planned? When was the QRA last run, and who has peer-reviewed it?

If the business case answers those six questions clearly, the figure is approvable on the evidence presented. If any of them is opaque — if the central estimate is presented at P80 with no underlying P50, if the IPA band is not referenced, if the risk drivers cannot be named, or if peer review is missing — the right response is not to refuse approval but to send the business case back for the missing evidence before approving. Approving against an incomplete confidence-level picture is what produces the disputes years later about what was actually agreed at sanction.

The single biggest cost-control discipline available to a sponsor is to refuse to approve a figure they do not actually understand the confidence position behind. The single most common controls failure is to approve one anyway under time pressure. The framework above is what makes the difference.

FAQ

Frequently asked

What does "confidence level" mean on a capital project cost estimate?: A confidence level is the probability that the actual outturn cost will be at or below a stated figure. A P50 cost has a 50% probability of not being exceeded; a P80 cost has 80% probability; a P95 cost has 95% probability. The confidence levels come from a Monte Carlo simulation that runs the cost model thousands of times with varied inputs, producing a distribution of possible outturn costs from which any percentile can be read.
Does the IPA require P50 or P80 as the central estimate for a UK business case?: The IPA Cost Estimating Guidance requires the central estimate in a business case to be the "Median Scenario / P50 equivalent". P80 is not the IPA's written requirement for the central estimate — it is a UK departmental convention for upper-bound sensitivity that has emerged separately. Presenting P80 as the headline figure conflates the central estimate with the contingency provision and is inconsistent with the IPA requirement.
What are the IPA cost-estimating tolerance bands at each stage gate?: The IPA tolerance bands around the Anticipated Final Cost are: Strategic Outline Case -20% to +50%; Outline Business Case -15% to +30%; Final Business Case -10% to +10%. These bands reflect the expected range of the AFC as scope, design and estimating maturity improve through stage gates. They are separate from the QRA-derived confidence percentiles and the two are typically presented together: the P50 as the central estimate, the IPA band as the expected range at this stage gate, and the P80 as the upper-bound sensitivity used to size contingency.
When should a project be funded at P95 instead of P80?: P95 is appropriate when the consequence of overrun is unacceptable rather than uncomfortable. The typical contexts are safety-critical defence programmes, nuclear new-build, and portfolio-level capital safeguards where the sponsor is underwriting many projects and cannot afford the one-in-five overrun rate that P80 implies. The HM Treasury Green Book uses P90 as a worked example for high-value high-impact proposals — neither mandating that percentile nor ruling it out, but signalling that higher-percentile thinking is appropriate for some programmes.
How does optimism bias relate to QRA confidence levels?: Optimism bias is the empirically-observed tendency for project costs to outturn higher than ex ante estimates by a predictable amount, derived from large studies of completed projects. The Green Book publishes default uplifts by sector. QRA-derived confidence levels model project-specific risks bottom-up. Both can be required: the QRA as project-specific evidence, the optimism-bias uplift as empirical reality check. Convention is to apply optimism bias to the P50 central estimate before reading the P80 from the QRA, to avoid double-counting risk.
What questions should a sponsor ask before approving a QRA?: Six diagnostic questions cut through quickly. First, does the risk register reflect this specific programme or has it been copy-pasted from a template? Second, has correlation between risks been modelled (zero correlation almost always understates the spread)? Third, does the output distribution have a sensible right-skewed shape? Fourth, what does the tornado chart say about the three dominant variance drivers? Fifth, when was the QRA last run and what has changed since? Sixth, who has peer-reviewed it? If any answer is unclear, the right response is to send the business case back for the missing evidence before approving — not to approve anyway under time pressure.
What is the difference between a deterministic cost estimate and a P50 cost?: A deterministic cost estimate is a single point figure produced by building up quantities and rates — the "most likely" outcome. A P50 cost is the median of the modelled cost distribution from a QRA, the figure with equal probability of being too high or too low. These are not the same number. On a right-skewed distribution (which most capital projects are), the median is higher than the mode, which means a deterministic plan built around the most-likely cost is statistically biased low. This is one of the central reasons capital projects routinely overrun even when the deterministic plan looks defensible.

← Back to guides

More guides

Keep reading.

Guide

The Honest Guide to QRA

What Quantitative Risk Analysis actually is, when you need it, how it works, and how to tell a good one from a bad one.

10 min read

Guide

Monte Carlo Simulation Is Not Magic — What QRA Actually Does (and Doesn't Do)

What Monte Carlo simulation actually is in three sentences, what it does well in QRA, garbage-in-garbage-out, merge bias, correlation, and how to read the S-curve output for a board or finance committee.

9 min read

Guide

QSRA vs QCRA: Meaning, Methodology, and When Each Is the Right Answer

Two of the most important tools in quantitative risk analysis, frequently confused. Here is what each acronym means, how the methodologies differ, what each produces, and how to decide which your programme needs — with worked UK rail, water and nuclear examples.

8 min read

Guide

P50, P80, P95 in Cost Estimation: Which Confidence Level Should You Actually Use?

P50 is the IPA-required central estimate for UK capital cost. P80 is the UK departmental sensitivity convention. P95 is for safety-critical programmes and portfolio-level safeguards. How to pick the right confidence level for project sanction — and what HM Treasury Green Book and IPA Cost Estimating Guidance actually say, versus the working conventions departments use in practice.

9 min read

Strengthening your QRA function?

SOMA delivers quantitative risk analysis to AACE recommended practice — workshop facilitation, three-point calibration, Monte Carlo modelling and reports that survive gateway scrutiny. Independent, tool-agnostic, and written up so a board can act on the number.

Talk to our QRA team QRA service →