What it is — in three sentences
Monte Carlo simulation runs your project model thousands of times. Each time, it picks a random value for every uncertain input — each cost estimate, each activity duration, each risk — consistent with the range and probability you specified for that input. The result is not a single answer but a distribution of possible answers, showing how likely each outcome is.
That is it. The rest is detail — important detail, but detail. The reason it has an intimidating name and an air of mathematical sophistication is largely historical: when it was first used for nuclear weapons calculations in the 1940s, computing power was precious and running thousands of iterations was genuinely impressive. Today, a laptop runs 10,000 project scenarios in seconds. The technique is completely ordinary. What is not ordinary is using it well — which requires understanding what it can and cannot do.
The three words most often applied to Monte Carlo by people who have encountered a bad one are: reassuring, sophisticated, and wrong. Reassuring because the S-curve looks authoritative. Sophisticated because the tool is technically advanced. Wrong because the inputs were too narrow, the correlation was ignored, and the output gave a false sense of precision about a fundamentally uncertain project. This guide is about the gap between those three words and what a good Monte Carlo analysis actually provides.
What Monte Carlo does well
Monte Carlo does two things better than any other analytical technique available to project controls practitioners. First, it correctly models the combined effect of many uncertain inputs. A project cost estimate with 200 line items, each uncertain by a range of ±10–30%, does not have an overall uncertainty of ±10–30%. The uncertainties combine — sometimes cancelling out, sometimes compounding — and the only rigorous way to capture their combined effect is to sample them together in a simulation. Analytical formulae can approximate this for simple cases; Monte Carlo does it correctly for any level of complexity.
Second, Monte Carlo correctly handles the statistical phenomenon of merge bias in schedule models. Deterministic schedules assume that where multiple paths merge onto a single successor, the path that finishes last drives the successor's start. What they do not capture is the probability that any one of several near-critical paths could be the one that runs late. The more parallel paths converge on a milestone, the more likely at least one of them is to overrun — and the more the deterministic schedule underestimates the expected duration. Monte Carlo captures this automatically because it samples each path independently across thousands of iterations.
Monte Carlo is also the only widely available technique that produces a full probability distribution of outcomes rather than a single point estimate. That distribution is genuinely useful information. A P50 cost of £100m and a P80 of £125m tells a board something meaningful: "we expect to come in around £100m, but we should budget to £125m to have an 80% chance of being within budget." A single deterministic estimate of £100m tells the board nothing about how confident that figure is. The distribution is the point.
Garbage in, garbage out — and why it happens
The most important thing to understand about Monte Carlo is that it is a processing engine, not a source of truth. Feed it narrow input ranges and it will produce a confidently narrow S-curve. Feed it wide, realistic ranges and it will produce a wide, honest distribution. The simulation cannot tell you which inputs are credible. That is a human judgement problem, and it is where most Monte Carlo analyses go wrong.
The specific failure mode is anchoring on the most likely estimate. When someone who has been working on a project for months is asked for a minimum and maximum duration for a specific activity, they have a strong internal anchor: whatever they have been planning for. Moving away from that anchor feels like admitting uncertainty, which feels like admitting a potential problem. So the minimum comes in at 10% below the most likely, and the maximum at 15% above, and the resulting distribution is much narrower than the real uncertainty about that activity.
This is a cognitive bias, not incompetence. It affects experienced practitioners just as much as inexperienced ones, and it affects senior leaders more than junior ones (because senior leaders are more committed to the plan). The correct response is structured calibration — explicitly challenging estimators with direct questions, presenting historical data from comparable activities, and running the outputs through a sanity check before accepting them. A cost range where the P80-to-P50 ratio is less than 1.15 on a complex project is almost certainly too narrow. A schedule distribution where the P80 date is within two weeks of the deterministic end date on a multi-year programme needs to be challenged. These are not arbitrary rules — they reflect the empirical reality of how projects actually perform.
Garbage can also enter through the risk register. A risk model that contains only risks from the formal register — and those risks have been gamed down in workshops to avoid uncomfortable conversations — will systematically understate exposure. The best check is the reference class: what did similar projects actually cost and how long did they actually take? If the Monte Carlo P80 is materially below the reference class average outturn, the inputs are probably too optimistic, regardless of how rigorous the workshop process appeared.
Merge bias: what your deterministic schedule is hiding
Merge bias is one of the reasons that a properly run Monte Carlo schedule analysis almost always produces a P50 date later than the deterministic critical path end date — even before any discrete risk events are applied. It is worth explaining carefully, because clients and sponsors who see this gap often conclude that the simulation is being pessimistic when it is actually being accurate.
Consider a simple example. Three parallel activity chains each have a 50% chance of finishing on time and a 50% chance of being two weeks late. They all converge on a single commissioning milestone. The deterministic schedule says the milestone is achievable on time, because each chain has a 50% chance of finishing on time. But the probability that all three chains finish on time simultaneously is 0.5 × 0.5 × 0.5 = 12.5%. There is an 87.5% probability that at least one chain overruns and delays the milestone. The deterministic schedule is not wrong in any individual fact — it is wrong about the combined probability.
As the number of converging paths increases, the merge bias effect grows. A project with ten near-critical paths all converging on a single completion milestone may have a deterministic end date that has only a 5-10% probability of being achieved. The Monte Carlo P50 — the date with a 50% chance of achievement — might be two or three months later. This is not a modelling artefact. It is an accurate description of the project's expected completion. The deterministic schedule is the artefact — it presents a specific combination of lucky outcomes as the expected outcome.
The practical implication is that projects with many converging parallel paths — typical of major infrastructure commissioning milestones, complex fit-out completions, and system integration milestones — should be planned with explicit schedule contingency beyond the deterministic programme. The Monte Carlo P50 is a better target completion date than the deterministic end date. Clients and funders who insist on P50 (or better) funding for cost but are happy to plan schedules to the deterministic end date are accepting a systematic bias toward late completion that will eventually show up as a delay claim.
Correlation: the most neglected input
Correlation in a Monte Carlo model captures the fact that risks and uncertainties are not independent of each other. If the project experiences bad weather in January, it will experience bad weather across all the activities running in January — not just one of them. If a key subcontractor underperforms on one package, they are likely to underperform on other packages they are delivering simultaneously. If the steel market is inflated, it will be inflated for all the steel-intensive elements across the project, not just one cost line.
When correlation is ignored — when every risk and every activity duration is sampled independently — the simulation implicitly assumes that bad outcomes on one element are offset by good outcomes on another in the same iteration. This is statistically equivalent to assuming that all the risks are balanced across each simulated project. In reality, projects have good years and bad years, good site conditions and bad ones, cooperative supply chains and uncooperative ones. The correlation structure of a real project means that bad things tend to happen together.
The consequence of ignoring correlation is a Monte Carlo output that is too narrow — a P80 that is probably closer to the true P60 or P65. On large, complex programmes where correlation effects are material, the difference between a correlated and uncorrelated model can be 10-15% of the total project budget. This is not a technical nicety — it is potentially the difference between funding the project adequately or not. Practitioners who receive a QRA report without any mention of correlation assumptions should ask directly: was correlation modelled? If not, the contingency recommendation may be systematically understated.
Setting correlation coefficients does not require statistical expertise. The practical approach is to group related risks by common driver — all weather-sensitive activities, all activities dependent on a particular subcontractor, all activities affected by the same regulatory approval — and apply a uniform moderate positive correlation (typically 0.5 to 0.7) within each group. This captures the essential dependency without claiming false precision about specific correlation values. The result will be materially more accurate than uncorrelated sampling, and the difference will typically show up as a wider S-curve with a higher P80 that more honestly represents the project's true uncertainty.
Reading the output for a board
The standard output from a Monte Carlo cost or schedule analysis is an S-curve and a tornado chart. Both are useful, but neither is self-explanatory to an audience that has not seen them before. The facilitator presenting these outputs to a board or client needs a clear narrative that translates the numbers into decisions.
For the S-curve: start with what the curve represents. "This chart shows the range of possible project costs given the risks and uncertainties we have modelled. The horizontal axis is cost; the vertical axis is probability. The point where the curve crosses 50% — here — is our P50 estimate of £X. This means that in half of all the scenarios we modelled, the project came in at or below £X. The point where it crosses 80% is £Y — our P80 estimate. We recommend funding to the P80 level, which gives an 80% probability of completing within budget." That is the whole explanation. Boards do not need to understand the Monte Carlo engine — they need to understand what the percentiles mean and which one they are being asked to approve.
For the tornado chart: explain what drives the range of outcomes. "This chart shows the top risk drivers — the risks and uncertainties that most influence whether the project comes in at the low end or the high end of our cost range. The longest bar — ground conditions — is the single largest contributor to our cost uncertainty. If ground conditions prove better than expected, the project will likely come in below our P50 estimate. If they prove worse, we could be approaching P80 or beyond. This tells us where to focus risk management effort over the coming months."
Two things to avoid. First, do not present the P50 as "the most likely cost" without qualification — it is the median outcome, not the mode. In a skewed distribution (which most project cost distributions are, because costs tend to overrun rather than underrun), the P50 may be somewhat above the deterministic estimate and the mode may be below the P50. The distinction is technical, but if a board member challenges you on it, you need to be able to explain it. Second, do not present the P80 as a ceiling — it is not. There is a 20% probability that the actual cost will exceed the P80. If the client or funder asks what happens if it does, the answer should be: "That is what management reserve is for, and here is the residual exposure at P95." Having the P95 number ready is standard preparation for any QRA presentation to a senior audience.
The most important message to leave with a board is not the specific numbers but the appropriate level of confidence to place in them. A Monte Carlo based on well-calibrated inputs, realistic correlation assumptions, and a risk register that reflects the project's genuine exposure is a reliable planning tool. A Monte Carlo based on narrow estimates, no correlation, and a sanitised risk register is a false assurance. Telling the board which kind they are looking at — and what was done to ensure credibility — is part of the professional responsibility of anyone presenting a QRA. The same applies in front of a MoD CADMID Main Gate Board or an IPA Gateway Review team — reviewers familiar with the framework will press on correlation, reference class, and the optimism-bias adjustment, and will quickly detect a Monte Carlo whose distribution has been engineered tight to the funding envelope.