Guide

QSRA Readiness — What a Schedule Needs Before the Monte Carlo Runs

Why DCMA 14-point, CIOB PP21 and Acumen Fuse SQI each answer a different question, what risk-load readiness actually means, and the specific failure modes that kill a QSRA before the simulation has a chance.

Adam O'Neill19 April 202611 min readPart of Quantitative risk analysis (QRA)

The question DCMA does not answer

Every QSRA practitioner has seen the same pattern. The schedule passes DCMA 14-point — or most of it. The contractor's planner is satisfied. The QRA lead signs it off. The schedule and the risk register load into Safran, the simulation runs, and the output makes no sense. Variance is suppressed. The tornado is shallow. The P80 sits suspiciously close to the deterministic finish. Something is wrong, but nothing in the inputs was obviously broken.

The cause is almost always a category error about what the input checks were testing for. DCMA 14-point was written in 2005 by the US Defense Contract Management Agency as a contract-auditing checklist. It asks: does this schedule pass a generic quality audit? That is a useful question, but it is not the question a QSRA is built to answer. A QSRA asks: given the uncertainty in every activity duration and the risks in the register, what is the range of plausible completion dates? For that to produce a meaningful answer, the schedule has to be prepared for probabilistic analysis — and DCMA does not test for that.

CIOB PP21, published by the Chartered Institute of Building in 2017, is closer to the construction reality but blends schedule quality with process maturity. Scores depend partly on the project management environment around the schedule, not just on the schedule itself. That makes PP21 useful as an organisational benchmark and harder to use as a pass/fail gate for whether a specific schedule is ready for risk-loading. Acumen Fuse SQI, widely used because it ships with the tool, is closed-source and calibrated to Deltek's defaults — it treats every schedule as if it were a QSRA-ready baseline, without distinguishing between a schedule that is structurally sound and one that has been prepared for probabilistic analysis.

None of the three frameworks is wrong. They solve the problems they were designed to solve. But none of them asks the QSRA question directly, and a schedule can pass all three and still be entirely unready for a meaningful Monte Carlo. This is the first thing to accept: a DCMA green light is not a QSRA green light. They measure different things.

What "QSRA-ready" actually means

A schedule is QSRA-ready when the Monte Carlo engine can model it correctly — which means the logic, durations, calendars, constraints, float and scope markers all behave the way the simulation expects. Put differently: every assumption the tool is going to make when it iterates the schedule ten thousand times has to be true. If any of them are not, the variance the simulation produces will be an artefact of the input problems, not a reflection of real programme risk.

At SOMA we decompose QSRA readiness into five weighted domains, each containing specific checks against specific failure modes. The first, Logic Integrity (25% weighting), covers the foundation: missing logic, dangling ends, the mix of finish-to-start and other relationship types, merge hotspots, logic density and relationship direction. If the logic is soft, the Monte Carlo will find it — activities with no predecessors cannot be risk-driven, activities with no successors create stranded variance, and unusually high merge-node density produces "merge bias" where the simulation over-counts the compounding effect of parallel risks.

The second domain, Duration Health (20%), asks whether the activities can carry risk sensibly. Very-high-duration activities (say, over 90 working days) are usually summary activities disguised as detail — giving them a three-point range spreads risk over a bucket, not over a real operation. Zero-duration activities that are not milestones (a common XER import artefact) produce zero-variance points that mask real risk. Duration distribution tells you whether the schedule is padded, whether it is overly granular at one tier and coarse at another, and whether the activities look planned or guessed.

The third, Constraints & Calendars (15%), covers the silent levers that distort the critical path. Hard constraints — start-on, finish-on, must-finish — override logic; the Monte Carlo cannot push through them. A schedule with dozens of hard constraints will produce simulations where the critical path is not real, because the constraints are holding it in place. Calendar assignments that are inconsistent across connected activities break duration calculations. Milestones with missing or wrong calendar assignments show up as phantom criticality.

The fourth, Float & Critical Path (20%), asks whether the critical path is credible and whether float is honest. Extremely long critical paths, negative float, unusual volumes of high-float activities, float distribution skewed either way, and broken ERMHDR records (the XER header that tells Safran about calendars and resources) all break the risk model in ways the simulation will not flag — it will just produce confident-looking output that is not trustworthy.

The fifth, QSRA Readiness (20%), is the part most frameworks miss entirely. It covers in-scope definition, lag hygiene, status consistency, data-date alignment, activity-ID uniqueness, WBS depth, LOE (level-of-effort) discipline, TRA and TBP (time-risk allowance and time-bridging provision) hygiene, and procurement-lag exemption. These are the checks that ask: has the planner, explicitly, drawn the boundary around what the QSRA should model? A schedule where LOE activities are mixed into risk-loaded sections will inflate the tail. A schedule where procurement lead-times sit as lags on logic will absorb variance instead of driving it. These problems do not show up on DCMA because DCMA was not looking for them.

The failure modes that kill a QSRA

Specific input failures produce specific and predictable distortions in the simulation output, and an experienced QSRA practitioner can often recognise them from the S-curve shape alone. Knowing the failure modes backwards is the best defence against producing a model that is confidently wrong. The most common ones are worth walking through in detail.

Hard constraints in the middle of the schedule are the single biggest cause of suppressed variance. When a major delivery milestone is pinned by a "start-on" or "finish-on" constraint, the simulation cannot push it later even when the activities feeding into it slip. The result is an output where the P80 and the deterministic date are suspiciously close — the model is confident because the constraint is doing the work. Planners sometimes add these constraints to make the bar chart look tidy, not realising they are telling the Monte Carlo engine that the date is fixed.

LOE activities mixed into variance calculations are the next most common. Level-of-effort activities — typically summary bands like "project management" or "site facilities" — have durations driven by their start and end logic rather than by a real operation. Giving them a three-point range produces nonsense ranges, because the activity has no underlying work content to extend or compress. Good QSRA tools will exclude LOE activities from variance calculation by default, but only if the activities are correctly tagged. When they are not — when LOEs are flagged as task-dependent or have no activity-type marker at all — they go into the simulation as if they were real operations and the output gets polluted.

Procurement lags treated as risk-bearing activity duration is a subtler but equally damaging failure. A lag on a logic relationship — "10 days delay between design approval and construction start" — has no activity, no resource, and no explicit ownership. Variance applied through lag cannot be driven by risk events, because the lag is static. If procurement lead-time is represented as a lag rather than as a real activity with a planner-set duration and a risk mapping, the schedule simulation will treat it as a fixed delay and the variance modelling will under-count procurement risk entirely.

Orphan risks in the register are the fourth killer. An orphan is a risk that is mapped to an activity that does not exist in the XER — often because the activity has been renumbered between programme revisions, or because the risk was mapped to a coding tag that the latest baseline dropped. The simulation will either ignore the risk entirely (in which case the tornado is missing drivers that should be there) or assign it a default mapping that the QSRA analyst has not sanctioned. Both failure modes produce output that is structurally wrong.

The fifth common failure is schedule status lag. Compare the schedule's data date with the current programme actual-finish dates on complete activities: if the data date is two months stale, the schedule you are risk-loading is not the current reality, and the simulation will be modelling a world that no longer exists. Status consistency — making sure that the status shown on each activity matches the data date and the actual progress — is essential before any risk-loading exercise.

Why the risk register is half the job

Every hour spent auditing the schedule is wasted if the risk register behind it is broken — and most registers, on most real programmes, have structural issues that undermine the Monte Carlo even after the schedule is cleaned. The register problems are different from the schedule problems, but they are equally capable of producing confidently wrong output, and they need their own assurance pass before the simulation runs.

The first and most common register failure is missing mandatory fields. Whatever tool the register came out of — Riskhive, Xactium, ActiveRisk, Predict! or a plain consolidated workbook — the set of fields each row needs to carry is the same: probability of occurrence, minimum / most-likely / maximum cost impact, minimum / most-likely / maximum schedule impact, distribution type, mapped activity. When fields are missing, most QRA tools will either drop the row silently, substitute defaults, or apply a distribution that the analyst did not choose. All three behaviours produce simulation output that does not reflect the register the team actually built.

The second is three-point estimates that are not actually three-point. A distressing proportion of production risk registers use the same three-point range for every risk — typically 80% / 100% / 130% of the most-likely — because the team ran out of time to calibrate each one. The simulation will run happily against these, and the output will show a realistic-looking spread. It will also be wrong in ways that are invisible from the output alone, because the spread is an artefact of the uniform ranges rather than a reflection of differential risk. Each risk needs a range that reflects actual expected variability: some are tightly bounded, some have long tails, some have binary outcomes, and the register has to capture that.

The third failure is probability bounds. Some tools allow probabilities above 100% through operator error, some allow negative probabilities, some treat a blank field as "100%" and others as "0%". A register with probabilities outside the 0–100 range, or with a mix of representations (some rows in 0–1 format, others in 0–100), will produce simulation output where the total risk-load is the wrong shape. This is an easy validation check and one that should run against every register before the model starts.

The fourth and most structurally important is risk-to-activity mapping. Every risk in the register must map to one or more activities in the XER, or to a coded scope marker that the simulation tool can resolve. When mappings are weak — vague scope references like "civils works generally", mappings to summary activities rather than to detail, activities that have been renumbered — the simulation either cannot locate the impact point or maps to the wrong point. Split-impact weights, used where a risk affects several activities in different proportions, need explicit weighting values; when they are missing, the tool either applies the full impact to every mapped activity (inflating the tail) or applies proportional defaults that the analyst did not specify. Catching mapping failures is the job of a specific risk-to-activity mapping validator, separate from the schedule and register checks.

The full register audit takes about the same time as the schedule audit on a mid-sized programme. Skipping it and trusting that the register is fine "because the team built it carefully" is the most common reason a simulation produces output that everyone in the room wants to believe but no-one can defend under scrutiny.

Testing readiness before the model runs

The practical question for a QSRA analyst with a model to run next week is: how do I test these inputs before I load them, and how much time is the check going to take? The answer depends on the scale of the programme and the tools available, but a disciplined readiness pass on a mid-sized UK infrastructure schedule and its risk register takes roughly half a day and saves days of downstream rework.

The first pass is framework-based. Run DCMA 14-point against the XER to catch the generic schedule quality issues — missing logic, hard constraints, negative float, float outliers, invalid dates. DCMA will take perhaps half an hour with a reasonable tool, and it will clear the obvious structural problems. The schedule that fails DCMA is almost never ready for QSRA; the schedule that passes DCMA is not necessarily ready, but you have eliminated one class of failure.

The second pass is risk-load readiness specific. This is where DCMA hands off to something QSRA-focused: scope markers (LOE, milestones, hammocks), lag hygiene (procurement represented as activity not lag), status consistency, data-date alignment, activity-ID uniqueness, WBS depth (so roll-ups work), and TRA / TBP discipline. The SOMA QSRA Readiness framework codifies 25 checks against these failure modes; other frameworks cover parts of the same ground with different emphasis. The important thing is to be deliberate about what you are checking for, not to rely on a generic audit and hope.

The third pass is the register. Run the register through a validator that checks mandatory fields, probability bounds, three-point-estimate plausibility and distribution types. This is usually a table-based exercise and can be done in an hour on a spreadsheet if the tooling is not available, or in seconds if it is. Flag the rows with missing fields, the rows with suspiciously uniform ranges, and the rows with unusual distribution types that the simulation tool may not handle as expected.

The fourth pass is the cross-check between schedule and register — the mapping validator. For every risk in the register, verify that the mapped activity exists in the XER, is in the QSRA-relevant scope (not LOE, not milestone-only, not excluded by TRA discipline), and has a sensible split-impact weight if multiple activities are mapped. Orphan risks and unmapped activities get flagged here. This is the step that most teams skip and that catches the most expensive errors.

The fifth pass is the readiness scorecard. Aggregate the results into something an assurance lead can read in a minute — a RAG-banded score per domain, a headline number, a short list of the specific findings that caused any amber or red results. On a well-run programme, the score should be above 85 (green) before the simulation starts. A score in the 70–85 range (amber) means the simulation can run but the output needs to be caveated with the specific findings. A score below 70 (red) means the schedule is not ready and pushing ahead will produce defensible-looking output that is not defensible on closer inspection.

Tools to run these checks have improved significantly in the last five years. A practitioner today can run the schedule check, register check, mapping check and aggregated scorecard in roughly the time it used to take to do the DCMA check alone. At SOMA we built the QSRA Validator — a desktop tool covering all four validators and producing the aggregated scorecard — because our own delivery teams were losing days per engagement to spreadsheet-based audits that were not producing defensible, consistent output. The principle generalises: if the checks are formalised, the tooling will follow.

What to do on Monday morning

If you are running a QSRA in the next quarter and the schedule / register pair has not been through a dedicated readiness check, the practical improvements are specific and achievable within two weeks. First, run DCMA 14-point against the current XER. If it does not pass, the schedule is not ready — go back to the planner with the findings and fix them before thinking about risk-loading. DCMA alone takes half an hour with the right tool; there is no reason to skip it.

Second, separately, audit the schedule against a QSRA-specific framework. SOMA QSRA Readiness is one option; adapted CIOB PP21 subsets or bespoke checklists are others. The important thing is that you have checked against the right questions — LOE discipline, lag hygiene, scope markers — and not just against generic schedule quality. Document the checks you have run and the findings; this is what a defensible methodology sign-off looks like at the point a regulator asks.

Third, validate the risk register against a written schema. Mandatory fields, probability bounds, three-point ranges with a non-uniform distribution, explicit distribution types. If you are working in Riskhive or Xactium, each has its own export quirks — be aware of them and build the validator to catch the specific failure modes your tool produces. A register that passes this check is not necessarily a good register, but a register that fails it is definitely not ready.

Fourth, run the risk-to-activity mapping check. Every row in the register should resolve to an activity in the XER, or to a coded scope marker the tool can resolve. Orphan risks get flagged and resolved; unmapped risk-bearing activities get flagged and resolved. Split-impact weights that are missing get filled in with explicit values, not left blank. This is the step that catches the mapping errors no-one else is going to catch for you.

Fifth, produce a readiness scorecard and sign it off before the simulation runs. The scorecard is not a governance artefact for its own sake — it is the evidence that the inputs were checked properly and the analyst has a defensible position on input quality. If the Monte Carlo output turns out to be surprising (good or bad), the first question a reviewer will ask is "were the inputs clean?" and a signed readiness scorecard answers that question directly. A QSRA without one is an opinion; a QSRA with one is a defensible methodology.

The broader point is that QSRA input assurance has become a specific practitioner discipline in its own right, and frameworks like DCMA, CIOB PP21 and Acumen Fuse SQI — all of which are useful for the questions they were designed to answer — do not fully answer the QSRA question. The question is whether the schedule and register are ready to be risk-loaded, and the only way to know is to check. Tools have caught up with the discipline in the last few years; the practice should catch up with the tools. For teams running major UK infrastructure programmes, where QSRA output drives contingency conversations with boards and sponsors, the cost of running the readiness check is small and the cost of not running it can be significant.

← Back to guides

More guides

Keep reading.

Guide

The Honest Guide to QRA

What Quantitative Risk Analysis actually is, when you need it, how it works, and how to tell a good one from a bad one.

10 min read

Guide

Monte Carlo Simulation Is Not Magic — What QRA Actually Does (and Doesn't Do)

What Monte Carlo simulation actually is in three sentences, what it does well in QRA, garbage-in-garbage-out, merge bias, correlation, and how to read the S-curve output for a board or finance committee.

9 min read

Guide

QSRA vs QCRA: Meaning, Methodology, and When Each Is the Right Answer

Two of the most important tools in quantitative risk analysis, frequently confused. Here is what each acronym means, how the methodologies differ, what each produces, and how to decide which your programme needs — with worked UK rail, water and nuclear examples.

8 min read

Guide

P50, P80, P95 in Cost Estimation: Which Confidence Level Should You Actually Use?

P50 is the IPA-required central estimate for UK capital cost. P80 is the UK departmental sensitivity convention. P95 is for safety-critical programmes and portfolio-level safeguards. How to pick the right confidence level for project sanction — and what HM Treasury Green Book and IPA Cost Estimating Guidance actually say, versus the working conventions departments use in practice.

9 min read

Strengthening your QRA function?

SOMA delivers quantitative risk analysis to AACE recommended practice — workshop facilitation, three-point calibration, Monte Carlo modelling and reports that survive gateway scrutiny. Independent, tool-agnostic, and written up so a board can act on the number.

Talk to our QRA team QRA service →