Blog

Stress-testing your FIRE number

April 24, 202610 min read

Fan chart: 500 simulated paths from £1M starting portfolio, with median ending at £2M and 5th percentile reaching zero before year 30.

There is a common pattern in how people plan for early retirement. They calculate a target portfolio using the 4% rule, or maybe 3.5% if they are cautious. They check the number against their current savings rate. They adjust the retirement date accordingly. They feel reassured. They stop.

This is the single most dangerous moment in FIRE planning. Not because the calculation is wrong, but because a point estimate against a long-horizon plan with multiple independent risks is barely a plan at all. It is a wish with arithmetic.

Stress-testing is the process of deliberately pushing your plan against conditions that might actually occur and seeing what breaks. Done properly, it turns "I have enough to retire" into something you can actually defend. Done badly, or not at all, it is why people retire at 45, experience two years of bad markets, and quietly start job-searching at 47.

This post covers what to test for, how to test for it, and what to look at in the output.

What stress-testing is actually for

The purpose of stress-testing is not to prove your plan works. It is to identify the conditions under which it doesn't, and decide whether those conditions are tolerable.

Every retirement plan fails in some scenarios. A plan that has a 100% success rate across all simulated paths is either massively over-funded (you worked longer than you needed to) or tested against scenarios too narrow to be useful. The question is not whether failures exist but whether the failures happen in conditions you can live with or prepare for.

This reframing matters because it changes what you look at. "My plan has a 92% success rate" tells you almost nothing. Which 8% fails? What do the failures have in common? When do they happen? Could you detect them in year 5 or only in year 25? Would you have any options left by the time you noticed? These questions are what stress-testing actually answers, and they are qualitatively different from the survival-probability framing most FIRE tools produce.

The four tests that matter

Not every stress test is equally useful. A plan that survives a single catastrophic scenario may fail in a completely different one, and running thirty stress tests is just a way to generate thirty false-confidence statistics. Four tests do most of the real work.

Test 1: historical worst-case sequences

Run your plan through the specific historical sequences that have broken retirements. For the US, the canonical worst cases are 1906 (pre-Depression start), 1929 (the Depression itself), 1966 (stagflation entry), and 2000 (dot-com bust entry). For the UK, 1937, 1965, and 1972 are the brutal starting points. For Germany, 1923 (hyperinflation) and 1914 (pre-war) rarely appear in English-language planning because the data is harder to obtain, but they are not hypothetical.

This test answers: "if I had retired at the worst historical moment for my market, would my plan have survived?" If yes, your plan has passed the strongest test history can provide. If no, you need to know how badly it failed and when.

The limitation is that historical back-testing uses a single past, not a distribution of possible futures. The next worst-case sequence will not be a repeat of 1929 — it will be something new. Historical testing is necessary but insufficient.

Test 2: Monte Carlo with country-appropriate inputs

Monte Carlo simulation samples return distributions thousands of times and reports the portfolio outcome across those samples. The methodology is standard. What matters is what you sample from.

A Monte Carlo that samples from US historical returns will produce a reassuring picture because US historical returns are unusually good. A Monte Carlo that samples from Anarkulova 2025's 38-country dataset will produce a considerably less reassuring picture. A Monte Carlo that samples from forward-looking capital markets assumptions (what Morningstar and similar firms publish) will produce something in between, reflecting expectations about how future returns might differ from history.

The test to actually run is one that samples from a distribution broad enough to include scenarios you would rather not think about. Japan 1990-2019 (negative real equity returns across thirty years) should be in your sampling distribution. Germany 1948-1978 (post-war recovery from hyperinflation) should be in your sampling distribution. If they are not, your Monte Carlo is telling you "you will be fine, assuming the next forty years look like an above-average American century."

Test 3: compound stress tests

Single-variable stress tests are easy and misleading. "What if inflation is 2% higher?" "What if returns are 1% lower?" "What if I spend more than expected?" Each of these tested in isolation produces a plan that survives all of them individually but fails when they happen together.

Real retirement failures are typically compound events. The 1970s in the UK and US combined high inflation (corroding fixed income), low real equity returns, and oil-shock-driven spending pressure. The 1929 depression combined catastrophic equity losses with deflation (which helped), bank failures (which hurt), and sustained unemployment (which raised spending needs). These were not three separate bad years happening in sequence. They were correlated disruptions that compounded.

A proper stress test examines scenarios where two or three bad things happen together. 20% equity drawdown plus 5% inflation for three years plus 15% increase in spending due to a health event. These are the scenarios that actually break retirements, and they are the ones single-variable testing misses.

Test 4: longevity stress tests

The standard retirement planning horizon is 30 years. This was chosen in the 1990s when Bengen framed the 4% rule around a 65-year-old retiring and a realistic life expectancy into the mid-90s. For an early retiree at 45 or 50, a 30-year horizon ends at 75-80, which is a plausible life expectancy for someone in poor health but an obvious underestimate for most others.

The right horizon for stress-testing is not a single number but a distribution. Some portion of retirees will die in their early 70s. Most will live into their 80s. A non-trivial percentage will live past 95. A couple is more likely to have at least one partner survive to 95 than either partner individually. Morningstar's 2025 State of Retirement Income report explicitly notes that 40-year horizons produce safe withdrawal rates of 3.3% versus 3.9% for 30-year horizons — a meaningful difference for anyone retiring before 60.

The test: run your plan against your actual mortality distribution, not the conventional 30-year cap. Anarkulova et al. (2025) did this at the research level and found that proper mortality-table adjustment lowered the safe withdrawal rate materially. Your personal plan should follow the same logic.

What to look at in the output

Every stress test produces output. Most people look at the wrong parts of it. Three specific numbers matter more than success rate.

The 5th percentile portfolio trajectory. Forget the average. Forget the median. Look at the worst 5% of outcomes and ask: do they fail slowly, giving you time to react, or do they fail suddenly with no warning? A plan that fails in year 28 of a 30-year retirement is survivable with modest cuts. A plan that is already 40% below starting value by year 7 is terminal, even if the statistical success rate looks fine.

The shape of the first-decade drawdown. Most retirement plans that eventually fail do so because of what happens in the first ten years. Look at the distribution of portfolio values at year 10 across your simulations. If the 10th percentile shows you 50% below your starting balance, your plan is highly sensitive to the specific sequence you happen to retire into. The middle of the distribution is not what matters; the left tail is.

The maximum drawdown at any point. Not just the final outcome, but the worst moment along the way. A retirement plan that ends at exactly the same balance after thirty years might have done so smoothly, or it might have dropped 60% in year 8 before recovering. Those are different plans from a psychological-survival standpoint. Most people cannot rationally stay invested through a 60% drawdown while actively withdrawing from it, even if the maths says they should.

The best summary of stress-test output is not a single success probability. It is something closer to: "In the 5th percentile outcome, my portfolio hits £X by year 10, recovers or fails to £Y by year 20, and ends at £Z. I am prepared to respond to the year-10 warning sign by doing specific things A, B, and C."

That is a plan. "92% success rate at 4% withdrawal" is not.

Common mistakes in stress-testing

Five mistakes turn up repeatedly. Watching for them is itself a useful test.

Over-reliance on US data. The dominant FIRE tools are US-built and use US historical returns by default. A UK, EU, or other non-US investor running these tools is stress-testing against the wrong distribution. See the previous post in this series for why this matters structurally.

Confusing success rate with plan quality. A plan with 92% success rate that survives only by consuming 90% of the portfolio is technically successful but practically catastrophic. The "ending balance" distribution across simulations matters as much as the binary survival metric. A plan that ends with £200k in the worst surviving case is much weaker than one that ends with £800k.

Testing one variable at a time. As noted above, real failures are compound. Stress tests that examine inflation, returns, and spending in isolation produce a plan that survives each test individually and then fails when they correlate, which is their usual pattern.

Ignoring non-market shocks. The stress tests most people run vary returns and inflation. They rarely vary spending. Real retirement spending is volatile — a health event, a family emergency, an unexpected home repair — and the plan needs to survive those too. A £40,000-a-year plan that fails when actual spending is £50,000 in two non-consecutive years has a fragility that no market-focused stress test would catch.

Treating the stress test as a one-time exercise. The test is useful at the point of retirement, but more useful repeatedly over the first decade. Conditions change. Your plan should be re-stress-tested every year or two, with updated portfolio values, return expectations, and spending data. The early-retirement equivalent of a medical check-up.

The variable-spending corrective

The single most effective response to stress-testing is to build variable-spending rules into the plan itself. A plan that withdraws a fixed 4% adjusted for inflation is brittle. A plan that adjusts withdrawals based on portfolio performance — Guyton-Klinger guardrails, Vanguard's dynamic spending, or a simple floor-and-ceiling rule — absorbs most of the damage from bad sequences without requiring drastic intervention.

The trade-off is income variability, which is psychologically harder than it sounds. People's actual spending is less flexible than they think. A plan that requires cutting £8,000 from annual spending in year 6 of retirement to preserve long-term viability is mathematically correct and behaviourally unreliable. Most people will not do it.

The honest response is to build the flexibility in at the front end, not the back end. Instead of a £40,000-a-year plan with an emergency brake, run a £32,000-a-year plan with £8,000 of discretionary spending clearly labelled as such. The discretionary component is the one that flexes in a bad sequence. This sounds like a trivial semantic difference; behaviourally it is not. "Cancel the holiday because the market is down" is a very different cognitive task than "we'll do the holiday this year because we said we would."

Where Endute fits

Most FIRE calculators show you a single projected path with a smooth return assumption and a success-rate number. This is exactly the presentation that lets people skip the actual stress-testing step. Endute's FIRE simulator is built to show the distribution of outcomes, including the 5th and 10th percentile paths, the first-decade drawdown shape, and the ending-balance distribution under variable-spending rules. The simulator does not replace a professional financial planner for people who need one, and for complex cross-border plans or high-stakes decisions nothing replaces qualified advice. But for the basic question of "does my plan survive bad sequences, not just average ones," it gives you the shape of the answer rather than a single reassuring number.

Closing note

Stress-testing is not a confidence-building exercise. If you run it properly and come away more relaxed than before, you probably ran it wrong. The purpose is to find weaknesses while there is still time to do something about them.

The best indicator that you have stress-tested well is that you know specifically what signals would tell you your plan is off-track in year 3, year 7, and year 15, and you have specific responses prepared for each. "I'd go back to work" is not a response. "I'd cut my discretionary spending by £6,000 and delay the house purchase by two years until my portfolio recovers to £X" is a response.

Most people do not have this level of clarity about their plan because the tools they use do not produce it. That is a product choice, not a limitation of retirement mathematics. The mathematics are fine. The tools are the weak link.

Sources and further reading

Bengen, W.P. (1994). "Determining Withdrawal Rates Using Historical Data." Journal of Financial Planning 7(4):171-180.
Morningstar (2025). The State of Retirement Income 2025. 30-year base case 3.9%, 40-year base case 3.3%.
Anarkulova, A., Cederburg, S., O'Doherty, M.S., Sias, R. (2025). "The safe withdrawal rate: evidence from a broad sample of developed markets." Journal of Pension Economics and Finance 24(3):464-500. Uses actual mortality distributions and block bootstrap from 38-country data.
Guyton, J. and Klinger, W. (2006). "Decision Rules and Maximum Initial Withdrawal Rates." Journal of Financial Planning. Variable-spending guardrail methodology.
Pfau, W.D. (2010). "An International Perspective on Safe Withdrawal Rates: The Demise of the 4 Percent Rule?" Journal of Financial Planning, December 2010. Historical worst-case sequences by country.

This is the seventh post in a cluster on early retirement planning. The preceding post in this series explains sequence-of-returns risk in general terms; this one operationalises that concept into specific tests. The next post examines tax-efficient withdrawal order in early retirement — which wrapper to draw from first, why, and how the answer differs by jurisdiction.

← Back to all posts