Model Risk: When Your Mental Map Becomes the Failure Point

Success demands on the plan as well as execution

Feb 10, 2026

Reflection on space through Vermeer: Where is The Geographer? | by Dina Krichker | Medium

In 2008, financial institutions discovered that their risk models, sophisticated mathematical frameworks built by the brightest quantitative minds, had catastrophically failed. The models said the portfolios were safe. Reality disagreed. The mismatch didn’t come from calculation errors or data problems. It came from the models themselves: they were built on assumptions that stopped being true. The result is the all too well known Great Recession.

This is but one example of model risk. Model risk the danger that emerges not from having the wrong plan, not merely poor execution. Failure occurs not because the measurements taken were taken in error, but because the measurement itself is useless. The arrow fails to hit the target not because of poor accuracy, but because the archer is playing backgammon.

Every system runs on models. Whether it’s a financial organization, or a technical group of scientists, or a disparate group of grass roots activists, they all come loaded, not as blank slates, but with their own ideas and assumptions about what success looks like, how to get there, and what will indicate whether or not their efforts are making a difference. Seldom are these ideas made explicit. Instead they persist as implicit, embedded procedures, heuristics, and habits. All of them are simplifications of reality. And when the simplification diverges too far from what’s actually happening, the system built on that model fails, often spectacularly.

The Ubiquity of Models

Before exploring model risk, we must recognize how thoroughly models permeate every system.

Mental Models

The most common models are mental: the assumptions and heuristics people use to navigate their work, of which I’ve written about before. A manager operates on a model of how employees respond to incentives. A doctor operates on a model of how diseases progress. An engineer operates on a model of how systems fail. The author of this substack operates on a model of systems dynamics.

These mental models are usually unconscious and unexamined. They’re built from experience, training, and cultural norms. They work well when current conditions match the conditions under which they were formed. They fail when conditions change but the models don’t.

W. Edwards Deming emphasized the importance of making assumptions explicit. His concept of profound knowledge required understanding not just what is happening but why—understanding the theory behind the observations. Without this, people optimize locally based on mental models that may be fundamentally misaligned with system-level reality.

Operational Models

Organizations encode models into processes and procedures. For instance, when we establish that “standard processing time is 2 days,” we’re creating a model. When we set inventory reorder points based on historical demand patterns, we’re building a model. When we staff call centers based on expected call volume, we’re operating from a model.

These operational models often outlive their validity. The procedure manual reflects how things worked five years ago. The inventory model assumes demand patterns that changed last quarter. The staffing model doesn’t account for new product launches that shift call patterns.

Analytical Models

Formal models which are often captured in artifacts like spreadsheets, algorithms, and simulations are particularly dangerous because they appear precise. A financial forecast predicting 15% growth seems definitive. A demand planning model predicting 10,000 units needed seems authoritative. A machine learning model predicting customer churn seems objective.

But precision isn’t the same as accuracy, and appearance of rigor doesn’t guarantee validity. These models are built on data from the past and assumptions about the future. When the future doesn’t resemble the past, the model fails and often fails without pronouncement. Nobody notices or responds. They continue to produce confident predictions that are confidently wrong. Often in error, never in doubt.

How Assumptions Become Invisible Risk Factors

The most dangerous assumptions are the ones we don’t know we’re making.

Implicit Distributional Assumptions

Many models don’t look past an basic average. If the average time it takes to order a drink from Starbucks is 3 minutes, people think it will take 3 minutes. No more, no less. They can’t account and usually don’t accommodate for variation in the service time.

If there is such accounting, it usually means assuming that variables follow certain distributions, typically the normal distribution, the bell curve. This seems mathematically convenient and often appears to fit historical data, but for several applications (like service times) we know this will be incorrect.

Financial risk models before 2008 often assumed that asset returns followed normal distributions with predictable correlations. This worked fine for years, producing accurate risk assessments under ordinary conditions. But during crisis conditions, correlations approached 1.0 (everything fell together), and returns showed fat tails (extreme events occurred far more frequently than normal distributions predicted). Learn more about all that here.

The models weren’t wrong about the math. They were wrong about the fundamental structure of the phenomena they modeled. The risk wasn’t in the calculation; it was in the assumption that reality would continue to match the model’s distributional premises.

Stability Assumptions

Most models assume stability in relationships between variables. If we’ve historically converted 10% of leads to customers, we model future conversion at 10%. If production has historically required 2.5 labor hours per unit, we plan future production on that basis.

This works until something changes. A new competitor emerges. A process improvement reduces labor requirements. A regulatory change alters customer behavior. The model keeps projecting based on historical relationships that no longer hold.

The Theory of Constraints warns against this explicitly. Goldratt argued that local optimization based on static models often harms global performance because it fails to account for dynamic system interactions. Improving Department A based on historical patterns might create new bottlenecks in Department B that the model never predicted.

Linearity Assumptions

Many models assume linear relationships: double the input, double the output. Double the marketing spend, double the leads. Double the staff, double the throughput.

But most real systems are nonlinear. Returns diminish. Capacity constraints bind. Coordination overhead increases. Feedback loops create unexpected effects.

A staffing model built on linear assumptions might predict that adding 20% more staff increases output 20%. In reality, output might increase only 10% because the constraint wasn’t staff but equipment availability. Or output might increase 30% because the additional staff eliminated a coordination bottleneck. The model fails not because the math is wrong but because the assumption of linearity doesn’t match the system’s actual behavior.

Parameter Drift: When the Model Stops Matching Reality

Even when a model’s structure is sound, its parameters can drift.

Slow Drift

Customer preferences gradually shift. Competitor strategies evolve. Technology capabilities advance. Economic conditions change. Each individual change is small, perhaps imperceptible. But cumulatively, they alter the relationships the model depends on.

A demand forecasting model built on three years of historical data might work well initially. But if customer behavior is slowly changing, perhaps due to demographic shifts or evolving preferences, the model’s parameters become increasingly stale. It continues to forecast confidently, but accuracy gradually degrades.

The danger is that this degradation is often invisible. The model doesn’t throw errors. It doesn’t signal that its assumptions are outdated. It just quietly becomes less reliable while maintaining the appearance of precision.

Sudden Breaks

More dramatic is structural break. Structural break is when something changes fundamentally and immediately. An event disrupts normal demand patterns. A key supplier goes bankrupt. A regulatory change invalidates business models. A technological breakthrough makes current approaches obsolete. This is exactly what happened to supply chains during the Covid-19 crisis. Demand was halted, labor forces were sent home. Oh, and let’s not forget about the tanker stuck in the Suez.

Models built on pre-break data become instantly unreliable. Yet organizations often continue using them because “it’s what we’ve always done” or because replacing them takes time that feels unavailable during crisis.

Overfitting: The Illusion of Precision

Statistical models, especially sophisticated machine learning models, can achieve impressive accuracy on historical data through overfitting, the learning patterns that are artifacts of the specific data sample rather than true underlying relationships.

An overfitted model might predict the training data with 98% accuracy but fail completely on new data. It’s learned the noise, not the signal. It’s optimized for the past, not for the future. This is one of the first no-nos you learn to avoid when studying Machine Learning and Data Analysis, but despite this elementary nature of the error, it propagates frequently across industries.

This manifests outside of data Analysis, especially in areas like organizational planning, which might manifest as slightly more wholistic. A five-year strategic plan that fits historical trends perfectly might be overfit to circumstances that won’t continue. The plan has high fidelity to the past but no validity for the future.

The solution isn’t to avoid modeling, models are necessary and useful. It’s to maintain epistemic diffidence, recognizing that models are tools for thinking, not truth. They’re useful when they approximate reality well enough for decision-making, and dangerous when we mistake the map for the territory. Quote Dr. George Box,

“All models are wrong but some are useful.”

Forecast Risk

Organizations run on forecasts. Budget forecasts. Demand forecasts. Capacity forecasts. Headcount forecasts. Every one is a model-based prediction, and every one carries model risk.

Budgeting: Planning on False Premises

Annual budgets are detailed financial models of the coming year. Departments negotiate for resources based on forecasted needs. Projects are approved or rejected based on forecasted returns. Headcount is allocated based on forecasted demand.

When forecasts are accurate, this works. When they’re not, the entire organizational plan is wrong. The budget assumed 20% growth; actual growth is 5%. Now every resource allocation is mismatched to reality. The organization is either starved for resources (if forecasts were too pessimistic) or bloated with excess (if forecasts were too optimistic).

The typical response is to blame the forecasters. But forecast error is inevitable. The real model risk is designing organizations that can’t adapt when forecasts prove wrong. Organizations which are so tightly planned to the forecast that any deviation creates crisis.

Staffing: People Costs of Model Errors

Staffing decisions based on demand forecasts create multi-month commitments. Hiring takes time. Training takes time. If you forecast high demand and staff accordingly, then demand doesn’t materialize, you have expensive excess capacity. If you forecast conservatively and demand spikes, you have burnout and attrition.

The asymmetry is crucial: the cost of understaffing (burned-out employees, lost business, degraded quality) often exceeds the cost of modest overstaffing. But traditional efficiency models treat both errors symmetrically, leading to chronic understaffing as the “safer” bet.

This reflects model risk in organizational incentives. The model says: minimize labor cost. The reality is: chronic understaffing destroys institutional capability. The model optimizes the wrong objective.

Capital Planning: Betting on the Wrong Future

Capital investments are built on multi-year forecasts of technology needs, capacity requirements, and market conditions. A manufacturer invests in production equipment based on demand projections. A tech company builds data centers based on growth forecasts. A hospital expands based on demographic predictions.

When these forecasts are wrong, the capital is misallocated—sometimes catastrophically so. Equipment sits idle. Facilities are underutilized or overwhelmed. The strategic model said to invest here; reality says the need was elsewhere.

This is why Theory of Constraints emphasizes throughput accounting over cost accounting. Traditional models optimize for asset utilization and cost minimization. TOC recognizes that the true objective is system throughput—generating value through the constraint. These different models lead to radically different investment decisions.

Real-World Model Failures

Long-Term Capital Management (1998)

A hedge fund run by Nobel laureates using sophisticated mathematical models achieved spectacular returns until their models failed. The models assumed that historical price relationships would continue, that markets were liquid enough to unwind positions, and that extreme events had predictable probabilities.

In 1998, none of these assumptions held. The Russian financial crisis created unprecedented volatility. Correlations that the model said were impossible occurred. Markets became illiquid exactly when LTCM needed liquidity. The model wasn’t wrong about the math; it was wrong about the world.

The Boeing 737 MAX (2019)

Boeing’s approach to the 737 MAX involved adding larger engines that changed the aircraft’s handling. Rather than redesign extensively, they added software (MCAS) to compensate. The model was: we can solve aerodynamic changes with software, and pilots will respond appropriately if issues arise. I’ve written about this extensively, here, here, here, and elsewhere too.

The model failed on multiple levels. It underestimated how aggressively MCAS would need to intervene and the reliability of the program. It failed to account for single-point sensor failures. It assumed pilots would recognize and respond to a system they weren’t trained on. Each assumption seemed reasonable in isolation; together, they created a fatal failure mode.

COVID-19 Forecasting (2020-2021)

Early pandemic models made necessary but ultimately flawed assumptions about virus behavior, intervention effectiveness, and human behavior. Models predicted death tolls of several Million in New York City alone. As hysteria grew, policymakers all but abandoned their models to flatten the curve and ride it out as well as taking on more protracted policies like sequestering the most vulnerable (elderly) in lieu of a fanaticism for the vaccines and Operation Warp Speed.

Building Systems That Don’t Collapse When Models Fail

The goal isn’t to create perfect models but to create systems that remain functional when models prove imperfect.

1. Make Models Explicit

Hidden models are more dangerous than explicit ones. When assumptions are implicit, they can’t be examined or challenged.

Make your models visible:

Document the assumptions underlying operational procedures
Expose the logic in spreadsheet forecasts
Articulate the mental models driving strategic decisions

Once explicit, models can be stress-tested: what if this assumption doesn’t hold? What’s our contingency?

2. Maintain Model Registries

Organizations should maintain inventories of their critical models:

What models exist
What decisions they inform
What assumptions they rest on
When they were last validated
Who owns them

This sounds bureaucratic, but it’s essential. You can’t manage model risk if you don’t know what models you’re relying on.

3. Build in Monitoring and Triggers

Models should be instrumented to detect when they’re failing. Establish leading indicators that your model might be wrong:

Forecast error exceeding thresholds
Parameter drift beyond expected ranges
Increasing variance in residuals
Assumption violations

When triggers fire, don’t dismiss them. Investigate whether the model needs revision.

4. Create Adaptive Capacity

Systems that depend rigidly on a single model are fragile. Build in flexibility:

Slack resources: Don’t plan to 100% of forecast capacity
Decision checkpoints: Review and revise plans periodically, not just annually
Multiple scenarios: Plan for several futures, not just the most likely one
Fast feedback loops: Detect model failures quickly and respond

Deming’s Plan-Do-Study-Act cycle embodies this adaptive approach. You plan based on your current model, execute, study the results, and act to revise both the execution and the underlying model. The cycle never ends because models are never perfect.

5. Embrace Robust Decision-Making

Rather than optimizing for the most likely scenario (which maximizes expected value under your model), consider robust strategies that perform acceptably across many scenarios.

This might mean accepting lower expected returns in exchange for avoiding catastrophic losses under model failure. It means prioritizing resilience over optimization, adaptability over efficiency.

Theory of Constraints’ emphasis on protecting the constraint is a form of robust decision-making. Rather than optimizing every resource, you optimize the system’s throughput by ensuring the constraint is never starved—even if your models of demand and processing times are imperfect.

6. Maintain Epistemic Humility

The most important defense against model risk is intellectual: remember that all models are wrong; some are useful. Treat model outputs as inputs to thinking, not substitutes for thinking.

This means:

Being skeptical of precise predictions
Seeking disconfirming evidence for your models
Welcoming challenges to your assumptions
Updating models as evidence accumulates

This cultural stance is harder to build than technical safeguards but ultimately more important. Organizations that treat models as truth become brittle. Organizations that treat models as tools remain adaptive.

The Meta-Model Problem

There’s a recursive element to model risk: our models of model risk are themselves models that might be wrong. We might be monitoring the wrong indicators, stress-testing the wrong assumptions, building resilience against the wrong failure modes.

This isn’t cause for paralysis. It’s cause for humility and continuous learning. The goal is to reduce the probability and magnitude of model-driven failures while accepting that some will still occur.

Deming’s concept of profound knowledge encompassed this recursive awareness. Understanding a system requires understanding not just what you know but the limits of what you know, the boundaries of your models, and the uncertainty beyond them.

Conclusion: Maps and Territories

“The map is not the territory,” said Alfred Korzybski. Every model is a map—a simplified representation designed to help navigate complexity. Maps are indispensable, but they’re not the terrain itself.

Model risk emerges when we forget this distinction, when we navigate by the map so intently that we stop looking at the ground beneath our feet. We follow the forecast instead of watching actual demand. We execute the plan despite changing conditions. We trust the algorithm without questioning its outputs.

The solution isn’t to abandon models. It’s to hold them lightly, to treat them as provisional guides rather than absolute truth, to build systems that adapt when models fail rather than collapse.

Every forecast will be wrong. Every plan will meet unexpected conditions. Every mental model will eventually misalign with reality. The question isn’t whether your models will fail but whether systems can survive and rebound when they do.

In the end, the most dangerous model might be the meta-model that says we can eliminate model risk. We can’t. We can only manage it through explicit articulation, continuous validation, adaptive systems, and the intellectual humility to recognize that our best understanding of the world is always incomplete.

The territory keeps changing. The map must change with it. And the system must function even when the map hasn’t caught up to the terrain.

You read the whole article!

Maybe you want to:

Systems of Human Performance

Discussion about this post

Ready for more?