Estimates Are Forecasts¶

Estimates are one of the most contested artifacts in software engineering. Stakeholders treat them as commitments; engineers treat them as guesses; neither side is wrong about what they meant. The resulting friction (broken trust, padded numbers, slipped dates, and accusations on both sides) is one of the most reliable sources of dysfunction in software organizations.

The framing that holds up best is that estimates are forecasts: probabilistic predictions about how long something will take or what something will cost, with uncertainty proportional to how much the team knows about the work.¹ They are no more reliable than weather forecasts, and treating them otherwise produces predictably bad decisions on every side.

Why Software Estimation Is Hard¶

The intuitive complaint, that engineers should "just be better at estimating," misunderstands the difficulty. Estimating software is structurally harder than estimating most other kinds of work, for reasons that no amount of skill eliminates:

The work has never been done before. Every non-trivial software task contains some amount of discovery. If the team had done this exact thing before, they would not be writing new code; they would be running the old one. The novelty is the work.²
The problem is being understood while it is being solved. A task that takes two hours of coding can require two weeks of clarifying what was actually being asked for. The clarification is often invisible at estimation time.
Communication and coordination costs grow non-linearly. Brooks's Law: adding people to a late project makes it later, because the cost of keeping everyone synchronized outpaces the additional output.²
Dependencies hide. A change that should be small turns out to require updates in three other systems, owned by three other teams, with three other release cycles. The dependency tree is rarely visible up front.
Compounding small mistakes. Each subtask is estimated optimistically (a few percent low); the project is the sum of many subtasks; the underestimates compound.³
Hofstadter's Law. "It always takes longer than you expect, even when you take into account Hofstadter's Law."

These are properties of the work, not failures of the estimator. An estimation discipline that pretends they can be eliminated through more rigor is an estimation discipline that produces overconfidence.

What Estimates Are Actually For¶

Despite the difficulty, estimates serve real purposes when used honestly:

Funding decisions. A rough sense of whether something will take a week or a year is necessary to decide whether to fund it at all. Refusing to estimate at this level is refusing to make the decision.
Sequencing. When two pieces of work compete for the same team, an estimate (even a crude one) helps decide which to do first. Comparative estimates ("about twice as much work as the last one") are usually more useful than absolute estimates.
Scope-shaping. "This will take six months as scoped, or six weeks if we drop these three things." Estimates that surface the trade-offs let stakeholders make informed choices about scope.
Risk surfacing. A range estimate ("two to twelve weeks") is honest signal about uncertainty. The width of the range tells the team what they need to learn before the estimate can narrow.

What estimates are not for:

Performance evaluation. Punishing engineers for missed estimates trains them to pad estimates. Padding produces stable estimates that are systematically wrong in the safe direction.
Commitments to external parties. An estimate is a forecast; a commitment is a promise. Treating an estimate as a commitment converts honest signal into a political artifact.
Substitutes for discovery. A precise estimate produced before the work is understood is a guess dressed up in numbers. The number does not become more accurate by being written down.

The Estimation Toolkit¶

Several techniques recur in practice. None is universally right; each has appropriate uses.

T-shirt sizing. Crude buckets (S, M, L, XL). Cheap to produce, useful for relative comparison early in a project. Loses precision but loses honestly, which is sometimes the point.
Story points. Numerical (often Fibonacci) estimates of relative effort, used within a team to forecast throughput. Most useful as the team accumulates history; works poorly as a cross-team comparison or as a productivity metric.⁴
Three-point estimates. Best-case, expected, and worst-case durations, sometimes combined as a weighted average (the PERT formula). Forces the team to articulate uncertainty rather than hide it in a single number.
Reference-class forecasting. Compare the work to historical projects with similar shape, and use the actual outcomes of those projects as the base for this one. Often surprisingly accurate, because it incorporates the outside view that the team's optimistic inside view misses.³
Forecasting from cycle time. Once a team has shipped enough work, the distribution of how long similar work has historically taken is itself a forecast. Often more reliable than the team's intuitive estimates, especially for steady-state work.⁶
#NoEstimates. A loosely-defined movement that argues many of the activities estimates are used for can be replaced with smaller batch sizes, faster delivery, and forecasting from cycle time, removing the estimation step entirely.⁵ Useful as a corrective to overinvestment in estimation; less useful as a universal prescription.

A useful target: pick the cheapest estimation technique that produces a number you would actually act on. Anything more precise is overinvestment; anything less is a guess.

Common Failure Modes¶

Estimates-as-deadlines. A team estimates eight weeks; leadership reports it as "shipping in eight weeks"; engineering is held to the date. The estimate's uncertainty was discarded in the handoff.
Padding. Engineers, having been burned by being held to estimates, add buffer. The buffer is honest protection in an unsafe culture, and a defect in a healthy one. Both versions exist.
Anchoring. The first number mentioned in the conversation becomes the default for everyone. Stakeholders who hear "six weeks" remember six weeks even when the team later said "six to twelve."
Estimation theatre. A multi-day planning meeting produces detailed estimates that are no more accurate than a five-minute conversation would have produced. The team has paid for precision that the underlying uncertainty cannot support.
The fixed-price, fixed-scope, fixed-deadline contract. A contract that fixes all three is a contract that will be violated. The honest contract fixes two and treats the third as the variable that absorbs reality.⁷
Optimism on novel work. Teams systematically underestimate work they have not done before, because the unknowns are unknown. The fix is to add explicit risk premium for novelty, not to estimate "better."
Ignoring the maintenance tail. Estimates count the work to ship and stop. The work to operate, maintain, and evolve the system continues for years; pretending otherwise is the source of much "we built it cheap" mythology.⁸

What This Looks Like in Practice¶

A few habits keep estimation honest:

Estimate in ranges, not single numbers. "Two to four weeks" is more honest than "three weeks" and forces the conversation about uncertainty into the open.
State explicit assumptions. What did the estimate depend on? Which assumptions, if wrong, would change the answer? An estimate without assumptions is an estimate that cannot be revised when new information arrives.
Re-estimate as you learn. An estimate from six weeks ago, before the team had built anything, is not the same artifact as an estimate from today. Updating estimates is signal, not failure.
Track estimate accuracy, not estimate adherence. A team that consistently overshoots by forty percent has useful information; a team that always "hits the estimate" has probably trained itself to pad.
Distinguish forecasts from commitments at the point of communication. When a number leaves the team, the team should be able to say whether it is a forecast (uncertain) or a commitment (promise). Both are valid; conflating them is the problem.
Prefer fixing scope over fixing time. When a deadline is hard, the lever the team has is what to ship by that date. Pretending the scope is also fixed is how teams ship late and incomplete instead of on-time and focused.

Key principle

Estimates are forecasts, not commitments. The number is a probabilistic statement about how long something will likely take, with uncertainty proportional to how much the team currently understands about the work. A team that treats estimates as commitments is a team that will be punished for the uncertainty inherent in software work, and will respond by hiding that uncertainty rather than addressing it.

Steve McConnell, Software Estimation: Demystifying the Black Art (Microsoft Press, 2006). The most thorough single-volume treatment of software estimation as a discipline, including the empirical research on estimation accuracy, the "cone of uncertainty," and the systematic biases that distinguish good estimators from bad ones. The book argues, with data, that experienced estimators are routinely off by 4x even on familiar work, and that no amount of additional planning closes the gap further than the underlying uncertainty allows. ↩
Frederick P. Brooks Jr., The Mythical Man-Month: Essays on Software Engineering (Addison-Wesley, 1975; anniversary edition 1995). The originating analysis of why adding people to a late project makes it later, and why software estimation is structurally harder than the manufacturing-derived intuitions most stakeholders apply to it. ↩↩
Daniel Kahneman, Thinking, Fast and Slow (Farrar, Straus and Giroux, 2011). On the planning fallacy, the inside-vs-outside view distinction, and reference-class forecasting as a corrective. Kahneman's earlier work with Amos Tversky on the planning fallacy is the empirical foundation for why estimators consistently underestimate even when they know they will. ↩↩
Mike Cohn, Agile Estimating and Planning (Prentice Hall, 2005). The standard practitioner reference for story points, planning poker, and the discipline of treating estimates as relative judgments rather than absolute predictions. ↩
Vasco Duarte, No Estimates: How to Measure Project Progress Without Estimating (2015), and the broader #NoEstimates conversation. Argues that small batch sizes and forecasting from historical cycle time can replace much of the activity teams use estimation for, and that estimation as currently practiced is often net-negative. Useful as a corrective; controversial as a universal prescription. ↩
Joel Spolsky, Evidence Based Scheduling (2007): https://www.joelonsoftware.com/2007/10/26/evidence-based-scheduling/. A specific methodology for converting per-developer historical estimate-versus-actual ratios into Monte Carlo probability distributions over ship dates. The piece's contribution is operational: it shows what cycle-time forecasting looks like as a working practice rather than a principle, including the discipline of breaking work into small enough units (under 16 hours) for the statistics to mean anything. ↩
See Ethics of Spending Client Money on fixed-price contracts. ↩
See Total Cost of Ownership. ↩