Skip to content

Glossary

A working vocabulary for the concepts used throughout the guide. Where a term is treated at length in a specific chapter, the entry links to it.

A/B Testing

A controlled experiment in which different users see different versions of a product, and the team measures which version produces better outcomes. Most useful for narrowly-scoped, high-traffic decisions; the wrong tool for strategic or low-volume questions. See A/B Testing.

Accessibility

The property of a system that determines whether it can be used by people with different abilities, devices, environments, and assistive technologies. Not a feature added on top of an interface; a quality of every design decision.

API

Application Programming Interface. The defined surface through which one piece of software communicates with another. API design is a long-lived architectural commitment, since clients depend on the interface and not on the implementation.

Architecture

The set of design decisions that determine how expensive future change becomes. Architecture is a business concern because change is a business concern. See Architecture Is About Change.

Architecture Decision Record (ADR)

A lightweight document capturing an architectural decision, the alternatives considered, and the reasons for the choice. ADRs make decisions revisitable later by someone who was not in the room when they were made.

Backfill

The phase of a schema migration in which existing data is moved to a new shape, usually running in the background without blocking writes. The third step of an Expand-Contract Migration.

Blameless Postmortem

An incident investigation that looks for what about the system allowed the incident to occur, not who to blame. The discipline that turns incidents from costs into investments by surfacing structural causes that "be more careful" cannot fix. See Incident Response.

Blue-Green Deployment

A deployment pattern in which two full copies of the production environment exist; new code is deployed to the inactive copy, and traffic is switched to it once verified. Rollback is a traffic switch rather than a redeploy.

Bounded Context

A region of a system in which a term has a specific, agreed meaning that may differ from the same term elsewhere in the business. The discipline of bounded contexts lets "customer" mean one thing in billing and another in support, as long as the boundary is explicit. See Shared Language and Operational Clarity.

Build Trap

The assumption that building software is the same thing as creating value. Teams in the build trap measure progress by features shipped rather than by customer outcomes. The term comes from Melissa Perri. See The Build Trap.

Canary Release

A deployment pattern in which a small percentage of traffic is routed to the new version while the rest stays on the old, allowing the team to detect problems before they affect everyone.

CI/CD

Continuous Integration (every change is automatically built and tested against trunk on its way in) plus Continuous Delivery (every change is built, tested, and ready to be deployed at any time). Continuous Deployment is a further step in which every change that passes the pipeline is automatically deployed. See Deployment Strategies.

Conway's Law

Melvin Conway's 1968 observation that "any organization that designs a system will produce a design whose structure is a copy of the organization's communication structure." Communication structure shapes system structure, whether the team wants it to or not.

Cost of Delay

The economic cost of not having something available, expressed per unit time. The defining unit of analysis in Reinertsen's product development flow framework; the quantity most organizations should be measuring and almost none are.

Distributed Monolith

A system technically composed of services that nonetheless share a database, deploy together, and cannot be released independently. All the costs of distribution; none of the benefits. The most expensive of the common microservice failure modes.

DORA Metrics

The four key metrics of software engineering performance identified by Google's DORA program: deploy frequency, lead time for changes, mean time to restore service, and change failure rate. Strong empirical predictors of organizational performance, not just engineering output.

Error Budget

A formal allowance for unreliability: the fraction of time a system is permitted to fall below its service-level objective without triggering a reliability response. From the SRE tradition. Implies that 100% is the wrong reliability target.

Estimate

A forecast of how long a piece of work will take or how much it will cost, with uncertainty proportional to how much the team currently understands about the work. Distinct from a commitment, which is a promise. Treating the two as interchangeable is one of the most common sources of software-organization dysfunction. See Estimates Are Forecasts.

Expand-Contract Migration

The standard pattern for changing a schema (or any long-lived shared state) safely: expand the schema in a backward-compatible way, deploy code that handles both shapes, backfill data, then contract by removing the old shape. Also called parallel change. See Expand-Contract Migrations.

Feature Flag

A switch in the code that decides whether a particular piece of functionality is active at runtime. Separates code deployment from feature release. Comes in distinct flavors (release, operational, permission, experiment) with different lifecycles. See Feature Flags.

Fitness Function

An automated check that verifies an architectural characteristic holds (deployability, observability, response time, modularity, and so on). The mechanism by which evolutionary architecture keeps architectural commitments verifiable rather than aspirational.

Fix Forward

Recovering from a bad deploy by shipping a corrective change rather than rolling back. The default in mature continuous-delivery environments, with rollback reserved for emergencies. See Fix Forward.

High Cardinality

Data with many distinct values per dimension (user IDs, request IDs, full URLs). High-cardinality data is what distinguishes observability from monitoring: it lets the team ask questions they did not pre-define.

Hypothesis-Driven Development

A way of framing product work in which each significant change is treated as a hypothesis to test rather than a feature to deliver. Most directly associated with Ries's The Lean Startup. See The Build Trap.

Idempotent

A property of an operation that produces the same result regardless of how many times it is performed. Critical for safe retries during deployments and outages.

Incident

An event that exceeds a team's threshold for declared response: customer impact, duration, or severity beyond a defined bar. See Incident Response.

Information Architecture

The structural organization of concepts, screens, and navigation in a system. Part of UX, distinct from visual design.

Lead Time

The elapsed time from a code change being made to that change running in production. One of the four DORA metrics, and a primary indicator of an engineering organization's responsiveness.

Legacy Code

In Michael Feathers's definition: code without tests. The lack of tests is what makes code legacy, regardless of how recently it was written, because untested code cannot be safely changed.

Lock-in

The cost of leaving a vendor, technology, or architectural choice. Most lock-in is invisible at the moment of adoption and painful at the moment of exit.

Maintainability

The property of a system that determines how cheaply it can be changed once built. A cluster of related characteristics (readability, modularity, testability, observability, operational simplicity) that together determine whether the system can absorb new requirements gracefully. See Maintainability.

Mean Time to Recovery (MTTR)

The average elapsed time from incident detection to incident resolution. One of the DORA metrics; in mature operations, MTTR matters more than incident frequency.

Microservice

A component of a system deployed and operated independently, with its own data store and release cycle. Solves specific problems of scale, team autonomy, and operational independence at the cost of distributed-system complexity. See Monolith First.

Minimum Viable Product (MVP)

The smallest version of a product that can be released to learn whether the underlying hypothesis is right. Often misunderstood as "the cheapest thing we can ship"; correctly understood as "the smallest experiment that produces real evidence."

Monolith

A software system deployed as a single application. Often the right architecture for an early-stage product, despite the term's current pejorative use. See Monolith First.

Observability

The property of a system that allows the team to investigate failure modes they did not anticipate. Goes beyond logs, metrics, and traces; characterized by the cardinality of the data the team can query against. See Observability.

One Metric That Matters (OMTM)

The single number that best captures whether a product is succeeding at its current stage of life. From Croll and Yoskovitz's Lean Analytics. Most organizations should measure fewer things more rigorously rather than measuring everything badly.

Opportunity Cost

What the team is not doing while they are doing something else. Almost never tracked as a line item in any budget, and routinely the largest cost of any decision.

Pair Programming

The practice of two engineers working on the same problem at the same workstation, one driving and one navigating, swapping regularly. Folds design, code, and review into a single activity rather than separate phases. See Collaborative Engineering.

Parallel Change

See Expand-Contract Migration.

Postmortem

A structured review of an incident, intended to surface what about the system allowed the incident and what should change. Most useful when conducted blamelessly. See Incident Response.

Product Owner

The function (not necessarily the title) responsible for connecting the people who define value to the people who implement it. Owns priorities, accepts work, surfaces trade-offs, and protects the product's direction. See Product Ownership.

Pull Request

A proposed code change submitted for review before being integrated into the trunk. The unit of conversation in most modern engineering workflows.

Refactoring

A behavior-preserving transformation of code that improves its structure without changing what it does. Most effective when done continuously as part of normal work, not as a separate cleanup project. Fowler's Refactoring is the canonical reference.

Release vs. Deployment

Deployment is the act of putting code into production. Release is the act of exposing functionality to users. Feature flags are how the two get separated. A team that conflates them is paying release risk on every deploy.

Resilience Engineering

The discipline of treating reliability as a property of how systems and the humans inside them adapt to failure, rather than as a property of pre-engineered correctness. Argues that complex systems run as constantly broken systems whose ongoing operation requires continuous human adaptation. Most associated with Richard Cook's How Complex Systems Fail and John Allspaw's work.

Rollback

Returning a system to a previous known-good state by undoing a deployment. In modern continuous-delivery environments, rollback is the emergency tool; the default recovery option is Fix Forward.

Rolling Deployment

A deployment pattern in which new code is rolled out to instances of the service one at a time, with health checks between each. The cost is that the system runs with mixed old and new code during the rollout, which constrains how the application has to be written.

SaaS

Software as a Service. A product accessed over a network, typically subscribed to rather than purchased, with the vendor handling operations. One of the main archetypes in build-vs-buy decisions. See Owning vs Licensing.

Service-Level Objective (SLO) / Indicator (SLI) / Agreement (SLA)

A Service-Level Indicator is a measurement of a system's behavior (e.g., latency, error rate). A Service-Level Objective is the target the team commits to for that indicator. A Service-Level Agreement is the contractual commitment to a customer, usually a weaker version of the SLO. From the SRE tradition.

Shadow Traffic

A deployment validation technique in which new code receives a copy of production traffic but does not affect user-facing responses. Verifies behavior under real load without exposing any user to the new code.

SOC 2

A widely-adopted compliance framework for service organizations, focused on how customer data is handled. One of the more common compliance regimes a software product will encounter; an ongoing operational obligation, not a one-time audit.

Sunk Cost

Money already spent. Not a valid input to a decision about future investment. The relevant question is always forward-looking: given what we know now, what is the best use of the next dollar?

Technical Debt

A widely-misused metaphor for a category of trade-off in code. Better treated as a portfolio of different problems (deliberate strategy, natural entropy, organizational dysfunction) with different appropriate responses, rather than as a single quantity that can be "paid off." See Technical Debt.

Theory of Constraints

Eliyahu Goldratt's framework for thinking about flow in any system: the throughput of the whole is determined by its bottleneck, and improvements to anything other than the bottleneck do not improve the system. The analytical foundation underneath much of modern operations thinking.

Three Ways

The three principles articulated in The Phoenix Project for adapting lean thinking to software operations: Flow (improve throughput from idea to production), Feedback (amplify signal from downstream to upstream), and Continuous Learning and Experimentation. Each Way is necessary; none is sufficient alone.

Total Cost of Ownership (TCO)

The full cost of operating a system over its useful life, including implementation, operation, maintenance, security, compliance, support, and eventual decommissioning. The decision-relevant number for any significant software choice; almost always much larger than the build estimate suggests. See Total Cost of Ownership.

Trunk-Based Development

A workflow in which all engineers integrate their work into a single shared branch many times a day, with short-lived branches (if any) and feature flags for incomplete work. The empirical evidence for trunk-based development as a high-performance practice is among the strongest in software engineering. See Trunk-Based Development.

Type 1 / Type 2 Decisions

Jeff Bezos's distinction between irreversible "one-way door" decisions (Type 1) and reversible "two-way door" decisions (Type 2). Most organizations apply Type 1 deliberation to Type 2 problems and pay the cost in speed of learning.

Ubiquitous Language

A single shared vocabulary used by domain experts, stakeholders, and software teams, consistent across conversation, documentation, and code. From Eric Evans's Domain-Driven Design. See Shared Language and Operational Clarity.

User Experience (UX)

The functional design of how a user accomplishes work with a product. Structure, sequence, language, defaults, and feedback, in addition to the visual layer. Not decoration. See UI/UX Is Not Decoration.

Validated Learning

A unit of progress in hypothesis-driven product work: an experiment that produced evidence the team did not have before. From Ries's The Lean Startup. The argument is that learning, not shipping, is the actual product of early-stage development.

Westrum Typology

Ron Westrum's three-type classification of organizational cultures based on information flow: pathological (information is hoarded), bureaucratic (information follows rules), and generative (information flows where it is needed). Accelerate found that generative cultures are also high-performance engineering cultures.

Work in Progress (WIP)

Work that has been started but not finished. Excess WIP is one of the most common causes of slow flow in knowledge work. Visualizing WIP and setting explicit limits on it is one of the core practices from the lean and kanban traditions.

Yagni

"You Aren't Gonna Need It." Martin Fowler's articulation of the principle that building flexibility for needs that never materialize is consistently expensive. The decision to not build something is often the higher-leverage choice.