Skip to content

Feature Flags

A feature flag is a switch in the code that decides whether a particular piece of functionality is active at runtime. The most important thing it does is separate two activities that have historically been the same event: deploying code to production and exposing functionality to users.

Once those two are separated, a lot of things become easier. Releases stop being binary high-stakes events. Incomplete work can ship safely. Bad releases can be recovered from without a redeploy. New code can be exposed to small audiences first. Many of the practices that high-performing teams rely on quietly depend on flags being a routine part of the codebase.1

What Flags Actually Do

A flag is just a conditional. The discipline is in what the conditional gates and how the team manages it over time.

A few distinct kinds of flag, with different lifecycles:

  • Release flags. Hide incomplete work that is on trunk but not yet ready for users. These flags are short-lived: once the feature is shipped, the flag and the dormant branch of code are removed.
  • Operational flags. Allow on-call engineers to disable a misbehaving feature, throttle a problematic code path, or shed load under pressure. These flags are long-lived by design and are part of the system's operational toolkit.
  • Permission flags. Gate functionality by user, team, plan, or environment. Often integrated with the product's authorization system rather than implemented as ad-hoc flags.
  • Experiment flags. Drive A/B tests and other controlled experiments. Short-lived by intent; the flag exists for the duration of the experiment.

Mixing these together in the same codebase, with the same lifecycle, is one of the most common ways flag systems become hard to maintain.

What Flags Make Possible

  • Trunk-based development. Incomplete work can land on trunk behind a disabled flag rather than living on a long-lived branch. The benefits of frequent integration become available without the risk of shipping partial features.2
  • Deploy on one schedule, release on another. Code can be deployed daily while features are released by date, by audience, by experiment, or by readiness. The two schedules can be decoupled because the activation point is independent of the deployment.
  • Safer rollouts. New functionality can be turned on for internal users first, then a small percentage of customers, then larger cohorts. Issues surface before they affect everyone.
  • Faster recovery. When a feature misbehaves in production, flipping a flag off is faster and less risky than rolling back a deployment. The bad code is still present; it is just inactive.
  • Real production testing. Some classes of bug only appear under real traffic. Flagged exposure to a small percentage of real users surfaces these without committing to a full rollout.
  • Controlled experimentation. A/B testing requires the ability to expose different users to different code paths at the same time. Flags are the substrate that makes this practical.

Common Anti-Patterns

  • Flag debt. Flags that should have been removed are left in the codebase indefinitely. Each one is a permanent piece of conditional complexity, and the codebase accumulates them faster than anyone removes them.
  • Mixing flag types in the same system. Treating a long-lived operational flag the same way as a short-lived release flag means neither lifecycle is managed well.
  • No ownership of removal. A flag is added when a feature ships and forgotten when the feature stabilizes. The fix is to make removal a tracked task, with an owner and a date.
  • Flags as configuration. A flag is meant to be flipped during the lifecycle of work. A configuration value is meant to be set and forgotten. Conflating them muddies both the code and the operational tooling.
  • Untested off-paths. The disabled branch of a flag is rarely exercised in test environments. Six months later, nobody is sure whether turning the flag off would actually work. Flags require periodic validation in both states.
  • Production-only flag systems. A flag that can only be flipped in production cannot be exercised in CI or staging. The team learns about the failure modes of the off state when a customer reports them.

What This Looks Like in Practice

  • Track every flag's lifecycle. Each flag should have an owner, a purpose, and an expected lifespan. Long-running flags are explicitly long-running, not accidentally so.
  • Default to "flag and remove." Most release flags should be removed within weeks of full rollout. A team that does not routinely remove flags is a team that will eventually struggle to add new ones.
  • Test both states. CI should exercise the system with the flag both on and off, especially for release and operational flags.
  • Keep operational flags discoverable. On-call engineers should know which flags exist and what they do, ideally documented in the runbook for the system.
  • Use flags to enable trunk-based development, not to avoid the work of incremental design. A feature flag hiding a large incomplete change is not as good as the same feature broken into small changes that ship and integrate one at a time.

Important distinction

Deploying software is not the same thing as releasing features. Flags are how the two get separated. A team that has not internalized this distinction is paying release risk every time it deploys.

See also: Trunk-Based Development, Deployment Strategies, Fix Forward, Expand-Contract Migrations, A/B Testing.



  1. Pete Hodgson, Feature Toggles (aka Feature Flags) (Martin Fowler's blog, 2017). The most thorough public taxonomy of feature flags, including the distinction between release, operational, permission, and experiment toggles, and the operational practices that keep flag systems maintainable: https://martinfowler.com/articles/feature-toggles.html 

  2. See Trunk-Based Development for the longer treatment of how flags enable safe integration of incomplete work.