AI as Force Multiplier¶

Generative AI is the largest change to engineering work in a generation. It is also one of the most poorly handled cultural transitions in software, with organizations swinging between "AI will replace engineers" and "AI cannot be trusted with anything" without spending much time on the question that actually matters: what is AI good at, what is it bad at, and how should an engineering team integrate it without losing the judgment that makes engineering valuable in the first place?

The framing that holds up best is the one in the chapter title: AI as a force multiplier. It is a tool that amplifies the skill and judgment of the person using it. It does not generate skill or judgment that was not already in the room. The implications of taking this framing seriously are not what either of the loudest factions assume.

A caveat on this chapter: the AI tooling landscape is changing faster than any field guide can keep up with. Specific tools, capabilities, and benchmarks cited here may be stale by the time you read them. The underlying patterns about how amplification works, and how to keep human judgment in the loop, have proved more durable.²

The Amplification Pattern¶

Force multipliers, by definition, multiply. A capable engineer with strong product judgment and a good model of the system can use AI to ship more, faster, and with better quality than they could without it. The same engineer can use AI to produce a great deal of plausible-looking code that they no longer understand, and that is more expensive to debug than if they had written it themselves.

The factor of multiplication can be positive or negative. It depends almost entirely on the underlying skill and discipline of the person using the tool.

This has practical implications:

AI rewards expertise more than it democratizes it. The engineers who get the most leverage from AI are the ones who already had strong judgment about design, testing, debugging, and code review. They use AI to compress the time it takes to do work they could already do.
AI can mask absence of skill in the short term. Code that looks correct is easy to produce. Code that is correct in the context of an existing system, handles edge cases the model did not see, and fits the design of the surrounding codebase, is much harder. The visible gap between these two used to take experience to bridge. AI narrows the visible gap and widens the actual one.
The expensive parts of engineering are not the parts AI is best at. Typing the code is rarely the bottleneck. Knowing what code to type, recognizing that the code is wrong, and predicting how it will behave in production are the bottlenecks. These remain expensive.

Where AI Genuinely Helps¶

A few categories of work consistently benefit from AI integration:

Scaffolding and boilerplate. The first draft of a routine handler, a test file's structure, a typical configuration. Work where the structure is well known and the cost is mostly typing.
Translation and reformatting. Converting one data format to another, rewriting code in a different style, translating between two equivalent ways of expressing the same thing.
Exploration and idea generation. Brainstorming approaches, listing options, surfacing concepts the engineer might not have known to search for. Use as a thought partner, not as an authority.
Documentation and explanation. Drafting docs from code, summarizing a long thread, producing first-pass code comments. The team still needs to review for accuracy; the draft saves the labor of the blank page.
Code review support. A second pair of eyes (or sometimes the only pair, on a small team) that catches the class of mistakes that fresh attention surfaces.
Test ideation. Listing edge cases the engineer might not have considered. The tests themselves still need human judgment about which ones are actually worth running.
Repetitive transformations. Renaming across a codebase, applying a consistent change across many files, mechanical refactoring where the rule is clear but the labor is large.

In all of these, AI does work that is real but not the most cognitively expensive part of engineering. The engineer remains responsible for design, judgment, and verification.³

Where AI Quietly Hurts¶

The failure modes are subtler than the successes, which is part of what makes them dangerous:

Confident wrongness. AI tools generate plausible output regardless of whether the underlying claim is correct. The output is fluent, well-formatted, and frequently wrong in ways that take real expertise to detect.
Context collapse. AI works from a window of context it was given. Codebase-wide conventions, organizational decisions, ongoing migrations, and prior incidents are usually not in that window. Code that looks right in isolation can be wrong for the system it lives in.
Atrophy of fundamentals. Engineers who outsource basic tasks to AI without ever doing them themselves do not build the mental models that experience produces. The cost of this shows up years later, in the inability to debug, reason about, or improve systems the AI cannot do for them.
Hidden context loss in handoffs. When AI generates code, the "why" behind decisions is often not captured anywhere. The next engineer reading the code has to reverse-engineer intent from artifact, which is more expensive than understanding intent articulated at the time.
False productivity signals. Pull requests get larger, faster. Velocity charts go up. Whether the resulting code is actually working better, behaving better in production, or aging better in the codebase is a separate question that velocity metrics do not answer.⁴
Erosion of code review as a learning practice. Code review historically served two purposes: catching defects and transferring knowledge between engineers. When much of the code under review was AI-generated, the second purpose can quietly disappear unless the team is deliberate about preserving it.

These are not arguments against AI use. They are the reasons careful AI use produces a different result from careless AI use, even with the same tools.

The Skill Question¶

The most consequential question for engineering organizations is not "how do we adopt AI" but "how do we keep producing engineers whose judgment is worth amplifying."

If AI does the work that used to build junior engineers into senior ones, the pipeline that produced the senior engineers stops working. The team becomes structurally dependent on AI not just for productivity but for capability that used to grow inside the organization.

Several thoughtful practitioners have written about this risk.¹ The honest answer requires investment that AI itself cannot provide: explicit attention to skill development, deliberate practice of fundamentals without AI assistance, mentorship that happens outside the tooling, and clear distinction between work AI is doing and work the human is doing.

A team that uses AI to ship faster while quietly hollowing out the engineering judgment underneath it has not multiplied its force. It has traded the engineers it will need tomorrow for what it can ship today.

What This Looks Like in Practice¶

A few habits keep AI a force multiplier rather than a force replacer:

Read what the AI produces with the same care you would read a junior engineer's PR. Or more. AI output is fluent, which means it is harder to spot the mistakes that less fluent output would have made visible.
Use AI in domains where you can verify the output. The signal-to-noise ratio of AI assistance is highest where you can quickly tell whether the output is right. It is lowest in unfamiliar domains, which is also where the temptation to trust the output is highest.
Keep doing the fundamentals by hand sometimes. The engineers who maintain the most leverage from AI are the ones who can still do the work without it, and choose to delegate it knowingly. The engineers who lose leverage are the ones who can no longer do the work and have to delegate.
Treat AI output as a first draft, not a final answer. The draft saves blank-page labor. The labor of making it actually correct, contextually appropriate, and integrated into the system is still engineering work.
Be honest about what was generated and what was authored. Commit messages, PR descriptions, and design docs benefit from clarity about which parts were AI-assisted and which were not. Future debuggers will appreciate the context.
Invest in the team's underlying capability, not just its tooling. AI tools change every quarter. The judgment that uses them well, or compensates for their failures, takes years to develop. Spend on the slower thing too.⁵

Key principle

AI multiplies whatever judgment is in the room. It does not put judgment in the room. A team with strong engineering judgment and AI tools will ship better software than the same team without them. A team without strong engineering judgment will ship more code than before, and it will not necessarily be better software.

Ethan Mollick, Co-Intelligence: Living and Working with AI (Portfolio, 2024). One of the more measured published treatments of how to integrate AI into professional work, including the argument that AI amplifies existing skill rather than substituting for it, and the warning that organizations need to actively defend against the erosion of the human capabilities AI appears to make optional. Mollick's One Useful Thing Substack is the ongoing companion. ↩
Andrej Karpathy, Software 2.0 (Medium, 2017). The originating essay reframing software development as a discipline where parts of the codebase are increasingly trained rather than written. Predates the current LLM wave but holds up as foundational framing for how AI changes the shape of engineering work. ↩
Simon Willison's blog at simonwillison.net is one of the most reliable practitioner-level references for what current AI tools can and cannot do. Willison (co-creator of Django, builder of Datasette) writes regularly about prompt engineering, tool use, evaluations, and the practical mechanics of integrating LLMs into engineering workflows. The signal-to-noise ratio is unusually high for a topic where most writing is either marketing or alarm. ↩
METR (Model Evaluation and Threat Research), Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity (2025). A notable study finding that experienced developers were measurably slower when using AI coding assistance than without, despite perceiving themselves as faster. A useful counterweight to vendor productivity claims, and a reminder that perceived productivity and actual productivity are not the same metric. ↩
Google's DORA (DevOps Research and Assessment) program publishes an annual State of DevOps report and ongoing research on engineering performance, including, in recent editions, the effects of AI adoption on developer productivity, throughput, and quality. The DORA team's findings consistently complicate the simpler vendor narratives in both directions. Available at https://dora.dev. ↩