Skip to content

The Prompt Is the New Pull Request

Published: at 08:36 PM

Table of contents

Open Table of contents

The artifact is shifting

For the last decade, the pull request has been the atomic unit of engineering work. You open a PR, someone reviews it, it merges, work is done. The PR is the receipt. It’s how we measure velocity, how we onboard juniors, how we conduct interviews.

That model worked when a PR represented roughly a day’s worth of human effort. You wrote the code, you wrote the tests, you opened the PR — and someone could roughly estimate the thinking behind it by looking at the diff.

Now a PR can represent ten minutes. Describe the feature to Claude, copy the output, write a quick test, open the PR. The diff looks the same as a day’s work. But the effort behind it is fundamentally different. The real work — the thinking, the spec, the edge cases — happened before a single file was touched.

Terminal with code
The diff is no longer the artifact. It's the output of the artifact. Photo by James Harrison on Unsplash.

The prompt becomes the interesting part

Here’s what I’ve noticed in my own work: the thing worth reviewing is rarely the code anymore. The code is correct, idiomatic, well-structured — Claude is good at that. The thing worth reviewing is the prompt. The spec. The assumptions.

Did we describe the right problem? Did we include the right constraints? Did we tell Claude about the existing pattern in src/utils that it shouldn’t reinvent? Did we mention that this endpoint needs to handle a specific error code that the happy-path spec doesn’t cover?

These questions live upstream of the code. They live in the prompt. And right now, nobody reviews prompts. We review the code the prompt produced, which is like reviewing the cake without looking at the recipe. It might look fine today and fail tomorrow because the recipe had a hidden assumption.

We review the code the prompt produced, which is like reviewing the cake without looking at the recipe. It might look fine today and fail tomorrow.

What prompt-level review looks like

I’ve started doing something on my team that felt strange at first: I ask to see the prompt alongside the PR. Not the full chat history — just the final prompt that produced the implementation.

It takes thirty seconds to read. But in those thirty seconds I can catch things the diff would never show me:

These are real bugs. They just happen to live in the prompt instead of the code. And catching them at prompt-review time is dramatically cheaper than catching them in QA or production.

How this changes code review

Code review as we know it is about to split into two layers:

Layer one: the prompt review. Does the spec match the requirement? Are the constraints complete? Is the context fresh? This is where your senior judgment pays off. A junior can write a prompt; a senior knows what’s missing from it.

Layer two: the output review. Does the code match the prompt? Are there mechanical issues — performance, security, style? This layer is increasingly automated. Linters, type checkers, test suites, even Claude reviewing its own output. The human role here is shrinking to a final sanity check.

The ratio is flipping. A year ago, code review was 90% mechanical checking and 10% “does this actually solve the right problem.” Now the mechanical part is automated, and the judgment part is everything. The prompt review is the judgment part.

What this means for your workflow

If you’re using AI tools in your daily work, treat the prompt as a first-class artifact:

Version your prompts. If a prompt produced a PR, save it. Either in the PR description, a linked doc, or a comment. Six months from now when someone asks “why does this endpoint work this way,” the prompt will tell them more than the code ever could.

Iterate on prompts, not just code. If the implementation is wrong, fix the prompt first, then regenerate. Fixing the code directly means the prompt stays broken, and the next person who uses it makes the same mistake.

Build a prompt library. The best prompts for your codebase — the system prompts, the patterns, the constraints — are reusable. They’re context. They’re the beginning of the organisational brain.

The hiring implication nobody’s ready for

Here’s the uncomfortable part: if prompts become the work artifact, our hiring processes are measuring the wrong thing. We test coding speed. We do live coding interviews. We ask people to invert binary trees on a whiteboard.

But the skill we actually need now — the skill that separates good from great — is the ability to specify precisely, review critically, and catch the gap between what was asked and what was needed. Those aren’t tested by watching someone type a for-loop.

The companies that figure out how to evaluate specification and review skills — not coding speed — will hire the engineers who thrive in this era. The companies that stick to LeetCode will hire fast typists with outdated skills.

Where this is going

The prompt is just the current interface. In a year or two, we’ll have better interfaces — maybe voice, maybe something more structured, maybe agents that interview you to extract the spec rather than waiting for you to write it. The specific interface doesn’t matter.

What matters is the trend: the artifact of engineering work is moving upstream, away from the implementation and toward the intent. The people who embrace that — who treat their prompts with the same care they used to treat their code — will produce better output with less effort. The people who resist it will find themselves writing code that a machine could have written faster, while their colleagues focus on the part machines can’t do.

The diff was never the work. It was the receipt for the work. The receipt just moved upstream — and that’s the same shift, from author to orchestrator, that runs through everything in my AI Engineering practice.