Building a Safer AI Code Review Workflow

June 23, 20269 min read

AI code review tools are easy to demo and hard to trust.

The hard part is not getting a model to comment on a pull request. The hard part is deciding what the model is allowed to see, where it is allowed to comment, how its output is validated, and when a human remains responsible for the final action.

That is the boundary I wanted to explore.

I built a small local workflow that reviews the current pull request or merge request, asks an AI model for inline review suggestions, validates every suggested comment against the actual changed lines, previews the result, and only posts after explicit confirmation.

The goal is not autonomous review. The goal is a cheaper first pass with a narrow trust boundary.

The Problem

Code review has two very different modes.

The first mode is mechanical: scan the diff, look for obvious regressions, missing edge cases, unsafe assumptions, bad error handling, accidental debug code, suspicious API changes, or tests that no longer cover the behavior.

The second mode is judgment-heavy: understand the product intent, architecture, ownership boundaries, migration strategy, future maintenance cost, and whether the change is the right change at all.

AI can help with the first mode. It can sometimes prepare you for the second mode. But I do not want it acting as the final reviewer, and I definitely do not want it posting unbounded comments without validation.

There was also a smaller workflow problem: I was tired of constantly switching between the code and the comment box in the review UI. Read a line, open a comment, lose a bit of context, go back to the diff, repeat. It is not the deepest engineering problem in the world, but it adds friction exactly when I am trying to keep the whole change in my head.

The practical question became:

Can I get useful inline review suggestions while keeping the model constrained to the diff, validating every comment target, and preserving human approval before anything is posted?

The Design

The workflow is intentionally simple:

  1. Find the current pull request or merge request.
  2. Generate a diff against the base branch.
  3. Parse the diff and extract the exact changed lines that can receive inline comments.
  4. Send the model only the diff, the valid target list, and a compact review rubric.
  5. Require structured JSON output.
  6. Validate and normalize the JSON.
  7. Convert validated comments into a review payload.
  8. Preview the comments in the terminal.
  9. Post only after manual confirmation.

The model gets to reason. Deterministic code decides what is acceptable.

Why Not Use A Hosted AI Reviewer?

Hosted tools like CodeRabbit and GitHub Copilot code review are real options. For many teams, they may be the right choice.

CodeRabbit is built specifically around AI code review workflows. It supports pull request reviews, IDE feedback, CLI usage, planning workflows, and integrations with tools like GitHub, GitLab, Azure DevOps, Bitbucket, Jira, Linear, and Slack.

GitHub Copilot code review is also a natural choice if your team already lives in GitHub. You can request Copilot as a reviewer, configure automatic reviews, use repository instructions, and in some setups provide additional context through skills or MCP servers.

So the reason was not "those tools are bad."

The reason was that I wanted to own the review contract for this experiment:

  • What context is sent to the model
  • Which files are excluded
  • How large diffs are reduced
  • How valid comment targets are computed
  • What output shape the model must return
  • How model output is validated
  • When comments are posted
  • How failures affect the exit code
  • How token usage is surfaced

Commercial tools optimize for adoption and product experience. That is valuable. This local workflow optimizes for inspectability. I can open one file and understand the entire system: diff creation, prompt construction, target extraction, validation, preview, and review submission.

That may be less convenient than a polished product, but it is easier to reason about while I am still forming an opinion about where AI review helps and where it creates noise.

What Determines Review Quality?

This workflow is only as good as the inputs and constraints around it.

Review quality depends mostly on four things:

  • Prompt quality
  • Diff quality
  • Model quality
  • How much context is passed

Prompt quality matters because the model needs a clear job. A vague prompt produces vague comments. A better prompt defines the review rubric, the output format, the severity mapping, and what the model should avoid.

Diff quality matters because the model reviews what it sees. If the diff is too large, too noisy, or dominated by generated files, the signal drops. A clean diff is not just nicer for humans. It is better input for the model.

Model quality matters because different models have different strengths. Some are better at reasoning about edge cases. Some are better at following strict output formats. Some are more conservative. Some produce more noise. Guardrails can constrain output, but they cannot turn a weak review into a strong one by shell scripting harder.

Context matters because every review has a tradeoff. If I pass only the diff, I get a tighter and cheaper review, but the model may miss behavior that depends on surrounding code. If I pass more repository context, the model can reason more deeply, but token usage goes up and the trust boundary gets wider.

For this workflow, I intentionally bias toward less context: review the diff, validate targets, and let the human reviewer bring the broader system knowledge.

Limitations

There are real limitations.

Because the model primarily reviews the diff, it can miss problems that only become visible with broader repository context. I chose that narrower context window intentionally, but it is still a tradeoff.

Because the target list can get large, token usage can climb quickly on big pull requests. A large-diff fallback helps, but there is more work to do.

Because comments are generated by a model, some will be obvious, some will be wrong, and some will be phrased in a way I would not personally write. The preview step is not optional polish. It is part of the safety model.

Because the workflow can post review comments, rerunning it can create duplicate or overlapping feedback. Duplicate detection would be a good next improvement.

And because the first version is Bash, complexity has a ceiling. If this grows much more, I would probably move the orchestration into TypeScript or Go and keep the shell script as a thin entrypoint.

Appendix: A Prompt To Build Your Own Version

I am not sharing my exact script as a reusable package because your repository, team norms, and review preferences are probably different from mine. But if you want to build something similar, this is the kind of prompt I would start with:

I want to build a local AI-assisted pull request review script for my repository.

Create a Bash script that reviews the current pull request and suggests inline review comments using an AI model. The script should be designed as a guarded workflow, not an autonomous reviewer.

Requirements:
- Detect the current pull request from the active branch, or accept a pull request / merge request number as an optional argument.
- Fetch the base branch and generate a diff against it.
- Exclude low-signal files such as lockfiles, build output, coverage output, minified files, and source maps.
- Parse the unified diff and extract valid changed-line targets that can receive inline review comments.
- Send the model only:
  - the diff
  - the valid changed-line target list
  - a compact code review rubric
- Require the model to return structured JSON with:
  - target_id
  - body
  - severity
- Validate the model output before using it.
- Drop comments with invalid target IDs, empty bodies, or unknown severities.
- Convert valid comments into a review payload for the hosting provider.
- Show a terminal preview of the comments before posting.
- Ask for explicit human confirmation before posting anything.
- Post comments only after confirmation.
- Handle model failures with a nonzero exit code.
- Add a timeout for the model run.
- Show progress while the model is running.
- If possible, show token usage after the model finishes.

Review rubric:
- Prioritize correctness, regressions, unsafe behavior, missing edge cases, missing tests, security issues, and performance risks.
- Prefer no comment over speculative or low-value feedback.
- Keep comments concise and actionable.
- Do not comment on unchanged lines.
- Do not ask the model to inspect the repository or run commands.

Implementation preferences:
- Keep the script readable and easy to modify.
- Use structured JSON validation before posting.
- Use a real parser or small script where diff parsing becomes awkward in shell.
- Use environment variables for tunable settings such as max diff size, timeout, diff context, and max comment targets.
- Make failure modes explicit.
- Do not auto-post comments.
- Include comments in the script only where they clarify non-obvious logic.

Also explain how to run the script, what dependencies it requires, and which parts I should customize for my own review preferences.

The important thing is not that the generated script matches mine line for line. The important thing is that it preserves the same safety shape: constrain the input, validate the output, preview before posting, and keep the human reviewer in control.

The Takeaway

The most interesting part of this workflow is not that it uses AI.

The interesting part is the boundary around AI.

The workflow treats model output as untrusted until it passes through deterministic checks. It narrows the prompt, validates the shape, maps comments to known targets, preserves exit codes, shows cost, and asks before posting.

That is the version of AI tooling I trust more: not magical, not fully autonomous, not pretending review is solved.

Just a focused assistant that helps with the first pass, while leaving judgment where it belongs.

Back to Home