Code Review for Software Engineers (2026)
In short
Code review is the most under-taught senior+ skill: most engineers learn it by absorption, and most teams have implicit standards nobody wrote down. The two structural truths in 2026 are that AI handles first-pass mechanical review (style, obvious bugs, missing error handling) and that humans must focus on what AI cannot do well (architecture, context, blast radius, mentorship). This page walks Google's published Engineering Practices Documentation (google.github.io/eng-practices/review/) with real review-comment examples and the named patterns that distinguish a senior reviewer from a junior reviewer.
Key takeaways
- Google's Engineering Practices Documentation (google.github.io/eng-practices/review/) is the single best public resource on code review — read 'The Standard of Code Review' and 'How to do a code review' end-to-end at least once.
- Reviewers approve when the change is 'definitely an improvement' even if not perfect — perfection is the enemy of velocity (eng-practices/review/reviewer/standard.html).
- Architecture review and style review are different jobs. Architecture: 'should this exist?'. Style: 'is this readable?'. Senior+ reviewers do both but in that order — never approve style on a wrong abstraction.
- Comment severity should be explicit: nit (optional), suggestion (defer to author), blocking (must fix). Implicit severity is the #1 source of reviewer-author friction.
- AI-assisted review (Cursor, CodeRabbit, GitHub Copilot for PRs) handles first-pass mechanical issues — style, missing error cases, obvious bugs. This raises the bar for human reviewers to the things AI is bad at: architectural fit, blast radius, organizational context.
- The single highest-leverage senior practice: review the design before the implementation. If the approach is wrong, no amount of style fixing helps.
What good code review actually looks like (Google's published bar)
Google's Engineering Practices Documentation (google.github.io/eng-practices/review/) is publicly available and was written by Mike Bland and Dan Bentley. It's the closest thing to an industry-standard code-review rubric. Three load-bearing principles to internalize:
Principle 1: 'In general, reviewers should favor approving a CL once it is in a state where it definitely improves the overall code health of the system being worked on, even if the CL isn't perfect.' (eng-practices/review/reviewer/standard.html)
Translation: a CL that improves the codebase by 20% with three minor issues should ship; you can file the issues separately. Holding CLs hostage to perfection is what kills team velocity. If the CL is a regression on net (i.e., the issues outweigh the improvement), block.
Principle 2: 'It is ALMOST NEVER okay to block a CL just because the change isn't structured exactly the way you would have written it.' (eng-practices/review/reviewer/standard.html)
Translation: 'I would have done it differently' is not a blocking comment. Blocking comments are for correctness, security, performance regressions, or breaking team conventions. Style preference is a 'suggestion'.
Principle 3: Review the design first. The doc explicitly orders the review checklist: design → functionality → complexity → tests → naming → comments → style. (eng-practices/review/reviewer/looking-for.html)
If the design is wrong, fixing names doesn't help. If the design is right, the rest is recoverable.
Real review-comment pairs: bad vs good
The single fastest way to get better at review is to study side-by-side comparisons. Six concrete examples, all from patterns documented in Google's eng-practices doc and refined by practitioners (Sarah Drasner's 'How to write a code review', Will Larson's StaffEng essays).
Example 1: pointing out a missing error case.
Bad: 'This is wrong.'
Good: 'requests.get()raises ConnectionError on network failure. Without a try/except here, a flaky external service will 500 the user. Suggestion: wrap in try/except, log the error, return a sentinel or raise a domain exception. (Blocking — this is on the user-facing path.)'
Why the second is better: it names the specific failure (ConnectionError), the impact (user-visible 500), the fix (try/except), the severity (blocking, with reason). The author can act without a back-and-forth.
Example 2: questioning a design choice.
Bad: 'Why is this a separate service?'
Good: 'I see this is split as a separate service. The trade-off I'm worried about: this introduces a network hop on the auth path (current p99 is 80ms; this likely adds 20-30ms). Two questions: (1) is the deployment-decoupling benefit worth the latency cost? (2) Is there a path to in-process if scale demands? Not blocking — happy to discuss.'
Why the second is better: states the concrete cost (latency), asks specific questions, signals openness, marks severity (not blocking). The author can defend or rethink.
Example 3: requesting a test.
Bad: 'Where's the test?'
Good: 'Suggestion: add a test for the empty-input case —process_batch([])currently returns None, and we should pin this behavior so a future refactor can't accidentally raise. (Suggestion — not blocking.)'
Example 4: catching a security issue.
Bad: 'SQL injection.'
Good: 'BLOCKING:db.execute(f"SELECT * FROM users WHERE email = \'{email}\'")is a SQL injection vector — an email likex' OR '1'='1reads the whole table. Use parameterized query:db.execute("SELECT * FROM users WHERE email = %s", (email,)). Required by OWASP A03 (owasp.org/Top10/A03_2021-Injection/).'
Severity called out, exploit articulated, fix shown, citation. This is the bar.
Example 5: nit (style) feedback.
Bad: 'Rename this.'
Good: 'nit:dataas a variable name is unfit; this is specifically a list ofRawTransactionEvent.eventsorraw_eventswould read better. Take or leave.'
The 'nit:' prefix is critical. The 'take or leave' makes the severity explicit.
Example 6: catching architectural drift.
Bad: 'This violates our pattern.'
Good: 'This is a query directly against theuserstable; our convention is to go throughUserRepository. Reasons: (1) consistent caching layer, (2) auth checks centralized, (3) we plan to migrate this table next quarter and the repository abstracts it. Suggestion: refactor to use the repository before merging — happy to point at examples inapp/services/auth_service.py:42-58. (Blocking — convention break, but easy fix.)'
Names the convention, explains the reasons, points to an example, severity is explicit.
Architecture review vs style review: separate them
The single most common review failure mode: mixing architecture concerns and style concerns in the same comment thread. The author can't tell what's blocking. The reviewer dilutes the architectural concern by burying it in nits.
The fix: do two passes.
- Pass 1: design pass. Read the description and the test cases. Don't read the implementation yet. Ask: is the abstraction right? Is the test coverage right? Is this the right place for this code? If you have blocking concerns, comment them first and request changes before you do pass 2.
- Pass 2: implementation pass. Once design is settled, read the implementation. Look for: error handling, performance, security, readability. Now nits are appropriate.
Why this matters: if you give 30 nits and one architectural concern in pass 1, the author may fix the nits, defend the architecture in a comment, and request re-review. You then have to either (a) approve a wrong architecture because the nits got fixed or (b) explain that the architectural concern is the only blocking one and the nits don't matter yet. The two-pass discipline avoids this.
Concrete signal of seniority: on a 200-line PR, give 4-6 substantive comments and approve. On a 200-line PR, giving 25 comments is a junior signal — reads as nit-picking and breaks the author's flow. The senior reviewer prioritizes; the junior reviewer enumerates.
Reviewing AI-generated code: what to watch for
AI-generated code in 2026 has a specific signature that experienced reviewers learn to recognize. The common failure modes:
- Confident wrong code that passes the happy path. The function looks reasonable, the test covers one case, edge cases blow up. Reviewer fix: ask for property-based tests or fuzz tests on the boundaries.
- Hallucinated APIs. AI invents method signatures that don't exist on the real library. Reviewer fix: check the import; verify the method on the docs.
- Off-by-one and indexing errors. AI is inconsistent on inclusive/exclusive ranges. Reviewer fix: read every range and slice, mentally substitute boundary values (0, 1, len-1, len).
- Pattern matching from older idioms. AI pulls from training data that includes deprecated patterns (e.g., asyncio.coroutine when async/await is correct). Reviewer fix: verify the idiom matches the current major version of the language.
- Missing the team-specific convention. AI doesn't know your codebase has a custom logger, a specific error type, a preferred ORM. Reviewer fix: a CLAUDE.md or .cursorrules file in the repo so the next AI invocation has the conventions.
Two patterns that work for AI-generated PRs:
Pattern 1: 'Did the human verify this?' If the PR description says nothing about the AI run or the verification, ask: 'How did you verify the edge cases? Did you run mutation testing?' This is not gatekeeping; it's the same question you'd ask a junior who copy-pasted from Stack Overflow.
Pattern 2: AI as second reviewer. Many teams now run an AI-tool pre-review (CodeRabbit, GitHub Copilot for PRs, Cursor's BugBot) before human review. The AI catches mechanical issues; the human focuses on architecture and judgment. The two-stage pattern is becoming standard at AI-forward companies (Anthropic, Vercel, Cursor publish about this in their engineering blogs).
When to escalate (author and reviewer disagree)
Disagreements happen. The senior practice is to escalate gracefully when (a) the disagreement is on architecture, not style; (b) you've made your case once clearly and the author hasn't moved; (c) further back-and-forth is unproductive.
Escalation paths (in order):
- Synchronous conversation. 5-minute call. Most architectural disagreements collapse when you can ask 'what's the trade-off you're prioritizing?' in real time.
- Tech-lead or senior engineer arbitration. Bring in a third party with context. Useful when the disagreement is between peers and one of them needs a tiebreaker.
- Manager arbitration. Last resort; managers should rarely break ties on technical matters. If a manager has to break a tie often, the team has a deeper problem.
- Defer. If you've made your case and the author still disagrees and you're not the team's owner of this code, sometimes the right move is to approve with a comment ('I'd have done it differently — happy to revisit if it causes problems'). Note the disagreement publicly so future engineers see the trade-off.
Google's eng-practices doc explicitly notes (eng-practices/review/reviewer/standard.html): 'In general, the author of the CL should be the one to make the decision on what's best.' Reviewers can request changes for blocking issues but should not block on preference.
The pattern that breaks teams: reviewers who block on preference and authors who can't move forward without unanimous approval. Both are anti-patterns. The Google bar is 'definitely an improvement' — not 'unanimous'.
Reviewing across language and team boundaries
Cross-team or cross-language review is the senior+ scenario most often missed in code-review training. The patterns that work:
Reviewing in a language you don't write daily: focus on what doesn't require fluency — control flow, error handling, naming, test coverage, security. Defer language-idiomatic feedback to someone who writes in the language daily. Mark your comments: 'I don't write Go regularly — flag if this is idiomatic to ignore.'
Reviewing for a team you don't own: defer to the team's conventions. If you have a preference that conflicts with their convention, raise it as a discussion, not as a block. Their codebase, their conventions. Cross-team reviewers who impose their own team's conventions are a known anti-pattern.
Reviewing security- or performance-sensitive code: different bar. Bring in a domain reviewer (security engineer, performance engineer) explicitly. The senior practice: 'I'm approving with the caveat that I'm not a security expert; @sec-team should sign off on the auth path.'
Reviewing PRs that span 1000+ lines: push back on the size before reviewing. Will Larson's pattern in 'An Elegant Puzzle' (chapter 3): 'CLs over 400 lines should usually be split.' Beyond ~400 lines, reviewer attention degrades. The author should split the PR; if they can't, the design likely needs reconsidering.
Mentorship through review: what juniors should learn from your comments
The most under-appreciated function of code review: it's the dominant mechanism for transmitting team conventions and engineering judgment to junior engineers. The senior practice is to make every comment educational.
Pattern: explain the 'why', not just the 'what'.
What-only: 'Use a context manager here.'
Why-and-what: 'Use a context manager (with statement) here so the file handle closes deterministically even if process_row raises. Without it, on exception the file leaks until the next garbage collection. This pattern is documented in Python's PEP 343 (peps.python.org/pep-0343).'
The why-and-what comment is 30 seconds longer to write and saves the junior 10 reviews of the same kind in the future.
Pattern: link to canonical references in your codebase. 'See app/services/auth_service.py:42-58 for an example of this pattern in our codebase.' Junior engineers learn by example faster than by abstract description.
Pattern: separate 'team norms' from 'universal best practices'. 'In our codebase we use Result types instead of exceptions for expected errors — see ARCHITECTURE.md' is different from 'never use except: pass — it swallows real errors'. Make the distinction.
Anti-pattern: terse, judgmental comments to juniors. 'Wrong.' 'No.' 'Why?' These read as dismissive and make the junior less likely to ask questions on their next CL. The senior practice is to be exact and patient. If you don't have time to be patient, defer the review to someone who does.
Frequently asked questions
- How long should I spend on a code review?
- Roughly 5 minutes per 100 lines of diff for first-pass review, more if the change is architecturally significant. Google's data (eng-practices/review/reviewer/speed.html) suggests reviewers should aim to respond within one business day. The pattern that breaks teams: reviewers who batch reviews to once a week. Author velocity collapses, context degrades. The discipline: review within hours; flag complex CLs that need a longer block of time.
- Should I use 'I' or 'you' in review comments?
- Prefer 'I' or 'we' over 'you' for non-blocking feedback. 'I think this could be clearer as X' is friendlier than 'you should rename this'. For blocking issues, neutral structural language: 'this introduces a SQL injection vector' is better than 'you wrote a SQL injection vulnerability'. The substance is identical; the tone makes the receiver more likely to engage. Reference: Sarah Drasner's 'How to give helpful code reviews' (sarahdrasnerdesign.com).
- How do I review a PR from a senior engineer when I'm more junior?
- Same standards apply. Senior code is not exempt from review. If the senior is wrong, say so with the same evidence and citations you'd want from them. The asymmetric pattern that fails teams: junior reviewers who rubber-stamp senior PRs and then complain in private. Be the reviewer who calls real issues regardless of author level. Senior engineers respect (and often prefer) reviewers who actually engage.
- What's the right pattern for blocking vs requesting-changes vs approving?
- Three states with different meanings: (a) Approve = 'this is good, ship it'; (b) Approve with comments = 'good enough; address my comments at your discretion'; (c) Request changes = 'I have at least one blocking concern; I'll re-review when you address it'. The right pattern: approve liberally; request changes only for real blockers (correctness, security, breaking convention). Junior reviewers default to request-changes; senior reviewers default to approve-with-comments unless there's a real blocker.
- How do I review a PR I disagree with on philosophy but not on correctness?
- Approve and document. 'I would have approached this with X pattern but acknowledge Y is also valid. Approving — flagging that we should align on a team convention before this becomes the dominant pattern.' Then file a follow-up issue or discussion to align team convention. The wrong move: blocking on philosophy with no clear convention violation. Google's eng-practices is explicit: don't block on personal preference.
- Should AI tools auto-approve PRs?
- Currently no, despite the temptation. AI-tool pre-review is a strong signal but not a substitute for human approval. The reasons: (a) AI cannot evaluate organizational context (is this consistent with our roadmap?); (b) AI cannot evaluate blast radius (does this affect downstream teams?); (c) AI confidence on architectural fit is poor. Use AI as a first-pass reviewer; humans as approvers. This is the published pattern at Anthropic, Vercel, and Cursor as of 2026.
- How do I handle a PR that's been blocked for weeks with no resolution?
- Three steps. (1) Bring it to a synchronous conversation — 15 minutes resolves what 3 weeks of comments cannot. (2) Find the right tiebreaker — usually the team's tech lead or the engineering manager for architectural questions. (3) If still stuck, ask: 'is this PR worth this much team energy?' — sometimes the right answer is to abandon and try a different approach. Long-stalled PRs are a strong signal that something deeper is wrong: unclear ownership, an architectural disagreement that the team hasn't resolved, or a 'too clever' approach that should be rebuilt simpler.
- What's the relationship between code review and design docs / RFCs?
- Code review is the wrong place to discover an architecture problem; design docs are. Senior engineers move architectural decisions upstream — to a design doc reviewed before implementation begins. Code review then enforces the design rather than re-litigating it. If you find yourself in a code review arguing about whether the feature should exist or whether the abstraction is right, that's a signal: this work needed a design doc and didn't get one. The fix forward: write the design doc retroactively, get alignment, restart implementation. Reference: Google's design doc culture (rsdoiel.github.io/blog/2018/02/06/design-docs.html).
Sources
- Google — Engineering Practices Documentation, Code Review section (publicly available, written by Mike Bland & Dan Bentley).
- Google Eng Practices — 'The Standard of Code Review'.
- Google Eng Practices — 'What to look for in a code review'.
- Will Larson — StaffEng essays on engineering judgment and review.
- Will Larson — 'An Elegant Puzzle' (chapter 3 covers CL size and review).
- Anthropic Engineering blog — published patterns on AI-assisted review.
- OWASP Top 10 — A03 Injection (canonical reference for security-blocking comments).
About the author. Blake Crosley founded ResumeGeni and writes about product design, hiring technology, and ATS optimization. More writing at blakecrosley.com.