Add code-review plugin with automated PR review workflow

Add new code-review plugin that provides automated pull request reviews using multiple specialized agents with confidence-based scoring to filter false positives. Key features: - Multiple parallel agents for independent auditing (CLAUDE.md compliance, bug detection, historical context) - Confidence-based scoring (0-100) with 80+ threshold to filter false positives - Automatic skipping of closed, draft, or already-reviewed PRs - Links directly to code with full SHA and line ranges Updates: - Add code-review plugin directory with command and README - Update plugins/README.md to document the new plugin 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-30 04:02:03 +00:00 · 2025-10-23 23:11:44 -07:00
parent 3be7215354
commit 546f0b46ac
3 changed files with 342 additions and 0 deletions
--- a/plugins/README.md
+++ b/plugins/README.md
@@ -32,6 +32,16 @@ Simplifies common git operations with streamlined commands for committing, pushi
  - `/clean_gone` - Clean up stale local branches marked as [gone]
 - **Use case**: Faster git workflows with less context switching

+### [code-review](./code-review/)
+
+**Automated Pull Request Code Review Plugin**
+
+Provides automated code review for pull requests using multiple specialized agents with confidence-based scoring to filter false positives.
+
+- **Command**:
+  - `/code-review` - Automated PR review workflow
+- **Use case**: Automated code review on pull requests with high-confidence issue detection (threshold ≥80)
+
 ### [feature-dev](./feature-dev/)

 **Comprehensive Feature Development Workflow Plugin**
--- a/plugins/code-review/README.md
+++ b/plugins/code-review/README.md
@@ -0,0 +1,246 @@
+# Code Review Plugin
+
+Automated code review for pull requests using multiple specialized agents with confidence-based scoring to filter false positives.
+
+## Overview
+
+The Code Review Plugin automates pull request review by launching multiple agents in parallel to independently audit changes from different perspectives. It uses confidence scoring to filter out false positives, ensuring only high-quality, actionable feedback is posted.
+
+## Commands
+
+### `/code-review`
+
+Performs automated code review on a pull request using multiple specialized agents.
+
+**What it does:**
+1. Checks if review is needed (skips closed, draft, trivial, or already-reviewed PRs)
+2. Gathers relevant CLAUDE.md guideline files from the repository
+3. Summarizes the pull request changes
+4. Launches 4 parallel agents to independently review:
+   - **Agents #1 & #2**: Audit for CLAUDE.md compliance
+   - **Agent #3**: Scan for obvious bugs in changes
+   - **Agent #4**: Analyze git blame/history for context-based issues
+5. Scores each issue 0-100 for confidence level
+6. Filters out issues below 80 confidence threshold
+7. Posts review comment with high-confidence issues only
+
+**Usage:**
+```bash
+/code-review
+```
+
+**Example workflow:**
+```bash
+# On a PR branch, run:
+/code-review
+
+# Claude will:
+# - Launch 4 review agents in parallel
+# - Score each issue for confidence
+# - Post comment with issues ≥80 confidence
+# - Skip posting if no high-confidence issues found
+```
+
+**Features:**
+- Multiple independent agents for comprehensive review
+- Confidence-based scoring reduces false positives (threshold: 80)
+- CLAUDE.md compliance checking with explicit guideline verification
+- Bug detection focused on changes (not pre-existing issues)
+- Historical context analysis via git blame
+- Automatic skipping of closed, draft, or already-reviewed PRs
+- Links directly to code with full SHA and line ranges
+
+**Review comment format:**
+```markdown
+## Code review
+
+Found 3 issues:
+
+1. Missing error handling for OAuth callback (CLAUDE.md says "Always handle OAuth errors")
+
+https://github.com/owner/repo/blob/abc123.../src/auth.ts#L67-L72
+
+2. Memory leak: OAuth state not cleaned up (bug due to missing cleanup in finally block)
+
+https://github.com/owner/repo/blob/abc123.../src/auth.ts#L88-L95
+
+3. Inconsistent naming pattern (src/conventions/CLAUDE.md says "Use camelCase for functions")
+
+https://github.com/owner/repo/blob/abc123.../src/utils.ts#L23-L28
+```
+
+**Confidence scoring:**
+- **0**: Not confident, false positive
+- **25**: Somewhat confident, might be real
+- **50**: Moderately confident, real but minor
+- **75**: Highly confident, real and important
+- **100**: Absolutely certain, definitely real
+
+**False positives filtered:**
+- Pre-existing issues not introduced in PR
+- Code that looks like a bug but isn't
+- Pedantic nitpicks
+- Issues linters will catch
+- General quality issues (unless in CLAUDE.md)
+- Issues with lint ignore comments
+
+## Installation
+
+This plugin is included in the Claude Code repository. The command is automatically available when using Claude Code.
+
+## Best Practices
+
+### Using `/code-review`
+- Maintain clear CLAUDE.md files for better compliance checking
+- Trust the 80+ confidence threshold - false positives are filtered
+- Run on all non-trivial pull requests
+- Review agent findings as a starting point for human review
+- Update CLAUDE.md based on recurring review patterns
+
+### When to use
+- All pull requests with meaningful changes
+- PRs touching critical code paths
+- PRs from multiple contributors
+- PRs where guideline compliance matters
+
+### When not to use
+- Closed or draft PRs (automatically skipped anyway)
+- Trivial automated PRs (automatically skipped)
+- Urgent hotfixes requiring immediate merge
+- PRs already reviewed (automatically skipped)
+
+## Workflow Integration
+
+### Standard PR review workflow:
+```bash
+# Create PR with changes
+/code-review
+
+# Review the automated feedback
+# Make any necessary fixes
+# Merge when ready
+```
+
+### As part of CI/CD:
+```bash
+# Trigger on PR creation or update
+# Automatically posts review comments
+# Skip if review already exists
+```
+
+## Requirements
+
+- Git repository with GitHub integration
+- GitHub CLI (`gh`) installed and authenticated
+- CLAUDE.md files (optional but recommended for guideline checking)
+
+## Troubleshooting
+
+### Review takes too long
+
+**Issue**: Agents are slow on large PRs
+
+**Solution**:
+- Normal for large changes - agents run in parallel
+- 4 independent agents ensure thoroughness
+- Consider splitting large PRs into smaller ones
+
+### Too many false positives
+
+**Issue**: Review flags issues that aren't real
+
+**Solution**:
+- Default threshold is 80 (already filters most false positives)
+- Make CLAUDE.md more specific about what matters
+- Consider if the flagged issue is actually valid
+
+### No review comment posted
+
+**Issue**: `/code-review` runs but no comment appears
+
+**Solution**:
+Check if:
+- PR is closed (reviews skipped)
+- PR is draft (reviews skipped)
+- PR is trivial/automated (reviews skipped)
+- PR already has review (reviews skipped)
+- No issues scored ≥80 (no comment needed)
+
+### Link formatting broken
+
+**Issue**: Code links don't render correctly in GitHub
+
+**Solution**:
+Links must follow this exact format:
+```
+https://github.com/owner/repo/blob/[full-sha]/path/file.ext#L[start]-L[end]
+```
+- Must use full SHA (not abbreviated)
+- Must use `#L` notation
+- Must include line range with at least 1 line of context
+
+### GitHub CLI not working
+
+**Issue**: `gh` commands fail
+
+**Solution**:
+- Install GitHub CLI: `brew install gh` (macOS) or see [GitHub CLI installation](https://cli.github.com/)
+- Authenticate: `gh auth login`
+- Verify repository has GitHub remote
+
+## Tips
+
+- **Write specific CLAUDE.md files**: Clear guidelines = better reviews
+- **Include context in PRs**: Helps agents understand intent
+- **Use confidence scores**: Issues ≥80 are usually correct
+- **Iterate on guidelines**: Update CLAUDE.md based on patterns
+- **Review automatically**: Set up as part of PR workflow
+- **Trust the filtering**: Threshold prevents noise
+
+## Configuration
+
+### Adjusting confidence threshold
+
+The default threshold is 80. To adjust, modify the command file at `commands/code-review.md`:
+```markdown
+Filter out any issues with a score less than 80.
+```
+
+Change `80` to your preferred threshold (0-100).
+
+### Customizing review focus
+
+Edit `commands/code-review.md` to add or modify agent tasks:
+- Add security-focused agents
+- Add performance analysis agents
+- Add accessibility checking agents
+- Add documentation quality checks
+
+## Technical Details
+
+### Agent architecture
+- **2x CLAUDE.md compliance agents**: Redundancy for guideline checks
+- **1x bug detector**: Focused on obvious bugs in changes only
+- **1x history analyzer**: Context from git blame and history
+- **Nx confidence scorers**: One per issue for independent scoring
+
+### Scoring system
+- Each issue independently scored 0-100
+- Scoring considers evidence strength and verification
+- Threshold (default 80) filters low-confidence issues
+- For CLAUDE.md issues: verifies guideline explicitly mentions it
+
+### GitHub integration
+Uses `gh` CLI for:
+- Viewing PR details and diffs
+- Fetching repository data
+- Reading git blame and history
+- Posting review comments
+
+## Author
+
+Boris Cherny (boris@anthropic.com)
+
+## Version
+
+1.0.0
--- a/plugins/code-review/commands/code-review.md
+++ b/plugins/code-review/commands/code-review.md
@@ -0,0 +1,86 @@
+---
+allowed-tools: Bash(gh issue view:*), Bash(gh search:*), Bash(gh issue list:*), Bash(gh api:*), Bash(gh pr comment:*), Bash(gh pr diff:*), Bash(gh pr view:*), Bash(gh pr review:*), Bash(gh pr list:*)
+description: Code review a pull request
+---
+
+Provide a code review for the given pull request.
+
+To do this, follow these steps precisely:
+
+1. Use an agent to check if the pull request (a) is closed, (b) is a draft, (c) does not need a code review (eg. because it is an automated pull request, or is very simple and obviously ok), or (d) already has a code review from you from earlier. If so, do not proceed.
+2. Use another agent to give you a list of file paths to (but not the contents of) any relevant CLAUDE.md files from the codebase: the root CLAUDE.md file (if one exists), as well as any CLAUDE.md files in the directories whose files the pull request modified
+3. Use an agent to view the pull request, and ask the agent to return a summary of the change
+4. Then, launch 4 parallel agents to independently code review the change. The agents should do the following, then return a list of issues and the reason each issue was flagged (eg. CLAUDE.md adherence, bug, historical git context, etc.):
+   a. Agents #1 and #2: Independently audit the changes to make sure they compily with the CLAUDE.md
+   b. Agent #3: Read the file changes in the pull request, then do a shallow scan for obvious bugs. Avoid reading extra context beyond the changes, focusing just on the changes themselves. Focus on large bugs, and avoid small issues and nitpicks. Ignore likely false positives.
+   c. Agent #5: Read the git blame and history of the code modified, to identify any bugs in light of that historical context
+5. For each issue found in #4, launch a parallel agent that takes the PR, issue description, and list of CLAUDE.md files (from step 2), and returns a score to indicate the agent's level of confidence for whether the issue is real or false positive. To do that, the agent should score each issue on a scale from 0-100, indicating its level of confidence. For issues that were flagged due to CLAUDE.md instructions, the agent should double check that the CLAUDE.md actually calls out that issue specifically. The scale is (give this rubric to the agent verbatim):
+   a. 0: Not confident at all. This is a false positive that doesn't stand up to light scrutiny, or is a pre-existing issue.
+   b. 25: Somewhat confident. This might be a real issue, but may also be a false positive. The agent wasn't able to verify that it's a real issue. If the issue is stylistic, it is one that was not explicitly called out in the relevant CLAUDE.md.
+   c. 50: Moderately confident. The agent was able to verify this is a real issue, but it might be a nitpick or not happen very often in practice. Relative to the rest of the PR, it's not very important.
+   d. 75: Highly confident. The agent double checked the issue, and verified that it is very likely it is a real issue that will be hit in practice. The existing approach in the PR is insufficient. The issue is very important and will directly impact the code's functionality, or it is an issue that is directly mentioned in the relevant CLAUDE.md.
+   e. 100: Absolutely certain. The agent double checked the issue, and confirmed that it is definitely a real issue, that will happen frequently in practice. The evidence directly confirms this.
+6. Filter out any issues with a score less than 80. If there are no issues that meet this criteria, do not proceed.
+7. Finally, comment back on the pull request with a list of issues you found. When writing your comment, keep in mind to:
+   a. Keep your output brief
+   b. Avoid emojis
+   c. Link and cite relevant code, files, and URLs
+
+Examples of false positives, for steps 4 and 5:
+
+- Pre-existing issues
+- Something that looks like a bug but is not actually a bug
+- Pedantic nitpicks that a senior engineer wouldn't call out
+- Issues that a linter will catch (no need to run the linter to verify)
+- General code quality issues (eg. lack of test coverage, general security issues), unless explicitly required in CLAUDE.md
+- Issues that are called out in CLAUDE.md, but explicitly silenced in the code (eg. due to a lint ignore comment)
+
+Notes:
+
+- Use `gh` to interact with Github (eg. to fetch a pull request, or to create inline comments), rather than web fetch
+- Make a todo list first
+- You must cite and link each bug (eg. if referring to a CLAUDE.md, you must link it)
+- For your comment, follow the following format precisely (assuming for this example that you found 3 issues):
+
+---
+
+## Code review
+
+Found 3 issues:
+
+1. <brief description of bug> (CLAUDE.md says "<...>")
+
+<link to file and line with full sha1 + line range for context, eg. https://github.com/anthropics/claude-code/blob/1d54823877c4de72b2316a64032a54afc404e619/README.md#L13-L17>
+
+2. <brief description of bug> (some/other/CLAUDE.md says "<...>")
+
+<link to file and line with full sha1 + line range for context>
+
+3. <brief description of bug> (bug due to <file and code snippet>)
+
+<link to file and line with full sha1 + line range for context>
+
+🤖 Generated with [Claude Code](https://claude.ai/code)
+
+<sub>[ANT-ONLY]</sub>
+<sub>- If this code review was useful, please react with 👍. Otherwise, react with 👎.</sub>
+<sub>- To improve the quality of this code review, update the relevant CLAUDE.md with better guidance or post in #claude-code-feedback on Slack for advice.</sub>
+
+---
+
+- Or, if you found no issues:
+
+---
+
+## Auto code review
+
+No issues found. Checked for bugs and CLAUDE.md compliance.
+
+## 🤖 Generated with [Claude Code](https://claude.ai/code)
+
+- When linking to code, follow the following format precisely, otherwise the Markdown preview won't render correctly: https://github.com/anthropics/claude-cli-internal/blob/c21d3c10bc8e898b7ac1a2d745bdc9bc4e423afe/package.json#L10-L15
+  - Requires full git sha
+  - Repo name must match the repo you're code reviewing
+  - # sign after the file name
+  - Line range format is L[start]-L[end]
+  - Provide at least 1 line of context before and after, centered on the line you are commenting about (eg. if you are commenting about lines 5-6, you should link to `L4-7`)