Skip to content

Rework pr-reviewer workflow around suggestions#707

Open
aram356 wants to merge 13 commits into
mainfrom
pr-reviewer-iterative-workflow
Open

Rework pr-reviewer workflow around suggestions#707
aram356 wants to merge 13 commits into
mainfrom
pr-reviewer-iterative-workflow

Conversation

@aram356

@aram356 aram356 commented May 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Reworks the pr-reviewer agent around a single review pass that posts GitHub review comments and user-approved suggestion blocks, instead of pushing fixes or opening a stacked fix-up PR.
  • Adds explicit PR, branch-remote, and branch-local modes with private review refs, merge-base-aware diff handling, branch-only output, and a head re-check before submission.
  • Tightens CI and submission behavior: required checks are classified from gh pr checks --json buckets, failed required CI becomes a blocking body-level finding, pending reviews are preserved unless the user chooses to delete them, and review payloads are built as strict JSON.

Changes

File Change
.claude/agents/pr-reviewer.md Replaces the previous fix-up-PR loop with a one-pass review workflow. Documents mode resolution, worktree setup, full changed-file reads, CI classification, finding triage, GitHub suggestion eligibility, scratch verification, verdict precedence, branch-only rendering, PR head drift handling, pending-review handling, and safe gh api review submission.

Closes

Closes #706

Test plan

This is a workflow/documentation-only change under .claude/agents/; no Rust or JS source is affected.

  • git diff --check -- .claude/agents/pr-reviewer.md
  • Mocked CI capture cases:
    • nonzero exit with JSON -> classify from JSON
    • exit 8 with no JSON -> pending/no-output note
    • nonzero exit with no JSON -> command/API/auth diagnostic
  • cargo test --workspace - n/a
  • cargo clippy --workspace --all-targets --all-features -- -D warnings - n/a
  • cargo fmt --all -- --check - n/a
  • JS tests: cd crates/js/lib && npx vitest run - n/a
  • JS format: cd crates/js/lib && npm run format - n/a
  • Docs format: cd docs && npm run format - n/a, file is outside docs/

Checklist

  • Changes follow CLAUDE.md conventions
  • No production code changed
  • No secrets or credentials committed

The pr-reviewer agent now iterates review → implement → re-review on
a stacked fix-up PR until a full pass surfaces no actionable findings,
rather than producing a single read-only review.

Key workflow changes:
 - Re-resolve the PR's current head at the start of every pass; work
   against origin/<headRefName> rather than a previously checked-out copy
 - Triage each finding twice: include in review, and implement as code
 - One fix-up branch / PR per review engagement, reused across passes:
   branch `review/<timestamp>-<pr-number>` (UTC YYYYMMDD-HHMMSS), title
   `<timestamp> Review fixes for #<pr-number>`, base = the PR's head
 - Inline comments and the review-body summary reference the fix-up PR
   for findings that were implemented; verdict no longer forces
   REQUEST_CHANGES when all wrenches are addressed in the fix-up PR
 - Stop conditions: no new actionable findings (ideal merge candidate),
   blocked on author, or user says so
 - Rules forbid pushing or submitting without explicit user approval,
   targeting `main` from a fix-up PR, or skipping --force-with-lease
   after a rebase

Closes #706
@aram356 aram356 self-assigned this May 17, 2026

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review by Yesman.

I found one blocking correctness issue in the updated pr-reviewer instructions. The PR adds a stale-head guardrail in Step 1, but Step 2 still says to enumerate changed files with git diff main...HEAD --name-only (.claude/agents/pr-reviewer.md, current line 59). Because that line is not part of the submitted diff, I could not attach this as an inline comment.

This should use the just-fetched PR head, not the reviewer worktree's local HEAD. Otherwise later passes can enumerate files from an old checkout, a fix-up branch, or another branch entirely, causing the agent to miss current PR changes before submitting a review. Please make it consistent with Step 1, for example git diff origin/<baseRefName>...origin/<headRefName> --name-only after fetching the base, or git diff main...origin/<headRefName> --name-only if main is intentionally fixed.

CI is mostly passing with some checks pending at review time.

Comment thread .claude/agents/pr-reviewer.md Outdated
@aram356 aram356 changed the title Make pr-reviewer agent iterate to an ideal merge candidate Rework pr-reviewer workflow around suggestions Jun 9, 2026

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 against origin/main at head 5ea9211305c040fe8ddce01da70b392aa63db554. The workflow rewrite is generally careful around stale heads, suggestion safety, pending reviews, and JSON review submission, but I found one CI-coverage gap that can cause the reviewer to miss failing Trusted Server gates.

Findings

P1 / High

  1. Do not limit CI classification to branch-protection-required checks.claude/agents/pr-reviewer.md:370
    • Issue: The new CI collection uses gh pr checks --required and the body template later says to render only checks reported by that required-only query. In this repository, that currently returns only cargo test, format-typescript, cargo fmt, and format-docs; it omits checks such as Analyze (rust), vitest, CodeQL, browser integration tests, and integration tests.
    • Why it matters: CLAUDE.md treats those broader workflows as PR gates. If one of the omitted checks fails, the reviewer can still report clean CI and potentially approve/comment with no failed-CI finding, hiding a real regression in Trusted Server's Rust/JS/integration validation.
    • Suggested fix: Query and report all PR checks (for example, gh pr checks "$NUMBER" --json name,bucket,state,link), then separately mark which are required if branch-protection status matters. At minimum, classify failures/cancellations from the full check set as review findings, while preserving the existing required-check handling for merge-blocking branch protection.

CI / Existing Reviews

CI is currently passing. Existing review feedback included an older stale-HEAD concern that appears addressed by the current $DIFF_RANGE / private-ref workflow; I did not duplicate that finding.

Comment thread .claude/agents/pr-reviewer.md Outdated
aram356 added 2 commits June 15, 2026 15:58
…ecks

The CI collection used gh pr checks --required, which in this repo returns
only the branch-protection-required checks and omits gates like Analyze
(rust), vitest, CodeQL, and integration tests. A failure in one of those
could be missed, letting the reviewer report clean CI and approve over a
real regression.

Query the full check set for failure classification, capture the required
names separately so failed required checks can still be flagged as
merge-blocking, and render every reported check (annotated as required) in
the CI Status section.

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 at head a7b267c955d45a27a309b23e7bef33b787ae288f. The workflow rewrite is careful overall and the previously noted stale-head and required-only CI issues appear addressed in this revision. I found one non-blocking robustness gap around reusing the agent-owned worktree after an interrupted suggestion-verification pass; no blocking correctness, security, data-loss, authorization, or severe compatibility issues found.

Findings

  • P2: Reused reviewer worktrees should clear ignored JS bundles before a new pass — see inline comment.

CI / Existing Reviews

All GitHub checks currently pass. Existing review feedback covered the stale HEAD diff issue and the required-only CI query gap; I did not duplicate those findings.

Comment thread .claude/agents/pr-reviewer.md
aram356 added 2 commits June 16, 2026 17:39
reset --hard + clean -fd restore tracked files and remove untracked
non-ignored files, but leave ignored artefacts behind. A prior invocation
interrupted after running node build-all.mjs leaves a stale crates/js/dist/
from the old head or a discarded suggestion; crates/js/build.rs consumes
those bundles via include_str!(), so the leftover leaks into the next pass's
Rust compile output and can produce misleading verification results.

Scope the extra clean to crates/js/dist so the target/ and node_modules/
caches that speed up verification are preserved.

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 at head ba8a52113214d0502da7b4e2df822010c7eef059. The revised pr-reviewer workflow is much safer than the previous fix-up-PR approach, and I did not find blocking correctness, security, data-loss, authorization, or severe compatibility issues. I left two medium-severity inline comments for branch-only flow clarity and Markdown suggestion-fence safety.

CI / Existing Reviews

Current GitHub checks are mostly passing; integration tests and integration tests (EdgeZero entry point) were still pending at review time. Existing review feedback on stale HEAD handling, full CI classification, and stale JS bundle cleanup appears addressed; I did not duplicate those findings.

Comment thread .claude/agents/pr-reviewer.md Outdated
Comment thread .claude/agents/pr-reviewer.md
@aram356

aram356 commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator Author

@aram356 Check if workflow creates worktree twice

aram356 added 2 commits June 26, 2026 09:02
…acktick collapse

Two review-workflow doc fixes:

- The step 1 orientation note said step 8 is 'skipped entirely' for BRANCH-*
  modes, but 8a composes the branch-only artifact and would-have-been verdict;
  only 8b (GitHub submission) is skipped. An agent reading the earlier sentence
  could stop without rendering any branch-only findings. Reword to scope the
  skip to 8b.

- Suggestion blocks always used a literal three-backtick fence. When the
  replacement bytes contain their own three-or-more backtick run (Markdown/docs
  suggestions with a nested code fence), the inner run closes the outer block
  early and the one-click suggestion is truncated or malformed. Add a
  fence-length rule (widen to longest-inner-run + 1, else demote to prose) in
  step 7a and reference it from the step 6 triage preview and the 7b inline
  comment body so preview and submission stay identical.
aram356 added 2 commits June 30, 2026 21:28
…rectory

The worktree guard keyed off directory existence (`[ ! -e "$WT" ]`) while
git tracks worktrees by registration. When the two disagree — a registered
worktree whose directory was manually removed — the old flow took the
first-time branch and ran `git worktree add`, which fails with 'missing but
already registered' instead of recreating.

Prune dangling registrations first (`--expire now`, which only touches
worktrees whose directory is already gone), then decide reuse-vs-create from
`git worktree list` (the source of truth). This guarantees exactly one
`git worktree add` per path: reuse when a live worktree exists, refuse when an
untracked directory sits there, create only when neither does.
Keep the per-PR worktree name (pr-${NUMBER}-review) but normalize $NUMBER to
bare digits, so a '#707' input and a '707' input map to the same worktree
instead of splitting into pr-#707-review and pr-707-review. Combined with the
idempotent setup that reuses rather than re-adds, each PR now gets exactly one
worktree, reused across review passes.

@ChristianPavilonis ChristianPavilonis left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: I reviewed PR #707 against origin/main at head db9b3f33d1e87cb620bac5e291d0e8cdad8edb01. The workflow rewrite is detailed and the previously reported stale-head, required-only CI, branch-mode, suggestion-fence, and worktree-registration issues appear addressed. I found one blocking workflow correctness issue: the new JS scratch-verification and stale-bundle cleanup instructions point at a non-existent crates/js/... tree instead of this repository's crates/trusted-server-js/... package, so the reviewer would fail or skip the intended JS checks and leave stale bundles behind.

Findings

P1 / High

  • JS suggestion verification uses the wrong package path — see inline comment.

CI / Existing Reviews

gh pr checks 707 currently reports one failing check: prepare integration artifacts. The job failed while Docker was fetching php:8.3-cli-alpine due to a Docker Hub TLS handshake timeout, so I am treating it as CI context rather than attributing it to this docs-only change. Existing automated review feedback on stale HEAD diffing, full-check CI classification, stale JS bundle cleanup, branch-only step 8 handling, suggestion-fence length, and idempotent worktree setup appears addressed; I did not duplicate those resolved comments.

`crates/js/lib/src/`:

```bash
(cd "$WT/crates/js/lib" && npx vitest run)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review: 🔧 wrench — The JS verification commands still use $WT/crates/js/lib, but this repository's JS package is under crates/trusted-server-js/lib (as documented in CLAUDE.md and present in the tree). The same wrong crates/js/... prefix is also used for the lib/src trigger and dist cleanup earlier/later in this file. As written, JS/TS suggestions would cd into a missing directory, fail to detect real TS source edits, and clean a non-existent crates/js/dist instead of the ignored crates/trusted-server-js/dist that crates/trusted-server-js/build.rs consumes. That breaks the scratch-verification and stale-bundle guarantees this workflow is adding.

Suggested fix: replace the crates/js/... references throughout this workflow with the actual package paths: crates/trusted-server-js/lib, crates/trusted-server-js/lib/src/, crates/trusted-server-js/dist, and crates/trusted-server-js/build.rs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make pr-reviewer agent iterate to an ideal merge candidate

2 participants