Rework pr-reviewer workflow around suggestions#707
Conversation
The pr-reviewer agent now iterates review → implement → re-review on a stacked fix-up PR until a full pass surfaces no actionable findings, rather than producing a single read-only review. Key workflow changes: - Re-resolve the PR's current head at the start of every pass; work against origin/<headRefName> rather than a previously checked-out copy - Triage each finding twice: include in review, and implement as code - One fix-up branch / PR per review engagement, reused across passes: branch `review/<timestamp>-<pr-number>` (UTC YYYYMMDD-HHMMSS), title `<timestamp> Review fixes for #<pr-number>`, base = the PR's head - Inline comments and the review-body summary reference the fix-up PR for findings that were implemented; verdict no longer forces REQUEST_CHANGES when all wrenches are addressed in the fix-up PR - Stop conditions: no new actionable findings (ideal merge candidate), blocked on author, or user says so - Rules forbid pushing or submitting without explicit user approval, targeting `main` from a fix-up PR, or skipping --force-with-lease after a rebase Closes #706
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review by Yesman.
I found one blocking correctness issue in the updated pr-reviewer instructions. The PR adds a stale-head guardrail in Step 1, but Step 2 still says to enumerate changed files with git diff main...HEAD --name-only (.claude/agents/pr-reviewer.md, current line 59). Because that line is not part of the submitted diff, I could not attach this as an inline comment.
This should use the just-fetched PR head, not the reviewer worktree's local HEAD. Otherwise later passes can enumerate files from an old checkout, a fix-up branch, or another branch entirely, causing the agent to miss current PR changes before submitting a review. Please make it consistent with Step 1, for example git diff origin/<baseRefName>...origin/<headRefName> --name-only after fetching the base, or git diff main...origin/<headRefName> --name-only if main is intentionally fixed.
CI is mostly passing with some checks pending at review time.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 against origin/main at head 5ea9211305c040fe8ddce01da70b392aa63db554. The workflow rewrite is generally careful around stale heads, suggestion safety, pending reviews, and JSON review submission, but I found one CI-coverage gap that can cause the reviewer to miss failing Trusted Server gates.
Findings
P1 / High
- Do not limit CI classification to branch-protection-required checks —
.claude/agents/pr-reviewer.md:370- Issue: The new CI collection uses
gh pr checks --requiredand the body template later says to render only checks reported by that required-only query. In this repository, that currently returns onlycargo test,format-typescript,cargo fmt, andformat-docs; it omits checks such asAnalyze (rust),vitest, CodeQL, browser integration tests, and integration tests. - Why it matters:
CLAUDE.mdtreats those broader workflows as PR gates. If one of the omitted checks fails, the reviewer can still report clean CI and potentially approve/comment with no failed-CI finding, hiding a real regression in Trusted Server's Rust/JS/integration validation. - Suggested fix: Query and report all PR checks (for example,
gh pr checks "$NUMBER" --json name,bucket,state,link), then separately mark which are required if branch-protection status matters. At minimum, classify failures/cancellations from the full check set as review findings, while preserving the existing required-check handling for merge-blocking branch protection.
- Issue: The new CI collection uses
CI / Existing Reviews
CI is currently passing. Existing review feedback included an older stale-HEAD concern that appears addressed by the current $DIFF_RANGE / private-ref workflow; I did not duplicate that finding.
…ecks The CI collection used gh pr checks --required, which in this repo returns only the branch-protection-required checks and omits gates like Analyze (rust), vitest, CodeQL, and integration tests. A failure in one of those could be missed, letting the reviewer report clean CI and approve over a real regression. Query the full check set for failure classification, capture the required names separately so failed required checks can still be flagged as merge-blocking, and render every reported check (annotated as required) in the CI Status section.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 at head a7b267c955d45a27a309b23e7bef33b787ae288f. The workflow rewrite is careful overall and the previously noted stale-head and required-only CI issues appear addressed in this revision. I found one non-blocking robustness gap around reusing the agent-owned worktree after an interrupted suggestion-verification pass; no blocking correctness, security, data-loss, authorization, or severe compatibility issues found.
Findings
- P2: Reused reviewer worktrees should clear ignored JS bundles before a new pass — see inline comment.
CI / Existing Reviews
All GitHub checks currently pass. Existing review feedback covered the stale HEAD diff issue and the required-only CI query gap; I did not duplicate those findings.
reset --hard + clean -fd restore tracked files and remove untracked non-ignored files, but leave ignored artefacts behind. A prior invocation interrupted after running node build-all.mjs leaves a stale crates/js/dist/ from the old head or a discarded suggestion; crates/js/build.rs consumes those bundles via include_str!(), so the leftover leaks into the next pass's Rust compile output and can produce misleading verification results. Scope the extra clean to crates/js/dist so the target/ and node_modules/ caches that speed up verification are preserved.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 at head ba8a52113214d0502da7b4e2df822010c7eef059. The revised pr-reviewer workflow is much safer than the previous fix-up-PR approach, and I did not find blocking correctness, security, data-loss, authorization, or severe compatibility issues. I left two medium-severity inline comments for branch-only flow clarity and Markdown suggestion-fence safety.
CI / Existing Reviews
Current GitHub checks are mostly passing; integration tests and integration tests (EdgeZero entry point) were still pending at review time. Existing review feedback on stale HEAD handling, full CI classification, and stale JS bundle cleanup appears addressed; I did not duplicate those findings.
|
@aram356 Check if workflow creates worktree twice |
…acktick collapse Two review-workflow doc fixes: - The step 1 orientation note said step 8 is 'skipped entirely' for BRANCH-* modes, but 8a composes the branch-only artifact and would-have-been verdict; only 8b (GitHub submission) is skipped. An agent reading the earlier sentence could stop without rendering any branch-only findings. Reword to scope the skip to 8b. - Suggestion blocks always used a literal three-backtick fence. When the replacement bytes contain their own three-or-more backtick run (Markdown/docs suggestions with a nested code fence), the inner run closes the outer block early and the one-click suggestion is truncated or malformed. Add a fence-length rule (widen to longest-inner-run + 1, else demote to prose) in step 7a and reference it from the step 6 triage preview and the 7b inline comment body so preview and submission stay identical.
…rectory The worktree guard keyed off directory existence (`[ ! -e "$WT" ]`) while git tracks worktrees by registration. When the two disagree — a registered worktree whose directory was manually removed — the old flow took the first-time branch and ran `git worktree add`, which fails with 'missing but already registered' instead of recreating. Prune dangling registrations first (`--expire now`, which only touches worktrees whose directory is already gone), then decide reuse-vs-create from `git worktree list` (the source of truth). This guarantees exactly one `git worktree add` per path: reuse when a live worktree exists, refuse when an untracked directory sits there, create only when neither does.
Keep the per-PR worktree name (pr-${NUMBER}-review) but normalize $NUMBER to
bare digits, so a '#707' input and a '707' input map to the same worktree
instead of splitting into pr-#707-review and pr-707-review. Combined with the
idempotent setup that reuses rather than re-adds, each PR now gets exactly one
worktree, reused across review passes.
ChristianPavilonis
left a comment
There was a problem hiding this comment.
Automated review: I reviewed PR #707 against origin/main at head db9b3f33d1e87cb620bac5e291d0e8cdad8edb01. The workflow rewrite is detailed and the previously reported stale-head, required-only CI, branch-mode, suggestion-fence, and worktree-registration issues appear addressed. I found one blocking workflow correctness issue: the new JS scratch-verification and stale-bundle cleanup instructions point at a non-existent crates/js/... tree instead of this repository's crates/trusted-server-js/... package, so the reviewer would fail or skip the intended JS checks and leave stale bundles behind.
Findings
P1 / High
- JS suggestion verification uses the wrong package path — see inline comment.
CI / Existing Reviews
gh pr checks 707 currently reports one failing check: prepare integration artifacts. The job failed while Docker was fetching php:8.3-cli-alpine due to a Docker Hub TLS handshake timeout, so I am treating it as CI context rather than attributing it to this docs-only change. Existing automated review feedback on stale HEAD diffing, full-check CI classification, stale JS bundle cleanup, branch-only step 8 handling, suggestion-fence length, and idempotent worktree setup appears addressed; I did not duplicate those resolved comments.
| `crates/js/lib/src/`: | ||
|
|
||
| ```bash | ||
| (cd "$WT/crates/js/lib" && npx vitest run) |
There was a problem hiding this comment.
Automated review: 🔧 wrench — The JS verification commands still use $WT/crates/js/lib, but this repository's JS package is under crates/trusted-server-js/lib (as documented in CLAUDE.md and present in the tree). The same wrong crates/js/... prefix is also used for the lib/src trigger and dist cleanup earlier/later in this file. As written, JS/TS suggestions would cd into a missing directory, fail to detect real TS source edits, and clean a non-existent crates/js/dist instead of the ignored crates/trusted-server-js/dist that crates/trusted-server-js/build.rs consumes. That breaks the scratch-verification and stale-bundle guarantees this workflow is adding.
Suggested fix: replace the crates/js/... references throughout this workflow with the actual package paths: crates/trusted-server-js/lib, crates/trusted-server-js/lib/src/, crates/trusted-server-js/dist, and crates/trusted-server-js/build.rs.
Summary
pr-revieweragent around a single review pass that posts GitHub review comments and user-approvedsuggestionblocks, instead of pushing fixes or opening a stacked fix-up PR.gh pr checks --jsonbuckets, failed required CI becomes a blocking body-level finding, pending reviews are preserved unless the user chooses to delete them, and review payloads are built as strict JSON.Changes
.claude/agents/pr-reviewer.mdgh apireview submission.Closes
Closes #706
Test plan
This is a workflow/documentation-only change under
.claude/agents/; no Rust or JS source is affected.git diff --check -- .claude/agents/pr-reviewer.mdcargo test --workspace- n/acargo clippy --workspace --all-targets --all-features -- -D warnings- n/acargo fmt --all -- --check- n/acd crates/js/lib && npx vitest run- n/acd crates/js/lib && npm run format- n/acd docs && npm run format- n/a, file is outsidedocs/Checklist