feat(seo): static sitemap.xml with git-based lastmod#222
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a400d681f1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const pages = source.getPages() as Array<{ url: string; path: string }>; | ||
| const contentPages = pages.map((page) => ({ | ||
| url: page.url, | ||
| sourcePaths: [`content/docs/${page.path}`], |
There was a problem hiding this comment.
Track changelog data as a sitemap source
When /docs/changelog changes because src/lib/changelog-entries.json is regenerated or committed, this mapping still dates the page only from content/docs/changelog/index.mdx. That MDX renders <ChangelogTimeline /> (content/docs/changelog/index.mdx:92), and the component imports the JSON data (src/components/ChangelogTimeline.tsx:2), so changelog updates can ship with a stale <lastmod> until the wrapper MDX file is touched. Add the generated changelog JSON (or other supplemental component data) to the source paths for that page before resolving lastmod.
Useful? React with 👍 / 👎.
Deploying with
|
| Status | Name | Latest Commit | Updated (UTC) |
|---|---|---|---|
| ✅ Deployment successful! View logs |
superwall-docs | 622c80a | Jun 26 2026, 12:21 AM |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d83f1de65f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const SUPPLEMENTAL_SOURCES: Record<string, string[]> = { | ||
| // /docs/changelog renders <ChangelogTimeline/>, which imports this generated, | ||
| // committed JSON — changelog regenerations don't touch the wrapper MDX. | ||
| "content/docs/changelog/index.mdx": ["src/lib/changelog-entries.json"], | ||
| }; |
There was a problem hiding this comment.
Track page-tree list dependencies in lastmod sources
When a support article is added, removed, or renamed under a folder rendered by <SupportFolderList> (for example content/docs/support/faq/index.mdx:93), that folder index page changes because the component renders links from the page tree (src/components/SupportFolderList.tsx:23-35). The sitemap generator only expands raw <include>s plus this one supplemental JSON mapping, so those child pages/meta files never contribute to /docs/support/faq (and the other support folder indexes) and their <lastmod> can remain stale even though the rendered page changed. Add page-tree/child dependencies for these list pages before resolving lastmod.
Useful? React with 👍 / 👎.
| const pages = source.getPages() as Array<{ url: string; path: string }>; | ||
| const contentPages = pages.map((page) => ({ | ||
| url: page.url, | ||
| sourcePaths: [`content/docs/${page.path}`], |
There was a problem hiding this comment.
Include referenced images in lastmod sources
When a screenshot or other docs image changes without touching the MDX, the rendered page changes but the sitemap date does not: for example content/docs/dashboard/paywalls.mdx:93 renders /images/docs-paywalls-overview.png, which is copied from content/docs/images, yet each page starts with only its MDX path and the expander only follows <include>/supplemental files. Image-only documentation updates will now publish stale <lastmod> values; add referenced image files to each page's source paths before resolving the git date.
Useful? React with 👍 / 👎.
Replace the request-time sitemap route (which stamped every URL with new Date() on each crawl, training Google to ignore <lastmod>) with a build-time static sitemap whose <lastmod> comes from real git history. - New scripts/generate-sitemap.ts (runs in the build chain): one git log pass for per-file dates, resolves <include> deps into content/shared so shared edits bump the right pages, and serves dist/client/docs/sitemap.xml. - Shallow clones omit <lastmod> rather than publish one wrong date; git failures degrade gracefully instead of breaking the build. - src/lib/sitemap.ts refactored to pure, testable, worker-safe helpers. - Remove runtime route src/routes/sitemap[.]xml.ts (regenerates routeTree).
Two follow-ups from PR review on the sitemap generator: - Deploy environments (Cloudflare Workers Builds) shallow-clone with no fetch-depth setting, which left the deployed sitemap with no <lastmod>. Detect a shallow clone and deepen it with 'git fetch --unshallow' (anonymous; the repo is public). Falls back to omitting <lastmod> if history still can't be obtained — never fails the build. - /docs/changelog renders <ChangelogTimeline/>, which imports the committed src/lib/changelog-entries.json. Add that data file as a supplemental source so changelog regenerations bump the page's date.
d83f1de to
622c80a
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 622c80a. Configure here.
| return stdout.trim() === "true"; | ||
| } catch { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Shallow check errors publish dates
Medium Severity
When git rev-parse --is-shallow-repository fails, isShallowRepository treats the repo as non-shallow, so ensureFullHistory skips deepening and still runs buildGitDateMap. On a shallow or truncated history, that can emit clustered, misleading <lastmod> values instead of omitting them as intended.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 622c80a. Configure here.


What & why
/docs/sitemap.xmlwas generated at request time on the Cloudflare Worker and stampedevery URL with
new Date().toISOString(). That tells Google "every page changed just now"on every crawl, so Google learns to ignore our
<lastmod>entirely (it only trusts the fieldwhen it's consistently accurate).
This replaces that with accurate, build-time
<lastmod>derived from git history, servedas a static asset. Verified end-to-end on a Cloudflare preview deploy: 592 URLs, all dated.
Approach
Cloudflare Workers have no filesystem / git at request time, so dates are resolved during
bun run build(Node, full repo) — same pattern as the existinggenerate-static-cache/generate-search-indexpost-build scripts. The site has no runtime content source (MDX iscompiled into the bundle), so the page set is fixed at build time and a live route buys
nothing. The generated
dist/client/docs/sitemap.xmlis served at/docs/sitemap.xml— thesame delivery path proven by
search-index.json(confirmed in preview: HTTP 200,application/xml).How dates are computed
git log --no-merges --name-only --pretty=format:…%cspass builds afile → latest-commit-date (YYYY-MM-DD)map (1 subprocess, not ~590).content/docs/<page.path>.<include>dependencies are resolved transitively — 107 pages render shared bodiesfrom
content/shared/**, so an edit to a shared file bumps every page that includes it./docs/changelogrenders<ChangelogTimeline/>, whichimports the committed
src/lib/changelog-entries.json; that file is added as asupplemental source so changelog regenerations bump the page's date.
/docs→src/routes/index.tsx;/home(301→dashboard) → dashboard content./ios,/android, …) inherit their content source and only get apriority bump (single source of truth).
<lastmod>omitted(never falls back to
new Date()).Robustness — shallow clones self-heal
Deploy environments (Cloudflare Workers Builds) shallow-clone with no fetch-depth setting,
which would otherwise leave every page date-less. The generator detects a shallow clone and
deepens it with
git fetch --unshallow(anonymous — the repo is public). Verified in theCloudflare build log:
✓ Fetched full git history → 592 urls (592 with <lastmod>). If historystill can't be obtained, it omits
<lastmod>rather than publish a wrong date, and neverfails the build (git errors degrade gracefully).
Changes
src/lib/sitemap.ts— pure, worker-safe:getSitemapSourceEntries(dedupe + prioritymerge),
attachLastModified(date resolution injected by caller), optional<lastmod>.scripts/generate-sitemap.ts— new build-time generator (git dates, include + componentdata resolution, shallow self-heal, graceful degradation), wired into
build.src/routes/sitemap[.]xml.ts(+ regeneratedrouteTree.gen.ts).src/lib/seo-routes.test.ts— updated for the new API.Testing
bun test— 69 pass.<lastmod>, valid XML (xmllint);/docs/changelogcorrectlyreflects max(wrapper, changelog JSON); include resolution verified.
Note on current dates
~586 of ~590 pages currently share
2026-06-23because of recent bulk commits (#218/#219).That's accurate git history; dates diverge naturally as pages are edited individually.
Notes
/homeis a redirecting URL in the sitemap (pre-existing); left as-is.