fix(snapshot): bound S3 parallelGet and make snapshot serving range-aware#356
Draft
worstell wants to merge 1 commit into
Draft
fix(snapshot): bound S3 parallelGet and make snapshot serving range-aware#356worstell wants to merge 1 commit into
worstell wants to merge 1 commit into
Conversation
…ware The S3 full-object download path (parallelGet) let workers read every chunk into memory ahead of a slow consumer, so a single large snapshot served to a slow client could buffer the whole object in the Go heap. Bound it with a window of in-flight chunks (peak now numWorkers x chunkSize). The git snapshot endpoint also ignored client Range headers, so it always served the full object via the unbounded path. Forward Range/If-Range to the cache on the cache-hit path, return 206/Content-Range (416 when not satisfiable), and advertise Accept-Ranges, so clients can fetch with bounded parallel range requests (client.ParallelGet) instead of one full GET. Amp-Thread-ID: https://ampcode.com/threads/T-019ef6fe-55c6-71e8-96dc-ca3ef4301d36 Co-authored-by: Amp <amp@ampcode.com>
3bb7057 to
ad2f09d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two defense-in-depth fixes for snapshot serving heap pressure, plus the metadata wiring ranged clients need.
The S3 full-object download path (
parallelGet) let workers read every chunk into memory ahead of a slow consumer, so a single large snapshot served to a slow client could buffer the whole object in the Go heap. A live Datadog heap profile during the incident showed(*S3).parallelGet.func1holding ~86% of live Go heap. This bounds in-flight chunks with a window token (acquired before fetch, released once the writer consumes the chunk), so peak memory stays atnumWorkers × chunkSize.The git snapshot endpoint also ignored client
Rangeheaders, so it always served the full object via that unbounded path. It now forwardsRange/If-Rangeto the cache on the cache-hit path, returns206/Content-Range(416when not satisfiable), and advertisesAccept-Ranges: bytes. This lets clients fetch with bounded parallel range requests (client.ParallelGet) instead of one full GET.The snapshot freshen metadata (
X-Cachew-Snapshot-Commit/X-Cachew-Bundle-Url) is now emitted on206responses too, factored into a sharedsetSnapshotMetadataHeadershelper used by both the full and ranged paths. Without it a ranged client would download the tarball but lose the commit/bundle metadata it needs to apply a delta bundle.Tests:
TestS3ParallelGetMultiChunk(bounded-window reassembly),TestSnapshotGetHonorsRange(206/416/Content-Range and matching commit metadata on partial responses, via a non-*os.Filememory cache reader so range support comes from forwardingRangetoOpenrather thanhttp.ServeContent).