fix: decrypt copy source per-frame across internal parts (InvalidTag on large multipart COPY)#104
Merged
Merged
Conversation
_iter_multipart_plaintext decrypted each whole client part as a single AES-GCM seal. A client part expands into multiple internal parts (separate S3 parts), each itself a sequence of independent frames, so any source whose parts hold more than one frame (e.g. internal parts >8MB) failed to COPY with cryptography.exceptions.InvalidTag. Walk internal_parts -> frames and decrypt one frame at a time, matching the GET reader. Also bounds copy source-read peak memory to O(frame) instead of a whole client part.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Server-side
COPYof large multipart-encrypted objects fails withcryptography.exceptions.InvalidTag. Observed in production onoceanio-dc2-scylladb-backups-v3(1000MB SSTable backups): every copy of a multipart source whose internal parts exceed 8MB fails, and the pods OOMKill under the load.Root cause
_iter_multipart_plaintext(the copy source reader) decrypted each whole client part as a single AES-GCM seal:But a client part is not one seal. It expands into multiple internal parts (separate S3 parts), and each internal part is a sequence of independent frames (
nonce||ct||tag,FRAME_PLAINTEXT_SIZE= 8MB each). The failing request fetched a ~50MB client part (ciphertext_size: 52429136, 6 internal parts of 8.33MB, each >8MB → 2 frames) and authenticated it as one GCM message →InvalidTag.The GET path was already fixed for this layout (
get.py::_stream_internal_parts); the copy source-read path through_iter_multipart_plaintextwas never updated. Existing tests missed it because they only build single-frame source parts (exactly 8MB,internal_parts=[]).Fix
Walk
internal_parts→ frames (viacrypto.ciphertext_frame_byte_sizes) and decrypt one frame at a time, with range trimming preserved. Legacy single-seal parts are the 1-frame case and read through unchanged. Bonus: peak memory for copy source reads drops from a whole ~50MB client part to O(frame) = 8MB, relieving the OOM/MEMORY_BACKPRESSUREsymptom on these copies.Verification
TestIterMultipartPlaintextFramedreproduces the exact productionInvalidTagagainst the old reader (fails pre-fix, passes post-fix), plus a range round-trip.