feat(vi/cvi): recover VirtualImage and ClusterVirtualImage from ImageLost when data returns#2564
Open
danilrwx wants to merge 2 commits into
Open
feat(vi/cvi): recover VirtualImage and ClusterVirtualImage from ImageLost when data returns#2564danilrwx wants to merge 2 commits into
danilrwx wants to merge 2 commits into
Conversation
Restart the import process when a Ready image is lost in DVCR, for recoverable data sources (HTTP, ContainerImage, ObjectRef). Upload images stay in ImageLost since their data cannot be re-fetched. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
338e9b5 to
9454c18
Compare
…n data returns Instead of re-importing a lost image, poll DVCR while the image is in ImageLost and restore it to Ready once the data reappears (for example, when the DVCR PVC is remounted). No re-download, so upload-sourced images recover too. Replaces the previous restart-import recovery approach. Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com> chore: drop stray comments from image presence handlers Signed-off-by: Daniil Antoshin <daniil.antoshin@flant.com>
d98ea7f to
3d3798d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Automatically recover
VirtualImageandClusterVirtualImagethat entered theImageLostphase (image is missing in DVCR) without re-importing the data.The
ImagePresenceHandleralready moved aReadyimage toImageLostwhen it disappeared from DVCR. It is now symmetric and also handles the reverse transition:ImageLost, the handler keeps polling DVCR (RequeueAfter), so the return of the data is noticed shortly after it happens;Ready— only the phase and theReadycondition are flipped, the rest of the status (Target,Size,Format, …) was never cleared;VirtualImageLostRecovered/ClusterVirtualImageLostRecoveredevent is emitted on recovery.Because recovery reuses the blobs already present in DVCR, no re-download happens and every source recovers, including
Upload, whose data cannot be re-fetched.The
LifeCycleHandlerearly-returns onImageLostas before (no import restart). Handler order (LifeCycleHandler→ImagePresenceHandler) is unchanged and theImagePVCLostphase is not affected. HealthyReadyimages are not polled — loss detection stays event-driven as before; polling is added only while a resource is lost.Why do we need it, and what problem does it solve?
When the DVCR backing storage (its PVC) is temporarily lost and later comes back, all images served by DVCR are physically intact — only the registry was unavailable for a while. Today VI/CVI correctly move to
ImageLostduring the outage, but then stay stuck there even though the blobs returned, forcing users to delete and recreate the resources by hand.This is especially painful for a mass DVCR outage affecting dozens of images at once. Re-importing is both unnecessary (the data is already there) and impossible for
Uploadsources. Flipping back toReadywhen the data reappears restores the images automatically, with no re-download and no user action.What is the expected result?
Ready.ImageLost.Readyon its own; aVirtualImageLostRecovered/ClusterVirtualImageLostRecoveredevent is recorded. This works for every source type, includingUpload.ImageLostand DVCR keeps being rechecked periodically.Checklist
Changelog entries