cri: replace busy-wait loop with direct UpdateImage in checkpoint restore#12940
cri: replace busy-wait loop with direct UpdateImage in checkpoint restore#12940veophi wants to merge 1 commit intocontainerd:mainfrom
Conversation
…tore During checkpoint restore, the code used a busy-wait loop (up to 500 iterations) to poll LocalResolve() waiting for the CRI image store cache to be updated by an asynchronous ImageCreate event. This was unreliable and the original author noted 'This is probably wrong'. The root cause is that CRImportCheckpoint creates images via the containerd ImageService directly (bypassing CRI), so the CRI in-memory image store cache only gets updated asynchronously through the event handler. Instead of polling, we can proactively call UpdateImage() to synchronously refresh the CRI cache, which is the same operation the event handler would eventually perform. This change: - Removes the busy-wait polling loop - Calls UpdateImage() to synchronously sync the image into the CRI cache - Removes the unused imagestore import Signed-off-by: sunweixiang <sunweixiang@xiaohongshu.com>
|
/cc @adrianreber |
|
@veophi: GitHub didn't allow me to request PR reviews from the following users: adrianreber. Note that only containerd members and repo collaborators can review this PR, and authors cannot review their own PRs. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@veophi If this works better then I am all for it. I didn't know how to implement it better when doing it initially. I was looking for a blocking call to download the image and I was not able to find it. If your change is blocking until the image has been downloaded then this PR looks better then what was there initially. |
During checkpoint restore, the code used a busy-wait loop (up to 500 iterations) to poll LocalResolve() waiting for the CRI image store cache to be updated by an asynchronous ImageCreate event. This was unreliable and the original author noted 'This is probably wrong'.
The root cause is that CRImportCheckpoint creates images via the containerd ImageService directly (bypassing CRI), so the CRI in-memory image store cache only gets updated asynchronously through the event handler. Instead of polling, we can proactively call UpdateImage() to synchronously refresh the CRI cache, which is the same operation the event handler would eventually perform.
This change: