daemon: close EventsService on shutdown#51448
Conversation
On daemon shutdown, the HTTP server tries to gracefully shutdown for 5 seconds. If there's an open API connection to the '/events' endpoint, it fails to do so as nothing interrupts that connection, thus forcing the daemon to wait until that timeout is reached. Add a Close method to the EventsService, and call it during daemon shutdown. It'll close any events channel, signaling to the '/events' handler to return and close the connection. It now takes ~1s (or less) to shutdown the daemon when there's an active '/events' connection, instead of 5. Signed-off-by: Albin Kerouanton <albin.kerouanton@docker.com>
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a bug where the Docker daemon's HTTP server fails to shut down gracefully within its 5-second timeout when there's an active connection to the /events endpoint. The fix adds a Close() method to the EventsService that closes event channels, allowing the API handler to return immediately and close the connection. This reduces shutdown time from 5+ seconds to ~1 second or less.
Key changes:
- Added
Close()method toEventsstruct that closes the publisher - Modified
/eventshandler to detect closed channels and return gracefully - Called
EventsService.Close()during daemon shutdown - Added integration test to verify shutdown timing with active events connection
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| daemon/events/events.go | Adds Close() method to terminate all event subscriber channels |
| daemon/daemon.go | Calls EventsService.Close() during shutdown sequence |
| daemon/server/router/system/system_routes.go | Handles closed event channels by checking ok flag and returning gracefully |
| integration/daemon/daemon_linux_test.go | Adds regression test verifying daemon shuts down in under 5 seconds with active events connection |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| apiClient := d.NewClientT(t) | ||
| // Open a connection to the '/events' endpoint. | ||
| apiClient.Events(ctx, client.EventsListOptions{}) |
There was a problem hiding this comment.
The returned event channel from apiClient.Events() is not being consumed or closed, which could lead to resource leaks. Consider either consuming the events in a goroutine or explicitly closing the channel before stopping the daemon.
| apiClient.Events(ctx, client.EventsListOptions{}) | |
| eventsChan, eventsErrChan := apiClient.Events(ctx, client.EventsListOptions{}) | |
| // Consume the events channel in a goroutine to avoid resource leaks. | |
| go func() { | |
| for range eventsChan { | |
| // Discard events | |
| } | |
| }() | |
| // Optionally, consume errors as well | |
| go func() { | |
| for range eventsErrChan { | |
| // Discard errors | |
| } | |
| }() |
| d.Stop(t) | ||
|
|
||
| dt := time.Since(t0) | ||
| if dt.Seconds() > 5 { |
There was a problem hiding this comment.
[nitpick] Consider using a constant for the 5-second threshold to match the HTTP server's graceful shutdown timeout mentioned in the PR description, improving maintainability if this timeout changes.
|
Thanks! I ran this through some Docker Desktop CI tests: all green; and in my logs I see: Much faster! |
- What I did
On daemon shutdown, the HTTP server tries to gracefully shutdown for 5 seconds. If there's an open API connection to the '/events' endpoint, it fails to do so as nothing interrupts that connection, thus forcing the daemon to wait until that timeout is reached.
Add a Close method to the EventsService, and call it during daemon shutdown. It'll close any events channel, signaling to the '/events' handler to return and close the connection.
It now takes ~1s (or less) to shutdown the daemon when there's an active '/events' connection, instead of 5.
- How to verify it
Within the dev container:
Run
docker system events, killdockerdand observe how much time it takes to shutdown.Before (5s delay):
After (~1s or less):
- Human readable description for the release notes