libn/d/overlay: calculate SPI like older engines#51951
Merged
vvoland merged 1 commit intomoby:masterfrom Jan 29, 2026
Merged
Conversation
The Security Parameter Index value signals to the recipient which key to decrypt the packet with. The overlay driver derives the SPI value for a flow from a hash digest of the source and destination IP addresses. The source and destination need to derive the same digest given the same information as the SPI values are not signaled over the overlay driver's control plane. Refactoring the overlay driver to use netip types accidentally changed the hash function to digest IPv4 addresses in 4-byte form, causing newer engines to calculate a different SPI value for a flow than older engines would. Restore the original calculation by hashing IPv4 addresses in their 16-byte form, and refactor the buildSPI function to take netip.Addr parameters to prevent 16-byte vs 4-byte mixups from being possible in the future. Signed-off-by: Cory Snider <csnider@mirantis.com>
7b430df to
51664a2
Compare
Member
|
GHA oddness; |
thaJeztah
approved these changes
Jan 28, 2026
vvoland
approved these changes
Jan 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
- What I did
The Security Parameter Index value signals to the recipient which key to decrypt the packet with. The overlay driver derives the SPI value for a flow from a hash digest of the source and destination IP addresses. The source and destination need to derive the same digest given the same information as the SPI values are not signaled over the overlay driver's control plane. Refactoring the overlay driver to use netip types accidentally changed the hash function to digest IPv4 addresses in 4-byte form, causing newer engines to calculate a different SPI value for a flow than older engines would. Restore the original calculation by hashing IPv4 addresses in their 16-byte form, and refactor the
buildSPIfunction to take netip.Addr parameters to prevent 16-byte vs 4-byte mixups from being possible in the future.- How I did it
While it would be straightforward to get the receiving side to decrypt packets tagged with both possible SPI values concurrently by programming two sets of states into the kernel, there is no easy solution for the sending side. The sender would need to know which algorithm each recipient is using to calculate its SPIs so that it can pick the same SPI to program the kernel to transmit with.
Or it could transmit every packet twice.This may be possible to do so, e.g. by looking up the Engine version in Swarm's node inventory, but it would be a significant amount of work. Since we recommend against running Swarm clusters with mixed engine versions and only provide best-effort support, YAGNI. Mixed version clusters should be a transient condition which only occurs during a rolling upgrade, so whatever heroics would be needed to get the latest engine to pass encrypted overlay traffic with engine versions that use the bugged SPI calculation would only be active for that one single maintenance window where degraded availability is already expected.- How to verify it
- Human readable description for the release notes
- Fix encrypted overlay networks not passing traffic to containers on v28 and older Engines. Encrypted overlay networks will no longer pass traffic to containers on v29.2.0 thru v29.0.0, v28.2.2, v25.0.14 or v25.0.13.- A picture of a cute animal (not mandatory but encouraged)