Skip to content

Fix coordination mode mismatch for parallel replicas with aggregation in order#98467

Open
alexey-milovidov wants to merge 2 commits intomasterfrom
fix-parallel-replicas-coordination-mode
Open

Fix coordination mode mismatch for parallel replicas with aggregation in order#98467
alexey-milovidov wants to merge 2 commits intomasterfrom
fix-parallel-replicas-coordination-mode

Conversation

@alexey-milovidov
Copy link
Member

Summary

  • Fix a logical error exception ("Replica decided to read in WithOrder mode, not in ReverseOrder") when using parallel replicas with optimize_aggregation_in_order and parallel_replicas_local_plan.
  • The initiator replica (with 0 parts) used stale result.read_type (Default) to compute coordination mode, incorrectly mapping to ReverseOrder, while remote replicas correctly computed WithOrder from input_order_info->direction.
  • Fix uses query_info.input_order_info->direction directly, which is always consistent regardless of whether range analysis ran.

Fixes #95524

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=03c7c3d6f2c843cd5f0f681e5032bf968e97e88a&name_0=MasterCI&name_1=AST%20fuzzer%20%28amd_debug%29

Test plan

  • Existing test 03810_pr_aggr_in_order_read_mode passes with the fix
  • Added test case with parallel_replicas_filter_pushdown = 1 enabled (previously had a TODO to enable after fix)

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix logical error exception "Replica decided to read in WithOrder mode, not in ReverseOrder" when using parallel replicas with optimize_aggregation_in_order.

🤖 Generated with Claude Code

… in order

When the initiator replica has no parts to read, `result.read_type`
remains `Default` (never updated to `InOrder`), but
`query_info.input_order_info` is set with direction=1 by the
`optimizeAggregationInOrder` optimization. The coordination mode
calculation incorrectly used `result.read_type` to determine the mode,
falling into the `ReverseOrder` branch for non-`InOrder` read types.
Meanwhile, the remote replica with actual data correctly computed
`WithOrder` (direction=1), causing the coordinator to throw:
"Replica 0 decided to read in WithOrder mode, not in ReverseOrder."

Fix by using `query_info.input_order_info->direction` directly, which
is always consistent regardless of whether `result.read_type` was
updated.

Also enable `parallel_replicas_filter_pushdown` in the existing test
for this issue.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=03c7c3d6f2c843cd5f0f681e5032bf968e97e88a&name_0=MasterCI&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 2, 2026

Workflow [PR], commit [839544f]

Summary:

job_name test_name status info comment
Upgrade check (amd_release) failure
Error message in clickhouse-server.log (see upgrade_error_messages.txt) FAIL cidb

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 2, 2026
Revert modification to the existing test `03810_pr_aggr_in_order_read_mode`
and add a new test `04009_pr_aggr_in_order_coordination_mode` that covers the
fix with `parallel_replicas_filter_pushdown` enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PR: Replica 1 decided to read in WithOrder mode, not in ReverseOrder

1 participant