Skip to content

improved flux attention qkv unpacking#1306

Open
bssrdf wants to merge 1 commit intoleejet:masterfrom
bssrdf:improve-flux-attn-qkv
Open

improved flux attention qkv unpacking#1306
bssrdf wants to merge 1 commit intoleejet:masterfrom
bssrdf:improve-flux-attn-qkv

Conversation

@bssrdf
Copy link
Contributor

@bssrdf bssrdf commented Mar 1, 2026

This PR improves performance a bit for flux models by getting rid of some ggml_cont ops.

RTX 4090

FLUX.2 Klein 4B (CFG 1, 4 steps, bf16) master This PR
512x512 7.8it/s 8.2it/s
1024x1024 2.5it/s 2.57it/s

@bssrdf bssrdf changed the title improved flux attention speed by removing cont op for qkv improved flux attention qkv unpacking by removing cont op Mar 1, 2026
@bssrdf bssrdf changed the title improved flux attention qkv unpacking by removing cont op improved flux attention qkv unpacking Mar 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant