AIbase
Product LibraryTool Navigation

ffpa-attn-mma

Public

?FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑?vs SDPA EA.

Creat2024-11-29T19:47:23
Update2025-03-27T11:33:38
https://zhuanlan.zhihu.com/p/13975660308
160
Stars
2
Stars Increase

Related projects