标签: Linear Attention
所有带有此标签的文章 "Linear Attention".
-
Kimi Linear: An Expressive, Efficient Attention Architecture
更新于:Kimi Linear,有比较详细的实验&Scale Up。有Linear Attention可以去掉RoPE这个结论还是比较惊喜的。
-
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
更新于:AI Lab关于”广义“LLM推理加速的工作,包括Linear Attention,Sparse Attention,Diffusion LLM,Applications等。
-
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
更新于:DeltaNet