Tag: Linear Attention

All the articles with the tag "Linear Attention".

Kimi Linear: An Expressive, Efficient Attention Architecture
Updated:2025年11月4日 at 19:10Published: 2025年11月4日 at 13:55
Kimi Linear，有比较详细的实验&Scale Up。有Linear Attention可以去掉RoPE这个结论还是比较惊喜的。
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Updated:2025年10月16日 at 15:15Published: 2025年10月7日 at 17:05
AI Lab关于”广义“LLM推理加速的工作，包括Linear Attention，Sparse Attention，Diffusion LLM，Applications等。
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
Updated:2025年9月26日 at 16:46Published: 2025年9月25日 at 14:43
DeltaNet

Kimi Linear: An Expressive, Efficient Attention Architecture