Tag: Linear Attention
All the articles with the tag "Linear Attention".
Kimi Linear: An Expressive, Efficient Attention Architecture
Updated: at 19:10Published: at 13:55Kimi Linear,有比较详细的实验&Scale Up。有Linear Attention可以去掉RoPE这个结论还是比较惊喜的。
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Updated: at 15:15Published: at 17:05AI Lab关于”广义“LLM推理加速的工作,包括Linear Attention,Sparse Attention,Diffusion LLM,Applications等。
Parallelizing Linear Transformers with the Delta Rule over Sequence Length
Updated: at 16:46Published: at 14:43DeltaNet