Tag: LLM
All the articles with the tag "LLM".
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Updated: at 15:06Published: at 18:32From IPADS, 利用模型预测LLM中需要激活的MoE or Neuron,减少资源消耗。
All the articles with the tag "LLM".
From IPADS, 利用模型预测LLM中需要激活的MoE or Neuron,减少资源消耗。