站内搜索

可视化查看deepseek R1蒸馏（llama-8B）模型结构

Tech | AI Deepseek 大语言模型LLM 蒸馏（Distillation） | 作者： NullThought | 2025-03-01 | 发表评论

可视化查看了一下deepseek R1蒸馏（llama-8B）模型（ONNX格式）的结构。

模型360层，op节点884个。

可视化查看deepseek R1蒸馏（llama-8B）模型结构

Just visually examined the structure of the DeepSeek R1 distilled (llama-8B) model (in ONNX format). The model has 360 layers and 884 op nodes.

发表评论取消回复