可视化查看了一下deepseek R1蒸馏(llama-8B)模型(ONNX格式)的结构。
模型360层,op节点884个。
Just visually examined the structure of the DeepSeek R1 distilled (llama-8B) model (in ONNX format). The model has 360 layers and 884 op nodes.
可视化查看了一下deepseek R1蒸馏(llama-8B)模型(ONNX格式)的结构。
模型360层,op节点884个。
Just visually examined the structure of the DeepSeek R1 distilled (llama-8B) model (in ONNX format). The model has 360 layers and 884 op nodes.