Transformer模型的一些学习参考资料

IT, Tech | AI Transformer模型谷歌 | 作者： NullThought | 2023-05-03 | 发表评论

1. 第一当然是论文Attention Is All You Need了。经典论文，必看。

2.李沐博士对论文Attention Is All You Need的讲解视频：Transformer论文逐段精读。

3.《动手学深度学习》，李沐博士也是作者之一。

4.AI大神Andrej Karpathy手把手讲解一个nanoGPT代码：Let’s build GPT: from scratch, in code, spelled out.
代码所在的Google Colab文件：https://colab.research.google.com/drive/1JMLa53HDuA-i7ZBmqV7ZnA3c_fvtXnx-?usp=sharing
Andrej Karpathy对Attention的解释很经典：’Attention is a communication mechanism. Can be seen as nodes in a directed graph looking at each other and aggregating information with a weighted sum from all nodes that point to them, with data-dependent weights.‘

5.Tensorflow官网上的Transformer教程。

6. Drawing the Transformer Network from Scratch，作者Thomas Kurbiel。可视化动态展示。

7.Transformers Explained Visually，作者Ketan Doshi。这一系列四篇文章对Transformer的解释很到位，推荐指数5颗星。

发表评论取消回复