artificialfintelligence / xformers_w_attn Public

Notifications You must be signed in to change notification settings
Fork 2
Star 4

Building Transformer Models with Attention: Implementation from Scratch in TensorFlow Keras

4 stars 2 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
playground		playground
xformer		xformer
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

Building Transformer Models with Attention

Implementation from Scratch in TensorFlow Keras

Following this book to teach myself about the transformer architecture in depth.

Some excellent resources I've come across along the way:

Illustrated Guide to Transformers Neural Network: A step by step explanation - by Michael Phi (@LearnedVector)
Let's build GPT: from scratch, in code, spelled out. - by the legendary Andrej Karpathy (@karpathy)
Transformers from Scratch - by Peter Bloem (@pbloem)
Lil'Log > The Transformer Family Version 2.0 - by Lilian Weng (@lilianweng)
The Illustrated Transformer - by Jay Alammar (@jalammar)
Transformer Architecture: The Positional Encoding - by Amirhossein Kazemnejad (@kazemnejad)
Dive into Deep Learning > Attention Mechanisms and Transformers
Harvard NLP > The Annotated Transformer
Towards Data Science > Transformers Explained Visually: Part 1, Part 2, Part 3 and Part 4 - by Ketan Doshi
Lecture 12 of the "Deep Learning at the Vrije Universiteit Amsterdam" (DLVU) Series - by Peter Bloem (@pbloem)
Natural Language Processing in Action Using Transformers in TensorFlow 2.0 - by Aurélien Geron (@ageron)
TensorFlow Tutorials > Neural machine translation with a Transformer and Keras

About

Building Transformer Models with Attention: Implementation from Scratch in TensorFlow Keras

Report repository

Releases

No releases published

Packages

No packages published

Languages