+1, Andrey is an amazing educator! I'd also recommend his https://youtu.be/kCc8FmEb1nY?si=mP0cQlQ4rcceL2uP and checking out his github repos. MinGPT, for example, implements a small gpt model that's compatible with HF API, whereas more modern nanoGPT shows how to use newer features such as flash attention. The quality of every video, every blog post is just so high.