This explanation is better, because it puts things into perspective, but you don't seem to realize that your 1 and 2 are almost trivial compared to 3 and 4.
At the heart of it are "methods of creating distributed representation of words", that's where the magic happens. So I'd focus on helping people understand those methods. Should probably also mention subword embedding methods like BPE, since that's what everyone uses today.
I noticed that many educators make this mistake: spend a lot of time on explaining very basic trivial things, then rush over difficult to grasp concepts or details.
I noticed that many educators make this mistake: spend a lot of time on explaining very basic trivial things, then rush over difficult to grasp concepts or details.