> I had heard specifically that word vectors weren't a game-changer for document...

> I had heard specifically that word vectors weren't a game-changer for document classification, because the averaging method didn't work well.

As with anything, your mileage may vary.

One aspect of FastText that definitely helped in my case was n-gram support (both word and character, tunable via command-line arguments). In my corpus, I have short fragments of sentences containing misspelled words, incorrect grammar etc. plus my test set has out-of-vocabulary words.

n-grams are more robust to these than Word2Vec which uses a static vocabulary.