Most of the techniques for this sort of thing don't work that well for sparse datasets. So given a choice between showing bad results for users with relatively few comments, and not showing any results ... especially when users can search not just for themselves but also for folks they know and enjoy ... Also, scraping the full history of HN is not cool.
A while ago, pg posted an archive of HN comments, specifically so that nobody would have to scrape them. (It was coupled with comments on how the arc server was holding up, IIRC.) I haven't had any luck searching for it, though - there are too many discussions about the ethics of scraping comments, archive file formats, and the like.
I remember that too. It was, as you say, a while ago, meaning long time outdated. And I indeed recall it to be pg-posted, but instead all I can find is this non-pg release with all links broken: http://news.ycombinator.com/item?id=173045. I looked for in all links to tar and zip archives, could've missed something.