I think Google used to be better by weighting incoming links higher than outcoming, e.g. these days searching for a programming problem you get irrelevant "StackOverflow/Github clones". Cosine similarity only considers, obviously, similarities, which can be helpful in contextless scenarios, but the amount of undetected duplicates in a specific context is stupid.