Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would agree, but on the other hand, Transformer (or attention really) based models seem to be the first time that computers are generating ad hoc human text on a variety of topics, so I do believe the hype is justified. I mean... people have spent entire careers in pursuit of this goal, and it's here... as long as what you want to talk about is less than 4096 / some K tokens.

Given how little progress (relatively) was made until transformers, it seems totally reasonable to pursue att ention models.



And as long as it is English. How well do they work for other languages with large corpuses?

I know they suck for Serbian, but I wonder what kind of corpus they need to become useful?


Fwiw I talk to gpt 3.5 in Spanish frequently and there’s no problem that doesn’t exist in the English version.


Interesting: I do wonder about slightly more complex languages which have declensions and verb "gender" (eg. in Serbian "pevala" means "(a female) sang", whereas "pevao" means that a male did. Or nouns and adjectives can be in 7 declensions: "plavom olovkom" means "with a blue pen", whereas "a blue pen" is just "plava olovka".

ChatGPT always mixes these up, hallucinates a bunch of words (inappropriate prefixes, declensions etc and is very happy to explain the meaning of these imaginary words), and I can imagine smaller, more complex languages like Serbian needing even larger corpuses than English, yet that's exactly the hard part: there is simply less content to go off of.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: