Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I might be rude for this comment, but can anyone explain what did Fei-Fei Li accomplish in AI to be considered a pioneer?

I read her autobiography and I still do not understand. The only thing she did was create the ImageNet dataset, by paying Amazon Mechanical Turk. Am I missing something? I don’t understand how is she in the same breathe as Cunn, Goodfellow, Hinton, even Karpathy?



She has been exceptionally good at creating a narrative IMO. I agree that her actual accomplishments are less clear. At Stanford, a person in her position gets added as last author to all her students’ work so her citation count is very skewed due to that. I’ve not heard from anyone in the field (past students, CV researchers) about her capabilities, and I’ve heard loads about other PIs like Chris Re, Percy Liang, etc.

Also, I hate to say this but when I was at Stanford there was a distinct sense of promoting women in AI and she was asked to speak in or co teach courses/lectures for what seemed to be that reason alone. For example the course on “AI for human thriving” and such.


I think there is value in including women in this way even if the objective scientific output is not the same as her peers.


Data work is traditionally both incredibly valuable and something no one wants to actually do. Further, ImageNet has probably been cited a _lot_ in other important research.

I agree that it feels a _tad_ underwhelming but that's what these people do - try to strike big with some research and then spend their lives convincing others of the value of that research. If you're lucky, you might even do this a few times as she appears to be trying to do.


Disagree. Stanford researchers have been making datasets for years and years. SQUAD is another Stanford dataset. Everyone knows that publishing datasets gets you citations. But that gig is now sorta over because the word is out.


Sure maybe that's true now. Less true when ImageNet was created though. And in any case, what is this argument that this is a "citation-hack"? Like - yeah if you found the dataset useful during your own research you should cite it... ImageNet did in fact provide value for many years. All the original work for guided diffusion trained on ImageNet, just as a for instance. Of course now we have superior, larger datasets like LAION and whatever OpenAI uses internally. But w.r.t. the times, it was valuable.


ImageNet dataset is the main thing AFAIK. But even so I find Dr. Li's contribution big enough. For a context, datasets for computer vision at her time were mostly small, so nn was rarely considered a good method for CV. Not until AlexNet won the challenge, and the world changes after that. I remember many people initially scoffed at ImageNet, arguing that the dataset was flawed and that a “bad” method like NN (AlexNet) could only win because of those flaws. Simply saying “paying” is an understatement because we also need to account for the academic politics of her time. A little fun fact, even if most research papers nowadays try to propose new dataset, if we take imagenet and pretrain the backbone, we usually end up with a very strong baseline.

Btw, not sure why you think Karpathy has a bigger impact than Fei-Fei Li. I can't think what he is doing that is actually changing the playing field.


While the folks you mentioned advanced deep learning techniques, Fei-Fei Li transformed computer vision by creating ImageNet, which fuelled progress in the computer vision field. Beyond that, she’s been a champion of ethical, human-centered AI and has worked to make AI more accessible.


Hard to believe that ImageNet fueled advancements in that field. Dataset creation is a very widely known “citation hack” because it forces others to cite your work even though the dataset is often used for thrift reasons over any genuine value.


I think it's telling that someone who "only" orchestrated one of the most used pre-training and benchmark datasets in computer vision is seen as less accomplished than those who developed algorithms that require those datasets to work.

Yes, dataset creation and curation is less glamorous, but it's important work. I've worked with a few modelers who could stand to learn that first hand.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: