Hacker Newsnew | past | comments | ask | show | jobs | submit | imustachyou's commentslogin

Very fun! Would love to see a tier list to start checking off myself, and maybe a caption for the pictures


Reminiscent of On Kawara's "Date Paintings", where he painted each day's date in the local date format, every day for 48 years.

https://www.phaidon.com/agenda/art/articles/2014/july/14/on-...


What was the moment, if you don’t mind sharing?


Sure. It was at the end of the semester, filling in surveys for the class. I volunteered to submit the names to the office. All of the sheets were in the envelope, the total number submitted written on the sheet ready to send to the office. Then one student came back in a gave their sheet in. My two classmates left over asked me to scratch the old number and add one to it. I refused for no good reason, in the wrong from a process perspective. I didn’t change it and didn’t want to. After my classmates pushed, I still refused stating that it really didn’t matter.

I went ahead and submitted the envelope containing 23 sheets with the number 22 still written on it. I felt liberated. Like I said, unimportant, but a flip switched. It was like I learned that it was ok to make mistakes while making decisions, so I let this one by.


He also left $200M for the Summer Science Program. https://www.forbes.com/sites/marybethgasman/2023/10/12/200-m...


After the Apple Vision announcement today, I was reminded of this short story by Rich Larson. Not so dystopian now


S4 and its class of state-space models are an impressive mathematical and signal-processing innovation, and I thought it was awesome how they destroyed previous baselines for long-range tasks.

Have there been any state-space models adapted for arbitrary text generation?

Language models like ChatGPT are trained to predict new words based on the previous ones and are excellent for generation, a harder task than translation or classification. I'm doubtful about the adaptability of text models that deal with fixed-sized input/outputs and don't have an architecture that is as natural for generating indefinitely long sequences.


Go read about S4, from these authors. It's about having a learnable state-space model which can be efficiently implemented as either an RNN or (very long) convolution, according to the needs of train or inference.


Do these scale as well as transformers? My understanding is that classic RNNs don't scale well, and that is one reason why transformers became popular.

As a pleb who doesn't even own a data center, I've been hoping that a superior machine learning architecture will be discovered that doesn't scale well. We would be fortunate if our personal computers end up being half as good as Microsoft's or Amazon's best models; fortunate if the best architecture gains little from an additional 10,000 GPUs. This would help spread the benefits of AI evenly among anyone with a phone or computer -- a utopia compared to the other possibility, that everyone can learn how to build AI, but only those with a few hundred million to throw at a data center can actually control the means of production -- err, I mean, the means of intelligence.

Philosophically, this wouldn't be unlike people. Humans are still the greatest intelligence we're aware of, and humans don't scale. I'm hoping computer intelligence ends up not scaling well either.


That's the point of having multiple realizations of the same underlying model.

The (depthwise) convolutional realization is extremely efficient for training, and the RNN is extremely efficient for inference. The scaling in both of these cases is much better than attention layers - as they discuss in the article.


Based on the 1953 short story by Arthur C. Clarke

https://urbigenous.net/library/nine_billion_names_of_god.htm...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: