Here is mine (*stolen off the internet of course*), lately the vv part is import...

birriel · on May 25, 2024

I believe it was originally written by Jeremy Howard, who has been featured here in HN a number of times.

https://youtu.be/jkrNMKz9pWU?si=0kGhs7gyh0LUXUBJ

welpo · on May 25, 2024

Indeed. He shared it here: https://x.com/jeremyphoward/status/1689464587077509120

matsemann · on May 25, 2024

He's active here as jph00. Great dude.

https://news.ycombinator.com/user?id=jph00

mediumsmart · on May 25, 2024

thats him!

starspangled · on May 26, 2024

You really have to stroke its ego or tell it how it works to get better answers?

qingcharles · on May 26, 2024

It helps!

couchdb_ouchdb · on May 26, 2024

Can someone explain what this is attempting to do?

Dessesaf · on May 26, 2024

It's useful to consider the next answer a model will give as being driven largely by three factors: its training data, the fine-tuning and human feedback it got during training (RLHF), and the context (all the previous tokens in the conversation).

The three paragraphs roughly do this:

- The first paragrath tells the model that it's good at answering. Basically telling it to roleplay as someone competent. Such prompts seem to increase the quality of the answers. It's the same idea why others say "act as if youre <some specific domain expert>". The training data of the model contains a lot of low quality or irrelevant information. This is "reminding" the model that it was trained by human feedback to prefer drawing from high quality data.

- The second paragraph tries to influence the structure of the output. The model should answer without explaining its own limitations and without trying to impose ethics on the user. Stick to the facts, basically. Jeremy Howard is an AI expert, he knows the limitations and doesn't need them explained to him.

- The third paragrah is a bit more technical. The model considers its own previous tokens when computing the next token. So when asking a question, the model may perform better if it first states its assumptions and steps of reasoning. Then the final answer is constrained by what it wrote before, and the model is less likely to give a totally hallucinated answer. And the model "does computation" when generating each token. So a longer answer gives the model more chances to compute. So a longer answer has more energy put into it, basically. I don't think there's any formal reason why this would lead to better answers rather than just more specialized answers, but anecdotally it seems to improve quality.

andai · on May 26, 2024

>each token you produce is another opportunity to use computation

Careful, it might embrace brevity to reduce CO2!