And it will be even more expensive to train it again on larger amounts of data a...

Kranar · on Feb 9, 2023

I disagree, I run a small tech company that has a group that's been experimenting with stable diffusion and we noticed that an extreme version of the Pareto Principle applies here as well where you can get ~90% of the benefits for like 5% of the cost, combined with the fact that computing power is continuously getting cheaper.

Based on that groups success, they've recently proposed a mini project inspired by GPT that I am considering funding; the data its trained on is all publicly available for free, and most it comes from Common Crawl. I suspect that it will also yield similar results, where you can tailor your own version of GPT and get reasonably good models for a fraction of the price as well. We're no where close to the scale of Big Tech giants, but I've noticed for the better part of 15 years that small companies can actually derive a great deal of the benefits that larger companies have for a fraction of the cost if they play it smart and keep things tight.

99_00 · on Feb 9, 2023

Do you think it is possible for the AI to request information to fill in gaps in it's model?

For example, the AI doesn't have enough information about a companies process, or a regulation. It chats with an expert to fill in the gaps.

I have no understanding of AI

simonw · on Feb 9, 2023

This is happening already. The trick is to run a search against an existing search engine, then copy and paste the search results into the language model and ask it to answer questions based on what you provide it.

This is how the new Bing Assistant works. It's also how search engines like https://you.com/ and https://www.perplexity.ai/ work - as exposed by a prompt leak attack against Perplexity a few weeks ago: https://simonwillison.net/2023/Jan/22/perplexityai/

I wrote a tutorial about one way of implementing this pattern yourself here: https://simonwillison.net/2023/Jan/13/semantic-search-answer...

crosen99 · on Feb 9, 2023

A small difference between the pattern you describe and the one of the inquiry is where responsibility lies for retrieving and incorporating the augmentation. You describe the pattern where an orchestration layer sits in front of the model, performs the retrieval, and then determines how to serve that information down to the model. The inquiry asks about whether the AI/model itself can perform the retrieval and incorporation function.

It’s a small difference, perhaps, but with some significance since the retrieval and incorporation occurring outside the model has a different set of trade offs. I’m not specifically aware of any work where model architectures are being extended to perform this function directly, but I am keen to learn of such efforts.

HellsMaddy · on Feb 9, 2023

Yes, check out LangChain [0]. It enables you to wire together LLMs with other knowledge sources or even other LLMs. For example, you can use it to hook GPT-3 up to WolframAlpha. I’m sure you could pretty easily add a way for it to communicate with a human expert, too.

[0]: https://github.com/hwchase17/langchain

alfor · on Feb 9, 2023

Yes.

It’s trained on completing the text.

If an expert write a long test and you and "in summary: " at the end, the model will complete with something approximating truth (depend on size of model, training, etc)

Humains do a similar things. We have a model in our head of the subject discussed and we can summarize, but we will forget some parts, make errors, etc. GPT is very similar.

TheCoreh · on Feb 9, 2023

It is! You can specify on its prompt that it should "request additional info via search query, using the following syntax: [[search terms here]], before coming to a final conclusion" then you integrate it with a traditional knowledge base textual look up, and run it again with that information concatenated

int_19h · on Feb 10, 2023

Stable Diffusion could do it because the task turned out to be amenable to reasonably small models. But there's no evidence of that being the case with GPT.

That said, other organizations that can afford to foot the bill for it are the governments. This is hardly ideal, since such models will also come with plenty of strings attached - indeed, probably more than the private ones - but at least these policies are somewhat checked by democratic mechanisms.

Long-term I think the demand for more AI compute power will lead to much more investment in GPU design and manufacture, driving the prices down. Since the underlying tech itself is well-understood, I fully expect to see the day when one can train and run a customized GPT-3 instance for one's private use, although the major players will likely be far ahead by then.