At the bottom, in their coming soon section: "Customization: Securely extend ChatGPT’s knowledge with your company data by connecting the applications you already use"
On the other hand, there was a lot of knowledge in those documents that effectively got lost - while the relevant tech is still underpinning half the world. For example: DCOM/COM+.
I saw it, but it only mentions "applications" (whatever that means) and not bare documents. Does this mean companies might be able to upload, say, PDFs, and fine-tune the model on that?
Pretty unlikely. Generally you don't use fine-tuning for bare documents. You use retrieval augmented generation, which usually involves vector similarity search.
Fine-tuning isn't great at learning knowledge. It's good at adopting tone or format. For example, a chirpy helper bot, or a bot that outputs specifically formatted JSON.
I also doubt they're going to have a great system for fine-tuning. Successful fine-tuning requires some thought into what the data looks like (bare docs won't work), at which point you have technical people working on the project anyway.
Their future connection system will probably be in the format of API prompts to request data from an enterprise system using their existing function fine-tuning feature. They tried this already with plugins, and they didn't work very well. Maybe they'll come up with a better system. Generally this works better if you write your own simple API for it to interface with which does a lot of the heavy lifting to interface with the actual enterprise systems, so the AI doesn't output garbled API requests so much.
When I first started working with GPT I was disappointed in this. I thought like the previous commentor that I could fine tune by adding documents and it would add it to the "knowledge" of GPT. Instead I had to do what you suggest is vector similarity search, and add the relevant text to the prompt.
I do think an open line of research is some way for users to just add arbitrary docs in an easy way to the LLM.
Yes, this would definitely be a game changer for almost all companies. Considering how huge the market is, I guess it's pretty difficult to do, or it would be done already.
I certainly don't expect a nice drag-and-drop interface to put my Office files and then ask questions about it coming in 2023. Maybe 2024?
That would be the absolute game-changer. Something with the "intelligence" of GPT-4, but it knows the contents of all your stuff - your documents, project tracker, emails, calendar, etc.
Unfortunately even if we do get this, I expect there will be significant ecosystem lock-in. Like, I imagine Microsoft is aiming for something like this, but you'd need to use all their stuff.
There are great tools that do this already in a support-multiple-ecosystems kind of way! I'm actually the CEO of one of those tools: Credal.ai - which lets you point-and-click connect accounts like O365, Google Workspace, Slack, Confluence, e.t.c, and then you can use OpenAI, Anthropic etc to chat/slack/teams/build apps drawing on that contextual knowledge: all in a SOC 2 compliant way. It does use a Retrieval-Augmented-Generation approach (rather than fine tuning), but the core reason for that is just that this tends to actually offer better results for end users than fine tuning on the corpus of documents anyway!
Link: https://www.credal.ai/
What are the limitations on adding documents to your system? Your website doesn't particularly highlight that feature set, which it probably should if you support it!
Thanks for the feedback! Going to make some changes to the website to reflect that later today! Right now we support connecting Google Doc, Google Sheet, PDFs from Google Drive, Slack channel, or Confluence space. O365, Notion and a couple other sources integrations are in beta. We don't technically have restrictions on volume, the biggest customers we have have around 100 GB of data with us total. If you were trying to connect a terrabyte worth of data, that might be a conversation about pricing! :)
Your pricing seems to eliminate some use cases, including mine.
Rather than wanting to import N documents per month, I would want to import M documents all at once, then use that set of documents until at some future time I want to import another batch of K documents (probably a lot smaller than M) or just one document once in a while.
By limiting it to a fixed amount of documents per month, it eliminates all the applications where you need to import a complete corpus before the service is useful.
Totally agree. retrieval augmented generation is still the preferred way to give the LLM more knowledge. Fine-tuning is mostly useful for adapting the base model for another task. I wrote about this in a recent blog post: https://vectara.com/fine-tuning-vs-grounded-generation/.
Anyone knows how this new capability works in terms of where the model inference be done? Would it still be at the OpenAI side or is this going to be at the customer side?
After using RAG with pgvector for the last few months with temperature 0, it's been pretty great with very little hallucination.
The small context window is the limiting factor.
In principle, I don't see the difference between a bunch of fine-tuned prompts along the lines of "here is another context section: <~4k-n tokens of the corpus>", which is the same as what it looks like in a RAG prompt anyway.
Maybe the distinction of whether it is for "tone" or "context" is based on the role of the given prompts and not restricted by the fine-tuning process itself?
In theory, fine-tuning it on ~100k tokens like that would allow for better inference, even with the RAG prompt that includes a few sections from the same corpus. It would prevent issues where the vector search results are too thin despite their high similarity. E.g. picking out one or two sections of a book which is actually really long.
For example, I've seen some folks use arbitrary chunking of tokens in batches of 1k or so as an easy config for implementation, but that totally breaks the semantic meaning of longer paragraphs, and those paragraphs might not come back grouped together from the vector search. My approach there has been manual curation of sections allowing variations from 50 to 3k tokens to get the chunks to be more natural. It has worked well but I could still see having the whole corpus fine-tuned as extra insurance against losing context.
It's not impossible that fine-tuning would also help RAG. but it's certainly not guaranteed and hard to control. Fine-tuning essentially changes the weights of the model, and might result in other, potentially negative outcome, like loss of other knowledge of capabilities of the resulting fine-tuned LLM.
Other considerations:
(A) would you fine-tune daily? weekly? as data changes?
(B) Cost and availability of GPUs (there's a current shortage)
My experience is that RAG is the way to go, at least right now.
But you have to make sure your retrieval engine work optimally: getting the very most relevant pieces of text from your data: (1) using a good chunking strategy that's better than arbitrary 1K or 2K chars (2) using a good embedding model (3) Using hybrid search, and a few other things like that.
Certainly the availability of longer sequence models is a big help
Yeah, I'll be curious to see what it means by this. Could be a few things, I think:
- Codebases
- Documents (by way of connection to your Box/SharePoint/GSuite account)
- Knowledgebases (I'm thinking of something like a Notion here)
I'm really looking forward to seeing what they come up with here, as I think this is a truly killer use case that will push LLMs into mainstream enterprise usage. My company uses Notion and has an enormous amount of information on there. If I could ask it things like "Which customer is integrated with tool X" (we keep a record of this on the customer page in Notion) and get a correct response, that would be immensely helpful to me. Similar with connecting a support person to a knowledgebase of answers that becomes incredibly easy to search.
Azure-hosted GPT already lets you "upload your own documents" in their playground; it seems to be similar to how ChatGPT GPT-4 Code Interpreter handles file uploads.