I'm not familiar with the summarization or NLP space really but I remember ~2011...

akiselev · on Oct 19, 2023

Not an NLP expert but the biggest difference in my experience is guided focus, so to speak. When summarizing something huge like the US Code, for example, you can tell the LLM to focus on specific topics and anything adjacent to them so that it ignores irrelevant details (which is usually >99.9% of the text in my use case). The word relationships encoded in the LLM are really good at identifying important adjacent topics and entities.

LLMs are also really good at the harder NLP problems like coreference resolution, dependency parsing, and relations which makes a huge difference when using recursive summarization on complex documents where something like "the Commisioner" might be defined at the beginning and used throughout a 100,000 token document. When instructed, the LLM can track the definitions in memory itself and even modify it live by calling OpenAI functions.

falsenapkin · on Oct 19, 2023

Interesting so maybe not my trivial "summarize an article" example but clearly the upper bound on what's possible is higher and more interesting.

dontupvoteme · on Oct 19, 2023

Might I ask how you use OpenAI's function calling here? That's the one bit of their functionality I haven't really explored.

akiselev · on Oct 19, 2023

I use OpenAI function calling most of the time I use the OpenAI API since that's the easiest way to get structured data and implement retry logic.

The simplest implementation is "retrieve_definition(word_to_lookup, word_to_replace)" with some number of tokens at the beginning of the prompt dedicated to definitions. You can use a separate LLM call with a long list of words (without their definitions) to do the actual selection since sometimes there might be ambiguity, which the LLM can usually figure out itself (it can also include both definitions when it's too uncertain if instructed).

A more complex variant does multiple passes: first pass identifies ambiguous words in each chunk, second pass identifies their definitions, third pass does actual summarization using the output of the previous passes to craft the prompt.