Thanks for taking the time for some sober analysis in the midst of reactionary chaos.
I can't wait until everyone stops falling for the "AGI ubermodel end of times" myth and we can actually have boring announcements that treat these things as what they actually are: tools. Tools for doing stuff, that's it.
Maybe I'm wrong, maybe stuffing a computer with enough language and binary patterns is indeed enough to achieve AGI, but then, so what? There's no point in being right about this. Buying into this ridiculous marketing will get us "AGI" in the form of machines, but only because all the human beings have gotten so stupid as to make critical reasoning an impossibility.
> According to this document, 1 of the 18 Anthropic staff surveyed even said the model could completely replace an entry level researcher.
>
> So I'd say we've reached this milestone.
If 1/N=18 are our requirements for statistical significance for world-altering claims, then yeah, I think we can replace all the researchers.
This makes me feel like karpathy is behind on the times a tad. Many agent users I know already do precisely this as part of "agentic" development. If you use a harness, the harness is already empowered to do much of this under the hood, no fancy instruction file required. Just ask the agent to update some knowledge directory at the end of each convo, done. If you really need to automate it, write some scheduling tool that tells the agent to read past convos and summarize. It really is that easy.
Totally. I was just remarking today how funny it is that it was apparently ok for humans to suffer from a dearth if documentation for years, but suddenly, once the machines need it, everyone is frantic to make their tools as usable and well-documented as possible
> everyone is frantic to make their tools as usable and well-documented as possible
Eh, enjoy it while it lasts. Companies are still trying to figure out how to get value by letting a thousand flowers blossom. The walled-garden gates will swing shut soon enough, just like they did after the last open access revolutions (semantic web, Web 2.0, etcetera)
I two am wondering exactly what form slamming the gates shut in our face will take. Closing the first hit is free train And opening the doors to pay me, $#%&
I two am wondering exactly what form slamming the gates shut in our face will take.
"You will rent only the best PCs, eat only the tastiest bugs, and live in the 15-minute City of Tomorrow (also known as New Kowloon). And you will like it. Or else."
Outside of having a military, several tech companies are probably more powerful than nation states at this point, and I think some of them realize this. As long as a complete slip into barbarism is still not fully on the table, nations need the data that tech companies have more or less entirely captured and established a complete hegemony around at this point. They also rely directly on their products. I guess the EU is starting to wake up to how problematic this is.
I actually think being a full-time writer is a more feasible professions today than it probably was a few hundred years ago. On the other hand, back in the 1800s random newspapers would pay for serialized stories. That doesn't really happen anymore (save a few surviving exceptions like the New Yorker) but now we have substack and a ton of other avenues writers can use to keep afloat
If you read John Fante’s Ask the Dust, he has a number of dollar amounts in there for short story sales. Those numbers are better than pretty much every contemporary opportunity without adjusting for inflation. I would say that the 20s and 30s were the ideal time. Right now, it’s pretty grim for nearly all writers. Substack and other venues tend to be kind of peanut money and there are few writers who make a living from them, especially compared to the long tail of those who make nearly nothing. And most of those who earn significant money had big reputations before Substack.
It makes the black box slightly more transparent. Knowing more in this regard allows us to be more precise—you go from prompt tweak witchcraft and divination to more of possible science and precise method.
Can this method be extended to go down to the sentence level ?
In the example it shows how much of the reason for an answer is due to data from Wikipedia. Can it drill down to show paragraph or sentence level that influences the answer ?
Your question should be "Can it drill down to show the paragraphs or sentences that influence the answer?"
I believe that the plagiarism complaint about llm models comes from the assumption that there is a one-to-one relationship between training and answers. I think the real and delightfully messier situation is that there is a many-to-one relationship.
Exactly! We will have a future post that shows this more granularly over the coming weeks. Here is a post we wrote on how this works at smaller scale: https://www.guidelabs.ai/post/prism/
Oh, that looks like a wonderful article. I just skimmed it, and I hope to get back to it later today. One thing I would love to see is how much of the training set is substantially similar to each other, especially in the code training set.
Great questions. We have several posts in the works that will drill down more into these things. The model was actually designed to answer these questions for any sentence (or group of tokens it generates).
It can tell you which specific text (chunk) in the training data that led to the output the model generated. We plan to show more concrete demos of this capability over the coming weeks.
It can tell you where in the model's representation it learned about science, art, religion etc. And you can trace all of these to either to input context, training data, or model's representations.
Does it? If i make a system prompt for most models right now, tell them they were trained on {list} of datasets, and to attribute their answer to their training data, i get quite similar output. It even seems quite reasonable. The reason being each data corpus has a "vibe" to it and the predictions simply assign response vibe to dataset vibe.
Even though it cannot be reversed or eradicated (yet, let's hope) detection can allow individuals to adopt interventions that help either adjust their lives to better cope with its progression or help mitigate some of the detrimental behavioral consequences. In addition, if you have family to care for it may be impetus to get certain things in order for them before later stages of the disease, etc. It's horrible and bleak, but I could certainly see why one might want to know.
In the lucky case, it can also relieve anxiety. Even though false negatives may still be possible, receiving a negative detection might give people who have anxiety about certain symptoms relief, since they can rule out (rightly or wrongly) a pretty severe disease.
And that's precisely why the term "reasoning" was a problematic choice.
Most people, when they use the word "reason" mean something akin to logical deduction and they would call it a reasoning failure, being told, as they are, that "llms reason" rather than the more accurate picture you just painted of what actually happens (behavioral basins emerging from training dist.)
It's actually very understandable to me that humans would make this kind of error, and we all make errors of this sort all the time, often without even realizing it. If you had the meta cognitive awareness to police every action and decision you've ever made with complete logical rigor, you'd be severely disappointed in yourself. One of the stupidest things we can do is overestimate our own intelligence. Only reflect for a second and you'll realize that, while a lot of dumb people exist, a lot of smart ones do too, and in many cases it's hard to choose a single measure of intelligence that would adequately account for the complete range of human goals and successful behavior in relation to those goals.
I can't wait until everyone stops falling for the "AGI ubermodel end of times" myth and we can actually have boring announcements that treat these things as what they actually are: tools. Tools for doing stuff, that's it.
Maybe I'm wrong, maybe stuffing a computer with enough language and binary patterns is indeed enough to achieve AGI, but then, so what? There's no point in being right about this. Buying into this ridiculous marketing will get us "AGI" in the form of machines, but only because all the human beings have gotten so stupid as to make critical reasoning an impossibility.
reply