Also the merits of documentation and specs. It’s been eye-opening to see the subset of developers who were almost disdainful about writing documentation for their colleagues but are now tripping over themselves to do so for their clanker.
People falling all over themselves to write docs for their pile-of-linear-algebra-with-a-smiley-face-painted-on-it [0] don't read the docs, no. People who give a shit about writing solid software that doesn't get them paged at three in the damn morning do.
[0] The face is there to provide social-trustworthiness signals to engage the human pack-bonding instinct, natch.
Your sarcasm is unwarranted, because what I said is true and reflects the experience of a lot of people.
A decade ago I left a job and spent the last week thoroughly documenting every flow and code section of an app that I worked with, which was the core value proposition of the company. A couple years later I ask around and nobody even took a look at that.
People just don't read, and there are actually good reasons for that, one of them being that documentation is outdated in most orgs and the effort to keep it up to date is greater than reading the code.
That’s a rather stunning comparison: racism is a problem because it’s unfairly treating sentient beings but a pile of linear algebra is not even sentient, much less your peer. That’s part of why I used the term: “agent” isn’t current because agents have, well, agency and can be held accountable.
People are rediscovering everything. Some people have proposed using a more formal language to tell the AI precisely what code to write. That's a compiler.
Ťhose are all technology variations of “automated web ui tests”, which is a subset of “automated ui tests”, which is itself almost (but not quite exactly) a subset of “automated user acceptance tests”, none of which are new categories.
i mean this is difficult to calculate because of prompt cacheing, the ratio of input/output token etc, but if you just do some napkin math, i find it hard to believe people are getting this many tokens on a $20 plan.
heres some napkin math
gpt oss 120b is in/out price at 0.039/ 0.18 per million on open router. heres some assumptions.
1. the ratio of input/ouput is about 25/1. (coding is mostly grep and fairly low outpu)
2. you are getting 75% prompt cache reads
Case B: 50% Prompt Caching Discount (Standard Provider Rate)At 75% Prompt Caching:Total Tokens Obtained: 658,749,010 (approx. 659 Million tokens)
Input: ~633mil
~475 mil cached at 50% input pricing = ~$9.25
~158 mil uncached = ~$6.15
tokensOutput: 25mil tokens ($4.5)
This doesnt even account for profit margins on inference providers, or the fact that openAI probably has a much more efficient inference stack.
its really hard to know what these companies are actually paying, but from everything im hearing, people are reporting API inference pricing is 50% margin.
I didn't say "use openrouter" as you might end using subsidized resources, part of the argument is to avoid that and reach the true capital cost of inference per token (or something like that).
I meant, buy/lease the hardware that lets you run this model, run gpt-oss-120b and measure. I did this once and it was like 10x more expensive than any hosted alternative, and $20 wouldn't get you far there.
An H100 today costs $2.95 an hour on vast.ai[1], which is already a good deal.
gpt-oss-120b on an H100 gives you ~200-250 tokens per second. I will be generous and say you can get a million tokens an hour out of it.
OpenCode Go (which I gladly pay for, because of this in part) is $10 a month, that's three hours of H100 use, and the models you have there are more expensive than gpt-oss-120b. Sure, they have "scale" (although that doesn't apply to AI inference, but whatever) and this and that, they're still pricing it 20-30x below their minimum threshold of capital expense.
Apples to apples, GLM 5.1 they sell it to you at $4.40 per million tokens, at ~50 tps in an H100 (being generous) it costs ~$16 to do a million tokens.
the role of evolution is always a confounding factor as well and all the various analogies to how it maps onto AI research are always not quite satisfactory.
Yes it seems most anti-LLM researchers take issue with LLMs on fundamental math/architecture based properties, but seem to miss all the engineering going on around the model to make it useful.
Those mathematical shortcomings very well might mean they arent a path to true AGI, but that honestly seems fairly irrelevant at this point tbh.
People game benchmarks for fake internet points to get their favorite web framework to the top of the list. I'm pretty sure they will do it for billions of dollars.
maybe its insane to think this, but if all AI providers turned off free plans tomorrow i think they would easily have enough people willing to pay $20 a month for it to sustain all their spending.
everyone is still fighting for market share so they are giving stuff away, but that doesnt mean people wouldnt be willing to pay for it if it wasnt free.
This proposition boils down to a belief that there are 3 billion people who are interested in AI for free but aren’t currently paying $20, but who would pay $20 if that was the price.
The global median income is around $12k, so this would mean that there’s roughly be a global budget of 0.5% of everyone’s annual income going to chatbots.
If you’re off by half, the price doubles for each person.
I think you’d make a lot of money betting against the existence of 3 billion ghost customers
i feel like im going insane
reply