More

mediaman · 2026-06-09T22:39:47 1781044787

Anthropic is, I believe, fully pursuing the idea that you shouldn't use their model with anything but their own products. They don't care whether it generalizes.

I agree it's very frustrating to use with custom tools/harnesses that can speed up the process for domain specific purposes.

mediaman · 2026-06-09T21:42:14 1781041334

Double check your math. All of their posts in this thread are correct.

1/30,000 * 100 = .003

ViscountPenguin · 2026-06-10T03:05:04 1781060704

Oh, fuck

freakynit · 2026-06-10T05:38:17 1781069897

/r/TheyDidTheMath IYKYK

mediaman · 2026-06-09T21:37:19 1781041039

That is not what their policy states. It specifically says they will sabotage even non-distillation attempts, such as distributed training pipeline design. And given that they are so far very nonperformant in classification accuracy, expect it to randomly include far more topics wide of the mark.

The fun part is that you will never know if your neural net classification project is getting silently sabotaged because their classifier doesn't work!

DonsDiscountGas · 2026-06-10T02:46:50 1781059610

You could try actually reading the code that it wrote

baq · 2026-06-10T06:59:07 1781074747

Good luck understanding it and finding malevolent inefficiencies if it’s already necessarily better at optimizing training pipelines than everyone except some Anthropic and OpenAI employees. Not a new thing either, see fast16.

mediaman · 2026-06-09T00:49:49 1780966189

The funny thing about this comment is that neural networks are universal function approximators.

The most fundamental essence of what they do is exactly what you say they don't: estimate.

airstrike · 2026-06-09T01:38:07 1780969087

Funny and ironic in a way, but the point still stands that they do not actually estimate the time it will take.

greenavocado · 2026-06-09T03:28:25 1780975705

> they do not actually estimate the time it will take

You can't prove that )))

airstrike · 2026-06-09T04:31:08 1780979468

Right, but extraordinary claims require...

greenavocado · 2026-06-09T14:55:55 1781016955

Instructions unclear, hard drive reformat completed.

mediaman · 2026-06-03T19:41:10 1780515670

There are plenty of US-based inference providers available, including AWS, that serve Chinese models at competitive prices (vs frontier US models). They also have lots of usage. Not necessarily for coding, but for other enterprise tasks.

mediaman · 2026-06-01T17:27:32 1780334852

I think he likely means "code that is hand-reviewed" and not directly controlled by the agent. He's probably meaning to differentiate it against the in-process agent writing the code. It doesn't matter too much if that fixed code was written by an LLM under guidance and review of the SWE, outside the agent.

Barbing · 2026-06-01T18:02:54 1780336974

Agreed, “literally written by hand” didn’t cross my mind. Not by keyboard or pen.

footydude · 2026-06-01T19:01:24 1780340484

Ahh ok - that's fair enough - hand-reviewed/not controlled by the agent seems a sensible approach (wasn't sure if it was instructive of a complete distrust of AI generated code)

mediaman · 2026-05-29T23:05:08 1780095908

What I've heard is that much of the model "intelligence" is a commingled bucket: although you can specialize specific knowledge somewhat, it's hard to specialize advanced reasoning to specific domains because so much of reasoning is a generalized capability that is not unique to, say, coding.

It turns out coding has to do with a lot of the same reasoning needed in math or in legal analysis, even if the grammatical expression is different.

This is less true of lower intelligence tasks. Classification requires a lot less reasoning capacity and so can be much smaller and more specialized.

mediaman · 2026-05-28T23:09:44 1780009784

If "normie" means a noncorporate knowledge worker who uses the free version, yes.

For enterprise, Anthropic is crushing it. In the manufacturing sector I anecdotally hear a 2:1 ratio of Claude to ChatGPT for teams who are settling on a platform.

toraway · 2026-05-29T00:10:18 1780013418

At my company the grassroots advocacy from devs has certainly been for Claude Code.

Unfortunately even though we have a degree or two of seperation from most federal contracts the punitive DoD blacklisting had enough of a chilling effect on our legal team to make them drag their feet on approving any contract involving Anthropic.

So I pitched OpenAI Business with Codex so we could drop our Github Copilot Business subscription before the billing change takes effect June 1st which was approved without pushback.

I felt some responsibility for finding an immediate solution to dump Copilot since I was the one who recommended adopting Copilot in the first place, ugh... Our prices would have quadrupled based on the single month Microsoft in their beneficence allowed previewing with their tool to simulate what the post-rug pull pricing would have looked like.

Codex becoming more or less a 1:1 replacement for CC made that a no brainer given our options and the exploitative value proposition of Copilot under the new pricing model (which Microsoft evidently hoped companies like us would just accept despite being a third tier option in the dev space these days).

mediaman · 2026-05-27T23:32:39 1779924759

Just look at large open weights models being served by inference providers.

Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet. Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.

majormajor · 2026-05-28T01:02:59 1779930179

I'm not sure just how good that looks for Anthropic/OpenAI.

4-7x isn't a tiny markup, but how does that compare to high-margin internet businesses like AdSense? Meta and Google do hundreds of billions in ad revenue a year, and after taking out the publisher's portion (60-80% per some searching), I wonder what the ratio of the remaining tens-of-billions is against the compute cost and headcount required to run it.

And how much room for maintaining or improving that margin do they have if the cheap competitors also continue getting better? Is there a "good enough" point where the easier inference tasks are all moving to vendors massively undercutting them, and then they don't have the volume necessary to justify spending on further cutting-edge development?

re-thc · 2026-05-28T10:26:52 1779964012

> Kimi 2.6 is a 1 trillion total / 32B active parameter model that's something comparable to Sonnet.

No it's not. On some rigged paper maybe. Some such benchmarks say all models group together, which they clearly do not.

> Sonnet's API pricing is $5 in, $15 out per million tokens. Deepinfra serves Kimi at $0.75 in, $3.50 out, and about the same at openrouter. So you're looking at a 4-7x multiple that Anthropic is charging compared to market rates that any plebe can get with a credit card.

That's not saying much. You can get "cloud" at AWS and you can get a VPS. There is likely a 10x difference. It's not "same". Whilst AWS costs more they also don't have 7x margins similarly.

mediaman · 2026-05-26T19:18:01 1779823081

I've found that legacy system users (or at least the execs) are pretty excited about AI because they hate their legacy systems but can't really do anything about it (ERP changes are an extreme nightmare, and often no better system exists with all the capability they need). They want to wrap it in AI to automate stuff without changing out the core system.

This seems like a good approach to me, I work with a lot of legacy ERP-using companies in the manufacturing sector and can immediately see how we could put this to use for our customers.

I especially like that it's not doing computer use for everything which so far doesn't really seem to be working, especially outside the browser.

fchishtie · 2026-05-26T19:23:56 1779823436

yeah those core systems of record are so locked in place - I can't even imagine the change management