More

kouteiheika · 2026-01-15T17:02:26 1768496546

> Imagine you could run a stack of Mac minis that replaced your monthly Claude code bill. Might pay for itself in 6mo (this doesn’t exist yet but it theoretically could happen)

You don't have to imagine. You can, today, with a few (major) caveats: you'll only match Claude from roughly ~6 months ago (open-weight models roughly lag behind the frontier by ~half a year), and you'd need to buy a couple of RTX 6000 Pros (each one is ~$10k).

Technically you could also do this with Macs (due to their unified RAM), but the speed won't be great so it'd be unusable.

kouteiheika · 2026-01-15T11:17:08 1768475828

A natural language based smart home interface, perhaps?

Tiny LLMs are pretty much useless as general purpose workhorses, but where they shine is when you finetune them for a very specific application.

(In general this is applicable across the board, where if you have a single, specific usecase and can prepare appropriate training data, then you can often fine-tune a smaller model to match the performance of a general purpose model that is 10x its size.)

michaelmior · 2026-01-15T11:21:46 1768476106

I think there's a lot of room to push this further. Of course there are LLMs being used for this case and I guess it's nice to be able to ask your house who the candidates were in the Venezuelan presidential election of 1936, but I'd be happy if I could just consistently control devices locally and a small language model definitely makes that easier.

kouteiheika · 2026-01-14T14:42:43 1768401763

Yes. All `&mut` references in Rust are equivalent to C's `restrict` qualified pointers. In the past I measured a ~15% real world performance improvement in one of my projects due to this (rustc has/had a flag where you can turn this on/off; it was disabled by default for quite some time due to codegen bugs in LLVM).

steveklabnik · 2026-01-14T14:45:52 1768401952

Not just all &mut T, but also all &T, where the T does not transitively contain an UnsafeCell<T>. Click "show llvm ir" instead of "build" here: https://play.rust-lang.org/?version=stable&mode=release&edit...

marcianx · 2026-01-14T15:00:23 1768402823

I was confused by this at first since `&T` clearly allows aliasing (which is what C's `restrict` is about). But I realize that Steve meant just the optimization opportunity: you can be guaranteed that (in the absence of UB), the data behind the `&T` can be known to not change in the absence of a contained `UnsafeCell<T>`, so you don't have to reload it after mutations through other pointers.

steveklabnik · 2026-01-14T15:19:48 1768403988

Yes. It's a bit tricky to think about, because while it is literally called 'noalias', what it actually means is more subtle. I already linked to a version of the C spec below, https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf but if anyone is curious, this part is in "6.7.4.2 Formal definition of restrict" on page 122.

In some ways, this is kind of the core observation of Rust: "shared xor mutable". Aliasing is only an issue if the aliasing leads to mutability. You can frame it in terms of aliasing if you have to assume all aliases can mutate, but if they can't, then that changes things.

dmitrygr · 2026-01-15T02:53:38 1768445618

Do you not use restrict in your normal everyday C code that you write? I use it in my normal C code.

kouteiheika · 2026-01-15T04:54:42 1768452882

I used to use it, but very rarely, since it's instant UB if you get it wrong. In tiny codebases which you can hold in your head it's probably practical to sprinkle it everywhere, but in anything bigger it's quite risky.

Nevertheless, I don't write normal everyday C code anymore since Rust has pretty much made it completely obsolete for the type of software I write.

kazinator · 2026-01-15T08:08:02 1768464482

restrict works by making some situations undefined behavior that would otherwise be defined without it. It is probably unwise to use casually or habitually.

kazinator · 2026-01-15T08:03:44 1768464224

But of course the only thing restrict does in C is potentially introduce certain kinds of undefined behavior into a program that would be correct without it (and then things can be optimized on the assumption that the code is not invoked in a way that it would happen)

kouteiheika · 2026-01-13T22:37:20 1768343840

> With the one AI, we can do word-to-image to generate an image. Clearly, that is a derived work of the training set of images

> The question of whether AI is stealing material depends exactly on what the training pathway is; what it is that it is learning from the data.

No it isn't. The question of whether AI is stealing material has little to do with the training pathway, but everything to do with scale.

To give a very simple example: is your model a trillion parameter model, but you're training it on 1000 images? It's going to memorize.

Is your model a 3 billion parameter model, but you're training it on trillions of images? It's going to generalize because it simply doesn't physically have the capacity to memorize its training data, and assuming you've deduplicated your training dataset it's not going to memorize any single image.

It literally makes no difference whether you'll use the "trained on the same scene but one in daylight and one at night" or "generate the image based on a description" training objective here. Depending on how you pick your hyperparameters you can trivially make either one memorize the training data (i.e. in your words "make it clearly a derived work of the training set of images").

kouteiheika · 2026-01-13T10:37:17 1768300637

> It’s such a commodity that there are only 3 SOTA labs left and no one can catch them.

No one can outpace them in improving the SOTA, everyone can catch up to them. Why are open-weight models perpetually 6 months behind the SOTA? Given enough data harvested from SOTA models you can eventually distill them.

The biggest differentiator when training better models are not some new fancy architectural improvements (even the current SOTA transformer architectures are very similar to e.g. the ancient GPT-2), but high quality training data. And if your shiny new SOTA model is hooked into a publicly available API, guess what - you've just exposed a training data generator for everyone to use. (That's one of the reasons why SOTA labs hide their reasoning chains, even though those are genuinely useful for users - they don't want others to distill their models.)

kouteiheika · 2026-01-11T08:03:08 1768118588

> when we could have had an open world with open models run locally instead where you got to keep your private health information private

But we can have that? If you have powerful enough hardware you can do it, right now. At very least until the anti-AI people get their way and either make the models' creators liable for what the models say or get rid of the "training is fair use" thing everyone depends on, in which case, sure, you'll have to kiss legal open-weight models goodbye.

kouteiheika · 2026-01-09T06:41:20 1767940880

So you'd prefer that only rich megacorporations and criminals have access to this technology, and not normal people and researchers?

karlgkk · 2026-01-17T23:27:41 1768692461

No? Show me where I said that please :)

renewiltord · 2026-01-09T07:03:19 1767942199

How is that surprising? The advent of modern AI tools has resulted in most people being heavily pro-IP. Everyone now talks about who has the copyright to something and so on.

kouteiheika · 2026-01-09T07:53:15 1767945195

Yes, people are now very pro-IP because it's the big corporations that are pirating stuff and harvesting data en-masse to train their models, and not just some random teenagers in their basements grabbing an mp3 off LimeWire. So now the IP laws, instead of being draconian, are suddenly not adequate.

But what is frustrating to me is that the second order effects of making the law more restrictive will be doing us all a big disfavor. It will not stop this technology, but it will just make it more inaccessible to normal people and put more power into the hands of the big corporations which the "they're stealing our data!" people would like to stop.

Right now I (a random nobody) can go on HuggingFace, download model which is more powerful that anything that was available 6 months ago, and run it locally on my machine, unrestricted and private.

Can we agree that's, in general, a good thing?

So now if you make the model creators liable for misuse of the models, or make the models a derivative work of its training data, or anything along these lines - what do you think will happen? Yep. The model on HuggingFace is gone, and now the only thing you'll have access to is a paywalled, heavily filtered and censored version of it provided by a megacorporation, while the megacorporation itself has internally an unlimited, unfiltered access to that model.

Joel_Mckay · 2026-01-09T17:37:27 1767980247

>Can we agree that's, in general, a good thing?

The models come from overt piracy, and are often used to make fake news, slander people, or other illegal content. Sure it can be funny, but the poison fruit from a poison tree is always going to be overt piracy.

I agree research is exempt from copyright, but people cashing in on unpaid artists works for commercial purposes is a copyright violation predating the DMCA/RIAA.

We must admit these models require piracy, and can never be seen as ethical. =3

'"Generative AI" is not what you think it is'

https://www.youtube.com/watch?v=ERiXDhLHxmo

kouteiheika · 2026-01-10T11:34:49 1768044889

> are often used to make fake news, slander people, or other illegal content.

That's not how these models are used in the the vast majority of cases.

This argument is like saying "kitchen knives are often used to kill people so we need to ban the sale of kitchen knives". Do some people use kitchen knives to kill? Sure. Does it mean they should be banned because of that?

> I agree research is exempt from copyright, but people cashing in on unpaid artists works for commercial purposes is a copyright violation predating the DMCA/RIAA. We must admit these models require piracy, and can never be seen as ethical. =3

So, may I ask - where exactly do you draw the line? For the sake of argument, let's imagine something like this:

    1. I scrape the whole internet onto my disk.
    2. I go through the text, and gather every word bigram, and build a frequency table.
    3. I delete everything I scraped.
    4. I use that frequency table (which, compared to the exabytes of the source text I used to build it, is a couple hundred megabytes at most) to build a text generator.
    5. I profit from this text generator.

Would you consider this unethical too? Because this is essentially how LLMs work, just in a slightly fancier way. On what exact basis do you draw the line between "ethical" and "unethical" here?

Joel_Mckay · 2026-01-10T20:39:30 1768077570

> 1. I scrape the whole internet onto my disk.

This is illegal under theft-of-service laws, and a violation of most sites terms-of-service. If these spider scapers respected the robot exclusion standard under its intended use-case for search-engines, than getting successfully sued for overt copyright piracy and quietly settling for billions would seem unfair.

Note too, currently >52% of the web is LLM generated slop, so any model trained on that output will inherit similar problems.

> 2. I go through the text, and gather every word bigram, and build a frequency table.

And when (not if) a copyrighted work is plagiarized without citation it is academic misconduct, IP theft, and an artistic counterfeit. Copyright law is odd, and often doesn't make a distinction about the origin of similar works. Note this part of the law was recently extended to private individuals this year:

"OpenAI Stole Scarlet Johansson's Voice"

https://www.youtube.com/watch?v=YhgYMH6n004

> 3. I delete everything I scraped.

This doesn't matter if the output violates copyright. Images in jpeg format are compressed in the frequency domain, have been around for ages, and still get people sued or stuck in jail regularly.

Academic evaluation usually does fall under a fair-use exception, but the instant someone sells or uses IP in some form of trade/promotion it becomes a copyright violation.

> 4. I use that frequency table

See above, the how it is made argument is 100% BS. The statistical salience of LLM simply can't prevent plagiarism and copyright violations. This was cited in the original topic links.

> 5. I profit from this text generator.

Since this content may inject liabilities into commercial settings, only naive fools will use this in a commercial context. Most "AI" companies lose around $4.50 per new customer, and are a economic fiction driven by some very silly people.

LLM businesses are simply an unsustainable exploit. Unfortunately they also proved wealthy entities can evade laws through regulatory capture, and settling the legal problems they couldn't avoid.

I didn't make the rules, but do disagree cleverness supersedes a just rule of law. Have a wonderful day =3

beeflet · 2026-01-09T07:07:31 1767942451

intellectual property isn't going to save us. it's a flimsy retort, like the water usage complaints

Joel_Mckay · 2026-01-09T17:26:49 1767979609

This covers the data center resource green-washing rhetoric, and most taxpayers will be paying more for energy now regardless of what they think:

'"Generative AI" is not what you think it is'

https://www.youtube.com/watch?v=ERiXDhLHxmo

And this paper proved the absurd outcome of the bubble is hype:

'Researchers Built a Tiny Economy. AIs Broke It Immediately'

https://www.youtube.com/watch?v=KUekLTqV1ME

It is true bubbles driven by the irrational can't be stopped, but one may profit from peoples delusions... and likely get discount GPUs when the economic fiction inevitably implodes. Best of luck =3

beeflet · 2026-01-09T21:37:03 1767994623

We can generate more energy, fabricate more computer chips and collect more water, but the impact on labor will be irreversible.

Joel_Mckay · 2026-01-10T02:44:05 1768013045

Energy is finite, and asking the public to fund a private firms irrational project is unethical.

"Memoirs of extraordinary popular delusions and the madness of crowds" (Charles Mackay, 1852)

https://www.gutenberg.org/files/24518/24518-h/24518-h.htm

I look forwards to buying the failed data center assets. LLM make great search engines, but are not the path to "AGI". Neuromorphic computing looks more interesting. Have a great day =3

beeflet · 2026-01-10T09:14:41 1768036481

The amount of electricity we can produce is limited only by regulation, because we have practically unlimited amount of fission energy under our feet. That is what you are seeing now with all of these new nuclear plants being built and de-decommissioned. If that is too scary for you, we also have the world's greatest reserves of shale gas.

I am not pro-AI, and I agree that the market will crash. But what I take issue with is this NIMBY mentality that we should nitpick proposals with a thousand fake reasons for why we can't build anything in this country. We can't do big engineering projects like china because they are too much of an eyesore or they use too much water or they're not zoned correctly.

We can't put up a new apartment block, it's too much of a strain on the local water supply. Okay can we collect more water, invest in a new reservoir? Of course not, it will endanger the tumbleweed population.

We can't let a new datacenter go up because it will cause everyone's power prices to increase. Okay maybe we can produce more power?? No, BECAUSE ENERGY IS FINITE AND THE SUN IS JUST GOING TO EXPLODE ANYWAYS SO WHY DO YOU EVEN CARE. WTF?

Why can't we build things? Because we just can't, and actually it's impossible and you are rude for suggesting we build anything ever. It's circular reasoning designed to placate suburban NPCs.

If you oppose AI because it is ruining art, or it will drive people out of jobs, just say that. Because these fake complaints about power and water are neither compelling nor effective (they are just technological and material problems which will be ironed out in the coming generations).

Joel_Mckay · 2026-01-10T11:33:07 1768044787

These firms can do what they like if and only if they pay for every $7B reactor, the 30k year waste stewardship, and disconnect from community resources people paid for with taxes. However, currently the unethical burden cities with the endless bill for resources, contribute no actual value, and one may spot the data center waste heat signatures and industrial run-off from space.

Consider most "AI" firms lost on average $4.50 for every new user, rely on overt piracy, and delusional boards sand-bagging for time... these LLM businesses are simply unsustainable fictions.

Many problems don't have simple answers, but one may merely profit by their predictable nature. I would recommend volunteering with a local pet rescue society if you find yourself getting upset about trivia. Have a great day. =3

https://www.youtube.com/watch?v=JAcwtV_bFp4

https://www.youtube.com/watch?v=Xx4Tpsk_fnM

https://www.youtube.com/watch?v=t-8TDOFqkQA

https://www.youtube.com/watch?v=yftBiNu0ZNU

https://www.youtube.com/watch?v=vrTrOCQZoQE

beeflet · 2026-01-10T21:51:26 1768081886

What trivia? I don't disagree that the AI companies are unprofitable.

These AI companies are paying for the reactors. As for waste, The Department of Energy handles spent nuclear fuel. Protests against the construction of yucca mountain have made this impossible. Nuclear power plants repeatedly sue the US Government for the cost of storing this nuclear waste on-site, because it's the DOE's problem.

And it is a totally artificial political problem. It is not even nessisarially "waste" in the sense that we ordinarily think: there is a significant amount of fissile isotope in spent fuel and countries like france recycle the majority of spent nuclear fuel. We could do the same with the right infrastructure, and it would vastly decrease the amount of waste we produce and uranium we need to mine.

My point is that complaints in these youtube videos you link (which I am very accustomed to, I have been following this for decades) present the argument that AI is politically dangerous, and this is totally separate from these material complaints (not enough water, not enough power, not enough chips, etc.) you pretend are a significant problem.

These are just extrinsic flaws which can be solved (and WILL be solved, if the USA is able to restore its manufacturing base, which it should). But my issue is purely with the intrinsic dangers of this tech, which are not fixable.

Some of the videos you link are just this suburban NIMBY nagging about muh noise pollution. You might as well get a video of people complaining about EMF pollution. The big issue here is that AI is going to take all of our jobs and will essentially harken the end of the world as we know it. It is going to get incredibly ugly very soon. Who cares about what some 50 year old boomer homeowner (who isn't going to live to see this unfold anyways) thinks about some gray building being built remotely nearby their suburb. They should go back to watching TV.

As for me, I am going to campaign to have my local pet rescue society demolished. It uses too much water and space and electricity, and for what? Something I don't care for? Seems unethical to me that I should bear the cost incurred through increased demand for these resources, even though I did not explicitly consent to the animal shelter being constructed.

Joel_Mckay · 2026-01-10T22:25:09 1768083909

>These AI companies are paying for the reactors.

This is demonstrably false with negative revenue, and when the gamblers default on the loans it is the public that will bear the consequences. Similar to sub-prime mortgages people on the con are getting tired.

Dismissing facts because you personally feel they are not important is silly. If you think the US will "win" the "AGI" race... than you are fooling yourself, as everything has already been stolen.

Have a great day, and maybe go outside for a walk to settle down a bit if you are uncomfortable with the way imaginary puppies, bunnies, and kittens make you feel. Community non-profit organizations offer tangible goodwill, and are very different from ephemeral LLM fads externalizing a suckers-bet on the public. =3

https://www.youtube.com/watch?v=FcGLveebwjo

Joel_Mckay · 2026-01-09T07:01:46 1767942106

The studios did already rip off Mark Hamill of all people.

Arguing regulatory capture versus overt piracy is a ridiculous premise. The "AI" firms have so much liquid capital now... they could pay the fines indefinitely in districts that constrain damages, and already settled with larger copyright holders like it was just another nuisance fee. =3

spencerflem · 2026-01-09T09:05:15 1767949515

Why not? I don’t think normal people have very many good uses for deepfake tech.

scotty79 · 2026-01-09T09:56:21 1767952581

Who is normal person? Non-creative? Deepfakes have immense creative potential.

spencerflem · 2026-01-09T16:38:13 1767976693

I don’t really see it to be honest. I feel like their best and most natural use is scams.

Maybe a different comparison you would agree with is Stingrays, the devices that track cell phones. Ideally nobody would have them but as is, I’m glad they’re not easily available to any random person to abuse.

Joel_Mckay · 2026-01-09T17:41:56 1767980516

>Deepfakes have immense creative potential

...and the lawyers win. =3

https://www.youtube.com/watch?v=zpcWv1lHU6I

kouteiheika · 2026-01-08T05:33:51 1767850431

> modern LLM architectures (which aren't that different) on his website and in the github repo: e.g. he has a whole article on implementing the Qwen3 architecture from scratch.

This might be underselling it a little bit. The difference between GPT2 and Qwen3 is maybe, I don't know, ~20 lines of code difference if you write it well? The biggest difference is probably RoPE (which can be tricky to wrap your head around); the rest is pretty minor.

libraryofbabel · 2026-01-08T06:02:57 1767852177

There’s Grouped Query Attention as well, a different activation function, and a bunch of not very interesting norms stuff. But yeah, you’re right - still very similar overall.

kouteiheika · 2026-01-07T04:54:49 1767761689

> Would also love to better understand what factors into "accuracy" since there might be some nuance there depending on the measure.

It's accuracy across GSM8K, MMLU, IFEVAL and LiveCodeBench.

They detail their methodology here: https://byteshape.com/blogs/Qwen3-4B-I-2507/

kouteiheika · 2026-01-07T04:53:46 1767761626

It's accuracy across GSM8K, MMLU, IFEVAL and LiveCodeBench.

They detail their methodology here: https://byteshape.com/blogs/Qwen3-4B-I-2507/