Where did you see the matmul acceleration support? I couldn't find this detail o...

aurareturn · 2025-09-09T19:32:54 1757446374

Apple calls it "Neural Accelerators". It's all over their A19 marketing.

kridsdale3 · 2025-09-09T20:06:05 1757448365

What a ridiculous way to market "linear algebra transistor array".

jacquesm · 2025-09-09T20:14:57 1757448897

Hey man, it helps you think different. You just never knew your neurons needed accelerating.

kridsdale1 · 2025-09-10T03:21:06 1757474466

I accelerate them every morning with an Americano.

liamwire · 2025-09-10T03:31:28 1757475088

I have to ask out of curiosity, why is your first comment made with one account, and the reply with a similarly-named alt?

kmarc · 2025-09-10T08:00:30 1757491230

To confuse all those neural accelerators scraping this conversation.

liamwire · 2025-09-10T08:16:06 1757492166

That seems incredibly prescient for accounts created before even GPT-1. Obviously broad data scraping existed before then, but even amongst this crowd I find it hard to believe that’s the real motivator.

fennecfoxy · 2025-09-10T13:05:37 1757509537

Account on laptop, account on mobile.

butlike · 2025-09-10T15:44:07 1757519047

I really hope someone got fired for this blunder

jimbokun · 2025-09-10T14:04:41 1757513081

Which means what, exactly, to someone whose not a machine learning researcher?

kamranjon · 2025-09-09T20:15:54 1757448954

Don’t all of the M series chips contain neural cores?

aurareturn · 2025-09-09T20:34:00 1757450040

Yes, they do. They're called Neural Engine, aka NPUs. They aren't being used for local LLMs on Macs because they are optimized for power efficiency running much smaller AI models.

Meanwhile, the GPU is powerful enough for LLMs but has been lacking matrix multiplication acceleration. This changes that.

astrange · 2025-09-10T01:58:42 1757469522

The neural engine is used for the built-in LLM that does text summaries etc., just not third party LLMs.

And there's an official port of Stable Diffusion to it: https://github.com/apple/ml-stable-diffusion

mrheosuper · 2025-09-10T01:58:59 1757469539

I thought 1 of the reason we do ML on GPU is fast Matrix multiplication ?

So the new engine is accelerator for matmul accelerator ?

wtallis · 2025-09-10T05:54:13 1757483653

From a compute perspective, GPUs are mostly about fast vector arithmetic, with which you can implement decently fast matrix multiplication. But starting with NVIDIA's Volta architecture at the end of 2017, GPUs have been gaining dedicated hardware units for matrix multiplication. The main purpose of augmenting GPU architectures with matrix multiplication hardware is for machine learning. They aren't directly useful for 3D graphics rendering, but their inclusion in consumer GPUs has been justified by adding ML-based post-processing and upscaling like NVIDIA's various iterations of DLSS.

cchance · 2025-09-09T23:07:31 1757459251

These are different these are built into the GPU Cores

emchammer · 2025-09-09T20:17:20 1757449040

Does this mean that equivalent logic for what has been called Neural Engine is now integrated into each CPU core?

rmccue · 2025-09-09T20:20:49 1757449249

Each GPU core, but yes, this was part of what they announced today - it’s now integral rather than separate.