You get considerably more ML FLOPS per dollar in a 4090 than any mac. It seems like the base M2 MAX is at roughly the same price point. It does grant you more RAM.
Quadro and Tesla cards might be a different story. I would still like to see concrete FLOPS/$ numbers.
They don't need the entire mac. Their cost per Max chip is probably $200-300 which beats the 4090 by a massive margin and each chip can do more than a 4090 because it also has a CPU onboard.
4090 peaks out at around 550w which means they can run 5+ of their Max chips in the same power budget.
A 4090 is $2000. Apple can probably get 5 chips on a custom motherboard for that cost. They'll use the same amount of power, but get a lot more raw compute.
> Their cost per Max chip is probably $200-300 which beats the 4090 by a massive margin...
That's true. I was talking about end user pricing.
> ...each chip can do more than a 4090 because it also has a CPU onboard.
That's a strange thing to say. It has a CPU, correct. It makes the chip more versatile but for data center ML tasks it doesn't really matter. A 4090 chip also has much more ML relevant compute per chip. So apple's chips can't really "do more than a 4090" in any relevant way.
Of course apple pays less for their in house made chips vs external products. That comparison doesn't seem relevant to the context, e.g. they're not going to challenging CUDA with internal chips.
They might get more compute per watt though. My guess is that nvidias datacenter chips are competitive in that space, but that's another story.
You need to consider this in the context of the relevant task. Nvidia GPUs have extremely high peak performance for GEMM, but when working with LLMs, bandwidth (and RAM capacity) becomes the limiting factor. There is a reason why real ML-focused datacenter Nvidia GPUs use much wider RAM interfaces and a much higher price point. The M2 Ultra might not have the raw compute, but it has a lot of RAM and large caches.
Part of the advantage of using "one 4090" is that the max TDP is only 450w, as opposed to 5 M2 Ultras running at ~150w each. When you scale up to Nvidia's latest Blackwell architecture, I genuinely don't know how Apple could beat them on performance-per-watt. Buying M2 Ultras wholesale is probably cheaper than an NVL72 cluster, but certainly not what you'd want to use for Linux or maximizing AI-based performance-per-watt.
You are missing the point. We're discussing if Apple can use their own chips more cheaply than buying Nvidia's chips.
The Max TDP is not actual peak power consumption. Gamer's nexus recorded 500w peak and almost 670w overclocked. Most reviews I've looked at seem to put peak power consumption around 550w.
M2 Ultra wasn't even mentioned and it uses more than 150w. The correct question would be about M3 Max as we have solid numbers on it. M3 Max uses around 100w when both the GPU and CPU are heavily utilized and less than that when only the GPU is used.
This means that Apple could run 5 of their M3 Max chips in the same peak power as the 4090. But wait, there's more. 4090 doesn't run in a vacuum. It requires a separate CPU setup and a couple hundred more watts.
That means we could power 7 or so M3 Max chips with that same amount of power.
Of course, this isn't the whole story. 4090 isn't a professional chip either (while Apple can bin and certify their own CPUs and know they're getting a server-grade chip) and the 4090 also doesn't have nearly enough RAM. H100 starts at $25,000 and goes up. Apple could buy 75-100 M3 Max chips for that kind of money. That's certainly a load more compute than H100 would offer. Blackwell will be even more expensive in comparison.
The M2 is a chip designed to be in a laptop (and it is quite powerful given its low power consumption). Presumedly they have a different chip or at least completely different configuration (RAM, network, etc.) in their data centers.
The interesting point here is that developers targeting the Mac can safely assume that the users will have a processor capable of significant AI/ML workloads. On the Windows (and Linux) side of things, there's no common platform, no assumption that the users will have an NPU or GPU capable of doing what you want. I think that's also why Microsoft was initially going for the ARM laptops, where they'd be sure that the required processing power is available.
> The interesting point here is that developers targeting the Mac can safely assume that the users will have a processor capable of significant AI/ML workloads
Also that a significant proportion (majority?) of them will have just 8 GB of memory which is not exactly sufficient to run any complex AI/ML workloads.
I believe MS is trying to standardize this, in the same way as they do with DirectX support levels, but I agree it's probably going to be inherently a bit less consistent than Apple offerings
How does it help me (with maxed out M3 Max) that Apple might have some chip in the future right now? I do DL on A6000 and 4090, not waiting until Apple produces a chip someday that is faster than 1650 in ML...
There was a rumor floating around that Apple might try to enter the server chip business with an AI chip, which is an interesting concept. Apple's never really succeeded in the B2B business, but they have proven a lot of competency in the silicon space.
Even their high-end prosumer hardware could be interesting as an AI workstation given the VRAM available if the software support were better.
> Apple's never really succeeded in the B2B business
Idk every business I’ve worked and all the places my friends work seem to be 90% Apple hardware, with a few Lenovo issued for special case roles in finance or something.
Of course you do, Apple's selling mobile SOC's not high end cards. That doesn't mean they're incapable of making them for the right application. You don't seriously think the server farms running on M4 Pro Max chips do you...
Quadro and Tesla cards might be a different story. I would still like to see concrete FLOPS/$ numbers.