Intel has been sitting twiddling its thumbs for years, so it was easy for Apple to actually innovate and create an incredible processor, also controlling the entire stack and being able to switch from x86 helps.
But on the GPU side NVIDIA and AMD have been battling it out for a while, competition is tight, and thinking Apple could come, compete and actually beat the fastest GPU currently available for consumers is a bit of a pipe dream, in hindsight. The GPU market hasn't been sitting on laurels for as long as Intel has been. That said, I am still positively surprised at how close they managed to reach the 3090, but alas, no cigar.
Beside "being able to switch from x86", there are some other factors that make this feat look easier than it actually is:
- building on the ARM architecture, which (1) saved them a lot of design costs and (2) being a simpler architecture already has a "built-in" performance advantage over x86.
- having years of experience already with the "A-series" of chips used in iPhones and iPads since 2010. Everyone is talking about "Apple silicon" now, but these predecessors are often forgotten.
- having a privileged partnership with TSMC, where they have access to the latest and best processes and priority over all other TSMC customers.
Just within Mac they've gone from 680x0 to PowerPC (related to POWER but separate), to AMD64, to Apple Silicon. Before the Mac, they also had the 6502 and 6800 platforms.
They went 32 bit x86 before adding support for amd64 bit and then eliminating 32 bit right before going to apple silicon (presumably in part motivated by rosetta 2 being optimized for translating amd64 software)
That was like 20 years ago. I wonder how many of those folks are still around. It would be interesting to understand how they ran this project internally. Did it build off of that previous work or did they approach it from the ground up.
The mere fact that Apple's claims to challenge NVidia's best weren't dismissed with a laughable graph, and instead were concluded to be "well, a bit of a marketing stretch, but not a bad attempt Apple"...
When Intel has been failing at discrete graphics competition with ATI/NVidia for... two decades? Yet Apple cranks out a functional competitor a year after their first desktop chip. Yes they probably have lots of experience with integrated graphics on SOC for the iPhones so this isn't totally out of the blue. But still.
What has Intel been doing all these years? Is it THAT dysfunctional? Add to the failures of VLIW/Itanium, loss of the SSD crown, underwhelming XPoint, no ARM competitors, no real position in mobile, twice getting passed up by AMD in the performance race, a company with 1/10th, possibly 1/100th the resources.
Intel did make GPU recently, I think? I though they have at least improved from "Just don't play games on this" to "Plays if it isn't power hungry". Consider they are not really doing anything about gaming in the past 10(?) years. It is a decent improvement.
Do you realize that Apple silicon is the result of decades of chip design experience? They didn't just decide to do it Q3 next year, one day.
Intel rested on their laurels because AMD almost wiped themselves out, which led to them getting complacent with the process. The actual designs were quite good, it took AMD a few years of better process and newer designs to properly dethrone intel (which intel have now reclaimed in the mid range, this will now yo-yo as it should always do)
> Did Bulldozer really lose or was the competition cheating?
Bulldozer and it’s competition were both designed at a time where leaking information to another process on the same physical core was outside the threat profile CPU memory protection was designed to mitigate against. Intel was optimizing aggressively within the envelope of expectations at the time, which were upended by the rise of cloud computing.
Worst case I see is approximately 33% performance penalty in aggregate for Xeon parts, with much worse performance in specific scenarios. Comparing the original Bulldozer benchmarks, this does close the gap like GP suggested.
Also, refresh your memory on how the mitigations work because there’s definitely an impact for most programs.
Semi-custom (aka console) chips kept them afloat, and they executed well on Ryzen while Intel made modest year-over-year improvements. Doesn't mean they weren't working on a relatively tight R&D budget, though.
Ryzen worked in part because it didn't try to be super-clever. Bulldozer was very opinionated about how computing would look (cheap cpu + big coprocessor) whereas Zen is much more practical
Large part, likely the majority, of Intel's CPU dominance had been a result of their "unfair" advantage in fabs, not in hardware design. Often quoted rule of thumb - 90%/10% - 90% of the progress in semiconductors is attributable to process shrinkage, 10% to improved the hardware design. TSMC contributed the lion's share in making AMD competitive again. Design-wise, chips from Intel, AMD and Apple are all incredibly impressive. Afaik, Apple pays the top dollar to get access to the highest-end TSMC process - any performance comparisons should keep that in mind
My understanding is that everything was pretty much wrapped up with the quid pro quo settlement where Intel allows AMD an x86 license and AMD allows Intel a license the x64 extensions, and it also applies to extensions since then like AVX.
I don't think any money actively changes hands on an ongoing basis for this, but when this was signed Intel did pay $1.25 billion and AMD agreed to drop anti-trust complaints against Intel.
Absolutely. I wasn't expecting Apple to beat any decent current GPU with the performance.
What I like about the M1 is that I get 60-70% of the performance with a chip that is much smaller and cooler than the GPUs it beats. I think for a long time we were thinking about computers as big bulky noise heat generators and Apple just shows the way that we can achieve roughly the same in a much smaller equipment that is much cooler (in terms of temperature).
M1 Ultra is 114 billion transistors, while the RTX 3090 is 28 billion. (Not sure of physical size of the chips, though - do you have a reference for that?)
Original point stands, of course, on efficiency/heat!
The last time Intel released transistor count was for the i9-7980XE. It's a physically larger 18 core chip on the same process node as the 10 core 10900K tested, and googling says it had 9 billion transistors. If we assume that their improvements on 14nm let them fit all the same chips as the larger CPU into the newer one, then.... the i9 transistor count doesn't make a meaningful difference to the comparison.
> Intel has been sitting twiddling its thumbs for years,
What evidence do you have for that?
Intel really wanted to release (in my opinion) very interesting processors, but had serious problems over years with getting their 10 nm process "stable" (i.e. sufficiently high and reproducible yield rates). They even back-ported some of their planned CPU architectures to their working 14 nm process "just to get something out". Since these processors were not developed with 14 nm in mind, they of course were not as good as they could have been.
> Intel really wanted to release (in my opinion) very interesting processors...
No. Intel just wanted to show some little improvements to keep performance gap constant, hide their neat tricks until competition catch up, and use them when clients or the market really demanded it.
The only notable efforts I've seen were reducing performance penalty of SpeedStep performance switching, making better memory controllers to "Catch" AMD, and other power-gating and independent throttling capabilities to address density issues in systems.
When fab/power/thermal issues became apparent, they started to hide AVX/AVX2 frequencies, created frankenprocessors for some applications, etc.
However, I've seen no real effort to make groundbreaking innovations in x86 space rather than protecting what they already had.
Performance counters, and other underlying piping to make processor observable was nice though.
As a result, I can still use a 3rd generation i7 as a daily driver for almost all tasks at hand, including development. The only definitive performance difference shows itself when I run my scientific code after compiling it with platform platform specific optimizations on newer systems. On that regards, an M1 MacBook air can be 25% faster than a 7th gen 7700K processor, and I find it ironic.
>
However, I've seen no real effort to make groundbreaking innovations in x86 space rather than protecting what they already had.
I consider what AVX-512 has to offer to be highly innovative.
Unluckily, just when they planned to introduce AVX-512 into most desktop/laptop CPUs (not just server CPUs or special-purpose accelerators), the problems with 10 nm occured. So this was delayed a lot and even today, many desktop/laptop CPUs of Intel have no support for this feature.
Also Intel TSX was in my opinion really innovative (even though this feature was to my knowledge mostly used in (business) databases; what a pity).
I wouldn't call wider SIMD lanes terribly innovative. Particularly when they suffer from power costs to evaluate, time penalties just to fill the registers with enough data from cache or memory, and real workloads don't benefit from SIMD as much in practice when compilers are terrible at autovectorization (and humans are only marginally better at doing it manually).
AVX-512 is an example of a feature that improves special cases that show up in faux-workloads (eg: fancy benchmarks and HPC) but does not manifest higher performance for the vast majority of workloads, including things that ostensibly should be embarrassingly parallel and reap gains from SIMD.
SIMD lines are just a miniaturization of older vector processors as co-processors, a-la CRAY in a box.
As an HPC sysadmin and scientific software developer/researcher, I can confidently say that SIMD can provide real performance gains, however there are trade-offs and decisions to be made.
- First of all, SIMD is very data-hungry. You either need to constantly push data into it, or modify the data you've pushed a lot. Otherwise you just sit.
- Then there comes power and frequency penalty. In Intel's case, it needs humongous amount of power in CPU budget terms, and it creates heat and slowdowns. So you have to test your code with SIMD or without it (-mtune, -march, etc.). If your code is as speedy or faster, use SIMD.
- Moreover, you can't just compile an extremely optimized binary and fan it out. Older processors will just throw "illegal instruction" and halt. You either will provide multiple binaries with specific optimizations for each, or lowest common denominator for a vendor (AMD binary and Intel binary), or just throw all out. The best way is giving the source out and providing a simple makefile to let the researcher/user compile it, but not all code is open, one may guess. Creating a universal with multiple code paths is also possible, yet needs a lot of elbow grease, and may not be always optimal.
- Lastly, your code don't have to be embarrassingly parallel to be able to use SIMD. Matrix/linear algebra libraries like Eigen can almost abuse the processor's all units when compiled with correct flags (-O3, -mtune=native, -march=native). However, if you want to accelerate small data with SIMD, you need to create a parallel loop which needs to saturate SIMD pipelines. Which OpenMP can easily do with parallel_for.
All of this doesn't change that SIMD is a special horse which can't run in all courses, however its not useless.
I didn't say it was useless, just that it wasn't a magic bullet and AVX-512 isn't particularly innovative, and doesn't solve most users' problems.
I think you're missing the point of my post, I agree with all your points in specificity (except one, but not the forum to discuss FMV in modern compilers) but they miss the grander point that Intel hasn't made computers faster via more SIMD. The amount of expertise required to make use of it is just more evidence of that.
AVX512 was clearly a great innovation in the vectorization landscape. A far cleaner instruction set, complete and symmetric, with very interesting blend, ternlog, lane-crossing instructions and the especially interesting mask registers. Lots and lots of goodies and an eye for compiler implementation.
I feel Intel failed hard at diffusion of the ISA (why not put it everywhere, with half-perf, it'll improve later, no change in code) and also at not pushing more energy/dollars into ispc. Yeah yeah your compiler engineers are clever, but you've been doing this for 20 years and autovectorization is still ways off. Let me write code in a way that can be easily vectorized. A subset of C. Less awkward than cuda.
Now it seems AVX512 and large vector units is dying and still is too niche. Sad.
The cleanup being tied to the width increase was the first problem. The new width still being a fixed one was the second.
SVE is SIMD actually done right – on the Arm side in the near future, everything from smartphones to massive HPC boxes will be covered by the same clean SIMD ISA.
I agree it would have been nice to have 'infinite sized' instructions, chopped up to the actual underlying vector size. But there were so many complaints about AMD not implementing some instructions as 256 bit-wide but 2x128 that I feel they went for the least microcode route.
Mask registers offset the size problem a bit. I just wish we'd rebuild a language or clean libraries to take full advantage of this programming model. Is ispc still maintained? Does anyone use it in prod? Genuinely curious.
I feel SVE is 'too late' as most CPU makers seem to go back to smaller vector units (leaving the vectorized stuff to gpus - I know they're not the same thing, but if you're investing in heavy perf hardware, for repetitive computing...) and even Intel doesn't seem very serious about AVX512 except in the Xeon world. But then if you pay 8000EUR for a platinum thing, you might be able to pay for top talent to handcraft some intrinsics.
We spent years with quad core i7 processors being the norm, with higher core count processors being locked to Intels HEDT platform. When Ryzen came out all of a sudden the i7 8700k was able to be a 6 core processor instead of a 4 core. Then it wasn’t until Alder lake released in November of last year that we finally got desktop processors that weren’t on 14nm+++ (or however many + it was at). That’s not including the fact that you could overclock all the initial Ryzen processors, while Intel locked you to specials skus of CPU and motherboard
It’s basically innovation 101: don’t spend more money developing something if you know you customers needs are already met.
They likely knew what apple were up to with efficiency cores several years ago, and only decided to accelerate manufacturing of Alder lake once they realised the market was cool with that form or architecture.
Putting effort where demand is, doesn't seem like the stupidest strategy to me. Creating demand for a new kind of product or service is nice, but reaching a market where their needs are, seems clever.
What baffles me about this 'innovate on something else than peak perf' is... What did they innovate on massively instead then, if not that? Apart from AVX512?
Intel did have any major advances from 2nd gen all the way until the 7th. The advancements were generally small (single-digit often) IPC or clock speed improvements.
Only after AMD released Ryzen, Intel had to respond at their 8th gen by cranking up core counts. And IPC did not have any increases, until 11th gen(the backported arch you mentioned). In my opinion, the performance delta between 28nm Sandy Bridge and 14nm Kaby Lake is ridiculously small.
> In my opinion, the performance delta between 28nm Sandy Bridge and 14nm Kaby Lake is ridiculously small.
Two of my Linux workstations are Sandy Bridge and Kaby Lake, and my real world experience bears this out. I can't distinguish between the two for everyday use cases; only synthetic benchmarks show any real advantage in the newer system. I can't speak for Windows performance differences, as my only Windows system is my gaming rig with a Ryzen 5 3600, which of course trounces both of the workstations no matter the OS.
Are there still software features exclusive to Intel's chips? I remember my dabble in Android Studio was painful, since the 'simulated device' functionality was only available on Intel, while I had some FX-8??? AMD chip.
I'm not in the market for a new gaming PC quite yet, but it's also going to be a personal workstation, so I don't want to deal with anything like above if it can be helped.
Get a Ryzen. Intel is restricting many features like ECC, overclocking or virtualization (important for emulators/android development) to certain processors/chipsets. If you want to save money, get a used workstation.
Looks like it's time to update the fantasy roster on pc part picker. A shame, since it had such a nice aesthetic! The Vision D looks so clean and well-featured
By the time I get around to actually building it, there may be some similarly-styled mini-ITX AM4 motherboards, so not all hope is lost.
Back in ~2012 I ran the Python Meetup here in Phoenix. One of the compiler leads for Atom processor attended and he basically told the tale that the Atom processor was being neutered. Intel higher ups were worried the Atom was getting too close to desktop/server processor performance. They were very concerned about cannibalization from within. Also, even then, Intel thought very little of mobile processors.
Interesting processors? Yes, Intel released IA64 with Itanium which was interesting and a dud in the market. Then they came up with Xeon Phi, which was a dud in the market. Then they brought Larabee, which was a failure in the market. Part of Larabee's issues did stem from process limitations. Intel had every opportunity to hedge its bets by buying from TSMC, Samsung, or others just as everyone did, but kept sinking more money into their own foundries without getting what they paid for.
Meanwhile AMD gave the market what customers clamored for: a 64-bit extension to the IA32 platform. Then AMD gave us massively performant APUs. Then AMD gave us multi-die packaging and left the IO die on a more sensible process for that function while using smaller processes where frequencies matter more.
Then Apple with some help from ARM gave us a SOC for laptop and desktop use that's frankly kind of embarrassing AMD and Intel, not so much for the core design really as its integration with memory.
Intel isn't just unlucky here. They've made a series of serious missteps going back a couple of decades now.
> Intel really wanted to release (in my opinion) very interesting processors, but had serious problems over years with getting their 10 nm process "stable" (i.e. sufficiently high and reproducible yield rates).
Intel was just doing what they have been doing for the last 40 years - building faster x86 CPU's.
They weren't even considering something as grand for desktop/laptops as Apple was with the M1 (i.e. a fully integrated SOC).
I don't think they were twiddling their thumbs to be fair - they were probably pushing hard in the same direction they have been pushing in for the last 40 years, but failed to see the industry change under their feet.
>
They weren't even considering something as grand for desktop/laptops as Apple was with the M1 (i.e. a fully integrated SOC).
Perhaps I misinterpret your argument, but it is my impression that Intel (and also AMD!) did huge steps into that direction, just in a more incremental way than what Apple did:
- developing a smartphone SoC: Intel did invest serious money into it and developed SoFIA and Broxton. Intel even strongly subsidized smartphone producers to use them. This all turned out to be a huge commercial failure and thus Intel left the smartphone SoC business.
It is also not the case that a fully-integrated SoC is "better". Rather having not everything in one SoC enabled much more flexibility for OEMs. Fully integrated SoC versus more chips is rather a trade-off between various goals.
Integrated GPU's have been around since 1991, so I wouldn't personally point to that as an example of Intel continuing to be highly 'incrementally' innovative.
Similarly the M1 chip came out of a smartphone SOC - it was just that Apple saw the potential for laptop/desktop adoption while Intel clearly didn't (maybe because they failed to get into the smartphone business - but their failure in that market is yet another example of 'too slow, too little, too late').
"Intel was just doing what they have been doing for the last 40 years - building faster x86 CPU's."
Yes Intel has been doing some minimum improvement to CPUs each year, but the reality of "Intel was just doing what they have been doing for the last 40 years - " is....
"Milking their Monopoly"
The AMD lawsuit with Intel back in the Netburst days showed how Intel was just as bad if not worse than Microsoft at anticompetitive behavior to lock out competitors in the PC market. I'll throw Intel a bone in that for decades with this monopoly power they still continued to push the process and design envelope (probably because they were afraid of becoming Motorola, and when they still had engineering leadership left over from the earlier days).
But Intel is a badly overfed Jabba the Hut. Gelsinger has his work cut out for him.
It's expected that they're trying to build chips that are faster on x86 rather than switch to an entirely new architecture - they can't switch without the full support of Microsoft and at least some major Linux distributions, not to mention the OEMs they sell their chips to.
The fact that Microsoft has released several ARM laptops but selected Qualcomm to provide the processors suggests that Intel at least had an opportunity to play the game, they just never came to the table. It’s hardly Microsoft slowing them down when Microsoft is ahead of Intel on this, but their reluctance to push forwards means that they are now behind.
It’s not Microsoft’s job to push intel anyway, it’s intels job to create a product so compelling that their partners adopt it. If they want the support of major Linux distros, they just have to write it themselves rather than wait for volunteers to do their work for them.
Yeah, I've always been pretty skeptical of the "Intel just sat doing nothing" narrative. The impression I got back in the day is that they went quite hard on their 10nm process with a reasonably ambitious set of changes, then failed to scale it to production. That had substantial knock-on effects including delays to the associated microarchitectures as well as subsequent nodes and backports.
Regardless of the reasons I hope they recover; multiple providers of cutting-edge fabrication technology will be essential.
Intel has absolutely been lazy and literally just gave 5% performance increase per year for quite some time. When AMD was making shitty processors, Intel was just trying to squeeze as much money as possible from marginal upgrades.
You can run Windows on a Intel cpu that is 10 years old and notice hardly any difference in performance.
And on the other hand, they changed everything. Their cpus are actually innovative and really fast, and brought the entire multi core thing into consumer hands in a real way.
That's not super surprising. The 3090 is a beast. The fact that they've gotten into a similar ballpark to maybe the 3070 based on those numbers with a SoC is pretty impressive, though.
That said, am I the only one getting annoyed with shallow comparisons of the M1 products? Or is SEO BS hiding the good stuff? It seems like the rule most sites are following is 50% or more of your benchmark graphs should be Geekbench which is quite eye-roll inducing when you've been spoiled by comprehensive benchmarks in the PC market for decades.
It depends on how you define "the media" and "professional journalist." It's more complicated than most people think.
Some of the world's best actual journalists, like Peter Jennings, were high school dropouts. At the same time, some of the world's most prominent people who often present themselves as journalists are absolutely not.
The old school of journalism allowed for cadetships and on the job professionalisation. The expectation that a degree can get you a journo job has made the industry better in many regards, but I feel like the boundary was blurred into comms, media, and marketing at some point.
The Medill School of Journalism at Northwestern University was one of the most highly-regarded journalism schools in the nation.
Then, its focus changed, and it is now the Medill School of Journalism, Media, and Integrated Marketing Communications with the motto "The future belongs to those who understand the art and science of marketing communications."
I used to work at a place where we would get Medill students as interns. I was told that they were always high-quality. But then we started getting "journalism" majors who couldn't spell, and didn't know the first thing about grammar, fact-checking, interviewing, storytelling, or the difference between fact and opinion; but were experts at putting together videos for YouTube.
A bunch of the popular PC benchmarks have never been ported to Mac, so that limits ones options - but none of these mass-market tech publications really lean into the technical aspects of hardware performance.
I generally wait till somewhere like Anandtech publishes benchmarks (which tends to take a while longer, given the more detailed testing).
Yeah, there's an option to see the original if you click into the image at the bottom of the light box, which is sharper. Also I think the colour key was messed up when I viewed it because it implied more FPS at higher resolutions in shadow of the tomb raider which didn't seem right.
It looks like The Verge (the source) have corrected it since, though the decision not to test the 3090 at 4k seems... weird to me: https://i.imgur.com/P2motB1.png
Also they just handwave "2019 Mac Pro" for the comparison, though the 2019 Mac Pro has _5_ GPU options. Going toe to toe with the intel gpu is embarassing, going toe to toe with the W6800 is quite good. Though I suspect since they've had the device for two years it's the W5700 model, but would be good to have the specified.
I don't understand why people compare hardware using software benchmarks that don't actually do the same work.
Geekbench's Compute is useless for M1s with more than 32 Cores, according to Anandtech. Shadow of the Tomb Raider runs as an x86 program under Rosetta 2.
Isn't there a single GPU benchmark that actually does the same work so that comparisons can actually be made?
> Isn't there a single GPU benchmark that actually does the same work so that comparisons can actually be made?
Apple does not support OpenGL or Vulkan, only Metal, and most app devs have better things to do than rewriting code for the Mac.
The recommended way of gaming on a mac is to use emulation by emulating whatever the game is used with a metal wrapper.
IMO the claim that these benchmarks aren't fair is naive. I don't care about how good the hardware is, but about what performance I get. If I get poor performance because the software, drivers, etc. are poor, I want to know that.
> I don't care about how good the hardware is, but about what performance I get.
Of course, I'm not at all saying that you're wrong to want that, or that these benchmarks don't show what you're interested in.
However this and many other articles are using these benchmarks to derive comparative hardware performance, which is simply wrong to do. That's what I'm criticizing.
Since you can't really buy apple cpus or gpus and put them in a PC, for benchmarking... today at least one can't compare apple's hardware against intel hardware.
What one can compare, is the performance of the "Apple platform" and the "PC platform" at similar price points, power budgets, features. This is more meaningful for most people, which mostly care about how fast can the computer do X (whether it uses the CPU, GPU, or some other chip, most people don't care).
Is this a serious comment? Alwrapper doesn't use emulation. An API wrapper simply translates calls to the appropriate API. But more to the point, If the benchmark isn't properly ported to the platform it's testing then the benchmark is BROKEN. It isn't naive to expect them to do their job right.
It depends on your perspective, yours is a subjective perspective that you care your experience only, which makes sense for a consumer to know what to buy. (But then everyone’s subjective bias has different weights so to speak.)
But from a technological perspective, the logical way to test is to eliminate as much variable as possible and really compare the hardware compatibility. Using software that’s not optimized for the hardware is not an objective test of the hardware.
There are many ways to try to conduct as objective a test as possible. In that criteria, I found that only those review from Anandtech is up to standard. Article like this is more like click bait.
Now I’m not an expert, but if I were to compare the performance somewhat objectively, I might start with TensorFlow where Apple has releases a metal backend for it to run. Then may be write some naive kernel in Julia using CUDA and the pre release metal library (It might not be fair, but that’s where I would start given what I know.)
Linux for Macs with M1 architecture is still in development. There's a team that's currently reverse engineering the M1 GPU to create linux drivers for it ... but until that's finished, unfortunately no, you can't just install Linux on both systems :/
Using the in-progress Asahi Linux to benchmark the hardware could be justified as hampering the hardware even more than MacOS's limited support for mainstream graphics APIs.
Probably a fairer comparison would be something that runs natively using vulkan or metal on both Apple's M1 and X86/Nvidia that isn't heavily optimized for just one of those. There aren't that many modern games that you could reach for here. But probably something that can target both vulkan and metal would be what you'd want. The upcoming x-plane v12 is something I might like a mac studio for.
What's interesting here is that is that we're haggling over comparing something that is an SOC design with fairly modest energy usage and cooling with a video card and cpu (presumably) that need a lot of power and cooling. Imagine Apple putting a few of these CPUs in the mac pro next year (or whenever they release that).
Of course these things are pricey. 5K for the 64 core model, ouch. I think they are going to be making lots of money selling these.
As for intel, amd, and nvidia. They each need to get their act together on the SOC front. They are not for crippled laptops anymore but for high end work stations as well. Nvidia should re-assess what they want now that their arm acquisition is off the table. IMHO they should get rid of amd/intel as the go-to cpu for their gpu and just build their own SOC and target it at gaming machines, workstations, etc. There's no good reason to stick with x86 anymore either. License ARM, or maybe hop on the Risv V bandwagon.
Nvidia has been making socs targeted at gaming for years now, they are in the Nintendo switch. They have a lineup of chips with arm cores and gpu cores available now but they are targeting industrial applications.
I know. That's exactly the point. It's not a flagship product for them but a watered down thing they aim at low cost devices. Apple charges 5K for a their 64 core Mac Studio. It's the fastest thing they have right now and it's an SOC. The fact that it even comes close to Nvidia's latest and greatest shows that this is the right thing to do from a design point of view.
> Isn't there a single GPU benchmark that actually does the same work so that comparisons can actually be made?
There's IndigoBench, but some people will complain about it using OpenCL (which still works great in spite of Apple's efforts to kill their own creation to the detriment of the whole GPU compute industry) instead of Metal.
Sure, if the consumer wants to run Shadow of the Tomb Raider, that's true.
But benchmarks where one hardware has to do much more work than the other is obviously not fit to determine the actual performance differences between hardware; and that is exactly what the article we're commenting on is doing.
Everyone understands the point you're making, it's just a pointless point.
People aren't concerned about the hardware itself. They're concerned about what it can do. If circumstances mean that your powerful hardware can't be utilized, then the powerful hardware is just extra cost and no benefit.
And to pile on, the games are tuned to run well on Nvidia hardware. They haven't even seen this new GPU yet or even care to optimize for it. Sometimes even small changes can make big differences.
First off, I don't doubt for a second that Apple has over-egged the performance of the M1 Ultra's GPU – or has, at least, selected performance indicators which aren't totally representative of common workflows. It's important that we have reviews that make it clear that this is the case and look at more useful numbers.
But this article is a terrible example of doing that. It's just a republishing of minimal benchmark results from another outlet, but without even the tiny amount of analysis that was conducted there. Even the original article gives us close to no information about the scope or setup of this testing, and proceeds to draw conclusions that are suspect at best – for instance, there's a <20% gap in 1440p "Shadow of the Tomb Raider" framerate between the Ultra and the RTX3090 – not mentioning that this is a game that's running under Rosetta – and no 4K comparison, which is the only one likely to be GPU-bound.
More importantly, the actual benchmark results are clearly and obviously wrong, with the numbers for 1080p and 4k performance being switched (unless graphics cards magically now render faster at higher resolution). The Verge has fixed the original, but how can you realistically quote these numbers and make a judgement on them without noticing this?
Just an absolutely embarrassing indictment of the state of technology journalism.
Because they have a week where they can control the narrative before the press gets their hands on actual devices. Every news site on the planet wants to write about the new Apple stuff, and since there are no 3rd party benchmarks yet, they have to parrot Apple's claims.
A week later, most of the press has already forgotten about Apple's new stuff already, and noone cares about the actual benchmarks. Except us techies, who didn't believe Apple's claims in the first place.
To be honest I'm much more disappointed that they shipped yet another crappy webcam in the Studio display after claiming it's the best webcam ever. (The webcams in all the M1 Macs suck ass, despite what Apple and the early press reviews claim)
> The webcams in all the M1 Macs suck ass, despite what Apple and the early press reviews claim
I have yet to see any computer ship with a half-decent camera to be fair, neither in the Windows nor in the Apple world. People want thin screens, about half as thin as your average phone. There's simply no physical space to fit good optics in there - there's a reason why external high quality webcams are so large!
> Except us techies, who didn't believe Apple's claims in the first place.
That's very debatable. Even individuals on HN (which is rare atmosphere) defended Apple's claims.
The other side of this is that people have been exclusively using Apple products for so long, that they don't have the experience to compare what Apple is doing with the rest of the industry. Apple only has to outdo themselves.
>That's very debatable. Even individuals on HN (which is rare atmosphere) defended Apple's claims.
This isn't an accident, it happens due to Apple's pricing/marketing strategy. If you can convince people to pay a premium for your product (when compared to competitors), people psychologically _need_ that product to be better than the cheaper alternatives, or else they feel ripped off. Best to ignore anecdotes from customers and instead rely only on hard benchmarks/data and make purchasing decisions based on what is most suited to you.
This is definitely not the case. I purchased a 32GB 8-core AMD CPU laptop (1060 NVIDIA GPU, unfortunately), in 2019, for $1100. There wasn't, and still is not, an Apple SKU that approaches anywhere near to the performance/dollar.
Apple products have a very distinct and precise market. It's "good enough shit" for "I'm not willing to bother" price. That has _extreme_ value in the modern market. If you're asking anything serious of their hardware then you are, either, ignorant about what else is on offer (having only used Apple hardware for the past <many> years), or have workloads that so-happen to fall into that extremely narrow niche.
> Similar-performing hardware for years
In the past year I have <been forced to> worked with Apple hardware as a daily driver. When did you last you've use one of the alternatives as a daily driver?
I mean, fuck, imagine for a second that Windows updates took 45min, because Apple updates sure as fuck do. That's a hard pill to swallow when my NixOS updates take 0 seconds.
GP's claim has been publicly proven time and again, yet the myth persists because of the failure to actually match Apple's hardware feature for feature, spec for spec. Sure, you can build your own laptop without magsafe or Thunderbolt for <$x, but then you've failed to actually match Apple's hw and specs. Once you actually do that, you'll discover you can not duplicate any Apple hardware at any time in history with third-party off-the-shelf hardware without spending $100-$1000 more. And this does not take into account that Macs, as a rule, will remain in use longer and retain their re-sale value far better than any PC, so the total cost of ownership ends up being a lot less, and even this is not taking into account all the users' time Windows invariably wastes on idiotic things like randomly dropping a driver spontaneously without cause and requiring some attention to restore functionality of, say, a printer. Try using the search engine of your choice with the terms, "macs are more expensive," and nearly all the results reveal the notion that macs are more expensive was always a myth.
If you list all dimensions of Apple's computing products as a radar chart and as a consumer that chart "shape" meets your needs, you will not find a competitor that meets those same needs on all of those dimensions but is also meaningfully cheaper. You can find comparable devices that are comparatively priced, and you can find cheaper devices that come with a slew of compromises, but Apple is incredibly hard to beat at their own game.
That's kind of a weird way to look at it. It assumes that your needs match exactly what Apple offers, which is rarely the case. For most users, Apple devices come with their own set of compromises.
Instead of taking Apple's devices as the benchmark, and trying to find a device that's cheaper and better on all axes, you have to start with the users requirements, and see where you get from there.
For example, if your requirement is "I want a silent machine for programming with lots of RAM for running VMs" then it's very unlikely that the cheapest option is going to be a Mac.
If your requirement is "I want a machine for checking my emails and watching Netflix with a big screen" then it's also very unlikely that the cheapest option is a Mac.
The only scenario where a Mac is reasonably priced is if your requirement is "I want a machine that is exactly like one of the few machines that Apple sells and I don't mind its shortcomings".
Honestly well before then. Phones—Apple's largest product by far—have been competitively priced their entire lifespan. One might even argue based on performance numbers and battery lifespan that they've been by far the best deal.
As far as I understand it, their laptops have also been within a pretty reasonable margin of being competitively priced (at least for base models) for many years. Sure you can find cheaper alternatives, but those usually come with significant sacrifices (build quality, missing features or lower-quality versions of those features), etc. Obviously different people can vary widely in how much they value feature sets, but competitors' laptop models that line up closely in those feature sets have tended to be around the same price for awhile now.
More like, "inexperienced". Tech aside (contexts where complexities offer controversial space in some comparisons, out of benchmarks that may not measure the actual general goal), many are convinced that the spread of false information is severely limited through social and legal devices: loss of reputation, sanctions (e.g. "false advertisement") etc.
> so they blindly believe all the fake comparisons that apple makes on their webpage
They're not that fake to begin with, at least not in "experienced performance" aka how performant the system feels for the user. Apple's stuff is expensive but in terms of performance even the Intel-based Mac lineup just blows the direct competition away:
- Apple's solid-aluminium body provides vastly better thermals than the Windows world which is usually plastic crap, which allows them to run the same Intel processor on boost mode for a longer time, and the fans are at least in my experience a lot better (especially in loudness and "annoyance" factor - even at full power, the fans of a MacBook are way better to listen to than those of your average Windows laptop). For the Mac Pro / iMac lineup, same thing. Most of the Windows world ships with barely adequate coolers and no thinking of airflow, whereas Apple... well, open any tower Mac Pro and see for yourself. That's what a cooler is called...
- Apple's OS has way less crap and clutter going on that gobbles performance (there's just at this moment a submission on the front page that details among other things how Windows got transmuted into a horrible mess of advertising and half a dozen competing frameworks), and they have a lot more control over the hardware and driver stack which means that Apple does not depend on the shoddy work that vendors for the Windows world can pass as "quality tested drivers". The result is better battery performance (because as Apple can control the entire stack they can make full use of sleep modes and other energy saving measures) and less bugs.
- related, but not by something that Apple has done: There isn't much malware going on on macOS and never has been when compared to the Windows world - you can get by without a virus scanner just fine. In contrast, most of the Windows world ships some crap "free for three months" virus scanner bundled which just destroys performance.
- Apple's own chipset designs are, while still based on standards such as the ARM ISA, completely fine-tuned to their needs - which means that, unlike the Windows world which carries about three decades worth of old garbage in their CPUs, Apple can devote more of the same chip area to the needs of modern computing.
- Here's a bit of a controversial take: Unlike the Windows world where most vendors ship systems with highly modular components (i.e. CPU, RAM and storage can be taken and upgraded), Apple solders the chips to the mainboards, very near together. That allows not just for higher performance because the latencies are lower but also because the chips can be cooled together and so run at a higher frequency.
Do I really need to be the one pointing out that M1 Ultra has an "integrated" GPU and that these tests are disadvantaging it due to Rosetta and weak engine support for Metal?
What does that really mean at this point? A big part of what made integrated cards suck so much is how they had to share memory with the CPU and with a unified memory SiP design that doesn't seem like a problem. Plus the M1 Ultra has 114 billion transistors whereas the 3090 is 28 billion. Even if the M1 Ultra is only using a 1/4 for the GPU that's still the same number of transistors. Even a 6th is still huge.
If Apple wants people to take their claims seriously and if weak engine support for Metal and Rosetta support is a problem they should pay someone to port a popular graphics intensively game properly. Of course it works in their favor not to because fans can point to it and say we'll it's just not optimized for it! Like we all did back in the PPC days until they admitted how crap those chips were when switching to Intel.
Apple doesn't care about gaming, but they do care about 3D rendering. Blender benchmarks (which is just now getting official Metal support, sponsored by Apple), should be the real deal.
Sounds about right, it's not magic. AFAIK the Metal support in Blender is not very optimized yet (it's brand new) but even optimized I don't expect it to come close to an RTX 3090. I still find the results amazing, for the amount of power this thing needs if it reaches 2x slower after optimizations, that's massive. Don't forget that the Ultra also includes the CPU and RAM in the power consumption.
Its wall power without cpu encoding with cpu its ~140W.
But i can agree results are good but we need also to remember Samsung 8nm [rtx 3090 node] is 4 years old so 2 generations different from TSMC 5nm 4090 will be probably on 5nm node in few months and presumably will have 3x teraflops (probably 3x tf32 precision but it could be fp32 using 2/1.5x(?) more power).
(also I'm not sure about optimization it was write by apple employees and apple likes to drop open source support and focus on proprietary software)
I agree, but that is still something we need to take into account when comparing the hardware. Imagine if Arm Windows gets popular and it somehow has great drivers for this GPU (unlikely, but humor me here). Then the actual power of the hardware would be a lot more relevant for gaming related benchmarks.
Who cares about hardware you can't utilize? Its just a piece of silicon with potentially any performance imaginable. What you can do with it, now, matters. If this changes in some theoretical future, well we re-evaluate performance claims and stats.
Nobody bothers to guess what's the potential of some new graphic card once all drivers, games, OS etc are fully tweaked and optimized. You decide on performance now.
You can already use it now if you're targeting the platform. It's not the platform provider's problem if you're using a compatibility layer, that was your choice. Well, from a development POV at least, from a user POV, if your tools are using the compatibility layer, there is not much you can do.
We knew this would happen for everything using Rosetta, that's just part of transitioning to a new CPU architecture. For Vulkan vs Metal that's self-inflicted on Apple's part.
You mean that we should take into account how actually complicated is to get software that fully support the hardware into account for HW comparisons where a type of HW can only run with a specific OS?
No, I mean that's not relevant to compare what the hardware can do. It's like how the PS3 was super hard to program but in the end it was producing more impressive results than the Xbox360. This was not a surprising result, it was to be expected from the hardware. It's not that different from what is happening here, benchmarks running on x86 emulation on engines running compatibility layers for Metal are not representative of the hardware.
With the difference that Apple HW performance might be "good enough" for 3rd party software vendors if they take into account the extra development burden for the platform. Anyway some of those benchmarks were already using Metal and are still behind.
To be fair, while they promoted DirectX, Microsoft have provided support for using OpenGL in Windows[1] (while the docs here are dated 2019, the porting guide from IRIS GL, OpenGL's early 90s predecessor, indicates they're actually much older)
Microsoft supported OpenGL prior to developing Direct3D. As soon as that got into good enough shape to supplant OpenGL, OpenGL got dropped.
These days you're better off using OpenGL via ANGLE, built on top of Direct3D.
That will probably also happen analogously on Apple's platform with Metal, so I don't quite get the kvetching about Apple deprecating OpenGL.
As if it matters whether Apple ships an OpenGL implementation as part of the OS or purely as a normal user space library.
I’ve already seen a significant CPU improvement on my 2019 16” and I can now use my 5500M to render. My PC’s RTX 2080 Super is significantly faster but it draws more than 200W.
An integrated GPU with 64 cores? (Compare this to the 112 "cores" of the 3090) At that point the distinction makes no damn difference. The rumor mills of the speculated that AMD is going to release high performance server APUs with 32 GB HBM years ago.
Really, if you can fit the memory on top of the GPU, then there is no meaningful distinction between a discrete and integrated GPU because the fundamental difference on the PC platform is whether the iGPU uses the system DDR RAM or whether it uses dedicated GDDR RAM. That distinction really doesn't exist on Apple's M1 platform because the memory is placed directly on top of/next to the SoC.
At this point, Apple Silicon is going on 18 months. If there still isn't software available to run GPU benchmarks natively, that's a problem for that GPU. And it's a problem for validating marketing claims.
Integrated is an advantage not a disadvantage on a technical level, it's just worse for the consumer as you cannot upgrade them separately. What you should be comparing is the number of transistors to see whether the architecture is impressive or if the speed is due to a size brute force. the M1 Ultra has 4x the number of transistors as the 3090 (probably about half is just the GPU)! It's a freaking humongous chip.
But they never said that that it is faster than 3090, they said it uses 60% less power for the same performance. The shady thing they did is they trimmed the plot by watts to remove full performance of 3090, which might be intentional to give wrong idea to most. But it was clear to me it doesn't have same performance as 3090.
On the other hand, the plot of the 3090 flattens so much @ 320W, that you wouldn't suspect a doubling of performance at its peak wattage (which I think is ~360W?).
Of course, this is marketing material. But as far as I am concerned, the jury is still out, I'd like to see a benchmark with native binaries (non-Rosetta) and some optimization for Metal.
Though even if the performance was half of the RTX 3090, that would be mighty impressive. I have an RTX 3090 on my desk and under full load it is loud and consumes a lot of power. Mac Studio + M1 Ultra's fans are apparently barely audible.
Yes, but if there are close to 0 games that don't require Rosetta, what's the point? Disclosing a theoretical emulation overhead that can't be proven would even be worse.
It's either this or no comparison at all. And it's a game that apple often uses in their promotional materials.
> Yes, but if there are close to 0 games that don't require Rosetta, what's the point?
Because it's not being marketed as a gaming computer, and other use-cases aren't run under Rosetta (e.g. 3D rendering, video editing etc).
e.g. Apple's marketing doesn't claim gaming performance, it claims this:
> You can run complex particle simulations or work with massive 3D environments that were previously impossible to render. And with twice the media engine resources, M1 Ultra can support up to 18 streams of 8K ProRes 422 video playback — something no other personal computer can do.
Their marketing page doesn't even mention games...
The most frustrating part about reading this article for me isn't the article itself, but that Apple - in fine tradition - didn't back up their graphics and performance claims with anything concrete and reproducible.
With M1, I think they can move beyond these shady marketing tactics entirely and focus on what they're developing. In other words, promote the merits of their architecture and systems (which are excellent) rather than try to compare Apples with oranges.
Been like this for years now, always remember the Apple twittersphere gushing about how insanely powerful the iPhone and iPad processors were yet even years after the news of this we didn't really see the platforms doing any remarkable forms of processing.
They are insanely powerful, and if you write the right apps for it, you will see that. I've written a custom CAD app from scratch for iPad, and was blown away by the speed I saw with Metal.
I mean, they turned these chips into desktops, because that's how good they are. Wtf are you talking about?
The phone in my pocket renders 4k60 video faster than my desktop, with no moving pieces. My iPad is even faster. It wasn't until that iPad processor went into my Mac that I could do that without getting burned or putting my bose headphones on.
I agree. M1 with Metal is already quite good. Just leave it at that. You're not going to be able to compete with a high-end discrete GPU in most areas, so stop pretending.
These Benchmarks are all rather useless because they lack the necessary details on methodology and context. For example Shadow of the Tomb raider may very well CPU Bottlenecked with that 3090 give it only runs a 10900 (non-k?) and who knows how the port for the M1 Ultra is done? Is it running x86 Emulation and something like DXVK or is it a truly native port?
Even if true for M1 Macs, it will still invalidate the benchmarks because it's not an objective comparison
edit: so i get downvoted for commenting that a benchmark needs to control external variables (like emulation vs native) in order to get a valid comparison of the underlining hardware? Never change apple fanboys, never change!
The dispute between what makes a fair benchmark is because the of the question of whether it's fair to test the M1 Ultra from the single deployment it's available in (with its fixed OS, API support and therefore required emulation for many use cases), or whether you can come up with some test that somehow tests the M1 Ultra in isolation.
Until either Asahi/Windows support for M1 gets further along, there are more M1 Ultra equipped machines, or more workloads get ported to Apple native (either because there's incentive enough despite the difficulties Apple introduces, or because Apple reduces those difficulties), it's hard to have an apples-to-apples comparison of the M1 Ultra vs 3090.
I fall more to the camp that we have to evaluate the package of the Mac Studio as presented for now. Especially as it's not like Apple is going to sell this standalone, so the only other deployment in the near future is a likely M1 Ultra Mac Pro. This means it can have the advantages (lower power draw, OS specifically optimised for it) and disadvantages (emulation for common workloads) that come with that.
> Not everyone puts a lot of stock in synthetic benchmarks like Geekbench, so the Verge also fired up Shadow of the Tomb Raider, and the RTX 3090 easily mopped the floor with the M1 Ultra, delivering 142 fps at 4K and 114 fps at 1440p. Unfortunately, the Mac Studio could only muster 108 fps and 96 fps, respectively.
Games are always less optimized for Mac/Metal than for Windows. It would have been better to use a content creation benchmark.
> delivering 142 fps at 4K and 114 fps at 1440p...
Why and how would these GPUs deliver more fps with higher resolution. 4K is 2160p or 2x1080. Shouldn't 1440p (2x720) deliver more fps? Someone please tell me what is going on here.
Not benchmarks but a back of the envelope calculation. When idling a 3090 draws ~20W and an M1 ~200mW. At full power its ~400W vs 100W, so considering the performance of both at maximum and interpolating between those power values it's a pretty sold bet.
Older GPUs with larger node sizes would fare even worse.
I’ve yet to see a fair comparison/assessment on M1 Ultra’s GPU performance in terms of ML potential. The same with 3D rendering outside of Blender 3.1 speeds which is not optimal (I also wonder if industrial rendering engines such as Renderman will support ARM in the future). As more and more softwares optimize for ARM, I see a great deal of potential for these machines above and beyond what is possible right now within limited ecosystem. I am 100% convinced Apple struck gold with this.
That graph from the verge can’t be right. There is no way you get more FPS at 4K than at 1080p on Shadow of the Tomb Raider while keeping all the other settings the same.
I'm wary of synthetic benchmarks. Are they representative of anything meaningful? Are they gamed? It's why people tend to look for real-world performance. For GPUs this often means games. This system too can be gamed (eg drivers silently optimizing for performance). Some games will also favour a particular architecture. Taken in totality you tend to get a good big picture view.
Such real-world benchmarks are thin on the ground for the M1 Ultra thus far and are clouded by things like programs and drivers not being optimized and/or translation layers. This will improve over time.
But I've seen a couple of game benchmarks for the M1 Ultra that seem to show it's >50% of the FPS of the 3090. That's honestly astounding.
I'm still confused by the power draw of the M1 Ultra. It's claimed to be 60W but the Mac Studio seems to draw >300W so what's the reality?
But if a CPU/GPU combo does in fact draw 60W (compared to the 350W+ for the 3090 alone and another 100W+ for any likely CPU) and gets over 50% of the performance? That is absolutely huge (if true). This is still a first-generation product. Many expect the M2 later this year. That may bring respectable (but definitely not cutting edge) GPU performance to the Mac Mini. Apple still has a lot of headroom here.
3090 is on Samsung 8nm 2+years old node so it will always be worse and still apple gpu claims are always oversaturated and its good people are talking about it considering apple propaganda of FASTEST CONSUMER GPU.
Despite this m1 ultra gpu is a big step for all people intrested in apple market and DL/ML workloads
3090 is on Samsung 8nm 2+years old node so it will always be worse
While the M1 Ultra is the latest CPU, it still uses A14/M1 cores from 2020. It's likely that Apple made a lot of progress on the microarchitecture since then, plus rumors are that M2 will switch to a 4nm node.
> considering apple propaganda of FASTEST CONSUMER GPU
Well as a mostly Apple user, I don't read Apple propaganda. Or nvidia propaganda for that matter.
The M1 Max mac studio's full system consumption on load seems to be ~50 W. I wonder how much performance nvidia can deliver with that. I think ... they simply don't have a product in that power budget?
I know most people only care about FASTEST CONSUMER GPU (yay, it doubles as a leaf blower and a space heater) but I still think nvidia's best product was the 1050Ti ... decent 1080 performance in 75 W.
It's out of stock just about everywhere and the few places that you can get it have scalpers charging thousands for it.
A standalone GPU that consumes multiple times more power than the M1, is a massive brick of hardware and is impossible to get for most consumers performs better? Shocking.
That’s a very disingenuous take on “in stock”. Sure, if you’re willing to pay multiples of MSRP, major retailers will now take your money too and not just let scalpers only get in on the profits.
If we don't eschew praising nvidia for its glorious performance, I guess we deserve that the rumored 4090 will draw 600 watts of power. Nvidia delivers what the majority wants.
The graph from the Verge embedded in this article is mislabeled. 4K should have the lowest fps and 1080p the highest. From the Verge's testing, the $6,200 M1 Mac Ultra's performance in Shadow of the Tomb Raider is as follows:
4K: 60fps
1440p: 96fps
1080p: 108fps
The Verge doesn't mention what graphics setting SoTR is set at, but assuming it is Ultra, the M1 Ultra's performance is about on par with an RTX 3060 Ti (MSRP $400):
4K: 54fps
1440p: 96fps
1080p: 130fps
Here's what a 3070 Ti (MSRP $600) gets:
4K: 67fps
1440p: 111fps
1080p: 162fps
Here's what a 3080 Ti (MSRP $1200) gets:
4K: 83fps
1440p: 147fps
1080p: 210fps
GPU performance in games doesn't scale linearly. You pay a lot more for a small boost in fps. Sourced from gpucheck.com which loosely matches up with the reviews I checked.
May be an unpopular take here: comparing the spec, we already knew that (3090 is faster.) We can look at theoretical peak FLOPS (32bit), and also available memory, and internal bandwidth. Of the 3, 3090 clearly wins in FLOPS, kind of a draw in internal bandwidth depending on how you read into it, and M1 Ultra has more memory.
Now, Apple’s benchmark, like all benchmark releases by their own manufacturer including Intel, use tests that’s favorable to their specific hardwares.
No 2 workloads are the same, there probably exists some workload that your hardware run fastest comparing to the competition.
In this case, relying on GPU alone doesn’t seem to make the M1 Ultra comes first, but they have a secret weapon: unified memory. Anyone who uses GPGPU for example will know that the latency to copy back and forth between main and GPU memory can be a deal breaker in some workload.
So to choose a a benchmark making M1 Ultra stands out then would involve picking some intensive workload that involve moving data between the GPU and CPU. That movement would diminish a bit of the gain from 3090 and if choosing the mix right, M1 Ultra can come out on top.
I’m not accusing Apple to be disingenuous, at least not in the sense that is not a common practice in the industry.
Apple’s chart is very uncommon though. Eg at a few points they keep showing the same statistics comparing it to another thing, where consecutive charts are just changing the “other thing”. Any sane person would put all of them in the same chart. The reason? Probably because nothing is in scale so drawing all of them in the same chart is meaningless.
For applications that are performance limited by memory bandwidth,
the M1 Ultra should perform at 85% of a 3090 given their respective
bandwidth of 800 GB/s versus 936.2 GB/s.
People using such applications where additionally power efficiency is paramount, such as memory hard proof-of-work, should be very keen in seeing whether the M1 Ultra lives up to its promise...
I find it strange that people even keep pushing the comparison with the death of EGPU's for modern Macs. I'm just glad we have some serious GPU muscle for both gaming and graphical work.
One question I’ we never gotten an answer for is what about arm makes these apple chips so good? Is it just that Apple can design them themselves compared to when they were with Intel? Or is there something about Arm that makes it better for making the processors that Apple likes? Outside of the M1 ultra I feel like most of the advancements are just due to be on the newest processes
Well apple got an architecture license, has been iterating on their design for many years, has bought companies with significant expertise, and has been pretty aggressive with making their Iphone/Ipad CPUs the best that they can be, and more recently applied a very similar design to MBPs, Mini, and Studio. M1 is based on the Apple A14 core and the future M2 will be based on the A15 core that's been shipping for awhile now in iphones.
Sadly you can't take advantages of Apple's CPU designs unless you buy into their ecosystem. Here's hoping ASAHI linux (Marcan's port to the M1) goes well.
There's little things like ditching the 32 bit ISA, not a big deal, but does make each new iteration that much easier from a design and testing standpoint.
ARM also has a loser memory model, which makes it easier to use more effectively use existing bandwidth.
But there's no magic. It's a straight up aggressive design, wide instruction decode (yes that's easier for arm than x86-64), large re-order buffer, wide issue, etc. Anandtech has a nice summary comparing various M1 features to the Intel and AMD competition. Apple has a high IPC and they make the most of their lower clocks (usually around 3.2 GHz). The chips from Intel/AMD that they are compared against are often in the 4-5 GHz range and burn significantly more power because of that.
Are there any Speedometer benchmark results for the M1 Ultra yet? I've looked, but haven't found any.
Given how much time I spend in the browser/Electron apps, browser benchmarks are the only benchmarks that are relevant to me (tongue placed firmly in cheek, I'm sure the entry-level M1 would suit me just fine).
It's wild that Apple has gotten this close with an integrated GPU. This is really the best engineering organization on the planet attached to an extremely annoying marketing/ecosystem/revenue team.
Given that Apple has finally conceded that a laptop thicker than an iPad is acceptable (M1 Pro/Max models), we may see the Ultra chip in a new flagship Macbook Pro soon. It won't be cranked all the way up like it is in the Mac Studio obviously, but it's definitely possible to put one (or the inevitable M2 version) in a large laptop.
> we may see the Ultra chip in a new flagship Macbook Pro soon
I’m not hopeful to be honest, at least not without a significant thermal reduction penalty. The M1 Ultra in the Mac Studio has a 1kg heavier heat sink vs the Max due to the extra heat.
Personally I think this entire situation has been at the behest of the AR push. Same with stuff like putting LIDAR in the iPhone. Push that price/performance down in a proprietary way so that when AR/VR becomes a market you have an unassailable advantage. At least that's my pet theory. In the meantime it's paid dividends as a Mac fan.
Yes was thinking that - AR/VR is going to be interesting - I definitely wouldn't trust Facebook recording everything I do and adding their ad tech to everything I see while in AR...
They also have to consider the $ cost of the GPU and time to earn back the initial investment. The AMD 5000 series and 3080 were already more popular than the 3090 for miners because of the hashrate/$ - considering you can't slot more GPUs into a M1 Studio and need to buy a whole other system, I'd imagine that it's current price would leave it uncompetitive for miners even if it did match the 3090 perf.
(nvidia had the same issue with miners still preferring their LHR consumer card to their attempts to push miners to their business tier cards)
All Apple has to do is to enable workstation-level features in its drivers and they'd beat 3090 handily. Even Titan RTX demolishes 3090 with workstation drivers in software that uses those features. As for raw performance, 3090 is far ahead.
Where do you see any gimping? It's known that NVidia gimps its consumer range by shipping drivers that are slow in some tasks despite the same hardware being able to be much faster. My comment was about Apple getting an easy win just by not doing that in pro tasks that matter, in other words beating 3090 in some apps is fairly easy due to NVidia's politics.
Yes Nvidia gimps because they have other products that are not gimped. But Apple doesn't. The M1 Ultra is the best of the best they have to offer.
I'm not saying Apple gimps their hardware. I'm asking because you are saying Apple gimps their hardware by not releasing "workstation drivers". Thus I ask you why would apple Gimp their hardware.
You were implicitly assuming that. How about: "Apple can just implement those premium features NVidia is gimping", i.e. implying Apple might not have those capabilities in their drivers yet, not intentionally gimping them.
If this transition is like the others the performance of Apple software and the operating system are going to increase with every new release for the next couple of years as they optimize it.
But on the GPU side NVIDIA and AMD have been battling it out for a while, competition is tight, and thinking Apple could come, compete and actually beat the fastest GPU currently available for consumers is a bit of a pipe dream, in hindsight. The GPU market hasn't been sitting on laurels for as long as Intel has been. That said, I am still positively surprised at how close they managed to reach the 3090, but alas, no cigar.