The article seems to suggest that the loop buffer provides no performance benefi...

akira2501 · 2024-11-30T22:39:37 1733006377

> but was shipped anyway, possibly so someone could save face

Was shipped anyway because it can be disabled with a firmware update and because drastically altering physical hardware layouts mid design was likely to have worse impacts.

eek2121 · 2024-12-01T01:58:52 1733018332

Well that and changing a chip can take years due to redesigning, putting through validation, RTM, and time to create.

Building chips is a multiyear process and most folks don’t understand this.

usrusr · 2024-12-01T09:58:17 1733047097

What you describe would be shipped physically but disabled, and that certainly happens a lot. For exactly those reasons. What GP described was shipped not only physically present but also not even disabled, because politics. That would be a very different thing.

readyplayernull · 2024-12-01T00:09:25 1733011765

That bathroom with a door to the kitchen.

adgjlsfhk1 · 2024-11-30T22:33:16 1733005996

> but was shipped anyway, possibly so someone could save face

no. once the core has it and you realize it doesn't help much, it absolutely is a risk to remove it.

glzone1 · 2024-11-30T23:00:08 1733007608

No kidding. I was adjacent to a tape out w some last minute tweaks - ugh. The problem is the current cycle time is very slow and costly and u spend as much time validating things as you do designing. It’s not programming.

magicalhippo · 2024-11-30T23:55:23 1733010923

Once interviewed at a place which made sensors that was used a lot in the oil industry. Once you put a sensor on the bottom of the ocean 100+ meters (300+ feet) down, they're not getting serviced any time soon.

They showed me the facilities, and the vast majority was taken up by testing and validation rigs. The sensors would go through many stages, taking several weeks.

The final stage had an adjacent room with a viewing window and a nice couch, so a representative for the client could watch the final tests before bringing the sensors back.

Quite the opposite to the "just publish a patch" mentality that's so prevalent these days.

hajile · 2024-11-30T23:14:25 1733008465

If you work on a critical piece of software (especially one you can't update later), you absolutely can spend way more time validating than you do writing code.

The ease of pushing updates encourages lazy coding.

chefandy · 2024-12-01T00:07:18 1733011638

> The ease of pushing updates encourages lazy coding.

Certainly in some cases, but in others, it just shifts the economics: Obviously, fault tolerance can be laborious and time consuming, and that time and labor is taken from something else. When the natures of your dev and distribution pipelines render faults less disruptive, and you have a good foundational codebase and code review process that pay attention to security and core stability, quickly creating 3 working features can be much, much more valuable than making sure 1 working feature will never ever generate a support ticket.

oefrha · 2024-12-01T07:59:20 1733039960

> It’s not programming.

Even for software it’s often risky to remove code once it’s in there. Lots of software products are shipped with tons of unused code and assets because no one’s got time to validate nothing’s gonna go wrong when you remove them. Check out some game teardowns, they often have dead assets from years ago, sometimes even completely unrelated things from the studio’s past projects.

Of course it’s 100x worse for hardware projects.

gtirloni · 2024-12-01T13:14:37 1733058877

And that's another reason for tackling technical debt early on because once it compounds, no one is ever touching that thing.

sweetjuly · 2024-12-01T01:09:29 1733015369

The article also mentions they had trouble measuring power usage in general so we can't necessarily (and, really, shouldn't) conclude that it has no impact whatsoever. I highly doubt that AMD's engineering teams are so unprincipled as to allow people to add HW features with no value (why would you dedicate area and power to a feature which doesn't do anything?), and so I'm inclined to give them the benefit of the doubt here and assume that Chips 'n Cheese simply couldn't measure the impact.

clamchowder · 2024-12-01T07:44:51 1733039091

Note - I saw the article through from start to finish. For power measurements I modified my memory bandwidth test to read AMD's core energy status MSR, and modified the instruction bandwidth testing part to create a loop within the test array. (https://github.com/clamchowder/Microbenchmarks/commit/6942ab...)

Remember most of the technical analysis on Chips and Cheese is a one person effort, and I simply don't have infinite free time or equipment to dig deeper into power. That's why I wrote "Perhaps some more mainstream tech outlets will figure out AMD disabled the loop buffer at some point, and do testing that I personally lack the time and resources to carry out."

sweetjuly · 2024-12-02T10:40:09 1733136009

Sorry, I totally didn't mean this as a slight to your work--I've been a fan for quite a while :)

More so that estimating power when you don't have access to post synthesis simulations or internal gas gauges is very hard. For something so small, I can easily see this being a massive pain to measure in the field and the kind of thing that would easily vanish into the noise on a real system.

But in the absence of any clear answer, I do think it's reasonable to assume that the feature does in fact have the power advantages AMD intended, even if small.

tacticus · 2024-12-02T03:38:40 1733110720

and given the size of the variance it's probably too small to easily pick up from external atx\eps12v monitoring solutions :\

kimixa · 2024-12-01T21:46:49 1733089609

> engineering teams are so unprincipled as to allow people to add HW features with no value

This is often pretty common, as the performance characteristics are often unknown until late in the hardware design cycle - it would be "easy" if each cycle was just changing that single unit with everything else static, but that isn't the case as everything is changing around it. And then by the time you've got everything together complete enough to actually test end-to-end pipeline performance, removing things is often the riskier choice.

And that's before you even get to the point of low-level implementation/layout/node specific optimizations, which can then again have somewhat unexpected results on frequency and power metrics.

01100011 · 2024-12-01T07:51:20 1733039480

Working at.. a very popular HW company.. I'll say that we(the SW folks) are currently obsessed with 'doing something' even if the thing we're doing hasn't fully been proven to have benefits outside of some narrow use cases or targeted benchmarks. It's very frustrating, but no one wants to put the time in to do the research up front. It's easier to just move forward with a new project because upper management stays happy and doesn't ask questions.

usrusr · 2024-12-01T10:41:19 1733049679

Is it that expectation of major updates coming in at a fixed cycle? Not only expected by upper management but also by end users? That's a difficult trap to get out of.

I wonder if that will be the key benefit of Google's switch to two "major" Android releases each year: it will get people used to nothing newsworthy happening within a version increment. And I also wonder if that's intentional, and my guess is not the tiniest bit.

01100011 · 2024-12-02T07:05:15 1733123115

Yeah, we've made great progress and folks are used to it. Now we've got to deliver but most of the low-hanging fruit has been picked(some of it also incurred tech debt).

markus_zhang · 2024-12-01T12:17:12 1733055432

Do you have new software managers/directors who are encouraging such behavior? From my experience new leaders tend to lean on this tactics to grab power.

01100011 · 2024-12-02T07:04:11 1733123051

Strangely no. Our management hasn't really changed in several years. Expectations have risen though and we've picked a lot of the low-hanging fruit. We also failed to invest in our staffing and so we don't have enough experienced devs to actually do the work now.

iforgotpassword · 2024-12-01T07:42:34 1733038954

Well the other possibility is that the power benchmarks are accurate: the buffer did save power, but then they figured out an even better optimization on the microcodes level that would make the regular path save even more power, so the buffer actually became a power hog.

EVa5I7bHFq9mnYK · 2024-12-01T07:38:38 1733038718

>> when the project is done, there are more lines of code and performance is worse

There is an added benefit though - that the new programmers now are fluent in the code base. That benefit might be worth more than LOCs or performance.

weinzierl · 2024-12-01T11:36:32 1733052992

"The article seems to suggest that the loop buffer provides no performance benefit and no power benefit."

It tests the performance benefit hypothesis in different scenarios and does not find evidence that supports it. It makes one best effort attempt to test the power benefit hypothesis and concludes it with: "Results make no sense."

I think the real take-away is that performance measurements without considering power tell only half the story. We came a long way when it comes to the performance measurement half but power measurement is still hard. We should work on that.

ksaj · 2024-12-01T00:22:01 1733012521

"the project shouldn't have shipped."

Tell that to the share holders. As a public company, they can very quickly lose enormous amounts of money by being behind or below on just about anything.

firebot · 2024-12-01T20:19:33 1733084373

The article clearly articulates that there's no performance benefit. However there's efficiency. It reduces power consumption.

hinkley · 2024-12-01T20:44:37 1733085877

Someone elsewhere quotes a game specific benchmark of about 15%. Which will mostly matter when your FPS starts to make game play difficult.

There will be a certain number of people who will delay an upgrade a bit more because the new machines don’t have enough extra oomph to warrant it. Little’s Law can apply to finance when it’s interval between purchases.

saagarjha · 2024-12-01T04:57:14 1733029034

Only on Hacker News will you get CPU validation fanfiction.