Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a developer who's micro-optimized some genetic software, I can confirm that I'd considered AVX-512 but decided against it after learning that the hardware being purchased by the company would not have the full AVX-512 feature set desired and it was simpler/easier to just write it in AVX2. Getting the software to also work on older/cheaper hardware made the business owner happy too.


If you really care about performance, you could always compile on the target machine directly via -xhost [0] or whatever the flag is on your compiler.

[0] https://software.intel.com/en-us/cpp-compiler-developer-guid...


In my case, it's GCC. The option is `-march=native -mtune=native`.

The trick though is _describing_ the scalar operations in the language and getting the compiler to understand how to efficiently vectorize them. I couldn't get GCC to do it at the time (GCC-5 if I recall, though we deployed with GCC-6); maybe it was just inexperience on my part. But I ended up writing the intrinsics by hand. To be quite honest it was my first dive into SIMD and I thought it was rather fun to do.


-march=native implies -mtune=native.

You can say -march=native -mtune=sandybridge, but there would be no point.

You can say -march=sandybridge -mtune=native, usefully. It might go slower on a real sandybridge than if tuned for it, but would still work, and would go as fast as the smaller instruction mix allows on your build machine.


I know this. I don't care. I use `-march-native -mtune=native` specifically to point other developers on the team to the two relevant compiler options. And if they don't look, nothing's lost.


Which ISA did it have?

Even the minimal AVX-512 ISA on any mainstream CPU (SKX) is pretty much a strict superset of AVX2.


> Which ISA did it have?

Business side was considering whether to buy Skylake or Broadwell.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: