Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Most high performance software on super computers uses the Intel C and Fortran compilers, and much engineering and scientific software on workstations uses the Intel Maths Kernel Library (MKL) for high performance linear algebra.

Now that AMD EPYC processors are powering a lot of next generation super-computer clusters, we're going to have to figure out some workarounds!



I just compiled tensorflow on amd epyc and had no idea https://github.com/oneapi-src/oneDNN was actually MKL... now Im wondering if I even am getting all that power


The actual cpuid checking code is drilled from here: https://github.com/oneapi-src/oneDNN/blob/master/src/cpu/x64...

to here: https://github.com/oneapi-src/oneDNN/blob/master/src/cpu/x64...

It's using feature-flag checks, not family checks, so you shouldn't be affected if you're using oneDNN.


thank you!


I took a reversing course some years ago and during the first part we learned how to identify the compiler using common patterns. Long story short, the Intel compiler did a phenomenally amazing job optimizing. This was 10 years ago so things may be different now.


10 years ago LLVM was a baby and GCC was still on version 4. Intel probably have an advantage in areas where people pay them for it but GCC and LLVM are excellent compilers today.

Anecdotally, (ignoring that I'm still not sure whether to trust it or not) Intel stopped developing IACA and suggested (but not recommended) LLVM's MCA - which does suggest a changing of the guard in some way.


I think this: https://developer.amd.com/amd-aocl/amd-math-library-libm/ is supposed to be the alternative to MKL for those applications.


No need to develop and alternative when you can trick MKL into not crippling AMD: https://www.pugetsystems.com/labs/hpc/How-To-Use-MKL-with-AM...

Edit: the link I posted follows Agner's advice from the bottom of OPs link. However I think the extra information that it adds is that Zen2 Threadrippers outpaced then-current Intel's top contender. Once Zen3 and Intel's 11th Gen become available, repeating this benchmarks would be very valuable.


Thank you! I wasn't aware of this. But this is only a replacement for libm (i.e. basic trig, exp function), not the matrix-orientated BLAS, LAPACK and SCALAPACK routines that scientific codes spend >90% of their time.


I'm not personally familiar with those, but seems like BLAS, SCALAPACK, & others are also available:

https://developer.amd.com/amd-aocl/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: