As a developer who's micro-optimized some genetic software, I can confirm that I'd considered AVX-512 but decided against it after learning that the hardware being purchased by the company would not have the full AVX-512 feature set desired and it was simpler/easier to just write it in AVX2. Getting the software to also work on older/cheaper hardware made the business owner happy too.
In my case, it's GCC. The option is `-march=native -mtune=native`.
The trick though is _describing_ the scalar operations in the language and getting the compiler to understand how to efficiently vectorize them. I couldn't get GCC to do it at the time (GCC-5 if I recall, though we deployed with GCC-6); maybe it was just inexperience on my part. But I ended up writing the intrinsics by hand. To be quite honest it was my first dive into SIMD and I thought it was rather fun to do.
You can say -march=native -mtune=sandybridge, but there would be no point.
You can say -march=sandybridge -mtune=native, usefully. It might go slower on a real sandybridge than if tuned for it, but would still work, and would go as fast as the smaller instruction mix allows on your build machine.
I know this. I don't care. I use `-march-native -mtune=native` specifically to point other developers on the team to the two relevant compiler options. And if they don't look, nothing's lost.