Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Found the discrepancy. I use single precision in PyTorch. When I benchmark sgemm, the SSE code path is selected.

Conclusion: MKL detects Zen now, but currently only implements a Zen code path for dgemm and not for sgemm. To get good performance for sgemm, you have to fake being an Intel CPU.

Edit, longer description: https://github.com/pytorch/builder/issues/504



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: