This is cool but following some of the links it seems like there are a lot of immature parts of the ecosystem and things will not "just work". See for example this bug which I found from the blog post:
https://github.com/odsl-team/julia-ml-from-scratch/issues/2
Summarizing, they benchmark some machine learning code that uses KernelAbstractions.jl on different platforms and find:
* AMD GPU is slower than CPU
* Intel GPU doesn't finish / seems to leak memory
* Apple GPU doesn't finish / seems to leak memory
Would also be interesting to compare the benchmarks to hand-written CUDA kernels (both in Julia and C++) to quantify the cost of the KernelAbstractions layer.
Summarizing, they benchmark some machine learning code that uses KernelAbstractions.jl on different platforms and find:
* AMD GPU is slower than CPU
* Intel GPU doesn't finish / seems to leak memory
* Apple GPU doesn't finish / seems to leak memory
Would also be interesting to compare the benchmarks to hand-written CUDA kernels (both in Julia and C++) to quantify the cost of the KernelAbstractions layer.