Julia's GPU capabilities are _far_ better than python's. There are near complete, free wrappers for all the CUDA libraries. The same is not true for python (e.g. CUSPARSE).
This is a consequence of how easy it is to interface with languages like C in Julia. I.e. whenever a new CUDA library or change happens, it will take the Julia devs less effort to update their wrappers.
Add to this Julia's HPC support and it is by far the best offering if you want to do REPL style programming on a GPU cluster.
Wow! Thanks for the CUSPARSE.jl shoutout (I'm the maintainer of that package)!
I must take issue with "near complete" - CUSOLVER.jl, my other CUDA wrapper, is missing a lot of the RF (refactorization) functionality and doesn't really have a pretty high level API yet. The doc and testing situation is also pretty bad. Everyone else's packages in JuliaGPU (the overarching GPU org on GitHub) are in a much better state.
You are right, though, that writing CUDA bindings in Julia is very easy - so easy that I can do it! It's also thanks to packages like Clang.jl, which make it easy for us to automate the procedure of wrapping the low-level interfaces.
Finally I must add the caveat that although I should test CUSPARSE.jl and CUSOLVER.jl with MPI (and GPUDirect) I've been extremely busy recently and not able to do so. If anyone wants to help out in this regard I would be very appreciative!
This is a good argument. But that is not a market where majority Python programmers are.
Once the wrappers are done, for most developers, it becomes the matter to layering those build blocks building their own functionality. It is not a strong enough argument that people should use Julia because it is a better language to write wrapper. Let alone that Python wrappers for data science stuff is already like a de-facto builtin.
Not saying Python is perfect, but IMHO, Julia is not hitting the right spot in order to really stand up against Python.
This is a consequence of how easy it is to interface with languages like C in Julia. I.e. whenever a new CUDA library or change happens, it will take the Julia devs less effort to update their wrappers.
Add to this Julia's HPC support and it is by far the best offering if you want to do REPL style programming on a GPU cluster.