Thanks for the explanation! __user and sparse seem to be almost exactly the kinds of tools I had in mind, with the caveat that __user would be the default for a pointer argument to a syscall, so that it wouldn't need to be specified.
I'm not sure I understand what interface you're referring to that was "unsafe" and subsequently "used in an unsafe manner". What is the "unsafe interface" here that was being used in an unsafe manner? It seems to me that the problem was that the pointer was not marked as __user? Which is awful, because shouldn't __user be the implicit default behavior for a pointer argument to a syscall? Why should the default behavior be the unsafe one you pretty much never want?
System call pointer arguments are usually marked with __user annotations (I cannot recall if there are some weird calls that may need a kernel pointer, none should need it, but there may be some legacy one). In particular, the infop argument to waitid() is marked as user-space pointer [1].
Before using a pointer to user-space one should check if access_ok() to it. The usual safe interfaces — copy_{to,from}_user(), put_user(), get_user() — always perform this check and fail with an error if the pointer is not an okay user-space pointer.
The commit that introduced the vulnerability [2] replaced the safe interface with unsafe ones, possibly for performance improvements. The code used put_user() function to set individual fields of a struct. Multiple calls to put_user() were replaced with multiple calls to unsafe_put_user() which does not perform access_ok() check every time. A check for NULL pointer was added before the stores. unsafe_put_user() still checks whether the address points to an actually mapped memory location, but does not verify whether the location is in user-space.
The commit was not really discussed in-depth on LKML [3] as it came from Al Viro who should know better, is one of the Sparse maintainers, etc. Some projects require human justifications for any usage of unsafe interfaces during code review (like, flagging a review with 'needs-check' or something that requires a sign-off by another human that the unsafe thing is actually safe). This may have been the case where it could matter, as the static analysis tool should not produce bogus warnings for interfaces which are designed to perform unsafe stuff. Though, it may also be useful to add a check to Sparse which will verify that unsafe_{get,put}_user() calls are preceded by an access_ok() call in the same function.
I'm not sure I understand what interface you're referring to that was "unsafe" and subsequently "used in an unsafe manner". What is the "unsafe interface" here that was being used in an unsafe manner? It seems to me that the problem was that the pointer was not marked as __user? Which is awful, because shouldn't __user be the implicit default behavior for a pointer argument to a syscall? Why should the default behavior be the unsafe one you pretty much never want?