Simple Linux kernel memory corruption bug can lead to complete system compromise

oneepic · on Oct 20, 2021

In case anyone was confused like me, this appears to be describing a past bug from ~10 months ago, not an open one. (the blog post links to the bug also) https://bugs.chromium.org/p/project-zero/issues/detail?id=21...

perihelions · on Oct 20, 2021

Did Debian-stable not release the patch until February 2021, or did am I confused?

https://lists.debian.org/debian-security-announce/2021/msg00...

https://tracker.debian.org/news/1226431/accepted-linux-41917...

geofft · on Oct 20, 2021

The commit messages are pretty unclear about whether there's any security impact:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

At least the second one says "... leading to use-after-free errors." But style in the Linux community is to not mention security impact and just to give a dense explanation of the bug itself. (Jann Horn, as a person who does care about security, tends to be better about this than most kernel developers; if the fix were from the average subsystem maintainer, I wouldn't expect to even see a mention of "use-after-free.")

Also, if you look at the Project Zero bug log (https://bugs.chromium.org/p/project-zero/issues/detail?id=21...), it's clear that Horn wasn't totally sure whether/how this could be exploited, just that it seemed funny.

(This should probably lead you to question whether "stable" kernels are a meaningful concept and whether the hypothesis that stable kernels are patched / otherwise do what they claim to do is even falsifiable.)

tialaramex · on Oct 20, 2021

The kindest way to look at the Linux commit behaviour is that almost by definition its bugs are security bugs. The bug means the kernel doesn't behave as intended, and you are entitled from a security point of view to assume it does behave as intended, so, hence bugs are security bugs. And thus argubaly having a "security bug" flag is redundant.

For example suppose for some crazy reason, on Sundays all USB audio devices have stereo flipped by mistake. That's a bug. Is it a security bug? At first glance you may think "No". It's just a weird logic bug. But the user is entitled to reason that if they routed stereo left from their USB digital audio feed to the mono "Security Announcements" PA in the building, and their "security announcements" code is silent on stereo right, that's fine, that will still cause the announcements to come from their PA system as desired. It's astonishing that on Sunday this Linux bug silences the PA and they don't get PA announcements that there's a seismic alarm down in gold vaults D, E and F because somebody has drilled into them. It was a security bug after all.

shawnz · on Oct 20, 2021

That seems like a stretch to me. If the hypothetical issue you are describing were caused by a bug in the system's media player software, would you say it's a security bug there too, just because it could be used to play security announcements? With this way of looking at things you are essentially saying that any bug in any software that could be called by other software is a security bug.

tialaramex · on Oct 20, 2021

Yes, I would actually be comfortable with the latter claim in extremis.

But here we're talking about the Linux kernel and so we needn't stretch that far. "The buck stops here" so to speak, if you can't trust the OS kernel to do what it promised, you're pretty much screwed.

All you have left are broader physical or mathematical guarantees e.g. even a Linux kernel bug at Let's Encrypt can't leak your TLS server private keys because they don't have your private keys and so mathematically that can't happen - or even if there's a really drastic Linux kernel zero day this Android phone can't travel faster than the speed of light because physics doesn't rely on kernel code.

Datagenerator · on Oct 20, 2021

Thank you, security officer on duty goes back to sleep.

diegocg · on Oct 20, 2021

It is also a local exploit, not a remote one

Steltek · on Oct 20, 2021

At some point in my career, I picked up the notion that there an infinite number of local exploits laying around on your average Linux box. Any local user could find their way to root unless you took extra steps to lock things down. I'm not saying that there are still bash one-liners that give you a root prompt. Just that the "attack surface" of privileged binaries and kernel APIs is so enormous that there must be something to leverage. I don't mean to pick on anything unfairly but I figured a specially crafted filesystem or FUSE command would do the trick quite easily.

Is that still the case or am I just old?

kuroguro · on Oct 20, 2021

Not sure about the particular setup you mentioned, but given that there's an LPE published every few months, I wouldn't be surprised.

https://ssd-disclosure.com/ssd-advisory-overlayfs-pe/

https://blog.qualys.com/vulnerabilities-threat-research/2021...

ncmncm · on Oct 20, 2021

The distinction is moot. A remote exploit that gets you local execution combined with a local to root gets you remote to root, and game over.

phendrenad2 · on Oct 20, 2021

Feels like a few years ago, posts about a local privilege escalation would be shouted down with "It doesn't matter, if someone has access to your machine, it's game over man". And remote code execution in a non-privileged context would be shouted down with "so what, it can't run as root". Glad to see people are finally connecting the dots.

amelius · on Oct 20, 2021

Can it be turned into a remote exploit through a web-browser? (Assuming the user knows what they are doing)

geofft · on Oct 20, 2021

Not without a sandbox escape, at which point you often have more juicy targets without even getting to root (e.g., bank account credentials). The attack requires the ability to open a terminal device and do strange ioctls on it. Web browsers don't open terminal devices at all, so you are pretty unlikely to induce the browser to reuse parts of its code to do it; you'd need the ability to run arbitrary code.

nyc_pizzadev · on Oct 20, 2021

A handful of C projects I have seen use magic numbers in allocated structs to prevent use-after-free and other memory bugs[0]. Basically, in this case, when the ref count hits zero and the struct is freed, the magic is zeroed and any further access will be stopped. The author makes no reference of this, so I guess this isn’t a widespread safety pattern?

[0] https://github.com/varnishcache/varnish-cache/blob/4ae73a5b1...

vlovich123 · on Oct 20, 2021

It’s possible some projects do this correctly but I suspect most have a false sense of security as the compiler will elide all stores that are happening in a struct about to be freed and there’s no C/C++/LLVM language that’s really immune from this [1].

Usually a more thorough approach is to turn on malloc scribbling, ASAN or valgrind which is something Darwin’s allocator can be told to do (it’ll scribble separate uninitialized and freed patterns).

I could see the appeal of there being a magic value though. I think that’s what memset_s is for so hopefully your favorite project is doing that properly.

[1] http://www.daemonology.net/blog/2014-09-04-how-to-zero-a-buf...

nyc_pizzadev · on Oct 20, 2021

> I suspect most have a false sense of security as the compiler will elide all stores that are happening in a struct about to be freed

The magic field is reset before returning the pointer to the allocator, so it’s definitely a live write to a valid pointer.

geofft · on Oct 20, 2021

It's a live write to a valid pointer that's never read from, so it can't have any influence on the program behavior if the rest of the program is valid C. Compilers are free to optimize that (and they basically need to for performance reasons - unused writes are common in normal code).

cogman10 · on Oct 20, 2021

In C and C++, It's unlikely the compiler will optimize that because of pointer aliasing. It's basically impossible for the compiler to know if a given pointer will ever be re-referenced in the future due to pointer math or in an extreme case, explicit pointer addresses (happens semi frequently in the embedded world).

DSMan195276 · on Oct 20, 2021

If you're passing the pointer to the standard library's `free()` then the compiler doesn't actually have to worry about pointer aliasing - it's undefined behavior to use a pointer after it has been passed to `free()`, so it can conclude it never happens in your program.

FWIW most of what you're referring to (reaching a free'd pointer via pointer arithmetic, or going straight to its address) is actually undefined behavior according to the C standard, what you're allowed to do with pointers is more limited than that. There are ways of getting around some of that (obviously, since it's necessary in embedded and other low-level situations), but it requires knowledge about your specific compiler and/or potentially the use of extensions or similar things not part of the standard.

Jweb_Guru · on Oct 20, 2021

You can put the code into godbolt right now and see that it's optimized out. The C and C++ standards say the compiler can make plenty of assumptions about what can and can't be aliased, and they generally won't hesitate to exploit this for optimization purposes. If they did not do this, C++ in particular would be pretty screwed as a high performance language (C might still fare okay in benchmarks, though not in real programs).

nyc_pizzadev · on Oct 20, 2021

The pointer is read by the allocator, which then performs its own set of writes and management. I think you guys might be confusing stack memory and dynamic memory here.

Interestingly enough, if you were to incorrectly use stack memory in this scenario, the magic checks should trigger, pointing out improper memory usage.

DSMan195276 · on Oct 20, 2021

Allocation and free functions from the standard library are 'special' though and do actually get their own logic in regards to whether a pointer is necessary or not. If you do some simple tests you can see that gcc will optimize away entire `malloc()` and `free()` calls if it can prove the resulting memory or assignments are unnecessary (Ex. Store a constant to it, and then read the constant). It's willing to do that because the behavior of those functions is defined by the standard, so it 'knows' that removing those calls has no affect on the behavior of the program.

I'm pretty sure you can get similar optimization behavior when you mark functions with special attributes, I'm not 100% sure on that point though. So for the Linux Kernel I'm not sure that kind of optimization would ever be done since obviously it's not using the standard defined functions and the compiler might not be given enough information to know the functions have the same semantics as 'malloc()' and 'free()'.

I personally was able to get the exact behavior described here by compiling the below code using `-O2`. The volatile write just ensures the first constant write and the malloc itself cannot be optimized out (and that does happen! Take out the volatile and there are no calls to malloc in the result), but gcc is still free to do whatever it wants with the other write and for me it's completely gone in the resulting program.

    int main()
    {
        int *p = malloc(sizeof(*p));
        *(volatile int *)p = 20;

        printf("p=%d\n", *p);
        *p = 30; /* this is gone from the -O2 compiled code */
        free(p);
        return 0;
    }

The relevant part of the assembly looks like this, I don't think I'm missing anything in that the 30 assignment is completely gone:

    push   $0x4
    call   460 <malloc@plt>
    movl   $0x14,(%eax) # Assignment of 20
    mov    %eax,%esi
    pop    %eax
    lea    -0x1930(%ebx),%eax
    pop    %edx
    pushl  (%esi) # push the 20
    push   %eax
    call   440 <printf@plt>
    mov    %esi,(%esp) # load address to pass to free(), no assignment between printf() and free()
    call   450 <free@plt>

pjmlp · on Oct 20, 2021

Yes, it is gone. You can play with it on Compiler Explorer.

https://godbolt.org/z/Ea3WPcad1

nyc_pizzadev · on Oct 20, 2021

The point of the magic values are that these values are eventually checked. This should break out of these kinds of optimizations.

Edit note: allocators are user space libraries (stdlib) and not part of the C spec. Use-after-free is extremely unsafe, however its completely valid C.

2nd note: running all programs below give me the expected result, writes to pointers are live, regardless if they are freed. So please provide more concrete steps to reproduce your results.

DSMan195276 · on Oct 20, 2021

> Edit note: allocators are user space libraries (stdlib) and not part of the C spec. Use-after-free is extremely unsafe, however its completely valid C.

I'm not sure why you think this, but the C standard includes a whole section defining the behavior of the standard library functions, including malloc and free. And it includes this note as undefined behavior:

> The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.20.3)

So it is undefined behavior to access memory passed to free, or IE use-after-free is undefined behavior if you're using the standard library.

> 2nd note: running all programs below give me the expected result, writes to pointers are live, regardless if they are freed. So please provide more concrete steps to reproduce your results.

I would check that you're compiling them with `-O2`, and also check the assembly output. However as someone linked, you can already see from the online compiler output that clearly gcc is capable of optimizing the assignment out. Here's my second program in the same compiler, notice how printf is called twice but the 30 assignment (which is what is supposed to set your magic number) is still completely gone: https://godbolt.org/z/9reErcbGK

Edit: Sorry, I previously included the wrong quote from the standard (it was about a double-free being UB, slightly different), I have the right one now. I got it from here if you want to look at it, it's a draft but practically the same as the actual one: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

guerrilla · on Oct 20, 2021

Thanks for going through all this so meticulously. I actually didn't know this was possible (although I usually disable optimizations, so that might be why.) Just to be clear, using volatile on the magic number will ensure this is avoided?

vlovich123 · on Oct 20, 2021

I recommend reading the article I linked. They thought they found a solution with volatile and it turns out didn’t actually work. Malloc/free are 100% special and a sufficiently smart compiler can remove this stuff (sometimes automatically, sometimes if you turn on LTO).

memset_s is the only function that would be defined to work precisely because this was needed for crypto so it is solvable but just take it on faith that if it’s not explicitly called out as forcing the compiler to not do dead store elimination, it will happen.

guerrilla · on Oct 20, 2021

Thanks for the recommendation, I had skipped it. It's worse than I thought. Wow, I need to reprogram my brain actually. Maybe it's time for me to actually read the standard and rethink these things.

DSMan195276 · on Oct 20, 2021

> Just to be clear, using volatile on the magic number will ensure this is avoided?

I think that's your best bet but volatile does bring its own set of problems with it (how it works differs from compiler to compiler). That said it should probably work fine, this is a pretty simple use of volatile. You might check the documentation for your particular compiler though if you have one in mind, it should tell you what it does and if it does anything undesirable (or if there's a better way to do this). Also the volatile cast like I did may be a better approach than actually making the member itself volatile.

FWIW, the whole idea is that whether this happens or not has no impact on your program, so in theory you shouldn't ever notice this happens. Detecting use-after-free like this is not really standards compliant so that's a big reason why it's problematic to implement.

Also, if you're not using the standard allcator then most of this logic doesn't apply because the compiler won't know your special allocator has the semantics of 'free()'. There are `malloc` attributes in gcc that might trigger similar optimization behavior, but you'd have to be using them, and even then I'm not really sure as I haven't looked into what all they do.

vlovich123 · on Oct 20, 2021

Best bet is to use the APIs that standards/compilers guarantee things about. The compiler is totally free to optimize volatile variables under very specific situations. Volatile has a very specific definition, but it does not mean “compiler is not allowed to optimize this”.

Trying to structure code to trick the compiler is a bad idea. The compiler authors know the standard better than you and eventually the compiler will exploit your misunderstanding.

DSMan195276 · on Oct 20, 2021

I agree, but FWIW I'm not claiming that volatile means "the compiler can't optimize this". This program would still function anyway even if the volatile is optimized out, so this is much more of a hint rather than a "this has to happen". A simple volatile store like this is also very unlikely to be optimized out, I don't know of any compiler that would attempt to do it and frankly it would break lots of stuff if it did (just because it can't be optimized out doesn't mean it can't cause other problems though). But when you get down to it trying to catch these use-after-free errors is never going to be guaranteed to work since as we've established use-after-free already breaks the standard itself. Still, using one of the various 'secure zero' APIs if you have one is definitely better, though the logic would need to be changed slightly.

vlovich123 · on Oct 21, 2021

> A simple volatile store like this is also very unlikely to be optimized out,

I think there's been godbolt links posted in this thread showing this assumption is wrong. Dead store elimination applies just fine to volatiles.

DSMan195276 · on Oct 21, 2021

I'd really like to see the example if that's the case, that would be very surprising to me and frankly sounds like a bug if it's really just a simple usage. The gcc documentation[0] suggests it will always emit a load and store, even when the result is completely ignored and is effectively dead code. I (and others) have interpreted this to mean they will never optimize out the actual load or store regardless of context (though reordering and such is still on the table in some cases, obviously, but that doesn't matter for this usage).

As context, the Linux Kernel uses volatile to ensure loads and stores happen, that's ultimately how READ_ONCE and WRITE_ONCE work[1]. If that's actually broken in such a simple case I think they'd like to know xD

[0]: https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html

[1]: https://elixir.bootlin.com/linux/latest/source/tools/include...

Edit: To be clear, I looked for the example you mentioned but couldn't find it. I'm somewhat wondering if you were thinking of the example I posted, since I used volatile to get gcc to not optimize the store out :P

vlovich123 · on Oct 21, 2021

It’s literally in the top level link I supplied [1]. You may trick some compilers today but there’s no guarantee that tomorrow’s compilers won’t get smart and leave you scratching your head about what went wrong. Memory allocation and deallocation is special in the standard. I agree it’s a bit weird but there are reasons for it (this is a form of dead store elimination that isn’t the same as normal dead store elimination which the compiler can’t optimize for volatiles because of what volatile means semantically). Your example with the kernel doesn’t apply because there’s no free happening there.

I’m genuinely amazed at the response. There’s literally an API defined that has the contract you want and your response is “yeah, but I want to write it a totally other way the standard doesn’t allow”. Just use memset_s. It’s a compiler builtin so the generated code is as efficient (more so) as compared with a volatile version except actually safe. Volatile has a totally different purpose and isn’t suitable to try to write a value before calling free.

I’ll leave writing a godbolt example of writing to a volatile right before a free in the same compilation unit at O3 for you to try out.

[1] http://www.daemonology.net/blog/2014-09-04-how-to-zero-a-buf...

DSMan195276 · on Oct 22, 2021

First I will point out, as I said memset_s is a good solution, I have no problem with that and would suggest it's use if it's possible. My complaint is simply the suggestion that volatile doesn't work here, it does.

As far as that article goes, the example for `secure_memzero` works and you will not find any compiler that will 'optimize that out', it would be a bug. And as I linked, the gcc documentation says as much. With that, memory allocation is not as special as you're making it out to be, normal memory can be volatile in perfectly valid situations (even ones mentioned in the standard), and just because it's related to a free() does not mean the compiler is now allowed to remove a volatile dead store - and even if you think it does, gcc will not do that.

Here's an example of such a case[0]. A signal handler is able to view the object being set right before the free() call, and a signal could trigger at that point, but the compiler still optimizes it out (which is correct). Using volatile on the variable to ensure all loads and stores actually happen (and are visible to the signal handler) is the suggested way, and if you do that then the code does set the value before the free().

As for your suggestion of writing to a volatile right before a free(), I'm not sure if you tried but it works just fine as expected, look[1]. I am perfectly confident in saying you will never find an example where the volatile store doesn't happen. With that, if it was willing to make such an optimization in the first place, don't you think my original example that used it to avoid dead store elimination and memory allocation elimination wouldn't have worked in the first place? ;)

[0]: https://godbolt.org/z/WWPz5Gqjo [1]: https://godbolt.org/z/anc1cfnPs

MauranKilom · on Oct 21, 2021

I have at times (ab)used volatile to aid in debugging sessions, something like

  volatile bool doCheck = false;

  if (doCheck)
  {
    // code I want to enable at some point during debugging
  }

The idea is that I attach a debugger, and then only at a certain point enable doCheck.

I was baffled to learn that MSVC will happily constant-fold the false into the if, as long as the variable is function-local. The variable still exists and I can change it in the debugger, but it doesn't actually impact control flow as intended. The "solution" is to move it to e.g. global scope (this is a debugging hack, remember).

Not an exact match for what you asked, but I think a good reminder that optimizers work in mysterious ways, and sprinkling in volatile may confuse the programmer more than the optimizer...

reza_n · on Oct 20, 2021

You can use `explicit_bzero()` to bypass DCE (dead code elimination). Otherwise, simply initializing your memory before using is enough to trigger magic failures when you use-after-free. C programs barely function if they do not initialize memory. Context, I work on Varnish which the OP referenced for this.

DSMan195276 · on Oct 20, 2021

The issue is that value is only checked in the context of a use-after-free error though, which the compiler assumes never happens, so it doesn't actually matter to the optimizer. The code I wrote will optimize the same way even if I add a random pointer assignment or read elsewhere (any of which could technically trigger a use-after-free), because the compiler is allowed to assume that because I'm passing that pointer to `free()` I'm never going to use it again.

To show this even more, if you add an extra printf to print the value of p after the `free()` (so add a literal use-after-free to check the 'magic' value) the 30 assignment is still gone. It prints zero for me because free() clears that memory, but the assembly does not include the 30 assignment even though I'm clearly reading the value it assigned after the `free()` statement, which is exactly the behavior you're attempting to catch. If `free()` didn't touch the value I'd still see 20.

With a little finessing I got my code to print 20 both times (the larger struct gets malloc() to leave the magic value alone) even with the 30 assignment still in the code:

    struct foo {
        char bar[35];
        int p;
    };

    int main()
    {
        struct foo *p = malloc(sizeof(*p));

        *(volatile int *)(&p->p) = 20;

        printf("p=%d\n", p->p);

        p->p = 30;
        free(p);
        printf("p=%d\n", p->p); /* Prints 20 for me, the 30 assignment is optimized out completely */

        return 0;
    }

geofft · on Oct 20, 2021

No, it should not, because it is impossible for the check to fail in valid C. A use-after-free is invalid C. Type confusion is invalid C.

If you have the function

    void free_ws(struct ws *ptr) {
        ptr->magic = 0xdeadc0de;
        free(ptr);
    }

there is no possible circumstance, in valid C, where ptr->magic could be read and have its value equal to 0xdeadc0de. That object is freed immediately after that write.

If you do a read in some other function, say

    char read_from_ws(struct ws *ptr) {
        if (ptr->magic != WS_MAGIC) {
            exit(1);
        }
        return ptr->s[0];
    }

it is impossible for the pointer passed into read_from_ws to be a pointer that has passed through free_ws, i.e., it is impossible for ptr->magic to have been set to 0xdeadc0de by free_ws. Therefore, free_ws doesn't need to actually do the write.

You're right that the correct magic check is not dead, and cannot be eliminated by the same logic. But the effect there is that the magic check always succeeds!

vlovich123 · on Oct 21, 2021

If you compile with LTO the compiler will see through your obfuscation. Play dangerous games assuming the compiler is stupid, win dangerous prizes.

turminal · on Oct 20, 2021

It can be turned of though. Compiler optimizations performed around memory allocations are weird sometimes.

geofft · on Oct 20, 2021

Yes, but this is by far the least weird type of optimization, and the one you'd least want to turn off. It's the sort taught in undergraduate compiler design classes.

Here's a piece of code that would go slower if you turned it off:

    for (i = 0; i < 1000; i++) {
        struct foobar *ptr = malloc(sizeof(struct foobar));
        ptr->this = a[i];
        ptr->that = b[i];
        if (ptr->this > 0) {
            do_stuff(ptr);
        }
        free(ptr);
    }

You only need to write to ptr->that if the condition succeeds. (And frankly you only need to do the allocation at all if a[i] > 0.)

Compilers that don't do this get rejected for compilers that do.

geofft · on Oct 20, 2021

This only protects you against unintentional use-after-free. If a use-after-free of struct ws is a thing you're worried about an attacker intentionally causing, in order for this to be useful, the attacker needs to control one of those four char * pointers and point them somewhere useful. Typically they'd do that by inducing the program to re-allocating the freed memory with a buffer under their control (like input from the network) and then filling it in with specific bytes.

If they can do that, they can very easily fill in the magic numbers too. It's even easier than pointers because it doesn't require inferring information about the running program - the magic number is the same across all instances of Varnish and right there in the source.

"Heap spray" attacks are a generalization of this where the attacker doesn't have precise enough control about what happens between the unwanted free and the reuse, but they can allocate a very large buffer (e.g., send a lot of data in from the network, or open a lot of connections) and put their data in that way. This approach would be basically perfect for defeating the "magic number" approach.

(The blog post itself has a discussion of a number of more advanced variants on the "magic number" approach - see the mention of "tagging pointers with a small number of bits that are checked against object metadata on access".)

tjalfi · on Oct 22, 2021

The Stratus recommendations for structure marking would be a bit harder to defeat.

Here's an excerpt from an old Stratus presentation[0] on writing robust software.

Add TYPE, SIZE, VERSION, and OWNER to data structure

TYPE: Unique number for each different structure

SIZE: in bytes

VERSION: Changed whenever structure declaration changes

OWNER: Unique ID of owner, must be independent of structure contents; can be UID, least significant bits of clock, etc.

[0] https://web.archive.org/web/20170303065858/http://ftp.stratu...

nyc_pizzadev · on Oct 20, 2021

> they can very easily fill in the magic numbers too

Right, recreating the magic does side step this defense.

The context for software security these days is defense in depth and not something like “total defense” anymore. In this case, the use of magics is more of a dev testing mechanism than a runtime protect, although it does provide great runtime protection. What this means is if you use magics with proper testing and load testing, errors should surface before you release.

tjalfi · on Oct 22, 2021

Multics and Stratus VOS are two operating systems that use structure marking[0].

[0] https://multicians.org/thvv/marking.html

pmarreck · on Oct 20, 2021

Solutions like this depend on the will, skill and ethics of the coder.

Better IMHO to design a language in such a way that dangerous errors like this are completely impossible.

(I mean... this is basically why I switched from Ruby to Elixir for web dev, eliminating an entire class of bugs... If the language itself doesn't provide an error-reduction feature, then you are reliant on other developers to "do the right thing" and lose any guarantees)

nottorp · on Oct 20, 2021

The title sounds like it's the end of the world. Reading, I see it's another local exploit. And long fixed to boot.

Can we stop having tabloid titles for technical matters?

tssva · on Oct 20, 2021

The actual title of the article is "How a simple Linux kernel memory corruption bug can lead to complete system compromise". Which is a much less tabloid title than the changed title here. It also more properly reflects the purpose of the article which isn't discussing the specific bug but how such bugs can be exploited and more importantly how to prevent such bugs from being exploited.

bruo · on Oct 20, 2021

This text is not a news report, it’s a technical one about this specific bug. It shows how the attack develops and suggest mitigations at the kernel development level.

The bug itself is small and it lead to a whole system compromise, and the title is very good to guide us to the point they are trying to make… memory corruption is a problem and that needs to be addressed at early stages even, even if the overhead seems not worth it.

rndgermandude · on Oct 20, 2021

I agree. The title is quite factual.

It would be nice if the article stated what's affected more clearly, and importantly, that patches were rolled out long ago for most distros.

rndgermandude · on Oct 20, 2021

Nothing is ever really just a local exploit. It's always also one half of a remote exploit chain...

r1ch · on Oct 20, 2021

I think the title is fine, it's showing how even the most simple memory safety bugs can be exploited to lead to system compromise. Not every submission has to be about something happening right now.

fsflover · on Oct 20, 2021

If you care about such attack vectors, consider security through isolation, which can be provided by Qubes OS: https://qubes-os.org.

alexfromapex · on Oct 20, 2021

Preventing bugs like this is where Rust would shine in the kernel

chc4 · on Oct 20, 2021

Rust would help with bugs like the initial memory unsafety. Half the blog post is about resilience even in the face of memory unsafety though, especially since the entire point is that there only has to be one bug, in any legacy subsystem, to exploit the entire kernel. Using Rust doesn't magically add any of those defense-in-depth mitigations and pessimistic sanity checks.

lmm · on Oct 20, 2021

Resilience is impossible in C-family languages, given undefined behaviour. Any defence-in-depth checks you add can only be triggered once you're already in an undefined state, so the compiler will helpfully strip them out (this has already happened in Linux and is the reason they build with -fno-delete-null-pointer-checks, but C compilers have very little appetite for broadening that kind of thing).

IshKebab · on Oct 20, 2021

That seems like a weird way to do it. Wouldn't it be better to have compile errors if the compiler figures out that it can remove a null check?

flohofwoe · on Oct 20, 2021

The "inconvenient truth" with Undefined Behaviour is that it provides essential wiggle room for the compiler to implement some optimizations that wouldn't be possible with a stricter specification. Of course the downside is that sometimes the compiler goes too far and removes critical code (which AFAIK is still a hot discussion topic how much compilers should be allowed to exploit UB for optimizations).

pjmlp · on Oct 20, 2021

Because that was the only way for C to catch up with stuff that was being done in languages like PL.8.

So compiler optimizers get to take advantage of UB for that last mile optimization, which in a language like C always expect the developer to be a ISO C expert, and when they aren't or are too tired trying to meet project deadlines, surprises happen.

"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue.... Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."

-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming

lmm · on Oct 21, 2021

It's not "essential". On realistic code it's single-digit percentages at most. But it's essential for winning compiler benchmarks, so you'll never get C compilers to stop doing it.

rcxdude · on Oct 20, 2021

You would have a huge number of false positives, because if you're doing a bunch of null checking defensively you're going to be doing it redundantly a lot.

IshKebab · on Oct 20, 2021

Right, but if the compiler has proved that you don't need those null checks then you can simply remove them. If you think you do need them then it's a sign you've screwed up somewhere and should fix it!

DSMan195276 · on Oct 20, 2021

The problem is that it's not that simple. A NULL check might only become redundant after various rounds of inlining and optimizing, and the same NULL check could be redundant in one usage and necessary in the other. It's also very likely that changes to other parts of the code will make a previously 'redundant' NULL check now necessary, and obviously you're unlikely to get a warning in that direction.

_flux · on Oct 20, 2021

> Right, but if the compiler has proved that you don't need those null checks then you can simply remove them.

This can be the case now, but then later someone adds new code that you did need that null check.

> If you think you do need them then it's a sign you've screwed up somewhere and should fix it!

Hm, so instead of

> Wouldn't it be better to have compile errors if the compiler figures out that it can remove a null check?

you wanted the opposite: warn when a null check is there and was actually required :).

Not sure if this is a working solution either. Maybe if it was behind a macro, JUST_CHECKING_IF_NULL(x)..

IshKebab · on Oct 20, 2021

Yeah sounds like the best solution is explicit checks that the compiler understands - REDUNDANT_NULL_CHECK(...), REQUIRED_NULL_CHECK(...).

I mean obviously the solution is to use a sane language but you know...

vlovich123 · on Oct 20, 2021

Did you read the last part of the article? It explicitly says that Rust (or some other kind of languages guarantees) would absolutely remove the need for more complex runtime measures. Additionally, such checks stop the exploit chain early before it’s able to pick up steam.

ncmncm · on Oct 20, 2021

It speculates that Rust would help. That is very far from demonstrating that Rust would actually help.

Rust would certainly not help in this instance, because nobody is writing pty handling code in Rust. I.e., using Rust in place B does not help with bugs in place A. Any expectation that Linux would get more secure if some new code were Rust is optimistic to the point of fantasy.

The best possible outcome of allowing Rust in kernel code is that the kernel would not become even more insecure as a result of the added Rust code. That would be good, by itself, even if not what we really want. But whether even that would be achieved in practice is still to be demonstrated.

smoldesu · on Oct 20, 2021

The best possible outcome of allowing Rust in kernel code is being able to guarantee type safety, memory safety and perfect concurrency where it's applied. Those 3 pain points are where C fails right now, and it's part of the reason why scheduling on Linux feels 'worse' than Windows or even MacOS at times.

ncmncm · on Oct 20, 2021

I.e., exactly what I just said, but longer.

vlovich123 · on Oct 21, 2021

I think the point is that as Rust is able to tackle more parts of the kernel, existing code can start to be rewritten in it.

atoav · on Oct 20, 2021

But it might remove another hole in the swiss cheese security model, which can be worth it.

skavi · on Oct 20, 2021

Do we know what the performance impact of adding all these checks is?

habibur · on Oct 20, 2021

Huge. Only checking array bounds on every access degrades performance considerably.

But I am waiting for the Rust-OS to complete -- if one is under construction now. We can check how that stands when released.

volta83 · on Oct 20, 2021

> Huge. Only checking array bounds on every access degrades performance considerably.

Citation needed, since all evidence points to the contrary.

Could you please point us to a Rust application (there are hundreds of thousands at this point) that gets noticeably faster when disabling bound checks?

In servo, a whole web browser written in Rust, the cost of doing this was negligible, to the point that it was barely measurable (1-3%, noise levels for such a big app).

Same for Firefox which has a substantial amount of Rust.

Go ahead and give Fuchsia a try. You can enable bound checks for a substantial part of Android's user space and not really notice it.

Same for Redox, or any operating system kernel written in Rust.

You have many large applications to choose from, so please, just point us to 1 for which this is the case.

---

Compared with other mitigations already in the kernel, that can cost you up to 50% perf, and that people seem to be ok with, bound checking all the array accesses seems like a no brainer, given that ~70% of CVE are caused by memory issues.

When most people think about bound checking all array accesses, they think, for some "i can only think inside the box" reason, that this happens on hardware, for every memory access.

But that is not how Rust works. Rust adds bound checks "in Rust", and the Rust compiler and LLVM are really good at removing duplicates, hoisting many bound checks out of a loop into a single bound check at the beginning of the loop, etc.

People also think that this is an all or nothing approach, but Rust allows you to manually access without bound checks and do the hoisting manually. So if you find a function in which this makes a big difference, you can just fix the performance issue there manually.

tialaramex · on Oct 20, 2021

For the optimisation, the compiler will even reason that e.g. iterating over the vector necessarily involves knowing the size of the vector and stopping before the end, so it doesn't need to add the bounds check at all because that's redundant. This is easier in Rust because the compiler knows nobody else can be mutating this vector while you're iterating over it - that's forbidden in the language.

So, in general, the idiomatic Rust "twiddle all these doodads" compiles to the same machine code as the idiomatic C++ for that problem, even though Rust bounds checked it and C++ didn't care. Lots of Rust checks are like this, they compile away to nothing, so long as what you did is necessarily correct. The Option<NonZeroU64> stuff a few days ago is another example. Same machine code as a C++ long integer using zero as a signal value, but with type safety.

pjmlp · on Oct 20, 2021

Same applies when using C++ with bounds checking enabled, but the FUD regarding bounds checking is deep.

tialaramex · on Oct 20, 2021

What happens about mutation in the C++ case though? Maybe I should just try it in Godbolt instead of bothering people on HN.

pjmlp · on Oct 20, 2021

What mutation?

Naturally this only kind of works when everyone on the team goes safety first.

Doing some C style coding will just bork it, similarly to any unsafe block or FFI call in other better suited languages.

But in the subject of making juice with lemons, is way better than plain C.

tialaramex · on Oct 20, 2021

Right, so long as there isn't mutation, we're golden, which is why the machine code is the same.

This is, after all, why Godbolt was first invented as I suspect you know (Matt Godbolt wondered if C++ iterators really do produce the same machine code as a hand-rolled C-style for loop, and rather than just trust an expert he built the earliest Compiler Explorer to show that yes, with real C++ code you get the same machine code, any time spent hand-rolling such loops is time wasted)

pjmlp · on Oct 20, 2021

Yeah, but since there are domains where C++ is unavoidable, this is the best we can do.

By the way, this should be a nice update about the state of affairs on Android (I am yet to watch it).

"Improving Memory Safety in Android 12 Using MTE"

https://devsummit.arm.com/en/sessions/57

tialaramex · on Oct 20, 2021

I don't understand what you mean by "domains where C++ is unavoidable" in this context. C++ is a choice, presumably usually a reasonable choice, but a choice, so if they wanted to people could avoid it.

Memory tagging (which is what MTE is about) reminds me of ASLR and password entropy requirements. They're slightly raising the bar which is not something I have much time for. I prefer to put the effort in to solve problems permanently so I can worry about something else instead. Whether that's a practical opportunity here is unclear though, and I think Rust is a big part of finding out.

Measter · on Oct 20, 2021

I did once have a bizarre situation where removing a bounds check that always succeeded degraded performance by over 30%.

The bounds check wasn't being elided either. I checked and it was there in the assembly, so I figured that the function is so hot that an unchecked access might help things. Apparently not. The only thing I can think of is that the reduction in code-size for that function had an unintended effect elsewhere, either for the optimizer or that it resulted in a hot bit of code crossing a cache line?

rini17 · on Oct 20, 2021

Or the always succeeding check helped CPU's branch predictor substantially.

pjmlp · on Oct 20, 2021

I have used C++ since 1993 with bounds checking enabled, it hardly has ever mattered.

The very few cases where it actually mattered for project delivery acceptance, were proven with a profiler, and fixed on those single cases only.

Most of the time it is just cargo cult taken from C into other languages.

Multics had a better DoD security profile assessment than UNIX, thanks to PL/I being bounds checked by default.

Mac OS used Object Pascal until they decided to switch to C++ around 1992, it also hardly impacted their sales, using a language with bounds checking.

throwaway81523 · on Oct 20, 2021

> Huge. Only checking array bounds on every access degrades performance considerably.

I have not found this, at least in application code. There is usually at most a few percent different between v[i] and v.at(i) (the latter checks bounds) with C++ std::vector, for example. So I almost always use .at() these days, and it does catch bugs.

scns · on Oct 20, 2021

Well, there is the rub. The safe thing to do is more verbose, the unsafe way is in muscle memory. Rust did it the other way around deliberately. But that is easy if you do a new design and don't need to retrofit a safe solution. No snark intended.

pjmlp · on Oct 20, 2021

That is why one should always enable the compiler switches that turn operator[]() into at(), and then only disable them on case by case scenario, if proven worthwhile to do so.

gpderetta · on Oct 20, 2021

wouldn't compiling with _GLIBCXX_ASSERTIONS or the corresponding for your compiler of choice be a better solution? It will also catch quite a few more issues (dereferencing null smart pointers, empty optionals, etc.), while being still relatively lightweight.

throwaway81523 · on Oct 20, 2021

Thanks, I didn't know (or didn't remember) about this, but it's not clear from the docs that it bounds checks []. I don't find it difficult to use .at(). I'll try the debug mode too, but when I bother writing something in C++ it's usually because I actually care about its speed, so I don't want the overhead to be too bad.

https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_macros...

pjmlp · on Oct 20, 2021

Not the OP, that is my approach, keep on the bounds checking settings from VC++ also in release builds.

And before STL was a thing, all the custom types I had were bounds checked by default.

fabianhjr · on Oct 20, 2021

AFAIK RedoxOS has been bootable for a while now. (Though, a very simple new OS)

https://www.redox-os.org/

b112 · on Oct 20, 2021

[flagged]

scns · on Oct 20, 2021

I have a problem with the classification of traits as mental disorders, which may be a huge boon in the right situation. A friend of mine could be classified as compulsive/obsessive. He works as an airplane mechanic. I would rather ride a plane that he worked on, than his "normal" colleague, who talked to someone else while refilling a tank with pure oxygen, walking away and having his arm catch fire from a static discharge. No snark intended.

okamiueru · on Oct 20, 2021

I'm sure if that was the management attitude that would have welcomed them, it is a win-win situation.

b112 · on Oct 20, 2021

Employers don't need zealots or evangelists, they need realists, and people ready to work.

Not prattle on endlessly about the latest fad.

Rust has some pluses. Yet the evangelism, with the "it fixes everything" lunacy circulating, taints rust, makes legitimate and sane advocates look like nutjobs.

I don't have the time to sort the wheat from the chaff, so when I see a big deal made about rust on a resume, odds are it's a zealot, and into the bin with 'em.

okamiueru · on Oct 20, 2021

You are expressing widely different things though. You said you would consider out right rejecting someone and question their mental health for having rust on their resume.

Having a tool on a resume, or listing experience with tools, should certainly not be interpreted as someone being a rust zealot, evangelist, "pratting on endlessly", and all the rest of the examples you gave. So, maybe you initially wrote something different to what you actually meant.

b112 · on Oct 20, 2021

[flagged]

okamiueru · on Oct 20, 2021

I'm not sure to what extend you are being facetious, so I'll take it at face value. Rust doesn't particularly attract "crazies", and nor are software developers mostly crazies. So, I believe your criteria is exceptionally bad at filtering those kinds of developers out. Hiring good people is challenging enough as it is without doing the job poorly. My advice, unwarranted as it is, is to revise your preconceptions.

Azsy · on Oct 20, 2021

Obviously the top comment is dumb, but please keep on doing this.

Nobody is going to be happy if you hire someone who knows rust.

> "As a shortcut for filtering out new people and ideas, i will filter them on the most annoying person that has tried them".

At least your partially honest with how many divergent ideas someone working with you is allowed to have.

throwaway81523 · on Oct 20, 2021

I think that says more about you than it does about the rusters.

kwertyoowiyop · on Oct 20, 2021

Certain people didn’t appreciate your sense of humor! :-)

andy_threos_io · on Oct 20, 2021

+1000

vadfa · on Oct 20, 2021

>It's almost at the point, where if I see rust on a resume, I'll figure the dev has a mental disorder and skip them.

I figured that one out when I tried to join the rust discussion server on discord and saw the server photo.

Cryptonic · on Oct 20, 2021

Well, denialists often think that the normal guys have mental disorders. The Rust kernel will come one day. Not doing it in the face of todays massive cyber security threads is madness. Even Windows and MacOS will be rewritten in Rust Ahahahahahahahahhahddahhadad runs in circles

kwertyoowiyop · on Oct 20, 2021

After we have rewritten everything in Rust, we will also have to delete all code, stored anywhere, that was written in C/C++. Storing such code will be considered a Thought Crime and punished by exile to the Radioactive Wastelands.

Koffiepoeder · on Oct 20, 2021

To be honest, the article makes much more nuanced suggestions to avoid these kind of bugs. I am not sure Rust would even have helped here, since the cause seemed to be a race condition because of an invalid lock being used. It might have been possible to avoid in Rust with an RwLock, but in this case that was also the fix for the original bug (using the correct lock). I have only looked into this bug report semi-thoroughly however, so I might be mistaken.

athrowaway3z · on Oct 20, 2021

I was typing a comment about how Rust wouldn't have made a difference for the race condition, but after reading through it again to be sure i'm now on the side that Rust would have errored on the original bug.

  spin_lock_irq(&tty->ctrl_lock);
  put_pid(real_tty->pgrp);
  real_tty->pgrp = get_pid(pgrp);
  spin_unlock_irq(&tty->ctrl_lock);

rustifying this would be

  let mut tty_lock = tty.ctrl_lock();
  put_pid(real_tty);
  real_tty.pgrp = get_pid(pgrp);
  std::mem::drop(tty_lock);

Which would give an error that you are not allowed to mutate real_tty.pgrp.

nyanpasu64 · on Oct 20, 2021

The problem was that one of two threads locked the wrong lock before accessing a shared resource (when two threads read or write shared memory, both sides must acquire mutexes or it's useless), resulting in a data race.

Rust could prevent this issue by requiring that all non-exclusive accesses to the shared data acquire the mutex (and if you use Mutex<T> which wraps the data, you'll always acquire the right mutex). The & vs. &mut system can model exclusive access ("initialization/destruction functions that have exclusive access to the entire object and can access members without locking"). It doesn't help with RCU vs. refcounted references, or "non-RCU members are exclusively owned by thread performing teardown" or "RCU callback pending, non-RCU members are uninitialized" or "exclusive access to RCU-protected members granted to thread performing teardown, other members are uninitialized". And Rust is worse than C at working with uninitialized memory (Rust not using the same type for initialized vs. uninitialized memory/references is a feature, but uninitialized memory references are too awkward IMO).

alexgartrell · on Oct 20, 2021

I shared this perspective, but luckily my job is awesome and (in a routine 1:1!) Paul told me why it's less straightforward than I thought: https://paulmck.livejournal.com/62436.html

my takeaway was essentially that you get sweet perf wins from semantics that are hard to replicate with a type system that's also making really strong guarantees without making the code SUPER gross.

encryptluks2 · on Oct 20, 2021

Seriously, this comment and similar comments is why I'm pretty much convinced that Rust will be booted from the Linux kernel altogether.

ylyn · on Oct 20, 2021

If enough people want it, it will happen.