Strict memcpy() bounds checking for the Linux kernel

kortex · on July 31, 2021

Relevant in 2009, relevant today: C's biggest mistake: https://www.digitalmars.com/articles/C-biggest-mistake.html

C uses (pseudo)fat pointers all the time, any time an array(pointer) is passed with size (as a second argument). It's just not as hygienic. Just think of all the dev hours, all the painful debugging, all the exploits prevented, if in the 80's they put out a stdlib with a fat pointer buffer type. Can't argue they didn't know better, as plenty of languages had them by then.

pjmlp · on July 31, 2021

Languages already had them in 1950's, e.g. JOVIAL, but is was more fun to create their own.

pjmlp · on July 31, 2021

The only way it will really work, is as Solaris SPARC has been doing for years, and Google is in the process of adopting for ARM based Android devices, hardware memory tagging.

Unfortunately Intel keeps borking their attempts to do the same.

usr1106 · on July 31, 2021

For those who like myself wonder what this hardware tagging is: https://lwn.net/Articles/834289/

tialaramex · on July 31, 2021

A bunch of the cases covered here are far too niche for memory tagging.

For example if a routine wants to quickly zero fields pkt.mask pkt.sent pkt.copied pkt.recvd, pkt.ackd, and pkt.ready but "obviously" not the other members of the pkt structure, somebody will write code that zeroes out the whole stretch of the pkt structure from "mask" to "ready" inclusive, and then comment in the structure definition that it's important not to re-arrange these members. And that's faster, and it works.

Now, suppose I'm a bad guy & I'm able to fool a routine into calling the zero-out function with any range I like instead of just the range from "mask" to "ready". Memory tagging might prevent me clearing the adjacent structure, but all I want to do is smash pkt.uid to zero so that I can become root and that's just part of pkt...

pjmlp · on July 31, 2021

Naturally given C's nature as pseudo-macro assembler, there will always ways to work around the mitigations, see iOS PAC exploits, however them not being there is even worse as proven by the numbers from Google and Microsoft security reports.

saagarjha · on Aug 1, 2021

Many of the PAC exploits have actually come from incorrect interactions with the OS or the PAC handlers written in assembly; the compiler is actually quite good at making PAC work.

pjmlp · on Aug 2, 2021

Thanks, good to know.

notorandit · on July 31, 2021

> The C programming language is famously prone to memory-safety problems.

I thought it was the programmer to make mistakes, not the language to be error prone.

flohofwoe · on July 31, 2021

I like C (the language) a lot, but the C runtime library is a single big minefield (the mem*() and math.h functions are actually mostly fine, but everything else is terrible. Thankfully C doesn't rely as much on its standard libary as most other languages, and in many cases it actually is better to reinvent the wheel, because it's almost impossible to come up with worse APIs than the C standard library (and before "switch to C++": most areas of the C++ stdlib are even worse, because they allow the same memory corruption issues, but hide them more cleverly, which in turn makes debugging harder).

tialaramex · on July 31, 2021

We can't blame the standard library for numerous infelicities in the language itself, and we likely shouldn't blame it for library designs that obviously fall out of those infelicities.

Example #1 the array type looks useful, but oh, it's actually just coerced into a much less useful pointer

You can have a C function signature that says it takes arrays of sixteen integers. But it doesn't! The compiler cheerfully ignores this and uses the function for any pointer to integers. Sixteen integers, or Zero, or Sixty, the same function executes.

If you've used enough C you might just assume that's how it has to be. Nope. A Rust function declared for arrays of sixteen integers takes... arrays of exactly sixteen integers like you asked for. Not any other size.

Example #2 implicit narrowing conversions everywhere

This blew up in Linux recently, but C is OK with the idea of just assigning 64-bit values to 32-bit integers for example. The value doesn't fit in the narrower variable, but no problem just throw away the extra bits and it'll go in...

flohofwoe · on July 31, 2021

For Example #1, C (but not C++) has a syntax for hinting the expected array size (note the warnings):

https://www.godbolt.org/z/7bcd9xPr3

For example #2, proper compilers have warnings for implicit conversions which would lose data, for instance:

https://www.godbolt.org/z/vEP8ra9MP

But for a proper solution to the array size problem you need array slice support built into the language, and this would be the first "opaque builtin struct type" in C (because slice types need either a pointer/pointer or pointer/size pair). At this point it's really better to switch to a different language.

tialaramex · on July 31, 2021

But the compiler cheerfully still calls the function with a pointer instead, even if that pointer is in fact pointing at the too-short array.

And yes, it really is better to switch, I've been doing so, and I am hopeful that the kernel can follow in due time.

Aside: Matt Godbolt deserves some sort of award for Compiler Explorer. So much less awful to just link these examples in such conversations. It might well be as big a contribution to software engineering as git bisect or Tinderbox, or (really going back) Grace Hopper's "compiler". Yes Compiler Explorer seems "obvious" and people talked about things like this but did you build it? 'cos just talking about it doesn't make my life any better but building it does.

pjmlp · on July 31, 2021

Indeed, for Matt Godbolt, and the whole community that has added the remaining AOT languages, .NET is one of the few missing from Compiler Explorer, but it has a similar one at https://sharplab.io/ and there is one at http://shader-playground.timjones.io/ for GPGPU shaders.

Here is one talk from Matt Godbolt about how Compiler Explorer was born,

CppCon 2019: Matt Godbolt “Compiler Explorer: Behind The Scenes”

https://www.youtube.com/watch?v=kIoZDUd5DKw

formerly_proven · on July 31, 2021

The mem* functions are mostly fine because those operate on memory blocks and so everything needs to use explicit lengths. Most of the brokenness with the str* function family revolves around questions of who needs to check the length, will things be terminated correctly etc. and various ergonomic problems.

fragbait65 · on July 31, 2021

Well, I do think a programming language is a tool and it is kind of wrong to blame the tool for the mistakes made by the one wielding it.

The problem with C is that it's a tool where you have no safety and I don't think there's a programmer alive that can actually use it in a safe manner. I guess some are close, but they will still end up with the occasional wound here and there on their bodies...

I would probably avoid using C for anything I put in production today, I still love the language though. It still feels special, sitting down with all that power at your fingertips knowing that you'll have a built in buffer overflow if you lose focus for a second, gets (no pun intended) my blood flowing! :-)

VortexDream · on July 31, 2021

> it is kind of wrong to blame the tool for the mistakes made by the one wielding it.

I always find takes like this bizarre. We're constantly improving the safety of the tools and devices we use. Our history is filled with examples of tools designed without safety in mind leading to deaths or maiming or other injuries. Thankfully, the people who came before us saw that it was up to us to reduce the likelihood of accidents by improving the tools that we use.

Just look at the aerospace industry. They didn't say "well, don't blame the plane, the pilot should've gotten it right". Often times improvements in planes were to avoid common mistakes because of how fallible we human beings are, instead of holding us up to impossible standards.

fragbait65 · on July 31, 2021

I agree.

The difference is that engineering over all is mature enough that you can make incremental changes over time and it still does not invalidate what you did previously. If you built a bridge 10 years ago its probably safe enough to leave standing even if you can build a safer bridge today.

The same cannot be said for programming yet. It's hard to improve the safety of the tool, the C compiler, without breaking your past bridges or having to redo the work again, i.e rebuilding the bridge.

tialaramex · on July 31, 2021

England has lots of Victorian railways bridges. Now, the Victorian engineers knew perfectly well how to build a railway bridge, which is why those bridges are still there. But they hadn't yet got to the place where you always properly document your work as you go, and they also didn't think too hard about (from their perspective) the distant future, our present.

So a modern railway engineer inspecting a Victorian bridge has a problem. The bridge was built in the usual fashion of the time, and it's impractical to fully inspect the load-bearing materials without dismantling the bridge. There is no detailed paperwork because the Victorians didn't keep any.

Still, it stands to reason that if the cast iron load structure exposed in one place has 15-20 years of life left in it, the unexposed structures you can't see are similar and this bridge can be scheduled for replacement in say 10-15 years. Right?

And then, one night, as a fully laden freight train crosses it, the bridge collapses. The driver feels something wrong on the bridge and then, a few seconds later, the locomotive automatically brakes to a full halt - unable to sense the rear of the train which is in fact now laying in the rubble of the broken bridge.

The Victorians saved a little money by using thinner metal for the unexposed girder, which had therefore failed earlier than the predictions based on the thicker metal.

Documentation is essential. If your 30 year old C project doesn't have adequate documentation chances are you don't know whether those unexposed elements are as strong as the parts you can see or if they're paper thin and likely to fail at any moment.

publicola1990 · on July 31, 2021

Yet the very same aerospace industry uses almost exclusively, the very unsafe C to write software instead of using something with more safety guarantees.

pjmlp · on July 31, 2021

Not at all, they use C dialects like MISRA-C, Frama-C, ACC3 among others, that are basically Ada with C syntax.

Additionally they use coding practices that would make the most hyped TDD advocates from Silicon Valley startups walk away from the projects without looking twice about what they were leaving behind.

When code kills, every line of code gets validated.

publicola1990 · on July 31, 2021

Yes but C being C, its hard to write it safely, and harder to ensure that it is safe beyond all doubt.

pjmlp · on July 31, 2021

That is why such standards exist.

https://en.m.wikipedia.org/wiki/The_Power_of_10:_Rules_for_D...

This is not the kind of C you will find in FOSS or UNIX software.

adwn · on July 31, 2021

> it is kind of wrong to blame the tool for the mistakes made by the one wielding it

I strongly disagree. I think it is absolutely legitimate to blame a tool which is practically impossible to use correctly [1]. Some of C's design decisions were justified in its historical context, and some – like zero-terminated strings – were indefensible even back then.

[1] Alternatively, you could blame its creators, but that's not very useful.

flohofwoe · on July 31, 2021

Zero terminated strings make sense when you consider the pecularities of the PDP instruction set.

Snippet from "https://dave.cheney.net/2017/12/04/what-have-we-learned-from...":

One can write a string copy routine using two instructions, assuming that the source and destination are already in registers.

    loop:   MOVB (src)+, (dst)+
            BNE loop

The routine takes full advantage of the fact that MOV updates the processor flag. The loop will continue until the value at the source address is zero, at which point the branch will fall through to the next instruction. This is why C strings are terminated with zeros.

pjmlp · on July 31, 2021

Except the world stop being a PDP-11 around 1980's, while ISO C refuses to update itself to modern times.

The biggest issue with C isn't its footguns, rather the WG14 unwillingness to provide additional language or library features that would allow for a safer C outside the low level code where it pretends to be a portable macro assembler.

flohofwoe · on July 31, 2021

In the early years the job of the C committee wasn't to improve C, but merely to harmonize existing C implementations, and by the end of the 80's it was already too late, zero-terminated strings had already been baked into operating system APIs (e.g. zero-terminated strings are no longer primarily a language problem, but an ABI problem).

Besides, the x86 "repeat while" string instructions continued the PDP legacy.

kortex · on July 31, 2021

That argument sounds akin to "Well people are used to driving without seatbelts, it'd be painful to make them switch now."

Or like my spouse likes to joke when we get in the car to leave and forgot something in the house: "it's too late, the door's shut."

I get why null terminated strings once existed. It's baffling that they continue to exist 50 years later. Not to mention, they don't even work on data buffers, so you need fat pointers anyways!

flohofwoe · on July 31, 2021

Replacing the string memory layout in operating system APIs is more similar to changing the track width on an existing railroad network, across the whole world.

It may be a good idea from a theoretical standpoint, but once you start calculating the cost it simply doesn't make sense.

kortex · on July 31, 2021

Obviously not with that attitude. I'm not even talking about back-porting. I'm saying going forward, in new drivers and extensions, and new growth where it makes sense to do so.

There's even plenty of means for backwards-compatible strings and arrays, such as sds.

But really the point was more to "this should have really been addressed decades ago."

https://github.com/antirez/sds

pjmlp · on July 31, 2021

Liability and lawsuits due to security exploits will take care of that.

Thankfully they are starting to pick up.

pjmlp · on July 31, 2021

We aren't speaking about ANSI/ISO C89 here, rather in what happened the following 32 years.

flohofwoe · on July 31, 2021

That doesn't matter, even C doesn't matter in that regard anymore, or the opinion of any other language on the best string memory layout. Once zero-terminated strings leaked into operating system APIs the damage was done forever, and there never was a good time to fix the problem afterwards.

pjmlp · on July 31, 2021

That is no excuse not to try to improve the status quo.

Maybe liability and lawsuits are really needed to stop with such excuses.

fragbait65 · on July 31, 2021

I think some incremental improvements could be done for C, question is if it's worth the time?

Maybe it's better to just use a better tool for new code bases?

pjmlp · on July 31, 2021

Except embedded developers and UNIX clones will never move beyond C.

fragbait65 · on July 31, 2021

Well, I agree that you can wish that the tool was better designed, that's a whole different thing though.

What I meant to say, in a rather roundabout way to be fair, is that the problem is both the language and the programmers. C is a tool that is too hard to use correctly, but the programmers who write crappy C code are to blame for their crappy code. It's another thing if an expert fails to use the tool safely, then one might blame the design of the tool used.

There seems to be 2 camps, one camp blames C and one camp blame it on poor programmers. Poor programmers have given C a reputation and the tool itself is too hard to use correctly, even for experts, so both camps are right and also wrong...

I think some of the decisions made for C back in the day where fine, C was designed to be lightning fast and close to the metal, but I think it's time to pivot. I don't think it's necessary to sacrifice security to squeeze out the last percent of "speed" today. C has a lot of legacy code still in use though, so I don't think it will ever happen, that's why I use other languages for production code today and only write C when needed for C code bases still in use.

flohofwoe · on July 31, 2021

There's a lot of room to explore between C and Rust, while not compromising on low-level features and performance (e.g. "performance" and "safety" are both not absolutes, and also not excluding each other. I hope that there will be many new languages (the more the merrier) exploring that space.

But we'll live for a very long time with C code, it will most likely outlive us all, because important infrastructure code is never really replaced, it just becomes a new sediment layer.

pjmlp · on July 31, 2021

Which is why stuff like Checked C, or something like SDS should really be made part of C, as UNIX clones are going to outlive us anyway.

However just like rotating blade covers, butcher metal gloves, seat belts, helmets,..., apparently external forces like government regulations are required to make WG14 act accordingly.

pjmlp · on July 31, 2021

It was already known to Dennis and Ritchie that C without appropriate tooling wasn't an ideal tool, hence why C's history of static analysis goes all the way back to 1979.

Unfortunately, the large majority to this day thinks they don't need such kind of tooling.

"Although the first edition of K&R described most of the rules that brought C's type structure to its present form, many programs written in the older, more relaxed style persisted, and so did compilers that tolerated it. To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions. "

https://www.bell-labs.com/usr/dmr/www/chist.html

Gibbon1 · on July 31, 2021

Bad tools should be replaced with improved ones.

The problem with C is the two generations of standard committee members have stubbornly shirked their duty to fix these problems.

masklinn · on July 31, 2021

A language which is not error prone will usually lead to a combination of the programmer making less mistakes and the language catching those mistakes the programmer makes (whether at compile-time or runtime, or more problematically attempting to DWIM those mistakes thankfully language design has rather moved away from that very bad idea).

Programmers making mistakes is a fact of life, that C is error-prone is what makes those mistakes into significant and recurring issues.

alkonaut · on July 31, 2021

All programmers make errors. Languages/compilers can make those errors more obvious or even impossible to represent or caught at compile time (e.g stronger typing). The error-prone-ness of a language is just a sloppy description of the tendency for errors to be discovered at runtime rather than development time, and the severity of such errors.