> like forgetting to check error returns One of the most common bugs in C (in my...

geofft · on Jan 21, 2018

Yes! In fact I'd argue that null-pointer dereferences are not a memory-safety bug the way buffer overflows / use-after-frees / etc. are, they're a type-safety bug. It's not that NULL is a currently-invalid pointer, it's that NULL is not a pointer at all - it's a value that's been stuffed into the pointer type because C has no better way to represent this. The actual return type of malloc is the set of all possible pointers along with NULL, a separate thing. Passing it to code that only expects something from the set of all possible pointers should require you to check for NULL and do something else instead.

If you want to optimize memory layout by reserving memory address zero, sure (and Rust's Option<&T> does exactly that), but at the language level, you shouldn't be able to use NULL as a pointer any more than you should be able to use 0.0 or '\0' or false as a pointer.

nwmcsween · on Jan 22, 2018

Why do you think NULL isn't a pointer? It most definitely is a pointer, it's the kernel the limits a ptr to 0 from being used, in fact you can use it if mmap_min_addr is set to 0.

geofft · on Jan 22, 2018

NULL is not a pointer—what does it point to?

0x00000000 is a pointer, sure, and it points to the memory at address zero. But that's not the same concept as NULL, even if it happens to have the same representation. If malloc(32) returns NULL, it doesn't mean the memory between 0x00000000 and 0x00000020 is available for me to use.

false also has the same representation, but no one would claim that false is a pointer.

mmap_min_addr exists because C is unable to distinguish NULL and a pointer to address zero, and the Linux kernel is written in C, and too much code wrongly treats NULL as a pointer to 0x00000000 and attempts to read or execute the contents of memory there. If code did not confuse the two, mmap_min_addr would not need to exist.

And it is relatively recent; it was added as a security measure in June 2007 for Linux 2.6.23 in https://github.com/torvalds/linux/commit/ed0321895182ffb6ecf... , about 40 years after the invention of NULL.

deathanatos · on Jan 22, 2018

The point is that there are two very fundamentally different things that can be stored in a C pointer:

1. A valid address to an object of some type.

2. Null.

One of these can be dereferenced, the other cannot; the valid operations are not the same, because they are not the same "type". Now, C of course will happily store both the address of an object and "NULL" in the same "type", and that's the problem.

There are a great many places in code in C where one would like to take a pointer that must contain an object's address; that is, a non-null pointer. The type system offered by C has no way to indicate this, and so the compiler cannot catch passing NULL to such a function.

(And I honestly would bet that the kernel limits it more due to C, than C uses 0 because the kernel limits it. That is, yes, the kernel limits allocating address 0, but the arrow of causality is the other way around.)

wott · on Jan 22, 2018

> One of the most common bugs in C (in my experience) is forgetting to check the error return on a malloc call.

Very unlikely.

1. it won't happen in regular use, only if your are asking for a huge amount of memory (possibly by mistake), or the system is already full and you will likely experience problems with the whole system (freezes, instability, processes killed, etc) and not just with your program.

2. it will never happen in Linux, because in the classical setup, malloc() never returns NULL, even when there is no memory available.

So you have to have those conditions + a return value unchecked for the bug to have a chance to appear. There are thousands of other bug sources.

nwmcsween · on Jan 22, 2018

The 'classical' setup is horribly broken and relying it just makes more broken code.

nh2 · on Jan 22, 2018

I think much worse is forgetting to check the size returned by a read() or write() call, especially when dealing with sockets.

Unchecked malloc() crashes very easily, data chopped off in the middle usually triggers problems on the /other/ side.

quotemstr · on Jan 22, 2018

Using exceptions solves that problem and many others besides; it's a shame they're not currently in fashion. But you do end up needing them, so in Rust, we end up with error codes _and_ exceptions (but spelled "panic").

geofft · on Jan 22, 2018

I wouldn't call panic the same thing as exceptions - panics are quite explicitly meant to not be caught, except at fairly hefty boundaries like processes, threads, or FFI. Using std::panic::catch_unwind to catch, say, accessing an out-of-bounds element in a vector would be super un-idiomatic, both because that's what .get() -> Option<T> is for, and also because even if you catch the panic, the message gets printed to stderr: https://play.rust-lang.org/?gist=587f976dc4bcf4010eb0026c9ed...

Rust doesn't have exceptions in the sense that C++/Java/Python/etc. have exceptions, i.e., things that unwind the stack and are part of a function's expected API. And I think the specific reason they're out of style is the inherent contradiction in that statement: either all the exceptions in the API of any of the functions you call are also part of your public API, or you're carefully filtering exceptions in any of the functions you call that raise them, and so you might as well not use unwinding.

Panics unwind the stack, but are not part of the API; they're for erroneous conditions where the usual right thing to do is to kill the process, but maybe you only want to kill e.g. the current HTTP request. Errors as return values / the Result<T, E> type do not automatically unwind - they're just normal data types returned from a function - but they have syntax (the question mark operator) for explicitly unwinding them one step, and there are proposals in progress to introduce syntax that use "throw" or "catch" to refer to returning the error case or handling such returns in a block of code, so it seems like people think Result more closely matches exceptions in other languages.

quotemstr · on Jan 22, 2018

In other words, Rust _does_ have exceptions! That some people don't use them all that much is a matter of convention in a particular community, not a matter of language feature set. You could implement exactly the same model in C++. The fact is that Rust has exceptions to exactly the same extent C++ has them.

geofft · on Jan 22, 2018

Kind of? I mean, the fact that panics get printed to stderr makes it cumbersome to use it. (I mean, yes, you can do things to suppress the exceptions. You can also write some C macros to implement tagged enums and make a libc whose malloc returns Option.)

I don't really see a huge distinction between a language and its community, for the primary reason that feature evolution in a language - e.g., that Results recently got the question-mark operator, and whether Results will get "throw"/"catch" syntax - is driven by the language community and what sorts of things are or aren't common practice. A Rust community that made heavy use of panics in normally-operating code would probably want to fork Rust just to optimize panicking and catching panics, to fix the fact that catch_unwind is documented to not necessarily catch all panics, etc., and would eventually make deeper language changes to improve the syntax around doing panicking and not merge the corresponding changes to improve the syntax around Result. Which is historically what's happened with languages that have developed multiple communities - there are lots of Lisp dialects, lots of BASIC dialects, etc. Whether BASIC has a feature isn't a well-formed question; whether GW-BASIC or VB.NET or your TI-83 has a feature is well-formed.

kvark · on Jan 22, 2018

Well, error codes are not special constructs in any way. And exceptions can be easily overused. So you have to draw the line somewhere. Rust just choose the balance where facing an exception is truly exceptional :)

quotemstr · on Jan 22, 2018

"Exceptions are for exceptional conditions" is a meme that I wish would just die. Its origin lies in 1990s C++ compilers, which were extremely inefficient when dispatching exceptions, leading programmers, as a pragmatic measure, to use error codes for "expected" errors and exceptions only for cases thought to occur infrequently.

We've long since past the time that we have to worry about such concerns. "Exceptions for exceptional errors" just means that you have to write code that both cares about exception safety and propagates error codes from subroutines. It's the worst of both worlds.

Just use exceptions for all errors. It's elegant.

geofft · on Jan 22, 2018

It's not elegant - it means that every single exception raised by any function you call, or any function they might call, and so forth, is now part of your API. If you're an HTTPS library, and you're using some OpenSSL bindings for certificate validation, OpenSSLCertificateValidationError is part of your API because callers are now catching that. If you switch to BoringSSL or NSS or whatever, callers won't be expecting BoringSSLCertificateValidationError or NSSMismatchedCertsException.

Java has a particularly inelegant and ugly solution here involving declaring what types of exceptions might be thrown. But that still doesn't change the fact that changing that list is an API change, it just makes it more explicit.

So if you care about API stability (and I firmly believe that no solution that ignores API stability is "elegant" - it is at best "cute"), you're basically required to catch the vast majority of exceptions your own dependencies generate and translate them to your own exception types. You'll need to make a MyHTTPLibCertificateValidationError, unwrap the contents of OpenSSLValidationError, and put them in the new object, or you can never switch away from OpenSSL without an API break. And you want your dependencies to follow the same discipline.

At that point, as I said above, why use unwinding? None of the exceptions in your program can safely pass more than one level of the call stack at a time; each level has to explicitly approve raising it another level or wrap the exception in its own type (or handle it). The only ones that can really unwind are standard library ones like OutOfMemoryError that are expected to go all the way up the call stack to the top of the program or at best the top of the current request, print or otherwise log a backtrace, and abort the entire thing in progress - i.e., exceptional conditions. Exceptions for expected conditions are a different thing entirely, precisely because you don't want unwinding, you want step-by-step propagation.

This has nothing to do with efficiency. This has to do with correctness and robustness.

And you get your syntactic elegance with a library for translating and wrapping error objects, like https://docs.rs/error-chain , combined with syntax for immediately translating and returning errors from dependencies, like https://doc.rust-lang.org/book/second-edition/ch09-02-recove... .

vvanders · on Jan 22, 2018

Except that exceptions need RTTI information and so you'll see binary size increase as you start using them.

Some compilers even won't let you use them without turning on full-blown RTTI(or doing it implicitly for you) which is even worse.

imron · on Jan 22, 2018

Not in C (which does not have exceptions) and not in C++ when doing what the parent suggested (calling malloc, not new).