My view strongly aligns with John Regehr's [1] (emphases original):
> My main idea is that we need to teach C in a way that helps students understand why a very large fraction of critical software infrastructure, including almost all operating systems and embedded systems, is written in C, while also acknowledging the disastrously central role that it has played in our ongoing computer security nightmare. [...]
> We’d like students to be able to answer the question: Is C an appropriate choice for solving this problem? [...] The second big thing each student should learn is: How can I avoid being burned by C’s numerous and severe shortcomings?
"How can I avoid being burned by C’s numerous and severe shortcomings?"
The only solution is using a program analyzer which can prove the absence of certain errors. Frama-C and Astrée come to mind.
Normal static analyzers are best-effort, will have both false positives and false negatives and will also miss errors. This is the difference between testing and verification.
Dynamic analyzers like ASan or valgrind depend on having good code coverage and good value domain coverage. The former should be obvious, the latter will ensure that error-prone code which fails only when variables have specific values is also flagged. E.g: adding two ints is defined for some value ranges, undefined behaviour for others.
Now the main issue with all of the above is that virtually no self-respecting C programmer will use a verifier, so while theoretically possible to write safe C, it is practically almost unheard of - safety-critical domains aside.
There is no good safe, user-friendly alternative to C, all of them are either much more complex languages (Rust, C++), or less performant (Swift, Go) or have tiny marketshare (D, obscure things like Zig, etc)
Although a C++ talk, she also touches C quite a few times, given the goal to get them to adopt C++ as safety improvement, and naturally how to tailor CLion to the markets they are on.
D doesn't belong in this category. If anything, D should be put next to Rust, C++. D is a very complex language compared to C and it offers all the C++-like features like templates and OOP.
You can use D's betterC mode, supported by the compilers. This avoids the D runtime entirely, there are no classes, exceptions, etc. You can still use templates and CTFE if you like (which can be awesome), but there's nothing forcing you to do that. You can just copypaste some C code, make a few syntactic changes to get it to compile and get a decent level of safety that way.
I find it hard to believe that you can seriously describe Rust as "much more complex" than C and recommend Frama-C in the very same comment! The point of all that "complexity" in Rust is precisely to enforce the use of patterns that will make a Frama-C-like analysis feasible and not overly complex, even for large codebases. This basically boils down to avoiding shared mutable access; not coincidentally, this is also what functional programming languages do! (That is, sequential mutability that's isolated to some part of the code can be modeled through constructs like the ST monad and is relatively benign; shared mutability is a whole other can of worms.)
I'm not recommending frama-C, I'm opining on what's necessary to write reliable C code.
Much of Rust's complexity is not directly addressing memory safety, it's providing high-level language features. A C-with-borrow-checker would be very different from Rust.
"A C-with-borrow-checker" is a good-enough description of Cyclone, which is not that different from Rust and which was basically abandoned for Rust.
Sure, Rust itself has a number of higher-level features, some of which are a bit problematic by comparison with C (e.g. monomorphizing generics lead to intermediate-code bloat, which causes long compile times) but it turns out that these features are practically needed, either to make programming-in-the-large more feasible or to abstract some underlying details from user-level Rust code so it can keep working even as the compiler and standard library evolve underneath it. C doesn't have these problems being a mature platform, but Rust still does.
"How can I avoid being burned by C’s numerous and severe shortcomings?"
I think this is where we start bumping up against the limitations of the universe we live in, because the question becomes "where do we put the bugs?".
The biggest flaw with C is the array bounds checking, especially if that array is on the stack, or other bugs related to "raw" pointers:
- We could enforce things through hardware, but then you've put the bugs in the microcode.
- You can change your language, but then you've put the bugs in your compiler / interpreter / vm.
- You can try and formally verify the correctness of your code, but it's less feasible for large codebases. And what if there's a bug in your proof? (proofs are programs)
I'm not suggesting people use C/C++, but I wouldn't blame C for the problem of natural numbers.
> - You can change your language, but then you've put the bugs in your compiler / interpreter / vm.
I'd rather put my bugs in one, more easily verifiable place. Verifying all code that's ever going to be compiled is close to impossible. Verifying the compiler / interpreter / VM is hard, but it's much easier.
At the end of the day our field is basically: "get rid of repetitive work". So arguing that it's better to have total manual control (and basically always re-implement things at the point of use) doesn't strike me as the best approach.
"I'd rather put my bugs in one, more easily verifiable place." We are in agreement, that is why I was saying people shouldn't use C unless they need it.
> You can change your language, but then you've put the bugs in your compiler / interpreter / vm.
Sure, but the number of bounds-checking bugs in, say, the Rust compiler/stdlib is multiple order of magnitudes lower than if all Rust applications had to roll their own bounds-checking code.
>You can change your language, but then you've put the bugs in your compiler / interpreter / vm.
memory safety issues account for an overwhelming amount of security bugs¹ and are avoidable by moving to managed languages. Compilers and VMs are very, very good at elimianting this primary source of bugs in code.
The first answer to the C/C++ problem should be to, whenever possible, move to languages that do a large part of the work for you.
In particular in the linux world it seems like a lot of application code that could be written in memory safe languages is still written in plain C.
I'm not sure I quite follow you, because if I believe what I think you believe, then I wouldn't call it "the problem of natural numbers", but I agree that C has no fault in it.
Precisely that which makes C so performant is also that which leads it to be "unsafe". It is thus up to the programmer to ensure safety in C, his manual managing of that is the price to pay for the performance gained using the language.
I was simply saying that it's undecideable to do bounds checking at compile time, so it's better to use languages that do bounds checking at run time. But you shouldn't blame C for this.
And yet almost all security problems on the internet involved PHP and JavaScript.
When you read about people who have had problems with C, you will find that these are people who never really learned the language and the language is as simple as one can get which should also make one question their abilities.
Half the PHP security vulnerabilities are of the "someone chmoded 777 the Wordpress plugins directory". The other half are C vulnerabilities leaking through the interpreter. And the third half :-) are SQL injections and generic web vulnerabilities.
The first batch would happen in any language, it's an operational issue.
The second batch comes from C and would likely be prevented by PHP being written in idiomatic Rust.
The third batch is a higher level development issue that is cross-language.
I think there's value in removing classes vulnerabilities.
And blaming the operator is NEVER a good idea. It was the exact opposite of this mentality that made us have safe cars and airplanes today.
Before the 70's, the auto industry was doing exactly what you're doing. Once they stopped doing that, people stopped dying for easily preventable issues.
People coming with "no bound checking/type safety/pointer safety/garbage collection in C" often don't get the simple idea they weren't supposed to be there to begin with, and in C the burden of doing in falls on the developer.
There is nothing wrong with setting types manually, managing memory manually, and thinking through how pointers and array boundaries are computed. In fact this is what makes C unbeatable in its niche, and leads to software with very low resource and performance requirements.
In my opinion, as a programmer who learned C when it was still in its infancy as a language, and who has used it in nearly every single project in 30+ years of professional development, what we need to do is not just teach the C LANGUAGE, but also the C RUNTIME, EXECUTION ENVIRONMENT, and .. most important of all .. the C COMPILER.
Each of these aspects of the "C ecosystem" needs to be well understood in order to be a productive, high-quality C developer. Its not enough to just type a bunch of stuff and then throw it at the compiler and see if it works - you have to understand what the compiler is doing with your language constructs and how this will be executed in the target execution environment.
So many times I've had to debug "professional C developers" who have no clue what the compiler is doing to their code, no idea what a TEXT SEGMENT is, absolutely zero responsibility for the HEAP, let alone runtime loading and linking. Its all just 'voodoo' behind the 'black box' of the compiler.
But even just having a basic understanding of these components can mean a huge difference in quality C code.
Another thing every C developer needs to know: how to debug code and read assembly language in the context of the operating/execution environment. You don't need to be able to WRITE assembly, but at least fire up the debugger and step through your program a few times to see how it behaves .. this can go a long way towards increasing a C developers understanding for what is happening and why its important to know all the other components. Too many times I've solved a major, delaying bug in a project by just re-ordering the linker commands or, even worse, cleaning up the linker commands to remove stuff that was glibly thrown at the wall by some other dev.
Any reading recommendations on how libc, libstdc++ and the compiler all interact, and how to manage that when the target OS is older than the toolchain?
> There are many ways of writing wrong C code, but you only need to make sure what you write is correct and in defined behavior, that’s all about C programming.
But this is a something that even experts fail to do.
There are developers at google (most likely in increasing numbers) that would look like a deer caught in the headlights when asked to work on a C++ part of the code base (anecdotally according to a guy working at Google)
It is something experts must fail to do. Documented extensions fall under "undefined behavior", and a lot of real-world coding requires them. Oh, and use of header files and functions not in ISO C is undefined behavior. On a POSIX system #include <fcntl.h> provides definitions of things like F_DUPFD. But there is no reason why on some non-POSIX system, #include <fcntl.h> might not cause the rest of the translation unit to be compiled as Fortran 77. The include mechanism per se has the well-defined behavior that, if the header is found, it replaces itself by the content, which is thereby incorporated into the translation unit. But if that header isn't coming from the program, or from ISO C, then the content is not defined by ISO C.
Twenty years ago I wrote C++ code that was easily verifiable to be memory safe and free of leaks, just by using a handful of smart classes and sticking to them.
"Using a handful of smart classes and sticking to them" by no means guarantees memory safety, unless you also rule out references and a bunch of other normal C++ features.
I made use of const references in the code to avoid some unnecessary refcount bumping.
I hadn't ruled out calling outside platform API functions, which were all written in C. You can't do that in any language, unless you're writing a pure text filter or calculator for the Unix command line environment (and don't count the I/O and math functions).
Twenty years ago I was also mostly a C++ dev and unless the code was 100% written by me I would never issue such statement, given the total lack of control what others in the team or binary 3rd party libraries are doing.
I still use C++ as one of my favourite hobby languages and althought it has improved a lot, using C++20 best practice across a team (lets assume it is already available), with binary dependencies, is still a challange to make it 100% memory safe.
In C++ static analysis is optional, while it is part of language in Rust.
Then there is the whole language culture.
While me coming from stronger system languages, always strived for bounds checking enabled on my own C++ projects, good luck selling that to most C++ teams, even though in 99% of the use cases its impact is negligible.
Finally going all the way back to NEWP, system languages that require explicit unsafe blocks are much easier to do code review, than those where every line of code can possibly trigger unsafe behaviour, and C++ inherited lot of such cases from C.
But did you verify it never overflows? I think it is much harder to get that right without language (or compiler) support. Keep it mind that overflow is as severe as memory bug in C/C++ due to the unforgiving nature of UBs in those languages.
> that overflow is as severe as memory bug in C/C++
In practice, it isn't. In many traditional compilers it has predictable behavior (two's complement wrapping), if we're not talking about floating-point overflow.
Some programs explicitly rely on it. Compiler support can be provided for those programs.
It's simply not in the same category as memory corruption bugs.
Of course, ISO C and C++ have just one category for undefined! However, note that "undefined behavior" is a formal term which extends over beneficial areas such as documented extensions and the use of third-party libraries and headers.
That was practically a poster example to sell C++. Look, you can overload operators and get arrays with bounds checking, integers that trap overflows and so on.
Twenty years ago, C++ was still hot and there was a lot of interest in all sorts of techniques. Books, seminars, papers, blogs, you name it.
There is a way to use C++ template partial specialization to mimic the built-in conversion rules, like "int op long" promoting the left operan to "int". You can mirror the language in itself and bend the rules.
"Undefined behavior" is a broad area which covers everything from defects in a program that make it crash, to documented language and library extensions (which real-world programs can hardly avoid using).
The problem is that, just because it's not causing problems now, doesn't mean it can suddenly start causing incredibly hard to track down problems years later after an innocuous compiler upgrade...
If a programmer can't at least read (normal) C, I'd doubt they're a fully qualified programmer. (Just my way of viewing things, I might be wrong. My reasoning is that if you don't know C and pointers, you probably don't know what a simple line like 'foo = bar + "abc"' even entails.)
It's a tiny language, has easy syntax, and "undefined" behavior (which you don't normally run into) exists for a reason -- e.g. to avoid having to check for unlikely cases every time a heavily used function (e.g. memcpy) is called.
If a programmer can't at least read (normal) assembler code, I'd doubt they're a fully qualified programmer. (Just my way of viewing things, I might be wrong. My reasoning is that if you don't know assembler code, you probably don't know what a simple line like 'foo = bar + "abc"' even entails.)
My point being, that it is a bit arrogant claiming that somebody isn't a fully qualified programmer unless they understand X technology. In my book you can be a fully qualified programmer (what ever that entails) if the only programming you have ever done is in Excel.
Comparing C to python, the lack of explicit memory management and pointers in python make it hard to understand what the machine is actually doing compared to writing in C.
Are there equivalent examples between C and assembler? What sort of conceptual stuff is C missing that assembler clarifies?
Although that lower level stuff (like calling conventions and register use) is missing in C, you have to deal with it when debugging C. C programmers know how to debug at the machine level.
By contrast, the average Python programmer couldn't debug into the Python internals if their life depended on it.
You could argue it's not conceptual, but you don't have to deal with registers in C. Also system calls. C handles dynamic library loading, I think. Not too familiar with how that works in assembly.
You have to deal with those register when debugging C, at least from time to time. C debuggers show you the registers; e.g. "info reg" in the GNU debugger. Linux kernel debugging often means working from a dump of the CPU registers, raw memory around the stack pointer, and surrounding the instruction pointer. From that you figure out what bad thing the C code did.
Very much so. But there are many architectures and C is a reasonable "portable assembler" of sorts, so having an inkling of a single ISA and a grasp of C will help a lot in de-mystifying more complicated data structures in any language on most architectures.
But you are correct that C by itself is not quite enough.
To be fair, machine-generates assembly codes are much more cryptic than hand written ones. That’s why, even though the language is straight simple, reading through it is still difficult. The same works for high level languages, like generated parser, generated protocol layer, generated object system, etc.
And, yes, knowing assembly sometimes do make huge difference as a programmer, but only in limited fields.
The difference, of course, is that significant classes of programs are still mostly in C - operating systems, compilers, databases, web servers, etc. The same can not be said of assembler.
If a programmer doesn't even have any curiosity to understand what's beneath their abstractions and how things really work behind the scenes in 10+ years, I wouldn't call them "great" at all. They may be productive, but still a mediocre hacker.
C has nothing to do with this. C is a quirky old high-level language which has nothing to do with underlying abstractions, i.e. with how the machine works.
If you want to learn what's beneath, you open a machine manual and read it.
They have no debugging skills worth a damn. If a problem occurs in a tool or library they are using at a level below what they are used to, they need help. They are not complete engineers.
You have to be a complete engineer because you take full responsibility for what you're doing. All your tools carry disclaimers in their licenses which absolve them of any responsibility if something goes wrong.
for LANG in Python, Ruby, Rust, Java, ... :
If $LANG has a bug, and because of that your $LANG program causes some harm to the customer's data, you can't blame $LANG.
If I'm in a situation that I'm somehow required to do something with Python, and something goes wrong, I can debug it right into the Python internals. If the problem is with how GCC compiled $LANG, I can debug that too, and I can drop to the machine level if needed
But in 10 years they will have touched JavaScript, C++, C# or Java, and they will surely understand how memory works, so they should be able to read C.
Doubtful. I think web development, mobile and enterprise are the biggest development domains by headcount, and neither has to normally touch C or C++.
And that's fine, why does one have to understand an ancient language when they can interact perfectly fine with machines by using Python, Swift, Java etc?
That's why I included Java, C# or Javascript. If you can read those languages you can read C.
And it's not the language what they need to understand. C in itself is a fairly simple language, that's why I said that any person with experience in that family of languages would be able to read it straight away. What one should understand is how the computer is doing what you ask it to do. What does "create an object" mean, what is a "reference to an object", what are the heap and the stack, etc. These are concepts that are extremely useful and, as the parent comment said, almost required to know in order to be a full programmer and be able to understand what happens when you create your programs.
Any JavaScript or C# programmer will understand the ifs, elses and fors, but that doesn't mean they'll understand what the C code does or whether it's correct - two essential aspects of reading code.
I've been doing C++ for more than a decade and had two tough experiences lately:
* I had to review a C project of several thousand lines. It's impossible for a human to understand if memory management is done correctly without weeks of deep analysis, so I had to resort to tools.
The logic of the code is also hard to make out because of the memory bookkeeping.
* I was reading the open source code of a Linux program and had to dig deep into a couple of system topics and read the documentation for half a dozen syscalls to make sense of it.
For me, C is one of the hardest languages to understand, you're always zoomed in at 10x, looking at all the insignificant details of memory management, working with strings and arrays, etc.
> Any JavaScript or C# programmer will understand the ifs, elses and fors,
I think that's what the parent comment meant with "reading (normal) C". The language itself is simple, because it only has the ifs, elses, and fors. Of course, given the simplicity of C the projects end up being complex because they have to manage a lot of things that other languages do for the programmer, but that is outside the scope of the language (i.e., you don't need to resort to the language manual to understand it).
In other words, the discussion is not that everybody should be able to dive deep into C projects and instantly know what's happening, and know all syscalls and everything. The discussion is that a programmer should know what is memory management, what is a system call, etc, which is practically synonymous to being able to read normal C.
If you can read those languages, you can probably tokenize C fairly well, and parse a decent proportion of it. But only understand maybe a small fraction.
>And that's fine, why does one have to understand an ancient language when they can interact perfectly fine with machines by using Python, Swift, Java etc?
If the interpreter or VM used by any of those languages is written in C, someone still needs to understand it. Not to mention that it doesn't hurt to understand the internals of your chosen language if you want to make the most efficient use of it.
Where someone should be a C pro, not the local JS guy that dabbles in C and can "read" it.
Doesn't hurt to know the internals, but one's time's limited and better spent on other endeavors. If I wouldn't be doing embedded I wouldn't touch C with a 3m pole, nor would I need to :)
In fact my life's mostly C free nowadays and I'm quite pleased with that.
Because (1) those are written in that ancient language, so you're not a complete engineer if you don't understand the workings of your tool, and those tools come with disclaimers which make every issue your fault. (2) integration with tons of C libraries or libraries with C API's is a reality.
One of the biggest fallacies I see spread on forums is the statement that the age of a language or a program is an indication of its failings and a need to be replaced. Typically spoken by amateurs and other hobbyists.
You're likely correct, but most of that code's either been written long ago and is now in maintenance mode or if it's new code it's very specialized and irrelevant for 99% of the developer population. Do I care what some internal chip fw is written in? Not really, but it's probably C and it's probably full of security vulnerabilities. Such is life.
There's also those that misguidedly use C when they shouldn't (e.g to speed up Python code). It's unfortunate, but such is C's siren song - mesmerizing one with promises of speed only to be dragged into the depths of undefined behavior.
6 of the 8 things you list are C or C++ which is, essentially, C with benefits--most of the code in C++ is identical to C except for interfaces--classes/inheritance/and all that. AI/ML are a small slice of the programming pie that is performed by people who wouldn't use C at the top level but you ignore the fact that, when speed is needed, C is used even there.
I'd go into this more but I don't feel like explaining these things to people ad nauseum. C is one of the most used languages around the world, in new projects, too, and nothing you've said changes that.
No need to explain anything. Where are the new C projects? :)
Some of the things I listed are C++, which is a different language, which kept evolving unlike C and also unlike C is in demand today for all sorts of hot domains. The two can't be mixed.
Redis comes to my mind which is actively developed and can be considered one of the new projects that is powering lot of new stuffs ... Also SQlite, Postgres which are very actively developed and again being used in much of the newer stuffs either directly or indirectly ...
But in a way, you may be right ... all the critical new lower level programs like web servers are now mostly written in Go, Rust and C is not favourably considered for them ...
It's just an instance of "No True Scotsman". What is fully qualified? There are certainly areas of programming where not knowing C is a deal killer, but they are a minor subset of all professional development.
Yes, and my computer and phone run binary machine code through their smashed rock in which we put electricity. I don't know much about the smashed rock nor about electricity (more than high school level knowledge).
If a programmer can't understand C or just by looking at the code can't immediately understand it, with C being a really clear language syntax wise, there's a problem.
Hmm, I'm very skeptical of this premise. Basically every halfway decent CS program has a systems level class in C. While bootcamps usually don't teach C, they're not necessarily dismissing C as much as ignoring the entirety of systems programming.
I don't want to be overly harsh on the writing, as English does not appear to be the author's first language (you should see my Chinese). However, here's some pointers (heh)
Especially with Go and Rust go viral right now => Especially with Go and Rust having gone viral
No matter you never touched C or you are a veteran => Whether you are a C novice or a C veteran
^[1]
and not as primitive => and is not as primitive
Regardless of you a system language programmer => Regardless if you are a systems language programmer
In general, I'd recommend using "you" a little less frequently and working in more. I.e. "I highly recommend you read Modern C if you haven’t read it before" could become "I'd highly recommend Modern C to those who have not read it already".
But again, wonderful job all things considered. Please continue to write blog posts.
[1] I tried to rewrite this in the spirit of the original, but the double negative of "No matter...never" was too awkward to keep. Also feel free to remove the C's
I think if you are interested in systems or embedded programming C is definitely a good language to learn still. I programmed in C ( and C++ ) most of the 90s, focused on C++ and other languages for the 00s, then came back to it 2010ish, and I found it refreshing using modern techniques in C. However, I'd still say avoid it if you don't have to use it. But the key here is knowing exactly what you are trading off before choosing to avoid it. In most cases C++ is a viable alternative ( not always ). Rust is starting to look good in the embedded world, but I'd like to see more vendor support for it, and very few professional Rust devs in the embedded space. C is still often the better tradeoff. Also Electronics Engineers often only know C which lets them debug and diagnose hardware. So in the embedded world, you should know C, in the systems world, its a good idea, it will let you read a large majority of all open source system level software
I would rather resist the temptation if I can avoid it.
I'm so frustrated with what we've come to tolerate with software.
Ada/SPARK seems to be trying to make a comeback, something I welcome.
They are interacting with the 'maker community' [1] and the online learning resources have improved a ton.[2]
SPARK is even taking some inspiration from Rust. [3]
What I find fascinating about C is how mature codebases develop their own meta language with copious macros. These in turn inspired future languages and their built-in capabilities.
My current preferred language/runtime/execution environment is: just enough C to support the Lua VM.
Write low-level stuff in C as necessary, expose it as an abstraction to the Lua VM, write all app logic in Lua.
This really represents a great bridge between two worlds and I've just found it so incredibly productive.
Disclaimer: grey-beard C dev who doesn't need to learn your new-fangled language that 'fixes' everything (npm, lol) because I already did it decades ago with Lua ..
Not just mature codebases; also mature developers. I collected a bunch into libcperciva, and I would hate to write code without ELASTICARRAY_DECL, GETOPT, or PARSENUM.
21st Century C is my favorite. It focuses on the whole ecosystem and not just the language, the more notable GNU extensions and overall pragmatism, most C resources fail at pragmatism.
There's a number of points to this story and maybe one of the reasons I liked it so much is because it says things that are otherwise hard to describe, but here are two pointers:
- in the monastry they visit, no two things are equal. I read this as an overapplication of DRY.
- the monks speaks an extremely terse language that is so complicated that even their master have to think for a good while before he can say anything. In the story the master also speaks to the newbie in this language, implying that he either doesn't realize nobody else understands it, or maybe rather that he doesn't care so much about being helpful as he cares about being terse.
One thing the internet as still not learned is how not to needlessly flame. A lot of the comments here are dissing the problems of C (fine) but are completely missing the author's main point. For heaven's sake, it's even a lower bar this time, the title says it all and people are cursing about C. Their main point is simply:
>Especially with Go and Rust go viral right now, C seems already forgotten by people. Nevertheless, IMHO, C is still a thing which is worth for you spending some time on it.
I happen to agree, especially if you have time to learn Go or Rust but don't use it for anything[0], you have time to learn modern C. They didn't make this point (they seem more on the side of you should in general) but if you don't have the time, I'd think anyone could make a decision of what's worth their time or not.
As an aside, this bit is partly true but partly isn't:
>Last but not least, because C is so “low-level”, you can leverage it to write highly performant code to squeeze out CPU when performance is critical in some scenarios.
The reality is C has been around a long time and compilers are written to make fast code from C programs. See this, "C Is Not a Low-level Language; Your computer is not a fast PDP-11."[1]
[0] Point about Go or Rust is if you do intend to use it for something, then that isn't learning for edifications' sake. Ditto C. If you refuse to learn C but it's relevant to your job/work then that's simply neglect of your duties.
I've programmed for many years and taught programming; my trajectory through the various programming languages and paradigms seems to have worked for me. But things have changed in big ways since I learned to program.
My first job after grad school was as an assembly language programmer. It's humbling and teaches some good programming practices (like "take a small bite and then run tests"). Today, assembly language programming is not appropriate for most applications. Processors are very fast, memories are very large, and instruction sets include numerous complex features that optimizing compilers handle very well. Meanwhile there has been one new programming language after another gradually improving the high level programming tools available.
If I was designing a university curriculum for CS, I would introduce languages in this order:
First Python, its basics are easy to learn and it allows new programmers to actually tackle interesting assignments. Those that don't go on in CS will have still acquired familiarity with a useful language for writing simple programs. It's a good language to use as a lingua franca in subsequent courses.
Second simple Java and OO design, there are just so many jobs using this language.
Third C, introduced along with a study of data structures. Learning data structures as an undergraduate is mostly learning linked structures of one kind or another and C is the best language for this. C is close to what's happening at the machine level whereas Python and Java have garbage collection and language features that already include lists, maps, etc. Studying data structures in C lets students see how these higher level abstractions are implemented and prepares them for seeing kernel code.
After this there are still some big important languages left out. Lisp/Scheme can be introduced in an AI class. Javascript can be introduced in a web programming class. Assembly language can be introduced in a hardware class.
The go language would work well for an algorithms class and "modern" Java with collections, streams, modules, etc. for an undergrad compiler class or software engineering class.
Naturally SQL should be used in a database class. Git could be introduced after the first year and used for all assignments. LaTeX could be required for all writing assignments.
The important language left out here is C++. It's just seems so difficult to learn that I don't know when or whether it should be taught. C++'s new features are big improvements, but the historical baggage is a lot to take on as a new programmer. When should C++ (or Rust, or Haskell for that matter) be learned?
C++ lends itself well to large-scale, performance-intensive, cross-platform software. Rust could in theory be used for similar projects.
Using C to learn pointers & data structures gives students just enough rope to hang themselves later if they want to use C in a real project. I'd go for an STL-heavy C++ and dig into the memory handling and pitfalls.
Students need to understand how to solve resource management issues in low-level programming. Since they're essentially unsolvable in C (be careful, try harder, use valgrind & pray), it doesn't seem like a good teaching language.
In software development (web/mobile) for about 10 years, I finally picked up learning C. 4 weeks in, I finished writing my first Gameboy (Color) game and started work on my
first Neo Geo game with awesome progress already. Will share more soon, but the gist is that you can't do this in JavaScript, Go or Rust. And don't say GBStudio now :)
> I highly recommend you read Modern C if you haven’t read it before.
I feel like I was pretty disappointed with Modern C, contrasted with Deep C Secrets which taught me a ton. Maybe I should write a book about all of the GNU C extensions and the situations in which they are useful (or the only possible solutions).
Honestly, working with gcc is the closest thing I've found to vinge's programmer archaeology. There's a lot of useful stuff in there, named / hidden behind the ugliest incantations imaginable in C and often platform specific or otherwise buggy.
I've been reading Modern C too. I don't know C well yet, but can say that the book is very raw and in dire need of heavy editing. It seems that it will officialy be released in September via publisher, so hopefully this will improve things. It has lots of typos, LaTeX markup errors, and pacing issues. In the past, some people on HN rightly pointed out that the book contains some code trickery. I haven't finished it yet because it is a very dense book. Still, I feel it is definitely worth reading.
I skimmed the foreword, didn’t like the first half, but did like the three levels of understanding and expected that the book would follow this. If it doesn’t, then your book idea sounds excellent.
I wasn’t very interested in systems level stuff until the last few years with this new wave of systems languages. I’m itching to learn Zig whenever I get some free time!
I know most people will say Go isn’t a systems language but it certainly taught me to think differently than when I use C# or Python.
Yep. With all of the PL research that has gone into Zig, Nim, Rust, and others, one would really hope that greenfield projects would be written in not-C. Of course, when does anyone really get the luxury of writing something completely from scratch--especially with low-level systems? Maybe someday!
Low level system development is very common in the embedded systems domain! There's always been a large corpus of development done on embedded systems in Ada, and there's quite a bit of interest lately in low-level development in Rust for microcontrollers. As well as interest in using Rust/Ada for operating system development.
I am not a big Go fan, but it definitly can be used for such tasks.
You can always point those people to Go's main compiler being written in Go, Android's GPU debugger, gVisor, Biscuit as examples from systems programming in Go.
Check out C++ it‘s modern versions translate almost verbatim from python (throw in auto and semicolon and replace . with ::). See for example the pytorch c++ api or nlohman‘s JSON library. At the same time it allows you to program a 16kB micro controller with ease.
You can learn enough C to leverage existing code, and then use the FFI facilities that many languages provide to interface with it. That way you don't have to engineer your entire application with C. I think this is pretty common practice, but maybe some people don't realize how common it is that their, e.g. Python, libraries are leveraging C code.
I disagree. The C programming language is directly responsible for countless damning flaws in modern software and can be credited for the existence of the majority of the modern computer security industry.
You can write system software in many languages, including Lisp. For a less outlandish example, Ada is specifically designed for producing reliable systems worked on by teams of people in a way that reduces errors at program run time.
I find it amusing to mention UNIX fundamentals as a reason to learn C, considering UNIX is the only reason C has persisted for so long anyway. Real operating systems focused largely on interoperation between languages, not funnelling everything through a single one; Lisp machines focused on compilation to a single language, but that language was well-designed and well-equipped, unlike C.
>Last but not least, because C is so “low-level”, you can leverage it to write highly performant code to squeeze out CPU when performance is critical in some scenarios.
It’s actually the opposite. The C language is too high-level to correspond to any single machine, yet too low-level for compilers to optimize for the specific machine without gargantuan mechanisms. It’s easier to optimize logical count and bit manipulations in Common Lisp than in C, because Common Lisp actually provides mechanisms for these things; meanwhile, C has no equivalent to logical count and its bit shifting is a poor replacement for field manipulations. Ada permits specifying the bit structures of data types at a very high level, while continuing to use abstract parts, whereas large C projects float in a bog of text-replacement macros.
Those are my thoughts on why C isn’t worth learning, although this is nothing against the author.
C might not be particularly designed for high performance computing, but there has been an absolutely enormous amount of work on compilers for C. Due to its prevalence, even CPU manufacturers take care to design their chips so that C code runs fast on them. Effectively this makes C one of the fastest languages around.
The story is similar for Javascript. The language design is fairly terrible, but thanks to millions of dollars invested into Javascript JITs, it's now one of the fastest interpreted languages.
Only to catch up with what was already happening outside Bell Labs world.
C is indeed the JavaScript of systems programming.
"Oh, it was quite a while ago. I kind of stopped when C came out. That was a big blow. We were making so much good progress on optimizations and transformations. We were getting rid of just one nice problem after another. When C came out, at one of the SIGPLAN compiler conferences, there was a debate between Steve Johnson from Bell Labs, who was supporting C, and one of our people, Bill Harrison, who was working on a project that I had at that time supporting automatic optimization...The nubbin of the debate was Steve's defense of not having to build optimizers anymore because the programmer would take care of it. That it was really a programmer's issue.... Seibel: Do you think C is a reasonable language if they had restricted its use to operating-system kernels? Allen: Oh, yeah. That would have been fine. And, in fact, you need to have something like that, something where experts can really fine-tune without big bottlenecks because those are key problems to solve. By 1960, we had a long list of amazing languages: Lisp, APL, Fortran, COBOL, Algol 60. These are higher-level than C. We have seriously regressed, since C developed. C has destroyed our ability to advance the state of the art in automatic optimization, automatic parallelization, automatic mapping of a high-level language to the machine. This is one of the reasons compilers are ... basically not taught much anymore in the colleges and universities."
-- Fran Allen interview, Excerpted from: Peter Seibel. Coders at Work: Reflections on the Craft of Programming
Absolutely agree. After I began learning Ada, I could never look at C the same way again.
It's an incredibly fluid and intuitive language that shines in the areas that C has dominated, without any of the pitfalls that make serious development in C so difficult. It's very easy to see why it remains the language of choice for life-critical applications.
My mother bought the family a PC back in the early 90s. She was working on a masters degree, and her coursework required it. I taught myself Turbo Pascal on that machine. Loved it!
Years later I started college (originally majoring in CS), and they taught their courses in C++. I read the manual cover-to-cover and finished out the course, just to give it a fair shot. Wasted time though. To know it is to loathe it. At the end of the course, changed my major to computer engineering and never looked back.
I had lots of experience in Modula-2 and Pascal (including Turbo Pascal) before C. I liked C very much. The very syntax was a breath of fresh air; just } to close scopes and so on. The declarations that resembled uses of the identifier looked good to me.
Good old Don Knuth had a similar reaction (also coming from lots of experience in Pascal, Algol and their ilk). In a 1993 interview he said this: I think C has a lot of features that are very important. The way C handles pointers, for example, was a brilliant innovation; it solved a lot of problems that we had before in data structuring and made the programs look good afterwards. C isn't the perfect language, no language is, but I think it has a lot of virtues, and you can avoid the parts you don't like. I do like C as a language, especially because it blends in with the operating system (if you're using UNIX, for example).
I don't entirely agree with him but sympathize with the reaction. (No you can't avoid the bits you don't like, unless you're a lone hacker who doesn't have to integrate any third party-code, which pretty much describes Knuth.)
For me, C versus Turbo Pascal 6.0 just felt backwards, thankfully the professor that was giving us C classes, also made a C++ compiler available on the school lab.
So given the choice of C vs C++, when coming from Wirth's school of type safety, the option was obvious.
I find it absurd that, in criticizing C, you conflate that with the underlying machine.
But, in a way, sure. There's been plenty of hardware that had safety features and other high-level qualities, but C could never take advantage of them and now there are plenty of stupid RISC machines that resemble a PDP-11 well enough.
So, without C, the common hardware would probably be better, too.
I tend to treat C like English: you don't have to learn it, and you can certainly get by without it if you stay within a certain semi-isolated community where it's not really used. However, it is certainly something that's a nice thing to know, if for nothing else that a lot of other people will know it too.
For me, the advance in thread and memory sanitizers in XCode made it much safer to use C. Its almost fun to create memory leaks now and see the sanitizer spotting them :-)
They are from the LLVM projects, but the integration is very well done. This is something that I miss a lot in Visual Studio, and that makes me consider it as inferior to XCode.
The main problem with C and C++ is the development model. There is a standard committee that meets for a couple of days three times a year. At this meeting, they discuss proposals. They have been talking about ranges since maybe around 2000 and they still aren't quite in the language. It would be impossible for me to contribute to any standard even if I wanted to.
Rust's development happens all on github. I can see all the discussions as they are happening and potentially even meaningfully contribute. Rust already has essentially all the things the C++ standard committee wants to standardize in the next couple of years and then some more.
Besides the language itself, the lack of a package manager is a huge hurdle.
Is this about knowing C or developing in C? Nothing wrong with knowing C, but developing in C is a bad idea (also immoral /s).
In programming languages slow evolution is a feature, this is particularly true when there is so much existing code out there. Fast iteration is probably good until the 1.0 version, but then you want stability.
> Besides the language itself, the lack of a package manager is a huge hurdle.
It's a systems language, it uses the systems package manager. We don't need language specific layers on top of the OS.
I don't have a traditional CS background and learned several high level languages (interpreted and compiled) before buying Learn C the Hard Way earlier this year. I admittedly stopped short of implementing the larger projects at the end of the book (got too frustrated with the Makefile setup constantly breaking in weird ways for bigger projects...) but I still think I got a lot of value of out implementing some basic algorithms like lists or hashmaps from scratch and having to deal with how memory is used exactly. I agree with the sentiment that C is syntactic sugar for assembly. I would never date to write anything non-trivial in C but can highly recommend anyone to at least spend a couple of months fighting with it.
While C is undoubtedly necessary as a systems programming language, where it really shines is in scientific computing. All the modern scientific stacks (R, julia, numpy, torch, tensorflow, octave, ...) are based mostly on C and, to a lesser extent, on Fortran.
I don't do any systems programming, but I think it would be valuable to learn C for educational purposes. Could I get some recommendations on books or online material that could get me started?
Author refers to system programming as a broad term. Not exclusively kernel and drivers.
For example:
> Regardless of you a system language programmer, DevOps, performance engineer or wear other hats, the more you know about the Operating System, the more you can do your job better. Take all prevailing Unix-like Operating Systems as an example, from kernel to command line tools, they are almost implemented in C.
In the past, I got impression that Go is a system language because how people "bragged" about Go's performance. Then someone reminded me that Go is in the same basket with Java, not C++.
As I wrote in another comment, the author seems to use a broader term for systems programming:
> Regardless of you a system language programmer, DevOps, performance engineer or wear other hats, the more you know about the Operating System, the more you can do your job better. Take all prevailing Unix-like Operating Systems as an example, from kernel to command line tools, they are almost implemented in C.
> My main idea is that we need to teach C in a way that helps students understand why a very large fraction of critical software infrastructure, including almost all operating systems and embedded systems, is written in C, while also acknowledging the disastrously central role that it has played in our ongoing computer security nightmare. [...]
> We’d like students to be able to answer the question: Is C an appropriate choice for solving this problem? [...] The second big thing each student should learn is: How can I avoid being burned by C’s numerous and severe shortcomings?
[1] https://blog.regehr.org/archives/1393