If it's about "prettier code" then I think a number one candidate would be making bitfields more viable for use. It could make driver code much cleaner and safer.
Windows is only targeting little-endian systems which makes life easier (and in any case they trust MSVC to do the right thing) so Windows drivers make much use of them (just look at the driver samples on Microsoft's GitHub page.)
Linux is a little afraid to rely on GCC/Clang doing the right thing and in any case bitfields are underpowered for a system which targets multiple endians. So Linux uses systems of macros instead for dealing with what Windows C uses bitfields. The usual pattern is a system of macros for shifting and masking. This is considerably uglier and easier to make a mess of. It would be a real improvement in quality-of-life if this were not so.
struct packing also isn't guaranteed by the C standard, everything around that is implementation defined, as it is with bitfield packing. __attribute__((packed)) is however implemented in any sensible C compiler implementation, and things around structure layout and memory layout of data types are specified in the compiler's manual. C would be useless without those implementation-specified guarantees, because most of the appeal of C for low-level-programming comes from the ease of deserialisation through something like
Of course this can only work if an implementation tells you exactly what the memory layout of 'struct deserialized' and all the data types in it are.
Btw, ordering is somewhat more defined than packing, in that the usual forward/reverse/little/big-endian shenanigans are OK. But relative ordering of each field is always preserved by the C standard.
Doesn't Linux make similar demands of the compiler, just not for bitfields? And I seem to recall Linus having some choice words for the C Standard's tendency over the years to expand the domain of undefined behavior. I don't think the Linux devs have much patience for C thinking it can weasel out of doing what it's told due to some small print in the Standard.
PowerPC "supports" both, but I believe it's typically run in big endian mode. Same with MIPS AFAIK.
(Mini rant: CPU people seem to think that you can avoid endianness issues by just supporting both little and big endian, not realizing the mess they're creating higher up the stack. The OS's ABI needs to be either big endian or little endian. Switchable endianness at runtime solves nothing and causes a horrendous mess.)
You could actually support both at runtime with both ABIs being available. This is done routinely on x86_64 with x86 ABI for compatibility (both sets of system libraries are installed), for a while I used to run 3 ABIs (including x32 - the 64bit with short pointers) for memory savings with interpreted languages.
IRIX iirc supported all 4 variants of MIPS; HP-UX did something weird too! I’d say for some computations one or the other endianness is preferred and can be switched at runtime.
Back in the day it also saved on a lot of network stack overheads - the kernel can switch endianness at will, and did so.
Are you advocating that Linux systems on PowerPC should have two variants of every single shared library, one using the big endian ABI for big endian programs and one using the little endian ABI for little endian programs?
Because that's how 32-bit x86 support is handled. There are two variants of every library. These days, Linux distros don't even provide 32-bit libraries by default, and Ubuntu has even moved to remove most of the 32-bit libraries from their repositories in recent years.
Apple removed 32-bit x86 support entirely a few years back so that they didn't have to ship two copies of all libraries anymore.
What you're proposing as a way to support both little and big endian ABIs is the old status quo that the whole world has been trying (successfully) to move away from for the past decade due to its significant downsides.
And this is to say nothing of all the protocols out there which are intended for communication within one computer and therefore assume native endianness.
There are downsides. Unsure if significant vs negligible. And same in terms of “internal” protocols - that essentially goes against the modularity (and while in the past there were good reasons to get away from modularity in pursuit of performance, darn, baudline.com of 2010 works amazingly well and is still in my toolbox!)
Big advantage of the “old ways” was the cohesion of software versions within a heterogenous cluster. In a way I caught the tail end of that with phasing out of MIT Athena (which at the time was very heterogeneous on the OS and architecture side) - but the question is, well, why.
Our industry is essentially a giant loop of centralizing and decentralizing, with advantages in both, and propagation delays between “new ideas” and implementation. Nothing new, all the economy is necessarily cyclic so why not this.
I’d argue that in the era of inexpensive hardware (again) and heterogenous edge compute, being able to run a single binary across all possible systems will again be advantageous for distribution. Some of that is the good old cosmopolitan libc, some of that is just a broad failure of /all/ end-point OS (which will brood its own consequences) - Windows 11, OSX, Androids etc..
I have no idea what you're trying to say. Are the "old ways" you're referring to having multiple ABIs on one system, like 32-bit and 64-bit x86? Were software versions within a heterogenous cluster more cohesive when we had 32-bit and 64-bit on the same machine..? What?
SGI IRIX and HP-UX handled multiple ABIs from one root partition, with the dynamic linker loader using appropriate paths for various userlands.
This had the advantage that one, networked root filesystem could boot both M68K and PA-RISC, or both o32 and n64 MIPS ABIs, and I’m pretty sure this would’ve worked happily on IA64 (again, from the same FS!)
The notion of “local storage boot” was relatively new and expensive in the Unix-land; single-user computing was alien, everyone was thin-clienting in. And it was trivial to create a boot server in which 32bit and 64bit and even other-arch (!) software versions were in perfect sync.
Nothing in current Linux actually forbids that. With qemu binfmt you can easily have one host and multiple userland architectures; and it sometimes even works OK for direct kernel syscalls.
All essentially aiming for a very different world, one that still runs behind the scenes in many places. The current Linux is dominated both by the “portable single-user desktop” workloads (laptops), and by essentially servers running JIT-interpreted language fast time to market startups. Which pushed the pendulum in the direction of VMs, containerization and essentially ephemeral OS. That’s fine for the stated usecase, but there are /still/ tons of old usecases of POS terminals actually using up a console driver off a (maybe emulated) old Unix. And a viable migration path for many of those might well be multi-endian (but often indeed emulated) something.
Even early Windows NT handled multi-architecture binaries and could’ve run fat binaries! We only settled on x86 in mid 1990s!
Linus hates introducing a ton of complexity and opportunity for bugs for no upside. Pre-emptively adding runtime endianness switching to RISC-V when there's not even market demand for it 100% falls into that category. Adding runtime endianness switching to the RISC-V ISA also falls into that category.
Supporting big endian for big-endian-only CPUs does not fall into that category.
The first line in the email that I linked is pretty unambiguous:
> Oh Christ. Is somebody seriously working on BE support in 2025?
Followed by Eric:
> And as someone who works on optimized crypto and CRC code, the arm64 big endian kernel support has always been really problematic. It's rarely tested, and code that produces incorrect outputs on arm64 big endian
regularly gets committed and released. It sometimes gets fixed, but not always; currently the arm64 SM3 and SM4 code produces incorrect outputs on big endian in mainline
BE support is unambiguously best-effort (which is none in some cases).
No, the Kernel does not take BE seriously. Not sure why I have to quote from the mailing list when the URL was a significant portion of my comment text - it directly contradicts your assertion on multiple fronts.
That's about someone adding BE support to an architecture which previously doesn't have it and therefore has no need for it. If you improve BE support in the kernel in order to fix or improve something on a supported BE-only architecture, I guarantee that Linus would have no qualms about it.
BE support in ARM is poor because ARM is primarily a LE architecture and almost nobody needs ARM BE.
Linux still supports BE for several targets, his point, I think, was that no one ises risc-v as BE except maybe in an academic setting. I don't think llvm or gcc will even target BE, so not sure how they were going to conpule those mods anyway
AFAIK, this is probably the easiest way to test BE on hardware (if you need that for some reason) - NetBSD on a Raspberry Pi running in BE mode is easy to use. (EDIT: Actually the more important thing is that it's cheap and easy to acquire)
To use big endian on real-world systems. And one of the reasons to use big endian is because diversity helps to find bugs in code. A bug that might result in a hidden off-by-one on little endian might crash on big endian.
Wouldn't that only matter if the bug has no affect on little endian?
Otherwise you don't need the other endian to confirm it?
Or are you saying to test their software before it goes out to big endian devices, which doesn't answer the question as to why someone would want to use it on those end devices?
> Wouldn't that only matter if the bug has no affect on little endian?
I don't know whether this logic applies to this specific sort of bug, but there is a long history of bugs that "don't matter" combining with other bugs to become something that matters. Therac-25 was a bug that didn't matter on the hardware it was developed for, but then the software that "worked fine" got repurposed for another hardware platform and people died. If there's an easy way to identify an entire class of bugs it seems like a good idea to do it.
Yes. The LEON series of microprocessors is quite common in space industry. It is based on SPARC v8 and SPARC is big-endian. And also, yes, SPARC v8 is a 33 years old 32-bit architecture, in space we tend to stick to the trailing edge of technology.
We’re stuck with big endian forever because of network byte order. There will probably always be a niche market for BE CPUs for things that do lots of packet processing in software.
Anything which is a bitstream on a slow processor BE has the advantage of being simpeler, see in order processing, anything else it does not matter due to caches and the non issue of adding a few more fets here and there depending on your preferred format and arriving format.
(though for debugging hex encoded data I still prefer BE but that is just a personal preference.)
From first hand experience, swapping the endianness is a non-issue in network processing performance-wise (it is headache-wise though). When processing packets in software, the cost is dominated by the following:
- memory bandwidth limits: for each packet, you do pkt NIC -> RAM, headers RAM -> cache, process, cache -> RAM, pkt RAM -> NIC. Oh and that's assuming you're only looking at headers for e.g. routing; performing DPI will have the whole packet do RAM -> cache.
- branch predictor limits: if you have enough mixed traffic, the branch predictor will be basically useless. Even performing RPS will not save you if you have enough streams
So yeah, endianness is a non-issue processing-wise. So more so that one of the most expensive operations (checksumming) can be done on a LE CPU without swapping the byte order.
Even assuming this does have a measurable performance effect for the kind of processors you run Linux on (as opposed to something like a Cortex-M), you only need to have load-big-endian and store-big-endian instructions.
They're a consultancy: ultimately they do things because either some company paid them to do it, or because they think it will help bring in future business.
There are definitely companies out there who want to run "wrong endian" configs -- traditionally this was "I have a big endian embedded networking device and I want to move away from a dying architecture (e.g. MIPS or PPC) but I really don't want to try to find all the places in my enormous legacy codebase where we accidentally or deliberately assumed big endian".
Personally I'm not in favour of having the niche usecase tail wag the general toolchain dog, and I think that's the sentiment behind Linus's remarks.
Huh. I thought the article was vague on what exactly these extensions permit, so I'd thought I'd look up the GNU documentation. Surprisingly, it [1] was rather vague too!
The only concrete example is:
Accept some non-standard constructs used in Microsoft header files.
In C++ code, this allows member names in structures to be similar to previous types declarations.
The important one is "Unnamed Structure and Union Fields"[1], in particular unnamed structs and union fields without a tag.
ISO C11 and onward allows for this:
struct {
int a;
union {
int b;
float c;
};
int d;
} foo;
In the above, you can access b as foo.b. In ISO C11, the inner struct/union must be defined without a tag. Meaning that this is invalid:
struct {
int a;
union bar {
int b;
float c;
};
int d;
} foo;
As is this:
union bar {
int b;
float c;
};
struct {
int a;
union bar;
int d;
} foo;
-fms-extensions makes both of the above valid. You might be wondering why this is uesful. The most common use is for nicer struct embedding/pseudo-inheritance:
The "vendor" in this case is GCC and there are plenty of non-standard GCC extensions in use today. The Linux kernel standard gnu89, not C89, after all. I doubt you can even compile a usable Linux kernel sticking purely to the official C standard.
The same tricks are also enabled in the plan9 extensions, but enabling plan9 extensions also enables a bunch of other tricks and those changes landed later than the Microsoft ones. Aiming to enable plan9 instead probably could've saved the Linux kernel half a decade of "Microsoft bad" comments, though.
Note that this cast would be valid without the MS extensions too, you can always cast a pointer to a struct to a pointer to its first member and viceversa. What the MS extensions allow you to do is to just do `c->i` directly, without having to name the parent
> Some implementations have permitted anonymous member-structures and -unions in extended C to contain tags, which allows tricks such as the following.
struct point { float x, y, z; };
struct location {
char *name;
struct point; // inheritance in extended C, but
// forward declaration in C++
};
> This proposal does not support that practice, for two reasons. First, it introduces a gratuitous difference between C and C++, since C++ implementations must treat the declaration of point within location as a forward reference to the type location::point rather than a definition of an unnamed member. Second, this feature does not seem to be used widely in applications, perhaps because it compiles differently in extended C vs. C++.
If C and C++ standardization had included both languages since the beginning, compatibility could have been a thing but it didn't so the languages have diverged since C-with-classes.
I don't understand why the C standard has to get bogged down with bizarro-world-C restrictions from C++.
You do realize there are a lot of projects written in C, right? Including Linux and most of its programs / utilities that you may be using.
I have new projects written in C, too, and you can do a lot to check for potential bugs using various flags to GCC / Clang, among other things like cppcheck and the rest.
No, people should not give up on C. C is really good to know, for many reasons... even if you are not going to use it.
I believe GP meant that the idea of C/C++ should be abandoned, that is, that C and C++ are compatible languages. GP thinks that they should diverge more when necessary, none of them should be held back for the compatibility with the other.
Now the reasoning isn't present in the patch but it probably is because they want step increments and -fms-extensions is a small-ish first step. Maybe -fplan9-extensions could make sense later, in a few years.
Plan 9 extensions would only require enough examples to justify and might not take years. Though your taking years assessment would be right if there's a dearth of kernel spots to add up where automatic pointer conversion for anonymous fields, or using the typedef name to access them, offer some improvement, not necessarily even a huge improvement.
Since with the Microsoft extension, it was just waiting until enough examples were woven into the discussion to overcome the back and forth that was preventing "biting the bullet".
Can you confirm whether or not anonymous member structures originated with the Plan 9 C compiler? I know I first learned of them from the Plan 9 compiler documentation, but that was long after they were already in GCC. I can't find when they were added to Microsoft's C compiler, but I'm guessing GCC's "-fms-extensions" flag is so named simply because it originated as a compatibility option for the MinGW project, and doesn't by itself imply they were a Microsoft invention. GCC gained -fms-extensions and anonymous member structures in 1999, and MinGW is first mentioned in GCC in 1997. (Which maybe suggests Microsoft C gained anonymous structure members between 1997 and 1999?)
Relatedly, do you know if anonymous member unions originate with C++, Plan 9 C, or elsewhere?
I would have liked the extension named after Plan9 more than this one after Microsoft. Not based on any ideology, mind you, but rather because the former is more powerful and allows this:
struct parent { int a; } p;
struct child {
struct parent;
int b;
} c;
void foo(struct parent *);
foo(&p); // valid of course
foo(&c); // also valid under plan9 extension, no casting!
Note that these are not the Microsoft "C Extensions", but the "Microsoft C Extensions" of the GNU Compiler Toolchain. I doubt MSVC supports -fms-extensions.
Extremely tangential: I maintain some of Rasmus's code. I've never met the man. I'd heard that kernel programmers were the "rockstar programmers of rockstar programmers", but I only grok it now.
His code is so clear, clean, concise, commented it feels divine in comparison to the drivel I subject myself to daily.
You almost have to wonder whether the past decisions to avoid this were based on the merits of the situation, or just based on the default hate of Microsoft. If it had been called "-fgnu-extensions" instead, would it have taken this long to enable?
"Almost", because -std=gnu11 is already used, so the answer seems to be right there.
That is an independent FOSS project to my knowledge.
> Knowing Microsoft, it will be both.
Possible. But then when they are content with both the kernel and the userspace, why should they switch at all? The must be wanting to replace some maintenance burden, otherwise they won't do it.
> The must be wanting to replace some maintenance burden
I'm sure they'd love to drop Windows as-is, certainly on the desktop, and let others have that burden (moving Gamers to a walled garden via XBox, let Apple and Linux have all the hardware and user support issues on desktops/laptops and phones, etc.), but momentum and backwards compatibility are massive problems and even ignoring those dropping Windows would just be too embarrassing.
Windows isn't the cash-cow it once was, it might even be a cost depending on how you massage the figures, the bread-winners for MS are currently: Azure, SQL Server, and Office. Office itself is part of the collection of things that would hold back a desktop OS exit for MS: there isn't a full port for any other OS and the online version is not feature-complete.
Sure, but developing both Win32/Linux and GNU/NT, means they would not get rid of anything. They need to keep the userspace API, so maybe they would favor Win32/Linux over GNU/NT. But then why should they get rid of the kernel, which is the least of their problems, isn't held back by API compatibility, is widely praised for its quality and supports a lot of things, which userspace doesn't. (fork, symlinks, etc.) What is the benefit of Win32/Linux over Win32/NT for Microsoft?
The alternative doesn't make sense either. There is no point in integrating their kernel into a different OS, when the result only is a different flavor of that OS. Why should anyone use GNU/NT over GNU/Linux, when it's still incompatible with all Windows software?
They could massively invest into Wine or an alternative and maybe also implement a shim for kernel modules. Or there could open-source Win32 and wait for it to merge with Wine, so that Win32/Linux becomes viable. Lastly they could spin off Windows OS into its own company and it would become just one of their targets for their software. This would likely also improve Windows, as the OS' main problem is forcing ads and bloatware into it. But why would Microsoft allow that?
-----
Arguable GNU/NT already exists with MSYS2/Cygwin. What they are missing is an OS package manager, integration into the user/process/permission system, a registry shim and a way to force random program installers to install into the FHS.
Agreed. As I said, in the words immediately after those you quoted:
> but momentum and backwards compatibility are massive problems
That, and loss of face.
> Lastly they could spin off Windows OS into its own company and it would become just one of their targets for their software.
That is an option that hadn't occurred to me, and it is closer to what I was meaning with “drop Windows as-is” then “drop parts of windows and replace with GNU or Linux”: drop windows desktop full-stop, concentrate on milking Azure platform income, Office subscriptions, and SQL Server licenses. I wouldn't envisage they'd drop everything, at least not immediately, as keeping a server subset alive to support more minor products like Exchange might be more practical than quickly porting them.
> This would likely also improve Windows, as the OS' main problem is forcing ads and bloatware into it. But why would Microsoft allow that?
Not quite sure what you are meaning there, but wrt ads and bloatware that boat has already sailed and MS is actively doing it, not just allowing it.
> Not quite sure what you are meaning there, but wrt ads and bloatware that boat has already sailed and MS is actively doing it, not just allowing it.
Exactly the opposite. A standalone Windows company wouldn't have a reason to force Copilot, Cortana, a Microsoft account, etc., since their only objective is to develop an OS. "But why would Microsoft allow that?" = "Why would give up that power over the OS?"
> That, and loss of face.
Right now. But when people got used to the OS being shitty, nobody would be sad if it's gone. Maybe that's the plot with Windows 11 and why they already partially have given up with backwards compatibility. :-)
> You are right, Git for Windows is by random people.
In addition Git for Windows and MSYS2 are different projects, but Git for Windows ships a strapped down version of MSYS2, which is a bit annoying once you want to use anything else, as now you have two (incompatible) versions of MSYS2 around. I would like if they would backport Git for Windows into MSYS2 proper.
Windows is only targeting little-endian systems which makes life easier (and in any case they trust MSVC to do the right thing) so Windows drivers make much use of them (just look at the driver samples on Microsoft's GitHub page.)
Linux is a little afraid to rely on GCC/Clang doing the right thing and in any case bitfields are underpowered for a system which targets multiple endians. So Linux uses systems of macros instead for dealing with what Windows C uses bitfields. The usual pattern is a system of macros for shifting and masking. This is considerably uglier and easier to make a mess of. It would be a real improvement in quality-of-life if this were not so.
You can also look at Managarm (which benefits from C++ here) for another approach to making this less fraught: https://github.com/managarm/managarm/blob/a698f585e14c0183df...
reply