Errr, how exactly does this work? What's the underlying technology that makes it...

jnevill · on March 7, 2016

I'd like to know more about this too. Data storage and retrieval is such an important and integral part of an RDBMS and the switch to Linux suggests that they've had to tear up a lot of their low-level logic. How well they've done optimizing storage for EXT3/EXT4 seems to be pretty important. It would be interesting to see what's changed and how they managed this.

oblio · on March 7, 2016

If past Microsoft is any reference, most of the software is cross platform. It is designed from the start to be cross platform. It's just that they never bothered to actually port it to anything other than Windows.

Of course, bits and pieces might be tied to a platform and the dependency chain might drag other parts down, but overall Windows, Office, etc. generally have some sort of abstraction at the lower level that could theoretically allow porting to any platform.

jnevill · on March 7, 2016

That's my concern though. When you look at other enterprise-sized RDBMS you find tight integration at the "machine" level. Oracle's storage, for instance, uses "Blocks" which are sized based on the block size of the disk they are writing to. This tight integration at such a low level is part of what makes these RDBMS's fast. I would imagine that instead of running on an abstract layer that can be commonly built across many OS's that an RDBMS like Sql Server would be more tightly integrated with the machine they are running on in order to maximize performance and disk space. If so, I would really love to dig through what that means for SQL server.

telotortium · on March 7, 2016

Might that not imply that the implementation is relatively little tied to a particular OS? Taking your example, as long as there's a way to get the block size for a disk, there's not much more OS-specific code to deal with -- in particular, adapting your code to the file system doesn't need to be done since the DB will essentially create its own anyway. Similarly, only a few threading primitives need to be used -- the DB isn't going to tie itself heavily to the vagaries of the async IO implementation, which does tend to differ heavily and pervasively between platforms.

daigoba66 · on March 7, 2016

SQL Server ships with almost it's own OS implementation, aptly named SQLOS. If I had to guess, the interface between SQLOS and Windows is relatively thin (and just IO, Threads, and Memory Management).

There are a lot of other bits, such as Integrated Security Security, Windows Perf Counters, Windows Event Log, that are probably stripped out.

But I also suspect that that the Linux implementation is based on the implementation that they use for Azure SQL Databases.

trollied · on March 7, 2016

SQL Server was a port of Sybase initially. Think of it as the RDBMS going full circle back to its POSIX roots

hoodoof · on March 8, 2016

What language was Sybase written in?

Back then alot of that stuff was C. Hard to imagine Microsoft continued writing SQL server in C.

mrweasel · on March 8, 2016

Why wouldn't they write it in C? I'm pretty sure it's a mix of C and C++. Postgresql is written in C as well, as are most of the SQL servers people use. Or at least a mix of C and C++.

I don't really see many other choices being available. Sure we have Rust, Go and other languages now, but most of the current SQL servers are 20 years old or more by now. C is a well known language, there are good tooling, the compilers generate fast code and you can do very precise memory control.

serge2k · on March 7, 2016

What do you mean?

hoodoof · on March 7, 2016

Well presumably SQL server is written in C++ talking to Windows APIs. The Windows APIs aren't present on Linux except through projects like Wine and presumably that is not the basis of this code.