Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would like to see an in-depth treatise explaining why existing bytecode VMs (LLVM, JavaVM and Ecma CLR) were never seriously considered for the world of browsers. These VMs already exist for numerous platforms, have been optimized to death, already have plethoras of languages that compile to them, and beside JavaVM are open source (Ecma CLR exists in the Mono project). I've looked at WebAssembly and I don't understand why it needed to be reinvented from scratch and why it needs to be so limited. We could already be writing web code in Rust, Java, C#, Python, and heck even Haskell, if we had just done that. I know that I'm skipping over the engineering effort required to make this happen but I get a sense that the engineering effort is not the stumbling block. I want to know the details of what is.


My guess is that none of those bytecode VMs were designed with the explicit goal of running untrusted code at global scale in a rock-solid sandbox.

If anything, I expect those existing VMs to slowly be replaced by WebAssembly due to how crucial and complicated that very specific sandbox requirement is - and how useful that is once you have it working reliably.

Personally I never want to run untrusted code on any of my computers outside of a robust sandbox. I look forward to a future where every application I might install runs in a sandbox that I can trust.


From the day WebAssembly was announced:

https://news.ycombinator.com/item?id=9732827

The Web is an evolving system too large and long-lived for any single company, stable consortium, or standards body capable of doing the deed to do it, so none of Java, Flash (AVM), .NET/CLR, NaCl/PNaCl, Dart, and others I have forgotten about ever had a chance to take over.

JS got out first and evolved through several jumps into https://asmjs.org/, a typed (as in static types) subset suitable with AOT+JIT techniques of hosting near-native-speed code such as Unreal Engine 3. https://brendaneich.com/2013/03/the-web-is-the-game-platform...

Java was mismanaged as a plugin (and only ever a plugin -- no deep or even shallow browser integration worth talking about) by Sun, who tried getting it into Windows after Microsoft was killing Netscape (Microsoft then killed Java in Windows, pulled trigger on .NET; Oracle later bought Sun).

Flash had its day but fell to HTML5 and fast JS, Adobe threw in the towel well before Wasm announcement, even salted the earth re: good Flash tools instead of retargeting them at the Web.

Google was a house divided all along but had absolutely no plan for getting PNaCl supported by Apple, never mind Mozilla or Microsoft. I told them so, and still get blame and delicious tears to drink as I sit on my Throne of Skulls, having caused all of this by Giant-Fivehead mind control (testimony from one of my favorite minions at https://news.ycombinator.com/item?id=9555028).


On why no extant 1995 language and why no bytecode:

https://news.ycombinator.com/item?id=1905155

https://kripken.github.io/talks/2020/universal.html#/ (from Alon Zakai in 2020)


"Secure Java" is something I recall hearing decades ago. No idea if it still exists.

The more important thing to consider, however, is the fact that CLR, JVM, etc. provide internal memory safety whereas Wasm runtimes don't.

e.g. a C program that goes sufficiently out of bounds on an array is guaranteed to segfault in the C runtime, but that runtime error does not necessarily occur on a wasm target. That is to say, the program in the sandbox can have totally strange runtime behavior -- still, defined behavior according to wasm -- although the program has undefined behavior in the source language. In the case of JVM languages, this can't really happen.


SecurityManager? Java's current direction (using the word "integrity" rather than "security", but seems relevant) looks interesting to me https://news.ycombinator.com/item?id=41520246


https://cybercultural.com/p/1995-the-birth-of-javascript/

> As told in JavaScript: The First Twenty Years, Brenden Eich joined Netscape in April 1995.

> [..]

> However, Eich didn’t think he’d have to write a new language from scratch. There were existing options available — such as the research language, Scheme, or a Unix-based language like Perl or Python. So when he joined, Eich “was expecting to implement Scheme in the browser.” But the increasingly fractious politics of the software companies of the day (it was, basically, everyone against Microsoft) soon saw the project take a more creative turn.

> On 23 May 1995, Sun Microsystems launched a new programming language into the world: Java. As part of the launch, Netscape announced that it would license Java for use in the browser. This was all well and good, but Java didn’t really fit the bill for the web. Java is a general-purpose programming language that promised Write Once, Run Anywhere (WORA) functionality, but it was too complicated for web designers and other non-programmers to use. So Netscape decided it needed a scripting language, which was a trendy term at the time for a smaller, easier to learn programming language.

There's a whole lot more interesting stuff but I think that part directly answers most of what you're wondering.


None of them started out with web security in mind.

Look at the Java bytecode, and you'll see it features such things as a goto with an arbitrary offset: https://en.m.wikipedia.org/wiki/List_of_Java_bytecode_instru...

They had to build a verifier that attempts to ensure the bytecode isn't doing anything bad. That proved to be fairly difficult, and comes at a considerable cost.


But it's not as if security concerns are specific to the Web. Look at the vulnerabilities found in CPUs over the last decade or so. Security is necessary no matter what the delivery medium, so I don't see why this is a rationale for reinventing the wheel.


They genuinely spent years trying to make Java more secure for the web. That was entirely new effort.


And?


Perhaps reread your comment and mine, and you'll see there is a relationship between statements.


They both mention security. That's about it as far as I can tell.


NIH and CIL is probably an ultra-overkill for browser-based scenarios. It implements a complex type system with all sorts of high-level features that significantly complicate the runtime/compiler. It makes it drastically easier to target but not to write an implementation.

I'm not a huge fan of WASM but it's easy to see that the authors would clearly not want to leave control in the hands of Microsoft or Oracle (and as a result all of us are hostages to Google instead because of evil that is Chromium).

https://ecma-international.org/publications-and-standards/st...


They were. Lots of reasons why it turned out how it turned out. Basically a local minimum in the gradient descent. Computers were much slower is one reason. JVM wasn't open source at the time is another. NIH is another 100 reasons.


A core requirement of WebAssembly was that (ignoring I/O for the moment and considering only the computational core) you should be able to run arbitrary existing code on it, and the effort involved in getting it working should be comparable to porting to a new architecture, not to a new programming language. What this particularly meant, in practice, was that it needed to be a good compilation target for C and C++, since most code is written either in those languages or in interpreted languages whose interpreters are written in those languages. (It also needs to support languages that's not true of, like Go, Rust, and Swift, but once you've got C and C++, those languages don't pose major additional conceptual difficulties.)

The JVM and CLR are poor compilation targets for C and C++, because those languages weren't designed to target those runtimes and those runtimes weren't designed to run those languages. (C++/CLI isn't C++.) It's possible to get something working, and a few people have tried, but you run into a lot of impedance mismatches and compatibility issues. I think you would see people run into a lot more problems trying to get their code running on the JVM or CLR than they in fact run into trying to get it running on WebAssembly. (Though I think the CLR is less bad about this than the JVM.)

As for the idea of using LLVM bitcode as an interchange format, we don't have to guess how that would have gone, because it was actually tried! Google implemented this in Chrome and called it PNaCl, and some sites and extensions relied on it for a while. They ultimately withdrew it in favor of WebAssembly. I don't understand all the reasons why it failed, but I think part of the problem is that it ran into a bunch of "the spec is whatever LLVM happens to do" type issues that were real problems for would-be toolchain authors and made the other browser vendors (including Apple, LLVM's de facto primary corporate sponsor) reluctant to support it. WebAssembly has a relatively short and simple standard that you can actually read; writing a WebAssembly interpreter is an undergraduate exercise, though of course writing a highly performant one is much more work.

Also, as far as I can tell, LLVM hasn't at all been optimized to death for the use case of runtime code generation, where the speed of the compiler is about as important as that of the generated code. The biggest dynamic language I know that uses LLVM is Julia, which is a decently big deal, but the overwhelming majority of LLVM usage is for ahead-of-time compilation of languages like C, C++, Swift, and Rust.

On a bigger-picture note, I'm not sure I at all understand why adopting an existing bytecode language would have made things easier. Yes, it would have been much easier to reuse existing Java code if the JVM had been adopted, or to reuse existing C# code if the CLR had been adopted, but those options are mutually exclusive; the goal was something that would work at least okay for all the languages. Python doesn't have a stable bytecode format, and Rust and Haskell compile to LLVM bitcode (which LLVM has no problem lowering to WebAssembly since WebAssembly was designed to make that straightforward), so I don't see how those languages are in any way disadvantaged by the choice of WebAssembly as the target bytecode language instead of some alternative.

Or are your concerns about I/O? That's a bigger can of worms, and you'd need to explain how you imagine this would work, but the short version is that reusing the interfaces that existing OSes provided would not have worked well, because the browser has a different (and in many ways better) security model.


This is not true. CIL could be an excellent compilation target for C++ and was quite literally made with that in mind. C# was inspired as much by C++ as it was by Java. And CLR back then was made with consideration of C++/CLI, which exists even today. You can't effectively express C++ code with JVM bytecode, you absolutely can with CIL. You can even express most of Rust's generics with CIL nowadays, retaining monomorphization, save for zero-sized structs and other edge cases.


I mostly don’t mind JavaScript only thing for me is number data type and no int. The other part that is annoying is lack of standard library so we get left-pad crap.


What disqualifies "the JVM" (usually referring to HotSpot implementations) from being considered open source? Are you talking about OpenJ9 or something else?


Java is as open-source as it gets (it’s reference implementation, OpenJDK, has the same license as the linux kernel)

And it was used by some browsers, there was just no consensus between different vendors due to politics. The problem largely solved itself by.. only one vendor remaining, chromium.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: