Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cheerp 3.0: C++ compiler for the Web, now permissively licensed (leaningtech.com)
195 points by apignotti on March 14, 2023 | hide | past | favorite | 43 comments


> Cheerp directly uses the LLVM bytecode format as the intermediate representation, for both object files and libraries.

The paragraph with this might be a little confusing. It suggests that this is a benefit of Cheerp, but upstream LLVM (and Emscripten, which uses upstream LLVM) can do the exact same thing if you build with -flto to enable LTO.

The benefit of upstream LLVM is that it can do both LTO and non-LTO builds. Non-LTO builds link much, much faster in a large project, since the linker is only concatenating wasm object files, just like a normal linker. That can matter a lot for CI times and local debugging etc.


Apparently LTO = "Link-Time Optimisation" ? https://en.wikipedia.org/wiki/Interprocedural_optimization#W...



Thanks for the input. This paragraph is indeed a selling point of Cheerp. "if you build with -flto to enable LTO" is only true if you run on Linux since LLVM LTO support on Windows and Mac is limited.


That is not true for compiling to wasm.

You can use upstream LLVM (and Emscripten, which uses it) with -flto normally on all platforms. It is basically cross-compiling, so it does not matter at all on what platform you are building, and Windows/Mac/Linux all work the same.

Just to confirm that, here is an Emscripten CI run on MacOS, where you can see among other tests ones that build with lto (search for "lto" in the "run tests" tab):

https://app.circleci.com/pipelines/github/emscripten-core/em...


Limited in what way? LLVM LTO works seamlessly if you use LLD on both macOS and Windows; that's what Chrome does, for example. LLVM LTO also works pretty seamlessly with ld64 on macOS; the linker needs libLTO, but the driver takes care of that for you.


I'm waiting for the same thing to happen to CheerpX [0].

Would be great to be able to sandbox x86 code trivially.

[0]: https://news.ycombinator.com/item?id=25646022 [0]: https://docs.leaningtech.com/cheerpx/


CheerpJ is the one I'm waiting for. That thing is wicked cool for Java.


While you wait for CheerpJ, please check out TeaVM:

https://teavm.org/

It has a permissive Apache license and numerous success stories:

https://frequal.com/TeaVM/TeaVmBasedSites.html

There is also a full SPA toolkit called Flavour Plus for making pure Java, web-native, full-stack web apps:

https://frequal.com/FlavourPlus/


Thanks, I've checked it in the past and recognize all the effort you've placed into the platform. There are reasons why it is mostly ignored:

+ website itself is difficult to find the hello world example

+ after a long time digging one finds the hello world, but is hosted somewhere else with (again) very scarce documentation on how to get it started.

+ the number of examples on the site is 2, and only one for Java language.

Please invest more on newbie first steps. It isn't that we (end-users) are lazy, it is just that our time is precious too and this kind of information being available makes a huge difference to gain confidence into quickly grasping the basics and grow into more complex cases. This isn't possible when losing so much time around the site for info and just 1 example as showcase. (I know more examples are on github, but again had to dig to find them). Even that second link you mention has interesting examples but again, not even listed on the main page.

If you compare to similar projects like GraalVM, you can notice they include snippet examples after snippet examples for end-users to dive quick and deep on what can be done: https://www.graalvm.org/latest/reference-manual/wasm/

My apologies for the long answer. It just pains me to see a good project and a motivated person like you answering here but still with the handicap of a site that pushes end-users away.


> + website itself is difficult to find the hello world example

I opened the website, saw a docs link, that had a "Getting started" section, and started wondering what you were talking about. Then realised all that gives is Gradle build instructions, a maven build command.

It'll tell me how to interact with javascript, use coroutines, all sorts of stuff, without actually giving me a very basic "Hello World" sort of example. I don't think I've ever come across a language / runtime that hasn't at least had a very simple example or two within their own documentation.

https://github.com/konsoletyper/teavm/tree/master/samples/he... seems to be where they have their hello world, but ... what am I even looking at? You're right, this is shockingly bad by way of a newbie experience.


> You're right, this is shockingly bad by way of a newbie experience.

Sorry, but what exactly were you expecting? I would improve documentation, but to me it looks ok. What sort of example you need? What's wrong with the Hello World link you posted, i.e. https://github.com/konsoletyper/teavm/tree/master/samples/he... ?


Almost every language has a simple walk through that goes step by step and explains what parts are, or gets you to the point of having your first working application.

Here you go "here's how to set up your build platform".. and good luck?

The content under that github repo doesn't explain anything. It's just a dump of code. Why are the various parts there, how does this interact with TeaVM, how does this interact with the web page and why?

The documents are written assuming knowledge, and the code does nothing to actually explain what is going on, how things should be laid out or anything. There's no readme, and no comments other than the license blurb.

How does someone with no familiarity with TeaVM at all, know how to even get started?

I appreciate writing documentation sucks, but if you want your project to take off, it's not enough for it to be good on a technical level. You need to be hand holding through the entire initial process. One of the things that is consistent right throughout technology is that the best solution isn't the one that gets chosen. It's the one that has the lowest barrier of entry, and the best user experience.

One of the things that catapulted Ruby from being some obscure language to immense popularity was the first Ruby on Rails tutorials that had you making a functional blog site, taking you from first line of code to functional site pretty quickly. Ruby wasn't that special or unique as a language, but suddenly this framework came along on it that was transformative (compared to the general state of the market) that made it easy to build a website and then start modifying it.

By way of contrast for your documentation, searching for rust webassembly takes me to https://www.rust-lang.org/what/wasm which has a link to an introductory documentation https://rustwasm.github.io/docs/book/introduction.html Then in the "Hello World" section, https://rustwasm.github.io/docs/book/game-of-life/hello-worl..., it explains what each file is, why, what's happening in them, and ultimately leads to you having a working example website.

To take another example, I just googled python and webassembly, which to me to pyodide's website. The first section of their site after the introduction to what pyodide is, https://pyodide.org/en/stable/usage/quickstart.html walks you through how pyodide works in the browser, how to get your code in and functional and gets to to a place where you've created a functional page.

Ruby's stuff around webassembly is pretty fresh and documentation doesn't seem to be linked from their website, but from a quick online search https://ruby.github.io/ruby.wasm/ appears to be there. Same story as with pyodide, albeit the documentation for ruby wasm is really bare right now, but it still gives you a quick example, explains how to put your code in to the site, or as WASI, and gets you up and running.

If you want TeaVM to take off, you have to make it easy for people to get started. Absurdly so. Walk them step by step to having a functional website leveraging it. Explain what's going on at each stage, and why various lines of code are needed. What they're doing. Build up their understanding that you already have.


A good introductory article is this one from Java Magazine:

https://blogs.oracle.com/javamagazine/post/java-in-the-brows...

A good collection of specific, detailed how-to's for various use cases is in Tea Sampler here:

https://frequal.com/tea-sampler/

A more general collection of education and advocacy articles are here:

https://frequal.com/TeaVM/

I do agree that more needs to be written about this mature, permissively-licensed tool that makes Java work in modern browsers. Hopefully these links are a start.


Do I understand it correctly that you are the one behind TeaVM? If so, would be interested to help out with the documentation if you can reply to this message with your email


Hello. No, I'm the one behind TeaVM. You can write me on `info at teavm.org` or send me a message via Gitter (https://app.gitter.im/#/room/#teavm_Lobby:gitter.im) or post a message to Google Groups or on Github discussions (https://github.com/konsoletyper/teavm/discussions)


If you can include those links on the main site it would certainly help many others with the same difficulties.


Cheerpj or Teavm would be amazing if they would be able to convert pdfbox to wasm as a library that runs on wasi and the web.



It's not open source


Yes, unfortunately, the more advanced products are much more restricted.


Related:

Cheerp: A C++ Compiler for the Web - https://news.ycombinator.com/item?id=15727346 - Nov 2017 (9 comments)

Cheerp 1.2 – C++ to JavaScript: faster than Emscripten with dynamic memory - https://news.ycombinator.com/item?id=11018516 - Feb 2016 (21 comments)

Cheerp 1.1 – C++ for the Web with fast startup times, dynamic memory - https://news.ycombinator.com/item?id=10524784 - Nov 2015 (20 comments)

Cheerp – A C/C++ compiler for web applications - https://news.ycombinator.com/item?id=8167536 - Aug 2014 (49 comments)


> You can download Cheerp here. For Debian/Ubuntu, consider using our PPA

> https://launchpad.net/~leaningtech-dev/+archive/ubuntu/cheer...

They recommend using an Ubuntu PPA with Debian? That's... definitely a bold strategy. Let's hope it works out for them.

I mean, given that they themselves suggest doing so, I take it that's the PPA has been properly tested on Debian, and that they'll answer support requests and help resolving any issues if they crop up? There aren't any Debian releases listed in the "Display sources.list entries for:" dropdown.

Hmmm....following a couple of documentation links, I get to:

https://docs.leaningtech.com/cheerp/Ubuntu-Debian-installati...

Which states "Manually edit sources.list (works on Debian testing / stretch)". Huh. testing / stretch.

Yup, I'm sure it's fiiiiine.


It can actually manipulate the DOM by writing JS code.

This is really amazing.

> The question is, what will you build with Cheerp?

I don't really know, for now all I want is executing python in the browser like brython, and see if WASM can make things better, at least give more elegant errors.

But other than that, I'd rather dump HTML as a format and have something better suited for it. I already thought about a replacement for HTML, because there are so many things to remove in HTML.


It can actually manipulate the DOM by writing JS code. This is really amazing.

Why is that amazing? Anyone can manipulate the DOM by writing javascript.

I don't really know, for now all I want is executing python in the browser

You might want that, but do people using what you make want that? Python is already orders of magnitude slower than a native program, it's going to be even slower running in a browser.


Manipulating the DOM with C++ means one doesn't have to write JS at all, and can use C++.

Even things like TS still rely heavy machinery to translate ts to js.

I don't really know how TS works and if it puts the generated JS in cache.

Well brython is already quite fast already, so I don't think a WASM version would be worse.

And python performance is not such a big problem, it's up to developers to write faster programs instead by picking relevant modules and not writing low-level code. Python performance has already improved a bit, and performance is rarely a real problem in software anyway.


TypeScript: in all sane deployments, it’s translated to JavaScript at build time, so that the end user is just receiving normal JavaScript. If you eschew a few features that do generate some new JavaScript (e.g. enums), TypeScript compilation can even be as simple as just removing type annotations. For most practical purposes, TypeScript is just JavaScript. The end user doesn’t experience the “heavy machinery” to translate it to JavaScript.

Python: whether you use the likes of Brython (>5MB of JavaScript, though if you’re unrealistically careful—given the choice to use Brython—you may be able to whittle it down towards 1MB) or CPython compiled to WASM (mostly even larger, but of WASM rather than JS), performance is always going to be worse than reasonably-equivalent JavaScript or ahead-of-time-compiled WASM from languages like C++ or Rust. It’s fundamentally doing more work than alternatives, so it’ll always be heavier to download, slower to run, and more memory-hungry on equivalent code. It should never be used for web frontend stuff except when operating under extraordinarily unusual constraints, because it’s simply significantly worse in every way imaginable other than “is it more like Python than like JavaScript”.


Well brython is already quite fast already, so I don't think a WASM version would be worse.

What are you basing this on? Brython's own web page says simple operations are slower than regular python and regular python is going to run at about 1/60th the speed of webasm.

Even python users seems to recommend not using it.

https://old.reddit.com/r/Python/comments/3z54u1/is_it_really...

Python performance has already improved a bit, and performance is rarely a real problem in software anyway.

Again, I think this is for users to decide if performance isn't a problem.

Most big and popular websites are sluggish even on powerful desktops, let alone a phone that's a few years old.

When someone talks about using super slow languages for something interactive and distributed to users, that's a pretty big red flag, even more when they say 'performance is rarely a problem'.


> I don't really know how TS works and if it puts the generated JS in cache.

Usually, typescript is compiled to javascript in a build step (eg on the developer’s computer) using the typescript compiler. The output is javascript which can be run in the web browser directly, or bundled with webpack & friends, then loaded into the browser.


It still annoys me that DOM manipulation was included in the original Java Applet implementations and no one took advantage of them we just got lame grey boxes.


This looks interesting, does anyone tried it before and can comment on how good this work on a real world project?


I'm particularly curious on what parts cheerp adds to their clang+llvm base. Presumably it's something like the C standard target library for WASM/JS?

For reference, here's examples of what you could do with the baseline clang with wasm (but not JS?) [1] [2] [3], referenced from a similar thread on HN.

[1] https://github.com/ern0/howto-wasm-minimal

[2] https://github.com/robrohan/wefx

[3] https://github.com/PetterS/clang-wasm


Disclaimer: I work on Cheerp.

We do include the C (Musl) and C++ (Clang's libcxx) standard libraries.

And we do have some custom LLVM passes to improve code size and performance (See for example [1] and [2]).

But the main selling point in my opinion is the ability to target JavaScript, and (almost) freely mix data and code compiled to either JS or Wasm.

Emscripten can also target JS, but it still retains the linear memory model of Wasm (or X86, Arm, ...), which means that what you get in the end is a big TypedArray and operations on it using basic types.

With Cheerp, code compiled to JS uses an object memory model: a C++ object will become a garbage-collected JS object. While this has some limitations (you can't do unsafe casts and treat your memory like it's a big array, because it isn't), it allows seamless integration with the Browser (or any third-party) APIs:

- You can store a DOM element directly in your C++ objects (instead of doing everything through tables)

- You can directly manipulate JS Strings instead of constantly converting back and forth to C strings

- You can create nice zero-overhead interfaces to use your C++ classes from manually written JS

- Or, you can just write your whole program in C++, including callbacks for DOM events and whatnot

And you can still compile the peformance-sensitive or type-unsafe parts of your code to Wasm (losing access to some convenience in exchange for speed).

You can get an idea from the Pong tutorial [3], although it's a bit of a contrived example to showcase what can be done.

[1] https://medium.com/leaningtech/partialexecuter-reducing-weba...

[2] https://docs.leaningtech.com/cheerp/Cheerp-PreExecuter.html

[3] https://docs.leaningtech.com/cheerp/Cheerp-Tutorial-Mixed-mo...


> a C++ object will become a garbage-collected JS object.

This sounds great for ease-of-use and cross-language binding. But, there's a downside with respect to performance. With Emscripten, you never see any GC events show up in the profiler.


Indeed, as I said it can be a downside.

What we usually suggest is to compile most of the application to Wasm (which is also the default), and move the parts that most interact with the outside world to JS.

This can actually result in a speedup if it avoids multiple back-and-forths between JS and Wasm, which is common if you use a Browser/JS api from Wasm.


How is this much different than wt [1] or compiling qt to emscripten? Sincere question.

[1] https://www.webtoolkit.eu/wt


Does the library side of this provide an emulated socket API? The CheerJ docs [1] say:

> Not yet. The main problem is that RuneScape requires low level network connections primitives (sockets)

So, I'm guessing the same applies to the C++ version?

[1] https://docs.leaningtech.com/cheerpj/Frequently-Asked-Questi...


We don't support sockets out-of-the-box yet, but we have built a solution for the "networking-from-the-browser" problem a few months ago and we may backport that to Cheerp eventually.

https://leaningtech.com/webvm-virtual-machine-with-networkin...


Congratulations, looks like a great release! I'm a big fan of CheerpJ - these tools are making web development from other languages a reality.


That's awesome. I don't know how exactly that compares to Emscripten, some things in Emscripten were a bit awkward. I wonder how the C++ to WASM compilation landscape changes through this.


Does Cheerp expose a C API or is it cpp only?


Cheerp works with C code, it's also possible to use __attribute__((cheerp_jsexport)) as an equivalent to [[cheerp::jsexporp]].

Accessing DOM / external JavaScript can be done by using the __asm__ syntax while in C, use of the client namespace is only possible from C++ code


> Since its release in 2014 Cheerp has been licensed under a dual licensing scheme: GPLv2 for non-commercial users, and a proprietary license for anybody not willing to comply with GPLv2 terms.

Using exclusively GPLv3 would have been better.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: