Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ubus (OpenWrt micro bus architecture) (openwrt.org)
113 points by todsacerdoti on Aug 27, 2023 | hide | past | favorite | 54 comments


Story time.. I coded a similar thing when I started working after graduation for a DTV software company; we needed to have an IPC on Linux and I whipped up some very crude equivalent of protobuf (which I didn’t know about), based on an RLE lib i stumbled upon and without any form of discovery.

It was circa 2009, I had only been exposed to plain text protocols and didn’t know about json. In hindsight, we might’ve been better using standard dbus, of protobuf, but I was a rookie and it provided the performance we needed (for DTV metadata).

I’m happy to see that these can still thrive, and I just recently figured out that discovery is a net multiplier in these projects; doing this really proved me that any problem is solvable if you can have some time to think about it and can prototype.

I long for these moments now, I feel like nearly all computing issues have been solved and we are now just plumbers, connecting libraries and software modules through config files instead of building things.


Only if you choose to be a plumber. You can equally write new things if you want.

Also I think "any problem is solvable" needs some qualification - there are a ton of problems that have yet to be solved or are super complicated. I still haven't figured out how unbounded model checking works for example.


As someone having to debug issues on OpenWRT derivatives, I wish I had a time machine to tell the inventors of Unix to never add other forms of IPC than pipes.

It's all a big pile of stateful daemons notifying other daemons with a billion race conditions and zero debugging capabilities, like a parody of how not to create reliable systems. When you have shell scripts parsing JSON messages, you know it's over.


Pipes and named FIFOs are easy and great. I say this after implementing various IPC methods (unix domain sockets with fd passing, POSIX message queues, 0MQ, XML RPC, local TCP sockets, just to name a few). Use a simple line oriented protocol. If you are passing complex data through your IPC, you know it's time for files. Shared memory is another way to do IPC but hopefully you have robust method of detecting the liveliness of your local processes and you want to give up the Unix file paradigm.


But then how to add datatypes - schema - to pipes? How do we add a column to the messages on the pipe?

Newline-delimited != SOTA

https://en.wikipedia.org/wiki/Apache_Arrow :

> Arrow allows for zero-copy reads and fast data access and interchange without serialization overhead between these languages and systems

JSON lines formatted messages can be UTF-8 JSON-LD: https://jsonlines.org/

Linux networking is now all built on eBPF.


The same way you add it to any other SOCK_STREAM? How do we add types and schemas to TCP connections?


But then when I need to scale to more than one node with multiple cores, do pipes scale?



https://mazzo.li/posts/fast-pipes.html#splicing :

> In general when we write to a socket, or a file, or in our case a pipe, we’re first writing to a buffer somewhere in the kernel, and then let the kernel do its work. In the case of pipes, the pipe is a series of buffers in the kernel. All this copying is undesirable if we’re in the business of performance.

> Luckily, Linux includes system calls to speed things up when we want to move data to and from pipes, without copying. Specifically:

> - splice moves data from a pipe to a file descriptor, and vice-versa.

> - vmsplice moves data from user memory into a pipe.

> Crucially, both operations work without copying anything.

But then to scale to more than one node, everything has to be copied from the pipe buffer to the network socket buffer; unless splice()'ing from a pipe to a socket is Zero-copy like sendfile() (which doesn't work with pipes IIRC).

"SOCKMAP - TCP splicing of the future" (2019) https://blog.cloudflare.com/sockmap-tcp-splicing-of-the-futu...


sendfile is implemented with splice, but the scenario in which splice is actually zero-copy is not well defined (man pages just say "it's generally avoided"), so you'd have to dig into your kernel source to know for sure. That being said, chances are the cost of copying the data would be far outpaced by the cost of serializing over the network anyways, so you would want to architect your application in a way that you're sending as little data over the network as possible if performance is of maximum concern.


I haven't checked, but by the end of the day, I doubt eBPF is much slower than select() on a pipe()?


Unfortunately pipes don't cover all kind of IPC we use today.


Using SOCK_SEQPACKET would be saner than pipes or SOCK_STREAM.


Not to be confused with the (dead) ubus project[1] by the suckless folks. (Mostly I just want to write down the link to that one, because I keep losing it.)

[1] https://web.archive.org/web/20131209010702/http://unixbus.or...


In case you want to react to stuff connecting to your router via ubus:

iw event | awk '/new station/ {print $4}' | xargs -n 1 sh -c 'ubus send new_station {\"mac\":\"$1\"}' _

(Yes, this contains silly hacks to work around busybox limitations)


I'm pretty sure that there already is an event emmitted upon station connection


Certainly not by default anyway. That would have been great though. Double-checked with ubus listen in case it had changed over the last couple of years.


It seems to have some similarities to RouterOS' IPC and of course they have a similar role and environment. I'm curious if anyone who has looked into both of them in detail has any thoughts?

https://news.ycombinator.com/item?id=33904105



Did they consider using a memory-safe language for this?


There is no real case to be made against the idea of memory safety, but there is a case to be made for C.

You can write C code such that its unit tested, fuzzed, statically analyzed and reviewed, and often embedded code has one or two specific jobs.

I would have loved to see Zig here, as it makes all of the above (testing, analysis, code clarity for review) easier, but its also not memory safe necessarily.

You could write it in D with @safe (which is SafeD or whatever), which is memory safe, but thats not a very popular language.

You could use Ada, but again, thats not as easy to find devs for as C is.

You could use Go or Ocaml, though go isn't really memory safe and you can easily cause data races in goroutines, and ocaml has a runtime afaik so thats out of the question for low memory devices.

You could use Rust, assuming it had existed back when they started it, and you would get memory safety, but also an entire kitchen sink of useless garbage (like C++). Youd also have to shell out to unsafe{} in a lot of places, unless you use a crate for it, which will then do unsafe{} for you.

So chances are, whatever you do, C is a pretty sane choice, or maybe C++ if you want RAII to at least make your resource management and lifetimes easy to handle.

I dont think you deserve to be downvoted, since this is an interesting discussion to have. However, I think it would have been helpful for the discussion if you had outlined which language youd suggest and why.

To Steelman your argument, I would say you think that C is unsafe to such a degree that even a for loop is UB most of the time (integer overflow as UB), no real way to check array bounds, no real way to catch off-by-one and similarly stupid simple errors, use after free, egc. and that the entire ecosystem relies on raw pointers and macros, and its a shitshow. I think your point would have been to suggest Rust, as it fixes all these issues, while bringing along a stronger type system and a better toolchain.


> You could use Rust, assuming it had existed back when they started it, and you would get memory safety, but also an entire kitchen sink of useless garbage (like C++). Youd also have to shell out to unsafe{} in a lot of places, unless you use a crate for it, which will then do unsafe{} for you.

I'm not saying C is a wrong choice here, but this argument doesn't make much sense to me. All the other languages also come with "useless garbage" (some even with a collector for them). I don't see why you'd need that many unsafe blocks either.

I feel like Rust, as a language, is much better equipped for writing a daemon like this. I'll even go as far as to say that modern C++ (C++2x) is a better choice than C for this stuff with its superior memory and resource management tools.

Of course most of the OpenWRT development is done in C, so doing it in another language requires a very good reason. I wouldn't look at the concept of "a networked daemon exchanging messages between arbitrary services" and think C of all languages is a good fit unless every other developer on the team can only write C.

Zig would've been a nice way to meet halfway, but the language isn't finished yet.

Of course the point is moot because this particular project existed years before any of the modern alternatives or many of the C++ improvements were even available. Still, if they'd start a project like this again, I'd hope they'd pick a better language for it.


They just rewrite the impacted mess in Rust?


A decent question, I've wondered about Rust programs on openwrt but at the moment they're just too big! Linking the stdlib into every binary doesn't help, and the code is generally larger than the C equivalent. It doesn't seem unsolvable though, I'm hopeful. no_std rust binaries can be near competitive with C.

Binary size is sometimes a big deal. Two decades ago I investigated using C++ for an "embedded" linux ssh server, but decided the 30kB overhead was too large (target was a 4MB laptop). The server ended up being used in OpenWRT and other places, I'm curious if it would have happened if I'd gone with C++ instead.


This is on point. I work on small-ish embedded Linux systems and I would really like to use Rust there, but a single Rust binary takes up enough space for 10 C programs, so it's prohibitive. You can dynamically link your libraries (especially since these systems are usually built as full images, so ABI compatibility does not really matter) and it helps a little, but nowhere close to the order of magnitude you need.

Alas, there is little interest in/awareness of the middle ground between the no_std world and the "size doesn't matter" world in the Rust community right now. I've asked a bunch of people for pointers as to where to start working towards supporting those cases but I mainly got shrugs. There were attempts at more lightweight stdlibs and so on, but they all seem to have fizzled out.

I wonder how much of this is inherent in the language design. Rust heavily leans on monomorphization, so as a first approximation, you will always generate more code (before maybe optimizing that away again). Famously, the Swift people went to great lengths to avoid these problems: https://faultlore.com/blah/swift-abi/. But while running Swift on Linux is possible, this is even more niche.


There's some interest in a binary size working group. https://rust-lang.zulipchat.com/#narrow/stream/131828-t-comp...


totally agree,that's the same reason I gave up rust for embedded systems, beyond no_stb bare metal use case,it is way too big in size comparing to c and c++. golang has similar size issue,sadly.


I used Rust on an attiny22 to decode standard 433MHz remote radio packets. It is little 8bit microcontroller with 256 bytes of RAM and enough flash to store about 1024 instructions.

The key was to turn off panic unwinding, turn on size optimization, and full LTO. I think I recall increasing the inlining threshold helped a bit.


>I'm hopeful. no_std rust binaries can be near competitive with C.

I feel like Rust without std might as well be a different language. Almost all of the available mindshare, docs, libraries are dependant on it.

Sure, you can use it. But to me it just looks like technological poverty. Even compared to the C ecosystem.


I'd disagree. Using something like Embassy is pretty pleasant for embedded development, less worries than C. The only issue is maturity.

There were sufficient crates to write a ssh server for a rp2040 (mostly out of curiosity) using what's available in no_std.


Do you have that example posted anywhere? I'm curious to see it. Also, any support for the Wifi and/or Bluetooth on Pi Pico W from Rust?


https://github.com/mkj/sunset/tree/main/embassy/demos/picow is a wifi-ssh to serial impl for a rp2040. I've got it plugged into my home server's serial port. It's using cyw43 for wifi also from Embassy - dirbaio is prolific! There's some WIP in Embassy for bluetooth.

That repo has a few crates depending on each other - sunset is the toplevel ssh, sunset-embassy adds no_std async, then the async dir has std rust, with a commandline ssh client in the examples dir.


First time I'm hearing about Embassy. Looks like things have progressed since last time I tried deploying rust on an stm32.


Actually I'm often surprised just how much out there is actually available with no_std feature options?

There's obviously a lot of need for improvement in Rust in regards to small systems, as people have pointed out. But there's also a lot more middle ground in Rust vs, say, C++: You can be no_std and still have Vector, for example, while in C++ the STL is an all-or-nothing thing.

It's a smaller community, and it will come down to individuals deciding it's important and making contributions.


> Linking the stdlib into every binary doesn't help

It should be possible to use a shared library for std but it would come with the limitation that the bins must be compiled with exactly the same rustc as std.


Considering the repo dates back to 2010 (predating Rust by 4+ years), there probably wasn’t a better option that could run on the embedded devices that OpenWRT targets.


The project age is good background info. However Rust didn't invent memory-safety, there were reasonable options for system programming at that time too (eg Ocaml, Go, Ada, etc).

edit: Also D in safe mode like mentioend above - not sure if it was around in 2010?


Another consideration (apart from the 64MB ROM / 8MB RAM configuration OpenWRT targets, up from 32MB / 4MB in 2010) is that the original—and still very popular—platform for OpenWRT is Linux on big-endian MIPS, which does not exactly have ubiquitous compiler support outside of the C world. (Rust treats it as Tier 2, on par with Windows on ARM and Solaris on x86-64, which isn’t bad as these things go.) So—Go is about as realistic as Java, OCaml I don’t think runs on MIPS, neither does MLton, and IIRC the GNU Ada implementation was in a much worse state then. There’s also the part where the devs are by necessity C hackers, what with sorting out the manufacturers’ kernel patches and all.


I think any language with a large runtime would be out of the question, because Openwrt has to fit into system with very limited storage.


To put it in context:

On some devices you have 32MB flash storage in total.

32 Megabytes, not gigabytes. This is not a typo.

This needs to fit the bootloader, the Linux kernel, the initrd, the rootfs containing all the user-space tools making OpenWRT an actual usable OS and whatever daemons you need to implement your particular network needs. Oh and you probably want the WebUI too.

In 32MBs. There are newspapers online which loads more than that just to show the front page!

There’s no room for 2MB HelloWorld type languages in this space.


It's even less than that. Current OpenWrt versions work on devices with 8 MB of flash and 64 MB of RAM. OpenWrt 19.07 worked on 4MB/32MB devices.


This is a good point. But Ocaml and Ada are not in the 2MB helloworld club (eg Ocaml is more like 200k), don't know about Go. Also accounting that to a single binary is not the right perspective I think - adopting a safe language in OpenWRT should be arranged so that all the programs using that language can share the same copy of the runtime.


200k is very large in the OpenWrt world


Ada then?

Eg on https://ada.godbolt.org/ you can see the default program compiled to ~25 lines of assembler.

edit: actually it contains a call to a bounds check function that presumably is in the runtime library, i don't know how big that is.

A major point in this is that C also has a fairly big runtime library that should be factored into the comparison!


Current OpenWRT uses musl libc[1] which can be optimized to have a tiny footprint and supports full static linking, before that it used ucLibc which was similarly optimized.

You can still build software for OpenWRT that requires the much bigger Glibc, but of course it will not work that well on devices with limited memory.

[1]: https://musl.libc.org/


Musl libc seems to be 2 MB ish, it's smaller than glibc but still something when comparing to other languages and their runtimes.


    464.7K Oct 15  2022 /lib/libc.so
As seen on openwrt on armv7l system


Also OpenWRT udev itself is tiny:

  20.0K Oct 13  2022 /usr/lib/libudev.so.1
   8.0K Dec 15  2022 /sbin/udevtrigger


> C also has a fairly big runtime library

The original releases of OpenWRT used uClibc, which is nowhere near Glibc levels of bloat (Musl beats it on code quality and is used today, but didn’t exist back then). Also, yeah, you’re going to have a libc on a Linux system no matter what, so this is one of the rare cases where dynamic linking makes for a legitimate optimization.


> A major point in this is that C also has a fairly big runtime library that should be factored into the comparison!

But libc will be there anyways, unless you are rewriting absolutely all of userland.


musl is a few hundred kb


Back then, your suggestions were niche languages (and some still are) and are still not popular for embedded systems or network equipment. Large runtimes or huge static binaries are not suitable due to the memory and storage constraints.

You have to consider the surrounding ecosystem. Those interested in such languages are not necessarily those interested in contributing solutions to the problem space. Any project attempting to use such a language in OpenWrt would very likely not have survived until today.


From working with OpenWRT for years on embedded systems where sometimes 128 bytes made a difference: this is not a useful question.

In the embedded space, you use small effective languages like assembly and C. You either design for explicit memory use entirely, or you systematically test for memory usage and waste - especially if you're delivering industrial applications. The so-called "memory safe" systems tend to be very expensive for memory usage and space, and not worth the investment. Usually, anyway. This may change in the future but it's a bad change if it comes with additional power usage requirements and the environment also requires minimal power use.


I have no idea if they did. C is a "default choice" in embedded systems.

I don't know if we now have a better alternative that is similar in speed, ram use and binary size.

If we do I'd would read a discussion on which one to use with interest.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: