When can we expect discord not to take 20-30 seconds to launch on a $5000 PC? What exactly is it doing, recompiling the client from source using a single core each time it opens?
It needs to download the 20 distinct update patches that they added in the past 4 hours, in series, all of which combine to together change the actual product in precisely no way whatsoever.
The deb file probably contains new versions of electron (or at least versions of electron that change the actual v8/chrome version) and the regular updates during launch are loading JavaScript. Just a guess.
Flatpak and its ilk are a scourge that should be avoided. I don’t want to have to install another package management system when apt is right there. Moreover, I like apt precisely because it has in-depth knowledge about everything in the system. If I install something from source, it’s a careful decision, and I know the tradeoffs. With things like Flatpak, that line is blurred.
While I think you might be missing out, I totally respect your preferences. For the longest time it was also my preference.
In my case, ultimately, paranoia has won. :)
How much do you really trust discord? Or games on steam, or slack, or zoom client?
Flatpak does provide you, out of the box, with a sandbox for whole discord app. Sandbox is built on standard linux stuff like bubblewrap / seccomp / namespaces. This should prevent discord from, for example, accessing anything other than ~/Downloads (rw), ~/Pictures (ro) and so on. Or seeing what other processess are running on my pc. Or snooping in my ~/.ssh.
Some stuff might not work out of box and will require reading app's README or wiki. I imagine stuff like "john is listening to song - artist" would require additional permission or configuration. File sharing could be complicated if, for example, discord could not access my screenshots directory. Webcam, voice and screen sharing work out of box but are protected by "portals" provided by (in my case) KDE.
Discord, for example, does not pollute my /usr or /home. It stays somewhere in /var/lib/flatpak/ and keeps my user files in ~/.var/app/$APPNAME. Steam, Slack, all other "big proprietary apps" also keep their stuff in ~/.var/app/$APPNAME. It's not really a choice for them; flatpak just mounts proper directories in proper places and it works really well.
You can use `nsenter --target $PID --pid --net --mount --ipc` and look around to what kind of access flatpak apps have to your system.
Sorry about preaching, I just like that bit of additional separation from stuff I don't really want to trust or care about.
Take a few minutes to learn how it works then. Flatpak doesn't replace apt, which is a system package manager. Flatpak is for standalone apps, like Discord.
? Apt doesn’t require you to install from source? It’s not even its default behavior?
Flatpak provides basic level of sandboxing that ensures non-malicious stupid stuff like how steam had a bug to delete the entire user directory doesn’t actually impact you.
Beyond that, discord probably won’t solve issues with the N package managers if something happens. They’re a lot more likely to solve a Flatpak problem though.
I meant that if I deviate from apt (e.g. I want a newer version of something than what the repo has), I carefully consider what ramifications compiling and installing something from source may have on the stability of the system, which likely shares many of the same libraries.
Much of this is just grumbling. I love Debian, hate Ubuntu, and think things shouldn’t change a lot.
This is why I tend to use flatpack browsers. Access to download to Downloads folder is really all they need on my system (for my needs, needs may vary by users)
I mute every "server" permanently, disable all "mentions" (@here, @role, etc) and it stays quiet except for DMs. If there's something I want to see I'll go look in a channel.
I have applied the same measures as GP; if it is interesting/important enough, I will find it manually. (Similarly, I am an administrator of a public server; I regularly remind regulars and other admins that I'll only get notified from DMs.)
For announcement channels that I'm feeling more certain I don't want to miss: I created a server for myself with a channel that has subscriptions to all the announcement channels that I want to be notified about (I also check this manually, but at least it is a custom-tailored feed just for me).
There is no coping with the (..checks Discord..) "4k+" outstanding notifications (and counting) that I am bombarded with. Slightly ironic that the Discord interface can't even be bothered to tell me the exact count; I'm not going to bother.
It used to amuse me to see people complain about @everyone tags. The commonality of it leaves me desensitized these days.
The channels with no activity remain grey instead of white. It's important to me that I discover messages when I have some downtime and am interested in so-called discovery, as opposed to having junk interrupt my flow at times when I am not.
Loads less than a second for me using the website. I'm just on a 1st gen Mac Mini M1 w/ Chrome. I mainly use Discord for MidJourney, it has always been smooth.
Are there any benefits/features to using their Electron desktop app?
I generally find that it's easier to interact with AV devices (webcams, microphones) on the desktop app, than when having to deal with a browser's opinions. It also separates it from your browser, which might be desirable if you keep a bunch of tabs and don't want them open in the background.
I use either option in different circumstances, they're both comparably fine.
It almost never is. While HN contrarians endlessly complain about it, the rest of the world happily ships Electron. Discontent with Electron is so comically over represented here that I'm beginning to think it's an insecurity or complex or something.
In my experience, some regular users also dislike the UX of websites as applications, and that's what Electron apps are. They feel heavier than native software, the UI elements tend to redraw and jump around unexpectedly, and the keyboard response is noticeably slower.
You can recognize them because they dislike the feel without even knowing what's an electron app. I can vouch they're not insecure about Electron.
Only if using DOM right? If someone ported a WebGL game to Electron the same 3D engine would be at play. Or 2D canvas for that matter. I think you're specifically referring to the overhead of browser DOM and how elements therein load, focus, apply styling, etc.
I hate every Electron app I use, and that includes vscode.
There is a unique type of jank associated with it.
Would love to live in the apparently "bizzaro" world where desktop class apps developed using tools which try to be more native to the OS were still successful in 2024.
Every update to Adobe creative suite makes it more and more close to an electron app and further and further away from it's roots as one of the premiere high end desktop caliber programs. I interpret the rise of Electron to be mostly correlated with the lack of knowledge on how to do old school desktop GUI development among the folks who teach CS education.
I expect that fl studio, photoshop, word, and every other remaining good desktop caliber app will be lost to the "make everything a in web-browser" trend, eventually. I'm so over it and it hasn't happened yet.
This is also the reason why GenAI STILL has no good prosumer tools. ComfyUI/Automatic1111/Forge/lm-studio are the closest we have and they're gradio or electron webapps which indicates that AI folks are not down with leaving the python or JS ecosystem for even a minute.
This means we need good desktop caliber GUI developers who understand AI. Unfortunately, they basically don't exist.
Thus, the world runs SOTA models on janky shit gradio or electron frontends instead of the desktop GUI's we had for similar UIs 10-20 years ago.
Word 2003 was peak software, and I'll never change my mind.
For me the problem with Electron is the “uncanny valley” effect - every app implements its own UI elements, and nothing is quite the same as the OS’s native behaviour.
Different apps have different unique quirks/odd behaviour, almost none of them do the right thing on platforms like macOS.
I do find it sad that native apps were quick and snappy back in the day, but now everything is effectively a browser, and has a DOM that needs to be parsed and dealt with, thus negating the hardware Moore's Law improvements that have happened since.
I used to feel the same way until Apple started dogfooding their ill-performing SwiftUI. The insane hardware packed into M1-3 chips compensates for it a bit (not always), but especially on my Intel mbp it can be jankier than most of the electron apps I interact with regularly. If VSCode was as janky as the "native" Settings app on macos I sure as hell wouldn't use it
yea man, I think it's stupid for a text editor like Atom to be 200mb and take up 1gb of ram because I have a "complex". I'm annoyed at the 120 independent copies of CEF on my computer using 20gb of disk space because I'm "insecure"
The onus is really on Discord, but you can use https://openasar.dev to partially fix the problem for yourself - it's an open source drop-in replacement for the client updater/bootstrapper.
* You can't share your screen in a way optimized for text clarity. In the native app, even without nitro, you, effectively, have the "source resolution, 5 fps" option. With the web app, you are limited to 720p.
* You can't make the full use of your camera. On the web, 640x480 is the only available resolution, while in the native app, 1280x720 is the default.
* Also, on platforms other than Linux, with the native app, you can stream the game audio separately from your microphone. This functionality is not available via the web or on Linux unless you install an unofficial client.
Ah, I didn't know people were using Discord for video calls. No one has ever asked me to to jump on a Discord video call. Then again, I'm not a PC gamer. I honestly thought it was basically just a modern IRC, to chat in specific groups/channels, and to ask the MidJourney bot to generate images of cats wearing pajamas.
In true JS fashion, it rewrites itself in a new framework every time you open it, and every fifth time, it chooses a different package manager just for fun.
This is true, but in the past various Discord employees have explicitly said (on HN no less) that they don't intentionally ban accounts for using them, only that sometimes their anti-spam systems can also flag them as false positives (and that you may be able to submit a ticket in this case), so I wouldn't worry too much
Reading through the post they seem to have been hyper focused on compression ratios and reducing the payload size/network bandwidth as much as possible, but I don't see a single mention of CPU time or evidence of any actual measureable improvement for the end user. I have been involved with a few such efforts at my own company, and the conclusion always was that the added compression/decompression overhead on both sides resulted in worse performance. Especially considering we are talking about packets at the scale of bytes or a few kilobytes at most.
The explicitly mention compression time. It’s actually lower in the new approach.
> the compression time per byte of data is significantly lower for zstandard streaming than zlib, with zlib taking around 100 microseconds per byte and zstandard taking 45 microseconds
For what it’s worth, the benchmark on the Zstandard homepage[1] shows none of the setups tested breaking 1GB/s on compression, and only the fastest and sloppiest ones breaking 1GB/s on decompression. If you can live with its API limitations, libdeflate is known[2] to squeeze past 1GB/s decompressing normal Deflate compression levels. In any case, asking for multiple GB/s is probably unfair.
Still, looking at those benchmarks, 10MB/s sounds like the absolute minimum reasonable speed, and they’re reporting nearly three orders of magnitude below that. A modern compressor does not run at mediocre dialup speeds; something in there is absolutely murdering the performance.
And I’m willing to believe it’s just the constant-time overhead. The article mentions “a few hundred bytes” per message payload in a stream of messages, and the actual data of the benchmarks implies 1.6KB uncompressed. Even though they don’t reinitialize the compressor on each message, that is still a very very modest amount of data.
So it might be that general-purpose compressors are simply a bad tool here from a performance standpoint. I’m not aware of a good tool for this kind of application, though.
One thing to note is that on a given gateway server there are potentially 100k other compression contexts active, and given each connection is transmitting a trickle of small data in an unpredictable way, from different CPU cores as the processes are scheduled by the erlang VM, chances are the CPU caches are absolutely being thrashed. I imagine this contributes to some level of fixed overhead here too, especially when you're measuring these timings on a machine serving actual production traffic as opposed to simply running a bunch of small payloads through a single compressor.
It’s possible, I guess, but it wouldn’t be my first thought. It’s too slow for that.
A payload of 1.6KB at 45us/B is 75ms, which is below the typical scheduling quantum of about 100ms. (Can’t say anything about Erlang, let alone Erlang bindings to C libraries, but I wouldn’t expect it to be that much smaller either, precisely because of the switching overhead both direct and indirect.) So a single compression operation shouldn’t be getting preempted enough to affect the results.
Typical RAM bandwidth is tens of GB/s (even consumer-class SSDs[1] are single-digit GB/s) so with tens to hundreds of cores that’s not enough to affect anything, and even taking into account the compressor’s window is measured in megabytes not kilobytes that’s likely not enough (it would be a bad compressor that reread its whole window each time, anyway). And the data we’re compressing is not only minuscule, it has just been generated and is virtually guaranteed to be cached.
Honestly, I almost want to say that the benchmark is measuring the wrong thing somehow, except they’re reporting a 2× speedup switching from one compressor to another. So it can’t be the JSON encoding overhead or whatnot, and, unless one of the Erlang bindings is somehow drastically stupider than the other, it shouldn’t be the FFI overhead, and even those are a huge stretch. The Flying Spaghetti Monster be merciful, I cannot see anything here that we could be spending over a hundred million cycles on.
At this point I’m hoping somebody just mixed up the units, because this is really unsettling.
First, let's establish a cheery mood: Happy Friday!!!!
Second, I noticed we're extrapolating from a tossed out measurement in "microseconds per byte" here, of extremely small payloads, probably included fixed-cost overhead of doing anything at all.
All leading up to: Is "atrocious" the right word choice here? :)
More directly: do you really think Discord rolled out a compression algorithm that does 23 KB/s for payloads in the megabytes?
Even more directly, avoiding being passive and just adopting your tone: this is atrocious analysis that glibly chooses to create obviously wrong numbers, then criticizes them as if they are real.
I think one thing this blog post did not mention was the historical context of moving from uncompressed to compressed traffic (using zlib), something I worked on in 2017. IIRC, the bandwidth savings were massive (75%). It did use more server side CPU, and negligible client side CPU, so we went for it anyways as bandwidth is a very precious thing to optimize for especially with cloud bandwidth costs.
Either way the incremental improvements here are great - and it's important to consider optimization both from transport level (compression, encoding) and also from a protocol level (the messages actually sent over the wire.)
Also one thing not mentioned is client side decompression on desktop used to use a JS implementation of zlib (pako) to a native implementation, that's exposed to the client via napi bindings.
Can't remember last time I had to worry about bandwidth for the servers. It only came up when talking about iPhones in the start because everyone was on realy slow mobile networks. Our company is usually very cost sensitive but all our server hotels so far has had unmetered bandwidth connected to a 100 Mbps interface. Has had zero complaints during the last 20 years even though we fill that one now and then.
But we don't use any of the usual cloud offerings, only smaller local companies.
I think the person is concerned with client-side compute, not just server-side compute. The article does not mention whether zstd has additional decompression overhead compared to zlib.
Client-side compute may sound like a contrived issue, but Discord runs on a wide variety of devices. Many of these devices are not necessarily the latest flagship smartphones, or a computer with a recent CPU.
I am going to guess that zstd decompression is roughly as expensive as zlib, since (de)compression time was a motivating factor in the development of zstd. Also the reason to prefer zstd over xz, despite the latter providing better compression efficiency.
though I always thought lz4 to be the sweet spot for anything requiring speed, somewhat less compression ratio in exchange for very fast compression and decompression
> Looking once again at MESSAGE_CREATE, the compression time per byte of data is significantly lower for zstandard streaming than zlib, with zlib taking around 100 microseconds per byte and zstandard taking 45 microseconds.
They're going from 2+MB (for some reason) to 300KB - even if decompression is "slow," that's going to be a win for their bandwidth costs and for perceived speed for _most_ users.
I was surprised to see little server-side CPU benchmarking too, though. While I'd expect overall client timing for (transfer + decompress) to be improved dramatically unless the user was on a ridiculously fast network connection, I can't imagine server load not being affected in a meaningful way.
There already was compression before, through zlib. The findings, as showed in the post, was that Zstandard was also a lot more efficient than zlib from a cpu time standpoint.
The 2mb case is pathological - an account on MANY servers with no local cache state (the READY payload works to only send data that's changed between when you've reconnected by having the client send hashes of data it knows.)
The bandwidth probably doesn't really matter, but a 2MB must have blob vs a 300kB must have blob at the start of a connection is a big difference.
The start of a tcp connection is limited by round trip times more than bandwidth. Especially for mobile, optimizing to reduce the number of round trips required is pretty handy.
Some of those payloads are much larger than a few kilobytes (READY, MESSAGE_CREATE etc.)
There is a section and data on "time to compress". No time to decompress though.
Performance is probably the wrong lens. Mobile data is often expensive in terms of money, whereas compression is cheap in terms of CPU time. More compression is almost always the right answer for users of mobile apps.
Interesting way to approach this (dictionary based compression over JSON and Erlang ETF) vs. moving to a schema-based system like Cap'n Proto or Protobufs where the repeated keys and enumeration values would be encoded in the schema explicitly.
Also would be interested in benchmarks between Zstandard vs. LZ4 for this use case - for a very different use case (streaming overlay/HUD data for drones), I ended up using LZ4 with dictionaries produced by the Zstd dictionary tool. LZ4 produced similar compression at substantially higher speed, at least on the old ARM-with-NEON processor I was targeting.
I guess it's not totally wild but it's a bit surprising that common bootstrapping responses (READY) were 2+MB, as well.
Hard disagree given the constraints. Every bot is also consuming the Discord API and forcing 3rd-party devs, many whom aren't particularly advanced coders to suddenly deal a binary wire format would be painful especially if you needed to constantly update a proto file. Their API is also part WebSocket part HTTP and many methods doing double-duty.
To be fair, this is exactly what the Accept and Content-Type standard HTTP headers are for. Clients can tell the API "OK, send me application/json data instead of binary data" or vice versa. You can have the majority of your traffic (client traffic) using the binary format, and still support JSON for bot API usage. This is standardized for both WebSockets and HTTP.
Is there a way to do this that doesn't require keeping two sets of books for every API? Because the JSON API is right now the canonical one and still has to work. I don't imagine the lift is worth it for the difference between compressed JSON and BSON.
Also how much can you realistically win when the payload for small messages where the difference matters are text?
Modern serialization libraries make supporting multiple serialization formats pretty transparent - of course, I'm not sure what the current situation is in Elixir land, which Discord seem to be using, but Go and Rust (as trivial examples) have serialization libraries which make serialization based on content negotiation pretty much transparent. Of course, this doesn't help with testing, you'll still need to be testing both content types separately, but the savings in bandwidth might just be worth it.
Moreover, i imagine a lot of these bots are built on top of an SDK instead of directly working with API calls, so would be just a matter of changing the SDK internals.
The initial handshake request is still over regular HTTP, though, which is where I'd assume you'd want to agree upon which content type you'll be sending anyways.
Some RIFF like format would not be that hard to parse different sections of. You get to ignore the parts you dont recognise and decode the parts you do.
Moving to a binary format would be better for 99.9% of users and would be a slight inconvenience to a few people creating bots. Discord could easily publish a library for reading the format if needed.
I don't understand why so many protocols that expect to handle large amounts of data don't default to a binary schema. JSON is fine on the edges, but the wire format between nodes is not the edge.
Erlang terms are, to a first approximation, the same as JSON for most relevant communication metrics.
To a second approximation it gets more complicated. Atoms can save you some repetition of what would otherwise be strings, because they are effectively interned strings and get passed as tagged integers under the hood, but it's fairly redundant to what compression would get you anyhow, and erlang term representations of things like dictionaries can be quite rough:
Even with compression that's a loss compared to '{"a":1,"b":2}'.
Plus, even if you're stuck to JSON one of the first things you do is shorten all your keys to the point they're as efficient as tagged integers on the wire anyhow ("double-quote x double-quote" can even beat an integer, that's only 3 bytes). Doesn't take a genius to note that "the_message_after_it_has_been_formatted_by_the_markdown_processor" can be tightened up a bit if bandwidth is tight.
It isn't clearly a loss over JSON but it is certainly not a clear win either. If you're converting from naive Erlang terms to some communication protocol encoding you're already paying for a total conversion and you might as well choose from the full set of option at that point.
>Diving into the actual contents of one of those PASSIVE_UPDATE_V1 dispatches, we would send all of the channels, members, or members in voice, even if only a single element changed.
> the metrics that guided us during the [zstd experiment] revealed a surprising behavior
This feels so backwards. I'm glad that they addressed this low-hanging fruit, but I wonder why they didn't do this metrics analysis from the start, instead of during the zstd experiment.
I also wonder why they didn't just send deltas from the get-go. If PASSIVE_UPDATE_V1 was initially implemented "as a means to scale Discord servers to hundreds of thousands of users", why was this obvious optimization missed?
Something important that didn't get mentioned, neither in the post nor in the comments, is whether this is safe in the face of compression oracle attacks[1] like BREACH[2]. Given how much effort it seems Discord put into the compression rollout, I would be inclined to believe that they surely must have considered this, and I wish that they had written something more specific.
Sounds like a problem with your computer. I can have discord, 50 browser tabs, two different games, a JetBrains IDE and various other stuff open at the same time without any trouble at all.
And my computer isn't particularly crazy. Maybe like $1500.
My computer isn't great, I'll admit, but I'm making a relative comparison. I visit many different websites on my low spec computer but discord is a noticeable outlier on how it affects the performance.
How many servers have you joined and how many of those are large and active? Also relevant, do you need to be in all of them?
Most of the time I have seen people complain about this it is because they have joined a ton of hyperactive servers.
You could argue it shouldn't be an issue and more dynamically load things like messages on servers. But then you'd have people complaining that switching servers takes so long.
>How many servers have you joined and how many of those are large and active?
Yes.
>Also relevant, do you need to be in all of them?
Yes.
You must be new here, because if you aren't connected to dozens of servers and idling in hundreds of channels (you only speak in maybe two or three of them) you aren't IRCing right.
What? I'm a confused old clod because we're talking about Discord in the year of our lord 2024? Same thing, it's a massive textual chat network based on a server-channel hub-spoke architecture at its core.
What is actually worth our time asking is why we could do all that and more with no problems in the 80s and 90s using hardware a thousandth or less as powerful as what we have today.
given the 'mutual servers' feature of Discord one could argue that Discord encourages users to idle in many servers even more aggressively than IRC did given the social networking implications.
in reality on IRC almost every client connected with invisible mode on so aggregating big vinn diagrams of overlapping channels between users in every channel was a lot more laborious.
One thing I appreciate very much about this article is that they describe things they tried and didn't work as well. It's becoming increasingly rare (and understandably why) for articles to describe failed attempts but it's very interesting and helpful as someone unfamiliar with the space!