Discord Reduced WebSocket Traffic by 40%

transcriptase · on Sept 20, 2024

When can we expect discord not to take 20-30 seconds to launch on a $5000 PC? What exactly is it doing, recompiling the client from source using a single core each time it opens?

dumbo-octopus · on Sept 20, 2024

It needs to download the 20 distinct update patches that they added in the past 4 hours, in series, all of which combine to together change the actual product in precisely no way whatsoever.

lol768 · on Sept 20, 2024

Why does it have this update mechanism, and semi-regularly force me to download a new .deb file to install w/ dpkg?

talhah · on Sept 20, 2024

You can disable the deb check by putting "SKIP_HOST_UPDATE": true in your configuration file and just let your package manager update it

gorbypark · on Sept 20, 2024

The deb file probably contains new versions of electron (or at least versions of electron that change the actual v8/chrome version) and the regular updates during launch are loading JavaScript. Just a guess.

bogwog · on Sept 20, 2024

To be fair, that's on you for not using the Flatpak version.

sgarland · on Sept 20, 2024

Flatpak and its ilk are a scourge that should be avoided. I don’t want to have to install another package management system when apt is right there. Moreover, I like apt precisely because it has in-depth knowledge about everything in the system. If I install something from source, it’s a careful decision, and I know the tradeoffs. With things like Flatpak, that line is blurred.

hiciu · on Sept 21, 2024

While I think you might be missing out, I totally respect your preferences. For the longest time it was also my preference.

In my case, ultimately, paranoia has won. :)

How much do you really trust discord? Or games on steam, or slack, or zoom client?

Flatpak does provide you, out of the box, with a sandbox for whole discord app. Sandbox is built on standard linux stuff like bubblewrap / seccomp / namespaces. This should prevent discord from, for example, accessing anything other than ~/Downloads (rw), ~/Pictures (ro) and so on. Or seeing what other processess are running on my pc. Or snooping in my ~/.ssh.

Some stuff might not work out of box and will require reading app's README or wiki. I imagine stuff like "john is listening to song - artist" would require additional permission or configuration. File sharing could be complicated if, for example, discord could not access my screenshots directory. Webcam, voice and screen sharing work out of box but are protected by "portals" provided by (in my case) KDE.

Discord, for example, does not pollute my /usr or /home. It stays somewhere in /var/lib/flatpak/ and keeps my user files in ~/.var/app/$APPNAME. Steam, Slack, all other "big proprietary apps" also keep their stuff in ~/.var/app/$APPNAME. It's not really a choice for them; flatpak just mounts proper directories in proper places and it works really well.

You can use `nsenter --target $PID --pid --net --mount --ipc` and look around to what kind of access flatpak apps have to your system.

Sorry about preaching, I just like that bit of additional separation from stuff I don't really want to trust or care about.

bogwog · on Sept 20, 2024

> With things like Flatpak, that line is blurred.

Take a few minutes to learn how it works then. Flatpak doesn't replace apt, which is a system package manager. Flatpak is for standalone apps, like Discord.

aaomidi · on Sept 20, 2024

? Apt doesn’t require you to install from source? It’s not even its default behavior?

Flatpak provides basic level of sandboxing that ensures non-malicious stupid stuff like how steam had a bug to delete the entire user directory doesn’t actually impact you.

Beyond that, discord probably won’t solve issues with the N package managers if something happens. They’re a lot more likely to solve a Flatpak problem though.

sgarland · on Sept 20, 2024

I meant that if I deviate from apt (e.g. I want a newer version of something than what the repo has), I carefully consider what ramifications compiling and installing something from source may have on the stability of the system, which likely shares many of the same libraries.

Much of this is just grumbling. I love Debian, hate Ubuntu, and think things shouldn’t change a lot.

EasyMark · on Sept 21, 2024

This is why I tend to use flatpack browsers. Access to download to Downloads folder is really all they need on my system (for my needs, needs may vary by users)

HeXetic · on Sept 20, 2024

> change the actual product in precisely no way whatsoever

How dare you belittle the new Super Ultra Nitro Deluxe Gold Platinum emoji, stickers, and playable sound effects.

troupo · on Sept 20, 2024

Don't forget the 15 000 or so notifications that you can't realistically turn off

smolder · on Sept 20, 2024

I mute every "server" permanently, disable all "mentions" (@here, @role, etc) and it stays quiet except for DMs. If there's something I want to see I'll go look in a channel.

troupo · on Sept 21, 2024

It will still show you notifications like "suggested people to follow" :)

It's an ongoing battle

squigz · on Sept 20, 2024

How do you know there's something you want to see?

(I run a relatively small Discord community so I'm sincerely curious how people engage with servers and discover new stuff in it)

boneitis · on Sept 20, 2024

I have applied the same measures as GP; if it is interesting/important enough, I will find it manually. (Similarly, I am an administrator of a public server; I regularly remind regulars and other admins that I'll only get notified from DMs.)

For announcement channels that I'm feeling more certain I don't want to miss: I created a server for myself with a channel that has subscriptions to all the announcement channels that I want to be notified about (I also check this manually, but at least it is a custom-tailored feed just for me).

There is no coping with the (..checks Discord..) "4k+" outstanding notifications (and counting) that I am bombarded with. Slightly ironic that the Discord interface can't even be bothered to tell me the exact count; I'm not going to bother.

It used to amuse me to see people complain about @everyone tags. The commonality of it leaves me desensitized these days.

smolder · on Sept 21, 2024

The channels with no activity remain grey instead of white. It's important to me that I discover messages when I have some downtime and am interested in so-called discovery, as opposed to having junk interrupt my flow at times when I am not.

nicoburns · on Sept 20, 2024

It opens in about 2 seconds if you use the web client (https://discord.com/app)

pwython · on Sept 20, 2024

Loads less than a second for me using the website. I'm just on a 1st gen Mac Mini M1 w/ Chrome. I mainly use Discord for MidJourney, it has always been smooth.

Are there any benefits/features to using their Electron desktop app?

delecti · on Sept 20, 2024

I generally find that it's easier to interact with AV devices (webcams, microphones) on the desktop app, than when having to deal with a browser's opinions. It also separates it from your browser, which might be desirable if you keep a bunch of tabs and don't want them open in the background.

I use either option in different circumstances, they're both comparably fine.

Waterluvian · on Sept 20, 2024

Didn’t even realize there was a desktop app worth getting. The website seems to work fine.

maxfurman · on Sept 20, 2024

It's an electron app, so, yes, kinda. It has to load and parse all of its own JS at boot time (barring any lazy loading within the app)

candiddevmike · on Sept 20, 2024

VSCode is quite a bit snappier for me though, I don't think electron is necessarily the problem.

a_wild_dandan · on Sept 20, 2024

It almost never is. While HN contrarians endlessly complain about it, the rest of the world happily ships Electron. Discontent with Electron is so comically over represented here that I'm beginning to think it's an insecurity or complex or something.

beeboobaa3 · on Sept 20, 2024

It's simple:

Electron is great for developers because it allows them to easily build a cross platform app. Businesses love it for this reason.

Most users don't care. They barely realize what software is to begin with, and the mere suggestion that things could be different is baffling to them.

Then you have the highly technical users who know the software could be better, but isn't, because of the previous points.

ASalazarMX · on Sept 20, 2024

In my experience, some regular users also dislike the UX of websites as applications, and that's what Electron apps are. They feel heavier than native software, the UI elements tend to redraw and jump around unexpectedly, and the keyboard response is noticeably slower.

You can recognize them because they dislike the feel without even knowing what's an electron app. I can vouch they're not insecure about Electron.

bschmidt1 · on Sept 20, 2024

Only if using DOM right? If someone ported a WebGL game to Electron the same 3D engine would be at play. Or 2D canvas for that matter. I think you're specifically referring to the overhead of browser DOM and how elements therein load, focus, apply styling, etc.

Der_Einzige · on Sept 20, 2024

I hate every Electron app I use, and that includes vscode.

There is a unique type of jank associated with it.

Would love to live in the apparently "bizzaro" world where desktop class apps developed using tools which try to be more native to the OS were still successful in 2024.

Every update to Adobe creative suite makes it more and more close to an electron app and further and further away from it's roots as one of the premiere high end desktop caliber programs. I interpret the rise of Electron to be mostly correlated with the lack of knowledge on how to do old school desktop GUI development among the folks who teach CS education.

I expect that fl studio, photoshop, word, and every other remaining good desktop caliber app will be lost to the "make everything a in web-browser" trend, eventually. I'm so over it and it hasn't happened yet.

This is also the reason why GenAI STILL has no good prosumer tools. ComfyUI/Automatic1111/Forge/lm-studio are the closest we have and they're gradio or electron webapps which indicates that AI folks are not down with leaving the python or JS ecosystem for even a minute.

This means we need good desktop caliber GUI developers who understand AI. Unfortunately, they basically don't exist.

Thus, the world runs SOTA models on janky shit gradio or electron frontends instead of the desktop GUI's we had for similar UIs 10-20 years ago.

Word 2003 was peak software, and I'll never change my mind.

fourfour3 · on Sept 21, 2024

For me the problem with Electron is the “uncanny valley” effect - every app implements its own UI elements, and nothing is quite the same as the OS’s native behaviour.

Different apps have different unique quirks/odd behaviour, almost none of them do the right thing on platforms like macOS.

trollied · on Sept 20, 2024

I do find it sad that native apps were quick and snappy back in the day, but now everything is effectively a browser, and has a DOM that needs to be parsed and dealt with, thus negating the hardware Moore's Law improvements that have happened since.

There must be a better way.

brailsafe · on Sept 21, 2024

I used to feel the same way until Apple started dogfooding their ill-performing SwiftUI. The insane hardware packed into M1-3 chips compensates for it a bit (not always), but especially on my Intel mbp it can be jankier than most of the electron apps I interact with regularly. If VSCode was as janky as the "native" Settings app on macos I sure as hell wouldn't use it

FactKnower69 · on Sept 20, 2024

yea man, I think it's stupid for a text editor like Atom to be 200mb and take up 1gb of ram because I have a "complex". I'm annoyed at the 120 independent copies of CEF on my computer using 20gb of disk space because I'm "insecure"

Jowsey · on Sept 20, 2024

The onus is really on Discord, but you can use https://openasar.dev to partially fix the problem for yourself - it's an open source drop-in replacement for the client updater/bootstrapper.

ezfe · on Sept 20, 2024

I switched to the web app on macOS (installed from Safari) and it opens much faster and is indistinguishable

pwython · on Sept 20, 2024

I'm confused, why is anyone "installing" Discord? I just go to the website.

patrakov · on Sept 21, 2024

The web-based experience is subpar:

* You can't share your screen in a way optimized for text clarity. In the native app, even without nitro, you, effectively, have the "source resolution, 5 fps" option. With the web app, you are limited to 720p.

* You can't make the full use of your camera. On the web, 640x480 is the only available resolution, while in the native app, 1280x720 is the default.

* Also, on platforms other than Linux, with the native app, you can stream the game audio separately from your microphone. This functionality is not available via the web or on Linux unless you install an unofficial client.

pwython · on Sept 21, 2024

Ah, I didn't know people were using Discord for video calls. No one has ever asked me to to jump on a Discord video call. Then again, I'm not a PC gamer. I honestly thought it was basically just a modern IRC, to chat in specific groups/channels, and to ask the MidJourney bot to generate images of cats wearing pajamas.

dimal · on Sept 20, 2024

I’ve found that on the website, they seem to log me out pretty aggressively if I’m not using it often.

ezfe · on Sept 23, 2024

It's just a PWA - it's nicer to use the app then have a tab in the website for something I keep open in the background all the time.

bschmidt1 · on Sept 20, 2024

Like Instagram and Slack they intentionally make the web experience bad so you use the native app where they can track more.

freeAgent · on Sept 20, 2024

This is the way.

solardev · on Sept 20, 2024

In true JS fashion, it rewrites itself in a new framework every time you open it, and every fifth time, it chooses a different package manager just for fun.

cevn · on Sept 20, 2024

Using Vencord on Linux it opens practically instantly and has a superior feature set..

squigz · on Sept 20, 2024

FWIW, third-party clients and client modifications are technically against Discord's TOS

Jowsey · on Sept 21, 2024

This is true, but in the past various Discord employees have explicitly said (on HN no less) that they don't intentionally ban accounts for using them, only that sometimes their anti-spam systems can also flag them as false positives (and that you may be able to submit a ticket in this case), so I wouldn't worry too much

dimal · on Sept 20, 2024

I wish it would at least not show those obnoxious spinners over and over.

alexander2002 · on Sept 20, 2024

i thought i had some bug that my discord was loading slowly!

dancemethis · on Sept 21, 2024

hey now, personal user data takes a while.

paxys · on Sept 20, 2024

Reading through the post they seem to have been hyper focused on compression ratios and reducing the payload size/network bandwidth as much as possible, but I don't see a single mention of CPU time or evidence of any actual measureable improvement for the end user. I have been involved with a few such efforts at my own company, and the conclusion always was that the added compression/decompression overhead on both sides resulted in worse performance. Especially considering we are talking about packets at the scale of bytes or a few kilobytes at most.

dumbo-octopus · on Sept 20, 2024

The explicitly mention compression time. It’s actually lower in the new approach.

> the compression time per byte of data is significantly lower for zstandard streaming than zlib, with zlib taking around 100 microseconds per byte and zstandard taking 45 microseconds

yuliyp · on Sept 29, 2024

I am fairly sure those should have been per kilobyte, not per byte.

dvh · on Sept 20, 2024

Those are atrocious numbers, that's only 23kB/s for the faster variant. It should have been GB/s not kB.

mananaysiempre · on Sept 20, 2024

For what it’s worth, the benchmark on the Zstandard homepage[1] shows none of the setups tested breaking 1GB/s on compression, and only the fastest and sloppiest ones breaking 1GB/s on decompression. If you can live with its API limitations, libdeflate is known[2] to squeeze past 1GB/s decompressing normal Deflate compression levels. In any case, asking for multiple GB/s is probably unfair.

Still, looking at those benchmarks, 10MB/s sounds like the absolute minimum reasonable speed, and they’re reporting nearly three orders of magnitude below that. A modern compressor does not run at mediocre dialup speeds; something in there is absolutely murdering the performance.

And I’m willing to believe it’s just the constant-time overhead. The article mentions “a few hundred bytes” per message payload in a stream of messages, and the actual data of the benchmarks implies 1.6KB uncompressed. Even though they don’t reinitialize the compressor on each message, that is still a very very modest amount of data.

So it might be that general-purpose compressors are simply a bad tool here from a performance standpoint. I’m not aware of a good tool for this kind of application, though.

[1] https://facebook.github.io/zstd/#benchmarks

[2] https://github.com/zlib-ng/zlib-ng/issues/1486

jhgg · on Sept 20, 2024

One thing to note is that on a given gateway server there are potentially 100k other compression contexts active, and given each connection is transmitting a trickle of small data in an unpredictable way, from different CPU cores as the processes are scheduled by the erlang VM, chances are the CPU caches are absolutely being thrashed. I imagine this contributes to some level of fixed overhead here too, especially when you're measuring these timings on a machine serving actual production traffic as opposed to simply running a bunch of small payloads through a single compressor.

mananaysiempre · on Sept 20, 2024

It’s possible, I guess, but it wouldn’t be my first thought. It’s too slow for that.

A payload of 1.6KB at 45us/B is 75ms, which is below the typical scheduling quantum of about 100ms. (Can’t say anything about Erlang, let alone Erlang bindings to C libraries, but I wouldn’t expect it to be that much smaller either, precisely because of the switching overhead both direct and indirect.) So a single compression operation shouldn’t be getting preempted enough to affect the results.

Typical RAM bandwidth is tens of GB/s (even consumer-class SSDs[1] are single-digit GB/s) so with tens to hundreds of cores that’s not enough to affect anything, and even taking into account the compressor’s window is measured in megabytes not kilobytes that’s likely not enough (it would be a bad compressor that reread its whole window each time, anyway). And the data we’re compressing is not only minuscule, it has just been generated and is virtually guaranteed to be cached.

Honestly, I almost want to say that the benchmark is measuring the wrong thing somehow, except they’re reporting a 2× speedup switching from one compressor to another. So it can’t be the JSON encoding overhead or whatnot, and, unless one of the Erlang bindings is somehow drastically stupider than the other, it shouldn’t be the FFI overhead, and even those are a huge stretch. The Flying Spaghetti Monster be merciful, I cannot see anything here that we could be spending over a hundred million cycles on.

At this point I’m hoping somebody just mixed up the units, because this is really unsettling.

[1] https://lemire.me/en/talk/perfsummit2020/

refulgentis · on Sept 20, 2024

First, let's establish a cheery mood: Happy Friday!!!!

Second, I noticed we're extrapolating from a tossed out measurement in "microseconds per byte" here, of extremely small payloads, probably included fixed-cost overhead of doing anything at all.

All leading up to: Is "atrocious" the right word choice here? :)

More directly: do you really think Discord rolled out a compression algorithm that does 23 KB/s for payloads in the megabytes?

Even more directly, avoiding being passive and just adopting your tone: this is atrocious analysis that glibly chooses to create obviously wrong numbers, then criticizes them as if they are real.

Starlevel004 · on Sept 20, 2024

> More directly: do you really think Discord rolled out a compression algorithm that does 23 KB/s for payloads in the megabytes?

yes, actually

refulgentis · on Sept 20, 2024

that's "yngmi" bait; you're suggesting it takes 2 minutes per intro payload. (2 MB / 20 KB/s ≈ 100 seconds = 1m40s)

jhgg · on Sept 20, 2024

I think one thing this blog post did not mention was the historical context of moving from uncompressed to compressed traffic (using zlib), something I worked on in 2017. IIRC, the bandwidth savings were massive (75%). It did use more server side CPU, and negligible client side CPU, so we went for it anyways as bandwidth is a very precious thing to optimize for especially with cloud bandwidth costs.

Either way the incremental improvements here are great - and it's important to consider optimization both from transport level (compression, encoding) and also from a protocol level (the messages actually sent over the wire.)

Also one thing not mentioned is client side decompression on desktop used to use a JS implementation of zlib (pako) to a native implementation, that's exposed to the client via napi bindings.

Moru · on Sept 20, 2024

Can't remember last time I had to worry about bandwidth for the servers. It only came up when talking about iPhones in the start because everyone was on realy slow mobile networks. Our company is usually very cost sensitive but all our server hotels so far has had unmetered bandwidth connected to a 100 Mbps interface. Has had zero complaints during the last 20 years even though we fill that one now and then.

But we don't use any of the usual cloud offerings, only smaller local companies.

ihumanable · on Sept 20, 2024

> zstandard streaming significantly outperforms zlib both in time to compress and compression ratio.

Time to compress is a measure of how long the CPU spends compressing. So this is in the blogpost

koito17 · on Sept 20, 2024

I think the person is concerned with client-side compute, not just server-side compute. The article does not mention whether zstd has additional decompression overhead compared to zlib.

Client-side compute may sound like a contrived issue, but Discord runs on a wide variety of devices. Many of these devices are not necessarily the latest flagship smartphones, or a computer with a recent CPU.

I am going to guess that zstd decompression is roughly as expensive as zlib, since (de)compression time was a motivating factor in the development of zstd. Also the reason to prefer zstd over xz, despite the latter providing better compression efficiency.

colechristensen · on Sept 20, 2024

zstd has faster decompression

though I always thought lz4 to be the sweet spot for anything requiring speed, somewhat less compression ratio in exchange for very fast compression and decompression

TulliusCicero · on Sept 20, 2024

> I don't see a single mention of CPU time

> Looking once again at MESSAGE_CREATE, the compression time per byte of data is significantly lower for zstandard streaming than zlib, with zlib taking around 100 microseconds per byte and zstandard taking 45 microseconds.

lilyball · on Sept 20, 2024

That's compression time, the parent is talking about the end user so we want decompression time instead.

monocasa · on Sept 20, 2024

Zstd is generally known for markedly faster decompression (as well as compression) than zlib.

https://gregoryszorc.com/blog/2017/03/07/better-compression-...

dgfitz · on Sept 20, 2024

I imagine it correlates with the end-user device specs, no?

xnx · on Sept 20, 2024

Do the packets transmit through Discord's servers? Reducing their bill may be more important to them than user performance.

bri3d · on Sept 20, 2024

They're going from 2+MB (for some reason) to 300KB - even if decompression is "slow," that's going to be a win for their bandwidth costs and for perceived speed for _most_ users.

I was surprised to see little server-side CPU benchmarking too, though. While I'd expect overall client timing for (transfer + decompress) to be improved dramatically unless the user was on a ridiculously fast network connection, I can't imagine server load not being affected in a meaningful way.

usernamear · on Sept 20, 2024

There already was compression before, through zlib. The findings, as showed in the post, was that Zstandard was also a lot more efficient than zlib from a cpu time standpoint.

jhgg · on Sept 20, 2024

The 2mb case is pathological - an account on MANY servers with no local cache state (the READY payload works to only send data that's changed between when you've reconnected by having the client send hashes of data it knows.)

paxys · on Sept 20, 2024

Bandwidth costs for text messages, maybe, but how much data is that really compared to images, audio, video or even just the app's JS bundle?

bri3d · on Sept 20, 2024

Presumably that's all CDNed and therefore a lot cheaper to serve.

bguebert · on Sept 20, 2024

Probably not for live shared audio/video

toast0 · on Sept 20, 2024

The bandwidth probably doesn't really matter, but a 2MB must have blob vs a 300kB must have blob at the start of a connection is a big difference.

The start of a tcp connection is limited by round trip times more than bandwidth. Especially for mobile, optimizing to reduce the number of round trips required is pretty handy.

hiddencost · on Sept 20, 2024

It's almost certainly about hosting costs, not user facing value.

usernamear · on Sept 20, 2024

Some of those payloads are much larger than a few kilobytes (READY, MESSAGE_CREATE etc.) There is a section and data on "time to compress". No time to decompress though.

BoorishBears · on Sept 20, 2024

They might have been getting murdered by egress fees in which case they'd be willing to make that sacrifice

zarzavat · on Sept 20, 2024

Performance is probably the wrong lens. Mobile data is often expensive in terms of money, whereas compression is cheap in terms of CPU time. More compression is almost always the right answer for users of mobile apps.

bri3d · on Sept 20, 2024

Interesting way to approach this (dictionary based compression over JSON and Erlang ETF) vs. moving to a schema-based system like Cap'n Proto or Protobufs where the repeated keys and enumeration values would be encoded in the schema explicitly.

Also would be interested in benchmarks between Zstandard vs. LZ4 for this use case - for a very different use case (streaming overlay/HUD data for drones), I ended up using LZ4 with dictionaries produced by the Zstd dictionary tool. LZ4 produced similar compression at substantially higher speed, at least on the old ARM-with-NEON processor I was targeting.

I guess it's not totally wild but it's a bit surprising that common bootstrapping responses (READY) were 2+MB, as well.

echelon · on Sept 20, 2024

They use JSON over the wire and not a binary protocol? That's madness and reminds me of XML / Jabber.

Protos or a custom wire protocol would be far better suited to the task.

Spivak · on Sept 20, 2024

Hard disagree given the constraints. Every bot is also consuming the Discord API and forcing 3rd-party devs, many whom aren't particularly advanced coders to suddenly deal a binary wire format would be painful especially if you needed to constantly update a proto file. Their API is also part WebSocket part HTTP and many methods doing double-duty.

xx_ns · on Sept 20, 2024

To be fair, this is exactly what the Accept and Content-Type standard HTTP headers are for. Clients can tell the API "OK, send me application/json data instead of binary data" or vice versa. You can have the majority of your traffic (client traffic) using the binary format, and still support JSON for bot API usage. This is standardized for both WebSockets and HTTP.

Spivak · on Sept 20, 2024

Is there a way to do this that doesn't require keeping two sets of books for every API? Because the JSON API is right now the canonical one and still has to work. I don't imagine the lift is worth it for the difference between compressed JSON and BSON.

Also how much can you realistically win when the payload for small messages where the difference matters are text?

xx_ns · on Sept 21, 2024

Modern serialization libraries make supporting multiple serialization formats pretty transparent - of course, I'm not sure what the current situation is in Elixir land, which Discord seem to be using, but Go and Rust (as trivial examples) have serialization libraries which make serialization based on content negotiation pretty much transparent. Of course, this doesn't help with testing, you'll still need to be testing both content types separately, but the savings in bandwidth might just be worth it.

random_ · on Sept 20, 2024

Moreover, i imagine a lot of these bots are built on top of an SDK instead of directly working with API calls, so would be just a matter of changing the SDK internals.

tricked · on Sept 22, 2024

Websockets don't support headers per spec, discord gets around this by using a query paramater though.

xx_ns · on Sept 22, 2024

The initial handshake request is still over regular HTTP, though, which is where I'd assume you'd want to agree upon which content type you'll be sending anyways.

atiedebee · on Sept 20, 2024

Some RIFF like format would not be that hard to parse different sections of. You get to ignore the parts you dont recognise and decode the parts you do.

Moving to a binary format would be better for 99.9% of users and would be a slight inconvenience to a few people creating bots. Discord could easily publish a library for reading the format if needed.

treyd · on Sept 20, 2024

I don't understand why so many protocols that expect to handle large amounts of data don't default to a binary schema. JSON is fine on the edges, but the wire format between nodes is not the edge.

nojvek · on Sept 21, 2024

I assume it’s mostly because it’s way easier to debug json over websockets and http with browser devtools instead of custom protocols.

Custom protocol would be binary.

They could make a custom extension but it wouldn’t be that easy.

I worked on browser devtools for IE and edge.

Even chrome/vscode use jsonrpc over websockets for ease of development.

SirGiggles · on Sept 20, 2024

Wouldn’t the ETF (Erlang Term Format) suffice in this case?

IIRC it’s used in the desktop client and some community libraries (specifically JDA) have support for it.

There were some quirks regarding ETF usage with Discord’s Gateway but I can’t recall at the moment.

jerf · on Sept 20, 2024

Erlang terms are, to a first approximation, the same as JSON for most relevant communication metrics.

To a second approximation it gets more complicated. Atoms can save you some repetition of what would otherwise be strings, because they are effectively interned strings and get passed as tagged integers under the hood, but it's fairly redundant to what compression would get you anyhow, and erlang term representations of things like dictionaries can be quite rough:

    3> dict:from_list([{a, 1}, {b, 2}]).
    {dict,2,16,16,8,80,48,
          {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
          {{[],
            [[a|1]],
            [[b|2]],
            [],[],[],[],[],[],[],[],[],[],[],[],[]}}}

Even with compression that's a loss compared to '{"a":1,"b":2}'.

Plus, even if you're stuck to JSON one of the first things you do is shorten all your keys to the point they're as efficient as tagged integers on the wire anyhow ("double-quote x double-quote" can even beat an integer, that's only 3 bytes). Doesn't take a genius to note that "the_message_after_it_has_been_formatted_by_the_markdown_processor" can be tightened up a bit if bandwidth is tight.

It isn't clearly a loss over JSON but it is certainly not a clear win either. If you're converting from naive Erlang terms to some communication protocol encoding you're already paying for a total conversion and you might as well choose from the full set of option at that point.

SirGiggles · on Sept 21, 2024

Sorry I want to apologize, I made an error in my initial statement and meant Erlang _External_ Term Format instead of Erlang Term Format.

Does this change anything w.r.t your response?

fearthetelomere · on Sept 20, 2024

>Diving into the actual contents of one of those PASSIVE_UPDATE_V1 dispatches, we would send all of the channels, members, or members in voice, even if only a single element changed.

> the metrics that guided us during the [zstd experiment] revealed a surprising behavior

This feels so backwards. I'm glad that they addressed this low-hanging fruit, but I wonder why they didn't do this metrics analysis from the start, instead of during the zstd experiment.

I also wonder why they didn't just send deltas from the get-go. If PASSIVE_UPDATE_V1 was initially implemented "as a means to scale Discord servers to hundreds of thousands of users", why was this obvious optimization missed?

jhgg · on Sept 20, 2024

It was a bug

fearthetelomere · on Sept 20, 2024

Thanks, explains a lot. Wish the article did a better job explaining that instead of framing it as a version upgrade to V2.

RainyDayTmrw · on Sept 21, 2024

Something important that didn't get mentioned, neither in the post nor in the comments, is whether this is safe in the face of compression oracle attacks[1] like BREACH[2]. Given how much effort it seems Discord put into the compression rollout, I would be inclined to believe that they surely must have considered this, and I wish that they had written something more specific.

[1]: https://en.wikipedia.org/wiki/Oracle_attack [2]: https://en.wikipedia.org/wiki/BREACH

acer4666 · on Sept 20, 2024

Anytime I have a discord tab open it noticeably grinds my computer to a halt

myprotegeai · on Sept 20, 2024

I wonder if it has anything to do with what is going on in my /var/log/syslog. I have hundreds of entries of the following:

2024-09-20T13:28:42.946055-07:00 hostname kernel: audit: type=1400 audit(1726864122.944:11828880): apparmor="DENIED" operation="ptrace" class="ptrace" profile="snap.discord.discord" pid=1055465 comm="Utils" requested_mask="read" denied_mask="read" peer="unconfined"

sfn42 · on Sept 20, 2024

Sounds like a problem with your computer. I can have discord, 50 browser tabs, two different games, a JetBrains IDE and various other stuff open at the same time without any trouble at all.

And my computer isn't particularly crazy. Maybe like $1500.

PhilipRoman · on Sept 20, 2024

Same here, but the computer cost $200

But on a more serious note, I still hate bloated software. Money can't buy latency and the sluggishness gets really annoying.

acer4666 · on Sept 20, 2024

My computer isn't great, I'll admit, but I'm making a relative comparison. I visit many different websites on my low spec computer but discord is a noticeable outlier on how it affects the performance.

toastercat · on Sept 20, 2024

"Works on my machine"

sfn42 · on Sept 20, 2024

Lol, in fact it works on all my machines. And lots of other peoples machines.

FactKnower69 · on Sept 20, 2024

worthless post

creesch · on Sept 20, 2024

How many servers have you joined and how many of those are large and active? Also relevant, do you need to be in all of them?

Most of the time I have seen people complain about this it is because they have joined a ton of hyperactive servers.

You could argue it shouldn't be an issue and more dynamically load things like messages on servers. But then you'd have people complaining that switching servers takes so long.

Dalewyn · on Sept 20, 2024

>How many servers have you joined and how many of those are large and active?

Yes.

>Also relevant, do you need to be in all of them?

Yes.

You must be new here, because if you aren't connected to dozens of servers and idling in hundreds of channels (you only speak in maybe two or three of them) you aren't IRCing right.

What? I'm a confused old clod because we're talking about Discord in the year of our lord 2024? Same thing, it's a massive textual chat network based on a server-channel hub-spoke architecture at its core.

What is actually worth our time asking is why we could do all that and more with no problems in the 80s and 90s using hardware a thousandth or less as powerful as what we have today.

creesch · on Sept 20, 2024

Discord is different from IRC in both the scale and payload. It's besides the point anyway, even if discord is still an unoptimismed piece of shit.

You clearly have issues with that unpolished turd. So I figured I'd offer my insights in what often causes that.

If you still insist on having your computer coming to a crawl because IRC did it better than that's entirely up to you.

serf · on Sept 21, 2024

given the 'mutual servers' feature of Discord one could argue that Discord encourages users to idle in many servers even more aggressively than IRC did given the social networking implications.

in reality on IRC almost every client connected with invisible mode on so aggregating big vinn diagrams of overlapping channels between users in every channel was a lot more laborious.

jimmyl02 · on Sept 20, 2024

One thing I appreciate very much about this article is that they describe things they tried and didn't work as well. It's becoming increasingly rare (and understandably why) for articles to describe failed attempts but it's very interesting and helpful as someone unfamiliar with the space!

nickphx · on Sept 20, 2024

mIRC did it better.

privacymatters4 · on Sept 20, 2024

Truth.