But wait! Not so fast (smile). This benchmark uses the compositor "raw Mutter 46.0", not GNOME Shell. Raw mutter is "a very bare-bones environment that is only really meant for testing."
In addition, this measurement is not end-to-end because it does not include keyboard latency. In this test, the "board sends a key press over USB (for example, Space)". Latencies just within a keyboard can go up to 60msec by themselves:
https://danluu.com/keyboard-latency/
What are the true end-to-end numbers for the default configuration, which is the only situation and configuration that really matters? I wish the article had measured that. I suspect the numbers will be significantly worse.
I do congratulate the work of the GNOME team and the benchmarker here. Great job! But there are important unanswered questions.
Of course, the Apple //e used hardware acceleration and didn't deal with Unicode, so there are many differences. Also note that the Apple //e design was based on the older Apple ][, designed in the 1970s.
Still, it would be nice to return to the human resposiveness of machines 41+ years old.
I'd argue that it's actually a good thing that the author ignored keyboard latency. We all have different keyboards plugged into different USB interfaces plugged into different computers running different versions of different operating systems. Throw hubs and KVMs into the mix, too.
If the latency of those components varies wildly over the course of the test, it would introduce noise that reduces our ability to analyze the exact topic of the article - VTE's latency improvements.
Even if the latency of those components were perfectly consistent over the course of the author's test, then it wouldn't affect the results of the test in absolute terms, and the conclusion wouldn't change.
This exact situation is why differences in latency should never be expressed as percentages. There are several constants that can be normalized for a given sample set, but can't be normalized across an entire population of computer users. The author does a good job of avoiding that pitfall.
The Mutter thing is interesting. The author holds that constant, and GNOME sits on top of Mutter, so I think it's reasonable to assume we'd see the same absolute improvement in latency. GNOME may also introduce its own undesirable variance, just like keyboard latency. I'd be very curious to see if those guesses holds up.
> We all have different keyboards plugged into different USB interfaces plugged into different computers running different versions of different operating systems.
I've actually been curious about getting a wireless keyboard recently, but wondered about how big of a latency impact there would be. Normally I use IDEs that definitely add a bit of sluggishness into the mix by themselves, but something compounded on top of that would probably make it way more annoying.
Just found that interesting, felt like sharing. Might actually go for some wireless keyboard as my next one, if I find a good form factor. Unless that particular keyboard does something really well and most others just have way worse wireless hardware in them.
Then use the Apple 2e and forgo all the nicities of modern operating systems. Honestly this take is a whole lot of words to shit on all the open source devs working hard to provide things to us for free and I’m not having it.
+100. It's my least favorite talking point because I'm old enough to seen it linked 100 times, I find it very unlikely it was faster when measured by the same methods, and the article itself notes the funny math around CRTs.
Yet as a lover of Cherry Reds and Blues, in my opinion that time should most definitely be included. I am not a gamer, but I do notice the difference when I'm on a Red keyboard and when I'm on a Blue keyboard.
My initial gut reaction to this was - yeah, of course. But after reading https://danluu.com/keyboard-latency/ - I'm not so sure. Why exactly should physical travel time not matter? If a keyboard has a particularly late switch, that _does_ affect the effective latency, does it not?
I can sort of see the argument for post actuation latency in some specific cases, but as a general rule, I'm struggling to come up with a reason to exclude delays due to physical design.
It's a personal choice of input mechanism that you can add to the measured number. Also, the activation point is extremely repeatable. You become fully aware of that activation point, so it shouldn't contribute to the percieved latency, since that activation point is where you see yourself as hitting the button. This is the reason I don't use mechanical keyboards; I can't activate the key in a reasonable time.
>This is the reason I don't use mechanical keyboards; I can't activate the key in a reasonable time.
From what I understand, non-mechanical keyboards need the key to bottom out to actuate, whereas mechanical switches have a separate actuation point and do not need to be fully pressed down. In other words mechanical switches activate earlier and more easily. What you said seems to imply something else entirely.
If you're comparing a mechanical key switch with 4mm travel to a low-profile rubber dome with 2mm or less of travel, the rubber dome will probably feel like it actuates sooner—especially if the mechanical switch is one of the varieties that doesn't provide a distinct bump at the actuation point.
No, I’m speaking only of travel required to activate the key. There’s still travel to the activation point for mechanical keyboards. I’ve yet to find a mechanical switch with an activation distance as small as, say, a MacBook (1 mm). Low travel mechanical switches, like from Choco (as others have mentioned) are 1.3mm. Something like a Cherry Red is 2mm.
A lot of modern keyboards allow you to swap out switches, which means switch latency is not inherently linked to a keyboard.
It also completely ignores ergonomics. A capacitive-touch keyboard would have near-zero switch latency, but be slower to use in practice due to the lack of tactile feedback. And if we're going down this rabbit hole, shouldn't we also include finger travel time? Maybe a smartphone touch screen is actually the "best" keyboard!
Latency isn't everything; but that doesn't mean it's irrelevant either. I'm OK with a metric that accurately represents latency with the caveat that feel or other factors may be more important. If key and/or switch design impacts latency in practice; shouldn't we measure that?
I guess that is an open question - perhaps virtually all the variance in latency due to physical design is tied up with fundamental tradeoffs between feel, feedback, sound, and preference. If so - then sure: measuring the pre-activation latency is pointless. On the other hand, if there are design choices that meaningfully affect latency without meaningfully impacting other priorities, or even where gains in latency are perhaps more important than (hypothetically) small losses elsewhere - then measuring that would helpful.
I get the impression that we're still in the phase that this isn't actually a trivially solved problem; i.e. where at least having the data and only _then_ perhaps choosing how much we care (and how to interpret whatever patterns arise) is worth it.
Ideally of course we'd have both post-activation-only and physical-activation-included metrics, and we could compare.
I'm fine with wanting to measure travel time of keyboards but that really shouldn't be hidden in the latency measurement. Each measure (travel time and latency) is part of the overall experience (as well as many other things) but they are two separate things and wanting to optimize one for delay isn't necessarily the same thing as wanting to optimize both for delay.
I.e. I can want a particular feel to a keyboard which prioritizes comfort over optimizing travel distance independent of wanting the keyboard to have a low latency when it comes to sending the triggered signal. I can also type differently than the tester and that should change the travel times in comparisons, not the latentcies.
Because starting measuring input latency from before the input is flat out wrong. It would be just as sensible to start the measurement from when your finger starts moving or from when the nerve impulse that will start your finger moving leaves your brainstem or from when you first decide to press the key. These are all potentially relevant things, but they aren't part of the keypress to screen input latency.
Dealing with Unicode is not the challenge that people seem to believe it is. There are edge cases where things can get weird, but they are few and those problems are easily solved.
What really got my goat about this article is that prior to the latest tested version of Gnome, the repaint rate was a fixed 40Hz! Whose decision was that?
Unicode is more challenging when you're talking about hardware acceleration. On an Apple //e, displaying a new character required writing 1 byte to a video region. The hardware used that byte to index into a ROM to determine what to display. Today's computers are faster, but they must also transmit more bytes to change the character display.
That said, I can imagine more clever uses of displays might produce significantly faster displays.
Modern video acceleration wouldn’t copy 1 byte into video memory even if we stuck to ASCII. They have to blit those characters onto a surface in the required type-face.
The extra few bytes for Unicode characters outside of the usual ASCII range is effectively a rounding error compared with the bitmap data you’re copying around.
> What really got my goat about this article is that prior to the latest tested version of Gnome, the repaint rate was a fixed 40Hz! Whose decision was that?
From a previous VTE weekly status message, leading up to the removal:
"I still expect more work to be done around frame scheduling so that we can remove the ~40fps cap that predates reliable access to vblank information." (from https://thisweek.gnome.org/posts/2023/10/twig-118/)
So as with a lot of technical debt, it's grounded in common sense, and long since past time it should have been removed. It just took someone looking to realise it was there and work to remove it.
That’s a valid reason to choose a repaint rate, sure, but 40Hz? It’s halfway between 30Hz and 60Hz (yes, it is) but it seems like a poor choice, to me. 60Hz would have been much more reasonable.
Also, why do userland applications need to know the monitor refresh rate at all? Repaint when the OS asks you to. Linux is a huge pile of crap pretending to be a comprehensive OS.
Yes, you're right, you're absolutely the smartest person in the industry, and absolutely everyone else doing all of this work has no clue what they are doing at all, and should absolutely bow down to your superior intellect.
You not understanding why something was done, much with your orignal comments about 40hz, doesn't actually mean that it is wrong, or stupid. It means that you should probably spend time learning why before making proclamations.
I found the claim in the keyboard latency article suspicious. If keyboards regularly had 60ms key press to USB latency, rhythm games will be literally unplayable. Yet I never had this kind of problem with any of the keyboards I have owned.
Real physical acoustic pianos have a latency of about 30ms (the hammer has to be released and strike the string).
Musicians learn to lead the beat to account for the activation delay of their instrument - drummers start the downstroke before they want the drum to sound; guitarists fret the note before they strike a string… I don’t think keyboard latency would make rhythm games unplayable provided it’s consistent and the feedback is tangible enough for you to feel the delay in the interaction.
My wife has JUST started learning drums in the past week or so. She doesn't even have her own sticks or a kit yet, but we have access to a shared one from time to time. It's been interesting watching her learn to time the stick / pedal hits so they sound at the same time.
I'm a semi-professional keyboard player, and in the past I played with some setups that had a fair bit of latency - and you definitely learn to just play things ahead of when you expect to hear them, especially on a big stage (where there's significant audio lag just from the speed of sound). And some keyboard patches have a very long attack, so you might need to play an extra half beat early to hit the right beat along with the rest of the band.
If you watch an orchestra conductor, you may notice the arm movements don't match up with the sounds of the orchestra - the conductor literally leads the orchestra, directing parts before you hear them in the audience.
Absolutely, humans are incredibly good at dealing with consistent and predictable latency in a fairly broad range. Dynamic latency, on the other hand ... not so good.
I recall a guitarist friend who figured out their playing was going to hell trying to play to a track when their partner used the microwave. They were using an early (and probably cheap) wireless/cordless system and must have had interference.
My wife is a very good singer, and took singing lessons for years while singing in chorale and other group activities. She used to sing harmony in our church worship bad where I played keys weekly.
She's been learning Irish tin whistle for a few years, and is a big fan of the Dropkick Murphys and other celtic punk bands, along with 90s alternative bands lik Foo Fighters, Red Hot Chili Peppers, and Weezer. I've been learning guitar / bass / ukulele / mandolin, and it would be great fun if she can play drums and sing while I play something else....
On higher-end pianos (mostly grands), there is "double escapement" action, which allows much faster note repetition than without. I suspect the latency would be lower on such pianos.
> Musicians learn to lead the beat to account for the activation delay of their instrument
Yes, this is absolutely a thing! I play upright bass, and placement of bass tones with respect to drums can get very nuanced. Slightly ahead or on top of the beat? Slightly behind? Changing partway through the tune?
It's interesting to note also how small discrepancies in latency can interfere: a couple tens of milliseconds of additional latency from the usual — perhaps by standing 10-15 feet farther away than accustomed, or from using peripherals that introduce delay — can change the pocket of a performance.
For rhythm games, you want to minimize jitter, latency doesn't matter much. Most games usually have some kind of compensation, so really the only thing that high latency does is delay in visual feedback, which usually doesn't matter that much as players are not focusing on the notes they just played. And even without compensation, it is possible to adjust as long as the latency is constant (it is not a good thing though).
It matters more in fighting games, where reaction time is crucial because you don't know in advance what your opponent is doing. Fighting game players are usually picky about their controller electronics for that reason, the net code of online games also gets a lot of attention.
In a rhythm game, you usually have 100s of ms to prepare you moves, even when the execution is much faster. It is a form of pipelining.
Yes and no. Latency across a stage is one reason why orchestras have conductors. An orchestra split across a stage can have enough latency between one side and another to cause chaos sans conductor. It takes noticeable time for sound to cross the stage.
I don't buy the 60ms latency either, but it's very easy to compensate for consistent latency when playing games, and most rhythm games choreograph what you should do in "just a moment" which is probably at least 10x more than 60ms
Only on movement inputs, which don’t (?) make as big a difference as aiming speed for most people I think. (I aim with my feet in first person shooters but I think that is a habit, maybe bad habit, picked up from playing on consoles for many years).
Lots of people might not be good enough to care about missing 1-2 frames.
Could it be that the user simply learns to press a key slightly earlier to compensate for the latency? There is key travel time you have to account for anyway.
Rythm games almost always have a calibration setting, where they ask you to press a key on a regular beat. Then can also do it to check the visual latency by doing a second test with visuals only. This allows them to calculate the audio and video latency of your system to counter it when measuring your precision in the actual game.
Oooh, that explains why when watching some super high bpm runs I always got the impression that they were like a frame off the marked area - but it was just shifted ahead of the actual music and they were in actually in sync
Rhythm games will often have compensation for the delays of the input, audio, and visuals. Some do this without telling the users, others do it explicitly, e.g. Crypt of the Necrodancer.
The latency of the physical key going down is counted in that post, so it includes mechanical "latency" that will differ depending on how hard you press the keys and if you fully release the key.
> a smaller input median latency (Console ~12 msec) than an Apple //e from 1983 (30 msec).
> Still, it would be nice to return to the human resposiveness of machines 41+ years old.
A while ago someone posted a webpage where you could set an arbitrary latency for an input field, and while I don't know how accurate it was, I'm pretty sure I remember having to set it to 7 or 8ms for it to feel like xterm.
BTW, remember, most people still have 60hz monitors. Min latency can only be 16.6ms, and "fast as possible" is just going to vary inbetween 16.6 and 33.3ms.
The real improvement wouldn't be reducing latency as much as allowing VRR signaling from windowed apps, it'd make the latency far more consistent.
>BTW, remember, most people still have 60hz monitors. Min latency can only be 16.6ms, and "fast as possible" is just going to vary inbetween 16.6 and 33.3ms.
No the minimum will be 0ms, since if the signal arrives just before the monitor refreshes then it doesn't need to wait. This is why people disable VSync/FPS caps in videogames - because rendering at higher than <60FPS> means that the latest frame is more up-to-date when the <60Hz> monitor refreshes.
The maximum monitor-induced latency would be 16.6ms. Which puts the average at 8.3ms. Again, not counting CPU/GPU latency.
33.3ms would be waiting about two frames, which makes no sense unless there's a rendering delay.
The article is about testing actual user experienced latency, and Mutter still buffers 1 frame. Actual observed latency is going to be between 16.6ms and 33.3ms before the user can see anything hit their eyeballs.
Best-case median latency @60hz is 8.3ms (i.e. if there were 0 time consumed by the input and render, it would vary equidistributed between 0 and 16.6ms.
Xfce always felt like Linux to me. Like sure, there are other interfaces, but this is the Linux one.
I want an ARM laptop with expandable memory, user-replaceable battery, second SSD bay, and a well-supported GNU/Linux OS that has xfce as the UI - from the factory. That's the dream machine.
IMO, better to install yourself. Too much potential for the manufacturer to add adware annoyances in a pre-install.
Although, mine is an x86-centric take. There are occasionally issues around getting the right kernel for ARM, right? So maybe it would be helpful there.
Xfce was my favorite until I found i3/sway. Even more responsive, and less mouse required since everything is first and foremost a keyboard shortcut, from workspace switching to resizing and splitting windows.
Xfce is very configurable, and it's fairly trivial to set up tiling-WM-style keyboard shortcuts for window positioning, and get the best of both worlds.
True, I find resizing and tiling specifically a 3 window screen in a 1/2, 1/4, 1/4 setup to be impossible to figure out - often I just move windows around repeatedly until I give up. If I could drag directly it would definitely make it easier. But that's relatively rare for me.
I have Super+Numpad assigned as window positioning hotkeys in XFCE. First window would be Super+4 to position it on the left half of the screen. Next two would be Super+9 and Super+3 respectively to position each in the upper-right and upper-left corners. Super+5 maximizes.
XFCE's goal is also implement a stable, efficient, and versatile DE that adheres to standard desktop conventions, rather than to import all of the limitations and encumbrances of mobile UIs onto the desktop in the mistaken belief that desktop computing is "dead".
Well, the Apple II had a 280x192 display (53 kpixel), and my current resolution (which is LoDPI!) is 2560x1600 (4 Mpixel). When you phrase it as "in 2024 we can render 76x the pixels with the same speed" it actually sounds rather impressive :)
Yeah, one of those is that modern stacks often deliberately introduce additional latency in order to tackle some problems that Apple IIe did not really have to care about. Bringing power consumption down, less jitter, better hardware utilization and not having visual tearing tend to be much more beneficial to the modern user than minimizing input-to-display latency, so display pipelines that minimize latency are usually only used in a tiny minority of special use-cases, such as specific genres of video games.
It's like complaining about audio buffer latency when compared to driving a 1-bit beeper.
The changes described in the OP are unequivocally good. But the most promoted comment is obviously something snarky complaining about a tangential issue and not the actual thing the article’s about.
It's an interesting observation, and it's constructive, providing actual data and knowledge. I downvoted you for negativity because you're providing nothing interesting.
I found parent to be constructive as to tone rather than content and upvoted them for that reason. The constructive part of grandparent's point can be made without needlessly crapping on people's hard work.
I love both the underlying focus on performance by the VTE developers, and the intense hardware-driven measurement process in the article!
The use of a light sensor to measure latency reminded me of Ben Heck's clearly named "Xbox One Controller Monitor" [1] product [1] which combines direct reading of game console controller button states with a light sensor to help game developers keep their latency down. It looks awesome, but it's also $900.
This, and the linked article, show the photo sensor halfway the monitor. Nothing wrong with that for comparing measurements, but for quite a lot (possibly the majority) of typical monitors out there that actually means for a refresh of 60Hz putting the sensor at the top of the screen will give you about 8mSec faster and at the bottom 8mSec slower measurements because pixels / lines thereof are driven top to bottom. Like a CRT basically. So if you're getting into the details (just like where to put the threshold on the photo sensor signal to decide when the pixel is on) that should probably be mentioned. Also because 8mSec is quite the deal when looking at the numbers in tha article :)
Likewise just saying 'monitor x' is 30mSec slower than 'monitor y' can be a bit of a stretch; it's more like 'I measured this to be xxmSec slower on my setup with settings X and Y and Z'. I.e. should also check if the monitor isn't applying some funny 'enhancement' adding latency with no perceivable effect but which can be turned off, whether when switching monitors your graphics card and/or its driver didn't try to be helpful and magically switched to some profile where it tries to apply enhancements, corrections, scaling and whatnot which add latency. All without a word of warning from said devices usually, but these are just a couple of the things I've seen.
It's crazy to me we exist in a world where we can render hyper realistic 3d scenes and play games that once felt like they would be impossible to render on consumer grade hardware AND in the world where we are still trying to perfect putting text on a screen for a terminal LOL :)
Isn't some of it that we're optimizing more for graphics and there are tradeoffs between the two (so getting better at graphics tends to make the text worse)? Partially offset by terminals using GPU acceleration, but you're still paying for that pipeline.
How much of this includes that fact that a) it didn't matter too much previously, I.e. it works and b) until recently there's been a lot of network latency to solve for in a lot of terminal use cases.
Not related to the speed, but is there any terminal for Linux that works like the Mac OSX terminal, in that you can shut it down and restart and it will bring up all the tabs and their cmd histories and scrollbacks for each tab? They do that by setting different bash history files for each tab etc.
This is pretty tangential, but I just (1 hour ago) found out that iterm2 on Mac can integrate with tmux [1] if you run tmux with the argument "-CC", in a way that makes your tmux sessions map to the GUI windows / tabs in iterm2, including when using tmux on a remote machine via ssh.
I'm really excited about this, since I always forgot the hotkeys/commands for controlling tmux.
> They do that by setting different bash history files for each tab etc.
I wonder what happens when you close all the tabs, and then open a new one. Are all the tabs history merged back into a "general" history file when you close them, so you will get access to its commands in new ones?
Yes, but if one process is writing its history to ~/.bash_history_1, and some time later you spawn a new shell whose history file points to ~/.bash_history_2, you won't have the commands from the previous session available, right?
Correct, but most of the time I use the same set of panes, created at startup, so the commands are in the history where I need them. And when they are not I grep in ~/.bash_history_*.
This. I was just looking for the exactly the same thing. For now I use tmux with tmux-ressurect to persist state between reboots. - It works okeyish, I would say it's a good hack but still a hack. It's sad there isn't really solution for this problem maybe aside for warp. My little UX dream is to have such a solution of saveing workspaces integrated with whole OS and apps inside it. - That would be cool.
I find it more preferable to setup bash (or whatever shell you are using) to append into .BASH_HISTORY file at every command. You don't always remember on which tab/pane/window you type which command and most people would do Ctrl + R or use a fuzzy search tool anyway.
Also many times you open a new tab/pane/window and would like to access the history of another one who is already busy running a job so a common history is usually preferrable.
At the time mac os got that functionality their macbooks had removeable batteries.
One fun fact: you could just remove the battery while your apps where running and when booting up again every window you had open would just reopen with everything you had typed saved to disk in case of the iwork apps.
While I fully believe this could work, seems like it might need to be a temp solution. Ripping out the battery seems like a solution that might have more downsides than advantages.
A lot depends on what are the important parts. It could be essentially identical if it supplies all the parts you care about and don't actually care too much what exact form it takes. But if you require the same exact gui controls then sure, totally different and not even an answer to the question.
And you have to get into tmux configs which is a hassle by itself. I am not a fan of terminal multiplexers, I tend to forget them and when the server crashes or reboot it won't replay/restart what I was doing in those tabs anyway. I just use them for long running jobs I am testing. I also don't like the whole tmux window/pane manipulation you can do, I'd rather have my primary DM to do that well.
It's funny how tmux is downvoted in those replies though :D.
How does it work in MacOS? Since some terminal commands are interactive (vim) and others have side effects (well, really, effects— rm won’t have the same result if you replay it!) I guess they can’t just be replaying/restarting programs, right?
Personally the only thing I really need is `mode-keys vi`, the rest is optional. I guess you want to configure mouse on/off depending on preference, if it differs from the default.
Usually will at least change the color of workstations and desktops just to make it more visually obvious which tmux environment I'm looking at when sshing and nesting.
It's C-b by default, C-a is what screen uses. Maybe you copied someone's config, as that remapping is fairly common (or weird distro default? but seems unlikely). I'm personally fine with the default C-b and have always kept it everywhere.
You can do the saving part automatically by setting a different history file automatically for each instance of the shell, for example using a timestamp on your rc file and force them to append after every command.
Then if you open a new shell and want history from a particular older shell you can do `history -r <filename>`
So it is an hybrid automatically saved, manually recovered mode.
Presuming my understanding of persistent sessions lines up with yours, set `terminal.integrated.enablePersistentSettings`. You may also want to change the value of `terminal.integrated.persistentSessionScrollback` to something higher than the default (1000? not sure)
It does for me when using a remote instance, like a devcontainer or over ssh. Maybe that is just because the server side keeps running and when you reconnect the UI reloads from there. Locally nothing would remain to be reloaded when you restart VSCode.
I used Gnome for years, then switched to sway and alacritty 2 years ago and honestly I can't tell any difference. I guess this is just like high end audio equipment, my ears/eyes are not tuned to the difference.
I use Alacritty as well and for me I care about having a low latency of response when I press Ctrl-C after I accidentally cat a huge text file. I want my key press to be acted upon immediately and that I want the pages and pages of text to finish rendering quickly so I can get back to my shell prompt.
When I did a quick test two years ago, Alacritty outperformed Gnome Terminal. Looking forward to trying this again when I update my system.
I've been using Gnome for years and am currently on Gnome 46. I hadn't noticed any difference in the terminal latency from Gnome 45. Like you, I think I just don't notice these things.
I'm on Gnome, and I moved from Gnome Console to Wezterm because of the latency. It wasn't noticeable when I used terminal as a small window, but most of the time my terminal was fullscreen, and the delay was unbearable.
Not a fair comparison, probably, but I swore off of gnome-terminal ~20 years ago because it was using half of my cpu while compiling a kernel. Xterm used ~2%.
To be fair even up to this day and on a modern Linux setup (Ryzen 7000 series CPU), that's still how Gnome terminal (before the patch in TFA are applied) does feel compared to xterm.
Maybe your screen and/or keyboard is adding enough latency that you'll never get good results no matter what software you use. The difference between bad latency and very bad latency isn't so obvious. Have you tried gaming hardware?
Latency adds up. When I had to work with a not-so-fast monitor (60Hz, mediocre processing and pixel-response lag), it became very apparent to point of annoyance. Using alacritty helped a bit. Our brains adapt quickly but you also notice it when switching frequently between setups. (Monitor or terminal emulators)
Finally a terminal benchmark that isn't just cat-ing a huge file. Would love to see the same test with a more diverse set of terminals, especially the native linux console.
Those tend to say things like "This benchmark is not sufficient to get a general understanding of the performance of a terminal emulator. It lacks support for critical factors like frame rate or latency. The only factor this benchmark stresses is the speed at which a terminal reads from the PTY."
So although those tests may be more extensive, they're not "better" in every regard.
Of course it's perfectly reasonable to want a benchmark that people can run without needing to build any custom hardware.
They are better in every regard compared to catting a large file, which is what the OP was complaining about.
Certainly if you want to comprehensively understand terminal emulator performance there are multiple axes along which to measur eit and various tradeoffs involved. But I think nowadays we have much better alternatives to catting large files or running find / both of which have been the go to for measuring terminal performance for most people.
Sorry for being off-topic but what I dislike the most about Gnome Terminal is that it opens a small window by default (like 1/4 of my screen size) and even if you resize it, it doesn't remember the size after restart. It turns out you need to go to settings and manually specify how many columns and rows do you want.
That behavior is pretty common with many terminals. Off the top of my head, I know the default macOS Terminal and Windows Terminal both behave like that where you need to change the default dimensions in the settings.
I personally prefer having the dimensions set to a default size and then resizing specific windows that require more space. But it should at least be an option to have it remember resizes.
I often have many terminals open of various sizes. It's not clear what size would be the correct one to remember.
Therefore, I don't want it to try. It's fine for software to try to be smart if there's high confidence it knows what you want. Otherwise, it's yet another case, "Look, I automatically messed things up for you. You're welcome."
It is just old school terminal behavior. Even xterm iirc 25 lines 80 columns. That natural screen size. See also natural terminal colors green letter with black background.
Neat! Would love to see the benchmark to also include Mitchelle Hashimoto's Ghostty terminal when it comes out publicly. (now still under development/polishing stage and in private beta)
I use xterm and i3wn on debian and I never experienced anything faster. Surely the idea of waste GPU for the terminal never even crossed my mind so alacritty is IMO overkill.
I think I feel the same way, lag has never occurred to me when using xterm -- and I use the heavier features like translations and sixels day-to-day. Maybe it's just the Athena widget style that is making people dismiss it, because it's great.
I never did tests that were as in depth as the OP's blog post but no terminal I've used ever matched how good Xterm feels when typing. It's like you're typing within a zero resistance system.
I don't know what this has to do with "terminals", other than the author is using this to benchmark this.
According to the author, Gnome 46's Mutter, when in direct mode (which windowed apps do not use, so the benchmark is partially invalid; useful for gamers, but not much else) is faster.
Thats great. Gnome is now possibly as fast as all the wlroots-based Wayland WMs (Sway, River, Hyprland, etc) and KDE's Kwin.
I've looked into why Mutter has historically been the worst WM on Linux for the longest time: it makes a lot of assumptions that "smoother == better", no matter the cost. OSX's WM does the same thing, so they felt justified.
If you lag more than 1 frame, it is noticeable to non-gamers; ergo, a latency of between 16 and 32ms (since WMs do not full-time VRR, although they could, and maybe should) once the app flushes their draw commands; this is on top of whatever the app did, which may also be assuming 60hz.
Modern WMs try to get latency down as far as possible, even implementing "direct mode" which directly displays the fullscreen app's framebuffer, instead of compositing it, thus zero latency added by the WM itself.
Lower latency is not the ultimate goal. Energy efficiency, for example, is another important factor DEs have to optimize for, that can be negatively affected by lower latencies.
Ultra-low latency on a laptop, that will poweroff in an hour is probably a bad tradeoff.
You either check whether any state has changed, doing nothing in the happy path at the price of slightly higher latency, or you just go with the unhappy path all the time.
I only tested for software latency (monitor, keyboard and other hardware latency is not included in Typometer benchmarks). I ran the test on Arch Linux with Xorg + bswpwm without compositor. You can find the full results on by blog https://beuke.org/terminal-latency/.
Compared to a similar 6yo [1] and 3yo[2] (by zutty maker) comparisons, VTE terminals still (at least pre-46) bad in latency front. (They're as high as VS Code based beuke article.) Xterm still rules it. (Pointed in [2], this is due to direct rendering via Xlib which comes with the downside of having poor throughput.) Alacritty significantly improved, Konsole got worse. About Alacritty, it's pointed in [2], there were various opened tickets related to its poor performance and wasn't an easy to solve problem. So kudos to Alacritty devs for succeeding and GNOME devs for improving in the new version.
Alacritty, Kitty, Zutty, GNOME, others, quite a rejuvenation in terminal development.
>However, if we custom-tune the settings for the lowest latency possible I chose minlatency = 0 and maxlatency = 1 then we have a new winner. Applying this custom tuning results in an average latency of 5.2 ms, which is 0.1 ms lower than xterm, and that’s with having a much more sane terminal without legacy cruft.
Huh, the devs really weren't lying, Alacritty really got better on the latency front. I started using it for supposed better security than Xterm, but at the time I think it was quite a lot worse on latency, but the throughput was way better.
Alacritty feels fast but they refuse to add support for tabs or tiling. They just say to go use tmux but that isn't the answer at all.
Kitty is quite nice but if you SSH into machines a lot, all hell breaks loose if they don't have the kitty terminfo files installed, and doing that isn't always possible. You can override TERM, but honestly don't have the patience for it.
It doesn’t bother me, I was just interested in whether the benchmark is fair in this respect (it is xorg only, so the answer is yes). I personally believe that 120+ hz gives barely any benefit, though.
Speeding up the gnome terminal is useless whe you realize GNOME has been removing keyboard support features from Gtk since 2014.
Try to copy a file path from that Gnome 46 terminal and do a File-Open operation in a Gtk application and ctrl-v<enter> to paste in the path and open it.
Woops! Error! "The folder contents could not be displayed. Operation not supported"
GNOME and Gtk3/4 no longer prioritize keyboard inputs and hide them behind complex key shortcuts. They let keyboard inputs bitrot and fail because they only care about GUI inputs. It's an open bug since 2014, opened and closed as wontfix literally dozens of times. Brought up in #gtk chat a similar amount. Latest (2023) wontfix: https://gitlab.gnome.org/GNOME/gtk/-/issues/5872
Do you really hit . or / and then backspace then paste when you want to paste a file path and enjoy it? That's at least 2 extra keystrokes. Even worse when you have to use the mouse to dismiss the pop-down closest file name list that prevents pasting.
Also not really cross-platform, contrary to what's indicated in the first word of its github description, and the owner is kind of an ass about it https://github.com/kovidgoyal/kitty/issues/6481.
But it should support at least more than one platform. And it's disputable what exactly one considers as a platform, or just a flavor of some platform.
As said, it depends on the definition of platform for this case. All I see is support of a bunch of flavors of one platform, namely POSIX, unixoids, or how you want to call it. Yes, they are different desktop-platforms, but the purpose of this software is still limited to one specific environment. Or to give a different perspective, nobody would call it cross-platform, just because it can run with Gnome and KDE, under X11 and Wayland.
And I'm curious how much adaption happens for each OS really. Are there specific changes for MacOS and BSD, outside of some paths for configurations?
The entire point of POSIX is that, if you only use what it defines, your program automatically becomes cross-platform, because it will run on several Unices, as well as other systems (like Haiku).
It's probably fair to say that an application with native Wayland and X11 support is multiplatform. I can understand somebody disputing that, but certainly Linux and MacOS are different platforms. They don't even share executable formats.
The author replied with the same effort as the person who reported the issue. You kinda need to do this as a maintainer if you don't want to drawn under low quality reports and burn all your energy. I'm sure lucasjinreal would have gotten a kinder answer if they took time to phrase their issue ("demand", at this point, also misguided) nicely.
It's not really, I just remembered wanting to try out this terminal emulator and being quite surprised that something actively advertised as cross-platform didn't support Windows.
I agree that the person posting the issue wasn't really doing it in a diplomatic way, but in the end, the result is the same. I think it's disingenuous to actively advertise something as cross-platform, without even specifying which platforms are actually supported (even if yes, technically it's cross-platform)
> without even specifying which platforms are actually supported
The first line of the README (ok, second line if you include the title) is "See the kitty website" with a link, and on the site the top menu has a "cross platform" entry which then lists "Linux, MacOS, Various BSDs".
It seems like a stretch to classify that as disingenuous.
Maybe, on the other hand: the link you posted was to a benchmark using kitty 0.31, since then it had an all new escape code parser using SIMD vector CPU instructions that sped it up by 2x.
https://sw.kovidgoyal.net/kitty/changelog/#cheetah-speed
I never understood why people want a bunch of features on their terminal. I just want a terminal that doesn't get in the way of my tools. Alacritty is great at that
I can notice a slight latency in Gnome Terminal ... running in a VM ... on a Windows host ... being accessed via RDP over Wi-Fi and a 100 Mbps line. Not enough to bother me.
The latency is literally about how late the pixels appear on the display, so it has to be about seeing.
If you type fast, but still have to look at your output, it may be a good idea to wean off that; you should be able to type while reading something elsewhere on the screen, only occasionally glancing on what you're typing.
Traditional typewriter typists had to be able to transcribe an existing written text (e.g. manuscript). That meant their eyes were on the text they were reading, not on their hands or the typewriter ribbon strike area.
I appreciate the fine work done by developers and testers, however I've been using gnome-terminal for around two decades and never perceived the input latency.
Like, I installed Alacritty some years ago side-by-side with gnome-terminal, and gosh, I could not visually sense any difference. Do I have bad sight?
Very occasionally (once a week or less) do I open VS code or so. The rest of the 8+ hours a day I spend in vim, zsh and such.
I don't perceive gnome-terminal as slow, or it's high latency. Alacritty did not show me any difference from gnome-terminal; other than that I needed to learn a lot of new stuff which I'm not going to do if there's no clear win. For me there was no clear win.
No, because the human does not need to process the entire screen contents with every frame, only the delta from the last frame (usually). Therefore getting those deltas into eyeballs as quickly as possible increases how quickly the human can interact with the computer.
There are cases where this makes a big difference.
For a dramatic example, consider the VESA framebuffer console, a slow serial link, or a bad SSH connection with a good CPU on the other end.
With enough terminal output it will bottleneck whatever you're doing. Sometimes extremely dramatically so. To the point of adding hours to a task in really bad cases.
For such situations, it often helps a lot to run remote tasks in something like screen and only switch to it to check progress once in a while.
> With enough terminal output it will bottleneck whatever you're doing. Sometimes extremely dramatically so. To the point of adding hours to a task in really bad cases.
At the very laziest &> foo your output. Or preferably add logging to your application. Go back 20 years and you wouldn't have had the luxury of spewing hundreds of megabytes of text to a remote terminal for no reason.
And? If we go back 200 years you probably wouldn't have had the luxury of surviving childhood. I don't see what either has to do with anything... Particularly when ot comes to tools for the world in 2024.
The reason is that the limiting factor absolutely should be the human, so if the computer is slow enough that the human can possibly notice it, it needs to be 10x faster.
Hi, I'm the author of almost all of those VTE performance improvements. I wrote them because the latency in gnome-terminal, which I've used for well over twenty years, was starting to make me miss-type due to how slow the feedback was on Wayland.
I've noticed this myself on my main rig running Ubuntu 22.04, which never ever had any perceptible lag. Now it is so bad I was forced to switch to Alacritty.
Ironically, a bug in mutter crept into Ubuntu 22.04 and later just a week ago. It increased the keyboard latency dramatically on some systems and made typing a pain.
I remember many years ago, SBCL would build noticeably slower on gnome terminal as compared to xterm due to how verbose the build process was. I think they even put a note in the README about it.
My distro recently upgraded to Gnome 46 and as someone who spends a big chunk of their day in a terminal, the difference was very noticeable. Great work to everyone involved!
there is lot of interesting stuff that could be measured here. Valves Gamescope is supposed to be this low-latency compositor, so it might be interesting to compare to more typical desktop compositors. Similarly testing VRR could have interesting results
Well written article and interesting results, thank you.
I'm surprised that there has been input latency of tens of milliseconds with the said setup. How much are typical input latencies in comparable Windows laptops and Macs?
I don't have any results for Windows or macOS yet unfortunately. I wanted to run these tests on Windows eventually, and include things like cmd.exe and the Windows Terminal. Maybe when I get around to re-benchmarking a wider range of terminals. Mac would certainly be interesting to include, but I don't have access to any of those.
I would encourage you to do these sort of tests yourself, the hardware involved is pretty inexpensive (I'd expect you should be able to get something for <$50)
The lack of mention to Linux's power management system during measurement is worrying. This is the kind of test that gets completely affected by Linux power management policies, up to the point where results may be meaningless.
Personally, IDGAF about latency. I'm used to typing things into servers that are two crossings of a continent away (one for corp VPN entry, and then another one to get to a server 50 away via the cross-country VPN hop).
What gets me is the performance when a terminal command spews an unexpectedly large amount of output and/or I forget to postpend less. Eg, the latency between ^C being entered and having an impact. This can be 10s of seconds on slower terminals like xterm and this is what finally got me to switch away from xterm to one of these new-fangled terminals like the one from lxde.
OK you can send these tests to the trashbin as this is so unrepresentative of what most users are using.
This is sometimes infuriating. A year ago, I installed linux on my partner's computer, her HDD running windows had died.
Things were seemingly working fine until I realized some apps (dnfdragora comes to mind) were unusable on her 1366x768 with 1.x ratio as they take way too much screen estate. I think worldwide there are still more than 20% of desktop users using screen resolutions with less than 768px of height and around 50% of user with less than 900px of height.
I have not issue developpers using decent machines to develop, but when it comes to testing performance and usability they should also do it on that +10y old entry level laptop they probably still have sitting somewhere in a storage area. Or buy an old netbook for 20 bucks for that specific use.
A ten year old system is a Core 4xxx or 5xxx series, which is plenty fast. The main struggle would be the integrated GPU at higher resolutions, but presumably a 10 year old laptop has a pretty low resolution.
Also you never know, one patch that improve perf on latest gen systems might do the opposite on much more humble ones which still represent a sizable portion of the computer running today. It is important to know what you are gaining (or not) and where.
And I don't agree about your initial premise. The fact one is running an old system that might be super slow on most js hungry websites, playing AAA ganes or compressing 4k videos doesn't mean slowness has to be expected on native and/or small or simple apps.
Also, while we can expect an old computer to be slow at things it wasn't designed for like decompressing/compressing videos with much higher bitrate/resolutions, handling traffic with more cpu intensive encryption algorithm, we should not accept a computer being slower at things it was designed for and was working well a decade ago. My motorbike is probably marginally slower from wear and tear of its engine, yet still goes like stink at the green light and has no issue reaching the speed limits. My well maintain 35y old bicycle is still as fast as it ever was, even probably faster thanks to better tires. My oven and gas stoves are still heating my meal the same way it did before. Why should an old computer be slower at doing the same things? This is a disgrace and the whole IT industry should be ashamed of accepting and enabling this.
Don't get me wrong, I get where you're coming from. I just hate the overall sentiment
Here we have a person who
... observed an interesting change in a open source project
... built out a tool to carry out his own testing
'
... shared the firmware for set tool
... ran some benchmarks and visualized the data
... took the time to write an informative article outlining the though process, motivation and sharing the results
And your comment essentially came down to
> "Testing methodology bad, not representative of your personal usecase, should have been done different, data is trash"
I think its incredibly rude and a steep ask to expect benchmarking to be done on a meaningful set of legacy hardware. Sure, legacy systems are a non insignificant portion of computers operating today. But the author used a system he had available and showed the results gathered. I'm sure his time is better spent on other projects or another blog article rather than benchmarking the same stuff on the n-th processor generation.
The Linux community can be accused of many things, not caring about performance certainly isn't one of those. The beauty is, if you deeply care about a specific configuration that you feel like is being neglected, you can step up and take over some optimization efforts. If you don't care enough about that scenario, why should anybody else?
Limiting this to that specific instance: the hardware is affordable to maintain, the firmware is linked. If you want to benchmark a different system of yours, go for it an publish an article. I'm gonna read it
On the other hand because the benchmark is comparative, it doesn't really matter what hardware it's run on. So they might as well just run it on decently recent hardware. The person on that 10y old laptop is going to have a bad experience anyway, it's just not very relevant for the actual comparison.
1. Random gui apps have anything at all to do with terminal emulators. Did the terminal emulator on your partner's system display any of the problems you're complaining about?
2. Resolution issues have anything to do with performance testing - those are different.
3. You think you get to demand anything of developers making something in their free time and giving it away. If they don't want to cater to you, they don't have to - there's no contractual, employment, or other obligation. You can choose to be useful and fix it yourself (or pay someone to), until then you are no different than any random ms employee thinking that ffmpeg gives a shit about teams.
1) My remark is not specific to this particular test but more general. The fact he used a relatively high end machine which is not representative of what most people are using. If we are talking about optimization, this is important.
2) yup in this case usability, see comment above.
3) I am not demanding, I am giving my opinion on what should be done to make better software. If some developers prefer to gatekeep, wallow in mediocrity, they are free to do so as much as I am free to use other pieces of software, write my own at my own level of quality and as much as you are free to wipe your ass with my comments in a piece of paper.
For testing usability, you are right. For testing performance of terminal emulators, you're wrong. A very large resolution should make the differences easier to measure. On smaller screens they should all perform closer to perfect, the difference between them will be smaller and your reason to care about these results will also be diminished. Basically, it doesn't matter.
You are getting downvoted, potentially by your tone, but you raise a very important point. I am still using an i7 of third generation with a nvidia 1050 and 16gb of ram. It moves windows 10 perfectly, and I have been able to play dota2 at medium graphic, and several games, such as the mass effect, bioshock, etc...
I can edit video using old software with no problem, however attempt to do it in davinci is impossible.
There are other software that is painfully slow on my machine.
Also desktop resolution sucks, a lot laptops do not have anything over 1080p or even home monitors, where 1440p should be the norm.
Yeah, I'm still on a 1080p monitor as my main one. I don't think any of my laptops have screens where 100% or 200% scaling is the correct choice, either, unless you're DHH level obsessed with font rendering.
But wait! Not so fast (smile). This benchmark uses the compositor "raw Mutter 46.0", not GNOME Shell. Raw mutter is "a very bare-bones environment that is only really meant for testing."
In addition, this measurement is not end-to-end because it does not include keyboard latency. In this test, the "board sends a key press over USB (for example, Space)". Latencies just within a keyboard can go up to 60msec by themselves: https://danluu.com/keyboard-latency/
What are the true end-to-end numbers for the default configuration, which is the only situation and configuration that really matters? I wish the article had measured that. I suspect the numbers will be significantly worse.
I do congratulate the work of the GNOME team and the benchmarker here. Great job! But there are important unanswered questions.
Of course, the Apple //e used hardware acceleration and didn't deal with Unicode, so there are many differences. Also note that the Apple //e design was based on the older Apple ][, designed in the 1970s.
Still, it would be nice to return to the human resposiveness of machines 41+ years old.