Google assigns a CVE for libwebp and gives it a 10.0 score

kevincox · on Sept 26, 2023

The thing that concerns me most is looking at the fix it is very difficult to see why this fix is correct. It also appears as there is lots of code without explicit bounds checks. It makes me worried because while the logic may be safe this makes the logic very complex. I wonder what the cost would be to add an explicit, local bounds check at every array access. This would serve as a backup that is much easier to verify. I suspect the cost would be relatively small. Small enough that I personally would be happy to pay it.

https://github.com/webmproject/libwebp/commit/902bc919033134...

This is also a great reminder that fuzzing isn't a solution to memory unsafe languages and libraries. If anything the massive amount of bugs found via fuzzing should scare us as it is likely only scratching the surface of the vulnerabilities that still lie in the code, a couple too many branches away from being likely to be found by fuzzing.

winter_blue · on Sept 26, 2023

> ...isn't a solution to memory unsafe languages and libraries. If anything the massive amount of bugs found via fuzzing should scare us as it is likely only scratching the surface of the vulnerabilities that still lie in the code

Yup. For example, the Linux code for its relatively new[1] io_uring subsystem was so memory-exploit-ridden that Google disabled it for apps on Android, and entirely on ChromeOS, and their servers[2]. It is insane that this is how easy it has been for a person to break into Linux.

[1] released with kernel version 5.1 which came out in May 2019

[2] in June 2023: https://en.m.wikipedia.org/wiki/Io_uring#Security

raverbashing · on Sept 26, 2023

Only goes to show that C is a broken language and unfit for its purpose in today's day and age

Aeolos · on Sept 26, 2023

For people downvoting, could you please explain which part of this statement you agree or disagree with?

eqvinox · on Sept 26, 2023

The design of io_uring has nothing to do with the language the Linux kernel is implemented in. And safer languages really can't provide guarantees when the issue at hand is designing a shared memory cross-executable (kernel⋄userspace) API.

chlorion · on Sept 26, 2023

Just having first class slice types, where your pointer is paired with a length, which allows for doing bounds checking automatically is a huge upgrade over C, even if it doesn't solve every other problem.

Security exploits from out of bounds access should not be happening today, bounds checking is a solved problem, and has been solved for decades.

pjmlp · on Sept 27, 2023

Time to make the point that not even C authors were able to change WG14 mind on the necessity of having fat pointers.

https://www.bell-labs.com/usr/dmr/www/vararray.pdf

winter_blue · on Sept 27, 2023

That's a pretty interesting paper. Thanks for sharing it. I wonder why Dennis Ritchie, with the weight his name carries, wasn't able to push it into mainstream implementation...

pjmlp · on Sept 27, 2023

Because once a language is managed by ISO, that doesn't matter, the author gets one vote like everyone else.

So this shows how the majority of WG14 members see security in C.

mxkopy · on Sept 26, 2023

[flagged]

oconnore · on Sept 26, 2023

But assembly _is_ outdated, at least for the purpose of authoring large community software projects.

bcrosby95 · on Sept 26, 2023

> C is syntactic sugar around assembly.

The C abstract machine, and the undefined behavior shenanigans compilers are pulling off says otherwise.

callalex · on Sept 26, 2023

I want to create software that doesn’t harm users. That doesn’t at all imply that I’m not interested in computers. In fact guaranteeing safety for users shows a profound interest in computers.

raverbashing · on Sept 26, 2023

> C is syntactic sugar around assembly

It isn't

As an example, see how C 'int', 'short', 'long' or whatever becomes actual different types

> just admit you're not interested in computers

I have the popcorn ready already for when the next C "hotshot" shoots himself in the foot yet again because he thinks a chainsaw without an emergency brake is just more fun

peteradio · on Sept 26, 2023

> chainsaw without an emergency brake is just more fun

Fun and fast!

ryeights · on Sept 26, 2023

All programming languages are syntactic sugar around processor instructions

bigbillheck · on Sept 26, 2023

> C is syntactic sugar around assembly.

Even if this is true (and I don't believe it is), it's not useful.

diogenes4 · on Sept 26, 2023

Kind of crazy we're still using monolithic kernels in 2023. Nothing about `io_uring` needs to happen with elevated privileges.

I would take a large—say, 2-5x—performance hit just to escape these kinds of vulnerabilities.

hderms · on Sept 26, 2023

but isn't the whole point of io_uring that it's more performant?

eqvinox · on Sept 26, 2023

A 2-5× performance hit is also a 2-5× CO₂ hit, and I wouldn't be willing to let myself or other people take that hit. We're throwing enough heat into the atmosphere already to make cats dance on our portable supercomputers as is.

jsnell · on Sept 26, 2023

There was a great explanation a few days back on why the vulnerability existed, and on why the fix was correct: https://blog.isosceles.com/the-webp-0day/

(Discussed in https://news.ycombinator.com/item?id=37600852)

kevincox · on Sept 26, 2023

I understand that the patch is correct. The point that I am trying to make is that the correctness of this code relies on non-local reasoning (the allocation/reallocation of the table before it is used). I believe that non-local reasoning raises the chance of error (and is likely one of the reasons why this bug existed in the first place).

By adding length counts and explicit bounds checks locally you provide local reasoning to back up this non-local reasoning which is much *easier* to make and maintain correct. I think that would result in code that is less likely to have similar bugs in the future.

pclmulqdq · on Sept 26, 2023

Over a library used by billions of people a lot of times a day, those bounds checks add up to a lot of wasted time and energy if you don't strictly need them.

If this is a common function in the webp library, this function is probably called something like a trillion times a day.

aidenn0 · on Sept 26, 2023

The past 40+ years of CS research has created a huge number of tools for reducing the costs of memory safety. Some of these made it into Rust, but others are sitting in (often abandoned) academic projects. It is an abject failure of industry that we are still parsing untrusted input in languages like C.

WebP gets a slight pass because the code is so old; WebP is at least partly based on code from On2 that extends back at least until the early '00s. Rewriting working code is always going to be lower priority than writing new code (and it's not always a safety win either; being deployed for 10 years gets you an awful lot of real-world testing!).

But look at implementations for newer image formats and you see more of the same; C and C++ with the token java implementation that won't get used outside of the JVM.

pclmulqdq · on Sept 26, 2023

C and C++ are often used because at the optimization level that these sorts of libraries use, Rust makes tradeoffs that do not work out to 0 overhead. I would personally suggest that Google et al invest in figuring out how to make "formally-verified C" a more widely adopted language for libraries that are truly security-critical.

namibj · on Sept 27, 2023

C safe enough to use for parsing untrusted input like this needs to be either formally verified to be safe (not to be correct, at least it's it's the canonical implementation), or written in an intrinsically memory-safe way which humans can fully reason about the safety of. And at that latter point it's slower than Rust. (Not too mention that Rust's awareness of memory lifetimes enables substantial optimizations beyond plain C code that quite often result in faster code at the same level of effort.)

pjmlp · on Sept 27, 2023

People should learn to use profilers, not even C and C++ do zero overhead versus writing in pure Assembly.

Maybe WEBP should have been written in Assembly, all the speed and I bet quite safe. /s

chlorion · on Sept 26, 2023

Apple has actually measured the impact of adding bounds checking to commonly used libraries including media libraries and found no impact on overall power use. Audio and video encoding libraries took around a 1% performance impact.

https://llvm.org/devmtg/2023-05/slides/TechnicalTalks-May11/...

I don't think there is any evidence to support that bounds checking causes any significant increase in power consumption, and measurements currently available such as shown above show minimal runtime performance impact.

Bounds checking is the perfect situation for branch predictors, 99.9% of the time the branch is predicted correctly, and when it's not it doesn't matter because you're probably aborting the program!

There's also the point that you can opt out of bounds checking as-needed on hot loops if you really need to.

There is no technological reason to not use bounds checking as the default mode of operation today.

pjmlp · on Sept 27, 2023

There is no technological reason to not use bounds checking as the default mode of operation since ALGOL exists.

My favourite quote, C.A.R Hoare's "The 1980 ACM Turing Award Lecture"

"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."

-- https://www.labouseur.com/projects/codeReckon/papers/The-Emp...

kevincox · on Sept 26, 2023

They aren't wasted if they are preventing security vulnerabilities.

I also wouldn't be surprised if patching, rebuilding and distributing this fixed version of libwebp to all of these devices would be comparable to the extra cost of decoding.

pclmulqdq · on Sept 26, 2023

"Preventing security vulnerabilities" is not an unadulturated good (the most secure computer does nothing but checking to make sure it hasn't been breached), and any bounds check that does not actively prevent an intrusion is wasted from a value perspective.

Your intuition about the cost of a patch vs the extra instructions in the hot path to check bounds everywhere is also very much wrong: the cost of a patch (which is also amortized among many fixes) is not even close. The hot path adds up when you call it a lot.

saagarjha · on Sept 26, 2023

The most secure computer does nothing, because checking if it has been breached is attack surface.

ctz · on Sept 26, 2023

I think your intuition on the cost of not-taken bounds check branches is off, especially in the context of devices that spend most of their resources browsing the web, running radios, powering screen backlights. I bet any holistic measurement could not tell the difference.

kps · on Sept 26, 2023

> … resources browsing the web, running radios

That's the whole point of webp, though.

pclmulqdq · on Sept 26, 2023

Not-taken branches and filling your superscalar slots with extra instructions costs 0 time, but they still cost energy.

izacus · on Sept 26, 2023

[flagged]

saghm · on Sept 26, 2023

The best solution to this sort of bug is not using languages that are susceptible to this sort of bug. At the very least, I think it's time to retire the fallacy that we're generally capable of producing sound programs in memory unsafe languages. Just like we don't write code manually checking raw SQL to protect against injections and we don't roll our own crypto when we need to encrypt something or do a key exchange, we shouldn't treat writing safe code in C as something anyone can do by just being vaguely diligent.

AnthonyMouse · on Sept 26, 2023

I feel like we need for C and C++ what Typescript is for Javascript: Not a language from scratch but something which is as close as possible to the thing everyone is familiar with while doing the thing Rust does. A standard library where everything has the same names, the same kind of C++ objects and templates and RAII etc., change only this and nothing else.

Because otherwise you make people learn 100 other things at once to start using it effectively, and then they don't.

aidenn0 · on Sept 26, 2023

Something like this? http://cyclone.thelanguage.org/wiki/Cyclone%20for%20C%20Prog...

AnthonyMouse · on Sept 26, 2023

That isn't the thing Rust does, and is slower, so then people won't use it.

The advantage of Rust is that you can produce efficient safe code. The disadvantage is that you have to learn a different language first, which deters adoption.

raydev · on Sept 26, 2023

I'm asking this into the comment section, since I'm curious if anyone has checked:

Does Zig fill this space in any way? It still allows manual memory management but offers more protections, the specifics I'm unclear on.

UncleMeat · on Sept 27, 2023

Zig is something like this for C, but not C++. There are a number of ongoing attempts to do something like this for C++, but it is very hard and they are all way behind Rust in terms of mindshare.

PhilipRoman · on Sept 26, 2023

Indeed, I would kill for a proof assistant that integrates with C

namibj · on Sept 27, 2023

Check out the tooling from the seL4 project. I'm aware they used intermediate levels between raw C and the proof assistant, but I don't see why that should be prohibitive.

izacus · on Sept 26, 2023

You do understand that's not a practical solution here, do you?

saghm · on Sept 26, 2023

Implementing webp and similar libraries in another language may not be quick or easy, but having these sorts of issues continue to crop up over and over doesn't feel particularly practical to me; it's just easier to act like every time a critical vulnerability comes up due to memory safety that it's an isolated issue than confront the reality that its a systemic issue and there aren't any easy solutions.

secondcoming · on Sept 26, 2023

This response is becoming so tiresome

kouteiheika · on Sept 26, 2023

> This response is becoming so tiresome

I'd argue the constant stream of CVEs due to memory unsafety is even more tiresome.

saghm · on Sept 26, 2023

Agreed, I'll stop saying it when we stop having these issues. I don't think people want to confront the fact that there isn't any easier solution, but we're going to keep running into the same thing over and over until we do.

adrianmonk · on Sept 26, 2023

The lack of response is way more tiresome. We keep having the same catastrophic failures decade after decade. Let's do nothing.

KirillPanov · on Sept 26, 2023

ye gads man, did you even read the comment you're replying to?

izacus · on Sept 26, 2023

I did - and I'm not sure how adding those checks here would help. Are you proposing a full library reaudit?

unethical_ban · on Sept 26, 2023

$ ./pedant.sh

If you read what they proposed and disagreed with it, then don't ask "What do you propose?" - Argue/debate why you think their proposal, which you read, is not useful.

londons_explore · on Sept 26, 2023

Looks pretty correct to me...

We just keep track of the table size and make it bigger if necessary.

We additionally have a mode where we just calculate the necessary size, without writing any data structures.

So we have kinda added double safety against this particular bug.

mlichvar · on Sept 26, 2023

Fuzzing needs to cover all important bits of the code to be useful. The problem I see is that incomplete coverage creates a false sense of security. Projects have some minimal fuzzing coverage (e.g. in oss-fuzz) and care less about quality of the code, thinking fuzzing will catch all security bugs.

Rust code needs proper fuzzing too. It takes a lot of effort to ensure everything is covered and stays covered as the code is developed. Crashing libraries or applications can be a denial of service. Sure, it's lower impact than an RCE due to a buffer overflow, but it is still a security issue.

UncleMeat · on Sept 26, 2023

This isn't even the problem. There are shitloads of open bugs found by fuzzing, even in the linux kernel, that nobody is fixing. Serious security vulns in the kernel regress because of lack of tests.

It is clear that even if we stood up the best bug finding systems the world has ever seen that critical software will still be a disaster.

lukeschlather · on Sept 26, 2023

This is the sort of thing where I am very curious to hear what happens if you fine-tune an uncensored version of GPT4 on some memory exploits like this along with the vulnerable code and ask it for more.

It seems like it ought to be really good at this and I am suspicious that people in the know are afraid to talk about it publicly because it's too good at it and once people start weaponizing LLMs for this purpose we just won't be able to use memory unsafe code anymore.

Zigurd · on Sept 26, 2023

LLMs should indeed be very good at finding vulns. It is a safe bet state actors have already done this, and have added to their stockpile of zero-day exploits. I suspect you are right that collectively we are whistling past the graveyard and the current "solution" is to make widely available LLMs explicitly not provide that kind of capability.

nahsra · on Sept 26, 2023

My experience is they're not good at this for most vulnerability classes, especially the those that are tough to discover by classical methods. Have you had any experience using them for this?

Trivial vulnerabilities are easily discoverable yes -- but, they are also trivially discoverable by standard automation available today. I've found GPT-4 to be shockingly bad at vulnerability analysis for all except the most popular vulnerability classes. My speculation is that there just isn't enough literature on these vulnerability classes for it to have practical mastery of them.

Complex vulnerabilities are the emergent phenomena of multiple events across a codebase and it's dependencies, involving control flow, data flow, while missing type information and other runtime data. Even Anthropic's 100K context windows won't nearly fit it all, and if you stuff all the code into embeddings, the ability to reason across all this space will be poor.

You can train a model to ask very pointed questions about particular snippets, but wholesale LLM-based analysis to find vulnerabilities seems like it'll be extremely slow, expensive and inaccurate.

lukeschlather · on Sept 26, 2023

I am surprised that you can even say "it's not very good." The sorts of prompts I'm imagining I would expect to trigger the "I'm afraid I can't do that Dave" safeguards every time. I guess yeah, I'm just imagining using it as an advanced fuzzer but I think the thing about using an LLM is you can take the fuzzer code and ask the LLM to just generate slight variations, feed that into a test and ask the LLM to flag ones that look like they might have been an exploit. And when it generates nonsense code, you just throw out those runs. But on the other hand hallucination feels like an advantage here since it's going to do things you never would have thought to test.

nahsra · on Sept 27, 2023

I don't have any problem getting it to help with exploit development. I never had any issue with that with any of them, in fact, which is surprising in retrospect.

Inference is so slow, and almost everything about fuzzers are meant to be super fast. Maybe there's a late stage part in crash validation/analysis where you can use it but my bias is that we're just not there yet.

Zigurd · on Sept 26, 2023

I would not expect an LLM to be good at this without specialized training. I have tried prompting for code generation.

I do not know of LLMs that have been specifically trained on, say, the testing corpus of some "lint" programs and against known vulns. As you point out, it wouldn't be possible as a user of an LLM AI to do the equivalent by showing it some vulns, while it would be perfectly reasonable to get an LLM to write, for example, business case studies by showing it examples.

nahsra · on Sept 27, 2023

I don't think specialized training solves any of the problems I mentioned. It doesn't increase the window size, or provide any of the types of highly specialized and optimized multi-file, multi-technique analysis, or make it any cheaper.

toomuchtodo · on Sept 26, 2023

This is why I'm bullish on LLMs for application security vulnerability searches (like Shodan for code). It'll see what humans don't.

theptip · on Sept 26, 2023

Seems to help attackers just as much as defenders though, until you’ve found every single bug (which could be possible, but will likely be expensive.)

malfist · on Sept 26, 2023

Why would you expect that? LLMs are explicitly trained on what humans do. Humans miss security vulnerabilities all the time.

toomuchtodo · on Sept 26, 2023

Because the LLM based appsec products I've tested on a large code base have been very good at finding what devs and appsec code reviews didn't.

reitzensteinm · on Sept 26, 2023

Which ones, out of curiosity?

ZiiS · on Sept 26, 2023

Humans find security vulnerabilities all the time too. The assumption is a lot of the misses are because highly trained security experts don't have time to carefully read and understand all code without fatigue.

dnedic · on Sept 26, 2023

Humans can't be forced to evaluate every piece of code 100000 times in an hour. LLMs can.

philipwhiuk · on Sept 26, 2023

With infinite hardware, sure. LLMs are pretty expensive to run vs fuzzing.

xeckr · on Sept 26, 2023

Because of generalization.

aorth · on Sept 26, 2023

> To put this in context: if this bug does affect Android, then it could potentially be turned into a remote exploit for apps like Signal and WhatsApp. I'd expect it to be fixed in the October bulletin.

Interesting quote from Ben Hawkes (former Project Zero manager) in the article. I regularly compile Signal-Android from source and happened to notice they vendored libwebp a few days ago:

https://github.com/signalapp/Signal-Android/commit/a7d9fd19d...

nikanj · on Sept 26, 2023

Fixed in the October bulletin*

*For the small fraction of Android phones that are new enough to get updates

nickcw · on Sept 26, 2023

Android is particularly troublesome here with the number of phones out there receiving no updates, and just a single download away from being exploited.

For me, personally, it's a race to see if Google can get this patched for my Pixel 5 before security updates stop in October.

matthewdgreen · on Sept 26, 2023

A phone released in October 2020 is about to stop getting security updates?!

KronisLV · on Sept 26, 2023

> A phone released in October 2020 is about to stop getting security updates?!

I actually looked into it, my previous phone was released in June 2020.

The last security update for the phone came out in January 2022.

I'm not sure whether they're just updating things infrequently or whether that model is abandoned altogether, but neither would speak highly of the Android support landscape for non-flagship phones.

This ends up being kind of silly: if I want to have a mostly secure backup device (for 2FA and my bank/eID apps), then I probably need two comparatively recent devices, since a 5 year old budget phone would no longer quite do as a temporary daily driver, in case I'd lose my main phone.

izacus · on Sept 26, 2023

Which phone would that be?

KronisLV · on Sept 26, 2023

It's this manufacturer https://www.ulefone.com/index.html and probably many more out there, typically in the budget segment.

I can say that their rugged designs served me well, but some of the newer phones apparently have this weird component for wireless earphones and I don't really want that, so my current and future purchases are from different manufacturers. Was actually pretty close to stock Android, though!

izacus · on Sept 26, 2023

Oh yeah, those OEMs are outright terrible with their support cycles :(

jefftk · on Sept 26, 2023

Pixel used to be 3y of security updates. Starting with the 6 (2021) they switched to 5y. https://endoflife.date/pixel

greesil · on Sept 26, 2023

Supposedly 7 years soon. Maybe. https://www.theverge.com/2023/9/25/23888831/pixel-8-pro-seve...

sva_ · on Sept 26, 2023

It appears so: https://support.google.com/pixelphone/answer/4457705?hl=en#z...

Helmut10001 · on Sept 26, 2023

Can this be fixed with a rooted Pixel with Lineage OS?

Looks like this is it [1] and it looks like LineageOS 17 is supported even back to Pixel 3a [2].

[1]: https://review.lineageos.org/c/LineageOS/android/+/366611

[2] https://forum.xda-developers.com/t/lineageos-17-for-pixel-3a...

hot_gril · on Sept 26, 2023

It stops getting guaranteed security updates, whatever that means. If this is really so serious that it enables drive-by attacks on the web, it'd be negligent not to port the fix to older OSes.

izacus · on Sept 26, 2023

Based on Mishaal's post on Twitter (https://twitter.com/mishaalrahman/status/1705934521751806033), the patch will be in SPL released marked with 2023-10-06 date.

Pixel 5 should still qualifiy for it.

ryukoposting · on Sept 26, 2023

My phone came out in 2017. I'm still on Android 8. There's an update to Android 9.1 available somewhere, but AT&T never got around to porting it to their crapware-riddled fork.

samtheprogram · on Sept 26, 2023

Galaxy S8?

ryukoposting · on Sept 26, 2023

LG V35.

josefresco · on Sept 26, 2023

Google cut off my Chromebook about 8 months back but still pushes what I'm assuming are security updates (although they've slowed as of late).

changelink · on Sept 26, 2023

I just checked, and in the last 30 minutes a update has become available for my Pixel 5.

The security bulletin doesn't reference this CVE specifically but does mention a critical vulnerability that could lead to RCE.

jacquesm · on Sept 26, 2023

The problem is that they'll use the opportunity to schlep a bunch of non-security related stuff into the update as well. That's the thing that really bothers me about these, that you can't say just the security patches and hold the telemetry/marketing/spyware/adware/crapware/malware/etc.

izacus · on Sept 26, 2023

Updates outside the quarterly feature drops on Pixels are pretty much exclusively security patches so I have no idea what you're talking about.

jacquesm · on Sept 26, 2023

This is what I'm talking about:

https://support.google.com/product-documentation/answer/1141...

Good luck setting things up in such a way that you only get security patches and not a whole bunch of 'improvements' to go with them.

mnw21cam · on Sept 26, 2023

The point about Android is particularly important. I wouldn't like to estimate the proportion of Android phones that are in regular use that no longer receive security updates.

londons_explore · on Sept 26, 2023

Android phones don't have an awful lot of attack surface area for typical users though. Messenger apps already will refuse to display arbitrary images - Whatsapp for example will only display jpegs and mp4's sent from other contacts.

Vt71fcAqt7 · on Sept 26, 2023

True, and most websites will reencode images for compression anyway. But the point still stands this makes "hacked by clicking a link" a reality again for some people which should have ended along with flash and java applets. The current mainstream is that "you won't get hacked if you don't install." (not to say that this is the best security practise..) Since you don't install images most people will just assume they are safe. I hope this whole ordeal doesn't hurt JPEG-XL adoption.

Filligree · on Sept 26, 2023

I hope it does hurt JPEG-XL adoption.

The reference implementation is C++, and it’s nearly guaranteed to have equally worrisome bugs in it — every image library has seen those over the years.

We live in 2023. We can deal with slightly worse compression until someone rewrites it in a sane language.

sambazi · on Sept 26, 2023

> We live in 2023. We can deal with slightly worse compression until someone rewrites it in a sane language.

not just sanity of implementation but also reliability and compatibility

{h264,jpg,zip} for life!

hot_gril · on Sept 26, 2023

Yeah, I don't mess with webp, VP8, etc. I see near zero benefit to myself from that, with significant downsides.

Like, Google Meet insists on using VP8/9. Why, cause it's "free?" The strain on my laptop and extra energy usage for it to CPU-en/decode video ain't free. Zoom just uses h.264 instead of being annoying about it.

londons_explore · on Sept 26, 2023

Especially when there are various neural net compression methods on the horizon that look like they can probably reduce file sizes 10x or more for similar user enjoyment.

pixelesque · on Sept 26, 2023

It's not just compression ratios that new formats provide though: it's also higher bit depths (meaning less banding for things like gradients, i.e. in skies at sunset) and HDR support (values over 1.0 in 'linear' space, rather than being clamped).

Vt71fcAqt7 · on Sept 26, 2023

I'm sure this is what you meant, right ;)

https://github.com/Traneptora/jxlatte

Filligree · on Sept 26, 2023

I'll take it. It's not like Java is slow nowadays.

yjftsjthsd-h · on Sept 26, 2023

I was under the impression that modern C++ is a "sane language" - it looks like "smart pointers" are a thing now?

UncleMeat · on Sept 26, 2023

Smart pointers aren't the beginning and end of things. C++ is riddled with places that can be converted into weird machines, even if you guarantee correct lifetimes of everything on the heap.

Look at how much fun we can just have with the stack!

  std::string_view foo(std::string_view s) {
    return s;
  }

  auto s = foo("temporary");  // kaboom

Modern C++ does not force you to initialize everything before it is read. Modern C++ happily lets you ignore bounds checks with vector operator[] or by using c style arrays or by doing pointer arithmetic. Modern C++ happily lets you overflow integers or silently truncate when widths change. And on and on and on. Turning every "new" into "make_shared" is nowhere close to enough to make C++ safe in the face of bugs.

mnw21cam · on Sept 26, 2023

Agreed. Security shouldn't depend on not visiting certain web sites, because if nothing else the ad networks are able to get your browser to load all sorts of content from all sorts of places that you wouldn't deliberately do.

londons_explore · on Sept 26, 2023

Clicking a link in a web browser means that any image decoding happens in the browser renderer sandbox.

That sandbox is pretty robust - the difficulty of finding an exploit somewhere in a browser renderer is much lower than the difficulty of finding a way out of the sandbox the renderer runs in.

Vt71fcAqt7 · on Sept 26, 2023

Yes, not exactly "clicking a link" but still at the same level ("add to contacts"). Also many apps have users with profile pictures. An attacker can send a link that will open in an app (eg. Instagram, facebook). Also within browsers could this still leak whatever is currently rendering on the page?

Pxtl · on Sept 26, 2023

Is there any site that shows what phones are still getting security updates? I'm worried that this will be the thing that makes me retire my son's old Moto G... 5 I think? It's probably out of security updates. Which kills me, he's a careful boy and it's a solid phone, this is unnecessary E-Waste.

yjftsjthsd-h · on Sept 26, 2023

I don't know of a site that tracks it, but in the phone settings you can see when it last got security patches.

Also, if you want to get security updates without the e-waste, you could just install LineageOS on it.

Pxtl · on Sept 26, 2023

"just" is definitely a 4-letter word here, but yes, I think I'll be getting him a new one and putting LineageOS on the thing to play with.

mnw21cam · on Sept 26, 2023

I'd say a G5 is way out of updates. I have the G8+ and it stopped getting security updates ages ago.

gjsman-1000 · on Sept 26, 2023

I’m surprised, given the history of exploits, Google didn’t decide to start shipping the image decoders as an APEX system component in the Play Store, with the built-in ones serving as a fallback. They just might now.

pja · on Sept 26, 2023

Having spelunked through some Google announcements on this, it looks like they planned to do so back when Android 11 was in development (https://android-developers.googleblog.com/2020/02/Android-11... ), but I can’t find any evidence that image decoders are included in the current Android Mainline system updates that ship via Google Play - video codecs are, but not image ones apparently.

The image decoders are in android.graphics (https://developer.android.com/reference/android/graphics/pac... ) & that API subset is not listed in the Android Mainline system modules that are catalogued here: https://source.android.com/docs/core/ota/modular-system

It looks as if they planned to include image decoders, but that was dropped sometime during the Android 11 development cycle, unless I’ve missed something (which is certainly possible).

Edit: the final dev release announcement for Android 11 keeps the same paragraph about the native image decoder. So maybe it is there & I’m just not finding it? https://android-developers.googleblog.com/2020/09/android11-...

tgv · on Sept 26, 2023

I'm always suspicious of Google's behavior. It might be a good opportunity for them to tell everyone they need the latest version of Chrome because security. Manifest 3.0 and cohorts/FLoC, you can't escape them.

xavxav · on Sept 26, 2023

I'm tired and cranky today so this will lack subtlety, but:

    You don't have to use Rust but you **can't** use C.

There's no reason to be finding these bugs in 2023; period, we can do better and we know how to do better, there's just no reason apart from legacy code (and even then) that you should be using memory unsafe languages in production.

ck45 · on Sept 26, 2023

One example for a safer language developed at Google: https://github.com/google/wuffs

_gabe_ · on Sept 26, 2023

Responses like this are why people still use C when it may make more sense to use a different language. Don’t just tell people, “trust me bro, never use C, there’s never a good reason”. Never make broad claims that leave no wiggle room, because the one thing that is always true is that there will be exceptional cases. Instead of saying this kind of stuff, why don’t you elucidate what circumstances exist where using C makes sense? Microcontrollers, FPGAs, glue code that can expose your API to other programming languages, maintaining an old code base already written in C, etc.

And as an aside, this is also why people don’t take security seriously. Not every bounds overflow will result in my database being breached and/or my computer being pwned. Instead of freaking out every time a bounds overflow or some sort of memory error is found, it may be useful to actually find out what the impact is. Is that code sandboxed? Is it local, or does it have access to the web? Does it request elevated permissions at any point? Does it even need elevated permissions?

There are many factors that can reduce or increase the risk factor of a bug. Security researchers need to start actually explaining the risk instead of blindly proclaiming everything as a “critical” vulnerability.

And everything I’m saying here has no bearing on this specific vulnerability. It’s in response to the general claim that every memory bug is critical.

pjmlp · on Sept 27, 2023

Many of those devices also have Ada, Pascal and BASIC compilers available, all of which do bounds checking.

Or they can, I don't know, at very least use C++ with std::array and std::span instead of raw C arrays, with the compiler flags to do bounds checking.

johnmaguire · on Sept 26, 2023

On this note, I was really surprised to find Red Hat's OCI runtime is written in C: https://github.com/containers/crun

Is anyone working on a Rust version?

yjftsjthsd-h · on Sept 26, 2023

Well runc is in Go, which isn't Rust but at least isn't C.

johnmaguire · on Sept 26, 2023

Yes, and Red Hat has metrics and explanations showing why Go is not exactly ideal here.

And yet I am surprised they chose C as a replacement... hence my question above!

sbt567 · on Sept 27, 2023

There is https://github.com/containers/youki

yjftsjthsd-h · on Sept 26, 2023

For an image decoding library, you'll still need to expose a C API if you want anything to be able to use your library. You could only provide a library for your preferred language(s) but if like Google you're trying to push an all-new format the ecosystem buy-in is a big deal. And obviously you can provide a C-compatible library without writing the library in C, but it's more complexity to handle.

(Edit: I should mention that this is one reason why I really like wuffs, mentioned elsewhere in this thread; it's a safer language that compiles down to C, so you can have mostly the best of both worlds)

colejohnson66 · on Sept 26, 2023

If you want a C-style API, that can be had in Rust as well: https://stackoverflow.com/questions/71904069/how-can-i-expor...

hot_gril · on Sept 26, 2023

Can vouch for the ease of Rust-C FFI. It was very simple for the newbie I was at the time I used it.

hot_gril · on Sept 26, 2023

I don't think the answer is to do fine-grained bounds checking absolutely everywhere, nor do I think it'll happen anyway. Even "unsafe" code has a coarse type of bounds-checking, the kernel's virtual memory layer, which adds less overhead. C etc is fine if it's not in the same memory space as the safe code.

yellow_lead · on Sept 26, 2023

Repro steps here: https://blog.isosceles.com/the-webp-0day/

This is pretty bad, there will be many vulnerable applications and someone just posted how to exploit it. I just reproduced it on a VM:

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/<name>/webp_test/examples/dwebp+0xb24e9) in BuildHuffmanTable Shadow bytes around the buggy address: ...

Also, here's the webp it generated as base64, but chrome doesn't crash or anything. Other apps may handle it differently, but many just say invalid format.

UklGRukAAABXRUJQVlA4TN0AAAAvAAAAAPAAWgAAsKwlnZsEAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAALRt27Zt27Zt27Zt27Zt2/b92fUAWgAAsLTknJskSQAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAALw/23oALQAAWFpyzk2SJAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAN6f bT2AFgAALC055yZJEgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAO/Pth7A/xsAQoutz/f3 fwAwMzNvVVXt7p5zLw==

X6S1x6Okd1st · on Sept 26, 2023

So this could potentially be activated by opening up a WhatsApp or RCSChat that received a payload right? I'm turning off auto download for now

izacus · on Sept 26, 2023

Those apps don't render WebP as far as I know so they should be fine.

ars · on Sept 26, 2023

Does Whatsapp even support webp? Don't they reencode everything as jpeg?

jurgenkesker · on Sept 26, 2023

If I recall correctly, Whatsapp supports webp for their sticker packs. And those are open to anyone to develop and integrate in Whatsapp on device with a intent.

michaelmrose · on Sept 26, 2023

Remember when it was a problem that 19 flatpaks had 19 different versions of a library. Storage wasn't the primary problem.

kirbyfan64sos · on Sept 26, 2023

Remember when the primary purpose of flatpak's shared runtimes was so that libraries like libwebp are not actually bundled per-application? libwebp in the runtimes has been patched with a fix for over a week.

ramesh31 · on Sept 26, 2023

How can buffer overflows still be happening in this day and age?

ojii · on Sept 26, 2023

Hubris and people thinking they can safely code in memory unsafe languages (they can't)

ninkendo · on Sept 26, 2023

A memory-safe language could have easily resulted in the same vulnerability.

In this context, “memory-safe” means it does bounds checking on an array when you try to access an element. But the webp code does bounds checks up-front so that array accesses can be non-checked, to help performance. (If they didn’t want this performance, they could have easily used a std::vector and used bounds-checked access.) The vulnerability happened because this up-front bounds checking logic was incorrect.

If we were to hypothesize a counterfactual world where webp was written in rust, presumably the devs would have wanted a similar optimization where they did the bounds checking up-front still, and they would have put the actual array access in an unsafe block to have the same perf optimization. The same bug would have thus happened.

The lesson here is that bounds checking on every element access is probably a worthwhile overhead, no matter what the language is. C++ has safe ways to do this (well, safe-ish… safe for the purposes of this bug at least) with std::vector, and maybe they should just switch to that and eat the performance overhead.

steveklabnik · on Sept 26, 2023

> presumably

This is a big presumption. Yes, it could happen. In practice, doing this isn't even the first tool you'd reach for in this circumstance; the compiler can and will eliminate duplicate bounds checks, so if you've hoisted it early, you shouldn't be using unchecked accesses, even if you care about performance, until you've demonstrated why the compiler isn't okay with removing them. The extra ceremony ("unsafe { foo.get_unchecked(n) {" vs "foo[n]") makes this even simpler to catch in code review.

ninkendo · on Sept 26, 2023

> The extra ceremony ("unsafe { foo.get_unchecked(n) {" vs "foo[n]" makes this even simpler to catch in code review

Right, and in said code review, the webp author could have easily said “yup, we want unsafe here because we already checked up front that the buffer shall not exceed k elements”. Sure it’s easier to see that unchecked access is happening, but when the whole point of large sections of the huffman table code in webp is to make this very thing work, it wouldn’t cause any additional scrutiny in a code review. In other words, it would already be super clear to the reviewer that bounds checking is being disabled for performance sake, seeing the word `unsafe` as ceremony isn’t really adding any information here.

It’s possible webp could have been implemented in rust with a naive approach to early bounds checking and relying on the optimizer to elide it, but having looked at the code, with all the buffer sizes they’re passing around, it looks unlikely that equivalent rust code would have been able to auto-optimize it. I don’t think it’s unlikely at all that, in this parallel universe, they would have found the bounds checking to be a decent enough overhead that they would have reached for unchecked access, and would have passed code review the same way the current code did.

steveklabnik · on Sept 26, 2023

For sure, this absolutely could (and will, at some point) happen. But security isn't only about what can happen, but what will probably happen. Each of these things introduces a possibility to catch the bug, not an impossibility that a bug would happen.

> it wouldn’t cause any additional scrutiny in a code review.

I would hope that it would at least cause a "demonstrate that this is actually necessary" review. People do do this! and sometimes, you do have to use unchecked. A famous example of this is litearally huffman coding, by Dropbox, back in 2016. They ended up providing both bounds checked and unchecked options as a flag, because they actually did measure and demonstrate that things were causing performance drops. I am curious if they'd still have the same issues today, given that a lot of time has passed, and also there were some other factors that would cause me to wonder a bit, but haven't followed up on myself. Regardless, this scenario will end up happening for good reasons at some point, the goal is to get them to happen only when they need to, and not before.

pjmlp · on Sept 27, 2023

I would only accept such PR with profile data and benchmarks proving the point.

ninkendo · on Sept 27, 2023

The dropbox article others and myself have linked provided such benchmarks, for an 11% speedup. Seem plausible?

pjmlp · on Sept 27, 2023

As everything in IT, it depends.

That was for Dropbox specific use case, and 11% seems too much in any case.

For example, what improvements has had rustc in the meantime to better optimize bounds checking, it is still the same, it is less, does it actually matter on the acceptance criteria regarding the definition of done on the story.

pornel · on Sept 26, 2023

> If we were to hypothesize a counterfactual world where webp was written in rust

You don't have to hypothesize. It's here:

https://github.com/image-rs/image/blob/master/src/codecs/web...

and it doesn't use unsafe indexing.

ninkendo · on Sept 26, 2023

That’s not google’s webp library, that’s a totally different project. It doesn’t really matter what that project does.

pornel · on Sept 26, 2023

You’re arguing about a potential vulnerability in a hypothetical code that exists only in your imagination.

Meanwhile the real code that exists doesn’t work like that. It’s not just a matter of personal style. Rust has constructs that help avoid bounds checks. It has design patterns that move unsafe code into smaller, easier to verify helpers. It has a culture of caring about safety – people don’t rewrite projects in Rust to just give up on its biggest selling point.

ninkendo · on Sept 26, 2023

We know that the webp project does use unchecked access. We know this is intentional because they go through a lot of effort to do up front bounds checking rather than at every access. We also know that companies like Dropbox, when implementing their own Huffman table implementation, demonstrated an 11% speed up when disabling bounds checks: https://dropbox.tech/infrastructure/lossless-compression-wit...

It is not a stretch whatsoever to assume that a direct rewrite of the webp library into rust would have uncovered the same perf findings that Dropbox did, and decided that the pattern of “up front bounds checks, disabled bounds checks at element access” is a reasonable perf enhancement, especially for a Huffman table, where you’re continually accessing the compression table over and over when expanding it.

Edit: the more I think about it, you’re probably right. I had mistakenly thought I was reading C++ code when I was looking at the original patch to the vulnerability. The fact that it’s actually written in C, erodes my argument quite a bit… my logic in my head went something like, “if they wanted bounds checking they would have used std::vector”, but now I realize that’s impossible since it’s not C++. I’ll concede that there’s no real reason to assume that they would have skipped bounds checking if written in a safe language.

howinteresting · on Sept 26, 2023

While yes, that's theoretically possible, do you have data to establish this? For example, having small sections of the code be marked as unsafe would allow for greater scrutiny of those sections. Also, unsafe access is more annoying to perform in Rust than in C or C++, so maybe that would have acted as a deterrent (or at least the code would have been profiled to make sure that unsafe access was worth it).

https://security.googleblog.com/2022/12/memory-safe-language... shows improvements at scale.

ninkendo · on Sept 26, 2023

Elsewhere in the comments someone linked to this: https://dropbox.tech/infrastructure/lossless-compression-wit...

It looks like dropbox experimented with disabling bounds checks in their huffman coding impl, and found that using the unsafe pattern increased throughput from 224 MB/s to 249 MB/s (11%-ish faster.) We don’t even need to hypothesize about whether webp would have elminated bounds checking, we can see that other companies arrived at the same conclusion: Disabling it can be worth it if you’re quite sure you’ve gotten the up-front checking right. We can imagine that if Dropbox went to prod with the unchecked huffman implementation (never mind that that article isn’t about webp in particular), we could imagine they could easily have the same bug. And I don’t think a naive code review saying “unsafe is bad” would have stopped them from doing it: they clearly did the work to show why it’s worth it.

hot_gril · on Sept 26, 2023

The risk probably doesn't matter for their use case because they're doing all this datacenter-scale image conversion on processes separate from the main logic (and likely not even on the same machines). Unlike in a phone's web browser or something, where it's 11% speedup with ??% added risk.

It's already common for PC apps to split potentially unsafe rendering into subprocesses, like in Chrome. If you don't want to pay the full IPC toll, there's shared memory. In theory should be about the same speed as inlined unsafe code, right? What if Rust's "unsafe" blocks could do this for you?

howinteresting · on Sept 26, 2023

Thank you.

e93849 · on Sept 26, 2023

Unsafe rust is not a memory-safe language.

vacuity · on Sept 26, 2023

It isn't, but there's a big difference between writing straight C and writing some unsafe Rust with safe wrappers. It's generally possible to have comparably few lines of unsafe Rust that do the heavy lifting and can be verified to maintain memory safety without sacrificing performance.

hot_gril · on Sept 26, 2023

Well, the code you mark unsafe is probably also the most complicated piece and thus the most likely to have a bug. I can trust decent C devs to write their basic logic safely, just not the hyper-optimized portions.

pornel · on Sept 26, 2023

A big blob of complex unsafe code is the opposite of how Rust devs approach unsafe optimizations.

Rust has a pattern of isolating unsafety into small components behind a safe interface, so that the component can be understood and tested in isolation. For example, if you need some adventurous pointer arithmetic, you write an Iterator for it, rather than do it in the middle of a complex algorithm. This way the complicated logic can be in safe code.

It's sort of like Lego, where you build from safe higher-level blocks, but you can design custom blocks if you need.

hot_gril · on Sept 26, 2023

Likewise, C code will be organized into separate pieces with a few small super-optimized/complicated parts, and they can fuzz-test the complicated pieces that are most likely to have buffer overflows.

I can't find the actual code causing the libwebp vulnerability, so idk if mixed safe/unsafe Rust code would've been any better here. Maybe what we really need is an "unsafe-jail" block in Rust that uses a child process limited to a piece of shared mem, and you put big pieces in there to avoid overhead. Like, libwebp can screw up all it wants, just don't touch the rest of my app.

pornel · on Sept 26, 2023

The difference is that in C you can't make a "safe" interface, which the compiler will enforce is used properly.

C's type system is not nearly expressive enough for this. You can barely declare a pointer non-null with extensions, but you can't express ownership. You can't force callers of your function to check for error before using the returned value. You can't force correct lifecycle of objects (e.g. Rust can have methods that can be called once, and no more, and then statically forbid further uses of the object. Great for cleanup without double-free.)

C doesn't have ability to turn off Undefined Behavior for a section of code. You can't just forbid dangling pointers.

For a foolproof API the best you can do is use handles and opaque objects, and a ton of run-time defenses. But in Rust that is unnecessary. Use of the API can be guaranteed safe, and a lot of that is guaranteed at compile time, with zero run-time overhead.

For example, a mutable slice in Rust (a buffer) has a guarantee that the pointer is not null, is a valid allocation for the length of the slice, is aligned, points to initialized memory, is not aliased with any other pointer, and will not be freed for as long as you use it. And the compiler enforces that the safe Rust code can't break these guarantees, even if it's awful code written by the most incompetent amateur while drunk. And at run time this is still just a pointer and a length.

In C you don't get this compartmentalization and low- or zero-overhead layering. Instead of unsafe + actually enforced safe, you have unsafe + more unsafe.

hot_gril · on Sept 27, 2023

Right, but I can trust a decent C developer to use it safely in the simple parts, especially with tooling like valgrind to detect obvious bugs. The only part where I'd say the usual "nobody is perfect" is in the hard parts.

pornel · on Sept 27, 2023

There's 40 years history of trying, and it doesn't work.

These decent C programmers are like True Scotsmen. When top software companies keep getting pwned, even in their most security-sensitive projects, it's because they hire crap programmers.

Even basic boring C can be exploitable. Android was hit by an integer overflow in `malloc(items * size)` (stagefright). Mozilla's NSS had vulnerability due to a wrong buffer size, which fuzzing did not catch (BigSig).

hot_gril · on Sept 27, 2023

After looking at Stagefright... yes, I've lost faith in the ability to write safe C code.

vacuity · on Sept 26, 2023

At that point, I imagine the overhead will motivate that bounds checks may as well be enabled as the alternative.

hot_gril · on Sept 27, 2023

The only overhead I see in theory is the additional process's kernel resources, which are negligible. Wherever you'd normally write to memory for some unsafe code to mutate, you instead write to the shared memory. Kernel is doing the same virtual mem protection either way, only in this case it opens up access to the child process too.

Am I missing something else, like shared mem being slower? Maybe the inability to share CPU caches?

vacuity · on Sept 27, 2023

The fact that you have a whole other process, was my line of thinking. If the scheduling doesn't play nice then latency will suffer. I don't really know in practice though.

hot_gril · on Sept 27, 2023

Latency is an issue with file or network-based IPC, but shared memory is supposed to be the same speed as non-shared. Apparently the X window system relies on it a lot.

Scheduling, I dunno. Would imagine it's not bad as long as you don't spawn a ton of these.

BeFlatXIII · on Sept 26, 2023

But we need the performance of hand-tuning memory to save the environment and provide accessibility to legacy devices! /s

delfinom · on Sept 26, 2023

Not to worry, there are serialization and other vulnerabilities in memory safe languages :3

JulianWasTaken · on Sept 26, 2023

Eliminating a whole class of vulnerabilities is the right thing to do regardless of whether other classes of vulnerabilities remain.

johnnyjeans · on Sept 26, 2023

The solution, if you can consider there to be one, was always languages with semantics to guarantee logical correctness. But we're never going to get that because it requires a pace of software development that's incompatible with the money people's fetish for churn and burn. So let's put all our eggs into the Rust basket and while we're at it trust the hardware to never lead us astray :)

himinlomax · on Sept 26, 2023

I never lock my door because burglars can break a window.

unaindz · on Sept 26, 2023

In both safe and unsafe languages

hgs3 · on Sept 26, 2023

Buffer overflows are bugs. You're asking how bugs can still be happening in this day and age? The answer is inadequate testing and broken engineering processes.

hot_gril · on Sept 26, 2023

The question is how buffer overflows can still be happening, not bugs in general. No matter how bad my eng practices are, I cannot create a buffer overflow bug in a memsafe language.

hgs3 · on Sept 27, 2023

Buffer overflows happen because a bounds check is missing. Using a language with implicit bounds checking (memsafe?) does _not_ excuse you from explicit bounds checking. Depending upon the language an implicit bounds check might "recover" by intentionally crashing or raising an exception. Either approach could result in lost or corrupt data with the latter being arguably worse as the program continues running in a potentially broken state. Bounds checking is error checking and non-trivial applications need robust error checking and recovery regardless of the language they are written in.

hot_gril · on Sept 27, 2023

Exception gets caught somewhere up the call stack, like a webserver returning error 500. It's not the most user-friendly error handling, so you should avoid that whenever you can, but it's not a big problem. In some languages like C++, you'll trigger a crash instead, which can be more of an issue but still isn't nearly as bad as an undetected buffer overflow.

UncleMeat · on Sept 27, 2023

Bugs will always exist. Bugs that lead to unbounded harmful behavior of your program are an important distinction.

barryrandall · on Sept 26, 2023

Because humans are still writing software, and we're still better at it than AI.

albntomat0 · on Sept 26, 2023

Additional timeline info, as I was curious myself. WebP is old enough that a memory safe language was not a feasible option when the project started.

Android 12 was the first version to support Rust code, and came out in 2021 [0, link talks about the first year of integration].

On the iOS side (which also was affected by this), Swift 1.0 came out in ~2014.

As far as I can tell, Chrome doesn't yet support a memory safe language, but do have a bunch of other safety things built in (see MiraclePtr, sandboxing, etc). Since both WebP and Chrome are from Google, this would stop a possible transition.

WebP was announced in 2010, and had its first stable release in 2018 [1].

[0]: https://security.googleblog.com/2022/12/memory-safe-language...

[1]: https://en.wikipedia.org/wiki/WebP

MatthiasPortzel · on Sept 26, 2023

> Chrome doesn't yet support a memory safe language

In addition to the safety features you mentioned, Chrome supports Wuffs, a memory safe programming language that supports runtime correctness checks, designed for writing parsers for untrusted files. I don’t think it existed at the start of the webp project either, but that’s what I would expect the webp parser to be written in, over Rust or a garbage collected language.

https://github.com/google/wuffs

albntomat0 · on Sept 26, 2023

Ahh cool! I was unaware of Wuffs. Thanks for sharing!

matthewdgreen · on Sept 26, 2023

Is Rust in practice a memory safe language when you're doing tricks like decoding huffman-decoding huffman tables into buffers? It seems like once you optimize for performance this much, you're liable to turn off bounds checking here or there.

albntomat0 · on Sept 26, 2023

I agree it's definitely possible, and I'm certain we'll see a vuln due to some crazy Rust optimizations in the future.

That said, not switching over to a memory safe language, in my opinion, is letting perfect be the enemy of the good. Folks will still be able to write footguns, but better language choices will prevent bugs in the all the non-crazy optimized parts.

Lockal · on Sept 27, 2023

There are already huffman-decoding and some parts of webp algorithms in https://github.com/google/wuffs (language that finds missing bounds checks during compilations). In contrary, according to readme, this language allows to write more optimized code (compared to C). WEBP decoding is stated as a midterm target in the roadmap.

vacuity · on Sept 26, 2023

The point of `unsafe` in Rust is that you can generally limit `unsafe` code to a few short functions and encapsulate them in a safe interface. This means less code to review and maintaining strong guarantees in the rest of the safe Rust code.

pjmlp · on Sept 26, 2023

First profile, then optimize.

Many anti-bounds checker advocates, usually miss the first part.

antiframe · on Sept 26, 2023

Rusts bounds checking is done at compile time, so you don't need to turn it off.

flooow · on Sept 26, 2023

Incorrect. Semantically, Rust always inserts runtime bounds-checks, regardless of optimization level. The compiler _may_ then remove them as part of an optimization pass, iff it can prove the check is redundant.

Do bounds checks even matter, even for very tight loops? This article [1] seems to suggest that removing the checks can give up to 10-15% speed increase under some circumstances. Worth it? I'd say no.

[1] https://dropbox.tech/infrastructure/lossless-compression-wit...

pjmlp · on Sept 26, 2023

And runtime as well, like in any memory safe language.

postalrat · on Sept 26, 2023

Is JavaScript a memory safe language?

albntomat0 · on Sept 26, 2023

Yes, and so are Python, Lisp, Haskell, and a million other languages that were available in 2010.

None of those are suitable options for an image decoding library on the range of WebP supported platforms.

pjmlp · on Sept 26, 2023

Modula-2, Ada, Object Pascal, D would be.

What makes them unsuitabe is lack of widespread compiler support across those platforms, if we ignore how long Ada has been available in GCC.

And Modula-2 is now in GCC as well.

assassinator42 · on Sept 26, 2023

Ada has the bounds checking, but doesn't (AFAIK) have a safe way to deallocate dynamic memory. At least as of now, there is a proposal to add something like Rust in the future. It also has memory unsafe concepts like specifying an address for a variable.

pjmlp · on Sept 26, 2023

Yes it does, this isn't Ada83 we are talking about.

Well Rust also has unsafe, better not use it.

mrtksn · on Sept 26, 2023

>None of those are suitable options for an image decoding library on the range of WebP supported platforms.

What makes these unusable for this task?

sp332 · on Sept 26, 2023

They need a large runtime and are slow to start up.

mrtksn · on Sept 26, 2023

Is that true for all memory safe languages? Why?

marcosdumay · on Sept 26, 2023

> Is that true for all memory safe languages?

No. That is true for the list upthread (well, if you consider runtimes measuring a couple of MB "large", it's reasonable but quite arguable). It doesn't have much correlation with any feature.

unilynx · on Sept 26, 2023

It is. and so is WASM, but both probably aren't fast enough

Thews · on Sept 26, 2023

I think there is a ~35% hit with emscripten

0xParlay · on Sept 26, 2023

[flagged]