More

pornel · 2025-07-09T19:32:02 1752089522

LLMs use tokens, with 1d positions and rich complex fuzzy meanings, as their native "syntax", so for them LISP is alien and hard to process.

That's like reading binary for humans. 1s and 0s may be the simplest possible representation of information, but not the one your wet neural network recognizes.

kazinator · 2025-07-11T11:28:28 1752233308

Already over two years ago, using GPT4, I experimented with code generation using a relatively unknown dialect of Lisp for which there are few online materials or discussions. Yet, the results were good. The LLM slightly hallucinated between that dialect and Scheme and Common Lisp, but corrected itself when instructed clearly. When given a verbal description of a macro that is available in the dialect, it was able to refactor the code to take advantage of it.

pornel · 2025-07-09T19:18:58 1752088738

This has been true since the beginning of HTML email. It hasn't stopped it from proliferating. It hasn't stopped it from being de-facto mandatory, and has no chance of reversing the course now.

HTML is going to be inseparable part of e-mail for as long as e-mail lives, and yeah, it seems more likely than e-mail will die as a whole rather than get any simpler technically.

At this point we can only get better at filtering the HTML.

pornel · 2025-07-09T18:50:47 1752087047

Rust already supports switching between borrow checker implementations.

It has migrated from a scope-based borrow checker to non-lexical borrow checker, and has next experimental Polonius implementation as an option. However, once the new implementation becomes production-ready, the old one gets discarded, because there's no reason to choose it. Borrow checking is fast, and the newer ones accept strictly more (correct) programs.

You also have Rc and RefCell types which give you greater flexibility at cost of some runtime checks.

gronpi · 2025-07-10T05:11:00 1752124260

The new borrow checker is not yet all that fast. For instance, it was 5000x slower, according to a recent report.

https://users.rust-lang.org/t/polonius-is-more-ergonomic-tha...

>I recommend watching the video @nerditation linked. I believe Amanda mentioned somewhere that Polonius is 5000x slower than the existing borrow-checker; IIRC the plan isn't to use Polonius instead of NLL, but rather use NLL and kick off Polonius for certain failure cases.

creata · 2025-07-10T05:02:30 1752123750

I think GP is talking about somehow being able to, for example, more seamlessly switch between manual borrowing and "checked" borrowing with Rc and RefCell.

pornel · 2025-07-09T18:37:29 1752086249

> actually prove that aliasing doesn't happen in select cases

In the safe subset of Rust it's guaranteed in all cases. Even across libraries. Even in multi-threaded code.

oconnor663 · 2025-07-09T20:19:05 1752092345

To elaborate on that some more, safe Rust can guarantee that mutable aliasing never happens, without solving the halting program, because it forbids some programs that could've been considered legal. Here's an example of a function that's allowed:

    fn foo() {
        let mut x = 42;
        let mut mutable_references = Vec::new();
        let test: bool = rand::random();
        if test {
            mutable_references.push(&mut x);
        } else {
            mutable_references.push(&mut x);
        }
    }

Because only one if/else branch is ever allowed to execute, the compiler can see "lexically" that only one mutable reference to `x` is created, and `foo` compiles. But this other function that's "obviously" equivalent doesn't compile:

    fn bar() {
        let mut x = 42;
        let mut mutable_references = Vec::new();
        let test: bool = rand::random();
        if test {
            mutable_references.push(&mut x);
        }
        if !test {
            mutable_references.push(&mut x); // error: cannot borrow `x` as mutable more than once at a time
        }
    }

The Rust compiler doesn't do the analysis necessary to see that only one of those branches can execute, so it conservatively assumes that both of them can, and it refuses to compile `bar`. To do things like `bar`, you have to either refactor them to look more like `foo`, or else you have to use `unsafe` code.

gronpi · 2025-07-10T05:06:11 1752123971

It requires that the libraries you use do not have UB. If you have no unsafe, but your library does, you can get UB.

https://github.com/rust-lang/rust/pull/139553

This is why it may be a good idea to run MIRI on your Rust code, even when it has no unsafe, since a library like Rust stdlib might have UB.

simonask · 2025-07-10T07:52:52 1752133972

Isn't this a pretty trivial observation, though? All code everywhere relies on the absence of UB. The strength of Rust comes from the astronomically better tools to avoid UB, including Miri.

gryhili · 2025-07-10T08:25:47 1752135947

Miri is good, but it still has very significant large limitations. And the recommendation of using Miri is unlikely to apply to using similar tools for many other programming languages, given the state of UB in the Rust ecosystem, as recommended by

https://materialize.com/blog/rust-concurrency-bug-unbounded-...

https://zackoverflow.dev/writing/unsafe-rust-vs-zig

>If you use a crate in your Rust program, Miri will also panic if that crate has some UB. This sucks because there’s no way to configure it to skip over the crate, so you either have to fork and patch the UB yourself, or raise an issue with the authors of the crates and hopefully they fix it.

>This happened to me once on another project and I waited a day for it to get fixed, then when it was finally fixed I immediately ran into another source of UB from another crate and gave up.

Further, Miri is slow to run, discouraging people to use it even for the subset of cases that it can catch UB.

>The interpreter isn’t exactly fast, from what I’ve observed it’s more than 400x slower. Regular Rust can run the tests I wrote in less than a second, but Miri takes several minutes.

If Miri runs 50x slower than normal code, it can limit what code paths people will run it with.

So, while I can imagine that Miri could be best in class, that class itself has significant limitations.

ralfj · 2025-07-10T09:42:44 1752140564

> So, while I can imagine that Miri could be best in class, that class itself has significant limitations.

Sure -- but it's still better than writing similar code in C/C++/Zig where no comparable tool exists. (Well, for C there are some commercial tools that claim similar capabilities. I have not been able to evaluate them.)

pornel · 2025-07-09T12:35:38 1752064538

The English language is awful, and we keep updating it instead of moving to a newer language.

Some things are used for interoperability, and switching to a newer incompatible thing loses all of its value.

pornel · 2025-07-09T12:21:03 1752063663

It's funny that when OpenAI developed GPT-2, they've been warning it's going to be disruptive. But the warnings were largely dismissed, because GPT-2 was way too dumb to be taken as a threat.

pornel · 2025-07-07T17:15:15 1751908515

My guess is that the public library interface of GCC doesn't support it this way.

This back-end uses the confusingly-named libgccjit (not as JIT), which gives access only to a subset of GCC's functionality.

If something isn't already exposed, it might take a while to get patches to GCC and libgccjit accepted and merged.

pornel · 2025-07-07T17:02:49 1751907769

You can't use this implementation to bootstrap Rust (in the sense of bootstrapping from non-Rust language or a compiler that isn't the rustc).

This GCC support here is only a backend in the existing Rust compiler written in Rust. The existing Rust compiler is using GCC as a language-agnostic assembler and optimizer, not as a Rust compiler. The GCC part doesn't even know what Rust code looks like.

There is a different project meant to reimplement Rust (front end) from scratch in C++ in GCC itself, but that implementation is far behind and can't compile non-toy programs yet.

pornel · 2025-07-05T03:15:21 1751685321

Compression is limited by the pigeonhole principle. You can't get any compression for free.

There's every possible text in Pi, but on average it's going to cost the same or more to encode the location of the text than the text itself.

To get compression, you can only shift costs around, by making some things take fewer bits to represent, at the cost of making everything else take more bits to disambiguate (e.g. instead of all bytes taking 8 bits, you can make a specific byte take 1 bit, but all other bytes will need 9 bits).

To be able to reference words from an English dictionary, you will have to dedicate some sequences of bits to them in the compressed stream.

If you use your best and shortest sequences, you're wasting them on picking from an inflexible fixed dictionary, instead of representing data in some more sophisticated way that is more frequently useful (which decoders already do by building adaptive dictionaries on the fly and other dynamic techniques).

If you try to avoid hurting normal compression and assign less valuable longer sequences of bits to the dictionary words instead, these sequences will likely end up being longer than the words themselves.

pornel · 2025-07-04T20:51:59 1751662319

I thought you'd link to the one by Steve Mould, who implemented a snake game on it https://www.youtube.com/watch?v=rf-efIZI_Dg

pcthrowaway · 2025-07-04T22:50:01 1751669401

That one was way better, I just hadn't seen it linked from the official page.