Hacker Newsnew | past | comments | ask | show | jobs | submit | whytevuhuni's commentslogin


Moral of the story: A truly secure website would be a continuously morphing one where an LLM keeps rewriting and redeploying large parts of its code every minute, so that no attacker can keep up.


Hmm. Now that you mention it, wasn't that part of what was happening in Neuromancer? The "encryption" (or whatever it was) kept changing so the attack had to respond by "evolving" to get in.

Excuse me, I need to go solicit VC for my new evolving web security startup that is really just Claude rewriting 10% of the infra each day....


Very surprised to hear that, since editions are exactly the kind of mechanism Rust is using to make sure software will keep working unchanged for decades.

The Rust compiler can build a 2024 edition application which depends on a 2015 edition library, which in turn depends on a 2018 edition library.

Every crate can upgrade at their own pace, or even never at all.


Oh, I'm aware about the benefits of editions. For some reason, I've misread OP's comment.


> Then add in the fact that a change to history gets rippled down the descendent commits.

This sounds interesting. Could you go into a bit more detail?

I have 3 branches off of a single commit, update that commit, and all branches automatically rebase? Or?


Yes, exactly that. In Jujutsu you don't have Branches like you do in Git. You have branches in the sense that you have forks in the tree and you can place a "bookmark" against any revision in that tree. (When exporting to a Git repo those bookmarks are mapped to Git branch heads.)

So yeah if I have revision `a` with two children `b` and `c`, and even if those children have their own children, a change to `a` will get rippled down to `b` and `c` and any further children. It's a bit like Git rerere if you've used it, except you're not forced to fix every conflict immediately.

Any conflicts along the way are marked on those revisions, you just fix the earliest conflicts first and quite often that'll ripple down and fix everything up. Or maybe there'll be a second conflict later down the stack of commits and you'll just fix that one the same way.

To fix a conflict you typically create a new revision off the conflict (effectively forking the tree at that point) using `jj new c` (let's call the result `cxy`) fix the revision in that commit and then you can `jj squash` that revision `cxy` back into `c`. This, again, gets rippled down fixing up all of the descendent commits.


Yep they automatically rebase. If that creates conflicts it's marked on the child commit and you can swap over and resolve it any time.


The only thing this leads to is that you'll have hundreds of vendored dependencies, with a combined size impossible to audit yourself.

But if you somehow do manage that, then you'll soon have hundreds of outdated vendored dependencies, full of unpatched security issues.


> full of unpatched security issues

If you host your own internal crates.io mirror, I see two ways to stay on top of security issues that have been fixed upstream. Both involving the use of

  cargo audit
which uses the RustSec advisory DB https://rustsec.org/

Alternative A) would be to redirect the DNS for crates.io in your company internal DNS server to point at your own mirror, and to have your company servers and laptops/workstations all use your company internal DNS server only. And have the servers and laptops/workstations trust a company controlled CA certificate that issues TLS certificates for “crates.io”. Then cargo and cargo audit would work transparently assuming they use the host CA trust store when validating the TLS certificates when they connect to crates.io. The RustSec DB you use directly from upstream, not even mirroring it and hosting an internal copy. Drawback is if you accidentally leave some servers or laptops/workstations using external DNS, and connections are made to the real crates.io instead. Because then developers end up pulling in versions of deps that have not been audited by the company itself and added to the internal mirror.

Alternative B) that I see is to set up the crates host to use a DNS name under your own control. E.g. crates dot your company internal network DNS name. And then set up cargo audit to use an internally hosted copy of the advisory DB that is always automatically kept up to date but has replaced the cargo registry they are referring to to be your own cargo crates mirror registry. I think that should work. It is already very easy to set up your own crates mirror registry, cargo has excellent support built right into it for using crates registries other than or in addition to crates.io. And then you have a company policy that crates.io is never to be used and you enforce it with automatic scanning of all company repos that checks that no entries in Cargo.toml and Cargo.lock files use crates.io.

It would probably be a good idea even to have separate internal crate registries for crates that are from crates.io and crates that are internal to the company itself. To avoid any name collisions and the likes.

Regardless if going with A) or B), you’d then be able to run cargo audit and see security advisories for all your dependencies, while the dependencies themselves are downloaded from your internal mirror of crates.io crates, and where you audit every package source code before adding it in your internal mirror registry.


You are getting distracted by domain names, your Cargo.lock files already cryptographically address the source code. Either make sure all your Cargo.lock files contain no known-bad hashes, or make sure all your Cargo.lock files contain only known-good hashes. Maybe also mirror the .crate files for the absolute worst case scenario of crates.io going offline.


There is sadly a lot that is missed by cargo audit, far from everyone report their vulnerabilities to rustsec.


A large number of security issues in the supply chain are found in the weeks or months after library version bumps. Simply waiting six months to update dependency versions can skip these. It allows time to pass and for the dependency changes to receive more eyeballs.

Vendoring buys and additional layer of security.

When everyone has Claude Mythos, we can self-audit our supply chain in an automated fashion.


You don't need vendoring for this, Cargo.lock already gives you locked-dependencies until you run `cargo update`. There is an ongoing RFC to support having cargo intentionally only use library versions that are least X days old:

https://github.com/rust-lang/rfcs/pull/3923


Maybe this was a genius move made precisely to be ambiguous on whether it was April Fools or not... so that the author can later read the room and clarify whether it was or was not April Fools, without much repercussion either way.


Nope:

> Timing just worked out this way. New month, ideal timing for testing a new rule.


Or so one says. (Not necessarily saying that it was a bad decision.)


What evidence of malice supports your claims here?

(Evidence-free conspiracy theories are generally unwelcome at HN.)


That's not much different than other distros, because the way auto-update usually works, is it can't use root permissions or the system package manager (in any distro), so it has to install the newer version in $HOME. Once the update is installed, the system package becomes a trampoline to that.

I tried Discord, and this one seems to download some updates on first run, but the version sticks to the one from the system (0.0.127, latest is 0.0.129). So I assume it just doesn't update, or it tries to and fails.


Interesting, although I checked and on NixOS the binary is just 29MB. It was statically linked, with just libc left as dynamic.

I think 29MB is still huge for a terminal text editor, but nevertheless not "hundreds".


Language grammars are ~200-250MB though. They are in a separate folder, and often they are all bundled to support all the languages. Some of them are HUGE.

  .rwxr-xr-x  4.6M aa    6 Mar 21:52  ocaml-interface.so
  .rwxr-xr-x  4.6M aa    6 Mar 21:52  rpmspec.so
  .rwxr-xr-x  4.9M aa    6 Mar 21:52  tlaplus.so
  .rwxr-xr-x  5.1M aa    6 Mar 21:52  ocaml.so
  .rwxr-xr-x  5.1M aa    6 Mar 21:52  c-sharp.so
  .rwxr-xr-x  5.3M aa    6 Mar 21:52  kotlin.so
  .rwxr-xr-x  5.4M aa    6 Mar 21:52  ponylang.so
  .rwxr-xr-x  5.5M aa    6 Mar 21:52  slang.so
  .rwxr-xr-x  6.1M aa    6 Mar 21:52  crystal.so
  .rwxr-xr-x  6.8M aa    6 Mar 21:52  fortran.so
  .rwxr-xr-x  9.2M aa    6 Mar 21:52  nim.so
  .rwxr-xr-x  9.5M aa    6 Mar 21:52  julia.so
  .rwxr-xr-x  9.9M aa    6 Mar 21:52  sql.so
  .rwxr-xr-x   16M aa    6 Mar 21:52  lean.so
  .rwxr-xr-x   18M aa    6 Mar 21:52  verilog.so
  .rwxr-xr-x   22M aa    6 Mar 21:52  systemverilog.so


That's exactly what I found. Why these files should exist at all? Some other IDEs just have a bunch of highlighting rules based on regular expressions and have a folder of tiny XML grammar files instead of a folder of bloaty shared libraries.


Because it's far more reliable to use proper parsers instead of a bunch of regular expressions. Most languages cannot be properly parsed with regexes.

Those files are compiled tree-sitter grammars, read up on why it exists and where it is used instead of me poorly regurgitating official documentation:

https://tree-sitter.github.io/tree-sitter


Funny enough, they are less than 10MB when compressed. I guess they could use something like upx to compress these binaries.

The whole Linux release is 15mb, but it uncompresses to 16MB binary and 200MB grammars on disk.

Why do we need to have 40MB of Verilog grammars on disk when 99% of people don't use them?


That would waste CPU time and introduce additional delays when opening files.

They could probably lazily install the grammars like neovim does, but as someone who doesn't have much faith in the reliability of internet infrastructure, I'll personally take it...

Just ran `:TSInstall all` in neovim out of curiosity, and the results were predictable:

  ~/.local/share/nvim/lazy/nvim-treesitter/parser
  files 309
  size 232M

  /usr/lib/helix/runtime/grammars
  files 246
  size 185M
If disk space is important for your use case, I guess filesystem compression would save far more than just compressing binaries with upx. btrfs+zstd handle those .so well:

  $ compsize ~/.local/share/nvim/lazy/nvim-treesitter/parser
  Type       Perc     Disk Usage   Uncompressed Referenced
  TOTAL       11%       26M         231M         231M

  $ compsize /usr/lib/helix/runtime/grammars
  Type       Perc     Disk Usage   Uncompressed Referenced
  TOTAL       12%       23M         184M         184M


I mean, they could decompress it once when using a language for the first time. It will still be fully offline, but with a bit uncompressing.


If this is a concern, why not compress at the filesystem level?


For real parsing a proper compiler codebase (via a language server implementation) should be used. Writing something manually can't work properly, especially with languages like C++ and Rust with complex includes/imports and macros. Newer LSP editions support syntax-based highlighting/colorizing, but if some LSP implementation doesn't support it, using regexp-based fallback is mostly fine.


Will it be okay though? i32 to u64 has two ways to convert it:

    i32 -> u32 -> u64
    i32 -> i64 -> u64
This matters with negative numbers, where the first one pads with 32 bits of 0, the second one pads it with 32 bits of 1. Sometimes (as it once happened to me), you wanted the wrong one.


Yes, it will be okay because I'm making a pretty picture :) If the default behavior of a conversion surprises me, I'd be able to sus it out and replace it with explicit behavior.

I'm not saying this is worth adding such a thing to Rust just for this use case, but it would be very nice not to write intos for every number


Not quite, because depending on the compiler implementation / memory model, other things can also lead to UB, and thus be unsafe, e.g.:

* data races in a multi-threaded program

* type-casting to the wrong type (e.g. via unions) with compiler optimizations that rely on proper types

* assuming values are absolutely never aliased (shared xor mutable) when there's actually a way to create aliased values (e.g. unsafe {} and raw pointers)

* creating objects out of a sequence of bytes that is not valid for their type

...and likely many others I can't remember.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: