Hacker Newsnew | past | comments | ask | show | jobs | submit | rep_lodsb's commentslogin

There's also a problem with unmodified phones containing malware, namely an operating system made by an advertising company, which is designed to collect as much information about you as possible.

And this malware is largely based on open source code (Linux) that was originally developed on open, documented hardware, where the firmware boot loader did nothing more than load the first 512 bytes of your hard disk to address 0x7c00 and transfer complete control to it.

Yes, there were viruses that exploited this openness, but imagine if Linus Torvalds would have needed a cryptographic certificate from IBM or Microsoft to be allowed to run his own code! This is basically the situation we have today, and if you don't see how dystopian this is, I don't know what more to say.

I will never understand why such an overwhelming majority of people seem to just accept this. When frigging barcodes where introduced, there were widespread conspiracy theories about it being the Mark of the Beast -- ridiculous of course, but look at now where in some places you literally can't buy or sell without carrying around a device that is hostile to your interests. And soon it will be mandated by the state for everyone.

Google must be destroyed.


Yeah, randomly calling software that you don't like "malware" isn't making a strong case you think it does. Or helps in this discussion.

It's doing things that are against the interest of the user. But obviously, that's no longer an acceptable definition! According to our benevolent overlords, Android is definitely not malware, while yt-dlp is </s>

For the code generator, it produced this annotated disassembly:

    2100 push ax            ;--- EmitByte: write one byte to code output ---
    2101 mov di, [code_ptr] ;DI → current position in output buffer
    2104 stosb              ;Write AL to output, advance DI
    2105 mov [code_ptr], di ;Update code pointer
    2108 pop ax             ;Restore AX
    2109 ret                ;Every compiled instruction flows through this 6-instruction emitter
    2110 mov al, 0E8h       ;--- EmitCall: generate CALL instruction ---
    2112 call EmitByte      ;Emit opcode byte E8h (near CALL)
    2115 sub bx, [code_ptr] ;Calculate relative offset
    2118 sub bx, 2          ;Adjust for instruction length
    211A xchg ax, bx        ;AX = relative offset
    211B call EmitWord      ;Emit 16-bit relative displacement
    211E ret                ;Generated: E8 lo hi — a complete CALL instruction
Obviously, there has to be a lot more to even a simple-minded x86 code generator than just a generic "emit opcode byte" and "emit call" routine. In general, what A"I" produced here is not a full disassembly but a collection of short snippets, potentially not even including the really interesting ones. But is it even correct?

EmitByte here is unnecessarily pushing/popping AX, which isn't modified by the few instructions in between at all. No competent assembly language programmer would do this. So maybe against all expectations, Turbo Pascal is just really badly coded? No, it's of course a hallucination: those instructions don't appear in the binary at all!

That the hex addresses are wrong can already be seen in the instruction "mov di,[code_ptr]" here being apparently only three bytes long. In reality it would take four! And it's easy to confirm that this code isn't present at the addresses shown.

So maybe it's somewhere else? x86 disassembly can be complicated because the opcodes are variable length, and particularly in old programs like this the code and data are often not cleanly separated. Claude apparently ran it through NDISASM, which doesn't even attempt to handle that task.

But searching for e.g. the hex opcode B0 E8 ('mov al,0xe8') is enough to confirm that this code snippet isn't to be found anywhere.

There is a lot more suspicious code, including some that couldn't possibly work (like the "ret 1" in the system call dispatcher, which would misalign the stack).

Conclusion: it's slop


Thanks for this, I've added that to my write-up of the project here: https://simonwillison.net/2026/Mar/20/turbo-pascal/#hallucin...

> Because it's amusing to loop this kind of criticism through a model

Maybe it could become a general pattern, to have an agent whose task is just to deny the output validity. GANs are a very successful technique, perhaps it could work for language models too.


>Protip: your functions should be padded with instructions that'll trap if you miss a return.

Galaxy brained protip: instead of a trap, use return instructions as padding, that way it will just work correctly!

Some compilers insert trap instructions when aligning the start of functions, mainly because the empty space has to be filled with something, and it's better to use a trapping instruction if for some reason this unreachable code is ever jumped to. But if you have to do it manually, it doesn't really help, since it's easier to forget than the return.


That only works for unsigned integers.


Signend 64-bit is the worst case. When I tried to enable overflow checking thr overhead of RISC-V and Arm was comparable: https://news.ycombinator.com/item?id=46588159#46668916


Refer to the spec for the official idioms to handle every case.


Yes, you can detect signed overflow that way, but it's a lot more instructions so it won't be used in practice.

The designers of RISC-V included the bare minimum needed to compile C, everything else was deemed irrelevant.


>but it's a lot more instructions so it won't be used in practice.

It will be used when it needs to be handled. e.g. where elsewhere, an exception would actually handle it. Which is seldom the case.

More instructions doesn't mean slower, either. Superscalar machines have a hard time keeping themselves busy, and this is an easily parallelizable task.

>The designers of RISC-V included the bare minimum needed to compile C, everything else was deemed irrelevant.

Refer to "Computer Architecture: A Quantitative Approach" by by John L. Hennessy and David A. Patterson, for the actual methodology followed.


Secure boot can be disabled even on modern PCs.


It has nothing to do with being unable to run 16-bit code, that's a myth.

https://man7.org/linux/man-pages/man2/modify_ldt.2.html

Set seg_32bit=0 and you can create 16-bit code and data segments. Still works on 64 bit. What's missing is V86 mode, which emulates the real mode segmentation model.


That can be trapped for sure.


You're confusing several things here. The only x86 processor that didn't allow returning to real mode was the 16-bit 80286 - on all later ones it's as simple as clearing bit 0 of CR0 (and also disabling paging if that was enabled).

Nothing more privileged than ring 0 is required for that.

"v86" is what allowed real mode to be virtualized under a 32-bit OS. This is no longer available in 64-bit mode, but the CPU still includes it (as well as newer virtualization features which could be used to do the same thing).


You can write to CR0 from a DOS COM program while in V86 mode??? :o Wouldn't that cause a GPF / segfault / EMM386 crash?


The scenario was about the first fusion (hydrogen) bomb test causing a runaway "ignition" of the atmosphere. It was never considered likely, but they still did the math to make certain it couldn't happen.


Why is that surprising? The trap into kernel mode alone would already take more cycles than dedicated hardware needs for the full page table walk.


Since we're talking about defining our own processor, that means we need to define one with cheaper traps.

Expanding on what I wrote above about "bits of hardware acceleration", maybe adding a few primitives to the instruction set that make page table walking easier would help.

And with a trusted compiler architecture you don't need to keep the ISA stable between iterations, since it's assumed that all code gets compiled at the last minute for the current ISA.

Lots of fun things to experiment with.


Taking this to an extreme, the whole idea of a TLB sounds like hardware protection too?

As a thought experiment, imagine an extremely simple ISA and memory interface where you would do address translation or even cache management in software if you needed it... the different cache tiers could just be different NUMA zones that you manage yourself.

You might end up with something that looks more like a GPU or super-ultra-hyper-threading to get throughput masking the latency of software-defined memory addressing and caching?


In TempleOS, everything runs in ring 0, but that's not the same as doing protection in software (which would require disallowing any native code not produced by some trusted translator). It simply means there's no protection at all.


Very fitting if that was intended to be protection by faith.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: