More

oofbey · 2026-01-19T07:16:01 1768806961

Ironic that you nitpick the author’s word choice of “back pressure” and then completely misuse the term RL in your complaint.

oofbey · 2026-01-19T06:34:12 1768804452

I like this idea but I can’t think of a concrete example to ground it. Can anybody share a real example?

simonair · 2026-01-19T08:53:00 1768812780

I got tired of brittle literals like `http://localhost:3000` and `postgres://…@127.0.0.1/...` creeping into my code, so I wrote a few ESLint rules that detect “hardcoded infrastructure” strings and ask the agent to find the constant in the codebase — not by guessing its name but by `grep`-ing for its value.

The detection is based on dumb string literal heuristics, but has proven rather effective. Example patterns:

const hardcodedInfrastructure = { url: /^https?:\/\/(localhost|127\.0\.0\.1|192\.168\.\d+\.\d+|10\.\d+\.\d+\.\d+|172\.(1[6-9]|2\d|3[01])\.\d+\.\d+)(:\d+)?/i, dbUrl: /^(postgresql|postgres|mysql|mongodb):\/\/.*@(localhost|127\.0\.0\.1|192\.168\.\d+\.\d+|10\.\d+\.\d+\.\d+|172\.(1[6-9]|2\d|3[01])\.\d+\.\d+)/i, localhost: /^localhost$/i, localhostPort: /^localhost:\d+$/i, };

tomashubelbauer · 2026-01-19T11:53:32 1768823612

Claude Code is obsessed with using single letter names for inline function parameters and as loop control variables. I don't like it and I think it is sloppy, so I told it to stop in CLAUDE.md. In my experience, Claude Code will respect CLAUDE.md around 70 % of the time, it seems to cherry pick areas that it will respect more and less often and of course it kept ignoring this instruction. So I told it to add a pre-commit hook and invoke the TypeScript compiler and analyze the AST for single-letter variable names and tank the pre-commit check when it detects one with an error message indicating the offending symbols' locations. Now it can be non-deterministic as much as it wants, but it will never commit this particular flair of slop again as the adherence is verified deterministically. I already have a few more rules in mind I want to codify this way to prevent it from reproducing patterns it was trained on that I don't like and consider low quality.

paradite · 2026-01-19T07:33:57 1768808037

Max 200/300 LOC per file is pretty popular.

oofbey · 2026-01-19T06:21:59 1768803719

I’d like gastown more if it could run cursor-CLI instead of claude, and thus be able to choose models. Claude is okay. But these things certainly have personalities. I’m not sure which would be best for each role. But gastown’s different actors seem like a great place to take advantage of the different quirks of each. And I certainly don’t choose Claude consistently when given a choice.

oofbey · 2026-01-18T20:11:34 1768767094

The big bet with this technique is in having a fixed (non learned) matrix which converts the tokens latent space to the linear attention space. So you can kinda cheat and say your model is small because a bunch of the smarts are in this fixed big graph laplacian matrix L.

So how do you scale this up from a toy problem? Well that L would Have to get bigger. And it’s hard to imagine it being useful if L is not trained. Then it starts to look a lot more like a conventional transformer, but probably harder to train, with the benefit of smaller KV caches. (Half the size - not a massive win.)

So overall doesn’t seem to me like it’s gonna amount to anything.

tuned · 2026-01-19T07:05:21 1768806321

also: precomputing a sparse Laplacian for N vectors at dimension D (NxD) is infinitely cheaper (if using `arrowspace`, my previous paper) than computing distances on the same full dense vectors billions of times. There are published tests that compute a Laplacian on 300Kx384 space in 500 secs on a laptop on CPU. So it is a trade-off: potentially few minutes of pretaining or hours of dot-product on dense matrices

tuned · 2026-01-19T06:30:01 1768804201

the idea is to have a lot of "narrow" models to work with RAG instead of one model for all the knowledge domains or also distil the metadata that is currently in enterprise Knowledge Graphs

oofbey · 2026-01-18T16:57:05 1768755425

Depending on how different the attention mechanism is, that might not work. If it’s just a faster / different way of finding the tokens to attend to, sure. But I get the sense the author is implying this method uses different semantics somehow. Although tbh I didn’t follow it entry.

oofbey · 2026-01-18T16:50:35 1768755035

Services industry is gonna be tough in the age of AI agents.

oofbey · 2026-01-14T14:59:42 1768402782

Please tell us more

dec0dedab0de · 2026-01-14T15:12:31 1768403551

PT has always been a bit emotional and reactive, but he's usually on the right side of things. Though it's been many years since I have followed them closely.

oofbey · 2026-01-13T19:13:21 1768331601

This is IMHO an impossible line to draw. If an established artist uses a GenAI music tool it would be accepted. If somebody unpublished does the same it wouldn’t. Assuming they can even tell the difference, which they can’t.

oofbey · 2026-01-12T05:42:15 1768196535

You can control some simple things that way. But the subtle stylistic choices that many teams agree on are difficult to articulate clearly. Plus they don’t always do everything you tell them to in the prompts or rule files. Even when it’s extremely clear sometimes they just don’t. And often the thing you want isn’t clear.

oofbey · 2026-01-11T19:26:51 1768159611

Solid MAHA advice right there. Just eat French fries fried in beef tallow, avoid all vaccines, and whatever ails you will surely go away. Medicine isn’t actually that complicated. /s