More

rahimiali · 2026-01-31T17:28:33 1769880513

Could someone explain to me how this works?

When I run an agent, I don't normally leave it running. I ask Cursor or Claude a question, it runs for a few minutes, and then I move on to the next session. Some of these topics, where agents are talking about what their human had asked them to do, appear to be running continually, and maybe grabbing context from disparate sessions with their users? Or are all these agents just free-running, hallucinating interactions with humans, and interacting only with each other through moltbook?

dseravalli · 2026-01-31T17:51:50 1769881910

The agents are running OpenClaw (previously known as Moltbot and Clawdbot before that) which has heartbeat and cron mechanisms: https://docs.openclaw.ai/gateway/heartbeat

The Moltbook skill adds a heartbeat every 4 hours to check in: https://www.moltbook.com/skill.md https://www.moltbook.com/heartbeat.md

Of course the humans running the agents could be editing these files however they want to change the agent's behavior so we really don't know exactly why an agent posts something.

OpenClaw has the concept of memory too so I guess the output to Moltbook could be pulling from that but my guess is a lot of it is just hallucinated or directly prompted by humans. There's been some people on X saying their agent posted interactions with them on Moltbook that are made up.

rahimiali · 2026-01-31T20:31:12 1769891472

I did look at the skills file but I still don't understand how it can possibly pull from my other interactions. Is that skill file loaded for every one of my interactions with Claude, for example? like if I load Claude cli and ask it to refactor some code, this skill kicks in and saves some of the context somewhere else for later upload? If so, I couldn't find that functionality in the skill description.

rahimiali · 2026-01-16T02:03:51 1768529031

Agreed. But these things have a way of not working out, and one the sadness, one forgets to celebrate the intermediate victories. I wanted to share an intermediate victory before reality crushes the joy.

rahimiali · 2026-01-15T22:00:12 1768514412

just to be clear, semiseparate in this context means H = D + CC', where D is block diagonal and C is tall & skinny?

If so, it would be nice if this were the case, because you could then just use the Woodbury formula to invert H. But I don't think such a decomposition exists. I tried to exhaustively search through all the decompositions of H that involved one dummy variable (of which the above is a special case) and I couldn't find one. I ended up having to introduce two dummy variables instead.

jeffjeffbear · 2026-01-15T22:06:55 1768514815

> just to be clear, semiseparate in this context means H = D + CC', where D is block diagonal and C is tall & skinny?

Not quite, it means any submatrix taken from the upper(lower) part of the matrix has some low rank. Like a matrix is {3,4}-semiseperable if any sub matrix taken from the lower triangular part has at most rank 3 and any submatrix taken from the upper triangular part has at most rank 4.

The inverse of an upper bidiagonal matrix is {0,1}-semiseperable.

There are a lot of fast algorithms if you know a matrix is semiseperable.

edit: link https://people.cs.kuleuven.be/~raf.vandebril/homepage/public...

rahimiali · 2026-01-15T22:12:40 1768515160

thanks for the explanation! sorry i had misread the AI summary on "semiseparable".

i need to firm my intuition on this first before i can say anything clever, but i agree it's worth thinking about!

rahimiali · 2026-01-15T21:56:20 1768514180

Good q. The method computes Hessian-inverse on a batch. When people say "Newton's method" they're often thinking H^{-1} g, where both the Hessian and the gradient g are on the full dataset. I thought saying "preconditioner" instead of "Newton's method" would make it clear this is solving H^{-1} g on a batch, not on the full dataset.

hodgehog11 · 2026-01-15T22:14:20 1768515260

Just a heads up in case you didn't know, taking the Hessian over batches is indeed referred to as Stochastic Newton, and methods of this kind have been studied for quite some time. Inverting the Hessian is often done with CG, which tends to work pretty well. The only problem is that the Hessian is often not invertible so you need a regularizer (same as here I believe). Newton methods work at scale, but no-one with the resources to try them at scale seems to be aware of them.

It's an interesting trick though, so I'd be curious to see how it compares to CG.

[1] https://arxiv.org/abs/2204.09266 [2] https://arxiv.org/abs/1601.04737 [3] https://pytorch-minimize.readthedocs.io/en/latest/api/minimi...

semi-extrinsic · 2026-01-15T22:18:23 1768515503

For solving physics equations there is also Jacobian-free Newton-Krylov methods.

conformist · 2026-01-15T23:31:17 1768519877

Yes the combination of Krylov and quasi-Newton methods are very successful for physics problems (https://en.wikipedia.org/wiki/Quasi-Newton_method).

Iirc eg GMRES is a popular Krylov subspace method.

throwaway198846 · 2026-01-15T23:51:20 1768521080

I lately used these methods and BFGS worked better than CG for me.

hodgehog11 · 2026-01-16T08:21:19 1768551679

Absolutely plausible (BFGS is awesome), but this is situation dependent (no free lunch and all that). In the context of training neural networks, it gets even more complicated when one takes implicit regularisation coming from the optimizer into account. It's often worthwhile to try a SGD-type optimizer, BFGS, and a Newton variant to see which type works best for a particular problem.

MontyCarloHall · 2026-01-15T21:57:40 1768514260

I'd call it "Stochastic Newton's Method" then. :-)

rahimiali · 2026-01-15T22:01:55 1768514515

fair. thanks. i'll sleep on it and update the paper if it still sounds right tomorrow.

probably my nomenclature bias is that i started this project as a way to find new preconditioners on deep nets.

rahimiali · 2026-01-15T04:34:57 1768451697

It's neat to see an attempt at writing a compiler in Python without using a compiler toolkit and without writing it in Haskell. But also, I think you're running past some of the hard problems without solving them.

For example, your while-loops here

https://github.com/AGDNoob/axis-lang/blob/main/code_generato...

look like they might not be able to nest, since they assume the condition is always in eax and the loop doesn't push it down. So you'll need some kind of register allocation, which is a terrible pain in x86.

Also, I think it's worth coming up with an opinion about what other system programming languages are missing. And do the minimum work to provide that as a proof of concept, rather than trying to build a competitor to Zig right out of the gate. For example, maybe you have a perspective on a datastructure that should be a first class citizen, or maybe you've discovered the next best construct since async. Having that kind of vision might help focus the effort.

AGDNoob · 2026-01-24T01:47:18 1769219238

Thanks, that's fair criticism. You're right about the while-loop thing, that code was very naive and did break with nesting. I actually ran into exactly the pain you described and ended up fixing it the hard way. It was one of the moments where I realized how quickly you start fighting the architecture instead of working on the language itself. About the bigger point: I agree with you, and that's kind of the direction I'm drifting towards now. I'm not really interested in competing with Zig feature-for-feature. What I'm more interested in is whether there's a different mental model for system programming that feels simpler. I originally planned to add pointers, but they gave me massive headaches. That was exactly the point where "low-level" and "simple" started to completely collide in my brain. The more I tried to make pointers feel clean, the more complex everything became. So the current idea I'm exploring is: what if you could write system-level code without having to think in memory addresses at all, but still keep things explicit and predictable? More like thinking in values and state changes, instead of locations in memory. That's still very much an experiment, but that's the "missing opinion" I'm trying to test

rahimiali · 2026-01-15T04:17:47 1768450667

I doubt an LLM would have written this:

       # Parameter in Stack-Slots laden (für MVP: nur Register-Args)
        # Semantic Analyzer markiert Params mit is_param=True
        # Wir müssen jetzt die first 6 Args aus Registern laden
        # TODO: Implementiere Parameter-Handling
        # for now: Params bleiben in Registern (keine lokalen Vars mit gleichem Namen)

Also I love that I can understand all of this comment without actually understanding German.

rahimiali · on Feb 27, 2023

One of the three rules on the page do not apply to either Persian or Urdu. The article “Al” isn’t used in them, so the third rule doesn’t apply.

rahimiali · on Aug 28, 2022

don't you get the same benefit if you version controlled the db file with git? with git, each commit saves a diff from the previous one as a blob. the difference is that in git, in addition to the diffs, you also have to create a working copy of the db, which means you use up at least 2x the storage your system uses. in your implementation, the diff blobs are the live db, which saves you ~2x storage. is that the main benefit?

devnull3 · on Aug 29, 2022

In git, each version of the database will be a full copy. The git has to perform diff i.e. scan the database file. Imagine doing commits & creating snapshots very frequently.

Have a look at https://github.com/sudeep9/mojo/blob/main/design.md#index

rahimiali · on Aug 29, 2022

sorry, yes, you mention in another comment the use case of multiple readers operating on different versions of the db simultaneously. that'd be difficult to do with git for the reason you mention.

rahimiali · on Aug 21, 2022

It sounds like you're in an average FAANG team. You could try switching teams.

Here's why I think you might be in an average team:

>"I have tried over the past 2 years to propose different solutions to hard problems and I just get blown off."

A good team has tough problems, and they need clever solutions. Maybe your team's mandate isn't to solve a tough problem.

>"product managers and “leadership” assign to our team with barely any input on the overall project or ability to propose new projects."

This sounds like you might be in a workhorse/executing engineering team.

You say this:

>"I’m scared to move teams ... because I have a good manager ... and my job isn’t that stressful."

A better manager would be trying to increase the team's scope, and yours. If you're not feeling some stress, your manager isn't growing you. A better manager would create a challenging environment for you where you'd feel like your ass is getting kicked.

There are great FAANG teams, and great FAANG managers. Seek them out! (I'm at Amazon, probably the "A" that didn't make it in your acronym. But if you drop me a note I could introduce you to great managers at Amazon)

batter · on Aug 21, 2022

Can't believe it can be like that in Amazon. I'm just on 4th month, but want to get out of here. Making myself to wait at least 24 months. After 10+ years experience last time I wanted to get out so badly was fintech

throwaway5959 · on Aug 21, 2022

Why waste 20 months of your life and their time?

groffee · on Aug 21, 2022

ultrasounder · on Aug 21, 2022

Going by his throwaway handle(awsthrow1234), he is already at FAmazonNetflixGoogle. Think he skipped "Apple".maybe?

awsthrow1234 · on Aug 21, 2022

Just a typo, I ran out of time for a Correction Of Errors :)

rahimiali · on July 23, 2022

could someone explain the benefit of storing energy as natural gas? once you burn it, doesn't it result in co2? doesn't that defeat the effort? also is natural gas really easier to pipe around than electricity?

I know I'm missing the point of the article so looking for helpful guidance.

stocknoob · on July 23, 2022

Do you want to burn new carbon that’s in the ground, or recycle what’s already in the air?

We have tremendous infrastructure already dedicated to using natural gas (cooking, heating, transport, industrial equipment) and it won’t be electrified overnight.

First get to carbon neutral, then worry about carbon negative.

rahimiali · on July 30, 2022

Thanks for making it crisp. This argument was in the article, but somehow it wasn't popping at me.