Hacker Newsnew | past | comments | ask | show | jobs | submit | reilly3000's commentslogin

This no doubt takes some inspiration from mcp_agent_mail https://github.com/Dicklesworthstone/mcp_agent_mail

I get it. A stunning indictment of our times… but there is something useful AI could be doing that MS has dropped the ball on: personal finance management. I should be able to have copilot grab all my transactions, build me budgets, show me what if scenarios, raise concerns, and help me meet my goals. It should be able to work in Excel where I can see and steer it. The math should be validated with several checks and the output needs to be trustworthy. Ship a free personal finance agent harness and you have your killer app.

I think there are business reasons why they wouldn’t do that, and that makes me sad.


I have a personal budget app and every so often I try and get the latest model to compare my data against the statements and find any discrepancies.

Every time it hallucinates visits to Starbucks.

I never go to Starbucks, it’s just a probable finding given the words in the question.

This should work. I want it to work. But until it can do this correctly all analysis capabilities should be suspect.


Maybe it's the model you are using.

Even a year ago I had success with Claude giving it a photo of my credit card bill and asking it to give me repeating category subtotals, and it flawlessly OCR'd it and wrote a Python program to do as asked, giving me the output.

I'd imagine if you asked it to do a comparison to something else it'd also write code to do it, so get it right (and certainly would if you explicity asked).


Maybe. But it’s always Claude. I even tried copying the text in directly to take OCR out of consideration. It still didn’t work very well.

Have you tried to get LLMs to do math or quantitative analysis? They're remarkably poor at it

Typing in URLs by hand is a choice you can make. Scrolling down to organic results (for brands you like) is another choice you can make. Paying for a search engine service is a great choice.

Brands can ask you to add them to your contacts with their website in their vcard. They can prompt you to bookmark them. They could publish a feed for you.

Sure Google can get us to routed in a way we’re all conditioned to depend on, but there are plenty of other ways to get to your destination. There must be 50 ways to leave…


Phone a friend. A buddy of mine texted me yesterday and we went to lunch together today. We’re talking about starting the old hackathon up again, this time with agent armies. It was just fun and easy and long overdue. Be the one to break the ice if you can.

I just cancelled, citing this as the reason. I’m actually not all that torn up about it. I mostly want to see how Anthropic responds to the community about this issue.

I love it! I used it last month to find a track that recently got noisy in our neighborhood and learned a lot about the regional rail network.


How I wish we could just see and patch up the raw context before it goes out. If I could hand edit a compaction it would result in better execution going forward and better for my own mental model. It’s such a small feature, but Anthropic would never give it to us.


Undervalued, not underperforming. Companies that others overlook but have solid fundamentals and great strategy.


codex cli with gpt-5.2-codex is so reliably good, it earns the default position in my book. I had cancelled my subscription in early 2024 but started back up recently and have been blown away at how terse, smart, and effective it is. Their CLI harness is top-notch and it manages to be extremely efficient with token usage, so the little plan can go for much of the day. I don’t miss Claude’s rambling or Gemini’s random refactorings.


Records and documents are usually private and owned for various good reasons. I don’t understand the core concept of decoupling them. What is the benefit? How does one make associations or use tools? Is everything public? How can you prevent spam?


Good questions. Records aren’t public by default — they’re decoupled from accounts, not from access control.

The benefit is durability and reuse: records can persist, move, or be re-associated without being owned by a single app or login. Identity is layered on top rather than baked in.

Tools operate on records they have explicit access to (by Star / capability), not global visibility. Spam is constrained by identity cost and rate limits, same as any system — decoupling doesn’t imply openness.

If this ends up being a bad abstraction, I want that to fail visibly rather than hide behind a conventional model.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: