More

callamdelaney · 2025-11-27T20:01:12 1764273672

Unfortunately LLM's aren't good at making decisions.

bofadeez · 2025-11-27T20:51:09 1764276669

I know they're supposed to be smarter than a year ago but you could have fooled me

I'm in a loop with Opus 4.5 telling it "be logically consistent" and then it says "you're absolutely right" and proceeded to be logically inconsistent again for the 20th time.

tbrownaw · 2025-11-27T23:20:53 1764285653

So instead of doing that, give it access to use something that's better at that?

callamdelaney · 2025-11-26T01:32:05 1764120725

The reason webhooks are popular is because they are easy, reliable and they work.

Now I might have to spin up a whole area of infrastructure I may not yet have to track eg usage, subscription tier, cancellations etc

agreeahmed · 2025-11-26T01:40:05 1764121205

Webhooks are absolutely the right tool for async / background jobs, or events driven domains. I’m not sure payments, and its flip side of entitlements, is best modeled as an events-driven domains.

In a perfect world money would trade hands and you’d just see what that meant for the features or value your customers could access accordingly. This is what happens in other domains of online commerce like Shopify. Shopify has webhooks but they are not the primary load bearing site of payment integration for most storefronts.

Payments webhooks were, from what I can tell in the history, a hack for side stepping gnarly problems in domain modeling and read optimization.

dxdm · 2025-11-26T08:17:21 1764145041

> I’m not sure payments [..] is best modeled as an events-driven domains.

But payments are event-driven in reality. Credit card charges can be disputed, some transactions only succeed after a delay, or transactions or refunds you thought were successful turn out to actually have failed. Some payment methods are more susceptible than others, but even credit cards are affected by some of this.

The point is, stuff will happen to your payments without you initiating the change, and out of your control. Like, you know, events. Now you will need to become aware of them, and either you keep polling updates for a lot of your payments to catch the one that changed, or you have the information pushed to you.

Webhooks are great for that.

(Edited because a word was missing.)

lmz · 2025-11-26T05:46:25 1764135985

Cards don't really need webhooks (unless you get into 3ds). Other payment types may need webhooks because they are fundamentally async.

dxdm · 2025-11-26T08:19:06 1764145146

3DS is a big deal in Europe, and card payments can be disputed. So I'd argue that credit cards are async, too, at least on some edges that can be expensive to ignore.

sofixa · 2025-11-26T12:10:34 1764159034

Because of PSD2 (regulating card payments in the EU), 3DS, including multi-factor authentication with an application or sms, are mandatory in the EU.

callamdelaney · 2025-11-24T20:30:09 1764016209

Can you imagine having chatgpt in your brain to constantly police wrongthink? Would save the British media a job.

tim333 · 2025-11-25T22:12:01 1764108721

It might be able to react fast enough to prevent the horror of the wrongthink reaching twitter.

callamdelaney · 2025-11-22T17:16:55 1763831815

Agent design is still boring and I’m tired of hearing about it.

callamdelaney · 2025-11-21T17:19:44 1763745584

Anything to save us from not being able to apply because of a CVE which is only relevant if you're doing something which we don't.

callamdelaney · 2025-11-19T16:20:48 1763569248

Hey, good luck with Mosaic.

Some feedback initially on the landing page, looks great but I thought that there is, for me, too much motion going on on the homepage and the use cases page. May be an unpopular opinion!

cjbarber · 2025-11-19T16:22:09 1763569329

Agreed, homepage was confusing for me also. I tried to scroll around and see a demo. For a product like this that is so visual, I expected to be able to find a 30s demo clip somewhere but couldn't see one on the homepage or product page (and the scrolling on the product page was annoying for me).

adishj · 2025-11-19T16:26:24 1763569584

the sad part is spent so long on the product page scrolling animation haha

very valid point though — I think a demo clip of a BEFORE vs AFTER immediately somewhere in the hero even or right below it would be helpful

thanks for the feedback

adishj · 2025-11-19T16:25:25 1763569525

valid points, thanks for the feedback. i had gone for a certain aesthetic but you're right in that it may be a bit too overwhelming.

callamdelaney · 2025-11-19T11:55:45 1763553345

Could we monitor all of these with downdetector?

callamdelaney · 2025-11-19T09:07:32 1763543252

The limits of LLM's for systematic trading were and are extremely obvious to anybody with a basic understanding of either field. You may as well be flipping a coin.

kqr · 2025-11-19T09:31:15 1763544675

I agree. Plus it's way too short a timeframe to evaluate any trading activity seriously.

But I still think the experiment is interesting because it gives us insight into how LLMs approach risk management, and what effects on that we can have with prompting.

ta12653421 · 2025-11-19T14:11:09 1763561469

In general, I agree - but there is one exception, I think: However you put AI into an stat arb context, I think it may help for trading on a daily base like "tell me where i should enter this morning and exit this evening". (not daytrading throughout the whole day)

But, I havent tested it so far since I do not believe it either :D

xwolfi · 2025-11-20T09:34:07 1763631247

Why would it know anything better than a bunch of 12 yo given the same question ? LLM don't know things very well, they don't cross concepts in their mind. Give you an example, made $1500 yday trading nvidia:

I followed the curve for the last month, scalping a few times - I get a feel like panic point is ~180$, hype point ~195$, it's like that most swings. There were earnings yday, people are afraid that the company is over its head already and prefer to de-risk, which I do too sometimes on other stuff. It is true that nvidia is overpriced ofc, but I feel we have maybe a few good runs and that's where the risk, therefore the potential reward, is. I enter around 184, and a bit more around 182. I go to sleep (Im in China), and when I wake up I sell at 194. I got lucky, and I would not do it again before I understand why would nvidia be swinging again.

Is an LLM gonna be any better ? My brain did a classic Bayes analysis, used the recent past as strong signal to my prediction of the future (a completely absurd bias ofc, but all traders are absurd humans), I played a company that wasnt gonna burn me too much, since Im still happy to own shares of nvidia whatever the price, and the money put there was losable entirely without too much pain.

Do I need AI ? Meh. For your next play, do you trust me or chatgpt more ? I can explain my decisions very coherently, with good caveats on my limits and biases, and warnings about what risk to afford when. I experienced losses and gains, and I know the effect and causes of both, and how to deal with them both. I prefer me, to it.

ta12653421 · 2025-11-21T07:30:31 1763710231

you apply it to find cross correlation ideas about larger numbers of assets. Try doing your stuff on a daily base with more than 500 assets :-)

xwolfi · 2025-11-24T07:13:50 1763968430

But it won't give me anything interesting though ! Like, would you trust it on an even higher scale ? It has no basis for its investment thesis, it's a word statistician, not a risk-weighted decision taker !

rob_c · 2025-11-19T09:13:02 1763543582

At least a coin is faster and more reliable.

Saline9515 · 2025-11-19T11:04:23 1763550263

So what are the limits, given that you seem knowledgeable about it?

red-iron-pine · 2025-11-19T14:44:12 1763563452

they're language models. they exist to take in text and compare it to existing tokens.

they're not quant-bots that already exist to read in stock prices and make decisions. different kind of ML/AI

from TFA: "We also found that the models were highly sensitive to seemingly trivial prompt changes"

falcor84 · 2025-11-19T09:48:24 1763545704

20 years ago NNs were considered toys and it was "extremely obvious" to CS professors that AI can't be made to reliably distinguish between arbitrary photos of cats and dogs. But then in 2007 Microsoft released Asirra as a captcha problem [0], which prompted research, and we had an AI solving it not that long after.

Edit - additional detail: The original Asirra paper from October 2007 claimed "Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it" [0]. It took Philippe Golle from Palo Alto a bit under a year to get "a classifier which is 82.7% accurate in telling apart the images of cats and dogs used in Asirra" and "solve a 12-image Asirra challenge automatically with probability 10.3%" [1].

Edit 2: History is chock-full of examples of human ingenuity solving problems for very little external gain. And here we have a problem where the incentive is almost literally a money printing machine. I expect progress to be very rapid.

[0] https://www.microsoft.com/en-us/research/publication/asirra-...

[1] https://xenon.stanford.edu/~pgolle/papers/dogcat.pdf

nl · 2025-11-19T11:06:22 1763550382

The Asirra paper isn't from a ML research group. The statement: "Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it" is just a statement of fact - it wasn't any forms of prediction.

If you read the paper you note that they surveyed researchers about the current state of the art ("Based on a survey of machine vision literature and vision ex- perts at Microsoft Research, we believe classification accuracy of better than 60% will be difficult without a significant advance in the state of the art.") and noted what had been achieved as PASCAL 2006 ("The 2006 PASCAL Visual Object Classes Challenge [4] included a competition to identify photos as containing several classes of objects, two of which were Cat and Dog. Although cats and dogs were easily distinguishable from other classes (e.g., “bicycle”), they were frequently confused with each other.)

I was working in an adjacent field at the time. I think the general feeling was that advances in image recognition were certainly possible, but no one knew how to get above the 90% accuracy level reliably. This was in the day of hand coded (and patented!) feature extractors.

OTOH, stock market prediction via learning methods has a long history, and plenty of reasons to think that long term prediction is actually impossible. Unlike vision systems there isn't another thing that we can point to to say that "it must be possible" and in this case we are literally trying to predict the future.

Short term prediction works well in some cases in a statistical sense, but long term isn't something that new technology seems likely to solve.

falcor84 · 2025-11-19T12:34:07 1763555647

Maybe I misunderstand, but it seems that there's nothing in your comment that contradicts any aspect of mine.

Regarding image classification. As I see it, a company like Microsoft surveying researchers about the state of the art and then making a business call to recommend the use of it as a captcha is significantly more meaningful of a prediction than any single paper from an ML research group. My intent was just to demonstrate that it was widely considered to be a significant open problem, which it clearly was. That in turn led to wider interest in solving it, and it was solved soon after - much faster than expected by people I spoke to around that time.

Regarding stock market prediction, of course I'm not claiming that long term prediction is possible. All I'm saying is that I don't see a reason why quant trading could be used as a captcha - it's as pure a pattern matching task as could be, and if AIs can employ all the context and tooling used by humans, I would expect them to be at least as good as humans within a few years. So my prediction is not the end of quant trading, but rather that much of the work of quants would be overtaken by AIs.

Obviously a big part of trading at the moment is already being done by AIs, so I'm not making a particularly bold claim here. What I'm predicting (and I don't believe that anyone in the field would actually disagree) is that as tech advances, AIs will be given control of longer trading time horizons, moving from the current focus on HFT to day trading and then to longer term investment decisions. I believe that there will still be humans in the loop for many many years, but that these humans would gradually turn their focus to high level investment strategy rather than individual trades.

nl · 2025-11-19T13:17:29 1763558249

> making a business call to recommend the use of it as a captcha is significantly more meaningful of a prediction than any single paper from an ML research group.

That's not what this is. It's a research paper from 3 researchers at MSR.

falcor84 · 2025-11-19T14:40:18 1763563218

Ok, I'll take it. It definitely wasn't a business call at the level of Microsoft saying that everyone should be using it, but it was an actual service offered under the Microsoft umbrella and used by many sites in the wild, e.g. via this MediaWiki extension [0], for 8 years [1].

[0] https://www.mediawiki.org/wiki/Extension:Asirra

[1] https://web.archive.org/web/20150207180225/https%3A//researc...

lambdaone · 2025-11-19T10:52:03 1763549523

What makes trading such a special case is that as you use new technology to increase the capability of your trading system, other market participants you are trading against will be doing the same; it's a never-ending arms race.

callamdelaney · 2025-11-19T14:43:42 1763563422

The only applications of generative AI I can envisage for trading, systematically or otherwise are the following:

  - data extraction: It's possible to get pretty good levels of accuracy on unstructured data, eg financial reports with relatively little effort compared to before decent llm's
   - sentiment analysis: Why bother with complicated sentiment analysis when you can just feed an article into an LLM for scoring?
   - reports: You could use it to generate reports on your financial performance, current positions etc
   - code: It can generate some code that might sometimes be useful in the development of a system

The issue is that these models don't really reason and they trade in what might as well be a random way. For example, a stock might have just dropped 5%. One LLM might say that we should buy the stock now and follow a mean reversion strategy. Another may say we should short the stock and follow the trend. The same LLM may give the same output on a different call. A miniscule difference in price, time or other data will potentially change the output when really a signal should be relatively robust.

And if you're going to tell the model say, 'we want to look for mean reversion opportunities' - then why bother with an LLM?

Another angle: LLM's are trained on the vast swathe of scammy internet content and rubbish in relation to the stock market. 90%+ of active retail traders lose money. If an llm is fed on losing / scammy rubbish, how could it possibly produce a return?

falcor84 · 2025-11-19T15:19:25 1763565565

> If an llm is fed on losing / scammy rubbish, how could it possibly produce a return?

Rather than just relying on pretraining, you'd use RL on the trade outcomes.

lambdaone · 2025-11-21T14:20:36 1763734836

RL would reasonably be expected to work if the market had some sort of discoverable static behavior.

The reason why RL by backtesting cannot work is that the real market is continuously changing, as all the agents within it, both human and automated, are constantly updating their opinions and strategies.

ta12653421 · 2025-11-19T14:14:36 1763561676

Good one! The thing is, you are assuming "perfect/symmetric distribution" of all known/available technologies across all market participants - this far off the reality. Sure: Jane Street et al are on the same level, but the next big buckets are a huge variety of trading shops doing whatever proprietary stuff to get their cut; most of them may be aware of the latest buzz, but just dont deploy it et.

jstanley · 2025-11-19T11:23:38 1763551418

That doesn't mean it doesn't work. That means it does work!

If other market participants chose not to use something then that would show that it doesn't work.

callamdelaney · 2025-11-10T14:18:18 1762784298

'allows' being the operative term here which means that the statement you attempt to contradict has not been contradicted.

How many executives has the UK government sought to prosecute under this law?

scott_w · 2025-11-10T15:17:35 1762787855

> 'allows' being the operative term here which means that the statement you attempt to contradict has not been contradicted.

That's a totally absurd statement in response to "the current government put a new law on the books to give themselves the power to prosecute."

> How many executives has the UK government sought to prosecute under this law?

As stated in my original reply: it came into force in February 2025. It's currently November 2025. It takes time to commit "multiple" offences, particularly given they need to be investigated and convicted. And no, dumping sewage twice doesn't automatically count under any legal regime.

In addition, DEFRA identified that Ofwat (the water regulator) is not fit for purpose in July 2025 and set out a proposal to abolish and replace it.

So yes, it does contradict your statement because the government is literally acting. What would you propose happen instead? That the government hold show trials and just start locking up water company staff and execs?

callamdelaney · 2025-11-10T09:11:39 1762765899

Misinformation merchants, ironically