I know they're supposed to be smarter than a year ago but you could have fooled me
I'm in a loop with Opus 4.5 telling it "be logically consistent" and then it says "you're absolutely right" and proceeded to be logically inconsistent again for the 20th time.
Webhooks are absolutely the right tool for async / background jobs, or events driven domains. I’m not sure payments, and its flip side of entitlements, is best modeled as an events-driven domains.
In a perfect world money would trade hands and you’d just see what that meant for the features or value your customers could access accordingly. This is what happens in other domains of online commerce like Shopify. Shopify has webhooks but they are not the primary load bearing site of payment integration for most storefronts.
Payments webhooks were, from what I can tell in the history, a hack for side stepping gnarly problems in domain modeling and read optimization.
> I’m not sure payments [..] is best modeled as an events-driven domains.
But payments are event-driven in reality. Credit card charges can be disputed, some transactions only succeed after a delay, or transactions or refunds you thought were successful turn out to actually have failed. Some payment methods are more susceptible than others, but even credit cards are affected by some of this.
The point is, stuff will happen to your payments without you initiating the change, and out of your control. Like, you know, events. Now you will need to become aware of them, and either you keep polling updates for a lot of your payments to catch the one that changed, or you have the information pushed to you.
3DS is a big deal in Europe, and card payments can be disputed. So I'd argue that credit cards are async, too, at least on some edges that can be expensive to ignore.
Some feedback initially on the landing page, looks great but I thought that there is, for me, too much motion going on on the homepage and the use cases page. May be an unpopular opinion!
Agreed, homepage was confusing for me also. I tried to scroll around and see a demo. For a product like this that is so visual, I expected to be able to find a 30s demo clip somewhere but couldn't see one on the homepage or product page (and the scrolling on the product page was annoying for me).
The limits of LLM's for systematic trading were and are extremely obvious to anybody with a basic understanding of either field. You may as well be flipping a coin.
I agree. Plus it's way too short a timeframe to evaluate any trading activity seriously.
But I still think the experiment is interesting because it gives us insight into how LLMs approach risk management, and what effects on that we can have with prompting.
In general, I agree - but there is one exception, I think: However you put AI into an stat arb context, I think it may help for trading on a daily base like "tell me where i should enter this morning and exit this evening".
(not daytrading throughout the whole day)
But, I havent tested it so far since I do not believe it either :D
Why would it know anything better than a bunch of 12 yo given the same question ? LLM don't know things very well, they don't cross concepts in their mind. Give you an example, made $1500 yday trading nvidia:
I followed the curve for the last month, scalping a few times - I get a feel like panic point is ~180$, hype point ~195$, it's like that most swings. There were earnings yday, people are afraid that the company is over its head already and prefer to de-risk, which I do too sometimes on other stuff. It is true that nvidia is overpriced ofc, but I feel we have maybe a few good runs and that's where the risk, therefore the potential reward, is. I enter around 184, and a bit more around 182. I go to sleep (Im in China), and when I wake up I sell at 194. I got lucky, and I would not do it again before I understand why would nvidia be swinging again.
Is an LLM gonna be any better ? My brain did a classic Bayes analysis, used the recent past as strong signal to my prediction of the future (a completely absurd bias ofc, but all traders are absurd humans), I played a company that wasnt gonna burn me too much, since Im still happy to own shares of nvidia whatever the price, and the money put there was losable entirely without too much pain.
Do I need AI ? Meh. For your next play, do you trust me or chatgpt more ? I can explain my decisions very coherently, with good caveats on my limits and biases, and warnings about what risk to afford when. I experienced losses and gains, and I know the effect and causes of both, and how to deal with them both. I prefer me, to it.
But it won't give me anything interesting though ! Like, would you trust it on an even higher scale ? It has no basis for its investment thesis, it's a word statistician, not a risk-weighted decision taker !
20 years ago NNs were considered toys and it was "extremely obvious" to CS professors that AI can't be made to reliably distinguish between arbitrary photos of cats and dogs. But then in 2007 Microsoft released Asirra as a captcha problem [0], which prompted research, and we had an AI solving it not that long after.
Edit - additional detail: The original Asirra paper from October 2007 claimed "Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it" [0]. It took Philippe Golle from Palo Alto a bit under a year to get "a classifier which is 82.7% accurate in telling apart the images of cats and dogs used in Asirra" and "solve a
12-image Asirra challenge automatically with probability 10.3%" [1].
Edit 2: History is chock-full of examples of human ingenuity solving problems for very little external gain. And here we have a problem where the incentive is almost literally a money printing machine. I expect progress to be very rapid.
The Asirra paper isn't from a ML research group. The statement: "Barring a major advance in machine vision, we expect computers will have no better than a 1/54,000 chance of solving it" is just a statement of fact - it wasn't any forms of prediction.
If you read the paper you note that they surveyed researchers about the current state of the art ("Based on a survey of machine vision literature and vision ex-
perts at Microsoft Research, we believe classification accuracy of
better than 60% will be difficult without a significant advance in
the state of the art.") and noted what had been achieved as PASCAL 2006 ("The 2006 PASCAL Visual Object Classes Challenge [4] included a competition to identify photos as containing several classes of objects, two of which were Cat and Dog. Although cats and dogs were easily distinguishable from other classes (e.g., “bicycle”), they were frequently confused with each other.)
I was working in an adjacent field at the time. I think the general feeling was that advances in image recognition were certainly possible, but no one knew how to get above the 90% accuracy level reliably. This was in the day of hand coded (and patented!) feature extractors.
OTOH, stock market prediction via learning methods has a long history, and plenty of reasons to think that long term prediction is actually impossible. Unlike vision systems there isn't another thing that we can point to to say that "it must be possible" and in this case we are literally trying to predict the future.
Short term prediction works well in some cases in a statistical sense, but long term isn't something that new technology seems likely to solve.
Maybe I misunderstand, but it seems that there's nothing in your comment that contradicts any aspect of mine.
Regarding image classification. As I see it, a company like Microsoft surveying researchers about the state of the art and then making a business call to recommend the use of it as a captcha is significantly more meaningful of a prediction than any single paper from an ML research group. My intent was just to demonstrate that it was widely considered to be a significant open problem, which it clearly was. That in turn led to wider interest in solving it, and it was solved soon after - much faster than expected by people I spoke to around that time.
Regarding stock market prediction, of course I'm not claiming that long term prediction is possible. All I'm saying is that I don't see a reason why quant trading could be used as a captcha - it's as pure a pattern matching task as could be, and if AIs can employ all the context and tooling used by humans, I would expect them to be at least as good as humans within a few years. So my prediction is not the end of quant trading, but rather that much of the work of quants would be overtaken by AIs.
Obviously a big part of trading at the moment is already being done by AIs, so I'm not making a particularly bold claim here. What I'm predicting (and I don't believe that anyone in the field would actually disagree) is that as tech advances, AIs will be given control of longer trading time horizons, moving from the current focus on HFT to day trading and then to longer term investment decisions. I believe that there will still be humans in the loop for many many years, but that these humans would gradually turn their focus to high level investment strategy rather than individual trades.
> making a business call to recommend the use of it as a captcha is significantly more meaningful of a prediction than any single paper from an ML research group.
That's not what this is. It's a research paper from 3 researchers at MSR.
Ok, I'll take it. It definitely wasn't a business call at the level of Microsoft saying that everyone should be using it, but it was an actual service offered under the Microsoft umbrella and used by many sites in the wild, e.g. via this MediaWiki extension [0], for 8 years [1].
What makes trading such a special case is that as you use new technology to increase the capability of your trading system, other market participants you are trading against will be doing the same; it's a never-ending arms race.
The only applications of generative AI I can envisage for trading, systematically or otherwise are the following:
- data extraction: It's possible to get pretty good levels of accuracy on unstructured data, eg financial reports with relatively little effort compared to before decent llm's
- sentiment analysis: Why bother with complicated sentiment analysis when you can just feed an article into an LLM for scoring?
- reports: You could use it to generate reports on your financial performance, current positions etc
- code: It can generate some code that might sometimes be useful in the development of a system
The issue is that these models don't really reason and they trade in what might as well be a random way. For example, a stock might have just dropped 5%. One LLM might say that we should buy the stock now and follow a mean reversion strategy. Another may say we should short the stock and follow the trend. The same LLM may give the same output on a different call. A miniscule difference in price, time or other data will potentially change the output when really a signal should be relatively robust.
And if you're going to tell the model say, 'we want to look for mean reversion opportunities' - then why bother with an LLM?
Another angle:
LLM's are trained on the vast swathe of scammy internet content and rubbish in relation to the stock market. 90%+ of active retail traders lose money. If an llm is fed on losing / scammy rubbish, how could it possibly produce a return?
RL would reasonably be expected to work if the market had some sort of discoverable static behavior.
The reason why RL by backtesting cannot work is that the real market is continuously changing, as all the agents within it, both human and automated, are constantly updating their opinions and strategies.
Good one!
The thing is, you are assuming "perfect/symmetric distribution" of all known/available technologies across all market participants - this far off the reality.
Sure: Jane Street et al are on the same level, but the next big buckets are a huge variety of trading shops doing whatever proprietary stuff to get their cut; most of them may be aware of the latest buzz, but just dont deploy it et.
> 'allows' being the operative term here which means that the statement you attempt to contradict has not been contradicted.
That's a totally absurd statement in response to "the current government put a new law on the books to give themselves the power to prosecute."
> How many executives has the UK government sought to prosecute under this law?
As stated in my original reply: it came into force in February 2025. It's currently November 2025. It takes time to commit "multiple" offences, particularly given they need to be investigated and convicted. And no, dumping sewage twice doesn't automatically count under any legal regime.
In addition, DEFRA identified that Ofwat (the water regulator) is not fit for purpose in July 2025 and set out a proposal to abolish and replace it.
So yes, it does contradict your statement because the government is literally acting. What would you propose happen instead? That the government hold show trials and just start locking up water company staff and execs?
reply