It's pretty surprising that they're willing to charge a flat rate rather than by token, but great news for users. It's inevitable that you get annoyed at AI when it consumes tokens and generates a bad answer, or starts reading files that aren't fully relevant. The flat rate takes away that bad taste. The business modelling behind it must be quite intense, I hope this doesn't blow up in JetBrains' face if Junie's usage patterns change over time.
JetBrains are in a great position to do this though, perhaps the best position. Whereas a tool like Claude Code or Aider can give the LLM grep and little else [1], Junie can give the LLM a kind of textual API to the IDE's own static analysis database. If Claude/GPT want to understand what a function is doing and how it's used, it could issue a tool call that brings up nicely rendered API docs and nothing else, it could issue a tool call to navigate into it and read just the function body and so on. And they can use the IDE to check whether the code complies with the language rules more or less instantly without needing to do a full compile to pick up on hallucinated APIs or syntax errors.
So much potential with this kind of integration, all that stuff is barely the start.
[1] Aider attempts to build a "repo map" using a PageRank over symbols extracted by tree-sitter, but it never worked well for me.
>The business modelling behind it must be quite intense, I hope this doesn't blow up in JetBrains' face
Historically... this tends to work out. Reminds me of Gmail initially allowing massive inbox. YouTube doing free hosting. All the various untethered LAMP hosting...
If necessary they'll add an anti-abuse policy or whatnot to mitigate the heavy users.
The sophisticated modeling is basically "just get going" with a guesstimate and adjust if needed.
I doubt that pricing structure will sink any ships. It's going to be about utility.
> Historically... this tends to work out. Reminds me of Gmail initially allowing massive inbox. YouTube doing free hosting. All the various untethered LAMP hosting...
One difference I see: storage capacity and compute performance aren't increasing like they had in the past, so companies can't rely on these costs to dramatically drop in the future to offset bleeding cash initially to gain market share.
The cost of inference[0] for the same quality has been dropping by nearly 10x year over year. I’m not sure when that trend will slow down, but there’s still been a lot of low-hanging fruit around algorithmic efficiency.
Sure. I agree that usage/demand is likely to outgrow compute performance.
But.. a lot of the other dynamics that make this game winnable still stand. Maybe they will need to go with a meter eventually or some other pricing structure... but it will work out.
It's odd that they don't seem to let you pay for overages, it looks like you are just shit out of luck past a certain point even on the most expensive plan.
Were you able to figure out what constitutes a "credit"? I initially assumed they were following Cursor's (early) model of 1 prompt = 1 credit, with the tokens used to fulfill the prompt not costing anything. If that's how they're doing it that still leaves a bad taste when you waste a credit on something that doesn't work, but it does remove the need to care about how the tool gets there.
Token based is a pretty strong downside for me that would be enough to get me to use other tools like Cursor (even though I love JetBrains IDEs). I get actively stressed watching an automated system burn through my money on its own recognizance. If I'm going to have quotas or usage-based pricing I want the metrics used to be things that I have direct control over, not things that the service provider controls.
TANSTAAFL. With flat pricing, companies have incentives to downgrade you to cheaper models - which currently strongly correlates with worse quality of output - or, more likely, to trim context significantly and hope you won't notice.
But yes, there should absolutely be ways to track usage, ideally before the prompt is even submitted for processing (maybe for >N tokens per query, where N can be specified in settings but has a reasonable default).
JetBrains are in a great position to do this though, perhaps the best position. Whereas a tool like Claude Code or Aider can give the LLM grep and little else [1], Junie can give the LLM a kind of textual API to the IDE's own static analysis database. If Claude/GPT want to understand what a function is doing and how it's used, it could issue a tool call that brings up nicely rendered API docs and nothing else, it could issue a tool call to navigate into it and read just the function body and so on. And they can use the IDE to check whether the code complies with the language rules more or less instantly without needing to do a full compile to pick up on hallucinated APIs or syntax errors.
So much potential with this kind of integration, all that stuff is barely the start.
[1] Aider attempts to build a "repo map" using a PageRank over symbols extracted by tree-sitter, but it never worked well for me.