Was able to sign up for the Max plan & start using it via opencode. It does a way better job than Qwen3 Coder in my opinion. Still extremely fast, but in less than 1 hour I was able to use 7M input tokens, so with a single agent running I would be able easily to pass that 120M daily token limit. The speed difference between Claude Code is significant though - to the point where I'm not waiting for generation most of the time, I'm waiting for my tests to run.
For reference, each new request needs to send all previous messages - tool calls force new requests too. So it's essentially cumulative when you're chatting with an agent - my opencode agent's context window is only 50% used at 72k tokens, but Cerebra's tracking online shows that I've used 1M input tokens and 10k output tokens already.
> For reference, each new request needs to send all previous messages - tool calls force new requests too. So it's essentially cumulative when you're chatting with an agent - my opencode agent's context window is only 50% used at 72k tokens, but Cerebra's tracking online shows that I've used 1M input tokens and 10k output tokens already.
This is how every "chatbot" / "agentic flow" / etc works behind the scenes. That's why I liked that "you should build an agent" post a few days ago. It gets people to really understand what's behind the curtain. It's requests all the way down, sometimes with more context added, sometimes with less (subagents & co).
Many API endpoints (and local services for that matter) does caching at this point though, with much cheaper prices for input/outputs that were found in the caching. I know Anthrophic does this, and DeepSeek I think too, at the very least.
At those speeds, it's probably impossible. It would require enormous amounts of memory (which the chip simply doesn't have, there's no room for it) or rather a lot of bandwidth off-chip to storage, and again they wouldn't want to waste surface area on the wiring. Bit of a drawback of increasing density.
Is this built with JS / something like Fabric JS? There are some things that feel very similar to a web app that I worked on before. Wondering if there's plans to have a plugin API at some point if it is.
One interesting thing here is that the chat side panel is agentic - it can read tab contents, open links in the existing tab or create new tabs, and do most of the standard "summarize", etc. things too.
This might be the first time that I move off of Chrome for an extended period of time.
uBlock origin lite kinda sucks compared to the OG uBlock, though. YouTube videos have this awkward buffering at the start now, sometimes YouTube homepage ads still load, sponsored placements on GrubHub/DoorDash appear and aren't able to be removed, etc.
"I pay to remove ads so my experience with a neutered adblocker isn't as bad" is a weird take.
If you think the end game is companies deciding they're comfortable with removing ads in exchange for a subscription, rather than a subscription with a gradually increasing amount of ads, then I have a bridge to sell you.
I support the creators I watch by donating to them directly.
I mentioned multiple domains...? I said it also impacts sponsored listings on food delivery platforms. Those used to be blocked and, more broadly, the ability to manually block specific elements of a webpage was lost with the transition to UB lite.
Specifically, it looks like skills are a different structure than mcp, but overlap in what they provide? Skills seem to be just markdown file & then scripts (instead of prompts & tool calls defined in MCP?).
Question I have is why would I use one over the other?
One difference I see is that with tool calls the LLM doesn’t see the actual code. It delegates the task to the LLM. With scripts in an agent, I think the agent can see the code being run and can decide to run something different. I may be wrong about this. The documentation says that assets aren’t read into context. It doesn’t say the same about scripts, which is what makes me think the LLM can read them.
This looks great! At a previous job we had a fork of Jupyter notebooks that were used this way by some teams. I see that remote execution is on the roadmap, but was also wondering if you'll have some form of parallel remote execution as well (ie one runbook run across 10 or 100 VMs similar to parallel ssh). Definitely more complicated than single execution, but potentially very powerful for debugging fleets where you don't have Chef or Ansible. I guess the alternative is to just have the runbook run locally but run pssh in some commands to get a similar result.
we already support execution of script + terminal blocks over SSH, but want much tighter integration. Parallel execution is certainly a part of that too. anything else you'd want to see?
Nothing in particular - when I wrote / used the jupyter 'runbooks', they were most helpful when a SEV (site event / severe error) was happening or if a new person on the team needed to handle oncall the first time.
Any chance you could add "extract clips" in addition to extract frames? Specifically I had to split a video into x-second clips recently & had to use AI to get the right command for that.
How does this compare to s6? I recently used it to setup an init system in docker containers & was wondering if nitro would be a good alternative (there's a lot of files I had to setup via s6-overlay that wasn't as intuitive as I would've hoped).
Thanks! Reading some of your other comments, it seems like runit or nitro may not have been a good choice for my usecase? (I'm using dependencies between services so there is a specific order enforced & also logging for 3 different services as well).
You seem to know quite a bit about init systems - for containers in particular do you have some heuristics on which init system would work best for specific usecases?
For reference, each new request needs to send all previous messages - tool calls force new requests too. So it's essentially cumulative when you're chatting with an agent - my opencode agent's context window is only 50% used at 72k tokens, but Cerebra's tracking online shows that I've used 1M input tokens and 10k output tokens already.
reply