Impressive stuff! Has anyone tried it for computer/browser control? How does it ...

radq · 2025-09-27T01:49:46 1758937786

The 'point' skill is trained on a ton of UI data; we've heard of a lot of people using it in combination with a bigger driver model for UI automation. We are also planning on post-training it to work end-to-end for this in an agentic setting before the final release -- this was one of the main reasons we increased the model's context length.

Re: chart understanding, there are a lot of different types of charts out there but it does fairly well! We posted benchmarks for ChartQA in the blog but it's on par with GPT5* and slightly better than Gemini 2.5 Flash.

* To be fair to GPT5, it's going to work well on many more types of charts/graphs than Moondream. To be fair to Moondream, GPT5 isn't really well suited to deploy in a lot of vision AI applications due to cost/latency.

bobdyl87 · 2025-09-27T01:51:17 1758937877

Im labeling a dataset with it. We’ll see how it turns out

bobdyl87 · 2025-09-27T06:36:03 1758954963

Pretty good so far. Have 100,000 detections