Not yet, but I used VLMs to read out water meters with mixed results. It is definitely the easiest approach to prototype. For my balcony, all vision based approaches are limited by viewing angles that don’t expose any neighbors.
But even with an llm backbone you’d still need some setup to detect when to query the llm. And then you already have a low-false-negative pigeon detector that may be sufficient for your use case.
But even with an llm backbone you’d still need some setup to detect when to query the llm. And then you already have a low-false-negative pigeon detector that may be sufficient for your use case.