One major risk is that their customers simply build teams in-house. Microsoft has had a very large team for years (and MSR / Ofer Dekel has actually published a lot of useful research on how to handle “crowdsourced” labels). Companies have been building productive off-shore labeling / moderation teams since the early days of Crowdflower. At some point, it’s not just the cost that makes sense, but rather the Product team wants a reliable workforce that they can control.
Another risk is that the well-funded self-driving customers go belly-up. However, one important facet is that dead players don’t release much data. MobilEye has a vast dataset (including images from not just Tesla but other automakers) but that data isn’t going anywhere. Neither is Nvidia’s 180PB of HD recordings. (Release or transfer in part requires dealing with PII of the people in the recordings. Now if only the offshore labelers weren’t handed PII for free...).
The valuation is likely a forward-looking bet on AI as whole versus the current suite of contracts. Anybody using an off-the-shelf model will want some labels after their first proof of concept. I wouldn’t argue that the math makes sense but rather that demand does look underserved.
Another risk is that the well-funded self-driving customers go belly-up. However, one important facet is that dead players don’t release much data. MobilEye has a vast dataset (including images from not just Tesla but other automakers) but that data isn’t going anywhere. Neither is Nvidia’s 180PB of HD recordings. (Release or transfer in part requires dealing with PII of the people in the recordings. Now if only the offshore labelers weren’t handed PII for free...).
The valuation is likely a forward-looking bet on AI as whole versus the current suite of contracts. Anybody using an off-the-shelf model will want some labels after their first proof of concept. I wouldn’t argue that the math makes sense but rather that demand does look underserved.