We appear to be working on solving the same problem! :)
I had a chance to use delomore and you're off to a great start... Just finding the shops to index is certainly a challenge. When we search for an item like "Oxford Shoes" the search returns are for all the shops that may sell shoes, presumably oxfords.
Are you also building out the ability to return individual items that match the search query? In order to be a functional shopping search engine, users will want to be able to view and click on relevant results, and not just the shops that might sell the items.
We have built a search engine that indexes 100 million items from 140,000 shops... We are currently returning product cards from relevant results for the search query, and we're working to help shoppers find the exact item they are shopping for, and compare against other items from other sellers...
I always considered doing something like this and then I realized the only technique I know to get all shopify sites is subdomain enumeration and that's probably not uhhh white hat or whatever the term is
I'm sure there's people doing market research out there who don't really give a damn about any of that but I'm curious as to how you build out your search functionality if it's not that?
I've listed all of the (English language) shops I can find through Common crawl etc. That gives me 56 million products from 395k shops. Shopify quotes 1.75 million shops in all languages. So I've got a fraction of the total, but enough to be interesting, I hope.
That's exactly what my startup [1] is making. We have 395,000 shops at the moment. Love to hear any feedback.
[1] https://www.delomore.com/