Ask HN: How to approach a computer vision freelancer

thegrif · on Feb 24, 2018

This is a useful post. I'm also currently looking for a computer vision resource to provide expertise in a project centered on visually identifying anomalies in video of industrial machines. In other words, watching a robot and firing an alert when he misbehaves or breaks down :-)

Similar to what I believe your approach is, I'm looking for someone comfortable working in iterations. So much of this type of work is experimentation and trial and error.

If you want to connect, hit me up using:

http://telegram.me/tomgriffin http://linkedin.com/in/tomgriffin http://facebook.com/tomgriffin

redknight666 · on Feb 20, 2018

Computer Vision is a highly specialized field. If what you need is an MVP just get some code out of the internet with a run of the mill dataset and create an MVP based on that. That means you don't care about the accuracy, recall, F1, and all the mambo jumbo we do care about and you should if you plan to have a product built around computer vision stuff.

You can even run that kind of stuff using Microsoft and IBMs API out of the cloud for an MVP. If what you need is to test the product idea. Mind you if I were you, I wouldn't even build the product in the first place and do a proper UX work first that will allow you to figure out pretty fast and very cost effective if it makes business sense to pursue computer vision at all.

Now assuming that you have all those angles covered. No, you shouldn't hire a freelancer. This is a piece of technology that is very specialized, very complex and difficult to get right (to fit for purpose) and debug; you either want a contractor company that knows what they are doing and ensure they are skilled in knowledge transfer (the good ones are not cheap). At least if down the road you want your company to be able to build your own team on top of the base technology. Another alternative is to buy a third-party product and deal with the risk of the technology not being fit now or later in a year for your purposes.

Disclaimer: I work for such a contractor type.

clearwaterdata · on Feb 20, 2018

Thank you for these thoughts!

Using existing software through an API won’t work because I need the model to work offline.

The idea of hiring a firm is a good one - how would you answer that same series of questions when approaching a firm rather than a freelancer (cost, time frame, needed training images, technical specifications, etc)?

redknight666 · on Feb 20, 2018

Training images: there is a tension here. You need to know if what you are asking is specialized knowledge (say machine parts) or you can get away with currently available ones. So unless what you need is very general, the answer to that is yes.

Cost: Firms that know what they are doing ain't cheap. It is a simple cost structure issue. Think about it, a single AI guy may cost around 300K year just in salary, so that is a starting point from the cost point of view. They have to make at least enough to cover those costs a year. Now a interesting tidbit that is not known much, not all AI guys live and breath in the United States (aka SV, NY and Boston). Our company is based in Argentina, and we built among other things face recognition software and done research for big names in the states. So if costs are prohibitively, think about shopping off-the-coast, you will be surprised.

Given that you don't have any real target, anyone skilled enough will interview you and figure out pretty fast if you understand what you need. If they push you to give them targets for say accuracy and you cannot back up why you need X or Y, they will know you are not well versed and because of the risk:

1) They pad their cost to deal with your uncertainty.

2) The most professionals will plain refuse to do the project on those terms (I know for a fact because we had rejected many because the clients were not ready to pursue such an endeavour).

Now the most important, there is a lot of smoke out there. Today doing ML work is cool and fancy (so everybody is doing it, right?). Truth is there are not many that know what they are doing, chances are you will end up in a bad spot if you are not very diligent. That is why I suggested first to do the pre-screening UX work first before committing to anything. You will be embarking on the path, only if it really makes sense and you will have spent say 30K to figure out that probably you don't need it, or you need to build something so constrained that you can get away with very rough and trivial tech.

If you go to a firm and ask for a classifier they will probably build you a convolutional neural network or RNN or whatever, sometimes you don't even need that. We had a client a year ago that needed something very specific, they asked us a quite intelligent OCR essentially. We could build it, or retrofit a commercial one, but working with them on the pre-screening we figured out that the use case was actually pretty restricted and because of specific constraints we were able to build in 5 days a solution that is in production to this date (plain old template matching did the trick). They spent 20K up-front, they end up saving 300K+ or even more in the very short run (and have a product out there in 3 weeks instead of 6 months).

How would we do it? In those cases we start with a very small project to figure out what kind of technology we actually need, that may involve UX work or plain ML MVP creation work. Aiming to keep the costs as low as possible with the idea to have something that we can actually test and figure out if that is good enough to be part of the product. Sometimes that involves just incorporate some pre-trained model that you could retrofit, other times it involves scanning for the state of the art and figure out what you need to build in order to achieve the goals. But, in order for that to work, there must be a constraint in time and the understanding that you are diverging toward a solution. It is essentially risk management at its best. I wouldn't trust anyone that doesn't start like that for a brand new Vision based product/feature.

clearwaterdata · on Feb 20, 2018

This is incredibly helpful. Thank you. I definitely understand the need to have a clear understanding of the software performance, and I have many of those details already nailed down. I absolutely plan on running an MVP project first to see if trivial tech will suffice, and to get a feel for the firm/freelancer that I'm working with. It could be that my project is constrained enough we do not need the same level of sophistication as many models.

I also am entirely willing to push my project offshore, so long as I can communicate with my partner and am confident that they can deliver.

If I wanted to engage your firm, how could I get a hold of them/you for further discussion about my project details?

redknight666 · on Feb 20, 2018

Contact me: federico.lois (at) corvalius.com

eggie5 · on Feb 20, 2018

This sounds like the classic image classification challenge. The next question I would have is: is this on open world or closed world problem? Meaning, what is the world of images that the model will be exposed to in the wild. Will it only be exposed to sailboat/windmill/bird or will it be exposed to sailboat/windmill/bird and every other possible item in the world?

You also want to deploy this on a standard machine w/ or w/o a GPU? That can have implications on inference speed. But there are ways to optimise for that. The absence of a network connection is not a problem.

Also, do you have training data? If not we can probably leverage a pre-trained model for this use case where, in this case we would only need a handful of training examples.

To answer some of your questions:

Where should I look to find a quality freelancer? Not sure.

What formats should I specify for the deliverables? Depends on what tools the developer uses, for example, I would deliver files in TensorFlow export format.

What timeline should I expect? Using a pre-trained model, I could expect to deliver it in about 40 man hours.

What pricing should I expect? Probably about $6000, but that depends on the contractor.

What would a good developer expect me to provide in terms and training data? Using a pre-trained model you would only need a handful of training images. But, of course, the more the better.

What API parameters would they expect me to specify? I can't think of anything.

I have experience building and deploying these vision models. Feel free to reach out to me for more info. It's my username @ gmail.com