Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I somewhat agree with you. Someone who is spending time now studying jQuery and becoming proficient at developing web services would nessesarily not be able to keep up with the pace of deep learning. On the other hand, there are people that had managed to become relatively proficient at developing software a decade ago. And spend last decade at becoming proficient at deep learning.


You need more than a decade to become proficient with deep learning at the level of researchers solving novel business problems.

It takes at least a decade just to study the prerequisite materials in vector calculus, linear algebra, advanced statistics, classifier algorithms, convex and gradient-based optimization, matrix computations and numerical methods, and associated software engineering skills. That’s all just to get to “base camp” of deep learning.

On the flip side, it’s pretty low effort to just use plug-n-play network components from popular libraries and follow a few tutorials or open source projects.

That’s why there’s effectively zero employment demand for the skill of naive keras or pytorch lego building. It’s as easy as it is meaningless.

Given that you’d already have been spending a decade+ of your life on advanced math if you planned to work on deep learning to solve real problems, there’s a huge impedance mismatch with this idea that you’d somehow also magically just be happy ignoring that specialized skill and the time investment sunk into it to then instead be happy writing throw-away little Flask apps or optimizing routine ETL queries.


My assumption was "starting from a post-graduate level in computer science, natural sciences or equivalent". By the way, I don't see how anyone could have more than a decade specifically in deep learning, considering that the field had started at around that time.

On a flip side, TensorFlow 2.0 and AutoML are coming ;). And generic RL agents that do not require reward hacking are also on the horizon. Who cares, if a researcher spend 10000 hours reading articles AND 10000 hours building products, if a more general algorithm obsoletes it all ;)


> “My assumption was "starting from a post-graduate level in computer science, natural sciences or equivalent".”

Yes, same for me. This builds in nearly a decade of preparatory work into the timeline... so it seems we agree.

> “On a flip side, TensorFlow 2.0 and AutoML are coming ;). And generic RL agents that do not require reward hacking are also on the horizon.”

I work professionally in deep learning for image processing. This quote reads like parody to me. I cannot imagine anyone familiar with the realities of AutoML or deep reinforcement learning talking this way. It’s like an excerpt from the script of Silicon Valley.


Have you used AutoML in practice for DNN architecture search?


Yes, I have used AutoKeras in practice, with mixed results. I have also written in-house hyperparameter search tooling to spread parametric architecture search in a distributed training environment with about the same mixed success. I have done this for both large-scale image processing networks and natural language processing networks.

Using AutoML in practice is beyond foolish, given the pricing, except for a really small minority of customers. Let alone that neural architecture search is not a silver bullet and frequently is totally not helpful for model selection (for example, say your trade-off space involves severe penalty on runtime and you have a constraint that your deployed runtime system must be CPU-only.. you may trade performance for the sake of reducing convolutional layers, in a super ad hoc business-driven way that does not translate to any type of objective function for NAS libraries to optimize... one of the most important production systems I currently work on has exactly this type of constraint).


Interesting. I agree, it is not trivial to estimate the runtime of the model on a target device. I wonder how Google does it. They've been boasting about precisely this ability - to optimize for architecture under constraints of precision AND runtime for a target device. And then, claiming that they've been able to get an architecture better than one optimized by a team of engineers over a few years.


It’s all hype coming out of Google. Most of this stuff is meant for foisting overpriced solutions onto unwitting GCP customers who get burnt by vendor lock-in and don’t have enough in-house expertise to vet claims about e.g. overpriced TPUs or overpriced AutoML.


>You need more than a decade to become proficient with deep learning at the level of researchers solving novel business problems.

No. Even these people haven’t been doing it for a decade.


Yes. They were doing 4 years of math intensive undergrad and 5+ years of math intensive post-grad work before there even was an advent of deep learning in 2008-2012.


So they were doing deep learning as an undergrad, huh? Studying tensors for 10 years? Next you’ll be telling me they have 20 years experience coding for CUDA.

No. None of these people have the math background you think, nor do you need it.


I don’t want to belabor this point (but apparently I do, since I’m replying to my own post 2 hours later), but your idea of going back and counting undergraduate work — when it isn’t even related — is simply padding for padding sake. Why stop there? Why not 4 years of high school, too? Aren’t they prerequisites? I mean you can go back all the way to preschool, since counting is a prerequisite to math, and hell, is a whole section of discrete math. But you don’t, because claiming you need 25 years of mathematical training sounds ludicrous.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: