I think in these sorts of discussions two concepts with the same name tend to ge...

JamesBarney · on Jan 20, 2020

I'm having trouble differentiating 1 from 2. Some seem obvious. Discovering deep learning is #2, labeling some data, throwing it at an algorithm after tuning a few hyper parameters sounds like #1.

But in my mind there is also a lot of overlap. Mind providing some concrete examples? For instance what is discovering "transfer learning", "pre-training with self-supervised learning", or "building PyTorch"?

ineedasername · on Jan 20, 2020

>in my mind there is also a lot of overlap.

Yep! There can be. But if you want concrete examples, I used Xgboost to identify people within a population at risk for an adverse event. This is strictly #1. If I optimized Xgboost code to make it faster, that's also probably firmly #1. If I improved Xgboost with a better understanding of gradient boosting to provide more accurate results, that's probably a firm case of overlap. When Leo Breiman [0] did his work that led to gradient boosting and tools like Xgboost, that was firmly #2.

[0] https://en.wikipedia.org/wiki/Leo_Breiman

JamesBarney · on Jan 20, 2020

Thanks!

rumanator · on Jan 20, 2020

> Mind providing some concrete examples?

It's like the difference between, say, applied and pure sciences. One is focused on developing and studying new algorithms, while the other is focused on using algorithms developed by someone else in practical applications.

To put it differently, it's like physics vs engineering. A physicist might develop new structural analysis methods, while the engineer would use those methods to model a bridge.

JamesBarney · on Jan 20, 2020

I understand the separation between physics. But most structural analysis methods are discovered by professors of structural engineering and not physicists(and much of it is empirical).

But I was asking because I was specifically looking for concrete examples in deep learning.

ineedasername · on Jan 21, 2020

Yep, this is why I talk about the virtuous feedback loop between these two modes. Empirical methods feed theory which feeds empirical methods ad infinitum.

In the field of ML, a concrete example might be the tool Xgboost (#1) and the original work that led to and developed Gradient Boosting itself (#2), of which Xgboost is an implementation, and probably one that has helped refine the underlying theory as well.

ML has lots of examples where the researcher(s) for #2 were also doing #1. A famous paper in NLP comes to mind as an excellent example of this overlap (PDF: https://www.csie.ntu.edu.tw/~b92b02053/print/good-turing-smo...)

rumanator · on Jan 21, 2020

> I understand the separation between physics. But most structural analysis methods are discovered by professors of structural engineering and not physicists(and much of it is empirical).

You're confusing the occupation with the role. Just because your job title is professor of structural engineering it doesn't mean that you are not studying "matter, its motion and behavior through space and time, and the related entities of energy and force."

https://en.wikipedia.org/wiki/Physics

papeda · on Jan 20, 2020

One possible distinction is: does the work reveal anything beyond the solution itself? The work might, for example, give one instance of a class of problems for which the tool is useful (bonus points for a formal statement to that effect). Or improve the tool, or improve understanding of the tool’s strengths and weaknesses

I think this is what we try to capture as “expanding human knowledge”.

IMO the more isolated the result (“technique x gave good results for problem y, the end”), the less like “research” it is. Though plenty such papers get into good conferences every year. A nice story and a little reviewer luck go a long way.

amilein7minutes · on Jan 21, 2020

I think the questions asked by researchers in #2 are very different from those by that of #1. The questions mostly surround the why's and how's of AI, i.e, mathematical questions. To take examples from deep learning, #2 might ask about the robustness and generalization of deep neural networks, applying dynamics/ODE theory to certain types of neural networks such as ResNets etc.

#1 might ask about the performance of a deep neural network in approximating a given model in a specific application. Alphafold, on the front page currently, is an example of #1.

AlexCoventry · on Jan 20, 2020

Is AlphaFold 1 or 2?

jhrmnn · on Jan 21, 2020

I have a decent understanding of the approach and would vote for #1. I’d say almost all applications of ML in physical sciences are #1. In contrast applying methods of statistical physics to understand how deep learning (as in DNN+SDG) works at all is a good example of #2.

ineedasername · on Jan 20, 2020

I'm hardly the official judge of these things, but I would say it depends on how novel of an approach AlphaFold is to the problem. If it's a more efficient tool for doing the same things as before, I would put it towards the #1 end of the spectrum, unless it has also improved our basic understanding of folding or approaches to exploring the solution space of folded proteins, which would shift it towards #2.

Personally I don't know enough about AlphaFold or the problems of protein folding to be remotely confident in my judgment on it

Reelin · on Jan 21, 2020

According to at least one expert in the field, it's a bit of both. (https://moalquraishi.wordpress.com/2018/12/09/alphafold-casp...)

sjg007 · on Jan 21, 2020

Neural networks are differentiable regexes that can be trained from examples. In the alpha fold case, which is the case with a lot of bioinformatics actually, is that you don't need to know a lot about the biological domain to be successful in solving "data" problems in the field.