The problem with this comment is that it doesn't teach us anything. If the article is wrong, it would be valuable to explain how it is wrong in a way that readers here can understand. But a sarcastic dismissal that leaves out the substance merely adds negativity.
I think that, if a phrase is meaningless, it is enough to explain why it is meaningless (because it proposes to make an 'exact' connexion between something rigorously defined and something that is not) without having to explain in what way the argument for it fails to fulfil that meaningless goal.
Nonetheless, it's surely also true that it would be nice to suggest a constructive remedy; and, fortunately, one need not go farther than the abstract to find (a better approximation to) the precise statement that the authors are making:
> We construct an exact mapping from the variational renormalization group, first introduced by Kadanoff, [to] deep learning architectures based on Restricted Boltzmann Machines (RBMs).
That was a case of not reading the link rather than asserting meaninglessness. The abstract makes clear that deep learning as an umbrella term would even fit RG, a completely unrelated concept from physics.
E.g. abbreviating deep belief nets with DBM, which is the commonly used acronym for deep boltzmann machines. These are similar, but very different. Calling an RBM an encoder is somehow not far fetched, but there are many differences between auto encoders and RBMs. He eventually claims an RBM minimises reconstruction error, which is just plain wrong and shows that this guy has absolutely no clue what he is writing about.
'Technically' this is correct--the RBM CD algo is not minimizing this function; that's not the point.
It is known that when training an RBM, the reconstruction error decreases but not monotonically; in fact it fluctuates. In the words of Hinton, 'trust it but don't use it'.
So in a global sense, yes, I would say that the RBM does eventually minimize the reconstruction error even though it fluctuates.
I can even offer a conjecture here on why the error fluctuates ; in a discrete RG flow map, there could be finite size effects that would give log-periodic fluctuations. This is a stretch--but it is something that could be tested.
As to stacking the RBMs to form a DBN--yeah that's the point.
"Hinton showed that RBMs can be stacked and trained in a greedy manner to form so-called Deep Belief Networks (DBN)"
http://deeplearning.net/tutorial/DBN.html
> Engineering is the application of scientific, economic, social, and practical knowledge in order to invent, design, build, maintain, research, and improve structures, machines, devices, systems, materials and processes.
The author should probably revise his definition of engineering.
Yes. Also the guy does not seem to be that knowledgeable.
He did not get the memo that more input data can lead to worse results; he claims the opposite in a slight variation (i.e. the more precise the input data, the higher the quality of the result).
I think the difference is that Prolog was aimed to solve all kinds of programming problems. PPs on the other hand is very domain specific from the start.
> A 'predictive model' should say 'if you do X then you will end up with Y' - and the X cannot be adjusting some number. The X has to be stuff like 'building ETU's in West Africa', or 'canceling all flights',... A predictive model should be able to say 'don't bother canceling flights, it's no use - instead do this...'.
This is just wrong. A predictive model does not necessarily have any "action" input. Example: weather forecast.
Well, not that important. It's only important that the buzzword appears in the title!