This is a similar view to the emerging theory of Bayesian Brain, which views the brain as a system that tries to minimise the prediction error (which might be the same thing as "free-energy" in some related publications) by comparing expectations with actual information coming from the senses.
So far it seems that it explains quite a lot of data, and many mind illnesses (e.g. many diseases can be thought as the brain under-correcting or over-correcting for the prediction error).
By under-correcting, the brain is not learning enough on its mistakes, which may lead to delusions of superiority (e.g. being stuck in usual habits, or inability to change one's world-view based on new information). On the other hand, when over-correcting, the world may seem unpredictable, frightening - leading to self-doubt, anxiety and negative thoughts.
There is a fascinating Bayesian explanation for schizophrenia: the story goes that schizophrenic people have a much sharper prior/posterior than non-schizophrenic people, which makes it more difficult for them to correct their internal models when the environment diverges from predictions. Which causes them to drift off into their own realities.
For example, if you run the rubber hand experiment with non-schizophrenic people, even if you don't stroke their hand and the rubber hand at the exact same time (say the timing offset is gaussian with standard deviation sigma), with enough repeated exposures to the stimuli they will recognize the rubber hand as their own. In contrast, if you repeat the same experiment with schizophrenic people, it takes a smaller standard deviation or substantially more trials to have them recognize the rubber hand as their own.
I wish I had the references lying around, but I dug into the literature for this a few years back and found this hypothesis to be surprisingly well supported.
I agree; Karl Friston's work is among the most interesting I have ever read, period. Interestingly, his 2009 paper (https://www.fil.ion.ucl.ac.uk/~karl/The%20free-energy%20prin...) / free-energy principle makes use of reinforcement learning, gradient descent, Markov blankets, Helmholtz machines, and other foundational tenets of modern machine learning ... In that regard, Geoff Hinton (a foundational figure in modern machine learning) overlapped with Friston while Hinton was in England at that point in his career.
Yes, interesting man. I encountered him on a workshop by Bert Kappen on stochastic optimal control. It shows that there are different control strategies for different noise levels separated by phase transitions.
"In short, free energy minimization will tend to produce local CLE that fluctuate at near zero values and exhibit self-organized instability or slowing."
I've to study it more what he means with self-organized instability.
Note that Geoffrey Hinton's first Restricted Boltzmann Machines were designed to minimize free energy. The first Restricted Boltzmann Machine, however, was Paul Smolensky's Harmonium. It maximized a metric called harmony, which was essentially the inverse of free energy. When Hinton and Smolensky collaborated with Rummelhart on a publication, they settled on calling it "goodness of fit".
My point is that saying that the brain is maximizing harmony is quite reasonable -- and much easier to understand.
Rumelhart, D. E., Smolensky, P., McClelland, J. L., & GE, H. (1986). Schemata and Sequential Thought in PDP Model. PDP, Exploration in the Microstructure of Cognition, The MIT Press, Cambridge, MA, Vol. IIº.
"the brain" Can any of this grand top-down 'delusions and personality traits are thermodynamic xyz' theorising about "the brain" apply to, say a bee's brain or a worm's?
Regarding complexity: Probably anything with a brain has the problem of balancing lots of different sources of information and maintaining (enough) coherence of behavior. So even very simple worms will need a simple version of this...
Any study about "criticality" needs to be taken with a huge grain of salt. The standard methodology is that criticality is synonymous with power laws. And power laws are straight lines on log-log plots, with slope equal to the critical exponent.
So when anybody says "we showed X was critical", they actually mean "we plotted a fuzzy cloud of data points on a log-log plot and fitted a line through it", nothing more. But you can fit a line through anything. Even a normal distribution shows up as a line on a log-log plot if your data has a small enough range.
Criticality studies trade on the reputation of physics, where the idea came from, and there it works fantastically. For instance, we can measure critical exponents for liquid/gas phase transitions to three or four significant figures, and even predict those numbers from pure theory. Applications outside of physics usually have barely one significant figure, if they're even measuring power laws at all, and no predictive theory.
John von Neumann was at a talk where the presenter had put up a slide with a cloud of points, and had optimistically drawn a line through the cloud. Von Neumann muttered, "at least they lie on a plane."
Huh, I stand corrected. My original complaint still applies to >99% of criticality studies, but the field might be coming around since the last time I looked into it, 3 years ago. Skimming through the papers, I'm still skeptical of how stringent a test their 'exponent relation' really is, but it's an improvement.
What exactly does exhibit criticality? As correctly stated, all kind of phenomena can exhibit power laws.
Avalanches on a sandpile are sized as a power law. Typical example of self-organized criticality (Per Bak).
Back in the day I played with group renormalization theory to prove SOC, but most systems break down if there is loss on a microscopic scale. Intuitively, you need conservation laws on a microscopic scale or the very large events do not happen.
This is unlikely the case in a biological system and I won't expect it to be at a critical state, but only hovering around "an interesting area".
Hengen said: “Recently, people moved away from measuring simple power laws, which can pop out of random noise, and have started looking at something called the exponent relation. So far, that’s the only true signature of criticality, and it’s the basis of all of our measurements.”
They measure neuron activity and spread of firing actions - if I understand correctly without looking at the paper. And they found that thanks to the sophisticated inhibitor neuron network the whole network balances around this mathematical criticality.
I guess it means if there were less inhibition then entropically there would be too many firing actions leading to a feedback frenzy, which is obviously not that effective for information processing.
And if there were more inhibition then the information wouldn't be able to spread to all the special small parts of the brain, thus making them too specialized.
"Avalanches were analyzed in terms of size (S, the number of spikes), and duration (D, time) (Fig 1A), and power law exponents were fit to the two distributions. In critical systems, the exponents of the two distributions can be used to predict the mean avalanche size (<S>) observed at a given duration (i.e. the distributions scale together). When <S> is plotted against avalanche duration, the difference between the empirically derived best-fit exponent and the predicted exponent serves as a compact measure of the deviation from criticality (Deviation from Criticality Coefficient, “DCC”, Fig 1B)."
Recap. There are several matters that are at times conflated.
+ Define an order parameter which "significantly changes" (undergoes a phase transition).
+ Define a control parameter that drives the system through those different regimes.
+ Establish that there is a critical point (not just a "region").
+ Define properties at the critical point that are scale-free (or having "all scales").
+ Define this critical point as an attractor in a dynamical system sense. (The system gets infinitely close to it given enough time.)
They show however that the system has an attractor "near" the phase transition. Moreover, do not really establish that this is a critical phase transition. They establish "nearness" w.r.t. the critical phase transition by fitting two power laws (size vs duration) according to an expected exponent relation (a-1)/(t-1). They see this as a quantitative measure of nearness. In other words, their nearness measure depends on their definition of criticality.
A lot of these studies talk about "near-critical" states. However, that doesn't always make mathematical sense.
PS: If you think, ah, you just have to have the control parameter as an output of a system with as input a value that defines the criticality of the system, yeah, that might work. However, the real interesting systems do not use this type of macroscopic information, but have local processes leading to the same results.
I think it's better phrased as that it has active controller feedback loops. If there is room for more specialization, if it can get away with less energy used, less neurons, or some other kind of optimization, then it seems it will do that.
And there are loops that work against these to prevent breakdown, forgetting too much, slowing down too much, etc.
Of course it's not really known yet what these control system are exactly. (At least I'm not aware we have good data and theories about this aspect of the brain.)
When we think of all the different neurotransmitters (dopamine, serotonin, etc) we sometimes forget that the vast majority of neurons are producing either excitatory outputs (via glutamate) or inhibitory outputs (via GABA). The other neurotransmitters are modulators of this basic phenomena of excitation/inhibition.
The brain has a tightly balanced feedback loop between excitation and inhibition. Too much excitation and the brain gets a seizure (positive feedback). Too little, and you black out.
TBH, it's quite a good example of the principle of balance in the concept of Yin-Yang.
Interesting, but that article -- Ma et al. (2019) "Cortical Circuit Dynamics Are Homeostatically Tuned to Criticality In Vivo" [https://www.cell.com/neuron/fulltext/S0896-6273(19)30737-8)] -- makes no mention of Karl Friston and his work (who is also mentioned elsewhere in this thread: @Agebor, re: 'Bayesian brain'), which seems highly relevant.
E.g.
* Friston, K. (2009). The Free-Energy Principle: A Rough Guide to the Brain? Trends in cognitive sciences, 13(7), 293-301.
* Solms M (2018) "The Hard Problem of Consciousness and the Free Energy Principle." Front. Psychol. 9: 2714. DOI: 10.3389/fpsyg.2018.02714 | PMCID: PMC6363942 | PMID: 30761057
"The activity of a brain—or even a small region of a brain devoted to a particular task—cannot be just the summed activity of many independent neurons. Here we use methods from statistical physics to describe the collective activity in the retina as it responds to complex inputs such as those encountered in the natural environment. We find that the distribution of messages that the retina sends to the brain is very special, mathematically equivalent to the behavior of a material near a critical point in its phase diagram."
Off topic, but seeing "wustl.edu" as the source brought up some memories.
In the early 90s, I used their ftp site at wuarchive.wustl.edu very frequently. It was a reliable source to download open source software like Perl, tcl, trn, gcc, and so on.
This problem of run-away excitation in wetware reminds me of exploding gradients in artificial neural nets. We try to handle this with data normalization, batch normalization, and gradient clipping of different sorts (although unless the clipping is incorporated into the loss as in PPO (Schulman), it's very brittle and dependent on the data and network architecture.). So I wonder if we can glean something from these inhibitory neurons for artificial nets. The opposite problem of vanishing gradients results from too much inhibition, which happens in recurrent neural nets - so it's definitely a balance. Currently PPO does the best job IMO, but is specific to reinforcement learning.
They are using criticality in the same sense it used in physics (in the context of phase transitions). You may find the wiki article useful in this regard [1]
This is a link which offers criticality as a metaphor with multiple meanings. You suggest phase transition criticality is close to what they mean, characteristic boundaries where matter transitions between solid-liquid, liquid-gas etc?
I won’t spin mental cycles guessing. I’d rather the authors were explicit.
Well frankly I did not read the article. But I am all too familiar with discussions of criticality in the context of neuroscience and it is always meant in the same way physicists use the word.
I'm afraid this may be a scenario where some deeper knowledge is needed to fully appreciate the discussion. You would be well rewarded for putting in the effort though, it's a fascinating notion. I'd recommend starting with the Ising model [0] which is the canonical system exhibiting critical phenomenon.
edit: if by 'this' you are referring the the wiki article on critical phenomenon, you're definitely missing the larger picture. The examples that wiki article lists aren't metaphorical, they're all essentially 'corollaries' (in a very loose sense) of the same underlying thing. Start with the Ising model.
“Recently, people moved away from measuring simple power laws, which can pop out of random noise, and have started looking at something called the exponent relation.“
Sadly no further explanation of this. Exponent-relation sounds like a synonym for power, too.
The brain being poised towards criticality is a serious meme in the neuro community. Not that it's an uninteresting idea or one unworthy of pursuit. Just a bit hard to have people to take you seriously.
Well, these ideas are very likely accurate, but so general that they are disconnected from solving practical problems. Fine, the brain is a prediction machine and it optimizes over some program space by annealing doing homeostasis/regulation, and maybe tends to occupy certain kinds of states now and then. This, however, tells us very little what the learning rules for the synaptic weights should be and how we should wire things up. In fact, I believe human-relevant problems are best solved by such a special subregion of program space that one needs pretty specific architectural priors as otherwise search will take too long. These are unlikely to be derived from general concepts, but are more likely evolved, either literally by evolutionary algorithms or by people doing the trial and error. The issue being that general concepts about prediction errors and program spaces know nothing about our specific world. E.g. none of these general concepts predict the usefulness of CNNs. CNNs exploit fairly specialized priors about object translation invariance and locality in image statistics, which are specific computations occurring in our universe when parts of it are perceived by geometric projections of EM rays onto an image plane with sensors. Hinton's capsules go into the right direction exploiting some more priors about spatial reference point invariance, but we need to go deeper. The brain disassembles the world into stable episodic chunks and operates on them, and it manages to backpropagate values through such episodic memories. Currently, no neural architecture does something like this.
https://towardsdatascience.com/the-bayesian-brain-hypothesis...
So far it seems that it explains quite a lot of data, and many mind illnesses (e.g. many diseases can be thought as the brain under-correcting or over-correcting for the prediction error).
By under-correcting, the brain is not learning enough on its mistakes, which may lead to delusions of superiority (e.g. being stuck in usual habits, or inability to change one's world-view based on new information). On the other hand, when over-correcting, the world may seem unpredictable, frightening - leading to self-doubt, anxiety and negative thoughts.
Being wrong around 15% of the time might actually be the optimal rate for learning... https://www.independent.co.uk/news/science/failing-study-suc...