A powerful hierarchical error correction principle

A powerful and general hierarchical principle of error correction and Invariance

In working with Sparsey, I've discovered an extremely important principle of information representation/processing with substantial consequences on error correction and invariance. The principle explicitly involves hierarchy. In words, the basic idea is that when there are multiple sources of evidence [i.e., activation patterns in coding fields (CFs), which we call "macs"], each representing a probability distribution over an event space, which influence a downstream probability distribution over another event space [represented by an activation pattern in the downstream (i.e., target) CF], the correctness of the estimate in the downstream distribution is generally higher than the average correctness of the upstream (source) distributions. Another way of saying this, which is specific to Sparsey's use of SDCs as the representations of the probability distributions, is: correctly active units (black, in the figures below), across the multiple source macs that provide input to a final common target mac are necessarily 100% correlated with each other, whereas incorrectly active units (red) are in general <100% and often <<100% correlated, or uncorrelated or anticorrelated with each other. Fig. 1 shows an example of the principle.

Why is this? Why are correctly active unts across multple course macs necessarily 100% correlated? It's because Sparsey's overall learning rubric is single-trial. That is, Sparsey assigns codes (SDCs) to inputs with single trials. These codes are permanent: they do not change over time. Thus, when the learning instance ("trained association") depicted in the figure occurs, the weights from all active (black) units in all three input macs to all active units in the target mac are set to 1. That is the formation (storage) of the association. That association is ground-truth: the three codes in the input macs and the code in the target mac are, by definition, correct, since this is essentially a storage event, i.e., of an "episodic" memory. All 15 black units in the three independent input macs become 100% correlated, i.e., entangled, in that storage event. The mechanism of their entanglement is the correlated physical setting of their efferent weights onto the units comprising the common target code.

cross mac correlation error correction principle

Having learned (stored) this one association, suppose we present a new input to the model. Note we are not depicting the actual raw sensory input field. Such field would be input to the level comprised of the three "input" macs shown here. Anyway, suppose this new input is similar to the input that gave rise to the initial three mac codes that occurred in the training instance, but not exactly the same. In this case, we can assume that the codes that arise in the three input macs in this case will be similar to, but not exactly the same as, the codes that arose in these macs in the training instance. Hence, we depict codes with varying degrees of overlap (intersection) with the original codes, e.g., 60%, 60%, and 40%. The crucial thing to see here is that during the learning instance, the eight black input units in the test instance (second row of first figure) will all have had their weigts onto the 5 active target units increased to 1. Thus, on the test instance, each of these 5 target units, i.e., the "correct" ones, will have input summations of 8.

Now consider, one of the "incorrect" (red) input units. By definition, it was not active during the traning instance. So it will not have undergone the same efferent weight increases (i.e. onto the 5 active target units) in the training instance. Now, we assume that other associations will have been stored. In general, that red unit may have been (correctly) active in one or more of those other training instances. In those other instances, in general, some other set of 5 active target units will have been active. Though it can also be that in any of those intances, one or more of the 5 target units chosen for the first (depicted) training instance here, may also have been active. In any case, in those other training instances, said red unit will in general have increased its weights onto other of the target units besides the 5 active units in the depicted training instance. Thus, in the test istance depicted, such red units will in general be contributing to increasing the input summations of incorrect target units. But, crucially, the effects (influences upon the target unit summations) of any one red cell will not be 100% correlated, and will usually be much much less correlated or even anticorrelated, with the effects of any of the other red units, not only with red units in other input macs, but with other red units in its own mac. So that's it. That's the principle. It's explicitly a hierarchical principle.

The below figure explicitly compares a model (left) where the principle is operative becasue there are multiple input fields (macs) to a model (right) where the principle cannot be operative since there is only a single input mac.

I have plenty of examples from actual simulation that demonstrate this principle in action, and having very strong effects. I'll post them soon. And, when I do, I'll also elaborate on how this principle really constitutes an extremely powerful and general principle for invariance, in particular, for part-invariance, i.e., for recognizing entities invariant to which or how many of their parts are present in a given instance.