### 2017

- A Radically Novel Explanation of Probabilistic Computing in the Brain. Invited Talk to Xaq Pitkow Lab Weekly Seminar Dec. 18, 2017.
**Abstract:**It is widely believed that the brain computes probabilistically and via some form of population coding. I describe a concept and mechanism of probabilistic computation and learning that differs radically from existing probabilistic population coding (PPC) models. The theory, Sparsey, is based on the idea that items of information (e.g., concepts) are represented as sparse distributed representations (SDRs), i.e., relatively small subsets of cells chosen from a much larger field, where the subsets may overlap, cf. ‘cell assembly’, ‘summary statistic’ (Pitkow & Angelaki, 2017). A Sparsey coding field consists of*Q*WTA competitive modules (CMs), each consisting of*K*units. Thus, all codes are of fixed size, Q, and the code space is*K*. This allows an extremely simple way to represent the likelihood/probability of a concept: the probability of concept X is simply the fraction of X’s SDR code present in the currently active code. But to make sense, this requires that more similar concepts map to more similar (more highly intersecting) codes (“SISC” property). If SISC is enforced, then any single active SDR code simultaneously represents both a particular concept (at 100% likelihood) and the entire likelihood distribution over all concepts stored in the field (with likelihoods proportional to the sizes of their codes’ intersections with the currently active code). The core of Sparsey is a learning/inference algorithm, the code selection algorithm (CSA), which ensures SISC and which runs in fixed time, i.e., the number of operations needed both to learn (store) a new item and to retrieve the best-matching stored item remains constant as the number of stored items increases (cf. locality-sensitive hashing). Since any SDR code represents the entire distribution, the CSA also realizes fixed-time ‘belief update’. I will describe the CSA and address the neurobiological correspondence of the theory’s elements/processes and highlight relationships with Pitkow & Angelaki, 2017.^{Q} - Superposed Episodic and Semantic Memory via Sparse Distributed Representation. Submitted to NIPS Cognitively Informed Artificial Intelligence (CIAI) Workshop 2017.
**Abstract:**The abilities to perceive, learn, and use generalities, similarities, classes, i.e., semantic memory (SM), is central to cognition. Machine learning (ML), neural network, and AI research has been primarily driven by tasks requiring such abilities. However, another central facet of cognition, single-trial formation of permanent memories of experiences, i.e., episodic memory (EM), has had relatively little focus. Only recently has EM-like functionality been added to Deep Learning (DL) models, e.g., Neural Turing Machine, Memory Networks. However, in these cases: a) EM is implemented as a separate module, which entails substantial data movement (and so, time and power) between the DL net itself and EM; and b) individual items are stored localistically within the EM, precluding realizing the exponential representational efficiency of distributed over localist coding. We describe Sparsey, a unsupervised, hierarchical, spatial/spatiotemporal associative memory model differing fundamentally from mainstream ML models, most crucially, in its use of sparse distributed representations (SDRs), or, cell assemblies, which admits an extremely efficient, single-trial learning algorithm that maps input similarity into code space similarity (measured as intersection). SDRs of individual inputs are stored in superposition and because similarity is preserved, the patterns of intersections over the assigned codes reflect the similarity, i.e., statistical, structure, of all orders, not simply pairwise, over the inputs. Thus, SM, i.e., a generative model, is built as a computationally free side effect of the act of storing episodic memory traces of individual inputs, either spatial patterns or sequences. We report initial results on MNIST and on the Weizmann video event recognition benchmarks. While we have not yet attained SOTA class accuracy, learning takes only minutes on a single CPU. - The Brain’s Computational Efficiency derives from using Sparse Distributed Representations. Rejected from Cognitive Computational Neuroscience 2017.
**Abstract:**Machine learning (ML) representation formats have been dominated by: a) localism, wherein individual items are represented by single units, e.g., Bayes Nets, HMMs; and b) fully distributed representations (FDR), wherein items are represented by unique activation patterns over all the units, e.g., Deep Learning (DL) and its progenitors. DL has had great success vis-a-vis classification accuracy and learning complex mappings (e.g., AlphaGo). But, without massive machine parallelism (MP), e.g., GPUs, TPUs, and thus high power, DL learning is intractably slow. The brain is also massively parallel, but uses only 20 watts and moreover, the forms of MP used in DL, model / data parallelism and shared parameters, are patently non-biological, suggesting DL’s core principles do not emulate biological intelligence. We claim that a basic disconnect between DL/ML and biology and the key to biological intelligence is that instead of FDR or localism, the brain uses sparse distributed representations (SDR), i.e., “cell assemblies”, wherein items are represented by small sets of binary units, which may overlap, and where the pattern of overlaps embeds the similarity/statistical structure (generative model) of the domain. We’ve previously described an SDR-based, extremely efficient, one-shot learning algorithm in which the primary operation is permament storage of experienced events based on single trials (episodic memory), but in which the generative model (semantic memory, classification) emerges automatically, and as a computationally free, in terms of time and power, side effect of the episodic storage process. Here, we discuss fundamental differences between the mainstream localist/FDR-based and our SDR-based approaches. - A Radically new Theory of How the Brain Represents and Computes with Probabilities. (arXiv)
**Abstract:**The brain is believed to implement probabilistic reasoning and to represent information via population, or distributed, coding. Most previous population-based probabilistic (PPC) theories share several basic properties: 1) continuous-valued neurons; 2) fully(densely)-distributed codes, i.e., all(most) units participate in every code; 3) graded synapses; 4) rate coding; 5) units have innate unimodal tuning functions (TFs); 6) intrinsically noisy units; and 7) noise/correlation is considered harmful. We present a radically different theory that assumes: 1) binary units; 2) only a small subset of units, i.e., a sparse distributed code (SDC) (cell assembly, ensemble), comprises any individual code; 3) binary synapses; 4) signaling formally requires only single (first) spikes; 5) units initially have completely flat TFs (all weights zero); 6) units are not inherently noisy; but rather 7) noise is a resource generated/used to cause similar inputs to map to similar codes, controlling a tradeoff between storage capacity and embedding the input space statistics in the pattern of intersections over stored codes, indirectly yielding correlation patterns. The theory, Sparsey, was introduced 20 years ago as a canonical cortical circuit/algorithm model, but not elaborated as an alternative to PPC theories. Here, we show that the active SDC simultaneously represents both the most similar/likely input and the coarsely-ranked distribution over all stored inputs (hypotheses). Crucially, Sparsey's code selection algorithm (CSA), used for both learning and inference, achieves this with a single pass over the weights for each successive item of a sequence, thus performing spatiotemporal pattern learning/inference with a number of steps that remains constant as the number of stored items increases. We also discuss our approach as a radically new implementation of graphical probability modeling.

### 2014

- Sparsey™: Event recognition via deep hierarchical sparse distributed codes. (2014) Frontiers in Computational Neuroscience. v. 8 December 2014 | doi: 10.3389/fncom.2014.00160
- Sparse Distributed Coding & Hierarchy: The Keys to Scalable Machine Intelligence. DARPA UPSIDE Year 1 Review Presentation. 3/11/14. (PPT)

### 2013

- A cortical theory of super-efficient probabilistic inference based on sparse distributed representations. 22nd Annual Computational Neuroscience Meeting, Paris, July 13-18. BMC Neuroscience 2013, 14(Suppl 1):P324 (Abstract)
- Constant-Time Probabilistic Learning & Inference in Hierarchical Sparse Distributed Representations, Invited Talk at the Neuro-Inspired Computational Elements (NICE) Workshop, Sandia Labs, Albuquerque, NM, Feb 2013.

### 2012

- Probabilistic Computing via Sparse Distributed Representations. Invited Talk at Lyric Semiconductor Theory Seminar, Dec. 14, 2012.
- Quantum Computation via Sparse Distributed Representation. (2012) Gerard Rinkus. NeuroQuantology 10(2) 311-315.

Abstract: Quantum superposition states that any physical system simultaneously exists in all of its possible states, the number of which is exponential in the number of entities composing the system. The strength of presence of each possible state in the superposition—i.e., the probability with which it would be observed if measured—is represented by its probability amplitude coefficient. The assumption that these coefficients must be represented physically disjointly from each other, i.e., localistically, is nearly universal in the quantum theory/computing literature. Alternatively, these coefficients can be represented using sparse distributed representations (SDR), wherein each coefficient is represented by a small subset of an overall population of representational units and the subsets can overlap. Specifically, I consider an SDR model in which the overall population consists of Q clusters, each having K binary units, so that each coefficient is represented by a set of Q units, one per cluster. Thus, K^Q coefficients can be represented with KQ units. We can then consider the particular world state, X, whose coefficient’s representation, R(X), is the set of Q units active at time t to have the maximal probability and the probabilities of all other states, Y, to correspond to the size of the intersection of R(Y) and R(X). Thus, R(X) simultaneously serves both as the representation of the particular state, X, and as a probability distribution over all states. Thus, set intersection may be used to classically implement quantum superposition. If algorithms exist for which the time it takes to store (learn) new representations and to find the closest-matching stored representation (probabilistic inference) remains constant as additional representations are stored, this would meet the criterion of quantum computing. Such algorithms, based on SDR, have already been described. They achieve this "quantum speed-up" with no new esoteric technology, and in fact, on a single-processor, classical (Von Neumann) computer.

### 2010

- A cortical sparse distributed coding model linking mini- and macrocolumn-scale functionality. (2010) Gerard Rinkus. Frontiers in Neuroanatomy 4:17. doi:10.3389/fnana.2010.00017

### 2009

- Familiarity-Contingent Probabilistic Sparse Distributed Code Selection in Cortex. (
**in prep**, also see this page) - Overcoding-and-Pruning:A Novel Neural Model of Temporal Chunking and Short-term Memory. (2009) Gerard Rinkus. Invited Talk in Gabriel Kreiman Lab, Dept. of Opthamology and Neuroscience, Children's Hospital, Boston, July 31, 2009.
- Overcoding-and-paring: a bufferless neural chunking model. (2009) Gerard Rinkus. Frontiers in Computational Neuroscience. Conference Abstract: Computational and systems neuroscience. (COSYNE '09) doi: 10.3389/conf.neuro.10.2009.03.292

### 2008

- Population Coding Using Familiarity-Contingent Noise.(abstract/poster)
*AREADNE 2008: Research in Encoding And Decoding of Neural Ensembles*, Santorini, Greece, June 26-29. (abstract) (poster) - Overcoding-and-pruning: A novel neural model of sequence chunking (manuscript
**in prep**)**-- Patented**

Abstract: We present a radically new model of chunking, the process by which a monolithic representation emerges for a sequence of items, called overcoding-and-pruning (OP). Its core insight is this: if a sizable population of neurons is assigned to represent an ensuing sequence immediately, at sequence start, it can then be repeatedly pruned as functions of each successive item. This solves the problem of assigning unique chunk representations to sequences that start in the same way, e.g., "CAT" and "CAR", without requiring temporary buffering of the items' representations. OP rests on two well-supported assumptions: 1) information is represented in cortex by sparse distributed representations; and 2) neurons at progressively higher cortical stages have progressively longer activation duration-or, persistence. We believe that this type of mechanism has been missed so far due to the historical bias of thinking in terms of localist representations, which cannot support it since pruning cannot be applied to a single representational unit.

### 2007

- A Functional Role for the Minicolumn in Cortical Population Coding. Invited Talk at
*Cortical Modularity and Autism*, University of Louisville, Louisville, KY, Oct 12-14, 2007. (PPT) (pdf) Animations do not show in pdf version.

### 2006

- Hierarchical Sparse Distributed Representations of Sequence Recall and Recognition. Presentation given at The Redwood Center for Theoretical Neuroscience (University of California, Berkeley) on Feb 22, 2006. (PPT) (video) (Note: PPT presentation uses heavy animations)

### 2005

- Time-Invariant Recognition of Spatiotemporal Patterns in a Hierarchical Cortical Model with a Caudal-Rostral Persistence Gradient (2005) (poster) Rinkus, G. J. & Lisman, J.
*Society for Neuroscience Annual Meeting, 2005*. Washington, DC. Nov 12-16. Note that this poster is almost identical to the one presented at the First Annual Computational Cognitive Neuroscience Conference. - A Neural Network Model of Time-Invariant Spatiotemporal Pattern Recognition (2005) (abstract) Rinkus, G. J.
*First Annual Computational Cognitive Neuroscience Conference*, Washington, DC, Nov. 10-11.

### 2004 and earlier

- A Neural Model of Episodic and Semantic Spatiotemporal Memory (2004) Rinkus, G.J.
*Proceedings of the 26th Annual Conference of the Cognitive Science Society*. Kenneth Forbus, Dedre Gentner & Terry Regier, Eds. LEA, NJ. 1155-1160. Chicago, Ill. (pdf)A Quicktime animation that walks you through the example in Figure 4 of the paper.

- Software tools for emulation and analysis of augmented communication. (2003) Lesher, G.W., Moulton, B.J., Rinkus, G. & Higginbotham, D.J.
*CSUN 2003*, California State University, Northridge. - Adaptive Pilot-Vehicle Interfaces for the Tactical Air Environment. (2001) Mulgund, S.S., Zacharias, G.L., & Rinkus, G.J. in Psychological Issues in the Design and Use of Virtual Adaptive Environments. Hettinger, L.J. & Haas, M. (Eds.) LEA, NJ. 483-524.
- Leveraging word prediction to improve character prediction in a scanning configuration. (2002) Lesher, G.W. & Rinkus, G.J.
*Proceedings of the RESNA 2002 Annual Conference.*Reno. (pdf) - Domain-specific word prediction for augmentative communications (2001) Lesher, G.W. & Rinkus, G.J.
*Proceedings of the RESNA 2002 Annual Conference*, Reno. (pdf) - Logging and analysis of augmentative communication. (2000) Lesher, G.W., Rinkus, G.J., Moulton, B.J., & Higginbotham, D.J.
*Proc. of the RESNA 2000 Annual Conference*, Reno. 82-85. - Intelligent fusion and asset manager processor (IFAMP). (1998) Gonsalves,P.G. & Rinkus, G.J.
*Proc. of the IEEE Information Technology Conference*(Syracuse, NY) 15-18. (pdf) - A Monolithic Distributed Representation Supporting Multi-Scale Spatio-Temporal Pattern Recognition (1997)
*Int'l Conf. on Vision, Recognition, Action: Neural Models of Mind and Machine*, Boston University, Boston, MA May 29-31. (abstract) - Situation Awareness Modeling and Pilot State Estimation for Tactical Cockpit Interfaces. (1997) Mulgund, S., Rinkus, G., Illgen, C. & Zacharias, G. Presented at
*HCI International*, San Francisco, CA, August. (pdf) - OLIPSA: On-Line Intelligent Processor for Situation Assessment. (1997) S. Mulgund, G. Rinkus, C. Illgen & J. Friskie.
*Second Annual Symposium and Exhibition on Situational Awareness in the Tactical Air Environment*, Patuxent River, MD. (pdf) - A Neural Network Based Diagnostic Test System for Armored Vehicle Shock Absorbers. (1996) Sincebaugh, P., Green, W. & Rinkus, G.
*Expert Systems with Applications*,**11**(2), 237-244. - A Combinatorial Neural Network Exhibiting Episodic and Semantic Memory Properties for Spatio-Temporal Patterns (1996) G. J. Rinkus. Doctoral Thesis. Boston University. Boston, MA. (pdf)
- TEMECOR: An Associative, Spatiotemporal Pattern Memory for Complex State Sequences. (1995)
*Proceedings of the 1995 World Congress on Neural Networks*. LEA and INNS Press. 442-448. (pdf) - Context-sensitive spatio-temporal memory. (1993)
*Proceedings of World Congress On Neural Networks*. LEA. v.2, 344-347. - Context-sensitive Spatio-temporal Memory. (1993) Technical Report CAS/CNS-93-031, Boston University Dept. of Cognitive and Neural Systems. Boston, MA. (pdf)
- A Neural Model for Spatio-temporal Pattern Memory (1992)
*Proceedings of the Wang Conference: Neural Networks for Learning, Recognition, and Contro*l, Boston University, Boston, MA - Learning as Natural Selection in a Sensori-Motor Being (1988)
*Proceedings of the 1st Annual Conference of the Neural Network Society*, Boston. - Learning as Natural Selection in a Sensori-Motor Being (1986) G.J.Rinkus. Master's Thesis. Hofstra University, Hempstead, NY.