A Unified Information Theory of Subjective Cognition
Jian-Sheng Kang
Submitted 2025-09-22 | ChinaXiv: chinaxiv-202509.00067

Abstract

Consciousness is considered the most challenging problem in the science of the mind, especially Chalmers' hard problem. From the standpoint of systems science philosophy, many consciousness theories overlook the structure-function correlation principle, thereby reducing their explanatory power regarding consciousness. Therefore, we first introduce a philosophy of systems science. Based on the structure-function correlation principle, we recognize that the neck structure of the dendritic spine is key to information encoding in the frequency domain. Consequently, the entire work is discussed within the frequency domain. At the level of sensory modalities, the intrinsic function of simple auditory neurons encodes information as a wave function. Subsequently, at the mesoscopic level, we successfully resolve the quantum mechanism for frequency adaptation, which represents an energy constraint. In particular, we find that the accommodation mechanism and the multi-layered retina are specialized to employ the principles of Fourier optics and quantum optics. Specifically, the stratified structures, positions, and functions of retinal ganglion cells and bipolar cells are specialized for the Fock states in quantum optics. The entire brain can be considered as a neuronal quantal field, comprising billions of neurons and trillions of synapses, which function intrinsically as wave functions and perceptual units, respectively. Furthermore, we address both the "hard problem" and "easy problems" of consciousness by analyzing them in the frequency domain, thereby fully resolving the pseudo-metaphysical hard problem. In summary, understanding the mechanisms of neuronal function and consciousness represents a significant achievement, and this Unified Information Theory (UIT) provides a bridge between microscopic and macroscopic levels. Additionally, building upon the identification of time perception as an intrinsic manifestation of causal asymmetry and a representation of imaginary temporal dynamics, we present a quantum geometric derivation of Newton's gravitational constant and resolve the cosmological constant problem through discrete spacetime structures, unified by Hodge duality mechanisms.

Full Text

Preamble

A Unified Information Theory of Subjective Cognition: Quantum Information Processing from Neuron to Consciousness

Jian-Sheng Kang*
Clinical Systems Biology Laboratories, The First Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, China

*Correspondence should be addressed to J.-S. Kang, Clinical Systems Biology Laboratories, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, 450052, China. E-mail: kjs@zzu.edu.cn ORCID ID: 0000-0002-2603-9718.

Abstract: Consciousness represents the most challenging problem in the science of the mind, particularly Chalmers' hard problem. From a systems science philosophy perspective, many consciousness theories overlook the structure-function correlation principle, thereby reducing their explanatory power. We therefore introduce a philosophy for systems science and, based on the structure-function correlation principle, identify the dendritic spine neck as a key structure for frequency-domain information encoding. Consequently, our entire framework operates in the frequency domain. At the sensory modality level, simple auditory neurons encode information intrinsically as wave functions. At the mesoscopic level, we resolve the quantum mechanism for frequency adaptation as a representation of energy constraints, finding that accommodation mechanisms and the multi-layered retina specialize in employing principles from Fourier optics and quantum optics. Specifically, the stratified structures, positions, and functions of retinal ganglion cells and bipolar cells are specialized for Fock states in quantum optics. The entire brain can be considered a neuronal quantal field comprising billions of neurons and trillions of synapses that function intrinsically as wave functions and perceptive units, respectively. We address both the "hard problem" and "easy problems" of consciousness through frequency-domain analysis, thereby dissolving the pseudo-metaphysical hard problem. This Unified Information Theory (UIT) provides a thrilling bridge between microscopic and macroscopic levels. Building on time perception as an intrinsic manifestation of causal asymmetry and imaginary temporal dynamics, we derive a quantum geometric expression for Newton's gravitational constant and resolve the cosmological constant problem through discrete spacetime structures unified by Hodge duality mechanisms. Neural systems encode information through frequency adaptation that aligns spike intervals with prime-rich regions, realizing the Hilbert-Pólya conjecture and providing a novel pathway to explore deep connections between biology, geometry, and number theory, offering insights toward proving the Riemann Hypothesis.

Keywords: Frequency adaptation; Frequency Domain; Quantum Information; Unified Information Theory (UIT); Hilbert-Pólya conjecture; Riemann Hypothesis; Consciousness; Systems Science Philosophy

Preface: A Philosophy of Systems Science

The mind-body relationship constitutes both a scientific problem and a philosophical theme. Surveys reveal that 73.3% of Americans and 43.2% of Europeans believe in an afterlife, reflecting a dualistic view that the mind can survive death (Riekki et al., 2013). Similarly, most college students and the general public subscribe to dualism—the belief that mind and brain are separable (Demertzi et al., 2009). In Introduction to Systems Philosophy (Laszlo, 1972), Laszlo rejected dualism and pluralism in favor of holism and system monism. However, the prevalence of dualistic views indicates problematic gaps in scientific explanations of the mind-body relationship. We therefore briefly review relevant philosophical history and introduce a philosophy for systems science.

Western dualism is closely linked to René Descartes' meditations (1641), which posited that the conscious, self-aware mind is nonphysical. In East Asia, Zhu Xi's Neo-Confucian Li-Qi dualism has been influential since China's Southern Song Dynasty. "Li" represents abstract, unchanging universal patterns governing the cosmos, while "Qi" constitutes the tangible, mutable substance of the material world (Fung and Bodde, 1942; 1948). This dualism originates from the systematic worldview of the I Ching (Book of Changes), whose "Xici" appendix states: "There is in the Changes the Great Primal Beginning (Taiji). This generates the two primary forces. The two primary forces generate the four images. The four images generate the eight trigrams" (Wilhelm and Baynes, 1967). The Great Primal Beginning (Taiji) represents a monistic concept.

In the West, Austrian biologist Karl Ludwig von Bertalanffy (1901-1972) founded general systems theory, while Chinese scientist Qian Xuesen (1911-2009) pioneered engineering cybernetics in the East. In a 1982 report, Qian advocated integrating general systems theory, information theory, and cybernetics within the broader field of systems science, with systems theory at the center (Qian, 1984). Subsequently, Wei Hongsen and Zeng Guoping's System Theory—Systems Science Philosophy (1995) expanded Qian's concept into systematicism, a philosophy of systems science grounded in general system theory, information theory, cybernetics, dissipative structure theory, synergetics, hypercycle theory, catastrophe theory, chaos theory, and fractal theory. They condensed systematicism into eight principles: integrity, hierarchy, openness, teleonomy/purposefulness, catastrophe, stability, self-organization, and similarity, plus five fundamental laws: structure-function correlation, feedback in information, competition and cooperation, order emerging from fluctuations, and optimization of evolutionary processes.

Embracing systematicism is essential for addressing today's intricate challenges, requiring interdisciplinary approaches that weave together insights from multiple disciplines, particularly when examining theoretical constructs and fundamental understandings of neurons, brain functions, and consciousness. Our analyses will primarily rely on the laws of structure-function correlation and optimization of evolutionary processes, along with the principles of integrity, hierarchy, openness, teleonomy, and stability. We prioritize logical coherence and clarification of fundamental concepts over rigorous mathematical and physical details, suggesting additional reference materials for deeper comprehension of terminologies when necessary.

1. Matter(Structure), Energy and Information

From a physical perspective, the brain functions as an open system exhibiting integrity through coordinated interactions of hierarchical components, openness through exchange of matter, energy, and information with its environment, and stability through maintenance of functional states despite external disturbances. Matter, energy, and information constitute three indispensable elements for preserving brain integrity and stability over time. Time remains fundamental, though often implicit in rate/frequency domain discussions—analogous to projected or reciprocal space in mathematical and physical contexts. Frequency domain representation offers a more concise and lucid description than time domain, circumventing the need to engage with abundant dynamic parameters and intricate temporal details that have been intensively documented, such as the biophysics of spike timing in single neurons (Koch, 1998) and spike rates as Bayesian posterior probabilities (Rieke et al., 1999).

The brain represents an evolutionarily optimized structure that emerged under energy constraints. Its hierarchical structures and functions evolved for information processing through storage, encoding, decoding, transmission, and quantitation. Concurrently, precise definition of information—including its emergent properties—holds the key to unlocking neuronal and brain mechanisms. Our subsequent analyses dissect the links between the brain's hierarchical structures and their information-processing functions.

1.1 Information Storage in Binary Form

The brain constitutes approximately 2% of human body mass yet accounts for roughly 20% of the body's energy consumption (Raichle and Gusnard, 2002). Notably, brain energy efficiency significantly surpasses contemporary computing and AI systems by approximately 9×10⁸ times (Stiefel and Coggan, 2023), likely resulting from evolutionary optimization under energetic constraints. This high efficiency indicates that the all-or-none action potential represents an efficient form for storing and processing information at any given moment.

This concept is supported by Norbert Wiener's conclusion in Chapter V of the 1965 second edition of Cybernetics, which established the minimum cost required to reliably store one bit of information in a binary system (Wiener, 1965). These constraints apply not only to information stored in action potentials but also likely pertain to long-term information storage via synaptic structure stability, recurrent action potential patterns, and potential epigenetic modifications (Santoni et al., 2024).

Beyond its all-or-none nature, the action potential exhibits a short refractory period and shows adaptation to stimulation; furthermore, neurons encode information intensity as impulse frequency (Adrian and Zotterman, 1926; Koch, 1998; Kandel et al., 2021). Action potentials initiate at the distal axon initial segment (AIS) [FIGURE:1], where high concentrations of low-threshold sodium channel Nav1.6 reside. Concurrently, the proximal AIS region enriched with high-threshold sodium channel Nav1.2 is crucial for action potential backpropagation (Hu et al., 2009). Upon arrival at the presynaptic terminal, voltage-gated calcium channels open, allowing calcium influx detected by the calcium sensor synaptotagmin (Xu et al., 2007), which activates the SNARE complex to facilitate neurotransmitter release from presynaptic vesicles (Südhof and Rothman, 2009).

1.2 The Release Probability of Presynaptic Vesicles

Synaptic vesicle pools, containing from nearly a hundred to approximately a million vesicles, can be categorized into the readily releasable pool (RRP) docked at the presynaptic active zone, the recycling pool, and the reserve pool (Rizzoli and Betz, 2005) [FIGURE:1]. In contrast, the average number of synaptic vesicles per active zone is around two hundred (Rizzoli and Betz, 2005). RRPs of rat hippocampal varicosities contain ~5-10 vesicles (Schikorski and Stevens, 2001). Similarly, RRPs per active zone have ~2-9 vesicles in the rat Calyx of Held (~600 active zones) (Sätzler et al., 2002), a giant presynaptic terminal in the auditory system with a total RRP size around ~3000-5000 vesicles (personal communication) (Guo et al., 2015).

[TABLE:1] contrasts time domain and frequency domain viewpoints. In Chapter 4 of Biophysics of Computation: Information Processing in Single Neurons (Koch, 1998), synaptic input is modeled in the time domain from the dendritic tree perspective. The probability of quantal release is typically represented using a binomial distribution, while synaptic weight is averaged over a Poisson-distributed presynaptic spike pattern (Koch, 1998). Rate theory regards spike rates as Bayesian posterior probabilities (Rieke et al., 1999). Currently, activation functions in artificial neural networks, such as the Rectified Linear Unit (ReLU), emulate neurotransmitter release to introduce nonlinearity.

In the frequency domain, we initially focus on a single active zone within the presynaptic terminal. The RRP, as its name suggests, contains vesicles ready for immediate release upon action potential arrival. As noted above, RRP size is usually less than 5% of the total vesicles per active zone. Considering that synaptic transmission is stochastic and release probability can be quite low (Koch, 1998), the Poisson distribution appropriately models vesicle release probability. More precisely, the non-homogeneous Poisson process is crucial for presynaptic modeling, especially when release probability varies over time or with synaptic activity, such as through presynaptic inhibition modulation (Naumann and Sprekeler, 2020).

Mathematically, random action potential arrivals at presynaptic terminals can be modeled using Bernoulli trials with time-varying firing rate λ(t). The probability of firing within interval Δt is λ(t)•Δt. Given that k vesicles are released per action potential, the probability p(k; λ(t)Δt) follows a non-homogeneous Poisson process generalized from the Poisson process with independent but nonstationary increments (Ross, 1996; Lu and Zhang, 2012):

$$p(k; \lambda(t)\Delta t) = \frac{(\lambda(t)\Delta t)^k e^{-\lambda(t)\Delta t}}{k!} \tag{1.2.1}$$

When λ(t)•Δt is time-invariant, Equation 1.2.1 becomes a standard Poisson distribution. Intriguingly, for simplicity without loss of generality, the standard Poisson distribution can discuss RRP size per presynaptic active zone [FIGURE:1]. As demonstrated in [FIGURE:2], if Δt is brief (e.g., 1 ms) and λ(t)Δt ≤ 1, the probability of k ≥ 5 vesicles being released concurrently during interval Δt approximates zero, implying that 5-10 vesicles in the RRP per active zone may encompass full probabilities for short intervals.

1.3 The Function of Dendritic Spine in the Frequency Domain

We now examine the postsynaptic terminal model in the frequency domain [TABLE:1]. As shown in [FIGURE:1], the spine neck has an average diameter of 0.2 ± 0.06 μm for layer 2/3 pyramidal cells from mouse visual cortex (Arellano et al., 2007). In the time domain, spine necks function as diffusion barriers for large proteins, small ions, and molecules (Adrian et al., 2014; Tønnesen et al., 2014). Molecular diffusion is acutely sensitive to spine neck width variations and inversely proportional to cross-sectional area (Tønnesen et al., 2014), implying that spine neck structure dictates diffusion rates. Consequently, the spine neck's role as a diffusion barrier can be accurately modeled using a point spread function (PSF), with the one-dimensional sinc(x) function, defined as sin(x)/x with sinc(x=0) = 1, serving as an appropriate and simple example. The sinc function is fundamental in probability and information theory (Woodward and Davies, 1952).

An interesting property is that the sums or integrals of sinc(x) and sinc²(x) over all real numbers are equal (Baillie et al., 2008), simplifying extension from one to two dimensions. The sinc function possesses remarkable properties in summation, multiplication, and integration (Kac, 1959; Baillie et al., 2008; Ortiz Gracia and Oosterlee, 2016). As illustrated in [FIGURE:3], the Fourier transform of the sinc function is a finite rectangle window in the frequency domain, signifying that dendritic spine structure effectively represents a finite frequency response.

1.4 Neuron as a Summator of Synchronous Frequencies and a Bernoulli Binary Generator

Having discussed presynaptic and postsynaptic terminal structures, functions, and models, we can now address synaptic integration, which is fundamentally a convolution process between the presynaptic input signal and the postsynaptic impulse response. In the frequency domain, overall dendritic output results from superposition of individual outputs obtained by convolving presynaptic input with various finite frequency responses of spines on the dendritic tree. Convolving the non-homogeneous Poisson distribution parameterized by λ(t)•Δt (Equation 1.2.1) with finite frequency responses of spines eliminates the time interval Δt due to the scaling property of Fourier transformation [FIGURE:3]. Additionally, given the all-or-none nature of action potentials, amplitude scaling becomes redundant. As dendritic trees receive presynaptic spikes conforming to a Poisson distribution (Koch, 1998), and considering vesicle release probability characterized by the non-homogeneous Poisson process with rate parameter λ(t) (Equation 1.2.1), the resulting postsynaptic activity can be described as:

$$B(t) = \sum_{i=1}^{N} \lambda_i(t)^{k_i} e^{-\lambda_i(t)} \tag{1.4.1}$$

where B(t) represents the summation of individual spine outputs in response to presynaptic inputs, and N denotes the substantial number of postsynaptic spines—exemplified by over 17,000 spines on mouse layer III pyramidal neurons (Ballesteros-Yáñez et al., 2006) and up to 97,853 spines on human Purkinje cell dendrites (Masoli et al., 2024).

Equation 1.4.1 can also be derived by modeling neuronal function as a busy period B(t) in an M/G/1 queueing system using its AIS and dendritic spines, where spikes arrive following a Poisson process with rate λ(t). Here, M denotes "Markovian" arrivals (memoryless Poisson process inter-arrival times), G represents "General" service time distribution (non-exponential, any general pattern), and 1 indicates a single server (AIS) (Ross, 1996, section 2.3.1). Alternatively, as discussed in Section 2.2.2.1, Equation 1.4.1 can be formulated by treating synapses as coherent states, since the n-th number state in a coherent state follows a Poisson distribution in quantum optics (Gerry and Knight, 2005).

Moreover, as signaling molecules such as Na⁺ from excitatory synapses and Cl⁻ from inhibitory synapses diffuse along ion gradients through the dendritic tree to the distal AIS to trigger or inhibit action potentials, the dendritic tree structure can enhance synchronous frequencies while attenuating non-synchronous frequencies. This effectively transforms a conditional compound Poisson process into more discrete, all-or-none behavior resembling a Bernoulli process. Intriguingly, given that simple cells receive one simple type of input (Hubel and Wiesel, 1962), Equation 1.4.1 implies that simple cells function as linear time-invariant (LTI) units in the frequency domain [FIGURE:4], yielding Bernoulli binary outputs with an average firing rate λ(t). For clearer understanding of λ(t), refer to Equation 2.1.3 or coherent states in Section 2.2.2.1. Thus, a neuron functions as both a summator of synchronous frequencies and a Bernoulli binary generator.

1.5 The Bernoulli Process and Ergodic Properties

The Bernoulli process models random phenomena where each trial yields success or failure. Named after Jacob Bernoulli, this process describes neuronal function in the frequency domain [FIGURE:4], offering both benefits and drawbacks. The independence of Bernoulli trials—guaranteed throughout the action potential refractory period [FIGURE:1]—enables quick adaptability and decision-making based solely on current information without considering prior outcomes (Markov property). However, ignoring long-term dependencies is disadvantageous since the Bernoulli process is memoryless.

The discrete nature of Bernoulli processes can conflict with continuous sensory data streams, making ergodicity essential for resolution. A Bernoulli process is ergodic if it satisfies certain conditions, such as simple cells with time-invariant frequencies in the frequency domain discussed above. Ergodicity is crucial in statistical mechanics, probability theory, functional analysis, and dynamical systems theory (Ashley, 2015). In statistical mechanics, ergodicity ensures equivalence between time averages and ensemble averages. Ergodic systems are characterized by stationary or invariant measures, signifying time-invariant probability distributions of system states, such as Bernoulli shifts and Markov shifts (Quas, 2011). Additionally, a non-homogeneous Poisson process driven by an almost periodic intensity function possesses ergodic properties (Rolski, 1990), providing a natural rationale for the vesicle release probability in presynaptic terminals adopting a non-homogeneous Poisson distribution (Equation 1.2.1 and [FIGURE:4]).

1.6 The Modular Group and Congruency Subgroups

Bernoulli trial outcomes (failure or success) can be expressed as {0, 1}, forming an additive cyclic group Z₂ under addition modulo 2, or as sign changes {-1, 1}, forming a multiplicative cyclic group isomorphic to Z₂. The success outcome (action potential) always maps to 1.

Before discussing group representations, we must introduce certain concepts for clarity, particularly given potential confusion from texts emphasizing non-trivial groups while disregarding trivial ones. A group is a closed set equipped with a binary operation adhering to associativity, identity element existence, and inverse element existence (Simon, 1996). A cyclic group is generated by a single element. The digit 1 and identity matrices satisfy multiplicative group and cyclic group definitions with the simplest structures—trivial yet fundamentally important for later discussions.

In mathematics, the modular group is represented by the projective special linear group PSL(2, ℤ), consisting of 2×2 matrices with integer coefficients and determinant 1 (Diamond and Shurman, 2006). To streamline notation in the projective frequency domain, we use the special linear group SL(2, ℤ) to denote the modular group without ambiguity:

$$SL(2, \mathbb{Z}) = \left{ \begin{pmatrix} a & b \ c & d \end{pmatrix} : a, b, c, d \in \mathbb{Z}, ad - bc = 1 \right} \tag{1.6.1}$$

The modular group (1.6.1) is generated via fractional linear transformation:

$$A(\tau) = \frac{a\tau + b}{c\tau + d} \tag{1.6.2}$$

with two matrix generators: the translation matrix T (1.6.3) and inversion matrix S (1.6.4). The fractional linear transformation maps the modular group from the complex upper half-plane onto itself (Im(τ) > 0) [FIGURE:5], where τ is a complex number with positive imaginary part (Serre, 1973; Apostol, 2012). Note that matrices A and -A are identical in the modular group, making it a quotient group SL(2, ℤ)/{I, -I} in projective space, so S² = I in Equation 1.6.6.

Beyond {0, 1} and sign representations {-1, 1}, Z₂ can be expressed as 2×2 matrix representations. The Pauli matrices form a set of three 2×2 complex matrices (σₓ, σᵧ, σ_z):

$$\sigma_x = \begin{pmatrix} 0 & 1 \ 1 & 0 \end{pmatrix}, \quad \sigma_y = \begin{pmatrix} 0 & -i \ i & 0 \end{pmatrix}, \quad \sigma_z = \begin{pmatrix} 1 & 0 \ 0 & -1 \end{pmatrix} \tag{1.6.7-1.6.9}$$

The 2×2 matrix representation of Z₂ includes the identity matrix I (1.6.5) and either Pauli σₓ (1.6.7) or σ_z (1.6.9), which generates the multiplicative cyclic group Z₂. The action potential can be mapped to the 2×2 identity matrix I, while mapping Bernoulli failure outcomes to σₓ or σ_z. As discussed, the action potential expressed as I alone forms a trivial group. Moreover, the identity matrix I belongs to every principal congruence subgroup of SL(2, ℤ), denoted as the gamma congruence subgroup Γ(n):

$$\Gamma(n) = \left{ \begin{pmatrix} a & b \ c & d \end{pmatrix} \in SL(2, \mathbb{Z}) : \begin{pmatrix} a & b \ c & d \end{pmatrix} \equiv \begin{pmatrix} 1 & 0 \ 0 & 1 \end{pmatrix} (\text{mod } n) \right} \tag{1.6.10}$$

All elements of Γ(n) are matrices congruent to the identity matrix modulo n, where diagonal elements a and d ≡ 1 (mod n) and off-diagonal elements b and c ≡ 0 (mod n). The principal congruence subgroup Γ(2) is the smallest non-trivial congruence subgroup of SL(2, ℤ), and the action potential expressed as the 2×2 identity matrix I as a trivial group encapsulates the simplest action form within Γ(2) of SL(2, ℤ).

Beyond principal congruence subgroups, the notable congruence subgroups Γ₁(n) and Γ₀(n) hold significance for later classifications:

$$\Gamma_1(n) = \left{ \begin{pmatrix} a & b \ c & d \end{pmatrix} \in SL(2, \mathbb{Z}) : \begin{pmatrix} a & b \ c & d \end{pmatrix} \equiv \begin{pmatrix} 1 & * \ 0 & 1 \end{pmatrix} (\text{mod } n) \right} \tag{1.6.11}$$

$$\Gamma_0(n) = \left{ \begin{pmatrix} a & b \ c & d \end{pmatrix} \in SL(2, \mathbb{Z}) : \begin{pmatrix} a & b \ c & d \end{pmatrix} \equiv \begin{pmatrix} * & * \ 0 & * \end{pmatrix} (\text{mod } n) \right} \tag{1.6.12}$$

Every congruence subgroup Γ has finite index in SL(2, ℤ) (Diamond and Shurman, 2006) and follows a strict inclusion hierarchy: I ⊂ Γ(2) ⊂ Γ(n) ⊂ Γ₁(n) ⊂ Γ₀(n) ⊂ SL(2, ℤ).

1.7 The Definition of Information and the Minimum Uncertainty Principle

In group theory, the 2×2 identity matrix representation of the action potential serves as both a fundamental element and the smallest congruence subgroup of SL(2, ℤ). Correspondingly, in neuroscience, the action potential represents the smallest known unit of information carried by neurons. Given that additivity is essential for information, modeling the action potential as a Bernoulli trial provides a solid foundation for quantifying information in neurons and the brain.

1.7.1 Shannon Information and Fisher Information

Addressing Shannon's inquiry—"How does one measure amount of information?"—requires precise definition. Shannon's solution for determining "capacity of a communication channel" was information entropy, or Shannon entropy (S), measured in bits (Shannon and Weaver, 1998):

$$S = -\sum_i p_i \log_2(p_i) \tag{1.7.1}$$

For an action potential firing with probability p as a Bernoulli trial outcome with failure rate 1-p, its Shannon information (S_B) is:

$$S_B = p\log_2\left(\frac{1}{p}\right) + (1-p)\log_2\left(\frac{1}{1-p}\right) \tag{1.7.2}$$

In physics, entropy associates with randomness, disorder, or uncertainty, while information links to accuracy, precision, invariance, or stability. In statistics, precision of random variable x is defined as inverse variance. The variance lower bound is the Cramér-Rao bound (CRLB), defined as the inverse of Fisher's information (Cramér, 1946; Rao, 1992). Fisher's information is an additive "intrinsic accuracy" defined by Fisher (1922). For probability mass function f(x, p) of discrete random variable x with unbiased estimator p, Fisher information F(p) is:

$$F(p) = -E\left[\left(\frac{\partial}{\partial p} \ln f(x, p)\right)^2\right] \tag{1.7.3}$$

where the partial derivative of the natural logarithm of f(x, p) with respect to p is called the score, and Fisher information F(p) is the expected value (E) of the score variance.

1.7.2 Neuronal Information and the Minimum Uncertainty Principle

If x is one sample from a Bernoulli trial, the Fisher information carried by x is:

$$F(p) = -E\left[\frac{\partial^2}{\partial p^2} \ln(p^x(1-p)^{1-x})\right] = \frac{1}{p(1-p)} \tag{1.7.4}$$

where p is success probability. Therefore, the Bernoulli trial variance p(1-p) is the inverse of Fisher information [TABLE:2], meeting the CRLB lower bound and indicating minimal variance or uncertainty. Consequently, Fisher information in a Bernoulli trial equals its precision, signifying that intrinsic accuracy is synonymous with precision.

The concept of Fisher information's intrinsic invariance is more intuitively understood through information geometry (Nielsen, 2022). Similar to how curvature in differential geometry depicts manifold structure, Fisher information's second-order property (Equation 1.7.3) captures the local curvature of conditional probability space, related to relative entropy or Kullback-Leibler divergence (Gourieroux and Monfort, 1995). In differential geometry, curvature is recognized as an intrinsic invariant, a concept established by Gauss's Theorem Egregium (Gu and Yau, 2008). In summary, neuronal information in Bernoulli processes can be defined and quantified as Fisher information, adhering to the principle of minimum uncertainty.

1.7.3 Relationship between Fisher Information and Shannon Entropy

1.7.3.1 Isomorphic Relationship
Through appropriate mathematical transformations (such as logarithmic transformation of Equation 1.7.4), Fisher information and Shannon entropy (Equation 1.7.2) exhibit an isomorphic relationship for Bernoulli trials. This isomorphism reflects shared behavior across probability space, as the Shannon entropy vector can be expressed as a linear action of the logarithmic Fisher information vector via multiplication by a diagonal probability matrix:

$$\begin{pmatrix} p\log_2\left(\frac{1}{p}\right) \ (1-p)\log_2\left(\frac{1}{1-p}\right) \end{pmatrix} = \begin{pmatrix} p & 0 \ 0 & 1-p \end{pmatrix} \begin{pmatrix} \log_2\left(\frac{1}{p}\right) \ \log_2\left(\frac{1}{1-p}\right) \end{pmatrix} \tag{1.7.5}$$

However, this isomorphism does not imply direct equivalence or interchangeability. Shannon entropy measures overall system uncertainty, while Fisher information focuses on parameter estimation precision. Despite structural similarities, they serve distinct statistical purposes.

1.7.3.2 Dialectical Relationship
The extremal behavior of Shannon entropy and logarithmic Fisher information exhibits opposite trends. For a Bernoulli distribution with parameter p, Shannon entropy achieves its maximum value log₂2 at p = 0.5 (maximum uncertainty) and approaches 0 as p → 0 or 1. Fisher information and its logarithm attain minimum values of 4 and 2log₂2, respectively, at p = 0.5 (least precision), while approaching infinity as p → 0 or 1 (highest precision). This opposite yet complementary behavior at extreme values embodies a dialectical unity of opposites, underscoring their deep interconnection and complementary roles in describing different information aspects within a system.

1.7.3.3 Information Geometry and Group Representations
Information geometry studies the geometric structure of statistical models, treating probability distributions as points on a manifold (Nielsen, 2022). The Fisher information metric defines the Riemannian metric on this manifold, providing distance measures between distributions. This framework reveals how parameter changes affect distribution shape and information content per parameter.

In group theory and Lie groups, generators construct all group elements via the exponential map (Greub, 1975; Hall, 2003). In information geometry, this concept broadens to represent transformations or flows on the statistical manifold. In differential geometry (Petersen, 1998), generators correspond to infinitesimal parameter changes causing distribution shifts. The logarithmic isomorphism between Fisher information and Shannon entropy uncovers deeper theoretical connections rooted in information geometry and group theory. Viewing entropy as representing transformation generators on the statistical manifold enriches understanding of the interplay between global uncertainty and local parameter sensitivity.

1.7.3.4 Dialectical Unification Relations

1.7.3.4.1 One-Parameter Transformation Groups and Fisher Geometry in Statistical Manifolds
In Lie theory, a one-parameter subgroup of Lie group G is a smooth homomorphism from the additive group of real numbers (ℝ,+) to G, expressed as γ(t) = exp(tX) where X is a Lie algebra element and t ∈ ℝ is a real parameter. The exponential map connects the infinitesimal generator X (a tangent space element at identity) to group elements through left-invariant vector field flow (Hall, 2003).

In information geometry, we can consider a one-parameter transformation group acting on probability space with infinitesimal generator X = p(1-p)∂/∂p. This vector field encodes directional fluctuations analogous to thermal energy K_BT (with K_B as Boltzmann constant and temperature T in Kelvin), corresponding to Bernoulli process variance. Fisher information emerges as the metric dual of X, quantifying statistical manifold local curvature—a role analogous to 1/(K_BT) in thermodynamics. The group action on probability space p ∈ (0,1) is given by Φ_t(p) = pe^t/(1-p+pe^t), satisfying both group composition properties (Φ_s(Φ_t(p)) = Φ_{s+t}(p)) and consistency with infinitesimal generator verification (∂Φ_t/∂t|_{t=0} = p(1-p)). This formulation bridges stochastic fluctuations in binary systems to geometric structures, mirroring thermodynamic analogies between statistical variability and thermal noise.

To achieve transparent insight into the Fisher-Shannon relationship, we rewrite Equation 1.7.4 (Bernoulli trial Fisher information) in relation to one-parameter subgroup form generated by Lie algebra elements:

$$F(p) = \frac{1}{p(1-p)} = \exp[\ln(1/p)] \cdot \exp[\ln(1/(1-p))] \tag{1.7.6}$$

In information theory (Alajaji and Chen, 2018), the terms ln(1/p) and ln(1/(1-p)) define Bernoulli outcome self-information. Equation (1.7.6) demonstrates that Fisher information equals the product of reciprocal spiking and silence probabilities multiplied by exponentials of these self-information terms. Crucially, this Bernoulli process identity holds as an algebraic equality, not merely a group-theoretic analogy.

1.7.3.4.2 Shannon Entropy: A Weighted Average of Self-Information Quantifying System-Wide Uncertainty
Expressing the probability matrix in Equation 1.7.5 as a probability vector, Bernoulli trial Shannon entropy (S_B) becomes the dot product of the probability vector and Fisher information self-information vector:

$$S_B = (p, 1-p) \cdot \left(\frac{\ln(1/p)}{\ln 2}, \frac{\ln(1/(1-p))}{\ln 2}\right) = p\log_2\left(\frac{1}{p}\right) + (1-p)\log_2\left(\frac{1}{1-p}\right) \tag{1.7.7}$$

In neuronal information modeled as a Bernoulli process, Shannon entropy serves as a weighted average of self-information, capturing overall firing pattern uncertainty. Shannon entropy quantifies average surprise, while Fisher information quantifies parameter estimation precision. This perspective highlights the dialectical unity between Shannon entropy and Fisher information concerning information uncertainty and encoding efficiency.

When neural activity is described by independent and identically distributed Bernoulli trials, each trial represents a discrete time point where the neuron may fire ("success") or remain silent ("failure"). Shannon entropy quantifies outcome uncertainty by calculating self-information for all possible spiking events, weighted by their probabilities. The principle that less probable events carry more information because they provide greater "surprise" is central to this understanding. Conversely, Fisher information focuses on firing probability estimation precision based on observational data, measuring spike train responsiveness to minute firing probability changes. While Fisher information serves as a metric for parameter estimate precision, Shannon entropy gauges total uncertainty conveyed by these estimates.

1.7.3.4.3 Fisher-Shannon Tangling in Bernoulli Processes: Unifying Information and Entropy in Tangled Factor

1.7.3.4.3.1 Unifying Information and Entropy in Tangled Factor
Treating Bernoulli parameters p (success probability) and q (failure probability) as independent variables for gradient decomposition (while retaining the constraint p+q=1), gradients can be taken with respect to either parameter. The formal sum of partial derivatives of Shannon entropy with respect to p and q is ∇S_B = A∂/∂p + B∂/∂q where A = log₂(1/p) + 1 and B = log₂(1/q) + 1.

For the Fisher geometry statistical manifold defined by the one-parameter transformation group (Section 1.7.3.4.1), we define self-information vector fields ln(1/p)∂/∂p and ln(1/(1-p))∂/∂p as Lie algebra elements dA = (A-1)ln2 ∂/∂p and dB = (B-1)ln2 ∂/∂p, naturally derived from entropy gradient components through the Fisher information metric. Under the linear constraint p+q=1, which induces ∂/∂q = -∂/∂p, these vector fields do not commute, leading to a non-vanishing bracket [dA, dB] (Equation 1.7.8) that quantifies covariance between precision (Fisher information) and uncertainty (entropy). This bracket satisfies anti-commutativity and the Jacobi identity—defining properties of Lie algebras that emerge directly from constraint-induced non-commutativity, reflecting how geometric structures encode functional relationships. This mathematical rigor aligns with the systems science principle that structure determines function.

$$[dA, dB]f = \left[\ln\left(\frac{1}{p}\right)\frac{\partial}{\partial p}, \ln\left(\frac{1}{1-p}\right)\frac{\partial}{\partial p}\right]f = F(p)S_B\ln 2 \frac{\partial f}{\partial p} \tag{1.7.8}$$

This bracket quantifies a fundamental constraint: firing precision F(p) and uncertainty S_B are tangled such that their product scales the mismatch between spiking and silent states. This coupling defines the Tangled Factor (TF):

$$TF = F(p)S_B\ln 2 = \frac{1}{p(1-p)}\left(p\log_2\frac{1}{p} + (1-p)\log_2\frac{1}{1-p}\right)\ln 2 \tag{1.7.9}$$

TF arises directly from the non-vanishing Lie bracket [dA, dB] (Equation 1.7.8), where the coupling of F(p) (precision) and S_B (uncertainty) is quantified by their product.

1.7.3.4.3.2 Tangled Factor as Local Geometric Curvature
In gauge theory (Baez and Muniain, 1994, Chapter 3), the curvature 2-form K associated with connection 1-form A is defined by the exterior covariant derivative: K = dA + A ∧ A, where d denotes exterior derivative and ∧ represents wedge product. In local coordinates, if A = A_i dx^i (with A_i being Lie algebra-valued components), its curvature 2-form components are K_{ij} = ∂_i A_j - ∂_j A_i + [A_i, A_j], where the bracket denotes Lie bracket of gauge group generators. This encodes both geometric distortion (from dA) and algebraic non-commutativity (via A ∧ A), reflecting how parallel transport fails to close around infinitesimal loops.

For a one-parameter subgroup, the term ∂_i A_j - ∂_j A_i vanishes, and nonzero curvature arises solely from the non-vanishing Lie bracket [A_i, A_j]. In the Bernoulli one-parameter subgroup Fisher information manifold (Section 1.7.3.4.1), TF is proportional to this bracket, making TF a local geometric curvature.

These derivations rely on unique Bernoulli process features. The beauty and advantage of Bernoulli processes with constraint p+q=1 lie in their statistical independence across trials and memoryless temporal intervals—properties rooted in physical mechanisms such as neuronal refractory periods and radioactive decay dynamics. For example, after a neuron fires (due to sodium channel activation), it enters a refractory period where ion channels reset, erasing memory of past activity and ensuring subsequent spikes depend only on current input. Meanwhile, inter-spike intervals follow an exponential distribution with rate λ = -ln(1-p) because firing probability in the next millisecond depends solely on present conditions, not elapsed time since the last spike. Geometrically, p-q complementarity arises from how the Bernoulli system embeds into an information-geometric structure where parameters p and q form a dual relationship scaling with the Fisher metric tensor. This unique combination of computational efficiency and neuropsychological meaning is irreplicable in trigonometric or other systems.

In summary, within Bernoulli processes, Shannon entropy and Fisher information exhibit dialectical unity concerning information uncertainty, encoding efficiency, and tangled factor (Equation 1.7.9). While Fisher information highlights local system sensitivity to parameter changes, Shannon entropy offers a global perspective on total neural signal uncertainty and encoding capacity.

1.7.3.5 Statistical Manifold of Perception under Self-Information Geometry

1.7.3.5.1 Neural Perception as Bernoulli Logit: Canonical Parameterization in Information Geometry
The Bernoulli probability mass function is:

$$P(X=x|p) = p^x(1-p)^{1-x}$$

Rewritten in exponential family form:

$$P(X=x|p) = \exp\left(x \ln\frac{p}{1-p} + \ln(1-p)\right) \tag{1.7.10}$$

This simplifies to the canonical representation:

$$P(X=x|p) = \exp(x\theta - A(\theta)) \tag{1.7.11}$$

where θ denotes the canonical parameter (logit function for Bernoulli trials) and A(θ) serves as the log partition or cumulant function (Wainwright and Jordan, 2008). Notably, the Legendre dual of A(θ) obtained by convex conjugation equals negative entropy for Bernoulli processes. Intriguingly, A(θ) unifies statistical, geometric, and physical interpretations: it directly maps to inter-spike interval failure rate λ of its exponential distribution, encodes success probability p = dA/dθ, and its Legendre dual yields negative entropy, quantifying uncertainty as disorder. This connection binds information geometry (Fisher metric curvature), probabilistic dynamics (event likelihood), and thermodynamic analogs (entropy as disorder), forming a cohesive framework where A(θ) elegantly bridges discrete outcomes and continuous-time processes.

In the Fisher information manifold framework, Bernoulli variance p(1-p) and Fisher information exhibit mathematical correspondence to thermodynamic quantities thermal energy K_BT and its inverse precision 1/(K_BT) (Section 1.7.3.4.1). For exponential family distributions (Wainwright and Jordan, 2008), Equation 1.7.8 encodes the canonical parameter—specifically the logit function for Bernoulli trials (Equations 1.7.10, 1.7.11)—via gradient relationships (∂f/∂p = ∂(p(1-p))/∂p = (1-2p) ∝ ∇θ) under natural units (K_B = 1). This establishes geometric correspondence to thermodynamic formalism with entropy-energy analogy S_B∇θ/(K_BT) ~ ∇E/(K_BT). The logit function is not an emergent neural property but rather the mathematically necessary canonical parameter for Bernoulli-distributed perceptual decisions. Notably, the logit function unfolds the confined probability space (0,1) into an unbounded, perceptually rich continuum (ℝ). This aligns with the Weber-Fechner Law, where human perception follows logarithmic scaling (Vorländer, 2020). Inspired by this geometric relationship between decision probabilities and information manifolds, we propose quantifying cognitive tension in binary choices via a new metric: Mental Knots (MK), operationally defined by Equation 1.7.8:

$$MK = -F(p)S_B\ln 2 (1-2p) = (2p-1)TF \tag{1.7.12}$$

The antisymmetric term (2p-1) encodes directional bias (deviation from p=0.5), while TF quantifies inherent information-geometric constraints (Equation 1.7.9). Thus, MK synthesizes both effects into a unified measure of cognitive tension. Both MK and TF are dimensionless meta-information about neuronal firing probability. MK can be heuristically interpreted as: (1) Cognitive unease—quantifying subjective "mental knots" during decision-making hesitation stemming from precision-uncertainty tradeoffs; and (2) Survival dynamics—unifying Fisher information (precision) and Shannon entropy (uncertainty) to frame instinctive unease as a measurable quantity impacting cognitive effort and survival outcomes.

1.7.3.5.2 From Sensory Logits to Tension Geometry: A Unified Perceptual-Decision Framework
In the Fisher information manifold framework, perceptual decisions are formally modeled as Bernoulli processes where the logit function defines the canonical parameter θ of the underlying exponential family distribution (Equations 1.7.10 and 1.7.11). For binary outcomes governed by probability p (e.g., "stimulus detected"), this canonical parameter is:

$$\theta \equiv \text{logit}(p) = \ln\frac{p}{1-p} = \ln\frac{1}{1-p} - \ln\frac{1}{p} \tag{1.7.13}$$

Equation 1.7.13 shows that θ equals the difference between failure and success self-informations. This parameterization endows the system with linear additivity of θ under independent and identically distributed (i.i.d.) events. For neural computations integrating multiple stimulus features—such as summed intensities, frequencies, or phase variations in holographic signal encoding—the canonical parameter decomposes as:

$$\theta = \beta_0 + \sum_g \beta_g \sum_j x_j \tag{1.7.14}$$

where x_j represents sub-features (e.g., intensity, frequency, phase) and β_g is the vector/matrix of linear weights mapping features x_j to canonical parameter weights of an exponential family distribution. In neuroscience, β_g represents a perceptual gain factor. The linear expansion of θ (Equation 1.7.14) generalizes perceptual integration: sub-features x_j (e.g., edge orientations in vision, spectral peaks in audition) are combined across groups g with weights β_g. These perceptual gain factors dynamically scale feature contributions to decision variable θ, implementing context-dependent salience prioritization shaped by synaptic architecture and adaptive plasticity.

The logistic probability expression explicitly links θ to firing probability:

$$p = \sigma(\theta) = \frac{1}{1+e^{-\theta}} = \frac{1}{1+e^{-(\beta_0 + \sum_g \beta_g \sum_j x_j)}} \tag{1.7.15}$$

Critically, the marginal effect of sensory feature x_j on firing probability p integrates perceptual gain and conflict gating:

$$\frac{\partial p}{\partial x_j} = p(1-p)\beta_g \tag{1.7.16}$$

where β_g (perceptual gain factor) scales feature contribution via synaptic efficacy, and p(1-p) gates neuroplasticity, peaking at p=0.5 (decision conflict) and vanishing at p=0 or 1 (certainty states).

In the Fisher information manifold framework, perceptual decision-making is rigorously formalized through Bernoulli processes. The canonical parameter θ (Equation 1.7.13) provides an algebraic basis for feature integration, while MK (Equation 1.7.12) captures geometric constraints on decision dynamics [FIGURE:6]. This reveals a unified neurocomputational principle: perceptual integration (θ) and cognitive tension (MK) emerge from complementary algebraic and geometric properties of the same information-theoretic substrate. [FIGURE:6] illustrates that MK aligns with logit in central probability ranges but enhances interpretability, stability, and explanatory power for modeling decision dynamics. This framework establishes a continuous pathway: sensory logits (cortical feature coding) → tension geometry (MK in decision circuits) → behavioral adaptation, resolving the algebra-geometry divide in perceptual-decision models.

1.7.3.5.3 More Insights about Logit and Self-Informations
The Bernoulli model is the simplest non-trivial exponential family example, characterized by single scalar parameter p ∈ (0,1). The associated statistical manifold is one-dimensional, with geometry governed by the Fisher-Rao metric encoding parameter space intrinsic curvature through Fisher information (Section 1.7).

The 1D Bernoulli manifold is intrinsically conformally flat. When embedded isometrically into ℝ² via self-information or root coordinates, it becomes a curved submanifold of a conformally flat ambient space. The Fisher metric can be expressed in isothermal coordinate form, relating to both intrinsic and extrinsic parameterizations. The line element is:

$$ds^2 = p(1-p)d\theta^2 = \frac{e^\theta}{(1+e^\theta)^2}d\theta^2 \tag{1.7.15}$$

In isothermal coordinates (logit transform), the metric is ds² = p(1-p)dθ². Coordinate transitions are conformal. Square-root coordinates (u,v) = (√p, √(1-p)) satisfy du² + dv² = 1, while self-information coordinates (x,y) = (-ln p, -ln(1-p)) satisfy e⁻ˣ + e⁻ʸ = 1. These 2D isothermal coordinates induce a conformal metric structure on the embedded curve, preserving angles relative to ambient Euclidean space. Mathematically, this arises because the induced metric takes the form ds² = e^{2φ(μ,ν)}(dμ² + dν²), where scalar factor e^{2φ(μ,ν)} scales lengths uniformly at each point, ensuring angle invariance under coordinate transformations. For Bernoulli processes, this enables seamless geometric interpretations: self-information coordinates and logit parameters θ are conformally related through holomorphic mappings, preserving angular relationships while reparameterizing the 1D statistical manifold embedded in ℝ². This unifies Fisher information geometry with Euclidean angle preservation, critical for quantifying perceptual decision dynamics in curved parameter spaces.

The Bernoulli model provides a simple yet rich example of information-theoretic geometric interpretation. Different parameterizations—canonical parameter θ, expectation parameter p, and self-information coordinates—represent equivalent geometric perspectives related by conformal or holomorphic transformations when embedded in ℝ². This opens doors to generalizations in higher-dimensional exponential families and deeper connections between information geometry and differential geometry.

1.7.3.5.4 Perception as an Element in the Bilinear Self-Information Manifold
Both the TF metric (Equation 1.7.9) and MK meta-metric (Equation 1.7.12) exhibit strict bilinear proportionality to the non-vanishing Lie bracket of self-information vector fields (Equation 1.7.8). This Lie bracket quantifies non-commutativity between information gradients, revealing intrinsic geometric structure encoded in perceptual processes. Furthermore, the canonical parameter θ (Equation 1.7.13) demonstrates bilinear dependence on self-information component pairs, establishing a bridge between differential information geometry and perceptual representation. Together, these relationships reveal that perception operates within a structured bilinear information manifold, where Fisher metric properties, geometric invariants, and parameterizations emerge naturally from algebraic interplay of self-information quantities. This framework unifies measurement-theoretic foundations with differential geometric structures through shared dependence on fundamental informational differences, demonstrating that perceptual organization inherently resides in a linearized manifold defined by bilinear self-information dynamics.

As discussed in Section 1.7.3.5.2, the neurogeometric framework rigorously unifies perception and cognition by embedding Bernoulli decision processes into a curved Fisher information manifold, where algebraic (canonical parameter θ) and geometric (MK invariant) properties coexist as dual manifestations of the same informational substrate. The bilinear relationships among self-information differences (θ), cognitive tension (MK), and non-vanishing Lie brackets ([dA,dB]) reveal that perceptual decisions inherently unfold in a linearly structured information manifold. TF serves as curvature measuring cognitive tension, while MK encodes additional directional bias intrinsically embedded in manifold topology. Collectively, the geometric essence of perception fundamentally shapes how brains unify sensory logits and decision-related geometric local invariants (Fisher information, TF, MK) with topologically constrained global functions (e.g., Shannon entropy under Bernoulli ergodic dynamics, Section 1.5).

1.8 Information and Energy

1.8.1 The Information Capacity of Single Action Potential

Neuronal information can be calculated using Fisher information definition (Equation 1.7.3). Practically, it is easily computed via the inverse relationship between Fisher information and variance [TABLE:2], since Fisher information is synonymous with precision and intrinsic accuracy. Shannon information primarily addresses information transmission limits in noisy physical systems rather than quantifying transmitted information content (Rieke et al., 1999). The Shannon-Hartley theorem defines maximum reliable data transmission rate (channel capacity C, in bits/second) as maximum mutual information between input and output, functioning as a function of channel bandwidth (W, in Hertz) and signal-to-noise ratio (SNR) (Cover and Thomas, 2005; Shannon and Weaver, 1998):

$$C = W \log_2(1 + \text{SNR}) \tag{1.8.1}$$

Based on Shannon's channel capacity theory, we can discuss single action potential information capacity. For a given time interval, SNR can be expressed as an energy ratio. The energy required for an action potential at a node of Ranvier is at least 10⁶ ATP molecules (Aiello and Bach-y-Rita, 2000), thermodynamically equivalent to about 12.2×10⁶ K_BT (where K_B is Boltzmann's constant and T is temperature in Kelvin). This conversion follows ATP hydrolysis to ADP, releasing 12.2 K_BT per molecule (30.6 kJ/mol or 7.3 kcal/mol). Considering verified Landauer's principle linking information and thermodynamics (Bérut et al., 2012), we take the Landauer limit ln(2)K_BT as thermal noise energy, linking single action potential information capacity to neuronal information transmission rate (C_axon) as an axon channel function:

$$C_{\text{axon}} = W \log_2\left(1 + \frac{12.2 \times 10^6}{\ln 2}\right) \cong 24W \tag{1.8.2}$$

Assuming 1 Hz bandwidth for a simple neuron, the estimated capacity is approximately 24 bits per second, aligning with experimental data from cat visual system (Eckhorn and Pöpel, 1975). If a visual simple neuron fires at rate λ(t), single action potential information capacity (C_impulse) is:

$$C_{\text{impulse}} \cong 24 \cdot \lambda(t)\Delta t \tag{1.8.3}$$

Therefore, a visual simple neuron firing at 1 Hz within 1 Hz bandwidth conveys roughly 24 bits per second per action potential. This substantially surpasses 1 bit per impulse and exceeds the maximum 10 bits per impulse observed experimentally (Eckhorn and Pöpel, 1975), suggesting the recorded neuron likely had an actual firing rate over 2.4 Hz or narrower bandwidth (Equation 1.8.3).

1.8.2 The Law of Structure-Function Correlation at the Neuronal Level

Our discussions have demonstrated strong structure-function correlations in neurons, including AIS, axons, presynaptic vesicles, dendritic spines, and dendritic trees. Particularly, the dendritic spine neck is key for frequency-domain information encoding. Structure-function correlation also manifests in energy metabolism via mitochondria—the neuronal powerhouses [FIGURE:7].

Axonal and dendritic mitochondria show distinct morphologies and distributions [FIGURE:7A]. Named by Carl Benda in 1898 from Greek "mitos" (thread) and "chondros" (granule), these terms are vividly represented by filamentous reticular dendritic mitochondria and discrete axonal mitochondria, as demonstrated in an excellent three-dimensional ultrastructural study (Popov et al., 2005). Impressively, every reconstructed dendritic segment in hippocampal areas contains a single giant filamentous mitochondrion, independent of brain functional state, with average mitochondrial volume about 7% of the reconstructed dendritic segment volume including spines (Popov et al., 2005). Beyond providing ATP energy currency, filamentous dendritic mitochondria can facilitate ion diffusion partially via volume effects increasing ion effective gradients (Section 1.4, [FIGURE:4]), thus influencing information integration and neuronal excitability.

Considering that Nav1.2 dysfunction causes epilepsy or autism by modulating neuronal excitability (Zhang et al., 2021), the backpropagation of action potentials generated by Nav1.2-enriched proximal AIS regions [FIGURE:1] might be necessary for optimizing dendritic tree structure, dendritic mitochondrial morphology and function, and information integration.

1.8.3 Information and Energy

Under central limit theory and Gaussian distributed noise, Shannon channel capacity represents the upper bound of mutual information (see Chapter 3 of Spikes: Exploring the Neural Code) (Rieke et al., 1999). Conversely, Fisher information is the lower bound on mutual information for Gaussian channels (Brunel and Nadal, 1998). In more realistic situations with relaxed noise distribution assumptions, Fisher information emerges as an upper bound on mutual information in small noise regimes (Wei and Stocker, 2016). Consequently, Fisher information can serve as a good approximation for mutual information when effective noise is Gaussian distributed and the square root of Fisher information aligns with the input distribution (Wei and Stocker, 2016).

As discussed in Sections 1.1-1.7, neuronal information in Bernoulli processes is ideally characterized as Fisher information, and neuronal information capacity is well-approximated by Fisher information provided thermal noise follows a Gaussian distribution with identically distributed input and output. A prominent Fisher information feature (Equation 1.7.4) is approximate infinity when probability p is extremely small or large, potentially related to human perceptions or phantoms about salient features (Ramachandran and Blakeslee, 1999). Equation 1.7.4 indicates that the C_impulse lower bound is Fisher information I(1/2), which equals dimensionless 4 per impulse, or 2 bits per impulse (success or failure).

Intriguingly, the derivative of mutual information (in nats) with respect to SNR equals half the minimum mean-square error (MMSE), a relationship holding irrespective of discrete or continuous input distribution (Guo et al., 2005). For neuronal systems, this fundamental relationship reflects the intrinsic accuracy property inherent in both Fisher information and neuronal information, adhering to the minimum uncertainty principle. This also implies that causal MMSE equals noncausal MMSE averaged over SNR for Gaussian channels with finite average power (Guo et al., 2005). Similarly, as discussed in Spikes (Chapter 3 and Appendix 15), the optimal signal spectrum complements the noise spectrum over finite bandwidth (Rieke et al., 1999).

The MMSE criterion in communication systems aims to minimize signal propagation mean square error, reducing energy consumption for reliable data transmission and enhancing system energy efficiency. Consistently, as discussed in Sections 1.1 and 1.7, evolution has optimized brain energy efficiency due to energetic constraints, and neuronal information defined as Fisher information complies with the minimum uncertainty or variance principle (MMSE).

Remarkably, brain metabolic activity remains almost constant, with local blood flow variations (measured by PET) under 5% for most cognitive tasks, and ~80% of brain energy use correlates with active signaling processes (Raichle and Gusnard, 2002). At the intracellular level, ATP as energy currency and its metabolites are critical for fundamental metabolic processes (Wallimann et al., 1992; Boutilier, 2001). [ATP] (~2-5 mM) remains stable through replenishment from oxidative phosphorylation (Cooper, 2000) and de novo pathways (Schultheisz et al., 2008), plus active buffering by the creatine kinase/phosphocreatine (CK/PCr) system (Bonilla et al., 2021). In human brain grey matter, the maximal ATP synthesis rate by CK reaction (0.8 μmol/s/g) far exceeds maximal oxidative phosphorylation ATP synthesis (0.1-0.2 μmol/s/g) (Zhu et al., 2012). Therefore, the notably high PCr concentration (~5-10 mM) (Wallimann et al., 1992) and rapid CK reaction kinetics designate the CK/PCr system as a critical spatiotemporal energy buffer [FIGURE:7B]. The CK/PCr system facilitates tight coupling between PCr-based ATP replenishment and energy demands of processes like Na⁺-K⁺ ATPase activity, a primary ATP-consuming process in neurons that may consume up to 80% of energy (Boutilier, 2001). This tight coupling is crucial for action potential refractory kinetics and information encoding, explored in greater detail in Section 2.1.6.

1.8.4 Entropy and its Derivatives as Duality in Fisher Geometry

Shannon entropy (Equation 1.7.7), as expected self-information of a Bernoulli process, resides in an inverse dual space where its derivatives reveal fundamental statistical relationships through information geometry. The first derivative corresponds to the negative logit function (Equation 1.7.13):

$$\frac{\partial S_B}{\partial p} = \ln\frac{1-p}{p} = -\theta \tag{1.8.4}$$

which linearizes probabilities into log-odds—a key involution mapping between probability space and its dual manifold in Fisher geometry (Section 1.7.3.4.1). The second derivative yields the inverse of Fisher information (Equation 1.7.4):

$$\frac{\partial^2 S_B}{\partial p^2} = -\frac{1}{p(1-p)} = -F(p) \tag{1.8.5}$$

encoding curvature as negative precision in parameter estimation. Together, these derivatives form a duality mirroring moment-generating function structure but generating "uncertainty moments" through inverse relationships between Shannon entropy and Fisher geometry.

Intriguingly, entropy and its inverse dual derivatives form one face of a coin, with the Fisher manifold in information geometry constituting the opposite face: while logit and Fisher information map log-odds and precision dynamics through parametric gradients and curvature, entropy quantifies average uncertainty itself and its derivatives as inverse dualities of Fisher geometry. This duality links probabilistic complexity to geometric structure, creating an inseparable framework for analyzing how abstract uncertainty interacts with measurable statistical properties.

2. Information Decomposition and Encoding in Sensory Systems

The human sensory system constitutes our primary interface with the external world, enabling perception and interpretation of diverse stimuli. Historically, sensory modalities have been categorized into five primary senses according to Aristotle: sight, hearing, taste, smell, and touch (Brandt et al., 2024). Each sense involves specific organs and receptors that transduce stimuli into nerve impulses for brain interpretation. Sensory receptors can be classified by proximity to stimuli: exteroceptors detect external environmental changes, interoceptors monitor internal organs and tissues, and proprioceptors sense body part movement (Biga et al., 2019). Beyond traditional exteroceptive senses, our sensory system encompasses interoception (visceral sensations like hunger/satiety and thirst), proprioception (body position and movement awareness), vestibular sense (balance and head movement), nociception (potentially harmful stimuli), and special senses such as chronoception.

The sensory system's initial insight is that information decomposes into distinct modalities upon environmental interaction. For simplicity, we begin by examining two intensively researched modalities: the auditory and visual systems.

2.1 Simple Cells in the Auditory System

The auditory system, a pivotal human sensory modality, integrates outer, middle, and inner ear with delicate neural networks that decode sound from initial reception to brain perception. Sound waves are gathered by the outer ear (auricle and external auditory canal) and funneled to the eardrum, initiating mechanical processes in the middle ear. The ossicular chain—malleus, incus, and stapes (the body's three smallest bones)—amplifies vibrations and transmits them to the inner ear's cochlea ("snail shell" in ancient Greek). This spiral, fluid-filled structure converts mechanical vibrations into electrochemical impulses via hair cells in the Organ of Corti, which are relayed along the auditory nerve to the primary auditory cortex for sound perception (Longstaff, 2007; Biga et al., 2019).

2.1.1 Tonotopy in the Auditory System

A fascinating auditory system feature is tonotopic organization—a neuroanatomical arrangement where different sound frequencies are processed in orderly regions along the auditory pathway. The tonotopic pattern begins in the cochlea, where distinct basilar membrane areas respond to different frequencies [FIGURE:8A]. This gradient is maintained throughout subcortical structures: cochlear nuclei project to lateral and medial superior olives (LSO and MSO) on both sides, crossing the pons via the medial nucleus of the trapezoid body (MNTB) (Kandler et al., 2009) to the contralateral inferior colliculus (IC) in the midbrain (Ress and Chandrasekaran, 2013), then via auditory radiation from the medial geniculate nucleus (MGN) of thalamus (Longstaff, 2007) to the primary auditory cortex in the temporal lobe (Hackett et al., 2011).

Tonotopy derives from Greek "tono" (frequency) and "topos" (place). The tonotopic organization of cochlear hair cells enables correct frequency perception as proper pitch [FIGURE:8]. Therefore, after information initially breaks down into distinct sensory modalities like the auditory system, sound is further decomposed into frequencies [FIGURE:8A], allowing auditory neurons in hierarchically layered tonotopic spaces to function as simple cells with characteristic frequencies.

2.1.2 Simple Cell Encodes Decomposed Information in a Wave Function

The hierarchically conserved auditory tonotopy implies that simple auditory neurons have invariant characteristic frequencies from cochlea to primary auditory cortex. This translational invariance suggests these neurons can be described using group representations. As discussed in Section 1.6, simple auditory neurons can be classified as congruence subgroup Γ₁(n). In the cochlea, simple neurons with even frequencies subclassify as subgroup Γ₂(n), while those with odd frequencies subclassify as principal congruence subgroup Γ(n). Auditory relay neurons with translationally invariant frequencies in hierarchically layered tonotopic spaces belong to subgroup Γ₁(n).

Consequently, a simple auditory neuron's intrinsic function can be considered as encoding decomposed sound information through a wave function ψ satisfying translational invariance and possessing characteristic frequency:

$$\psi = e^{i(2\pi ft - \mathbf{k}\cdot\mathbf{r})} \tag{2.1.1}$$

where f is characteristic frequency, t is time, i = √-1, k is wave number inversely related to spatial wavelength, and r is a tensor describing spatial properties within the coordinate system for corresponding hierarchical tonotopy spaces. The dot product k·r reflects orientation projection and simple cell selectivity, suggesting spatial invariance interpretable as phase stability in neural responses.

Loudness is a subjective perception influenced by but not directly proportional to sound wave amplitude. Humans generally require higher intensity to perceive low-frequency sounds as loud, indicated by equal-loudness contours [FIGURE:8B]. Notably, our auditory system is more sensitive to low frequencies (<1 kHz) in speech/voice-sensitive regions (Moerel et al., 2012). This subjective loudness implies that simple neurons' perception wave function ψ in the primary auditory cortex is assigned an amplitude A:

$$\psi = Ae^{i(2\pi ft - \mathbf{k}\cdot\mathbf{r})} \tag{2.1.2}$$

For the average firing rate λ(t) discussed in Section 1.4, simple auditory neurons fire at relatively constant rate λ during interval Δt for Bernoulli trials, since these neurons equipped with perception wave functions and characteristic frequencies act as Bernoulli binary generators:

$$\lambda = 2\pi f|A|, \quad p = |A|^2 \tag{2.1.3}$$

The rate and probability in Equation 2.1.3 are important for statistical analyses. Frequentists focus on probability's long-run frequency interpretation, while Bayesians use probability to quantify degrees of belief (Bland and Altman, 1998). In physics, a first-level loudness approximation is proportional to relative change in measurable sound stimulus—the Weber-Fechner law (Vorländer, 2020). Loudness in decibels uses a logarithmic ratio scale. The discrepancy between equal-loudness contours and loudness impression suggests that amplitude A (Equations 2.1.2 and 2.1.3) is a function not only of synapse number but also of vesicle release efficiency. Consistently, high-frequency cells have more synapses and larger RRP size, but low-frequency cells release vesicles faster when normalized to release site number (Schnee et al., 2005). For relay neurons, amplitude A equals one, so their intrinsic function can be expressed as Equation 2.1.1. Given the all-or-none action potential property (Section 1.1), relay neuron stochastic output can be considered a regenerative process (Ross, 1996) with probability 1, temporal frequency f, or angular frequency 2πf (Equation 2.1.3).

Auditory neurons can faithfully follow sound stimulus frequencies below 800 Hz, with phase-locking capability declining from 800 Hz to 4 kHz and diminishing further for higher frequencies (Reichenbach and Hudspeth, 2012). Notably, maximum faithful phase locking at 800 Hz indicates a minimum 1.25 ms inter-spike interval for human auditory neurons. In contrast, high-frequency cells employ quadrature amplitude modulation (QAM) for information encoding, incorporating both alternating current (AC) components (action potentials) and sustained direct current (DC) components with broader bandwidths (Johnson, 2015). This versatility is crucial for discerning interaural time differences at microsecond scales.

For frequencies below 1 kHz in [FIGURE:8B], the hearing threshold curve can be elegantly modeled using a power function with exponent approximately -1, implying that products of logarithms of amplitudes and frequencies remain remarkably constant (~1730) for this low-frequency spectrum. This suggests that successful low-frequency perception requires more synapse activations to reach hearing threshold, potentially explained by ergodicity (Section 1.5). Ergodicity ensures low-frequency neurons can efficiently use spatial averages to approximate time averages within short periods.

2.1.3 Empirical Fano-Factor Time Curves for Neural Activity in Mammalian Auditory Nerve Fibers

Spontaneous firing data from cat auditory neurons under continuous tone stimulation have been intensively discussed with diverse statistical interpretations and time-domain modeling (Teich, 1989; Teich and Lowen, 1994; Rieke et al., 1999; Moezzi et al., 2014). However, comprehensive understanding of underlying rules remains incomplete. In the double-log plot shown in [FIGURE:9], the Poisson distribution Fano factor is represented as a horizontal dashed line equal to 1 [TABLE:2]. For large time intervals, the Fano factor scales with the square root of time [FIGURE:9] (Teich, 1989). Consistent with the Bernoulli simple neuron model (Section 1.4), the Bernoulli trial Fano factor (1-p) is smaller than 1 during initial stages.

As discussed, auditory simple neurons can be considered regenerative processes. When spike numbers are large, by the central limit theorem, such renewal processes converge to a normal distribution with mean (t/μ) and variance (tσ²/μ³), where μ and σ² represent interarrival time mean and variance (Ross's theorem 3.5) (Ross, 1996). This is also known as the inverse Gaussian distribution [TABLE:2], also called the Wald distribution, Tweedie distribution, or first passage time distribution of Brownian motion with positive drift (Folks and Chhikara, 1978).

The inverse Gaussian interpretation as first passage time is particularly interesting for information relay latency, which could be a lag time (0-500 ms) correlated with music intensity profiles and might partially explain temporal requirements for full cognitive processing and interpretation (Ding et al., 2019). Additionally, Miller's 1956 proposal of "The magical number seven, plus or minus two: Some limits on our capacity for processing information" remains independent of stimulus attributes such as tastes, colors, tones, or points (Miller, 1956). Consequently, first passage time might be involved in both information processing capacity and transmission latency, analogous to Ilya Prigogine's notion of "internal time" (Prigogine, 1980). When musical notes 1 through 7 are played, the result is perceived as a cohesive melody rather than isolated notes.

2.1.4 Time Classification: Real Time, Duration Time, Characteristic Time

To avoid confusion, we must define time classifications for further discussion. Real time (wall time in physics) refers to absolute time. Duration time is the period between event start and end. Characteristic time is an intrinsic time associated with action potential refractory periods. Among these, duration time most likely relates to Prigogine's internal time.

The double-log plot of Fano factor versus duration time for driven firing of cat auditory neurons under continuous tone at phase-locking frequency (10.2 kHz) shows similar shape but with increased inverse Gaussian fitted slope (Teich and Lowen, 1994). As discussed in Section 1.1, neurons encode information intensity as impulse frequency (Adrian and Zotterman, 1926; Koch, 1998), so Fano factor for large duration times varies with stimulus intensity.

The similarity between driven and spontaneous firing shapes is consistent with auditory simple neuron properties, which remain relatively stable, especially during initial stages. Since Brownian motion distribution variance is linear with duration time (Equation 2.1.4) (Wiener, 1921; Ross, 1996) and the inverse Gaussian Fano factor scales with square root of time [FIGURE:9], we immediately know that simple neuron firing is also linear with the square root of duration time:

$$\sigma^2 = c^2 t \tag{2.1.4}$$

where c (with dimension t⁻¹/²) need not be constant. This relationship indicates that the simple neuron regenerative process is asymptotically normal with square root of duration time as mean and duration time as variance (Ross, 1996), also satisfying Poisson distribution characterization with equal mean and variance when c = 1 [FIGURE:9].

Indeed, as demonstrated in [FIGURE:10], the square root of duration time is approximately linear with spike number. This intriguing relationship between simple neuron firing and duration time suggests that action potential refractory periods and spike frequency adaptation can be explained by the square root of duration time.

2.1.5 Simulation of Cellular Spike Frequency Adaptation

Stimulus duration represents a critical parameter for both perception and intensity encoding. This raises a fundamental question: how does a simple neuron encode temporal information? Spike frequency adaptation emerges as a plausible encoding strategy that expands the bandwidth of simple neurons and enhances information processing capacity (Equation 1.8.3). Consequently, we must examine the underlying mechanisms governing the action potential refractory period and spike frequency adaptation.

Spike frequency adaptation manifests across multiple levels, from individual sensory neurons to neuronal networks mediated by inhibitory feedback (Gutkin and Zeldenrust, 2014). Here we focus specifically on simulations and theoretical explanations of auditory sensory adaptation. Notably, no existing model has considered the linear relationship between the square root of stimulus duration and spike frequency adaptation in double-logarithmic coordinates (Figure 2.1.3). This relationship can be interpreted as a first-order approximation of a complex natural exponential function (Equation 2.1.1), where the peak spike current ( \mathcal{I}(t) ) is proportional to the linear term ( i_c\sqrt{t} ) from the Taylor expansion of ( e^{2\sqrt{t}} ) around ( t = 0 ) (Equation 2.1.5):

[
\mathcal{I}(t) \propto \left(e^{2\sqrt{t}} \approx 1 + i_c\sqrt{t}\right) \tag{2.1.5}
]

where ( \mathcal{I}(t) ) represents the membrane current, and the approximation indicates that to first order, ( e^{2\sqrt{t}} ) is approximately equal to ( 1 + i_c\sqrt{t} ). The coefficient ( c ) is subsequently determined through simulation. After extensive trials, spike frequency adaptation can be effectively simulated using the square root of duration time, as demonstrated in Figure 2.1.4. Remarkably, the simulation achieves excellent fit without requiring undefined parameters (Equation 2.1.6):

[
\mathcal{I}(t) = \frac{\cos(2\pi\Delta t\sqrt{t})V(t) - V_{\text{pump}}^{\text{NaK}}}{\text{TS}} = \frac{(\mathcal{I}{\text{inject}} - \mathcal{I}}} - \mathcal{I{\text{leak}}) - \mathcal{I}}}}{\text{TS}} \tag{2.1.6
]

All parameters possess clear physical and biological significance: ( t ) is the duration time; ( \Delta t ) is the characteristic time (i.e., inter-spike interval). The first-order approximation of Equation 2.1.5 remains valid for small ( \Delta t ) values, assuming higher-order terms in the Taylor series remain negligible within the range of interest. ( V(t) ) denotes the neuronal membrane potential; ( V_{\text{pump}}^{\text{NaK}} ) is the zero-current potential (-156 ± 14 mV) of the Na-K pump/Na⁺-K⁺ ATPase (Gadsby et al., 1985); ( \mathcal{I}{\text{inject}} ) is the injected current; ( \mathcal{I}}} ) is the sodium current; ( \mathcal{I{\text{leak}} ) is the leak current; and ( \mathcal{I} ) is the potassium current.}

[FIGURE:2.1.4] Simulation of spike frequency adaptation. The spike train is simulated utilizing the Hodgkin-Huxley spiking neuron model. The simulation uses a modified Python script from the GitHub swharden/pyHH repository, implementing Equation 2.1.6 for spike frequency adaptation. The Y-axis unit is mV. The X-axis represents time (details omitted to avoid potential misinterpretation). Red lines indicate the duration period ( t ) for continuously injected current ( \mathcal{I}{\text{inject}} ); ( \mathcal{I}(t) ) represents the net membrane current. ( \Delta t ) is the characteristic time (i.e., inter-spike interval); ( V(t) ) is the membrane potential; ( V}}^{\text{NaK}} ) is the zero-current potential (-156 ± 14 mV) of Na-K pump/Na⁺-K⁺ ATPase (Gadsby et al., 1985); ( \mathcal{I{\text{Na}} ) is the sodium current; ( \mathcal{I} ) is the potassium current. The adaptation simulation script (osf.io/yaf8n) is available on the OSF server (Kang, 2025).}} ) is the leak current; ( \mathcal{I}_{\text{K}

A striking feature of adaptation is the vivid manifestation of the 'magic number seven' phenomenon in Figure 2.1.4, which clearly illustrates cognitive capacity limits. A direct explanation for Miller's 'magic number seven plus or minus two' is that the temporal cost increases dramatically after five spikes for temporal information or intensity encoding.

2.1.6 Analytic Solution of Neuronal Refractory Period and Spike Frequency Adaptation

The successful simulation of adaptation without unexplained parameters (Equation 2.1.6 and Figure 2.1.4) strongly suggests the existence of an analytic solution for cellular spike frequency adaptation. The zero-current potential of the Na-K pump/Na⁺-K⁺ ATPase and the square root of duration time provide rich information about the underlying mechanism. As discussed in Section 1.8.3, the Na-K pump/Na⁺-K⁺ ATPase constitutes a primary ATP-consuming process in neurons, potentially consuming up to 80% of neuronal energy (Boutilier, 2001), which implies that spike frequency adaptation emerges as a consequence of energy constraints. Therefore, a more intriguing and enlightening explanation for the universal 'magic number seven plus or minus two' is that energy constraints manifest as temporal expenditure, indicating a fundamental trade-off between energy consumption and time allocation.

[FIGURE:2.1.5] Creatine as an electric dipole in an AC (alternating current) field. A. One of creatine's zwitterionic forms. B. The dipole in an electric field. The charge is ( q ) for both negatively and positively charged atoms; the distance between two charged atoms is ( d ); the magnitude of electric dipole moment ( \mu ) is defined as the product of ( q ) and ( d ), and the direction of ( \mu ) points from the negative charge to the positive charge; ( \theta ) is the angle between the dipole moment ( \mu ) and the electric field ( E(t) ); ( E(t) ) is a periodic function of time. ( \tau ) is the torque on the dipole under the electric field.

Furthermore, the square root of duration time may indicate a diffusion process governed by Brownian motion (Equation 2.1.4). Considering the CK/PCr system as a critical spatiotemporal energy buffer (Section 1.8.3) and recognizing that creatine in its zwitterionic forms (Figure 1.8.1B and 2.1.5A) undergoes Brownian motion, creatine likely serves as the key molecule actively involved in spike frequency adaptation. Because creatine is electrically neutral, its diffusion is not directly influenced by electric fields associated with fluctuations in neuronal membrane potential. However, the charged groups within the zwitterionic creatine molecule, which can be considered as forming a molecular dipole, are susceptible to electric field influences (Figure 2.1.5B). This influence can affect molecular orientation, potentially causing alignment with the field. Let us consider the creatine zwitterionic molecule as a dipole under time-dependent perturbation.

2.1.6.1 The Hamiltonian of an Electric Dipole

Focusing on a single molecular dipole, we first simplify the system to consider only one entity as a free particle, which is reasonable if the molecule is isolated and does not interact significantly with other molecules during sufficiently short characteristic time intervals. Assuming at ( t = t_0 ), the Hamiltonian ( H_0 ) of the undisturbed system represents the kinetic energy of the molecule without electric field influence:

[
H_0 = \frac{\mathbf{p}^2}{2m} \tag{2.1.7}
]

where ( m ) is the molecular mass and ( \mathbf{p} ) is the momentum operator. The potential effect of magnetic fields is ignored since brain magnetic fields are extremely weak.

The total Hamiltonian can be expressed as:

[
H = H_0 + H_{\text{int}} \tag{2.1.8}
]

where ( H_{\text{int}} ) is the interaction energy between the electric dipole moment and the electric field (Figure 2.1.5). The Hamiltonian of an electric dipole in an AC field can be written as:

[
H_{\text{int}} = \int \tau\,d\theta - \boldsymbol{\mu} \cdot \mathbf{E}(t) = \left(\int \sin\theta\,d\theta - \cos\theta\right)qd\,E(t) = (b - 2d\cos\theta)qE(t) \tag{2.1.9}
]

where ( \theta ) is the angle between the dipole moment ( \boldsymbol{\mu} ) and the electric field ( \mathbf{E}(t) ); ( \tau ) is the torque on the dipole under the electric field; the term ( (b - 2d\cos\theta) ) provides a mathematically rigorous and physically complete description of electric dipole interactions under arbitrary initial angles ( \theta ). The parameter ( b ), an integration constant from ( \int \tau\,d\theta ), encodes static dipole asymmetry (e.g., structural defects or thermal offset) and is resolved to 0 via quantum boundary conditions (Equation 2.1.19). The coefficient ( 2d ) reflects kinetic-potential energy coupling in AC fields.

The total mechanical energy of a dipole in an external electric field (Equation 2.1.9) comprises the sum of its rotational kinetic energy and potential energy. The work done on the dipole by external torque typically equals the change in the dipole's potential energy.

As discussed in Section 2.1.2, the intrinsic function of simple auditory neurons encodes decomposed sound information in a wave function with characteristic frequency and stable phase, allowing the electric field ( \mathbf{E}(t) ) of a simple neuron to be expressed as:

[
\mathbf{E}(t) = \mathbf{E}0 \left(e^{i(2\pi ft)}\right)}} = \mathbf{E}_0 \cos(2\pi ft) \tag{2.1.10
]

Let us denote ( d\cos\theta ) as a projected position operator ( \mathbf{x} ). Substituting Equation 2.1.10 into Equation 2.1.9 yields a simplified expression for ( H_{\text{int}} ) for full circular trajectories:

[
H_{\text{int}} = -(b - 2\mathbf{x})qE_0 \cos(2\pi ft) \tag{2.1.11}
]

2.1.6.2 The Dipole Hamiltonian Evolution Under Time-Dependent Perturbation

In the Heisenberg picture of quantum mechanics (Zhang, 2015; Zeng, 2013), the time-dependent Hamiltonian evolution of the system under interaction with an electric field is:

[
H_I = U^\dagger H_{\text{int}} U = e^{iH_0t/\hbar} H_{\text{int}} e^{-iH_0t/\hbar} \tag{2.1.12}
]

where ( U = e^{-iH_0t/\hbar} ) is the time-evolution operator. Using the chain rule, the time differential of ( H_I ) is:

[
\frac{dH_I}{dt} = \frac{i}{\hbar}[H_I, H_0] + \frac{\partial H_I}{\partial t} \tag{2.1.13}
]

where the expression ( [a, b] = ab - ba ) defines the commutator of elements ( a ) and ( b ). In group theory and Lie algebra, this operation is also known as the Lie bracket (Dai, 2014; Baez and Muniain, 1994; Hall, 2003). If ( ab = ba ), the elements ( a ) and ( b ) are said to commute. If ( ab \neq ba ), they do not commute, which is referred to as noncommutativity. In quantum mechanics, the non-commutativity of the position ( \mathbf{x} ) and momentum ( \mathbf{p} ) operators, ( [\mathbf{x}, \mathbf{p}] = i\hbar ), embodies the Heisenberg uncertainty principle, where ( \hbar ) is the reduced Planck constant.

Substituting Equation 2.1.11 into 2.1.13, we obtain:

[
\frac{dH_I}{dt} = U^\dagger \frac{2qE_0}{\hbar} \cos(2\pi ft) [\mathbf{x}, H_0] U - U^\dagger (b - 2\mathbf{x})2\pi f qE_0 \sin(2\pi ft) U \tag{2.1.14}
]

Since ( [\mathbf{x}, H_0] = \left[\mathbf{x}, \frac{\mathbf{p}^2}{2m}\right] = \frac{1}{2m}[\mathbf{x}, \mathbf{p}^2] = \frac{1}{2m}([\mathbf{p}[\mathbf{x}, \mathbf{p}] + [\mathbf{x}, \mathbf{p}]\mathbf{p}]) = \frac{i\hbar \mathbf{p}}{m} ), we have:

[
\frac{dH_I}{dt} = U^\dagger \frac{2qE_0}{\hbar} \cos(2\pi ft) \frac{i\hbar \mathbf{p}}{m} U + U^\dagger (b - 2\mathbf{x})2\pi f qE_0 \sin(2\pi ft) U \tag{2.1.15}
]

2.1.6.3 The Position Representation Under Time-Dependent Perturbation

Equation 2.1.15 has a momentum representation. Using the energy conservation law (( dH_I = 0 )) and the definition of the momentum operator ( \mathbf{p} = -i\hbar \frac{\partial}{\partial x} ) in the position representation, we can transform the momentum representation to the position representation, obtaining:

[
E_0 \cos(2\pi ft) \frac{d\mathbf{x}}{dt} = -2\pi f qE_0 \sin(2\pi ft) (2\mathbf{x} - b) \tag{2.1.16}
]

Simplifying further:

[
\frac{d(\ln(2\mathbf{x} - b))}{dt} = m\pi f \tan(2\pi ft) \tag{2.1.17}
]

Multiplying by ( x ) and integrating both sides yields:

[
\mathbf{x} = \frac{b}{2} \exp\left(\int m\pi f \tan(2\pi ft) \frac{1}{x} dt\right) + \frac{b}{2} \tag{2.1.18}
]

Using the boundary condition ( \langle \mathbf{x} \rangle = d ) at the initial time ( t = 0 ) (Figure 2.1.5), we obtain:

[
d = \frac{b}{2} \quad \text{and} \quad b = 0 \tag{2.1.19}
]

This result suggests integration along full circular trajectories in Equation 2.1.9, yielding:

[
\mathbf{x} = d \exp\left(\int m\pi f \tan(2\pi ft) \frac{1}{x} dt\right) \tag{2.1.20}
]

The equation can be further simplified using properties of the Dirac delta function: ( \delta(x) = \frac{1}{2\pi} \int \exp(ikx) dk ), ( \delta(x/a) = |a|\delta(x) ), ( \int \delta(x)dx = 1 ), ( \int f(x)\delta(x)dx = f(0) ). Additionally, the derivative of the trigonometric tangent function is: ( \tan(x) = \frac{1}{\cos^2 x} ). Applying these relations to Equation 2.1.19, we obtain:

[
\mathbf{x} = \frac{\pi\hbar}{2} \cos^2(2\pi ft) \delta(x) \tag{2.1.21}
}

As discussed at the beginning of Section 2.1.6, creatine in its zwitterionic form, being electrically neutral, experiences Brownian diffusion—a process not directly influenced by electric fields related to neuronal membrane potential fluctuations. Brownian motion is characterized by a normal (or z-normalized) distribution, where the variable ( z ) represents the standard score calculated as ( z = (x - \mu)/\sigma ), with ( x = \sigma z + \mu ) being the value, ( \mu ) the mean, and ( \sigma ) the standard deviation. When ( x = 0 ) and ( \theta = \pi/2 ) (Figure 2.1.5), this marks the moment when the Brownian diffusion of the dipole has zero mean and maximum torque in the interaction between the dipole and the electric field. Therefore, after incorporating Brownian diffusion with zero mean and variance as expressed in Equation 2.1.4, Equation 2.1.21 becomes:

[
\mathbf{x} = \frac{\pi\hbar}{2} \cos^2(2\pi ft) c\sqrt{t} \tag{2.1.22}
]

2.1.6.4 The Schrödinger Picture Under Time-Dependent Perturbation

Combining Equations 2.1.10, 2.1.11, 2.1.19, and 2.1.22, we obtain:

[
H_{\text{int}} = \pi\hbar \frac{dc\sqrt{t}}{E_0 \cos(2\pi ft) \cos^2(2\pi ft)} = \pi\hbar \frac{dc\sqrt{t}}{E(t)(1 - \sin^2(2\pi ft))} = \pi\hbar = H - H_0 \tag{2.1.23}
]

Therefore, the total Hamiltonian of a single electric dipole is:

[
H = \pi\hbar \frac{dc\sqrt{t}}{E(t)} \tag{2.1.24}
]

Here, ( \varepsilon_0 ) denotes the quantization energy required by the reduced Planck constant (( \hbar )), as determined through dimensional analysis. In the Schrödinger picture of quantum mechanics, the evolution of the neuronal wave function under time-dependent perturbation is governed by:

[
|\psi_n(t)\rangle = e^{-iH\Delta t/\hbar} |\psi_n(0)\rangle \tag{2.1.25}
]

where ( |\psi_n(t)\rangle ) is the ket form in Dirac notation, representing the time-evolved quantum state of the wave function in the Schrödinger picture; ( n ) represents the number of creatine molecules required for an action potential; ( \Delta t ) is the characteristic time/interval.

According to the Copenhagen interpretation, the all-or-none firing probability (Equation 2.1.3) is:

[
p = \left| e^{-iH\Delta t/\hbar} \delta(H\Delta t - k\pi) \right|^2 = \left(\cos\left(2\pi \frac{H\Delta t}{\hbar}\right)\right)^2 \tag{2.1.26}
]

The equality in Equation 2.1.26 holds only when the ratio ( H\Delta t/\hbar ) is an integer multiple of ( \pi/2 ), matching the constant value of 1 as an integer multiple from the modulus squared of the complex exponential. Therefore, we have successfully derived the analytic solution for neuronal refractory period and spike frequency adaptation, thereby explaining the simulation equation 2.1.6 and Figure 2.1.4. This analytic solution supports the preceding discussions about energy constraints, which drive neuronal refractory periods and spike frequency adaptation. Equation 2.1.26 also explains how energy limits neuronal activity and shapes information encoding via spike frequency adaptation. The analytic solution further supports the notion that the CK/PCr system functions as a critical spatiotemporal energy buffer (Figure 1.8.1B), as discussed in Section 1.8.3.

2.1.6.5 Analytic Determination of Quantized Energy in Single Action Potentials

Compared to the simulation equation 2.1.6, we can extract more insightful information, such as:

[
n \approx 1.36 \times 10^6 \tag{2.1.27}
]

Here, the parameter ( c ) of Brownian motion (Equation 2.1.4) is taken as unity after analytical parameters have been parsed; the molecular size and weight of creatine are approximately 1 nm (10⁻⁹ m) and 131.135 g/mol, respectively; the Avogadro constant is 6.02214076 × 10²³/mol; the atomic unit of charge is 1.602176634 × 10⁻¹⁹ coulombs. Then, assuming the quantization energy ( \varepsilon_0 ) is one unit, we calculate the number ( n ) of creatine molecules and determine it to be ~1.36 × 10⁶, consistent with the discussion in Section 1.8.1 regarding the estimated energy value at a node of Ranvier (~10⁶ ATP) required for an action potential (Aiello and Bach-y-Rita, 2000). Interestingly, in a volume of ~1 µm³, such as the node of Ranvier and the distal part of the AIS (Figure 1.1.1), there are approximately ~10⁶ creatine molecules available for action potential generation, since the PCr concentration is ~5-10 mM in the brain (Wallimann et al., 1992). Consistently, the PCr/creatine turnover rate is also ~10⁶ molecules per millisecond, as the energy usage rate of a single human cortical neuron is estimated to be ~10⁶ ATP/ms (Zhu et al., 2012), suggesting the characteristic time/interval ( \Delta t ) is on the ms scale, such as ~0.5-50 ms for human auditory neurons. Additionally, concentrations up to 10 mM ensure that direct dipole-dipole interactions are negligible, as there is less than one creatine molecule per 1000 nm³, supporting the rationality of the initial assumption in Section 2.1.6.1. In muscle or heart cells, where PCr concentration reaches 35 mM (Wallimann et al., 1992), the situation becomes more complex, further complicated by additional calcium dynamics in cardiac beating.

One challenge arises from inconsistent dimensions within the details, indicating the necessity of employing dimensionless parameters, such as quantized energies. Consequently, Equation 2.1.27 serves as a self-consistency condition for applying dimensionless parameters, such that ( \varepsilon_0 ) in Equation 2.1.24 represents the neuronal quantization energy, which is approximately 10⁶ ATP/PCr/creatine molecules per action potential. The neuronal quantization energy also supports the discussion in Section 1.8.1 regarding the information capacity of single action potentials.

2.1.6.6 Topological Protection of Quantum Neuronal States via Creatine Dipoles

More intriguingly, ( b = 0 ) in Equation 2.1.19 implies integration along full circular trajectories in Equation 2.1.9. This suggests that the classical torque along full circular trajectories under dipole-electric field interaction produces a non-classical effect as the phasor factor in Equation 2.1.25. In physics, a small electric current loop generated by dipole circular movement is equivalent to an infinitesimal magnetic dipole (Jin, 2011). In quantum mechanics, the Bohm-Aharonov effect represents a pure quantum-mechanical phenomenon demonstrating that the vector potential in electromagnetism can significantly affect the wavefunction (Baez and Muniain, 1994).

Therefore, the analytic solution implies that the infinitesimal magnetic dipole generated by circular dipole movement may play a role in neuronal refractory periods and spike frequency adaptation, and could explain the weak magnetic fields (~femtotesla, fT - picotesla, pT) detected in single neuron action potentials (Tonini et al., 2022). Moreover, transcranial magnetic stimulation (TMS) is currently used to treat depression, though its mechanism remains obscure (Cosmo et al., 2021). TMS might facilitate creatine adopting a specific zwitterionic form and orientation, and binding to CK, thereby enhancing energy coupling between the CK/PCr system and Na⁺-K⁺ ATPase activity (Figure 1.8.1B).

In summary, we introduce a novel biophysical theory in which topological protection is central to neuronal spike frequency adaptation and refractory periods. By modeling zwitterionic creatine molecules as electric dipoles under oscillatory neuronal fields, we derive an analytic solution to the time-dependent Hamiltonian evolution using quantum perturbation theory. A key result is the emergence of a boundary condition (( b = 0 )) that enforces full circular trajectories of dipole alignment, leading to quantized phase evolution analogous to topological phase conditions in condensed matter systems.

This topological protection mechanism provides energetic robustness to spike generation, shielding the system from decoherence and thermal noise. Importantly, due to the random angular distribution and Brownian drift of creatine molecules, only a small fraction are effectively aligned with the electric field at any given moment. This statistical sparsity naturally yields an effective contribution rate on the millisecond timescale, consistent with known neuronal firing intervals. The resulting quantization condition links ATP consumption (~10⁶ ATP per action potential), creatine concentration (~5-10 mM), and neuronal refractory periods, forming a unified model of metabolic constraint and information encoding. These results provide a biophysical explanation for cognitive limits such as Miller's "magic number 7," rooted in energy constraints rather than abstract information-theoretic bounds.

By replacing speculative or empirically limited constructs, such as "neural competition" in Global Workspace Theory or ( \Phi ) in Integrated Information Theory (IIT) (Seth and Bayne, 2022), with testable topologically protected quantization via dipole dynamics, this framework offers a falsifiable, room-temperature mechanism for robust information encoding. In contrast to decoherence-limited quantum consciousness models, such as the orchestrated objective reduction (Orch OR) model (Hameroff and Penrose, 2014) invalidated by Tegmark's decoherence analysis (Tegmark, 2000), our theory operates entirely within biologically plausible parameters while extending quantum formalism to neural energetics and cognitive processes.

2.1.7 The Wave Function Evolution of Simple Cells in the Auditory System

In summary, combining Equations 2.1.2, 2.1.6, 2.1.24, 2.1.25, and 2.1.27, the wave function ( \psi ) of simple neurons in the auditory system evolves according to an expression with a phasor factor constrained by energy coupling between PCr/creatine/ATP and the Na-K pump/Na⁺-K⁺ ATPase (Figure 1.8.1B), such that:

[
\psi(t) = e^{iH\Delta t/\hbar} \delta\left(\frac{H\Delta t}{\hbar} - m\pi\right) e^{i\Phi(2\pi l + \theta_0)} \quad \text{and} \quad H_{\text{AP}} = \pi\hbar I\sqrt{t} \frac{V(t) - V_{\text{pump}}^{\text{NaK}}}{\text{TS}} \tag{2.1.28}
]

where ( H_{\text{AP}} ) represents the Hamiltonian (~10⁶ ATP/PCr/creatine) required for a single action potential during the characteristic time (inter-spike interval) ( \Delta t ). The delta function represents the energy constraint condition that the ratio ( H\Delta t/\hbar ) must be an integer (( m )) multiple of ( \pi/2 ). Here, ( I ) is a unit with dimensions of ( nqdc/(m\varepsilon_0) ) membrane potential; ( V_{\text{pump}}^{\text{NaK}} ) is the zero-current potential (-156 ± 14 mV) of Na-K pump/Na⁺-K⁺ ATPase (Gadsby et al., 1985); ( l ) represents the periodically layered tonotopy, and the phase function ( \Phi(2\pi l + \theta_0) ) represents the dot product ( \mathbf{k} \cdot \mathbf{r} ) in Equation 2.1.2, reflecting the orientation projection of simple neurons. ( \Phi(2\pi l + \theta_0) ) is a function of the spatial angle ( \theta_0 ) between ( \mathbf{k} ) and the tensor ( \mathbf{r} ). ( \theta_0 ) is an invariant of simple neurons sharing the same characteristic frequency in the auditory system (Equation 2.1.27); ( t ) is the stimulus duration time; ( V(t) ) is the membrane potential.

2.1.8 Absolute Refractory Period

Combining Equations 2.1.26-2.1.28, we derive the relation:

[
I\sqrt{t} \frac{V(t) - V_{\text{pump}}^{\text{NaK}}}{\text{TS}} \Delta t = k\frac{\pi}{2} \tag{2.1.29}
]

Considering a Bernoulli trial of a simple neuron firing at its characteristic frequency, the duration time equals the characteristic time, which can be taken as the refractory period. With membrane potential ( V(t) ) ranging from -70 mV to 40 mV, the driving force of the Na-K pump/Na⁺-K⁺ ATPase lies between 86 mV and 196 mV. Consequently, when ( k = 1 ), the minimum/absolute refractory period for a Bernoulli trial spans 0.0187 ms to 0.0323 ms.

The absolute refractory period, along with other physiological factors such as sodium and potassium channel kinetics (Figure 1.1.1B), may influence neuronal phase-locking capabilities across species. Impressively, the barn owl demonstrates phase locking up to frequencies near 10 kHz (Köppl, 1997), reflecting diverse auditory ranges in the animal kingdom. Additionally, Equation 2.1.29 suggests that the minimum refractory period is independent of characteristic frequency. In more realistic scenarios, the integer ( k ) follows a geometric distribution, making the expected value of ( k ) inversely proportional to temporal frequency.

In summary, these analyses provide self-consistent evidence that simple neuron behavior in the auditory system is governed by the Schrödinger wave equation (Equations 2.1.25 and 2.1.28). Intriguingly, Equation 2.1.28 demonstrates that wave function evolution in auditory simple cells incorporates all three temporal classifications discussed in Section 2.1.4: real time, duration time, and characteristic time. The characteristic time validates the first-order approximation of Equation 2.1.5, establishing a linear relationship between the square root of duration time and spike number in Figure 2.1.3. Beyond the driving force of the Na-K pump/Na⁺-K⁺ ATPase, both duration time and characteristic time participate in neuronal adaptation (Equation 2.1.26).

2.2 Simple Cells in the Visual System

Visual perception involves intricate collaboration between the eye and brain to interpret the visual world, such as reconstructing three-dimensional representations from binocular two-dimensional retinal images (Longstaff, 2007). This process, while seemingly straightforward, poses significant challenges to both laypeople and experts. A comprehensive understanding necessitates a multidisciplinary approach, drawing from optics, neurophysiology, cognitive psychology, and computer science to unravel its complexities.

Light enters the eye and projects onto the retina, where photoreceptor cells convert it into electrochemical signals. These signals then traverse a layered network of neural pathways, including the lateral geniculate nucleus (LGN), superior colliculus, and visual cortex. The LGN-superior colliculus-pulvinar pathway functions as an unconscious rapid process for quick safety-related actions (Soares et al., 2017). The LGN-visual cortex pathway is where visual information is further decomposed, processed, and eventually transformed into the rich visual experiences we consciously perceive. This pathway performs complex visual operations including stereoscopic vision, pattern recognition, depth perception, color rendering, and motion detection. Processing occurs sequentially in the primary visual cortex (V1, also known as striate cortex) for basic visual processing and stereopsis, the secondary visual cortex (V2) and V3 for fine pattern recognition and depth perception, V4 for color rendering, and the middle temporal area (MT or V5) for motion detection. Consequently, visual information processing bifurcates into two streams: the dorsal stream from V5 projecting into the parietal lobe to determine object location ("where"), and the ventral stream from V4 to the temporal lobe for object identification ("what") (Longstaff, 2007). Koniocellular cells in the LGN also project directly to V5, which may account for the phenomenon of blindsight (Sincich et al., 2004).

2.2.1 The Retina as a Fourier Plane in Fourier Optics

[FIGURE:2.2.1] The retina as a Fourier plane in Fourier optics. A. Diagram of lens imaging. The distance (d) between an object and a lens exceeds the focal length (f) of the lens. The posterior focal plane is the Fourier plane in Fourier optics. B. The human retina is the focal plane of the human eye lens system. The lens system comprises the cornea, aqueous humor, lens, and vitreous humor, which work together to focus light onto the retina. The iris and pupil regulate light entry, with the iris controlling pupil size. By the lens formula, the image plane coincides with the Fourier plane under the paraxial approximation where object distance significantly exceeds focal length (( d \gg f )). C. A Fourier-transformed 2D image generated with ImageJ software encapsulates spatial frequency, phase angle, and amplitude, representing frequency domain information received by the retina. D. The original image for panel C. The real image plane simultaneously received by the retina is an inverted rendition of the original scene, akin to a mirrored reflection. The Chinese characters "感知" mean "perception" in English. The human eye schematic is modified from a shared version by Mohamed El-haddad (10.5281/zenodo.3926106).

Deciphering the brain's mechanisms for processing visual information, integrating it, and defining conscious perception represents a frontier that pushes the limits of scientific knowledge. This endeavor encompasses information decomposition, neural encoding, algorithms the brain uses to construct reality, and even philosophical inquiries about consciousness and the nature of subjective experiences or qualia, such as colors.

The first fundamental challenge is understanding how visual information is decomposed and encoded in the visual system. To clarify the decomposition mechanism and uncover how we see and what we see, we must understand key properties of lenses in Fourier optics. The lens effectively performs a Fourier transform (Goodman, 2005), with the image's frequency domain representation appearing at the posterior focal plane (Figure 2.2.1A).

The human eye lens system includes the cornea and lens, which refract light and focus it onto the retina. The accommodation mechanism, consisting of ciliary muscles and zonule fibers, controls lens shape for clear vision at varying distances. As demonstrated in Figure 2.2.1B, the human retina serves as a Fourier plane in Fourier optics (Figure 2.2.1C), beyond its traditional role as an image plane (Figure 2.2.1D). The Fourier plane, capturing spatial frequency, phase angle, and amplitude of visual input (Figure 2.2.1C), may be critical for the retinotopic map in V1 cortex, which preserves the spatial organization of visual input. For instance, angles constitute fundamental information for the retinotopic map based on quasiconformal mapping (Ta et al., 2022). Moreover, beyond spatial frequency, amplitudes in the spatial frequency domain efficiently encode visual stimulus intensities. The retina's function as a Fourier plane provides a fundamental condition for subsequent analyses.

2.2.2 Information Decomposition in the Visual System as Radon Transform

The retina contains five neuronal types. From outer to inner layers, these are photoreceptor cells, horizontal cells, bipolar cells, amacrine cells, and retinal ganglion cells (RGCs). Photoreceptor cells have three types: rods, cones, and the less prevalent (~1%) intrinsically photosensitive retinal ganglion cells (ipRGCs). Rod cells exhibit remarkable sensitivity, capable of detecting single photons (Rieke and Baylor, 1998; Tinsley et al., 2016). For conscious visual perception, retinal rods must absorb 5 to 14 photons (Fu, 1995). Cone cells are abundant in the fovea and sensitive to color and detail (Figure 2.2.2A). Recent progress demonstrates that ipRGC-mediated light sensation can promote synaptogenesis via oxytocin (Hu et al., 2022) and facilitate visual orientation feature perception (Shi et al., 2024). RGCs are categorized into three distinct types: parvocellular RGCs (P-cells), magnocellular RGCs (M-cells), and koniocellular cells (K-cells). P-cells are smaller and responsible for conveying color vision and fine detail information. M-cells are larger and specialized for low spatial frequency, high temporal resolution, and motion detection. M-cells are sensitive to low contrast and play key roles in black-and-white (luminance) vision.

The human retina contains ~92 million rod cells, twenty times the ~4.6 million cone cells (Curcio et al., 1990), ~10 million bipolar cells (Visual Processing: Eye and Retina, Section 2, Chapter 14, Neuroscience Online: An Electronic Textbook for the Neurosciences | Department of Neurobiology and Anatomy - The University of Texas Medical School at Houston), and ~1 million RGCs (Watson, 2014). In the retinal-LGN-visual cortex pathway (Figure 2.2.2A-C), the LGN contains ~2 million neurons (Figure 2.2.2B) (Dorph-Petersen et al., 2009), while V1 cortex possesses ~300 million neurons (Figure 2.2.2C) (Garcia-Marin et al., 2024).

[FIGURE:2.2.2] Visual information processing and decomposition as Radon transform. A. Schematic of retinal cells and numbers. Gray vertical rectangles represent rod cells (~92 × 10⁶). Colored sharp bars represent cone cells (~4.6 × 10⁶), which are abundant in the fovea and serve as photoreceptor cells sensitive to color and detail. ipRGCs are not shown as they are not involved in image-forming vision (Fu, 1995). Collate shapes represent bipolar cells (~1 × 10⁸). Light blue ellipses of different sizes represent two major retinal ganglion cell types (RGCs) (~1 × 10⁶): parvocellular RGCs (P-cells) and magnocellular RGCs (M-cells). "On-center" and "Off-center" terminology typically describes subtypes of bipolar cells and RGCs. B. Schematic of the lateral geniculate nucleus (LGN) (~2 × 10⁶). Koniocellular cells (not shown) are located between magnocellular and parvocellular layers, projecting to extrastriate visual areas (V2-V5) as well as striate cortex (V1) (Eiber et al., 2018). C. LGN output projects to V1 (~3 × 10⁸) via optic radiations. V1 organization is known as retinotopic mapping, maintaining the spatial order of the visual field from the retina, particularly the angle of eccentricity. D. An example of a coherent state in quantum optics. E. An example of a Fock state (photon number: ( n > 0 )) in quantum optics. Coherent and Fock states in quantum optics are realized and visualized using the qfunc function of qutip, an open Python framework for quantum systems (Johansson et al., 2013). F. The wavefunction of a coherent state ( \psi(y) ) exhibits a two-dimensional Gaussian distribution in phase space. Z is a normalizing parameter. G. The wave function of LGN. H. An expression of Gabor transform. I. An example of Gabor transformation using the same Chinese characters as in Figure 2.2.1D, using a modified Python script from the GitHub mhaghighat/gabor repository (Haghighat et al., 2015). Overall, the information processing function of the retinal-LGN-visual cortex pathway is equivalent to the Radon transform in mathematics and physics.

2.2.2.1 Quasi-Probability Distributions and the Fock State in Quantum Optics

Rod cells possess the remarkable capability to detect single photons, indicating exceptionally high quantum efficiency (Rieke and Baylor, 1998). To further decipher the functions of photoreceptor cells, bipolar cells, and RGCs in quantum mechanical terms, we introduce key concepts from quantum optics (Guo and Zhou, 2022): the Glauber-Sudarshan P representation ( P(\alpha) ), the Husimi Q representation ( Q(\alpha) ), and the Wigner representation ( W(x, p) ). The ( P(\alpha) ), ( Q(\alpha) ), and ( W(x,p) ) are three important quasi-probability distributions in quantum mechanics that provide different ways to represent quantum states in phase space (Gerry and Knight, 2005). The 2D Fourier transforms of ( P(\alpha) ), ( Q(\alpha) ), and ( W(x,p) ) are their characteristic functions, unified through an ( \varepsilon )-parameterized characteristic function ( C(k, \varepsilon) ) (Cahill and Glauber, 1969):

[
C(k, \varepsilon) = e^{\varepsilon|k|^2/2} C_W(k) = e^{\varepsilon|k|^2/2} \text{Tr}[\rho D(k)] \tag{2.2.1}
]

where ( C_W(k) ) is the Wigner characteristic function; ( \rho ) is the density matrix in quantum mechanics; ( D(k) ) is the displacement operator in quantum optics (Gerry and Knight, 2005; Weedbrook et al., 2012; Guo and Zhou, 2022); ( \varepsilon ) is a Levi-Civita symbol-like parameter: ( \varepsilon = 0 ) represents ( C_W(k) ); ( \varepsilon = 1 ) represents the normally ordered ( C_N(k) ) for the ( P(\alpha) ) representation; ( \varepsilon = -1 ) represents the antinormally ordered ( C_A(k) ) for the ( Q(\alpha) ) representation; ( k ) is a complex wave number, since ( \exp(-|k|^2/2) ) can be regarded as a photon wave packet function integrated from different plane waves with various wavenumbers ( k_i ).

For coherent states, ( P(\alpha) ) functions are given as 2D delta functions (Gerry and Knight, 2005). In quantum optics, light beam intensity is detected by our eyes through photon absorption, which is an annihilation state—i.e., a coherent state. Additionally, since the retina is a Fourier plane, photoreceptor cells interact with single modes of the electromagnetic field (photons in pure states), so photoreceptor cells function as 2D delta functions under the ( P(\alpha) ) representation. For coherent states, as illustrated in Figure 2.2.2D, both ( Q(\alpha) ) and ( W(x,p) ) are 2D Gaussian distributions (Equation 2.2.1), considering that the Fourier transform or inverse Fourier transform of a Gaussian function is simply another Gaussian function with different parameters. For coherent states, the 2D Gaussian distributions of ( Q(\alpha) ) and ( W(x,p) ) are also equivalent to 2D delta functions under the ( P(\alpha) ) representation and wave packet forms of photons in mathematics and physics (Equation 2.2.1) (Zeng, 2013), and satisfy the minimum uncertainty principle because coherent states are minimum uncertainty states (Gerry and Knight, 2005). Considering that 5 to 14 photons must be absorbed for efficient visual information transmission and conscious perception (Fu, 1995), these photon number states (Fock states, ( n > 0 )) can be visualized with the ( Q(\alpha) ) representation (Figure 2.2.2E), which is the only quasi-probability distribution possessing the nonnegative property of real probability (Gerry and Knight, 2005).

Once photoreceptor cells are activated by visual stimuli, bipolar cells undergo depolarization or hyperpolarization, propagating graded potentials based on light stimulus intensities. However, unlike neurons, bipolar cells do not typically generate action potentials. Visual information in the retina is processed in an interesting pattern: both bipolar cells and RGCs can be classified into "On-center" and "Off-center" cells (Figure 2.2.2A). These on-center and off-center cells are crucial for processing brightness changes, contrast, and motion within the visual system. The probability distribution in Figure 2.2.2E explains why "On-center" and "Off-center" cells are necessary for visual information processing in quantum optics. As illustrated in Figure 2.2.2A, light sequentially passes through RGC and bipolar cell layers before photon annihilation by photoreceptor cells, guaranteeing the Fock state (photon number ( n > 0 )) for RGCs and bipolar cells through this layered retinal design. Therefore, the functions of RGCs and bipolar cells are specialized for Fock states (Figure 2.2.2E) under the ( Q(\alpha) ) representation. Moreover, the Fock state in Figure 2.2.2E can also explain why one photon must be absorbed by 5 to 14 rod cells in the retina to generate a visual response (Fu, 1995).

2.2.2.2 LGN Neuron Functions as Gabor Transform

The on-center and off-center patterns also suggest that RGCs act as pixel units of the 2D spaces (retinal, image, and Fourier planes) and integrate gradients encoded in bipolar cells with assistance from horizontal and amacrine cells. Therefore, RGC functions can be represented as a coherent state and a 2D Gaussian function ( f_{2D}(x, y) ) (Equation 2.2.2) for simplicity without loss of generality:

[
f_{2D}(x, y) = Z \exp\left(-\left(\frac{(x-x_0)^2}{2\sigma_x^2} + \frac{(y-y_0)^2}{2\sigma_y^2}\right)\right) \tag{2.2.2}
]

where ( Z ) is a normalizing parameter; ( x_0 ) and ( y_0 ) denote the means, while ( \sigma_x^2 ) and ( \sigma_y^2 ) represent variances along the x and y axes, respectively, on the projected 2D x-y plane. This equation can be expressed in quadratic form, capturing the essence of the dot product ( \mathbf{k} \cdot \mathbf{r} ) in Equation 2.1.2:

[
f_{2D}(x, y) = Z \exp\left(-(x - x_0, y - y_0) \mathbf{F}^{-1} \begin{pmatrix} x - x_0 \ y - y_0 \end{pmatrix}\right) \tag{2.2.3}
]

Using the tensor ( \mathbf{r} ) defined in Equation 2.1.1 and the Fisher information discussed in Section 1.7, Equation 2.2.3 simplifies to:

[
f_{2D}(x, y) = Z \exp(-\mathbf{r}^T \mathbf{F}^{-1} \mathbf{r}) \tag{2.2.4}
]

where ( \mathbf{r} = [\mathbf{x}, \mathbf{y}]^T ) denotes the spatial tensor on the x-y plane; ( T ) represents transpose; ( \mathbf{F}^{-1} ) is the inverse Fisher information matrix with half precision (( \mathbf{F}^{-1} = \frac{1}{2}\Sigma )). The absence of an explicit "1/2" factor in Equations 2.2.2-2.2.4 reflects deliberate quantum notation: coherent states encode minimal uncertainty ( \Delta r^2 = 1/2 ), which is the standard quantum limit in dimensionless units. This arises naturally from second quantization, where vacuum fluctuations are ( \langle r^2 \rangle = 1/2 ).

This also represents the amplitude information received by LGN neurons (Figure 2.2.2G). The function of the LGN neuron as a perception neuron (Equation 2.1.2) has the expression:

[
\psi_{\text{LGN}} = \sqrt{f_{2D}(x, y)} \exp\left(i(2\pi ft + \Phi(\theta))\right) \tag{2.2.5}
]

Intriguingly, Equation 2.2.5 also represents the Gabor transform (Figure 2.2.2H), which is a special case of the short-time Fourier transform. Information from LGN neurons to be transformed is first multiplied by a Gaussian function before applying the Fourier transform. The Gaussian function serves as a window function in physics. Consistently, two-dimensional Gabor filter models of visual systems have been intensively discussed for over forty years (Daugman, 1980; Jones and Palmer, 1987; Olshausen and Field, 1996). Therefore, the function of LGN neurons can naturally be regarded as the Gabor transform (Equation 2.2.5) (Figure 2.2.2I).

2.2.2.3 The Retinal-LGN-Visual Cortex Pathway Functions as a Radon Transform

Given that the LGN contains ~2 million neurons while V1 cortex possesses ~300 million neurons, visual information from LGN neurons is further linearly decomposed into V1 cortex (Figure 2.2.2). Apparently, after fine details of the real image are pixelized in RGCs, Gabor-filtered via LGN, and further decomposed into V1 cortex, visual information through the retinal-LGN-visual cortex pathway is completely decomposed into the frequency domain. Consequently, information processing via this pathway is equivalent to the Radon transform in mathematics and physics, which converts a planar function defined on 2D space (such as an image or Fourier plane) into a set of one-dimensional projections representing the sum of image intensities along specific angles. For instance, a 2D Gaussian function transforms into a series of line functions, each corresponding to a different orientation on the plane.

The Radon transform and its inverse play pivotal roles in biomedical imaging, such as X-Ray Computed Tomography (CT), Positron Emission Tomography (PET), and Single Photon Emission Computed Tomography (SPECT). CT generates and reconstructs cross-sectional images from projections taken at different angles around the patient to identify internal abnormalities for clinical diagnosis, while PET and SPECT visualize metabolic and functional activity in tissues (Barret, 1984).

2.2.3 An Example of a Simple Neuron in Monkey V1 Cortex

Beyond theoretical analyses, perceptual knowledge about V1 neuron firing patterns is valuable (Figure 2.2.3). The evoked monkey V1 activity plotted in Figure 2.2.3 is shared by the Kohn group (Smith and Kohn, 2008) and deposited in CRCNS (Collaborative Research in Computational Neuroscience) (Kohn and Smith, 2016).

[FIGURE:2.2.3] The evoked V1 activity of Macaque monkey under drifting sinusoidal gratings. A. Drifting sinusoidal gratings and raster plots (neuron vs time) of evoked V1 datasets recorded using Utah arrays. Arrows represent grating drift directions. The imagesc function of Octave software is used for raster plots. B. A raster plot (time vs trial) example of recorded data from a simple neuron (monkey3-S1(0º)-N18) in V1 cortex. This simple neuron (N18) also responds to the S7 stimulus of vertical grating drifting left (180º) with the same frequency (Figure 2.3.2A), suggesting the neuron responds to vertical grating patterns rather than directions. The dataset (pvc-11) was recorded by Matthew Smith and Adam Kohn and deposited in CRCNS (Kohn and Smith, 2016).

A higher-order modularity exists in V1 cortex structure called the hypercolumn (Longstaff, 2007; Kandel et al., 2021). Consequently, only limited data demonstrate a simple neuron firing pattern with a characteristic frequency (Figure 2.2.3B). The simple neuron responds to vertical grating drifting irrespective of drift direction. The temporal frequency tuning of V1 neurons involves inhibitory modulation (Lauritzen and Miller, 2003), representing an example of adaptation in a neuronal network caused by inhibitory feedback (Gutkin and Zeldenrust, 2014), discussed further in Section 2.3.2 regarding complex cells.

As illustrated in Figure 2.2.2B, each LGN layer receives input primarily from one eye, with alternating layers receiving inputs from left and right eyes. However, binocular visual information is not mixed in the LGN, meaning magnocellular, parvocellular, and koniocellular pathways remain separate from retina to LGN. Binocular visual information first combines in V1 cortex for stereopsis. Considering the retina as a Fourier plane, binocular visual information contains all amplitude and phase information necessary for a hologram, so information encoded in V1 simple neurons can be regarded as a Fourier-sliced hologram. More precisely, since pixelized visual information in RGCs is further Gabor-filtered via LGN and linearly decomposed into V1 cortex, the fine information encoded in V1 simple neurons is a pixelized and Fourier-sliced hologram. We can generalize Equations 2.2.4 and 2.2.5 for simple neurons in V1 cortex:

[
\psi_{V1} = Z' \exp\left(i(2\pi f_0 t + \Phi(\theta_0) + i \mathbf{r}^T \mathbf{F}_3 \mathbf{r})\right) \tag{2.2.6}
]

where 3 represents the dimension of tensor ( \mathbf{r} ) for V1 simple neurons, so ( \mathbf{F}_3 ) represents the 3D Fisher information weighted matrix. Here, ( \Phi(\theta_0) ) as a phase factor can be simplified to ( \theta_0 ), ranging from 0 to ( 2\pi ):

[
\psi_{V1} = \exp(i\theta_0) Z' \exp\left(-\frac{1}{2} \mathbf{r}^T \mathbf{F}_3 \mathbf{r}\right) \exp(2\pi i f_0 t) \tag{2.2.7}
]

Here, ( \theta_0 ) belongs to a projective space with circular symmetry after Gabor transform and angular decomposition by LGN (Figure 2.2.2C and 2.2.2I). This space is called ray space (dimension: ( n + 1 )) in physics, known as projective Hilbert space in mathematics. The ( \exp(i\theta) ) can be regarded as a group representation of a unitary transformation in ray space, and the phase information contained in ( \exp(i\theta) ) is important for stereopsis formation; ( f_0 ) denotes temporal frequency.

2.3.1 Extrastriate Cortex Processing Visual Information as Inverse Radon Transform

After holograms are pixelized, Fourier-sliced, and stored in V1 simple neurons via the Radon transform action of the LGN-visual cortex pathway, the pixelized and Fourier-sliced visual information is sequentially merged in V2 and V3 for fine pattern recognition and depth perception, V4 for color rendering, and V5 for motion detection. As discussed in Section 1.4 and Equation 1.4.1, neurons act as summators, integrating outputs from individual dendritic spines in response to presynaptic inputs, which can be regarded as independent and identically distributed random variables. Additionally, wave functions (Equation 2.2.7) of V1 simple neurons are characteristic functions of presynaptic inputs. In the context of Ross's Example 1.5a, which pertains to a stochastic process, the characteristic function of the sum of a random number of random variables equals the product of the individual characteristic functions of those variables (Ross, 1996). Moreover, extrastriate visual areas exhibit reciprocal projections to one another (Figure 2.3.1A). Consequently, visual information integration in extrastriate visual areas is sequentially processed as the inverse Radon transform to higher ( \mathbf{F}_n ) dimensions (Figure 2.3.1B):

[
\psi_{Vx} = \exp\left(i \sum \theta_0\right) Z' \exp\left(-\frac{1}{2} \mathbf{r}^T \mathbf{F}_n \mathbf{r}\right) \exp\left(2\pi i \sum f_0 t\right) \tag{2.3.1}
]

where ( Vx ) represents neurons in extrastriate visual areas; ( n ) represents the dimension of tensor ( \mathbf{r} ) for extrastriate visual area neurons, so ( \mathbf{F}_n ) represents the n-D Fisher information weighted matrix.

[FIGURE:2.3.1] Visual information processing as inverse Radon transform in extrastriate visual areas. A. Schematic of communications between visual cortex areas. B. Examples of inverse Radon transformations with different angular resolutions. The inverse Radon transform is performed with the iradon command of Octave software.

Given that the number of V1 neurons is only ~150 times the number of LGN neurons (Figure 2.2.2), reciprocal projections between extrastriate visual areas are necessary for enhancing angular resolution to reconstruct intricate visual object details with increasing clarity, as demonstrated in Figure 2.3.1B. Eventually, a fine reconstructed scene (( \sum \theta_0 = 2k\pi )) is encoded in complex neurons as:

[
\psi_{Vx} = Z' \exp\left(-\frac{1}{2} \mathbf{r}^T \mathbf{F}_n \mathbf{r}\right) \exp\left(2\pi i \sum f_0 t\right), \quad Z' = \frac{\det(\mathbf{F}_n)^{1/2}}{(2\pi)^{n/2}} \tag{2.3.2}
]

where the normalization parameter ( Z' ) is necessary for characterizing the all-or-none firing pattern and satisfies the condition that the integral over all space of the amplitude-squared of the wavefunction equals unity; ( \det(\mathbf{F}_n) ) is the determinant of ( \mathbf{F}_n ); ( f_0 ) denotes temporal frequency. Reciprocal projections between extrastriate visual areas are also required for integrating additional information, such as that received and encoded in koniocellular cells. The ( \mathbf{F}_n ) dimension of complex neurons in V4 cortex is likely large, including extra dimensions for patterns, gradients, depth, etc.; while ( \sum f_0 ) is critical for complex neurons in V5 cortex for motion detection, for example, in representing a serene scene of gently flowing water.

2.3.2 An Example of a Complex Neuron in Monkey V4 Cortex

For comparison, evoked monkey V1 activity (monkey3-S7(180º)-N18) as a simple neuron example is plotted in Figure 2.3.2A (Kohn and Smith, 2016). Evoked monkey V4 activity data in Figure 2.3.2B is shared and deposited in figshare by Matthew Smith's laboratory (Smith, V1 cortex: monkey3-S7(180º)-N18 2020).

[FIGURE:2.3.2] The evoked V1 and V4 activities of Macaque monkey. A. A raster plot (time vs trial) example of recorded data from a simple neuron (monkey3-S7(180º)-N18) in V1 cortex. B. A raster plot (time vs trial) example of recorded data from monkey V4 cortex. Blue curves represent the Fano factor, and orange curves represent normalized variance.

As discussed in Section 2.2.3, inhibitory modulation is involved in temporal frequency tuning of V1 neurons. The negative slopes of the Fano factor in Figure 2.3.2 suggest that inhibitory modulation is active during the late stage of visual stimulation, consistent with reports that temporal responses to natural stimulation have stronger late inhibitory components in V1 neurons (David et al., 2004). This represents a vivid example of adaptation and synchronization in visual neuronal networks caused by inhibitory feedback. Intriguingly, the modulatory role of inhibition in synchronization may paradoxically contribute to epilepsy intractability. Consistently, fast-spiking parvalbumin-positive interneurons play active roles in maintaining oscillatory dynamics, such as phase-locking and coherence (Hijazi et al., 2023).

Neurons within extrastriate visual areas demonstrate complex cellular activities corresponding to mixed states within the quantum optics framework. As illustrated in Figure 2.3.2B, the complex neuron in V4 cortex exhibits a rich firing pattern with various frequencies and phases (Equation 2.3.1) compared to the spiking of the V1 simple neuron (Figure 2.3.2A). Complex cells in the visual cortex can be classified as the largest congruence subgroup ( \Gamma_0(n) ), as discussed in Section 1.6.

2.3.3 Working Algorithms of Complex Cells and Their Perceptions

After examining auditory and visual systems, we can identify the working algorithms of complex cells from Equations 2.2.5-2.3.2: the information integration algorithm is the inverse Radon transform; information decomposition in complex cells can employ the Radon transform and Gabor transform as demonstrated in Figure 2.2.2; and neuronal information transmission follows a Bernoulli process or regenerative process as discussed in Section 2.1.

Furthermore, how is visual information consciously perceived and created by complex cells? As discussed in Section 2.2.3, information encoded in V1 simple neurons is a pixelized and Fourier-sliced hologram. Therefore, each synapse of extrastriate V2 or V3 complex neurons receives a pixelized and sliced hologram, and V2 and V3 complex neurons selectively reconstruct specialized information (such as pattern, depth, color, and motion) via the inverse Radon transform. Intriguingly, for conscious perception or visualization, synapses of single complex neurons in extrastriate visual areas can act as pixels, collectively creating snapshots for specialized perceptions.

2.4.1 The Law of Structure-Function Correlation at the Sensory Modality Level

Stimuli have four elementary features: modality, location, intensity, and duration (Kandel et al., 2021). Human sensory systems employ specialized receptors to detect various modalities, classified by energy types (Kandel et al., 2021), such as mechanical energy (touch, proprioception, pain, hearing), electromagnetic energy (visible light), thermal energy (temperature changes), chemical energy (taste and smell), and gravitational energy (balance). For intensity, neurons encode stimulus intensity through spike rate as introduced in Section 1.1. For chronoception of duration, we have discussed the energy-constrained adaptation of auditory simple neurons in Section 2.1. For location, this information is inherently encoded within the hologram, which is reconstructed in visual cortex areas via inverse Radon transform as illustrated in Figure 2.3.1B.

After examining auditory and visual modalities, we observe that both sensory systems completely decompose auditory and visual information into the frequency domain. The exquisite structures of the cochlea in the ear and the lens system in the eye function as Fourier transformers. The tonotopic map and retinotopic map in the brain store information as wave functions in simple neurons with their characteristic frequencies and spatial configurations (Equations 2.1.28 and 2.2.7, respectively). The accommodation mechanism and multi-layered retina are specialized to employ principles of Fourier optics and quantum optics (Section 2.2). Both the structures and functions of auditory and visual modalities represent evolutionary optimized adaptations for sensory systems that preferentially respond and adapt to stimuli from changing circumstances (Kandel et al., 2021).

For the modular group SL(2, Z) and its congruence subgroups introduced in Section 1.6, simple auditory neurons can be classified as the congruence subgroup ( \Gamma_1(n) ). Simple neurons featuring even frequencies can be further subclassified as the principal congruence subgroup ( \Gamma_2(n) ), as discussed in Section 2.1.2. Complex cells can be classified as the largest congruence subgroup ( \Gamma_0(n) ). Naturally, the modular group and its congruence subgroups offer a systematic framework for classifying and understanding a wide range of neurons and their functions in the frequency domain.

2.4.2 Quantum Mechanics of the Neuronal System at the Mesoscopic Level

The group representation and hierarchical spatial invariance of auditory tonotopy facilitate our conception of simple auditory neurons' intrinsic function as a wave function ( \psi ) with characteristic frequency and invariant phase. This is a critical step toward successfully resolving the quantum mechanism of "the magic seven" and explaining the operative mechanism of energy constraints at the mesoscopic level (~10⁶ creatine/PCr/ATP). Provocatively, quantum optics perfectly explains the design, structure, and function of the retina (Figure 2.2.2). Particularly, the stratified structures, inner positions, and functions of RGCs and bipolar cells are specialized for Fock states (Figure 2.2.2E) under the ( Q(\alpha) ) representation. Interestingly, vertebrates and cephalopods (such as octopus, cuttlefish, and squid) have their retinal photoreceptors oriented in opposite directions (Baden and Nilsson, 2022), suggesting that the specialization of RGCs and bipolar cells for Fock states originated from an evolutionary ancient bifurcation.

Moreover, the wave function as the characteristic function of neurons provides insightful perspectives on LGN neuron functions as the Gabor transform (Equation 2.2.5), the retinal-LGN-visual cortex pathway function as Radon transform (Figure 2.2.2), and extrastriate visual area information processing as inverse Radon transform (Equation 2.3.1). In summary, we have learned the quantum mechanisms of time-dependent adaptation and the decomposition and reconstruction of spatial information from auditory and visual systems.

In the realm of quantum wave mechanics, particularly when examining the frequency domain—akin to Wigner's contemplation first published in 1960 (Wigner, 1995)—the impressively self-consistent information flow resonates with an emotional response: the unreasonable effectiveness of mathematics and physics in neuroscience.

2.4.3 Color Detection and Discrimination

All sensory modalities instantiate two fundamental functions: detection and discrimination (Kandel et al., 2021). Color, as a well-known subjective perception or brain creation, deserves further discussion regarding potential mechanisms for color detection and discrimination.

2.4.3.1 Color Detection

For color detection, cone cells can be further classified into three or four submodalities according to their wavelength sensitivity. Typically, the human eye contains three cone types for three primary colors: red, green, and blue, so the human eye uses the additive RGB mode for color detection, also known as trichromatic theory. Each cone type has maximal sensitivity to specific wavelengths: Short-wavelength (S) cones are most sensitive to blue light around 420-440 nm; Medium-wavelength (M) cones are most sensitive to green light around 530-540 nm; Long-wavelength (L) cones are most sensitive to red light around 560-580 nm. Approximately 12% of women have X chromosome-linked color vision deficiencies called anomalous trichromacy, which has potential links to tetrachromacy, but true tetrachromacy with a fourth cone type is rarer, occurring in about 1 out of 24 carriers of deuteranomaly with red-shifted mutant M cones (Jordan et al., 2010). Tetrachromacy is common among birds, fish, and reptiles with additional UV or violet-sensitive cones (Bowmaker, 2008).

[FIGURE:2.4.1] Color detection and discrimination. A. A modified diagram (Wikipedia) demonstrating the additive RGB mode. Red, green, and blue are the three primary colors. Cyan, magenta, yellow, and white are additive colors. B. A schematic for a color discrimination strategy with gradients. C. An adapted Kanizsa triangle illustrating the creative aspect of visual perception.

2.4.3.2 Color Discrimination and Creation

For trichromacy, red, green, and blue cones provide the human eye's capability to detect three primary colors. Complex neurons in V4 cortex may then use 3 bits (2³) to code eight colors, including black (no light stimuli) and four other additive colors: cyan, magenta, yellow, and white (Figure 2.4.1A). With gradients combined with three primary colors, human perception might discern approximately 16.7 million (2⁸)³ to 1 billion (2¹⁰)³ different colors, because the human eye can typically sense and discriminate around 8 to 10 bits of brightness gradients, varying based on lighting, contrast, and other factors. Therefore, most digital displays in practice, such as monitors and TVs, provide sufficient bit depth (8-bit or higher) to match human visual capabilities under standard conditions. Consequently, complex neurons in V4 cortex may typically require 24-30 bits (2²⁴⁻³⁰) to discriminate and create rich, nuanced colors varying in hue, saturation, and intensity (Figure 2.4.1B).

Since visual perception is a creative process (Kandel et al., 2021), we can use an adapted Kanizsa triangle to demonstrate visual creation flexibility (Figure 2.4.1C). First, we perceive three sharp edges of a white triangle despite the absence of explicit strokes; second, beyond three small blue triangles, we largely sense one large blue triangle under the white triangle; third, we can notably identify three partially shielded purple circles with gradients converging toward the Kanizsa triangle's center.

2.4.3.2.1 The Bit-Wise Spatial Coding

Color information can be embedded using either temporal modulation or spatial coding. For temporal modulation, the adaptation mechanism discussed in Section 2.1.6 might serve as a color encoding scheme, but it is vulnerable to the 'magic seven' constraint with associated time costs. An octal color code with three or four repeats might be feasible for temporal color schemes. Compared to temporal modulation, bit-wise spatial coding offers time-saving advantages. Considering that rods outnumber cones twentyfold in the retina (Figure 2.2.2A), V4 complex neurons might prefer spatial coding schemes and have the potential capability to discriminate up to ~20 bits of brightness gradients. Moreover, spatial coding can leverage the ergodicity discussed in Section 1.5. Averaging ~2300 synapses per neuron in visual cortex (O'Kusky and Colonnier, 1982), the information accuracy for each bit (total 20×3+3=63) is ensured by averaging across ~36 synapses. Considering the ergodic nature, it becomes clear why each neuron has numerous synapses beyond their role in perception, as discussed in Section 2.3.3. Using spatial coding also suggests that synapse configurations and spatial positions must be defined and organized according to the encoding scheme, such as uniform spatiotemporal weighting.

Most importantly, with representation of spatial coding using equally weighted 63 bits (2³×(2²⁰)³), billions of colors combined with other rich features can have unique identities in binary-coded digits for color and feature discrimination. Consequently, the dimension of the Fisher information matrix ( \mathbf{F}_n ) in Equation 2.3.2 is likely around 63 for V4 complex neurons. Therefore, for visual information, one neuronal unit might represent a digital information unit consisting of ~63 bits, suggesting an upper limit for visual discrimination and associated memory, including color, text, faces, spatial data, and other visual elements. For intuitive comparison, in computing, 2⁶³ is 2⁶⁰ bytes, which equals 1 exbibyte (EiB) based on binary prefixes. Using decimal prefixes common in storage capacity, 2⁶⁰ bytes is roughly equivalent to 128 petabytes (PB). In 2022, the Library of Congress in the United States had 21 PB of digital collection content comprising 914 million unique files. Thus, 2⁶⁰ bytes (or 128 PB) is ~6 times larger than the 2022 digital collection of the Library of Congress.

2.4.3.2.2 The Octal-Color Temporal Scheme

Time information is important for motion detection and discrimination by V5 complex neurons. An octal color code with 21 repeats (2³)²¹ might be optimal for full temporal color information within reasonable time costs due to frequency adaptation. Here we discuss two recent advancements by Cheng Wang's group (Cheng et al., 2024; Chen et al., 2024). The octal temporal color scheme likely participates in the egocentric (self-centered) response of the retrosplenial cortex (RSC, Brodmann areas 29 and 30), since visual inputs rather than vestibular inputs contribute to egocentric representation (Cheng et al., 2024). Egocentric representation of objects in self-centered space is involved in navigation (Cheng et al., 2024). Additionally, the octal temporal color scheme likely participates in integration and the trade-off relationship between space and time in the allocentric (world-centered) hippocampus (Figure 2.4.2) (Chen et al., 2024). The data and code for Figure 2.4.2 are shared in Chen's paper and deposited in Zenodo (Chen et al., 2024).

[FIGURE:2.4.2] Temporal rate map and correlation of two place fields in a mouse hippocampal conjunctive neuron (Chen et al., 2024). A. A range-distance-ordered lap-wise temporal rate map. Two colored lines indicate corresponding temporal rate maps of two place fields. B. The correlation between temporal rate maps of the two fields.

2.4.4 The Storage and Advanced Processing of Visual-Related Information

2.4.4.1 Visual Memory

Ungerleider and Mishkin first delineated the "dorsal" and "ventral" streams of visual information in 1982, using monkey lesion models to reveal these distinct pathways (Ingle et al., 1982; Goodale and Milner, 1992). The inferior temporal cortex (ITC) is the ventral stream's end stage via V4 cortex (Figure 2.4.3). The ITC contains or is associated with regions like the lateral occipital complex (LOC), fusiform face area (FFA), parahippocampal place area (PPA) (Goebel et al., 2012), and visual word form area (VWFA) (Dehaene and Cohen, 2011), serving as a crucial hub for high-level visual processing. The ITC encompasses specialized regions exhibiting lateralization, including FFA and VWFA. FFA is typically larger and more active in the right hemisphere of right-handed individuals, demonstrating high selectivity for facial processing and recognition (Tsao et al., 2006; Kanwisher and Yovel, 2006). Conversely, VWFA, associated with visual word form processing, tends to be dominant in the left hemisphere (Dehaene and Cohen, 2011). The extrastriate body area (EBA), part of the extrastriate visual cortex near the occipital-temporal border, is specialized for body part perception (Downing et al., 2001). The lateral occipital complex (LOC) processes object shape and is involved in non-face object perception. The parahippocampal place area (PPA) is involved in scene and place recognition (Goebel et al., 2012). Consequently, the ITC uses distributed representations, meaning different visual features and categories are represented across a neuronal network. This distributed coding enables robust recognition even when parts of visual stimuli are missing or occluded, and provides functional buffering against potential brain ischemia, edema, and head trauma.

[FIGURE:2.4.3] The storage and advanced processing of visual information. A. Cortical areas involved in storage and processing of visual intermediate-level and high-level information. FEF, frontal eye fields. The retrosplenial cortex (RSC) serves as a hub that translates between egocentric and allocentric information, essential for navigation, spatial memory, and orientation in varying contexts (Van Wijngaarden et al., 2020; Alexander et al., 2023). The extrastriate body area (EBA) near V5 responds to body parts (Downing et al., 2001). B. Storage and lateralization of information in the inferior temporal cortex (ITC) and associated areas. The ITC encompasses specialized lateralized regions including the fusiform face area (FFA) and visual word form area (VWFA). The lateral occipital complex (LOC) processes object shape and non-face objects. The parahippocampal place area (PPA) is involved in scene and place recognition (Goebel et al., 2012). Diagrams are adapted from Wikipedia entries on the two-streams hypothesis and FFA, respectively.

What principle governs the distribution of visual feature categories? Considering that smooth information flow in complex cells is managed by inverse Radon transform and Gabor transform for inputs and outputs, respectively, as discussed in Section 2.3.3, the distribution principle may be attributed to hierarchical levels and distinct identities of extrastriate visual areas projecting to V4. Visual feature categories imply certain invariances in the distribution principle, such as the dimension of the Fisher information weighted matrix in Equation 2.3.2. These Fisher matrices should have invariant dimensions and ordered hierarchy levels for each visual feature category.

More importantly, all these categorized and stored visual features provide rich prior probabilities and solid foundations for rate theory based on Bayesian algorithms (Rieke et al., 1999), as mentioned in Section 1.2. Therefore, neurons in the visual system may also use Bayesian inference for discrimination, learning, and prediction. Particularly, the EBA, located near V5 and specialized for body part perception (Downing et al., 2001), may play a crucial role in Bayesian inference for predicting body motion and assessing potential threats.

2.4.4.2 Egocentric and Allocentric Representations/Coordinates

The dorsal "where" stream to the posterior parietal cortex (PPC) serves spatial vision and visually guided actions (Goebel et al., 2012). The PPC is an egocentric cortex, particularly with an eye-centered representation/coordinate modulated by eye-, head-, body-, or limb-position signals (Cohen and Andersen, 2002). Egocentric frames (eye-, head-, body-, and limb-centered) are based on the observer's current position, while allocentric frames represent spatial layout from a fixed environmental point.

The RSC serves as a hub that translates between egocentric and allocentric information, essential for navigation, spatial memory, and orientation in varying contexts (Van Wijngaarden et al., 2020; Alexander et al., 2023). The RSC and posterior cingulate cortex (PCC, Brodmann areas 23 and 31) together form the main hub of the default mode network, which is active in the 'resting brain state' (Vann et al., 2009; Hemmings et al., 2019). By supporting transformation and integration of spatial information across reference frames, the RSC enables flexible and efficient navigation and spatial awareness. The RSC as an association cortex interconnects with the egocentric parietal cortex, medial prefrontal cortex (mPFC), visual areas, and allocentric hippocampus (Figure 2.4.3A). The allocentric hippocampus also integrates memory, spatial, and temporal information from the parahippocampal cortex via the entorhinal cortex (EC) (Eichenbaum, 2017), which includes the egocentric lateral (LEC) and allocentric medial (MEC) entorhinal cortex (Wang et al., 2018). The RSC, hippocampus, and precuneus (the median portion of PPC) are particularly susceptible to hypometabolism and atrophy in Alzheimer's disease (AD) (Mendez et al., 2002; Biran and Coslett, 2003; Ryu et al., 2010; Nestor et al., 2003; Chételat et al., 2016), accounting for typical AD symptoms such as disorientation and misidentification.

Moreover, the brain's navigation system employs specialized cells to map spatial information, facilitating orientation and movement. Place cells in the hippocampus are active at specific locations, constructing a cognitive map (O'Keefe et al., 1998). Grid cells in MEC generate a hexagonal lattice, aiding position and distance tracking across large areas (Hafting et al., 2005; Tukker et al., 2022). Boundary cells, found in the subiculum, MEC, parahippocampal cortex, and RSC, respond to environmental boundaries such as walls and personal spaces, and to spatial representations of fixed features (Barry et al., 2006; Lever et al., 2009). Collectively, this neuronal network ensures accurate spatial navigation and memory encoding.

The PPC as a common reference frame for movement plans requires intensive computational processes for transformations between reference frames (Cohen and Andersen, 2002). This information is important for frontal eye fields (FEF) in the prefrontal cortex (PFC) (Figure 2.4.3A), which plays a crucial role in controlling visual attention and eye movements (Vernet et al., 2014). We are not conscious of these computational frame-transformation processes, nor do we need to consciously control our eye movements most of the time.

Consequently, as mentioned at the beginning of Section 2.2, the superior colliculus-pulvinar pathway of the thalamus functions as an unconscious rapid process (Soares et al., 2017). Visual information from the retina is transmitted to the superior colliculus and further sent to the PPC and FEF via the pulvinar, which serves as a central relay for cortical areas (Kandel et al., 2021). The FEF controls eye movements by projecting back to the superior colliculus via the caudate nucleus and substantia nigra, with the superior colliculus further projecting to oculomotor nuclei (including the abducens nucleus) via the paramedian pontine reticular formation (Kandel et al., 2021).

2.4.4.3 The Capacity of Working Memory

As discussed in Section 2.4.3.2.2, time information is important for motion detection and perception, and the octal temporal color scheme may be optimal for egocentric and allocentric responses. Time-adapted information is likely transmitted to the PFC, which plays a critical role in working memory and decision-making (Figure 2.4.3A), particularly considering the 'magic seven' adaptation discussed in Section 2.1.

The "magic number seven, plus or minus two" is a classic concept illustrating working memory capacity limits—the cognitive system for temporarily holding and manipulating information (Miller, 1956). While Miller's original observation suggested a limit of approximately seven items, more recent research indicates capacity may be closer to three to four items for complex information (Fukuda et al., 2010; Cowan, 2010). Factors such as attention, chunking strategies, cognitive load, and information nature can all influence working memory's effective capacity.

The maximal number of asynchronous three- or four-item representations for working memory suggests that the dorsal "where" stream to the PFC might use a vigesimal (base-20) color scheme for time-critical complex tasks, since neuronal adaptation time costs increase dramatically after a few spikes for temporal encoding, as discussed in Section 2.1.5. Additionally, simulation (Figure 2.1.4) and analytic solution (Section 2.1.6) of neuronal adaptation imply that working memory's maximum capacity could be five items, still within Miller's proposed range. Furthermore, considering the minimum refractory period of ~0.0187 ms discussed in Section 2.1.7, it is more appropriate to use the characteristic time as the absolute refractory period when simulating maximal number representations for working memory (Figure 2.4.4).

[FIGURE:2.4.4] Simulation of maximal number representations for working memory. A. When ( \Delta t = 0.02 ) ms is used as characteristic time for simulation, three spikes are visible during the stimulation period (red duration time). B. There are five spikes for ( \Delta t = 0.0187 ) ms as the minimum refractory period. The working memory simulation script (osf.io/xkjvt) is available on the OSF server (Kang, 2025).

2.4.4.4 Bayes Inference

2.4.4.4.1 Bayesian Theorem and Conjugate Distributions

Bayes' theorem is a foundational probabilistic formula that quantifies how evidence updates beliefs about hypotheses (Ross, 2002):

[
P(H|E) = \frac{P(E|H)P(H)}{P(E)} \tag{2.4.1}
]

where ( P(H) ) represents prior belief in hypothesis ( H ); ( P(E|H) ) is the likelihood of observing data ( E ) given ( H ); and ( P(H|E) ) is the posterior probability reflecting updated beliefs. Bayesian inference extends this framework, using Bayes' theorem iteratively to update probabilistic models as new data are observed. It integrates prior knowledge with empirical evidence for decision-making under uncertainty.

A conjugate distribution refers to a prior distribution that, when combined with the likelihood function from an exponential family model (e.g., gamma, Gaussian), results in a posterior distribution of the same family (Diaconis and Ylvisaker, 1979). This property simplifies computation by ensuring closed-form solutions for the posterior (e.g., Beta-Binomial or Normal-Inverse-Gamma conjugacy).

Bayesian theorem and inference provide a coherent probabilistic framework for modeling uncertainty, integrating prior knowledge with observed data to iteratively update beliefs. This approach enables nuanced quantification of confidence in conclusions, supports sequential learning through dynamic posterior updates, and unifies hypothesis testing, parameter estimation, and prediction under a single paradigm. Conjugate distributions further enhance computational efficiency by ensuring closed-form posterior solutions when priors are chosen from the same family as the likelihood function (e.g., Normal-Inverse-Gamma). This allows streamlined inference without resorting to approximation methods such as Markov Chain Monte Carlo (MCMC), which are required when closed-form updates are unavailable.

2.4.4.4.2 The Gamma Distribution as Conjugate Prior for Visual Information

As discussed in Section 2.4.4.1, the rich prior probabilities provided by categorized and stored visual features in the ITC form a solid foundation for rate theory based on Bayesian algorithms. The precision or Fisher information of a normal distribution is inversely proportional to its variance. In Bayesian statistics, when modeling a Gaussian distribution with unknown mean and precision, a conjugate prior is often assigned to the precision, typically following a gamma distribution.

Utilizing properties and definitions of the diagonal Fisher information matrix, determinant, and quadratic form, we can explicitly reformulate Equation 2.3.2 as:

[
\psi_{Vx} = \left(\prod_k \frac{F_k}{\pi}\right)^{1/4} \exp\left(-\frac{1}{2} \mathbf{r}^T \mathbf{F} \mathbf{r}\right) \exp\left(2\pi i \sum_j f_j t\right) \tag{2.4.2}
]

where ( F_k ) denotes the k-th Fisher information component, and ( \mathbf{r} ) represents spatial coordinates in the tensor basis; ( f_j ) corresponds to temporal frequencies.

Then:

[
|\psi_{Vx}|^2 = \prod_k \sqrt{\frac{F_k}{\pi}} \exp(-F_k r_k^2) \tag{2.4.3}
]

The probability density function (PDF) of a gamma-distributed random variable with shape parameter ( \alpha ) and rate parameter ( \lambda ) is:

[
\text{PDF}(x; \alpha, \lambda) = \frac{x^{\alpha-1} e^{-\lambda x} \lambda^\alpha}{\Gamma(\alpha)} \tag{2.4.4}
]

The cumulative distribution function (CDF) is:

[
\text{CDF}(x; \alpha, \lambda) = \int_0^x \frac{t^{\alpha-1} e^{-\lambda t} \lambda^\alpha}{\Gamma(\alpha)} dt = \frac{\gamma(\alpha, \lambda x)}{\Gamma(\alpha)} \tag{2.4.5}
]

where ( \gamma(\cdot) ) is the lower incomplete gamma function. The gamma-distributed random variable ( F \sim \text{Gamma}(\alpha, \lambda) ) has closed-form expressions for both PDF and CDF, with mean ( E[F] = \alpha/\lambda ) and variance ( \text{Var}(F) = \alpha/\lambda^2 ). The gamma distribution serves as the conjugate prior for the precision ( F ) (inverse variance) of a Gaussian distribution with known mean, as well as for Poisson, exponential, and gamma distributions with known shape parameter.

Given the ITC's role as the final ventral stream stage via V4 cortex (Figure 2.4.3) and its connectivity with specialized visual regions (LOC, FFA, PPA, VWFA), this implies that precision parameters/Fisher information in these areas follow gamma distributions (Equation 2.4.3). Consequently, stored visual features in ITC maintain gamma-distributed prior probabilities, facilitating efficient Bayesian inference. This further supports that neuronal information can be conveniently and naturally defined and quantified as Fisher information (Section 1.7).

2.4.4.4.3 Hierarchical Normal-Gamma Conjugacy and Posterior Probability

Comparing Equations 2.4.3 and 2.4.4, we observe that ( |\psi(F_k)|^2 ) follows a ( \text{Gamma}(3/2, r_k^2) ) distribution. Simultaneously, ( |\psi(r_k)|^2 ) follows a Gaussian distribution ( N(0, 1/(2F_k)) ). The quadratic form ( \mathbf{r}^T \mathbf{F}_n \mathbf{r} ) (Sections 2.2 and 2.3) follows a chi-squared distribution ( \chi^2(n) ) with ( n ) degrees of freedom. This establishes Equation 2.4.3 as a hierarchical conjugate model with Gaussian likelihood for ( r_k ) and gamma prior over precision ( F_k ). Equation 2.4.3 provides the mathematical foundation for Bayesian causal inference implemented by complex cells in LOC/FFA/PPA/VWFA and associated regions (EBA, V5, RSC, FEF; Sections 2.4.4.1 and 2.4.4.2). The Normal-Gamma conjugacy enables optimal integration of multisensory evidence under uncertainty through closed-form posterior updates.

The Gamma distribution ( \text{Gamma}(3/2, \beta) ) provides critical advantages for automated inference systems:

  1. Stable closed-form Bayesian updates: Its conjugacy with Gaussian likelihood ensures analytical computation of posterior distributions for spatial coordinates ( r_k ) and Fisher information ( F_k ), eliminating numerical instabilities in non-conjugate models. This ensures reliable inference even under noisy or sparse data.

  2. Outlier robustness via heavy tails: Compared to Gaussian distributions, the Gamma prior's heavy tails naturally suppress outliers in precision estimates (( F_k )), preventing overfitting while preserving high-confidence features essential for visual recognition.

  3. Linear-time scalability: Posterior computation scales with input dimension as ( O(n) ), enabling real-time deployment on edge devices (e.g., ITC-linked neural circuits for object recognition).

  4. Self-calibration through ( \beta ) adaptation: The scale parameter ( \beta ) dynamically updates through sensory evidence (e.g., ( \sum r_k^2 )), enabling autonomous precision updates without retraining—a process functionally analogous to plasticity in visual areas like FFA.

For 3D spatial data, ( \text{Gamma}(3/2, \beta = 1/2) ) is optimal. ( \text{Gamma}(3/2, 1/2) ) corresponds exactly to a chi-squared distribution (( \chi^2(3) )), where three degrees of freedom inherently encode 3D coordinates (x, y, z). The choice ( \beta = 1/2 ) represents minimal uncertainty in the quadratic form ( \mathbf{r}^T \mathbf{F} \mathbf{r} ) (Section 2.2.2).

Posterior Derivation of Normal-Gamma Conjugacy:

Likelihood for ( \mathbf{r}_k ):
[
p(\mathbf{r}_k | F_k) = \sqrt{\frac{F_k}{\pi}} e^{-F_k \mathbf{r}_k^2} \Rightarrow \text{Var}(\mathbf{r}_k) = \frac{1}{2F_k}
]

Conjugate prior for ( F_k ):
[
p(F_k) \sim \text{Gamma}(\alpha, \beta) = \frac{F_k^{\alpha-1} e^{-\beta F_k} \beta^\alpha}{\Gamma(\alpha)}
]

Posterior probability given observation ( \mathbf{r}_k ):
[
p(F_k | \mathbf{r}_k) \propto p(\mathbf{r}_k | F_k) p(F_k) = F_k^{1/2} e^{-F_k \mathbf{r}_k^2} F_k^{\alpha-1} e^{-\beta F_k} = F_k^{\alpha-1/2} e^{-(\beta + \mathbf{r}_k^2)F_k}
]

Thus:
[
p(F_k | \mathbf{r}_k) \sim \text{Gamma}\left(\alpha + \frac{1}{2}, \beta + \mathbf{r}_k^2\right) \tag{2.4.9}
]

Note: No explicit ( 1/2 ) appears in the updated ( \beta ) since it is embedded in ( F_k )'s role as half precision (Section 2.2.2). The factor ( 1/2 ) in ( \text{Var}(\mathbf{r}_k) = 1/(2F_k) ) is intrinsic to the quantum-classical interface, harmonizing visual wavefunctions with Bayesian conjugacy.

Consequently, the posterior probability of hierarchical Normal-Gamma conjugacy with ( n ) observations (Equation 2.4.3) is:

[
p(F_k | {\mathbf{r}_k}) \sim \text{Gamma}\left(\alpha + \frac{n}{2}, \beta + \sum \mathbf{r}_k^2\right) \tag{2.4.10}
]

The mean precision with hyperparameters has the expression:

[
E[F_k | {\mathbf{r}_k}] = \frac{\alpha + n/2}{\beta + \sum \mathbf{r}_k^2} \tag{2.4.11}
]

The variance of precision is:

[
\text{Var}[F_k | {\mathbf{r}_k}] = \frac{\alpha + n/2}{(\beta + \sum \mathbf{r}_k^2)^2} \tag{2.4.12}
}

Immediately, we observe key insights about precision updates: First, the posterior mean is context-dependent—whether ( E[F_k | {\mathbf{r}_k}] ) equals (invariant), exceeds (decreases), or falls below (increases) the prior mean depends on whether ( \sum \mathbf{r}_k^2 ) is equal to, exceeds, or falls below a critical value of ( \alpha/\beta ). This relationship is not monotonic. Second, quantum uncertainty principles impose a constraint on the denominator ( (\beta + \sum \mathbf{r}_k^2) ), which must equal or exceed unity (Section 2.2.2). This ensures posterior variance decreases with updates to ( \sum \mathbf{r}_k^2 ) if ( \Delta \mathbf{r}_k^2 ) remains constant during persistent exercise or learning, reducing uncertainty in precision/Fisher information ( F_k ). When ( \text{Var}[F_k | {\mathbf{r}_k}] \leq \text{Var}[F_k^{\text{prior}} | {\mathbf{r}_k^{\text{prior}}}] ), posterior variance increases, representing a period of plasticity or decline in learning. Equation 2.4.12 further substantiates the existence of an extremal value (6) during initial stages via extreme-value analysis, reflecting inherent bounds on precision quantification in underdetermined systems.

Using functional magnetic resonance imaging (fMRI), Nomi et al. reported that spontaneous Blood Oxygenation Level Dependent (BOLD) signal variability decreases linearly with age in visual cortex but increases in ventral temporal cortex (Nomi et al., 2017). This opposing variability trajectory is consistently explained as above, reflecting age-related increases in weighting of prior expectations during Bayesian inference.

Crucially, the hierarchical Normal-Gamma conjugate structure remains self-consistent under these dynamics. The universal learning signal arises from this variance reduction mechanism, which continuously refines probabilistic estimates by prioritizing precision gains across all plausible parameter configurations. Additionally, the hierarchical Gamma prior structure is ideal for processing multisensory information, such as auditory signals (Equation 2.1.28), as the gamma distribution serves as conjugate prior for both Poisson and exponential distributions. Empirical evidence supports the role of hierarchical Bayesian causal inference in integrating visual and auditory stimuli (Rohe et al., 2019).

2.4.4.4.4 Cognitive Implications of Normal-Gamma Dynamics

The posterior distribution of hierarchical Normal-Gamma conjugacy has dual properties: context-dependent accuracy quantified by mean precision (( E[F_k] )); and confidence ( I^2 ) defined as:

[
I^2 = \frac{1}{\text{Var}[F_k | {\mathbf{r}_k}]} = \frac{(\beta + \sum \mathbf{r}_k^2)^2}{\alpha + n/2} \tag{2.4.13}
]

Confidence (Equation 2.4.13) can well explain BOLD variability increasing linearly in ventral temporal cortex for Bayesian inference. Moreover, accuracy and confidence provide a unified quantum-Bayesian framework for cognitive phenomena:

  1. Photographic Memory: At minimal stimulus (( \sum \mathbf{r}_k^2 \to 1/2 )), ( \text{Var}[F_k] ) is saturated and remains at its prior level (≤ 6), permitting low-fidelity, prior-driven recall.

  2. Cognitive Rigidity: With aging (( n \gg 1 )), ( \text{Var}[F_k] \sim O(1/n) ) asymptotically vanishes, cementing prior beliefs and confidences. While the algorithmic process favors rigidity, metacognition and high-impact evidence can still override default updating rules. Therefore, personality traits (e.g., stubbornness in aging) naturally emerge from individual differences in cognitive algorithms governing belief updating.

  3. Survivor Bias: Extreme events (( \sum \mathbf{r}_k^2 )) reduce accuracy but trigger a confidence explosion (( I^2 \to \infty )), which overpowers accuracy reduction, causing pathological overweighting of rare outcomes.

Critically, the quantum constraint ( (\beta + \sum \mathbf{r}_k^2 \geq 1) ) enforces:

  1. Energy optimization: Confidence uncertainty reduction (( \text{Var}[F_k] ) decreases) optimizes decision energy (( \propto 1/\text{Var}[F_k] )) (Section 1.8.3). This mechanism demonstrates that Bayesian confidence always strengthens with evidence during persistent exercise or learning, while accuracy adaptively responds to prediction errors—an evolutionary design optimized for metabolic efficiency.

2.4.4.4.5 Hierarchical Conjugacy Enables Human-Like Efficient Learning

The ventral visual stream encodes hierarchical object representations through specialized cortical regions including LOC, FFA, PPA, and VWFA. These areas serve as terminal nodes for hierarchical visual information binding, integrating visual features with semantic knowledge (Section 2.4.4.1). Specifically, FFA specializes in facial features, PPA prioritizes spatial configurations, LOC handles generic object forms, and VWFA decodes orthographic symbols. This processing mechanism operates within a shared hierarchical Bayesian framework, wherein prior knowledge dynamically refines posterior beliefs.

To computationally model this process, we propose a minimum five-level generative architecture grounded in Bayesian hierarchical models with Normal-Gamma conjugacy. The layers correspond to distinct processing stages: Levels 1-2 involve low-level feature extraction (edges → textures); Levels 3-4 perform mid-level pattern recognition (object parts → categories); and Level 5 integrates high-level visual-semantic representations in the specified regions. Each inference step is discretized into ( n = 5 ) hierarchical levels, reflecting distributed cortical computation across areas. This formulation captures propagation of gamma-distributed prior precision and normal likelihood distributions through the hierarchy (Figure 2.4.5).

2.4.4.4.5.1 Confidence Gain and Mastery Level

The model was implemented in Python and validated on visual object recognition tasks. Figure 2.4.5 illustrates the architecture of a hierarchical Bayesian framework with Normal-Gamma conjugacy, demonstrating human-like learning efficiency. Specifically, the model achieves robust knowledge acquisition from minimal experience (2-3 examples) by tracking two meta-metrics derived from Fisher information:

Confidence Gain (CG) is defined as normalized confidence improvement during training relative to initial confidence ( I_0 ):

[
CG = \frac{I^2 - I_0^2}{I_0^2} \tag{2.4.14}
]

Mastery Level (ML) reflects the proportion of acquired knowledge concerning precision or likelihood parameters, formulated to quantify learning progress:

[
ML = \frac{E[F_k] - E[F_k]{\text{prior}}}{E[F_k]}} - E[F_k]_{\text{prior}}} \tag{2.4.15
]

Confidence, CG, and ML serve as indicators of the system's confidence in its parameter estimates, bridging computational neuroscience models with empirical observations. By operationalizing abstract concepts like "knowledge mastery" into concrete metrics, the model provides a formal framework for analyzing cognitive efficiency.

This approach not only explains human-like learning dynamics observed in biological systems but also offers a foundation for evaluating artificial intelligence paradigms under information constraints, exemplified by specialized brain regions like VWFA.

2.4.4.4.5.2 Achievement Emotion and Frustration Feeling

Figure 2.4.5D also depicts temporal dynamics of achievement emotions throughout distinct learning stages. In addition to achievement emotions, learners frequently experience frustration during study sessions (Figure 2.4.6). To systematically quantify these affective phenomena, we formally introduce two complementary meta-metrics: Achievement Emotion (AE) and Frustration Feeling (FF).

[
AE = \log_2\left(1 + \frac{I^2 - I_0^2}{I_0^2} \cdot \frac{ML}{1 - ML}\right), \quad \text{when } I^2 \geq I_0^2 \tag{2.4.16}
]

[
FF = -\log_2\left(1 + \left|\frac{I^2 - I_0^2}{I_0^2} \cdot \frac{ML}{1 - ML}\right|\right), \quad \text{when } I^2 \leq I_0^2 \tag{2.4.17}
]

Here, AE is constrained to the closed interval [0, 1], indicating positive shifts in learning progress, whereas FF falls within [-1, 0], representing negative setbacks. For visual clarity, these metrics are represented in percentage formats within Figure 2.4.6.

AE and FF serve as quantitative indicators of emotional responses within the computational model. This formalization bridges cognitive science and affective computing by linking learning dynamics to measurable emotional states.

[FIGURE:2.4.6] Achievement emotion and frustration feeling. Mastery levels, achievement emotions (≥0), and frustration feelings (≤0) at different learning stages are shown in percentage formats.

2.4.4.5 Meta-Metrics and Qualia

The proposed framework integrates quantifiable meta-metrics (confidence, confidence gain (CG), mastery level (ML), achievement emotion (AE), and frustration feeling (FF)) to address challenges in neuroscience, artificial general intelligence (AGI), and consciousness studies. The framework defines these metrics as meta-information quantifying distinct visual signal properties derived from precision/Fisher information variances. Each metric structurally encodes a formalizable aspect of subjective experience, enabling scientific investigation of previously intractable phenomena. These metrics are not emergent properties but rather quantifiable components that can be measured and predicted. Unlike emergent properties, which arise only when components interact at a critical scale (e.g., neural network complexity of V4) and are irreducible and unpredictable from individual parts alone (e.g., saltiness of NaCl and wetness of H₂O), these meta-metrics provide closed-form representations of subjective experiences. For example, CG captures temporal changes in confidence, ML reflects prior knowledge saturation in perception, AE encodes positive momentum from skill acquisition, and FF measures negative affect during learning setbacks.

These metrics serve as quantitative proxies for qualia—fundamental units of phenomenal consciousness—bridging Bayesian inference with subjective experience without relying on the irreducible nature of emergent properties, which are elaborated in Section 3. They structurally encode indivisible perceptual units to address issues like Leibniz's monadic perception problem, map cognitive processes to quantifiable states, and resolve the hard problem of consciousness proposed by Chalmers. Mechanistically unifying mental phenomena with neuro-computational variables, they reconcile Kantian dualism between noumenal/phenomenal realms or mind-brain separation.

This framework demonstrates how subjective experiences—certainty (CG), skill acquisition (ML), positive affective responses (AE), and negative emotional states (FF)—can be systematically quantified within a mathematical model. By formalizing the relationship between neural activity and subjective experience, Bayesian hierarchical Normal-Gamma conjugate inference and its meta-metrics provide principles that advance AGI development through mechanistic, computationally grounded metrics for human-like awareness.

2.4.5 Time perception and the fundamental cosmic architecture

In Section 2.4.4.4.3, the hierarchical Normal-Gamma conjugacy follows a Gamma(3/2, r') distribution. For visual signals, r₄ can be expressed as (cΔt)⁴ in terms of time or simplified to Δt⁴ under natural units where c = 1. This allows the Gamma(3/2, Δt⁴) distribution to serve as a conjugate prior for modeling temporal events like expert and novice reaction times.

2.4.5.1 Hierarchical Bayesian inference explains professional reaction speed

Figure 2.4.7 illustrates how Bayesian inference accounts for professional reaction speed, with the critical threshold referring to a variance threshold that serves as a screening criterion. The Gamma(3/2, Δt⁴) model explains professional reaction speed through cumulative Bayesian updates that convert training repetitions into precision gains, hyperparameter optimization refining neural efficiency, uncertainty minimization achieving automaticity when variance (σ²) falls below a critical threshold, and predictive processing leveraging priors for pre-stimulus readiness. This framework provides the first computational account of why 10,000 hours of training produce exponential rather than linear improvements, why experts outperform novices by 200-300 ms despite identical neural transmission limits, and how "game sense" arises from statistically optimized Bayesian inference. The model reveals expertise as a biological implementation of efficient Bayesian algorithms, with Gamma(3/2, Δt⁴) serving as the mathematical foundation for human mastery.

σ² emerges as a robust metric to simulate skill state assessment, reflecting stability in performance across trials and tracking progress toward professional proficiency. By monitoring σ² relative to a critical threshold, this framework enables objective evaluation of training outcomes, distinguishing novices from experts through their capacity to minimize uncertainty and maintain consistent reaction times under varying conditions. Empirically, this value aligns closely with real-world data, where professional performance stabilizes around σ² ≈ 0.035, validating its utility as a reliable indicator of skill maturity in both simulated and natural environments.

Age emerges as a critical determinant in skill acquisition, with Bayesian inference simulations demonstrating that older individuals require significantly more training volume and intensity to reach expert levels. As prior rigidity increases with age as discussed in Section 2.4.4.4.4, the learning curve flattens, necessitating extended exposure and enhanced precision gains to achieve the same variance threshold. Simulations reveal that adult learners must compensate for reduced adaptability through intensified practice regimens to match the efficiency of younger counterparts. This age-related disparity underscores why peak performance in high-stakes domains like esports remains concentrated among individuals who begin training during adolescence.

The 10,000-hour rule in esports reflects a realistic timeline requiring 3-5 years of sustained effort (10-12 hours daily) to achieve professional-level proficiency, as excessive training beyond this threshold risks burnout and reduced efficiency. Deliberate practice outweighs mindless repetition, while non-negotiable factors like youth and raw ability determine whether 10,000 hours yield elite status or merely "amateur excellence." While the 10,000-hour benchmark serves as a guideline, true mastery demands talent, optimized training, and early development, with progress validated by measurable thresholds in high-stakes scenarios.

2.4.5.2 Time perception is an enforced imprint of causal asymmetry

Time perception is an enforced imprint of causal asymmetry as the mathematical structure of Bayesian inference inherently encodes temporal directionality through its dependence on positive time intervals Δt which are always greater than zero in this formulation. By reformulating Equation 2.4.2 as:

ψ_Vx = (∏ F_k/π)⁴ exp $− ∑ 𝐹𝑘∆𝑡𝒌 & 𝑒𝑥𝑝 $2πi ∑ f_j (t + Δt)& (2.4.18)

Equation 2.4.18 reveals that time is encoded in an imaginary format. Bayesian inference mechanisms inherently distinguish between past and future through their reliance on positive temporal increments which align with the observed one-directionality of cause-effect relationships. This mathematical property suggests that our perception of time as flowing in a single direction is not merely an abstract concept but a fundamental constraint imposed by the way information is processed probabilistically in systems that track sequential events. The requirement for Δt to remain positive reflects the asymmetry between prior and posterior probabilities in Bayesian updating where new evidence always builds upon previous states creating an irreversible temporal structure embedded in the formalism itself.

In addition, Equation 2.4.18 demonstrates that Δt arises as an intrinsic accompanying perceptual component in spatial Bayesian inference reflecting the necessity of temporal differentiation between sequential states in dynamic environments where causal relationships must be dynamically updated through probabilistic reasoning. This necessity reflects how perception of time is not an external construct but a fundamental feature of probabilistic reasoning systems that operate within evolving spatial contexts, where each new observation must be anchored to prior states through quantifiable temporal increments to maintain coherence in predictive models.

2.4.5.3 Imaginary time is a reality without reality

In the Schrödinger picture of quantum mechanics, the time evolution operator for a quantum state is typically expressed as:

U(t) = e^(-iHt/ℏ) = e^(-iH̃) (2.4.19)

where H is the Hamiltonian and ℏ is the reduced Planck constant; H̃ denotes Ht/ℏ for simplicity. Its Taylor expansion around t = 0 is given by:

U(t) = Σ (iH̃)^n/n! = I - iH̃ + (-i)² H̃²/2! + O(t) = I - iH̃ - H̃²/2 + O(t) = I - i(H̃ + H̃²/2) + O(t) (2.4.20)

where O(t) represents infinitesimal higher-order terms; I is the identity operator and higher-order terms involve powers of the Hamiltonian. If the Hamiltonian H̃ arises from symmetry-based interactions (e.g., local gauge transformations), the item in parentheses of the last equation resembles a gauge potential in specific contexts. The Taylor expansion reflects the unitary nature of time evolution in quantum mechanics, ensuring conservation of probability over time. The series converges for all finite t when H̃ is bounded, and practical implementations often truncate it at a finite order for approximations.

For the Hamiltonian defined in Equation 2.1.28 for neuronal action potential, H̃ is proportional to Δt√t. This relationship arises from the Brownian diffusive dynamics of creatine dipoles, as formalized in Equation 2.1.22. Consequently, truncating the series in Equation 2.4.20 at second order reveals real-time fluctuations akin to a Wiener process, where variance scales linearly with time t, mirroring Brownian motion's diffusive behavior. This connection underscores how imaginary time encodes structural information about temporal evolution, even as it remains abstract in its mathematical form.

In this context, imaginary time operates as a dual reality, mathematically essential yet empirically indirect. While real-time fluctuations (e.g., Brownian motion) manifest through classical stochastic processes, their underlying structure is mirrored in the quantum realm via imaginary-time formalisms. The interplay between these frameworks reveals that time's perceived directionality may arise from emergent properties of complex systems rather than being an inherent feature of spacetime itself.

Thus, imaginary time remains a powerful tool for probing fundamental symmetries (Wick rotation in quantum field theory and the standard model) (Schwartz, 2014) and thermodynamic behaviors (finite-temperature or thermal field theories) (Mustafa, 2023), even as it retains its enigmatic status as "a reality without reality." Importantly, the Kubo-Martin-Schwinger (KMS) condition ⟨A(t)B⟩ = ⟨BA(t + iβ)⟩ arises in the context of thermal equilibrium states in quantum statistical mechanics and characterizes the periodicity of correlation functions under imaginary time evolution (Mustafa, 2023). We have demonstrated that neurons employ quantum topological protection of quantum neuronal states through creatine dipoles (Section 2.1.6.6) under imaginary time evolution (Section 2.1.6), and given their thermally equilibrated state with ambient thermal conditions at body temperature, they inherently satisfy the KMS condition. Consequently, imaginary time naturally emerges in neural information coding, accompanying Bayesian inference as a framework through which the human brain processes probabilistic uncertainties and contextual dependencies. In the context of time perception, imaginary time functions as a mathematical abstraction that encodes structural properties of temporal evolution, existing as a 'reality without reality' in both theoretical and empirical frameworks.

2.4.5.4 The fundamental cosmic architecture

While imaginary time is conventionally treated as a mathematical tool in quantum field theory and statistical mechanics (e.g., via Wick rotation), its role in revealing fundamental symmetries suggests a deeper connection to spacetime geometry. Equation 2.4.18 highlights a tension: though imaginary time lacks direct empirical correspondence to macroscopic time perception, its mathematical structure may encode essential features of spacetime architecture, implying that causality and temporal directionality could emerge from geometric or entropy-related principles rather than being primitive attributes of reality.

2.4.5.4.1 Hamiltonian mechanics, symplectic geometry, quaternions, and emergent spacetime dimension

Hamiltonian mechanics naturally operates within a symplectic geometric framework, where the phase space is equipped with a closed, non-degenerate 2-form (Feng and Qin, 2010). The Hamiltonian admits a quaternionic representation due to the isomorphism between quaternion algebra and rotational symmetries. By extending the quaternion-based Hamiltonian framework through imaginary time, the mathematical description of time evolution transitions into a hyperkähler manifold, which is a special geometric space governed by three hyper-complex structures that obey fundamental quaternion relationships. Temporal dynamics arise through spontaneous symmetry breaking, as exemplified by the Taylor expansion's role in encoding deviations from idealized symmetry, rather than emerging from fundamental symmetries themselves or existing as an independent dimension, as discussed in Section 2.4.5.3. Therefore, the effective spacetime dimension is 11 (2 × 4 + 3), where the four quaternionic dimensions split into real and imaginary components. This framework establishes a necessary and sufficient geometric condition for universal energy conservation and quantum coherence. The 11-dimensional structure is central to supersymmetric theories in physics (Nishino, 2000).

2.4.5.4.2 Supersymmetric space and the cosmological constant problem

In an 11-dimensional supersymmetric space, the vacuum energy density (cosmological constant) is inherently suppressed due to the cancellation of bosonic and fermionic contributions under unbroken supersymmetry. However, observed cosmic acceleration implies a tiny but non-zero cosmological constant. This discrepancy is resolved by recognizing that supersymmetry breaking corresponds to time symmetry breaking, as the emergence of real time (from imaginary) introduces an asymmetry in energy distributions. The low dark energy density arises naturally from this transition, while experimental evidence for supersymmetry breaking, such as the absence of superpartner particles, reflects the non-trivial topology of spacetime at high energies. By coupling this to the 11-dimensional structure, both the cosmological constant problem and the hierarchy between gravitational and electromagnetic forces are addressed simultaneously. Therefore, resolving these issues just requires moving beyond conventional frameworks and redefining our understanding of time.

2.4.5.4.3 A quantum geometric origin of Newton's gravitational constant

In the 11-dimensional supersymmetric framework, spacetime naturally decomposes into a 4D observable sector (our universe) and a 7D compactified dual space, governed by geometric duality. Within this framework, the 3-form C₅ of 11D supergravity generates the field strength F₄ = dC₅ and its Hodge dual G₇ = ⋆F₄ encodes gravitational dynamics in the compactified space. This reflects the fundamental 4-form ↔ 7-form duality since 4 + 7 = 11.

The photon 2-form field strength F₂ = dA resides strictly in the 4D observable subspace. The massless photon propagates along geodesics in an SO(3)-affine manifold, where the local symmetry constrains its dynamics. Through Hodge conjugation in 11D, the photon-graviton duality arises naturally: the 2-form F₂ has a Hodge dual ⋆F₂, a 9-form in 11D, encoding gravitational degrees of freedom in a complementary 9D subspace orthogonal to the 2D gauge orbits of the photon (11 - 2 = 9). The duality is governed by:

dF₂ = 0 ⟺ d(⋆F₂) = 0 (2.4.21)

which reflects shared masslessness and gauge invariance (Baez and Muniain, 1994). Importantly, this structure implies that the graviton effective degrees of freedom can be expressed via a 7-form field on a dual 9D subspace, consistent with dimensional reduction mechanisms. This formulation also naturally captures the dilution effect: as the gravitational field spreads across the larger 9D dual volume, its coupling strength scaling as f_g ∝ V₉⁻¹ becomes increasingly weak in 4D, offering a geometric interpretation for the smallness of G in terms of volume dilution across extra dimensions.

The derivation of Newton's gravitational constant G_N proceeds by relating the coupling strength of the gravitational 7-form flux G₇ to the effective volume of the 9D space. The 11D effective action includes higher-order graviton interaction terms due to extended dimensionality. The expansion of the exterior derivative (Equation 2.4.20) is truncated at the ninth order for gravitons (the maximal form degree in 7D), and at the second order for photons, consistent with their 2-form nature.

The graviton flux strength (α_g) is related to the square of the gauge field strength α = e²/ℏc ≈ 1/137 (Kawai et al., 1986) (Bern et al., 2010) (Borsten, 2020). The fine-structure constant α quantifies the strength of electromagnetic interactions relative to the speed of light and Planck's constant. Consequently, we consider the graviton expansion scaling as:

(α_g/α) ≈ 3.3 × 10⁻³⁸ (2.4.22)

This ratio quantifies the geometric dilution of gravitational flux under small perturbations (Δf_g ∝ ΔG) in the vacuum: the exponent 7/9 in (α_g/α)^(7/9) reflects the graviton flux G₇ strength (modeled as a 7-form wedge product) attenuation in 9D space, while the 9! factorial encodes the topological suppression factor from high-order curvature terms. For a proton with mass m_p and charge e, the gravitational force between two protons is given by f_g = G_N m_p²/r², where r denotes the separation distance; the electromagnetic interaction strength is expressed as f_em = k_e e²/r² = e²/(4πε₀r²), with k being Coulomb's constant and ε₀ the vacuum permittivity, highlighting the vastly stronger nature of electromagnetic forces compared to gravitational interactions at comparable distances. This formulation explains the extreme weakness of gravity relative to electromagnetism as a natural consequence of geometric flux spreading in extra dimensions.

By substituting:

α = e²/(ℏc) (2.4.23)

we relate gauge coupling directly to Planck units. We then introduce the proton mass m_p as a reference scale. This is a physically natural choice since protons dominate baryonic matter and are used in defining the known gravitational-to-electromagnetic force ratio. Substituting into Equation 2.4.22, we derive an analytic expression for Newton's gravitational constant under small perturbations in the vacuum from the photon-graviton duality field:

ΔG = G - G_N = -G_N = γ α^(9/2) ℏc / (9! m_p²) (2.4.24)

where α^(9/2) = α^(7/2) × α: the α^(7/2) term originates from Equation 2.4.22, and the additional α arises from substitution via Equation (2.4.23); here, G_N represents the classical Newtonian gravitational strength, while G incorporates quantum corrections from the photon-graviton dual field. The condition G = 0 emerges because the massless, gauge-invariant nature of this duality field contributes no curvature to spacetime, aligning with its topological defect structure (Equation 2.4.21); γ is an effective dimensionless coefficient.

The factor γ is identified with the Barbero-Immirzi parameter, a key quantity in Loop Quantum Gravity (LQG) (Domagala and Lewandowski, 2004) (Meissner, 2004) (Vyas and Joshi, 2022). This parameter quantizes area in discrete spatial geometry and enters the entropy formula of quantum black holes. Equation 2.4.21 implies that the net gravitational effect of the photon-graviton duality field is zero, consistent with its massless gauge nature. Therefore, in this perturbative framework, we have G = 0, and a calculated value of the Barbero-Immirzi parameter, γ ≈ -0.241873. The value of |γ| lies within the theoretical bounds ln(2)/π ≈ 0.221 and ln(3)/π ≈ 0.350 (Domagala and Lewandowski, 2004), as well as the subsequently derived range of 0.23753 to 0.27399 (Meissner, 2004).

The result can be rewritten in dimensionless form as:

G_N m_p²/(ℏc) = γ α^(9/2)/9! (2.4.25)

This expression unites quantum gauge dynamics (α), quantum geometry (γ), and the classical gravitational strength (G_N) in a single coherent formulation. This framework redefines gravity not as a fundamental force, but as an emergent consequence of deeper quantum and discrete geometric structures, governed by discrete symmetries and topological constraints characterized by the Barbero-Immirzi parameter γ. It implies that both ℏ and α must remain finite, not merely as constants, but as residual manifestations of quantum discreteness forbidding a classical limit ℏ → 0. Thus, classical spacetime and its gravitational behavior emerge only in the presence of a fundamentally discrete quantum geometry, with the cosmological constant and dark energy scale naturally constrained by these residual quantum structures.

Even more provocatively, the smooth flow of time itself may be an illusion: what appears as perceptive temporal continuity emerges solely within the imaginary-time formulation of the wave Equation 2.4.18, so that imaginary time is a reality without ontological reality as discussed in Section 2.4.5.3.1. It is through this lens that continuity is conjured from a truly discrete foundation (Δt > 0). The seamless fabric of spacetime is thus not fundamental, but a projection, a classical mirage woven from the threads of an inherently quantized geometry.

Within the 11-dimensional supersymmetric framework, Newton's gravitational constant G_N emerges analytically from Hodge duality between photon and graviton forms, dimensional dilution of gravitational flux, perturbative expansions reflecting field degrees, and quantum geometric corrections incorporating the Barbero-Immirzi parameter. This unified expression bridges electromagnetism, quantum gravity, and discrete geometry, offering a first-principles derivation of interaction strengths and addressing one of the most profound challenges in physics: reconciling quantum field theory with general relativity.

2.4.5.3.2.4 The discrete spacetime and its implications for the cosmological constant problem

The conceptual evolution of space-time comprises three pivotal stages: the Newtonian absolute framework, where space and time were regarded as independent and unchanging backgrounds; the Einsteinian relativistic paradigm, which unified spacetime into a dynamic geometric entity shaped by mass-energy distributions; and the discrete spacetime models emerging from quantum gravity theories, suggesting that spacetime may exhibit granular structures at fundamental scales. Real time, as a physical manifestation of spontaneous symmetry breaking in an underlying imaginary-time framework, is not an independent foundational dimension but arises from constraints imposed by the Euclidean substrate (Sections 2.4.5.2 and 2.4.5.3). This perspective aligns with Hartle and Hawking's proposal that quantum gravity can be formalized through path integrals over Euclidean metrics, leading to a cosmological wave function describing the universe's quantum state (Hartle, 1983), while in quantum cosmology, the entire universe is inherently described by a wave function rather than classical spacetime geometry (Vilenkin, 1994).

We have derived Newton's gravitational constant G_N analytically from quantum gauge dynamics and discrete quantum geometry under vacuum perturbations. This formulation unites the fine-structure constant (α), the Barbero-Immirzi parameter (γ), and the Planck-scale proton mass (m_p) into a single coherent expression (Equation 2.4.24). Rewriting the expression in a dimensionless form (Equation 2.4.25), the framework suggests that gravity emerges from the topological defect condensate of the photon-graviton duality field, a massless, gauge-invariant F₂ ↔ ⋆F₂ dual field in 11D supergravity.

The Einstein field equations are expressed as (Grøn and Hervik, 2007):

G_μν + Λg_μν = (8πG_N/c⁴) T_μν (2.4.26)

where G_μν = R_μν - ½ Rg_μν is the Einstein tensor; g_μν denotes the metric tensor; T_μν is the stress-energy tensor; the cosmological constant Λ, traditionally linked to vacuum energy and dark energy (Peebles and Ratra, 2003). Weinberg's 1989 formulation of vacuum energy density in natural units (ℏ = c = 1) is given by ⟨ρ_vac⟩ = Λ/(8πG_N), which translates to SI units (J/m³) as:

⟨ρ_vac⟩ = (c⁷Λ)/(8πG_Nℏ) (2.4.27)

where E_cutoff signifies the upper energy bound of vacuum fluctuations.

By equating this to the observed dark energy density:

(c⁷Λ)/(8πG_Nℏ) = ⟨ρ_vac⟩ = Ω_Λ ρ_crit c² = Ω_Λ (3H₀²c²)/(8πG_N) (2.4.28)

with ρ_crit = 3H₀²/(8πG_N) as the critical density in cosmology (Peacock, 1999) and Ω_Λ representing the dark energy fraction (Aghanim et al., 2020). H₀ denotes the Hubble parameter. The upper bound of Λ-related energy (in Joule) becomes:

E_cutoff = (6π²Ω_Λ)^(1/4) √(ℏ³c⁵/G_N) (2.4.29)

This yields E_cutoff ≈ 1.275 × 10⁻⁴⁹ J or equivalently ~7.9578 meV. Notably, the typical photon energy in cosmic microwave background (CMB) radiation with a CMB temperature (T_CMB) of approximately 2.725 K is about 3k_B T_CMB ≈ 1.129 × 10⁻⁴⁴ J (0.70 meV). The striking proximity between E_cutoff and the CMB photon energy suggests a deeper quantum-gravitational origin of the observed dark energy density, rather than a mere numerical coincidence. The Λ-related energy E_cutoff could represent the maximum energy scale of photon fluctuations within the photon-graviton F₂ ↔ ⋆F₂ duality field. This interpretation frames the cosmological constant as a geometric distortion arising from quantum vacuum fluctuations in such a field, linking dark energy to fundamental quantum-gravitational effects.

The sensitivity analysis of E_cutoff (Eq. 2.4.29) confirms its robustness: ±5% variations in H₀ or Ω_Λ shift E_cutoff by only ≤2.5%, demonstrating its stability against observational uncertainties and ruling out a fragile coincidence with CMB photon energy, instead establishing E_cutoff as the fundamental energy scale of vacuum fluctuations in the photon-graviton duality field (linked to Λ via Eq. 2.4.27). Critically, Hubble's constant H₀ governs dark energy density (ρ_vac): Its direct scaling in Eq. 2.4.28 reveals H₀ not merely as an expansion rate, but as a fundamental regulator of the quantum-gravitational vacuum (F₂ ↔ ⋆F₂ field), where increased H₀ amplifies both the fluctuation energy E_cutoff and the emergent cosmological constant.

While the Barbero-Immirzi parameter γ > 0 in standard LQG, the negative sign suggests quantum asymmetries in discrete spacetime geometry, implying chirality or torsion in quantum spacetime geometry potentially detectable via pulsar timing or interferometry (Lommen, 2015). These insights into gravity's emergent nature and spacetime's discrete symmetries not only resolve theoretical tensions but also hint at technological implications—particularly in leveraging vacuum geometric asymmetries for propulsion. For instance, curvature propulsion systems rely on an analytical formula for G_N which explains the quantum origin of gravity while providing a foundation for engineering spacetime distortions. The speculative "quantum sail" concept extends this idea: as a geometric resonance cavity by redistributing vacuum energy anisotropically, it harnesses self-amplifying vacuum torsion (from γ < 0) to extract directional thrust from quantum fluctuations, effectively generating motion through controlled spacetime curvature. This redefines cosmic travel as an interplay between geometric symmetry and quantum asymmetry, opening possibilities for future technologies that exploit the discrete fabric of reality.

2.5.1 Smell and taste modalities

Both the olfactory and gustatory systems play crucial roles in our perception of food and environmental cues, influencing behaviors related to feeding, safety, and social interactions. Clinically, dysfunction of the olfactory and gustatory systems can lead to various symptoms. For instance, Alzheimer's disease may affect olfactory processing areas (Miceli and Caccia, 2023b), leading to hyposmia or phantosmia. Brain injuries or certain neurodegenerative diseases may affect gustatory processing areas, resulting in dysgeusia or changes in food preferences. These sensory impairments not only affect quality of life but can also be early indicators of more serious neurological conditions.

2.5.1.1 The olfactory system

The sense of smell begins with olfactory receptor neurons in the nasal cavity, which detect odorant molecules. These neurons send their signals directly to the olfactory bulb, bypassing the thalamus, which is unusual compared to other sensory systems. From the olfactory bulb, information travels to the olfactory tract and then to several areas in the brain, including the olfactory tubercle, piriform cortex, and entorhinal cortex (Kandel et al., 2021). This pathway is critical for identifying odors and linking them to emotional and motivational states.

Neurons in the primary olfactory cortex (piriform cortex) encode odor identity, and represent the earliest and most odorant-specific and concentration-invariant features of the odor (Bolding and Franks, 2018). Odorant information is relayed from the olfactory bulb to the piriform cortex through the lateral olfactory tract (Wang et al., 2020). Within the human piriform cortex, neurons responsive to a given odorant are distributed without apparent spatial preference, and exhibit discontinuous receptive fields (Stettler and Axel, 2009). Lacking spatial preference suggests that odor information is encoded by olfactory sensory neurons with a specific odor receptor (OR). Consistently, olfactory sensory neurons fire bursts of action potentials with odor-specific latencies and concentration-dependent prolonged responses (Bolding and Franks, 2018). Moreover, beyond repertoires of trace amine-associated receptors (TAAR) (Hashiguchi and Nishida, 2007) and vomeronasal receptors (Grus et al., 2005), the largest OR repertoires in human and rat genomes contain ~400 and ~1200 OR genes, respectively (Niimura, 2009).

Structurally, the EC is organized into several layers (Wang et al., 2018), including the egocentric LEC and the allocentric MEC discussed above. The EC receives input from the olfactory bulb and piriform cortex, processing and relaying this information to the hippocampus. The EC is also involved in the processing of chemical compositions of odors by integrating olfactory information with contextual and spatial cues (Witter et al., 2017). Naturally, odors as diffusible signals are tightly integrated with spatial data. This tight spatial integration is essential for creating a coherent representation of odor-related discriminations, memories and spatial environments.

The direct projection of the olfactory system to the EC might suggest that the odor information is additively integrated with spatial data. The unusual characteristic of the olfactory system bypassing the thalamus might also imply the limited role of the thalamus in information integration and conscious perception.

2.5.1.2 The gustatory system

Taste perception starts with taste receptor cells on the tongue and other parts of the mouth, which respond to five submodalities (sweet, sour, salty, bitter, umami). These cells send signals through cranial nerves VII (facial), IX (glossopharyngeal), and X (vagus) to the brain stem, specifically the solitary nucleus (Kandel et al., 2021). From there, projections go to the thalamus and then to the gustatory cortex, primarily located in the anterior insular cortex and the anterior part of the inferior parietal lobe. The gustatory cortex is in a complex, non-topographical manner, allowing for the combination of taste modalities with other sensory inputs such as smell, texture, and temperature, contributing to the perception of flavor (Avery et al., 2020).

The insular cortex, or insula, is a deep brain region located within the lateral sulcus, divided into anterior and posterior sections with distinct functions. The anterior insula is involved in higher-order processes like emotional awareness, interoception (perception of internal bodily states), empathy, and decision-making, playing a key role in connecting sensory experiences with emotions and social cognition. It is also part of the gustatory cortex, processing taste. The posterior insula is more engaged in sensory processing, such as pain, temperature, and visceral sensations, as well as proprioception and sensorimotor integration, contributing to bodily awareness and self-location. Together, the insula integrates sensory, emotional, and cognitive information, supporting experiences like bodily self-awareness, emotion regulation, and social interactions (Menon and Uddin, 2010) (Kurth et al., 2010).

The anterior insular cortex processes gustatory information as a personal sensory experience, indicating that the ability to discern distinct flavors is closely linked to the emotional responses to food, such as enjoyment and aversion. The projection also suggests that the gustatory information is also additively integrated with feelings. Consequently, both the olfactory and gustatory systems play crucial roles in our perception of food and environmental cues, influencing behaviors related to feeding, safety, and social interactions. Their unique pathways and cortical organizations underscore the importance of these senses in daily life and survival.

2.5.2 The Sensory cortex

The sensory cortex, specifically the primary somatosensory cortex (S1), is a critical brain region involved in processing tactile information from the body. It is located in the postcentral gyrus of the parietal lobe and is organized in a highly specific and detailed manner. One of its most notable features is the sensory homunculus, a distorted representation of the body's sensory map (Figure 2.5.1) (Nguyen and Duong, 2024). This cortical map reflects the density and sensitivity of sensory input from different body parts, with areas of the body that have more precise sensory receptors, such as the fingers and lips, occupying larger portions of the cortex. Conversely, body parts with less sensitive or fewer receptors, such as the back and legs, are represented with smaller cortical areas.

Figure 2.5.1 The sensory homunculus (Wikipedia). The sensory homunculus is organized in a manner known as a somatotopic organization, where adjacent areas of the body are mapped to adjacent regions of the cortex. This organization allows for the fine-tuned processing of sensory information, facilitating detailed tactile perception and spatial awareness. The size of the representation in the sensory homunculus correlates not with the actual size of the body part but with its sensory importance and resolution, highlighting the cortex's adaptability and functional specialization (Figure 2.5.1). This map is not static but can be modified by experience and sensory input, reflecting the brain's plasticity and ability to adjust sensory processing in response to changes in sensory input or body usage, such as phantom limb sensation (Ramachandran and Blakeslee, 1999).

The unique structure of the sensory cortex allows for the processing of various sensory modalities, such as touch, temperature, and proprioception (the sense of body position and movement). This somatotopic organization implies that these sensory modalities use spatial encoding in the brain. This information can be relayed to the posterior insula, which is involved in processing somatosensory inputs, including pain, temperature, itch and other feelings. In turn, the anterior insula receives inputs from the posterior dorsal insula and plays a key role in integrating sensory information with emotional and cognitive processes (Craig, 2002). It is involved in interoceptive awareness, which is the perception of internal bodily states like coolness, hunger, thirst, and emotions, and is also critical for emotional processing and social cognition. The insula helps translate sensory information from the external environment into subjective feelings and experiences by connecting the sensory cortex with brain regions involved in emotional and decision-making processes, such as the anterior cingulate cortex (ACC) and the right (non-dominant) orbitofrontal cortex (OFC) for subjective awareness of feelings (Craig, 2002). The right OFC is preferentially associated with negative emotions such as sadness, anger, panic and disgust (Craig, 2002). The emotionally volatile right hemisphere is also vividly discussed by Ramachandran and Blakeslee in Chapter 7 of their book 'Phantoms in the Brain' (Ramachandran and Blakeslee, 1999).

Beyond the insular cortex, the sensory cortex is also interconnected with the motor cortex, the PPC, the PFC and the ACC, enabling the integration of sensory information with motor responses and higher cognitive functions. This connectivity is essential for the brain's ability to make sense of the sensory input, generate appropriate reactions and interact with the world.

3.1 Perceptions and the self-consciousness

In summary, we have mentioned some conscious and unconscious information processing, such as conscious visualization and the unconscious computational processes for frame transformation. After the discussion of extra-sensory modalities beyond auditory and visual systems, we have a clearer picture of perceptions and self-consciousness. First, the temporal and spatial information are decomposed, reconstructed, perceived and redistributed via the auditory and visual systems. Secondly, the olfactory and gustatory information are additively integrated into specialized perceptions, such as the spatial data in the EC and the interoception of the anterior insular cortex. Thirdly, the sensory cortex together with the egocentric PPC integrates the egocentric information about one's positions and various personal sensory modalities, such as touch, temperature, and proprioception; these autobiographical personal information are further transmitted to the ACC and OFC via the insula cortex to have subjective feelings and awareness.

As discussed in Section 2.3.3, synapses of complex neurons can act as pixels, and together generate snapshots for specialized perceptions. Apparently, in the whole brain, all active synapses, which receive pixelized information, together can act as a unified internal representation for the self-consciousness about personal perceptions, feelings and thoughts, filled with tremendous details from trillions of synaptic inputs. Thus, the whole brain might be considered as a neuronal quantal field, comprising ~86 billion neurons (Azevedo et al., 2009) and trillions of synapses (Voglewede and Zhang, 2022), which intrinsically function as wave functions and perceptive units, respectively.

Sensory modalities and their submodalities produce distinct information chunking at various levels, which suggests that self-consciousness encompasses hierarchical structures and modular functional units. In the brain, the advantage of such a design for the structure-function correlations may be employed as a fault-tolerant mechanism.

3.2 Cortical areas associated with clinical diseases

Cortical areas associated with clinical diseases provide valuable information about potential fault-tolerant mechanisms or functional perspectives, such as achromatopsia (color blindness) confirms that the V4 is critical for color processing (Heywood et al., 1992), and akinetopsia (motion blindness) identifies the function of V5/MT for motion processing (Ardila, 2016). Cortical blindness patients with normal eye function validate the fundamental role of the V1 area in visual perception (Barbot et al., 2021). The FFA and PPA damage causes deficits in recognizing faces (prosopagnosia)(Haeger et al., 2021) and the layout of scenes (Epstein et al., 1999), respectively. The EBA damage induces autotopagnosia (inability to localize body parts) (Ogden, 1985), or asomatognosia (denial of ownership of a body part) (Saetta et al., 2021).

The hypometabolism of the RSC in AD patients (Chételat et al., 2016) could be a key factor contributing to difficulties in remembering spatial environments and a loss of directional sense. Impairments of the somatosensory cortex in the parietal lobe lead to tactile agnosia (insensitive to touch), hemispatial neglect (Leibovitch et al., 1998), phantom limb syndrome (Chahine and Kanazi, 2007), and astereognosis (inability to identify objects by touch without visual input) (Amick, 2018). The frontal lobe syndrome (dysexecutive syndrome) affects planning, decision-making, and social behaviors (Hanna-Pladdy, 2007). Damages to the insula cortex result in the denial of illness (anosodiaphoria) (Prigatano, 2013), denial of ownership of a limb (somatoparaphrenia) (Gandola et al., 2012). Intriguingly, the anosodiaphoria and the somatoparaphrenia provide strong clinical evidence for the crucial role of the insula cortex in subjective awareness and self-consciousness. The asomatognosia (Saetta et al., 2021) and the hemispatial neglect (Leibovitch et al., 1998) also imply that the EBA, the ACC and the parietal cortex are also actively involved in the self-consciousness.

The prefrontal cortex (PFC) has important roles in cognitive control and executive function (Friedman and Robbins, 2022). Consequently, the hypometabolism of the left-side dorsolateral prefrontal cortex (Broadmann area 46) is common to unipolar depression, bipolar depression, obsessive-compulsive disorder and major depression (Baxter et al., 1989). Consistently, patients with depression may experience benefits following intermittent theta burst (iTBS) repetitive transcranial magnetic stimulation (rTMS) targeted at the left dorsolateral prefrontal cortex (Blumberger et al., 2018). On the other hand, the hypermetabolism/hyperactivity of the subgenual cingulate (Broadmann area 25) is associated with negative mood states (Mayberg et al., 2005). Thus, along with the role of the right OFC in negative emotions discussed above (Craig, 2002), personal emotional stability is contingent upon the subtle equilibrium within the functional regions of the left and right cerebral hemispheres (Table 3 [TABLE:3].2.1).

The arcuate fasciculus is typically larger and more developed in the left hemisphere, which is dominant for language functions in most people. This evolutionarily shaped left-hemispheric dominance is specialized in its role in connecting language-related areas: Broca's area, Geschwind's area and Wernicke's area, which are crucial for speech, language perception and comprehension (Sousa et al., 2017). Alterations of the arcuate fasciculus, Broca's area, Geschwind's area and Wernicke's area result in aphasia with different clinical symptoms (Table 3.2.1). Wernicke's area is located near the auditory cortex and the visual cortex, while Broca's area is anterior to the inferior part of the premotor cortex in the frontal lobe. Therefore, Broca's aphasia (Mohr et al., 1978) shows difficulty in speech production, with non-fluent but meaningful speech; Wernicke's aphasia (Mesulam et al., 2015) is characterized by fluent but nonsensical speech, demonstrating the inability to understand spoken or written language.

The motor systems are hierarchically organized (Kandel et al., 2021). Voluntary movements for purposeful tasks need to activate a brain network: the frontal lobe for decision-making integrates inputs from the premotor cortex and the supplementary motor area (SMA) for initial and higher-level planning, respectively; the PPC contributes to spatial processing, while the primary motor cortex executes movements; the basal ganglia select and initiate actions; the substantia nigra ensures smooth movement; and the cerebellum coordinates and fine-tunes motions (Table 3.2.1). Particularly, the medium spiny neuron in the basal ganglia and the Purkinje neuron in the cerebellum receives a large number of synaptic inputs, about 10,000 synapses for a single medium spiny neuron (Koos et al., 2004), or up to 97,853 synaptic connections for a Purkinje cell (Masoli et al., 2024). Therefore, a medium spiny neuron receives and integrates inputs from thousands of cortical neurons, excluding those from the primary visual and primary auditory cortices (Purves et al., 2001). A Purkinje neuron in the cerebellum can process up to ~10,000 inputs in parallel, showcasing the high computational capacity of the cerebellum. The reciprocal connection between the basal ganglia and the cerebellum implies that the two structures may form an integrated functional network (Bostan et al., 2010) for sensorimotor and neurocognitive processing. Consequently, in recent years, clinical and functional neuroimaging studies suggest that the cerebellum participates in higher-order functions (Schmahmann and Sherman, 1998), such as executive function, linguistic processing, spatial cognition, and affect regulation (Hoche et al., 2018). Beyond cerebellar motor syndrome and vestibulo-cerebellar syndrome, cerebellar cognitive affective syndrome is also known as Schmahmann's syndrome (Table 3.2.1) (Manto and Mariën, 2015) (Bodranghien et al., 2016). These cerebellar syndromes are related to different functional regions due to specific input-output pathways and re-entrant loops between each cerebellar compartmentalization and the cerebral cortex. Like the sensory homunculus, the primary motor cortex is organized according to the motor homunculus. Additionally, the basal ganglia-thalamocortical circuit employs somatotopic organization for encoding and processing motor information (Nambu, 2011).

Table 3.2.1 Cortical areas and their associated clinical diseases.

Overall, sensory and functional modalities, along with their submodalities, are organized into a modular and hierarchical framework that includes spatial partitioning. This design approach, which emphasizes both structural hierarchy and spatial organization within modules, contributes to a fault-tolerant system that can withstand brain lesions without incurring catastrophic functional losses. Meanwhile, these clinical conditions (Table 3.2.1) provide valuable insights into the fault-tolerant mechanisms and enhance our understanding of the functional roles of their associated cortical areas.

3.3 The hard problem of consciousness

Consciousness is considered the most challenging problem in the science of the mind, especially the hard problem of consciousness (Chalmers, 1995). The hard problem of consciousness, raised by David Chalmers in 1995, refers to the difficulty of explaining why and how subjective experiences (or qualia) arise from physical processes in the brain. Chalmers distinguishes between the "easy problems" of consciousness (such as the ability to discriminate and categorize, deliberate cognitive functions, and attention), and the hard problem (subjective experience, phenomenal consciousness or qualia) (Chalmers, 1995). Chalmers also criticizes 'most existing theories of consciousness either deny the phenomenon, explain something else, or elevate the problem to an eternal mystery' (Chalmers, 1995). From the systematicism perspective, many consciousness theories fail to recognize the structure-function correlation principle, diminishing their explanatory power regarding consciousness. For further details, please refer to a recent review of theories of consciousness (Seth and Bayne, 2022), recent progress (Consortium et al., 2023), and recent debates (Lenharo, 2023).

As discussed in Section 3.1, self-consciousness naturally emerges as a unified internal representation from information flow and integration from active synapses of complex neurons located in hierarchical and modular brain areas. Consequently, subjective awareness and self-consciousness are personal feelings, which emerge from autobiographical self-awareness along with unconscious perceptions, such as proprioception. Moreover, we have already discussed some emerging information, such as the cohesive melody that emerged from music notes via adaptation-encoded duration time discussed in Section 2.1, and the color discrimination and creation discussed in Section 2.4.3. In sections 2.4.4.4 Bayes inference and 2.4.4.5 Meta-metrics and qualia, we have resolved Chalmers' hard problem without invoking irreducible emergent properties.

In light of all the details above, in the frequency domain, there is no fundamental difference between the hard problem and the easy problems of consciousness. They share and rely on the same fundamental characteristics and algorithms of complex cells discussed in Section 2.3.3. Both the hard problem and the easy problems are explainable and resolvable in the realm of quantum wave mechanics along with the law of the structure-function correlation as corroborated in the above cortical areas associated with clinical diseases (Table 3.2.1).

The fundamental issue regarding consciousness with emergent properties is defining and quantifying the information that arises. Emergent properties in consciousness imply that certain complex systems exhibit characteristics not present in their individual components, making it challenging to pinpoint exactly where or how this information emerges. Understanding these emergent aspects requires developing precise methods to measure and describe the information involved in conscious experiences, which could provide insights into the nature of subjective awareness.

3.4 Definition and quantitation of information emerging

As the "internal time" noticed by Ilya Prigogine (Prigogine, 1980), a cohesive melody can emerge from playing music notes 1 through 7 consecutively, which is mentioned in section 2.1.3. Consequently, we have defined three classifications of time: real time, duration time and characteristic time in section 2.1.4. We also have discussed how the adaptation is related to the capacity of working memory in section 2.4.4.3. Importantly, we have analytically explained neuronal refractory period and spike frequency adaptation, which is a representation of energy constraint as discussed in section 2.1.6.

For simplicity, we can first use the adaptation of auditory simple neurons to discuss the definition and quantitation of information emerging. We can rewrite equation 1.8.3 of the information capacity of single action potential (C_impulse) as:

C_impulse ≈ 4W log₂(1 + S/N) ≈ 24 bits (3.4.1)

where n denotes spike count.

Figure 3 [FIGURE:3].4.1 Schematic for information emerging mechanism of spike frequency adaptation.

Comparing the spike trains with or without frequency adaptation in Figure 3.4.1, it is immediately clear that the adaptation markedly increases the information capacity of single action potential via expanding bandwidth (W) and reducing spike numbers (n). Consequently, we can define a new dimensionless information capacity by calculating the ratio of C_impulse with and without the adaptation, and calculate the information capacity per spike (C_info/spike) over the same duration time by measuring its ratio of C_impulse relative to the parameters of a simple neuron:

C_info/spike = W n_s / n (3.4.2)

where n_s is the number of spikes of a simple neuron without frequency adaptation within the duration time, W_s is the bandwidth of the simple neuron. Importantly, equation 3.4.2 can explain the high energy efficiency of the brain discussed at the beginning. The adaptation can save considerable energy (n_s - n) in terms of the quantization energy ε₀ discussed in section 2.1.6.5.

Considering the characteristic frequency of a simple neuron is invariant, the bandwidth of a simple neuron without frequency adaptation is adopted as one (W_s = 1). Consequently, total information within a duration time with n event-related spikes is:

I_total = C_info/spike × n = W n_s / n × n = W n_s (3.4.3)

Further subtracting the spike number n_s of a simple neuron without frequency adaptation over duration t, dimensionless information emerging (I_emerging) is defined as:

I_emerging = W n_s - n_s = (W - 1) n_s (3.4.4)

This definition is useful for quantifying the information emerging in the working memory discussed in Section 2.4.4.3. Within a given time window, n_s represents a constant spike count of simple neurons without frequency adaptation (as defined in Equation 2.1.1, where characteristic frequency remains invariant), so that information content is primarily determined by bandwidth W.

For the information emerging in the complex cells, we have intensively discussed the flexibilities of visual complex neurons, such as color detection, discrimination and creation in section 2.4.3. In the frequency domain, the modular group SL(2, Z) and its congruence subgroups are introduced in section 1.6 and also discussed in sections 2.1.2 and 2.4.1. Consequently, simple auditory neurons are classified as the congruence subgroup Γ₁(n), while the simple neurons featured with even frequencies are further subclassified as the principal congruence subgroup Γ₂(n). Complex cells are classified as the largest congruence subgroup Γ₀(n). These groups reflect hierarchical information complexity: high-complexity subgroups exhibit richer generator structures and topological properties (Section 1.6). Complex neurons further incorporate high-dimensional Fisher matrices beyond mixed frequencies (Equation 2.3.2).

Empirical evidence from the monkey V4 cortex demonstrates broader bandwidth in complex neurons (Section 2.3.2). Complex neurons exhibit enhanced functional flexibility beyond color discrimination and creation, such as the storage and advanced processing of visual-related information (Section 2.4.4), as well as potential roles in self-consciousness and cortical areas associated with clinical diseases (Section 3.2). These findings underscore the necessity to quantify information emergence in complex systems, as detailed in Section 3.5.

3.5.1 Biophysical redefinition of neural bandwidth

Neural bandwidth (W) quantifies the information-processing capacity of a neuron or neural system, defined as the number of distinguishable independent signals. For simple neurons lacking frequency adaptation, W is defined as 1 in Equation 3.4.3 and Figure 3.4.1, reflecting limited functional complexity due to their response to single uniform inputs and lack of "information emergence" (Equation 3.4.4). This underscores that W is fundamentally discrete, dependent on signal distinguishability.

The information emergence defined in Equation 3.4.4 represents an integer. Incorporating the information of simple neurons without frequency adaptation, Equations 3.4.3 is redefined as the total digitized information (DI) over duration t:

DI = W n_s, 1 ≤ W (3.5.1)

where n_s is the intrinsically invariant number of spikes of simple neurons without frequency adaptation within the duration t. This formulation of neuronal information captures the discrete neural processing based on distinguishable signals.

For the DI of simple neurons without frequency adaptation, W = 1. For those DI with frequency adaptation, W > 1 and less than n_s:

DI = W n_s, 1 < W < n_s (3.5.2)

For complex neurons, W increases with signal diversity and may exceed n_s. Key factors include: (1) Frequency: The number of distinct oscillatory patterns (e.g., sound frequencies, firing rates) a neuron can encode. For example, visual neurons process varying temporal or spatial frequencies. (2) Phase: Differences in the timing of signals, critical for tasks like stereoscopic vision. Binocular disparity, where the phase difference between inputs from each eye creates depth perception, is a prime example of how phase contributes to neural bandwidth. (3) Amplitude: Variations in signal strength, such as loudness in hearing or brightness in vision. The human auditory system's ability to detect subtle amplitude changes highlights its role in expanding W. (4) Frequency adaptation: Dynamic sensitivity adjustments via the ATP/PCr/creatine-dependent mechanism, which is extensively discussed in sections 2.1.5 and 2.1.6.

Equation 2.1.28 formalizes frequency adaptation by integrating time-dependent adjustments into a wavefunction expression that captures how neurons adapt their response profiles. Concurrently, Equation 2.3.2 defines the wave function of complex neurons in dynamic environments, reflecting their ability to process multiple signal dimensions (e.g., frequencies, phases, amplitudes). These two frameworks are combined in Equation 3.5.3, which synthesizes the frequency-adaptive wave function of complex neurons into a coherent mathematical formulation.

ψ_c = Σ δ(t - mΔt) exp(-i∫H(t)dt/ℏ) exp(-½ rᵀF r) exp(2πi Σ f_j t) (3.5.3)

where H_AP = πℏI√t (V(t) - V_rest)/Δt defined in Equation 2.1.28 represents the Hamiltonian (~10⁶ ATP/PCr/creatine) required for single action potential during the characteristic time (inter-spike interval) Δt. The sum of Δt_s is the duration time t, where m is an integer as defined in Equation 2.1.28. F represents the n-Dimension Fisher information weighted diagonal matrix defined in Equation 2.3.2, and f_j denotes temporal frequency.

The dimension of distinct wavefunctions described by Equation 3.5.3 directly determines the W, as it quantifies how many independent signal features a neuron can encode simultaneously (Equation 3.5.4). This integration ensures that W reflects both static structural properties and dynamic functional adaptations of complex neurons.

W = dim(ψ_c) = n + m + j, 1 < W (3.5.4)

where n is the dimension of the Fisher diagonal matrix; m is the maximum spike count modulated by frequency adaptation; j is the cardinality of distinct temporal frequencies; dim(ψ_c) denotes the dimension of the wavefunction (Equation 3.5.3). Consequently, W represents the dimensionality of neural encoding, capturing dynamic adjustments to stimulus features, such as color/spatial contrast (n), intensity (m) and temporal frequency (j).

The biophysical definition of W is therefore rooted in the number of independent wavefunctions (Equation 3.5.3) that a neuron can perceive. A "wavefunction" here refers to any distinct, measurable signal pattern, whether electrical (e.g., action potentials), mechanical (e.g., hair cell vibrations in hearing), or optical (e.g., light intensity variations). By quantifying W as the count of these wavefunctions, this framework provides a unified metric to compare information processing across sensory modalities and neural systems.

In summary, neural bandwidth (W) quantifies a neuron's capacity to decode multidimensional information, integrating biophysical principles with functional criteria (frequency, phase, amplitude, frequency adaptation). This definition bridges the gap between the structural properties of neurons and their computational roles in perception. It provides a scalable and testable framework for investigating how biological systems encode complexity, ranging from single cells to networks. By anchoring W in quantifiable wavefunction properties, this redefinition bridges neurobiological mechanisms with computational models of information processing.

3.5.2 Digitized information emergence in complex neurons

The total digitized information of a complex neuron over duration t is given by:

DI = W n_s = (n + m + j) n_s, 1 < W (3.5.5)

where W = dim(ψ_c). The emergent information (EI) is obtained by subtracting the information content of simple neurons without frequency adaptation (Equation 3.5.1 with W = 1):

EI = (n + m + j) n_s - n_s = (n + m + j - 1) n_s (3.5.6)

In visual signal processing, the primary (V1) and secondary (V2) visual cortices operate with dimensional complexity below n_s, resulting in minimal information emergence due to Equation 3.5.6 compared to the DI of simple neurons in Equation 3.5.2. However, the hierarchical and reciprocal connectivity between V1-V5 enables progressive dimensionality expansion, reaching a holographic information state, which is discussed in Section 2.3. The delta function in Equation 2.1.28 implies that, over duration t, simple neuronal functions formally resemble discrete quantum information channels with the maximum dimension n_s. Here, the number of discrete Weyl operators scales as n_s² (Rehman et al., 2018). Consequently, the dimension of the holographic information states is formally n_s⁴, representing a theoretical maximum for quantum-inspired encoding capacity. This assigned value establishes a conceptual correspondence between quantum information-theoretic formalisms and neurobiological perception processes.

For V4, the system supports creative information emergence, such as color perception (qualia). Similarly, V5, specialized in motion processing, generates dynamic perceptual states by leveraging temporal coherence and higher-order dimensional integration. Quantification of complex neuron creation remains critical.

3.5.3 Creation/qualia of complex cells in the visual system

The creative information (CI) is derived by subtracting the DI upper bound of simple neurons (Equation 3.5.2 with W = n_s):

CI = (n + m + j) n_s - n_s² = (n + m + j - n_s) n_s (3.5.7)

CI emerges naturally when W = dim(ψ_c) > n_s. As discussed in Section 2.4.3, the dimension n of the Fisher information matrix F is likely to be around 63 for V4 complex neurons, even 300-dimensional (Bohon et al., 2016), which should significantly exceed biologically plausible n_s values in simple neurons.

The CI > 0 condition defines a critical transition threshold where the system transitions from physical signal encoding to color perception. This manifests in color perception as the integrated emergence of qualia, including hue, saturation, and intensity dimensions. Bandwidth × spike count in Equation 3.5.1 quantifies informational richness, such as three primary colors combining with brightness/gradients to create and discriminate million-to-billion colors discussed in Section 2.4.3.2. Equation 3.5.7 describes the modular architecture of color globs and interglobs in V4, as well as the multimodal integration of non-color features such as shape, orientation, spatial frequency, depth, motion, and attentional selection (Roe et al., 2012; Bohon et al., 2016).

3.6 Energy efficiency and realization probability of information emergence

The emergence of information during spike frequency adaptation reflects a synergistic optimization of energy expenditure and informational capacity (Section 3.4). By reducing the number of spikes from n_s (non-adapting case) to n, the system conserves quantized energy ΔE = (n_s - n)ε₀, where ε₀ represents the minimum energy unit (Section 2.1.6.5). Simultaneously, the emergent information EI scales with the bandwidth expansion W and n_s. This dual optimization is not a simple trade-off but an example of synergistic efficiency, where energy savings directly enable the emergence of structured information through dynamic adaptation.

3.6.1 The probability of emergent information associated with energy savings

In auditory simple neurons with tonotopic organization, high-frequency receptors at the cochlear base and low-frequency ones at the apex (Section 2.1.1; Figure 2.1.1), energy constraints predict that high-frequency neurons prioritize spike frequency adaptation to optimize energy efficiency. Given this, the probability of emergent information associated with energy savings (EI_s) is hypothesized to follow a Boltzmann-like relationship:

P_emergent = 1 - e^(-(n_s - n)ε₀/ε₀) (3.6.1)

where β encodes the system's sensitivity to energy savings. Equation 3.6.1 satisfies P_emergent = 0 when n_s = n.

Shannon deduced the logarithmic form of entropy by leveraging continuity, monotonicity and additivity of independent subsystems (Shannon and Weaver, 1998). Similarly, we have a locked probability expression of EI_s. Equation 3.6.1 is uniquely determined by smoothness, monotonicity, and multiplicative property of 1 - P_emergent.

3.6.2 Energy efficiency of information emergence

Multiplying P_emergent by an amplitude α yields a meaningful expression of EI_s as a function of energy conservation:

EI_s = α(1 - e^(-(n_s - n)ε₀/ε₀))

Energy efficiency η is defined as the ratio of EI_s to energy consumption nε₀:

η = EI_s / (nε₀) = α(1 - e^(-(n_s - n)ε₀/ε₀)) / (nε₀)

η also reflects the realization probability of information emergence. (3.6.2) (3.6.3)

Figure 3.6.1 demonstrates the energy efficiency η for information emergence. The simulation employed the following parameters: α = 24, ε₀ = 1, n_s = 17, and β = 1, based on initial and boundary conditions from Sections 1.8.1 and 3.4, as well as data presented in Figure 3.4.1.

Figure 3.6.1 Energy efficiency and realization potential of information emergence. The "magic number seven, plus or minus two" (Miller, 1956) a classic limit on working memory capacity, is highlighted in green. Fukuda (Fukuda et al., 2010) and Cowan (Cowan, 2010) suggest that complex information may be constrained to 3-4 items (highlighted in red), aligning with the model's predictions. Working memory capacity is discussed in Section 2.4.4.3.

3.6.3 Quantification of realizable information emergence

Equation 3.6.1 represents the probability of emergent information associated with energy savings under β = 1, where the term exp(-(n_s - n)ε₀/ε₀) quantifies the failure probability required to achieve a saving of (n_s - n)ε₀, normalized by the quantization energy ε₀. This expression mirrors the structural and conceptual role of the Boltzmann factor in statistical mechanics, where exponential decay governs probabilistic outcomes based on energetic constraints.

Considering energetic constraints, we can further derive realizable information emergence metrics, including realizable total digitized information (RDI), realizable emergent information (REI), and realizable creative information (RCI). These are obtained by multiplying Equation 3.6.1 with Equations 3.5.5, 3.5.6 and 3.5.7, respectively:

RDI = W n_s (1 - e^(-(n_s - n)ε₀/ε₀))
REI = EI (1 - e^(-(n_s - n)ε₀/ε₀))
RCI = CI (1 - e^(-(n_s - n)ε₀/ε₀)) (3.6.4) (3.6.5) (3.6.6)

Note that RDI, REI, and RCI are discrete quantities but not necessarily integer-valued.

3.7 Unifying the hard and easy problems of consciousness in the frequency domain

Based on Section 3 and its subsections, the definition and quantitation of information emergence provide a unified framework to demonstrate that the hard problem (subjective experience) and easy problems (functional processes) of consciousness exhibit no fundamental difference in the frequency domain. Both share identical foundational characteristics rooted in complex cell dynamics governed by principles such as the structure-function correlation and mesoscopic quantum wave mechanics.

By defining neuronal information through Fisher information under the principle of minimum uncertainty, and quantifying emergent information via mechanisms like adaptive time scales (real time, duration time, characteristic time) and energy-constrained optimization (Section 3.4), the framework establishes a computable basis for analyzing consciousness. For example, subjective qualia (e.g., color perception in the visual system, Section 3.5.3) are modeled via equations like Equation 3.5.7, capturing modular integration and multimodal feature processing. Energy efficiency metrics (Section 3.6) link information emergence to biophysical constraints, demonstrating how spike frequency adaptation optimizes energy expenditure and informational capacity.

These quantitative approaches dissolve the distinction between the hard/easy problem dichotomy by framing both as manifestations of informational richness and adaptation-encoded dynamics in hierarchical neural systems. Therefore, we have resolved Chalmers' hard problem with (Sections 3.4-3.7) or without emergent properties (sections 2.4.4.4 and 2.4.4.5). The clinical data on cortical areas (Table 3.2.1) further corroborate this, showing that fault-tolerant mechanisms and structure-function correlations underpin both phenomenological and functional aspects of consciousness. These formalizations enable rigorous, computable analyses of consciousness, bridging theoretical divides through empirical and mathematical rigor.

3.8 The hard problem of consciousness: a reassessment of its metaphysical status

Paradigm constraints / framework limitation, analogous to Wittgenstein's 'language games', restrict what can be meaningfully addressed in metaphysics. What appears as profound questions may dissolve through shifts in conceptual coherence and advancements in science/technology, as frameworks evolve. History reveals many pseudo-metaphysical questions that were eventually dispelled by empirical progress: (1) Luminiferous ether (physics): Once posited as a necessary medium for light propagation, invalidated by relativity theory. (2) Élan vital (biology): A hypothetical life force replaced by biochemical explanations. (3) Phlogiston (chemistry): An assumed element of combustion refuted by oxidation chemistry.

The hard problem of consciousness, why subjective experience arises at all, is now progressing along this trajectory. Advances in neuroscience are transforming what appeared as an intractable philosophical question into a domain of measurable mechanisms: quantifiable mental knots (section 1.7.3.5.1), meta-metrics (sections 2.4.4.4 and 2.4.4.5) and emergent properties (sections 3.4-3.7). Evolving scientific frameworks reclassify this issue from pseudo-metaphysical speculation to empirical investigation.

3.9 Integration of perceptions and the center of self-awareness

Now, we are in good positions, since we have established that perception acts as the logit function in the bilinear self-information manifold (Section 1.7.3.5.4), decision reacts as MK meta-metric under cognitive tension (Section 1.7.3.5.1). Additionally, we have established quantifiable mental knots (Section 1.7.3.5.1), meta-metrics (Sections 2.4.4.4 and 2.4.4.5) and emergent properties (Sections 3.4-3.7). We also have Equations 2.1.2, 2.1.3, 2.1.28 and logit function to quantify auditory perception, and Equations 2.4.3 and 3.5.3 to quantify visual perceptions. Then, we can use the gamma distribution conjugate prior of Bayes inference for visual information (Sections 2.4.4.4 and 2.4.5) to integrate multi-modular perceptions.

3.9.1 Integration of perceptions

3.9.1.1 Auditory perception

In this analysis, we continue to use the auditory simple neuron as a foundational case to systematically quantify auditory perception, replicating here the logit definition (the canonical parameter θ) and associated functions for clarity:

θ ≡ logit(p) = ln(p/(1-p)) = ln(p) - ln(1-p) (1.7.13)

The perception wave function ψ_AP of simple neurons in the primary auditory cortex with an amplitude A:

ψ_AP = A e^(i(k·r - ωt)) (2.1.2)

This perceptual wave function encodes tonotopically invariant frequency information across hierarchical neural structures, with characteristic frequencies mapped through congruence subgroups Γ₁(n) that preserve translational invariance from cochlear hair cells to cortical processing layers (Section 2.1.2). The classification of invariant neural representations is governed by congruence subgroups (Section 1.6), which act as discrete symmetry groups preserving the topological order of neural phase configurations. These subgroups arise naturally in the modular symmetry of frequency-lattice mappings from cochlear hair cells to cortical columns. Crucially, the neural state space is a finite moduli space (Section 1.6), a quotient space of neural configurations modulo topological equivalence under Γ₁(n) or Γ₀(n) (Section 2.4.1). This space is finite-dimensional and discrete, reflecting the bounded resources of biological systems, and stands in contrast to infinite-dimensional Hilbert spaces of conventional quantum mechanics. Nevertheless, the presence of intrinsic quantum coherence and topological protection (Section 2.1.6.6) justifies the use of wave function formalism not as an analogy, but as a physically grounded description of neural information encoding.

The rate and probability of the perceptual wave function are defined as:

λ = 2πf|A|, p = |A|² (2.1.3)

Auditory perception (θ_AP) is then parameterized via the Bernoulli logit transformation:

θ_AP = ln(p/(1-p))

For neurons with frequency adaptation, the energy constrained probability p_EI of emergent information follows (Section 3.6.1):

p_EI = 1 - e^(-(n_s - n)ε₀/ε₀) (3.9.1)

where n_s is the number of spikes of a simple neuron without frequency adaptation within the duration time, n is the spike number under frequency adaptation.

We consider the well discussed 'internal time' example involving seven musical notes processed by auditory working memory. According to Equation 1.7.14 (Section 1.7.3.5.2), the perception θ_AP under i.i.d. events satisfies linear additivity. Thus, we have the expression form of auditory perception θ_AP^FA under frequency adaptation (p_i = p_EI(i)):

θ_AP^FA = Σ ln(p_i/(1-p_i)) = Σ ln(e^((n_s - n_i)ε₀/ε₀) - 1) (3.9.3)

For perceptual decision-making θ_APD, the MK(p) local meta-metric (Equation 1.7.12) and logit(p) are integrated over a bilinear self-information manifold M to account for continuous signal interactions:

θ_APD = ∫_M MK(p_i) = Σ (2p_i - 1) TF(p_i) = Σ (2p_i - 1) F(p_i) S_i ln(2) (3.9.4)

where MK(p_i) (Equation 1.7.12) is defined as the mental knot metric quantifying cognitive tension in binary decisions; TF is defined in Equation 1.7.9 and derived from Equation 1.7.8. Fisher information F(p_i) and Shannon entropy S_i exhibit linear additivity due to the i.i.d. nature of discrete success probabilities p_i, ensuring MK(p_i) and θ_APD inherit this property. This aligns with the bilinear structure of perceptual manifolds described in Section 1.7.3.5.4, where algebraic logit transformations (θ coordinates) and geometric invariants like MK coexist as dual descriptors on the curved Fisher manifold (Section 1.7.3.5.4).

Additionally, as demonstrated in Section 3.4, frequency adaptation induces information emerging. Consequently, total auditory perception (θ_AP^total) combines baseline θ_AP, adaptive contributions θ_AP^FA, cognitive tension θ_APD and realizable total digitized information (RDI), which includes the realizable emergent information (REI), and realizable creative information (RCI) (Sections 3.5.2-3.6.3):

θ_AP^total = θ_AP + θ_AP^FA + θ_APD + RDI (3.9.5)

The proposed auditory perception parameter θ_AP^total (Equation 3.9.5) unifies baseline perceptual processing (θ_AP), adaptive frequency contributions (θ_AP^FA), and realizable information metrics (RDI, REI, RCI), grounded in the biophysical principles of neural adaptation and energy-constrained computation. Frequency adaptation mechanisms (Section 3.4) enable emergent information quantification via RDI = W n_s (1 - e^(-(n_s - n)ε₀/ε₀)) (Equation 3.6.4), where W encodes multidimensional signal processing capacity through wavefunction dimensions (Equations 3.5.3, 3.5.4), while REI and RCI denote emergent feature integration and creative cognition. This framework aligns with structural-functional correlations observed in tonotopic hierarchies (Section 2.1.1) and energy-limited adaptation (Section 2.1.6), embedding logit-transformed perceptions into a geometric space where conscious experience emerges from integrated information flows constrained by neural energetics. The formulation bridges discrete neural encoding (e.g., spike timing, amplitude modulation) with emergent consciousness by rigorously linking biophysical parameters to computable metrics of perception and cognition.

In summary, we demonstrate how established definitions and equations can systematically quantify auditory perception in the auditory cortex by modeling neural responses as wave functions encoding frequency-specific information with translational invariance. Simple neurons are described via the wave function (Equation 2.1.2), where characteristic frequencies f and spatial tensors r capture tonotopic organization, while amplitude A modulates loudness perception according to non-linear relationships like equal-loudness contours (Figure 2.1.1) for low-frequency sensitivity. The dot product of k·r reflects in Equation 2.1.2 the orientation projection and selectivity of simple cells, suggesting a form of spatial invariance of simple cell, which could be interpreted as phase stability in under translation or invariant angle in information geometry. Firing rates λ and probability estimates p (Equation 2.1.3) bridge neurophysiological measurements with Bayesian and frequentist interpretations of auditory processing, enabling statistical analyses of signal detection and belief updating in noisy environments. These frameworks integrate quantum-inspired phase stability principles with mesoscopic neural dynamics, explaining how hierarchical tonotopy and energy-constrained computations underpin the transformation of acoustic inputs into perceptually meaningful representations within cortical circuits.

3.9.1.2 Visual perception and integration of multi-modular perceptions

The integration of visual perception into the total perceptual framework follows analogous principles to auditory processing, but equipped with unique features, such as Bayesian inferenced meta-metrics. For visual perception (θ_VP), Equation 2.4.3 defines the inverse Radon transform (Section 2.3.1) reconstructing holographic signals from pixelized synapse inputs in extrastriate areas like V2/V3/V4, while Equation 3.5.3 formalizes digitized qualia creation via complex cell algorithms (Section 3.5).

Here, we replicate these Equations for clarity:

p_v = |ψ_V|² = ∏ (F_k/π) exp(-F_k r_k²) (2.4.3)

Equation 3.5.3 synthesizes the frequency-adaptive wave function of complex neurons as:

ψ_c = Σ δ(t - mΔt) exp(-i∫H(t)dt/ℏ) exp(-½ rᵀF r) exp(2πi Σ f_j t) (3.5.3)

For simplicity, let's first consider a simple visual neuron without frequency adaptation. The p_v in Equation 2.4.3 includes holographic information:

θ_VP = ln(p_v/(1-p_v)) = ln(∏ exp(-F_k r_k²) / (1 - ∏ exp(-F_k r_k²))) (3.9.6)

Then, we consider a complex visual neuron (Equation 3.5.3) with frequency adaptation Σ θ_VP(m) and temporal frequencies Σ θ_VP(j). Moreover, Bayesian inference based on visual information, especially Equation 2.4.3, generates quantifiable meta-metrics including confidence, confidence gain (CG), mastery level (ML), achievement emotion (AE), and frustration feeling (FF) (Sections 2.4.4.4.5, 2.4.4.5). Frequency adaptation also induces visual emergent information. Together, we have total visual perception (θ_VP^total) including baseline Σ θ_VP(n), frequency contributions Σ θ_VP(m) + Σ θ_VP(j), visual cognitive tension θ_VPD, RDI and meta-metrics:

θ_VP^total = Σ θ_VP(n) + Σ θ_VP(m) + Σ θ_VP(j) + θ_VPD + RDI + CG + ML + AE/FF (3.9.7)

The visual perception framework θ_VP^total (Equation 3.9.7) integrates structural-functional principles analogous to auditory processing but incorporates unique features like Bayesian inference for confidence metrics. The core visual signal reconstruction follows Equation 2.4.3, where inverse Radon transform synthesizes holographic images from pixelized synaptic inputs in extrastriate areas (V2/V3/V4), with the logit-based perception metric θ_VP derived from probability distribution p_v reflecting spatial-frequency-amplitude-phase encoding. Complex visual neurons enhance this via Equation 3.5.3, embedding frequency-adaptive wavefunctions to generate emergent qualia through dimensionality W (combining color/contrast n, intensity m, and temporal frequencies j). Bayesian mechanisms formalized in Sections 2.4.4.4 and 2.4.4.5 enable quantification of meta-metrics: CG tracks learning efficiency, ML assesses skill acquisition, while emotional states (AE/FF) modulate perceptual feelings. The total visual perception θ_VP^total unifies baseline features (Σ θ_VP(n)), adaptive contributions (Σ θ_VP(m) + Σ θ_VP(j)), cognitive tension (θ_VPD for decision), and energy-constrained digitized information (RDI), forming a biophysically grounded model where conscious visual experience arises from integrated computational processes constrained by neural energetics and probabilistic inference.

Moreover, the visual perception θ_VP^total is derived through hierarchical Bayesian inference using Normal-Gamma conjugacy, mirroring auditory processing in Equation 3.9.5 but with extra unique features (Equation 3.9.7). By Aristotle's categorization of the five senses: sight θ_VP^total, hearing θ_AP^total, taste θ_T^total, smell θ_O^total, and touch θ_S^total, total integrated perception becomes:

θ = θ_AP^total + θ_VP^total + θ_T^total + θ_O^total + θ_S^total + ⋯ (3.9.8)

Here, the role of Gamma distribution as a conjugate prior (Sections 2.4.4.4 and 2.4.5) ensures efficient multisensory integration across Poisson-spiking neurons and exponentially distributed temporal encoding (Sections 1.2, 2.1.6). This Bayesian framework enforces linear additivity of perception metrics under i.i.d. sensory inputs, preserving computational efficiency (Section 3.6.2) through hierarchical Normal-Gamma dynamics.

3.9.2 The center of self-awareness

An intriguing question in neuroscience is whether a centralized neural substrate for self-awareness or self-consciousness exists to unify distributed sensory perceptions into integrated conscious experience. The proposed self-awareness center must satisfy two defining features: spatiotemporal anchoring of the self (spatial positioning and state tracking) and homeostatic stability maintenance via negative feedback loop mechanisms. Consequently, the center of self-awareness likely emerges from the integrative network of multiple brain regions, rather than a single anatomical locus.

3.9.2.1 The homeostasis of self-awareness

In Section 2.4.4.2, we have discussed the neural basis of egocentric and allocentric spatial representations: the posterior parietal cortex (PPC) functions as a core egocentric processor, encoding real-time spatial relationships relative to the body (e.g., eye-, head-, or hand-centered coordinates), while the retrosplenial cortex (RSC) acts as a critical hub for transforming and integrating egocentric signals from the PPC with allocentric representations generated by the hippocampal formation. This dual role positions the RSC not only as a high-level visual-association cortex, but also as a key node in a Bayesian inference engine that maintains perceptual coherence under uncertainty (Sections 2.4.4.4 and 2.4.5). Consequently, the PPC and RSC are fundamental components of a self-awareness center responsible for spatiotemporal anchoring of the self, which is the continuous process by which the brain maintains a stable, situated representation of "where I am" and "who I am" across time and context.

A closed-loop architecture for the homeostasis of self-awareness. We propose that this system operates as a minimal negative feedback control loop, analogous to homeostatic regulation in biological systems. Such a loop requires three canonical components: (1) Sensors: The PPC serves as the primary sensor, continuously monitoring egocentric spatial input via the dorsal visual stream, integrating visual, proprioceptive, and kinesthetic signals to track bodily position and movement in real time (Cohen and Andersen, 2002). (2) Comparator: The RSC functions as the comparator, receiving egocentric data from the PPC and aligning it with allocentric frameworks derived from the hippocampus and parahippocampal cortex via the entorhinal cortex (EC). This comparison generates a prediction error signal, a neural representation of the discrepancy between current sensory input and memory-based expectations. The dense connectivity of RSC with visual association areas, DMN nodes (e.g., mPFC, posterior cingulate), and limbic structures enables it to compute this error in a context-sensitive, self-referential manner. (3) Effector: The prefrontal cortex (PFC) (Friedman and Robbins, 2022), particularly the dorsolateral (dlPFC) and frontal eye fields (FEF), acts as the effector, initiating corrective actions, such as saccadic eye movements, reorientation, or cognitive reappraisal, through motor and executive control pathways. This completes the loop by adjusting behavior to reduce spatial or cognitive dissonance.

Together, the PPC-RSC-PFC triad forms a closed-loop system that maintains the homeostasis of self-awareness, dynamically stabilizing the self in space and time through continuous sensor-comparator-effector coordination. The PPC-RSC-PFC triad center of self-awareness is a core component of self-consciousness as outlined in Section 3.1, where self-consciousness encompasses additional critical brain regions beyond this central hub, such as the anterior cingulate cortex (ACC), somatosensory cortex, and insula. Importantly, self-consciousness is not a rigid entity but a dynamic, hierarchical, and modular system integrating perceptual and cognitive processes. This structure enables binocular rivalry experiments, where competing visual inputs are resolved through interactions among specialized neural modules. Hierarchical processing across regions like the ACC, somatosensory cortex, and insula integrates sensory, emotional, and contextual information, facilitating the competition and alternation of awareness observed in such tasks, reflecting the flexible, context-sensitive organization of self-consciousness.

Bayesian hierarchical architecture of spatial cognition. The spatiotemporal anchoring of the self is inherently visually grounded and multisensory, requiring integration of dorsal stream spatial signals with ventral stream object recognition and mnemonic context. The RSC, as a transmodal association cortex, is uniquely positioned to mediate this integration. It interconnects with: PPC (egocentric input), mPFC/DMN (self-referential processing), visual areas (V1-V4, MT), hippocampus (allocentric cognitive maps), parahippocampal cortex (scene context), and EC (spatiotemporal binding). EC further subdivides into: Lateral EC (LEC): processes egocentric, object-related, and temporal information; Medial EC (MEC): encodes allocentric spatial metrics via grid cells, border cells, and head direction cells (Hafting et al., 2005).

This architecture supports a hierarchical generative model of spatial cognition. As modeled in Section 2.4.4.4.5, five-level Bayesian architectures enable human-like efficient learning. The PPC-RSC-hippocampus-EC network likely implements a double cover of six- or seven-level Bayesian hierarchy, wherein parallel, bidirectional processing streams (e.g., egocentric vs. allocentric, prediction vs. error) provide redundant and complementary inference pathways. This topological redundancy enhances robustness and supports continuous state estimation under uncertainty. The use of Normal-Gamma conjugate priors (Sections 2.4.4.4 and 2.4.5) enables efficient online spatiotemporal frame realignment within this structured belief space.

In this framework, the hippocampus and EC function analogously to specialized sensory hubs such as the fusiform face area (FFA) or visual word form area (VWFA), but for spatial and contextual inference with place cells constructing context-specific cognitive maps, grid cells providing a metric for path integration, boundary cells anchoring spatial representations to environmental geometry (Section 2.4.4.2). Crucially, like the inferotemporal cortex (ITC) where object representations are dynamically refined through perceptual inference (Section 2.4.4.4), the hippocampus and EC do not merely store static memories. Instead, they operate as active generative models, continuously predicting self-location and environmental structure by integrating prior experience with ongoing sensory input. This predictive function is maintained through frequent, online Bayesian updates, ensuring that spatial and contextual representations remain aligned with the current state of the world.

Clinical validation: Alzheimer's disease as a systems-level failure. The integrity of this closed-loop system is clinically validated by Alzheimer's disease (AD), which selectively targets the network that sustains self-awareness. Early hypometabolism in the RSC, PPC (precuneus), and hippocampal-EC circuits (Chételat et al., 2016) disrupts the energy-intensive processes required for Bayesian spatial updating, leading to mild disorientation in early stage, topographical agnosia and navigation failure in mid-stage, prosopagnosia, familial misidentification, and ego dissolution in late stage. These symptoms reflect a progressive collapse of the feedback loop: PPC dysfunction impairing real-time spatial sensing, RSC degradation disrupting comparator function and self-environment integration, PFC atrophy eroding executive control and error correction. The result is dyshomeostasis of self-awareness, a breakdown in the ability of the brain to maintain the dynamic equilibrium underlying the coherent sense of self.

In summary, the homeostasis of self-awareness is maintained by a metabolically optimized, closed-loop system centered on the PPC-RSC-PFC-hippocampus network. This system operates through Bayesian predictive coding, integrating real-time sensory input with prior knowledge to anchor the self in space and time. Its vulnerability in AD underscores its high energy demand and functional centrality. Rather than being a passive correlate, this network is causally essential for the continuity of self-making it a prime target for both theoretical neuroscience and therapeutic intervention in disorders of consciousness and identity.

3.9.2.2 Energy efficiency is a fundamental theme in brain cognitive function

At the beginning of Section 1.1, we have discussed that the energy efficiency of the brain is approximately 9 × 10⁸ times greater than contemporary computing and artificial intelligence systems (Stiefel and Coggan, 2023), suggesting that the brain's architecture and function have been evolutionarily optimized under stringent energy constraints. This extraordinary efficiency is supported by multiple mechanisms across different spatial and temporal scales.

We have demonstrated that frequency adaptation can significantly reduce energy consumption, particularly in terms of quantization energy, as illustrated in Figure 3.4.1 and discussed in Section 3.4. The mechanism of frequency adaptation is further supported by both simulation and analytical solutions, including the topological protection of quantum neuronal states via creatine dipoles (Section 2.1.6), which may enhance the stability and energy efficiency of neural signaling at the molecular level.

In Section 2.4.4.4, we have explored the cognitive implications of Normal-Gamma dynamics, which provide a Bayesian framework for understanding how the brain optimizes energy use during cognitive tasks. These dynamics explain how the brain balances precision and uncertainty in perception and decision-making, particularly under aging, where increased reliance on prior expectations reflects an energy-saving strategy by reducing the need for real-time sensory processing.

Figure 3.6.1 (Section 3.6.2) illustrates how the energy efficiency of information emergence can account for the limited capacity of working memory, typically 3-4 items, by modeling the trade-off between information gain and metabolic cost. This principle extends to the quantification of realizable information emergence under energetic constraints, discussed in Section 3.6.3, which has potential clinical applications in assessing cognitive resilience and dysfunction.

As summarized in Section 3.2 and Table 3.2.1, RSC and PFC exhibit early and pronounced hypometabolism in both AD and multiple forms of depression, suggesting a shared metabolic vulnerability across neurodegenerative and psychiatric conditions. Crucially, RSC and PPC are identified as core components of the self-awareness network, playing pivotal roles in the default mode network (DMN). The DMN, active during rest and self-referential tasks, generates persistent baseline neural activity with elevated metabolic demands, supporting homeostatic regulation of self-referential processes and their susceptibility to dysfunction in neurological and psychiatric conditions. This network integrates egocentric and allocentric spatial representations, enabling coherent navigation and self-location in the environment.

Moreover, PFC, hippocampus, RSC, and PPC form a hierarchically organized Bayesian inference system that supports spatial cognition, memory integration, and self-referential processing through continuous belief updating under uncertainty. The RSC acts as an "integrator-orchestrator," translating between different spatial reference frames and maintaining perceptual coherence through intact entorhinal-hippocampal-RSC loops. Disruption of this system, as seen in AD, leads to early hypometabolism in RSC, PPC, and hippocampal networks, impairing spatial frame transformations and self-awareness. Progression of AD reflects an energy crisis in the self-awareness center, manifesting as disorientation, topographical agnosia, prosopagnosia, and ego dissolution.

In summary, the cognitive architecture of the brain is fundamentally shaped by energy constraints, with mechanisms ranging from molecular-level quantum protection to large-scale network dynamics all contributing to energy optimization. The PPC-RSC-PFC system of the self-awareness center exemplifies how energy efficiency is not merely a byproduct but a guiding principle in the evolution and operation of neural systems, with profound implications for both artificial intelligence and the understanding of neurodegenerative diseases.

3.9.2.3 Alzheimer disease as a consequency of energy crisis in self-awareness center

Alzheimer's disease (AD) is increasingly recognized not merely as a proteinopathy (amyloid-β and tau), but as a systemic metabolic failure centered on brain networks with high baseline energy demands, particularly the default mode network (DMN) and its core constituents: the posterior parietal cortex (PPC), retrosplenial cortex (RSC), prefrontal cortex (PFC), and hippocampal formation. This network constitutes a self-awareness center that maintains egocentric-allocentric spatial integration, self-referential cognition, and perceptual coherence through energy-intensive, dynamically balanced processes.

Early-stage AD: hypometabolism as the first sign of energy crisis. AD pathophysiology begins with early and selective hypometabolism in the RSC, PPC (especially precuneus), and hippocampal networks (Chételat et al., 2016). These regions exhibit the high baseline glucose metabolism in the resting brain, reflecting their role in continuous, unconscious spatial updating and self-location monitoring. The onset of reduced ATP availability disrupts spatial frame transformations converting between egocentric (body-centered) and allocentric (world-centered) representations, leading to mild disorientation, often one of the earliest clinical signs. This hypometabolism is not merely a consequence of neurodegeneration but likely a primary driver, as energy failure impairs synaptic maintenance, ion homeostasis, and neurotransmitter recycling, particularly in long-range, high-bandwidth connections.

Mid-stage AD: breakdown of reference frame integration. As the energy deficit progresses, the RSC, acting as the "integrator-orchestrator" of spatial cognition, loses its ability to mediate transformations between egocentric (PPC-driven) and allocentric (hippocampal-driven) reference frames. This failure manifests as severe allocentric-egocentric integration deficits, resulting in topographical disorientation and topographical agnosia—inability to navigate familiar environments. The EC-hippocampus-RSC loop, essential for Bayesian spatiotemporal updating (Sections 2.4.4.4, 2.4.5 and 3.9.2.1), becomes dysfunctional. Normally, this loop integrates sensory input with prior expectations to maintain a stable, probabilistic model of self-location. Under energy constraint, predictive coding fails: the brain cannot efficiently update its internal model, leading to perceptual incoherence and spatial confusion.

Late-stage AD: collapse of self-referential processing. With progressive atrophy of the RSC, PPC, and mPFC, self-referential processing rooted in RSC-PFC interactions erodes. This network supports autonoetic consciousness: the ability to mentally time-travel and recognize oneself across contexts. Its degradation leads to: prosopagnosia (face recognition loss), due to disrupted integration of facial identity with self-referential context, and familial misidentification, reflecting a breakdown in the self-world distinction, a core function of the RSC-mediated Bayesian system. These symptoms are not isolated cognitive deficits but emergent phenomena of a collapsing self-model, sustained only by continuous, energy-dependent neural computation.

The self-awareness center as a negative feedback loop. The PPC-RSC-PFC system operates as a negative feedback loop that maintains homeostasis of self-awareness: PPC provides real-time egocentric sensory input; RSC transforms and integrates egocentric frames with hippocampus maintaining allocentric cognitive maps; PFC stabilizes self-referential context and executive control. This loop is metabolically costly but evolutionarily optimized for energy efficiency through mechanisms such as frequency adaptation, sparse coding, and topological protection of quantum neuronal states via creatine-phosphocreatine (PCr) systems (Section 2.1.6). When energy supply fails due to vascular dysfunction and mitochondrial impairment, this loop destabilizes, leading to catastrophic failure of self-coherence.

AD as a disease of energy inefficiency. Thus, AD is best understood not just as a neurodegenerative disorder, but as a failure of energy homeostasis in the self-awareness center. The extreme energy efficiency of the brain (~10⁹ times greater than AI systems) (Stiefel and Coggan, 2023) is maintained by finely tuned, high-demand networks. The RSC-centered system, due to its integrative role and high metabolic rate, is uniquely vulnerable to energy insufficiency. This perspective reframes AD pathogenesis: amyloid and tau may be secondary responses to chronic energy stress, rather than primary causes. This view is fully aligned with our earlier synthesis (Liu et al., 2019), which identified cerebral hypometabolism and impaired autophagy as upstream, causative factors in neural atrophy and AD progression. Rather than mere correlates, these metabolic deficits disrupt proteostasis, compromise synaptic resilience, and ultimately trigger pathological protein accumulation. Consequently, therapeutic strategies should prioritize enhancing metabolic resilience, restoring mitochondrial function, and optimizing interventions that target the root causes of neurodegeneration, rather than solely focusing on the downstream clearance of protein aggregates.

In conclusion, Alzheimer's disease represents a systemic energy crisis in the brain's self-awareness center, where the RSC acts as the pivotal hub whose dysfunction cascades into spatial disorientation, memory fragmentation, and ultimately, the dissolution of self. This framework unifies metabolic, cognitive, and clinical observations under the principle of energy efficiency as the foundation of consciousness and cognition, offering a more holistic and mechanistically grounded understanding of neurodegeneration.

4.1 Symmetric bilinear hyperbolic self-information spaces

The information processing of the brain may be understood through the lens of geometric statistics and dynamical symmetry. Neuronal spiking activity, modeled as Bernoulli processes (Sections 1.7.3.4-1.7.3.5), gives rise to a one-dimensional statistical manifold whose intrinsic geometry is governed by the Fisher-Rao metric. However, beyond this local statistical structure, global neural representations may emerge within a bilinear hyperbolic information space, exemplified by the upper half-plane ℍ = {x + iy ∈ ℂ | y > 0} under the action of discrete subgroups of SL(2, ℝ), such as the modular group SL(2, ℤ) and its congruence subgroups (Section 1.6).

This hyperbolic manifold supports a duality between probabilistic spectral parameters and geometric invariants. Let p ∈ (0,1) denote a spiking activity parameter, with its dual q = 1 - p. The symmetric quantity p(1-p) plays a dual role: In information geometry, it determines the curvature scale of the Fisher metric g(p) = 1/(p(1-p)) dp², corresponding to the metabolic cost of maintaining spiking precision with higher curvature indicating greater ATP expenditure. In hyperbolic geometry, p(1-p) corresponds to the eigenvalue of the hyperbolic Laplacian Δ:

Δ = -y²(∂²/∂x² + ∂²/∂y²) (4.1.1)

which acting on automorphic forms on the upper half-plane ℍ, or a convex co-compact space SL(2, ℤ)\ℍ, such as Eisenstein series and Maass wave forms in number theory (Kurokawa et al., 2012). Particularly, Eisenstein series are eigenfunctions of the Laplacian operator Δ with the eigenvalue s(1-s) on the upper half-plane ℍ (Kurokawa et al., 2012). Automorphic forms on SL(2, ℤ)\ℍ can represent invariant neural codes under discrete symmetry transformations, such as grid cell firing patterns in the entorhinal cortex.

In addition, in isothermal coordinates θ (logit transform), i.e. ds² = p(1-p) dθ² (Equation 1.7.15), p(1-p) as the metric of isothermal coordinates θ is invariant as the Fisher metric of the isothermal coordinates p (Section 1.7.3.5.3). Notably, p(1-p) is also known as the variance of the Bernoulli distribution (Section 1.7.2), and can be interpreted as a Hamiltonian H_p = p(1-p) (Section 1.8.3).

The Hamiltonian H_p can represent the invariant Hamiltonian H_AP (~10⁶ ATP/PCr/creatine) required for a single action potential under the time-evolution of the wave function (Equation 2.1.28). Consistently, p(1-p) is also invariant as the metric in isothermal coordinates θ. The Hamiltonian can be considered as an exponential map of a symmetric bilinear self-information form:

H_p = p(1-p) = e^(-(ln p + ln q)) (4.1.2)

which is in contrast to the canonical parameter logit function, an antisymmetric bilinear self-information form under p ↔ 1-p (Equation 1.7.13).

The Hamiltonian H_p induces a Hamiltonian vector field ∇H_p = (1-2p) ∂/∂p. Here, the tangent vector fields X₁ = (1-p) ∂/∂p and X₂ = p ∂/∂p naturally form elements of a Lie algebra, satisfying the properties of a Lie algebra: bilinearity, antisymmetry, and the Jacobi identity. The non-commutativity of X₁ and X₂ is evident from the non-vanishing Lie brackets:

[X₁, X₂] = X₁X₂ - X₂X₁ = (p(1-p)) ∂²/∂p² (4.1.3)

The dual cotangent fields Y₁ = (1-p) dp and Y₂ = p dp are not elements of a Lie algebra, but their dual vector fields (e.g., Y₁/(1-p), Y₂/p) generate a Lie algebra under the Lie bracket [Y₁/(1-p), Y₂/p] = -1/(p²(1-p)²). However, these single-parameter Lie groups are insufficient to generate the non-abelian structure of sl(2, ℝ), which requires three independent generators with non-trivial commutators.

4.2 Quaternionic extension of the Hamiltonian framework

In Section 2.4.5.3.2.1, we have discussed that the Hamiltonian framework in symplectic geometry inherently preserves energy through a closed, non-degenerate 2-form (Feng and Qin, 2010). Therefore, a quaternionic representation is necessary and sufficient for embedding the Hamiltonian H_p into the quaternionic upper half-plane ℍ = {a + bi + cj + dk ∈ ℍ | b,c,d > 0}, equipped with the hyperbolic metric. To further expose the arithmetic depth of this framework, a key step is to complexify the parameter p as a coordinate on a hyperbolic manifold:

p = ½ + i r₁ + j r₂ + k r₃, where r = (r₁, r₂, r₃) ∈ ℝ³ (4.2.1)

with the quaternionic identities i² = j² = k² = ijk = -1. The constraint p + q = 1, where q = 1 - p = p̅ (the conjugate of p), is preserved:

p + q = [½ + i r₁ + j r₂ + k r₃] + [½ - i r₁ - j r₂ - k r₃] = 1 (4.2.2)

To simplify notation, define r as a quaternionic vector with components r₁, r₂, r₃. This yields the relation:

p = ½ + i r, ‖r‖² = r₁² + r₂² + r₃² (4.2.3)

p(1-p) = [½ + i r][½ - i r] = ¼ + ‖r‖² (4.2.4)

where H_p = ¼ + ‖r‖² generates dynamics via Poisson bracket ṙ_i = {r_i, H_p}, derived from the symplectic structure of ℍ. This corresponds to the invariant Hamiltonian H_AP translating along axons under time-evolution of the wave function (Equation 2.1.28), establishing a direct link between geometric dynamics and neural information processing in hyperbolic spaces.

Pure-imaginary quaternions (three independent generators) are isomorphic to 3 × 3 real antisymmetric matrices, the Lie algebra so(3). For any square matrix M, det(e^M) = e^(tr(M)). Real antisymmetric matrices satisfy tr(M) = 0, so det(e^M) = 1. Therefore, e^M belongs to the SO(3) group; its action is volume-preserving (flux or density remains constant). If the Hamiltonian of neural activity (e.g., direction-selective neurons in visual cortex) is invariant under SO(3) rotations, these imaginary quaternionic generators leave it unchanged while maintaining unit determinant. SO(3) invariance ensures that direction-selective neurons conserve information flux, minimizing metabolic cost in orientation-selective circuits.

Split-quaternions are used to model the non-compact antisymmetry group sl(2, ℝ), such as perceptual white balance (Berthier et al., 2023). Unlike standard quaternions, split-quaternions have a basis {1, i, j, k} with i² = -1, j² = k² = 1, and ij = k = -ji. They are isomorphic to the algebra of 2×2 real matrices. Under this isomorphism, the trace-zero matrices forming sl(2, ℝ) correspond exactly to the pure split-quaternions (no scalar part). The commutator of split-quaternions then reproduces the Lie bracket of sl(2, ℝ), giving a simple, intrinsic embedding. A notable element s = ½(1 + j) within split-quaternions algebra, where j² = 1, exhibits dual properties as both an idempotent (s² = s) and a zero divisor, where its product with a non-zero conjugate counterpart s̅ = ½(1 - j) yields zero (ss̅ = 0).

Crucially, split-quaternionic representations inherently align with self-adjoint Hamiltonian operators H† because their algebra supports Hermitian matrices (where H† = H†) without requiring artificial constraints. In the quantization of neuronal action potential dynamics, a Hamiltonian operator H† governing the system must be self-adjoint (i.e., H† = H††) to guarantee two critical properties: real energy eigenvalues of H† and unitary time evolution U(t) = e^(-iH†t/ℏ). In this framework, the Hamiltonian's self-adjointness is not just a mathematical requirement but a foundational constraint to align quantum-inspired models of neural dynamics with observed biological reality, such as frequency adaptation constrained by creatine dipole dynamics discussed in Section 2.1.6.

4.3.1 Maass wave forms, Riemann zeros and primes

In the frequency domain, the modular group SL(2, ℤ) gives rise to real analytic automorphic Maass wave forms in number theory (Kurokawa et al., 2012). Maass wave forms are non-holomorphic automorphic eigenfunctions of the hyperbolic Laplacian Δ, satisfying the eigenvalue equation Δ𝑓 = 𝜆𝑓 with 𝜆 = 𝑠(1 − 𝑠) (Equation 4.3.1). This equation exhibits the symmetry 𝑠 ↔ 1 − 𝑠, which leaves the eigenvalue 𝜆 invariant (Kurokawa et al., 2012). Geometrically, this symmetry is realized through the invariance of the upper half-plane ℍ under the modular group SL(2, ℤ), where fractional linear transformations preserve the hyperbolic structure (Section 1.6).

This framework enables a quaternionic extension of classical and quantum systems, as explored in Section 4.2. Here, we focus on the interplay between Maass waveforms—non-holomorphic eigenfunctions of the hyperbolic Laplacian—and the Riemann zeta function, which encodes deep arithmetic and geometric structures. Non-holomorphic Eisenstein series, a class of Maass waveforms, satisfy identities linking their L-functions to products of Riemann zeta values (Kurokawa et al., 2012):

𝐿(𝑠, 𝐸(9 + 𝑖𝑟,∙)) = 𝜁(𝑠 + 𝑖𝑟)𝜁(𝑠 − 𝑖𝑟) (4.3.2)

where 𝐸(9 + 𝑖𝑟,∙) is the non-holomorphic (Maass) Eisenstein series on SL(2, ℤ)\ℍ, and 𝐿(𝑠, 𝐸(9 + 𝑖𝑟,∙)) is its associated Dirichlet series (the Mellin transform of its Fourier coefficients), which equals 𝜁(𝑠 + 𝑖𝑟)𝜁(𝑠 − 𝑖𝑟). Here 𝜁(∙) denotes the Riemann zeta function. When 𝑠 equals 1/2, this reduces to:

𝐿(1/2, 𝐸(9 + 𝑖𝑟,∙)) = 𝜁(9 + 𝑖𝑟)𝜁(9 − 𝑖𝑟) (4.3.3)

Under the Riemann hypothesis, all non-trivial zeros of ζ(s) lie on the critical line Re(s)=1/2. Thus, the product 𝜁(9 + 𝑖𝑟)𝜁(9 − 𝑖𝑟) involves conjugated non-trivial zeros of ζ(s), ensuring that the product is real-valued and reflects the pairing of zeros on the critical line.

The Riemann zeta function has a symmetric form, often referred to as the completed zeta function or Riemann xi function, which is a central object in analytic number theory (Spigler, 2025). The xi function 𝜉(𝑠) is defined as:

𝜉(𝑠) = 𝑠(𝑠 − 1)𝜋^(-𝑠/2)Γ(𝑠/2)𝜁(𝑠) (4.3.4)

The functional equation of the Riemann zeta function is given by:

𝜁(𝑠) = 2^𝑠𝜋^(𝑠−1)sin(𝜋𝑠/2)Γ(1 − 𝑠)𝜁(1 − 𝑠) (4.3.5)

The xi function 𝜉(𝑠) eliminates the explicit symmetry-breaking terms (e.g., the sine factor), resulting in perfect symmetry under 𝑠 ↔ 1 − 𝑠:

𝜉(𝑠) = 𝜉(1 − 𝑠) (4.3.6)

This symmetry reflects the deep connection between the non-trivial zeros of 𝜁(𝑠) and the critical line Re(s)=1/2. The symmetric form 𝜉(𝑠) of the Riemann zeta function underscores the crucial role of the Riemann Hypothesis in connecting the distribution of zeta zeros to the structure of prime numbers.

The Riemann zeta function 𝜁(𝑠) is conventionally defined for complex numbers s with Re(s)>1 by the Dirichlet series and the Euler product formula, which connects it to the prime numbers:

𝜁(𝑠) = ∏_𝑝 (1 − 𝑝^(−𝑠))^(−1) (4.3.7)

This reveals a profound link between additive number theory (via the Dirichlet series) and multiplicative arithmetic (via primes). This duality mirrors global-local correspondences in mathematics: global phenomena (e.g., zeros of 𝜁(𝑠)) are deeply tied to local arithmetic data (e.g., primes). Under the Riemann hypothesis, Equations 4.3.3 and 4.3.7 demonstrate that the standard L-function of the non-holomorphic Eisenstein series (Maass forms with spectral parameter 1/2 + 𝑖𝑟, Equation 4.1.10) encodes a direct relationship between the distribution of non-trivial zeros of the Riemann zeta function and arithmetic properties of prime numbers.

The Selberg trace formula establishes a profound connection between spectral geometry and number theory by relating the eigenvalues of the Laplace operator on a Riemannian manifold to its geometric invariants. For co-compact hyperbolic surfaces SL(2, ℤ)\ℍ, this formula takes a symmetric form where the discrete spectrum of the Laplacian, parameterized by 𝜆 = 𝑠(1 − 𝑠), corresponds to the lengths of closed geodesics (Blackman and Lemurell, 2016):

∑_spectral terms ℎ(𝜆) = ∑_geometric terms 𝑔(𝑙) (4.3.8)

where ℎ is an even test function satisfying ℎ(𝜆) = ℎ(−𝜆), and 𝑔 is the Fourier transform of ℎ. This duality implies that the distribution of zeta zeros is deeply connected to the hyperbolic geometry of modular spaces, forming a framework integrating arithmetic, analysis, physics, and information theory. In this context, global neural representations in bilinear hyperbolic spaces align with Selberg's structure: geometric terms correspond to zeros (𝑔(𝑙) = 𝑔(0), e.g., the δ-defined spike firing in Equation 2.1.28) and terms on the left side reflect energy costs (quantalized 𝐻𝑃 = 𝑛𝐻𝐴𝑃 from Equation 2.1.28). Equation 4.1.17 realizes this framework:

∑_spectral terms δ(𝐻𝑝 − 𝐻𝐴𝑃) = ∑_geometric terms δ(𝐻𝐴𝑃Δ𝑡𝑚) (4.3.9)

where 𝐻𝐴𝑃 represents the Hamiltonian (~10^6 ATP/PCr/creatine) required for a single action potential over characteristic time Δ𝑡. The delta function on the right side encapsulates a quantization condition on energy constraints, requiring the ratio 𝐻𝐴𝑃Δ𝑡𝑚/ℏ to be an integer (m) multiple of 𝜋/2, as originally defined in Equation 2.1.28. For frequency adaptation, the condition requires that 𝑚 ≥ 3.

Equations 4.1.2, 4.2.4, 4.3.1, 4.3.3, 4.3.6 and 4.3.9 collectively suggest that binary decision-making or perceptual dichotomies in brain information encoding can be naturally embedded in this symmetric geometric framework through the dualities 𝑝 ↔ 1 − 𝑝 and 𝑠 ↔ 1 − 𝑠. These symmetries reflect inherent properties of both the mathematical structures, such as modular forms and spectral data, and potential mechanisms for representing opposing states or choices. While the speculative analogies remain to be rigorously validated, the mathematical structures described here provide a robust foundation for interdisciplinary exploration. The interplay of Maass waveforms, zeta functions, and modular geometry exemplifies how abstract number theory can inform both pure mathematics and applied fields like information theory and computational neuroscience.

4.3.2 Primes and information encoding

The interplay between primes, Riemann zeros, and number theory is inevitable: information encoding demands uniqueness, security, and efficiency, all of which are naturally realized through the arithmetic irreducibility of primes and their distribution governed by Riemann zeros. Number theory, as the study of discrete structure, aligns perfectly with information science, which deals with discrete bits and symbols. Primes are the simplest discrete irreducibles, making them the natural foundation for representing identity, classification, and authentication. The congruence subgroups (Section 1.6), which classify invariant neural representations, further reinforce this connection by mirroring the structural principles of prime factorization.

For congruence subgroups, the quotient group Γ₁(N)/Γ(N) serves as an algebraic invariant reflecting the prime decomposition of N. Its homomorphic images and group-theoretic properties, such as simplicity, order, and cyclic factorization, are uniquely determined by distinct prime factors of N, rooted in the fundamental theorem of finitely generated congruence subgroups (Section 1.6). This correspondence ensures that arithmetic information about N is encoded directly in its group-theoretic structure, mirroring how primes underpin number theory and modular forms. These are reasons why the classification of invariant neural representations (such as invariant frequencies in hierarchically layered auditory tonotopic spaces in Section 2.1.2) is governed by congruence subgroups (Section 1.6).

In short, primes are not merely tools in information coding but embody mathematical identities that are both universal and unforgeable. Their deep ties to number theory and Riemann zeros ensure their role as fundamental carriers of discrete information, reflecting arithmetic laws that underpin the logic of information itself.

Intriguingly, 𝑝(1 − 𝑝) and 𝑠(1 − 𝑠) are dual observables: one geometric (associated with the quaternionic modulus of probability distributions) and one spectral (linked to eigenvalues of automorphic forms). Their coupling reflects a deep symmetry in the bilinear hyperbolic information manifold, where the interplay between geometric configurations and spectral data encodes structural constraints. This structure resonates with the Langlands program, where automorphic representations of GL(2) correspond to 2-dimensional Galois representations via reciprocity laws. In this context, the spectral data of Maass forms (governed by 𝑠(1 − 𝑠)) encode arithmetic invariants, including distributions of primes and Riemann zeros through their associated L-functions (Equations 4.3.3 and 4.3.7). The modular symmetry of these L-functions further ensures that information encoded in the spectral plane is invariant under transformations induced by congruence subgroups, establishing a bridge between analytic number theory and information-theoretic principles.

This duality underscores how arithmetic objects, such as primes, zeros, and automorphic forms, are not isolated constructs but interconnected components of a unified framework. The geometric interpretation of 𝑝(1 − 𝑝) as a measure of variance/Hamiltonian in probabilistic systems parallels the spectral interpretation of 𝑠(1 − 𝑠) as an eigenvalue problem in harmonic analysis. Together, they form a self-consistent system where discrete arithmetic structures (primes, factorizations) and continuous analytic objects (L-functions, modular forms) coexist, enabling robust encoding mechanisms that balance precision, scalability, and adaptability. Such principles may inform the design of future information systems, leveraging number-theoretic symmetries to achieve secure, efficient, and scalable representations of complex data.

4.4 GL(2) unification of bilinear self-information spaces under the Langlands program

Any square matrix 𝑀 can be uniquely expressed as the sum of a symmetric matrix 𝑀ₛyₘₘ = (𝑀 + 𝑀ᵀ)/2 and an antisymmetric (skew-symmetric) matrix 𝑀ₐₙₜᵢ = (𝑀 − 𝑀ᵀ)/2, satisfying:

𝑀 = 𝑀ₛyₘₘ + 𝑀ₐₙₜᵢ (4.4.1)

This follows directly from the definitions of symmetry (𝑀ₛyₘₘᵀ = 𝑀ₛyₘₘ) and antisymmetry (𝑀ₐₙₜᵢᵀ = −𝑀ₐₙₜᵢ). To prove uniqueness, suppose an alternate decomposition exists: 𝑀 = 𝑆 + 𝐴, where 𝑆ᵀ = 𝑆 and 𝐴ᵀ = −𝐴. Subtracting the standard decomposition yields (𝑆 − 𝑀ₛyₘₘ) + (𝐴 − 𝑀ₐₙₜᵢ) = 0. Taking transposes of both sides and using 𝑆ᵀ = 𝑆, 𝐴ᵀ = −𝐴, we find 𝑆 − 𝑀ₛyₘₘ = 0 and 𝐴 − 𝑀ₐₙₜᵢ = 0. Hence, the decomposition is unique.

This foundational structure underpins deeper mathematical unifications. For example, in the Langlands program, such decompositions appear when analyzing the Lie algebra of the general linear group gl(n). Specifically, within gl(2), the decomposition into symmetric and antisymmetric components corresponds to distinct subspaces that govern complementary behaviors: symmetric matrices encode geometric/spectral properties (e.g., invariant metrics in hyperbolic geometry or automorphic forms central to number theory), while antisymmetric matrices relate to informational and dynamical processes (e.g., symplectic structures for Hamiltonian systems and information geometry in Sections 1.7 and 1.8.4). This duality mirrors the Langlands correspondence's interplay between number-theoretic objects (e.g., automorphic representations) and geometric structures (e.g., Galois representations), where symmetric/antisymmetric components act as algebraic "building blocks" for synthesizing diverse mathematical domains. Thus, while Equation 4.4.1 is a linear-algebraic identity, its generalization to Lie algebras and representation theory reveals profound connections in modern mathematics, exemplified by the role of gl(2) as both a computational framework and an abstract unifier.

A symmetric bilinear form, such as the Riemannian metric 𝑑𝑠², defines a geometric structure that is preserved under coordinate transformations. In the context of the upper half-plane (Figure 1.6.1), the space of complex numbers with positive imaginary part, the hyperbolic metric 𝑑𝑠² = (𝑑𝑥² + 𝑑𝑦²)/𝑦² is invariant under the action of PSL(2, ℝ), a quotient of GL(2, ℝ). When restricted to the discrete subgroup SL(2, ℤ), this action produces the modular surface SL(2, ℤ)\ℍ, a central object in number theory. On this surface, Maass forms—smooth, non-holomorphic eigenfunctions of the hyperbolic Laplacian Δ = −𝑦²(∂²/∂𝑥² + ∂²/∂𝑦²) (Equation 4.1.1)—play a fundamental role. These functions satisfy Δ𝑓 = 𝜆𝑓, where the eigenvalue is parameterized as 𝜆 = 𝑠(1 − 𝑠) for a complex variable s (Equation 4.1.9). Each Maass form gives rise to an L-function 𝐿(𝑠, 𝑓) = ∑ₙ 𝑎ₙ 𝑛⁻ˢ, whose nontrivial zeros are conjectured to lie on the critical line Re(s) = 1/2 (Equation 4.1.11). This mirrors the Riemann Hypothesis and suggests a deep spectral origin for the distribution of prime numbers.

[FIGURE:4].4.1 Unified diagram: GL(2) bridges geometry, information, and number theory. The diagram illustrates how GL(2), acting via translations on its Lie algebra, induces metrics/forms that split into symmetric and antisymmetric components. These components lead to distinct mathematical domains: hyperbolic geometry through symmetric bilinear forms, and information geometry through antisymmetric bilinear forms. Langlands Correspondence connects automorphic forms and Galois representations. This unified perspective reveals the profound interplay between geometry, information, and number theory under the umbrella of GL(2).

In contrast, the Lie algebra of GL(2)'s antisymmetric part corresponds to 2-forms like 𝜔 = 𝑑𝑥 ∧ 𝑑𝑦, which define symplectic structures ensuring Hamiltonian conservation (Section 4.2). In information theory, such antisymmetry appears in the behavior of conjugate variables and response functions. A key example is the logit function, logit(𝑝) = log(𝑝/(1 − 𝑝)), which satisfies the antisymmetric identity logit(𝑝) = −logit(1 − 𝑝). This reflects the duality between a probability p and its complement, and positions the logit as a natural coordinate for perceptual and cognitive scaling (Section 1.7.3.5). The Fisher-Rao metric, given by 1/(𝑝(1 − 𝑝)), captures the intrinsic curvature of statistical models. It governs directional flows, such as entropy production ∇𝑆 and score dynamics. Thus, the split between symmetric and antisymmetric components of GL(2) mirrors the duality between static geometry and dynamic information processing.

The interplay between symmetric and antisymmetric structures in GL(2) reveals a dual nature: while symmetry defines static geometric frameworks (e.g., hyperbolic metrics, Maass forms), antisymmetry governs dynamic processes (e.g., symplectic structures, information flows, perceptual scaling). This duality is not merely formal but reflects a deep structural principle in mathematics. In this light, GL(2) emerges as a universal group that transcends disciplinary boundaries. Its symmetric substructures encode conservation laws in geometry and arithmetic, while antisymmetric components model irreversible processes like entropy production or cognitive inference. The Langlands correspondence completes this picture by framing both as facets of a single mathematical object: automorphic representations are eigenfunctions (spectral data) for number-theoretic L-functions, just as Maass forms are eigenfunctions for the hyperbolic Laplacian. This spectral unity, where geometric operators (e.g., Laplacian Δ) and arithmetic objects (e.g., Galois actions) share common eigenstructures, suggests that mathematics itself is a coherent system of symmetries. As shown in Figure 4.4.1, GL(2) acts as a bridge, revealing that geometry, information, and number theory are not isolated disciplines, but interconnected manifestations of a deeper mathematical unity.

4.5.1 Temporal dynamics on Lie group manifolds: revisiting duration time classification

In Section 2.1.4, three temporal classifications are defined: wall/real time (clock-based progression), duration time (interval measurement), and characteristic time (intrinsic system-specific timescale). The latter is redefined in Section 2.4.4.3 as the absolute refractory period for working memory capacity modeling. To enhance analytical clarity, two supplementary temporal metrics are introduced: inter-spike interval (ISI) denoting the time between consecutive spikes, and delay time from first spike (TFS), representing the latency following initial spiking activity. Equation 2.1.29 is reformulated in non-dimensionalized form as:

𝐸√𝑡 Δ𝑡 = 𝑘^(2/3) (4.5.1)

where 𝐸 represents the driving force of Na-K pump/Na⁺-K⁺ ATPase, and 𝜏 denotes a unitary dimension normalization factor (the inverse of 𝐼 in Equation 2.1.28). This expression satisfies the right-hand side conditionality of Equation 4.3.9, and aligns with Figure 2.1.3, where √𝑡 exhibits approximate linearity with spike count under sustained stimuli.

Intriguingly, using identical data from Figure 2.1.3, geometric insights emerge in Figure 4.5.1: the linear relationship between √𝑡 and ln(1+TFS) suggests that temporal dynamics operate on a Lie group manifold, where √𝑡 acts as an element of the group governing time-scale invariance. This structure implies a continuous symmetry underlying frequency adaptation, with TFS serving as infinitesimal generators for state evolution under time-translation operators. Specifically, the logarithmic transformation ln(1+TFS) corresponds to exponential mapping between the Lie algebra (tangent space at identity) and the group manifold, while √𝑡 reflects an invariant metric under this action. For small TFS values, a first-order approximation holds:

ln(1 + 𝑇𝐹𝑆) ≈ 𝑇𝐹𝑆 (4.5.2)

Substituting into the fitted regression (√𝑡 ∝ 2.8272 × ln(1 + 𝑇𝐹𝑆)) of Figure 4.5.1 yields the empirical relation:

𝑇𝐹𝑆^(2) ∝ 2.8272² ≈ 8 (4.5.3)

Since √𝑡 originates from Brownian diffusion (Equation 2.1.22), the scaling factor can be interpreted as a diffusion coefficient D = 1/8 of the infinitesimal generator (TFS) in a one-parameter subgroup, which is not arbitrarily chosen but uniquely determined by the intrinsic geometric constraints of SU(2).

As a 3-dimensional compact Lie group, SU(2) is the connected double cover of SO(3) and is homeomorphic to the 3-sphere S³, a constant-curvature manifold. The adjoint-invariant symmetric bilinear form on su(2), realized as pure quaternions (Dubois, 2006), is defined via generators 𝑇ⱼ = 𝜎ⱼ/2 (𝑗 = 1,2,3) where 𝜎ⱼ are Pauli matrices defined in Equations 1.6.7-1.6.9. Normalized to satisfy ⟨𝑇ⱼ, 𝑇ₖ⟩ = −Tr(𝑇ⱼ𝑇ₖ) = 𝛿ⱼₖ (using the identity Tr(𝜎ⱼ𝜎ₖ) = 2𝛿ⱼₖ), this metric uniquely aligns with both the su(2) semi-simple Lie algebra structure and a natural isometry with the standard Euclidean inner product on ℝ³. Combining this metric with Einstein's mean-square fluctuation formula (⟨d²⟩ = 2nDt) (Selmeczi et al., 2007) and SU(2) diffusion dimension n = 3 (matching its Lie algebra dimension) uniquely yields D = n/(4×2n) = 1/8. This synthesis bridges biophysics and Lie group theory, enabling geometric analysis of refractory period constraints in information encoding through exponential mappings on manifolds as discussed in Sections 4.1-4.4.

[FIGURE:4].5.1 Linear relationship between √𝑡 and ln(1+TFS). TFS denotes the delay time from the first spike.

Importantly, the linear relationship between √𝑡 and ln(1+TFS) implies that neural systems inherently leverage Lie-theoretic principles to process temporal information. Empirical data in Figure 4.5.1 demonstrate that all ln(1+TFS) vary linearly with √𝑡, indicating a hierarchical or self-similar structure in neural timing mechanisms governed by Lie-theoretic symmetries. Particularly, neural frequency adaptation follows a √𝑡 law (Figures 2.1.3 and 4.5.1) that aligns with Lie-group theory: the logarithmic identity ln√𝑡 = (1/2)ln 𝑡 reveals an intrinsic Lie group structure where the nonlinearity of √𝑡 is encoded as a linear combination in the generator space (Lie algebra, ln 𝑡), enabling analytical tractability through exponential reconstruction. The exponential map then recovers the original dynamics √𝑡 as an invariant metric, encoding dynamics through an infinitesimal generator that preserves biological plausibility across time scales.

4.5.2 Reinterpreting characteristic time in Lie algebra

Building on the symmetric bilinear hyperbolic spaces and Maass forms discussed in Sections 4.1-4.4, we interpret √𝑡 as an element of a Lie group manifold governing temporal scaling symmetry. Specifically, its one-parameter subgroup structure is defined by its exponential identity:

√𝑡 = 𝑒^(ln(√𝑡)) = 𝑒^(ln 𝑡 / 2) (4.5.4)

This duality between logarithmic linearization and exponential reconstruction is essential for preserving both nonlinearity (√𝑡) and analytical tractability, while maintaining the temporal invariance properties observed in biological systems (Figures 4.5.1).

In Lie theory, the term ln 𝑡 acts as the infinitesimal generator of the one-parameter subgroup, particularly for group elements t near the identity element of the group, where the logarithm map is well-defined. Specifically, for small perturbations near the identity (𝑡 ≈ 1), the Taylor expansion of Equation 4.5.4 becomes:

√𝑡 ≈ 1 + (1/2)ln 𝑡 (4.5.5)

This Taylor expansion is always valid under the normalized condition 𝑡/𝑡₀ ≈ 1.

Motivated by the duality between a group element and its generator, we therefore reinterpret the characteristic time Δ𝑡 in Equation 4.5.1 as being proportional to the generator in the Lie algebra. To ensure dimensional consistency and to isolate the scale-invariant core of the theory, we introduce a compound characteristic biological time scale 𝑎𝜏₀ with the dimension of time, formed from the product of two fundamental parameters, each with dimension of t^(1/2): a unitary normalization factor 𝑎 encapsulating membrane electrochemical dynamics (such as creatine dipole √𝑡-scaled dynamics described in Section 2.1.6), and an intrinsic biological time 𝜏₀ (such as the Lie group time √𝑡).

Since Δ𝑡 is used in two different contexts—as a generator proxy and as a group element proxy—we first define the time-scale Δ𝑡 associated with the generator in the Lie algebra with a dimension of wall time:

Δ𝑡 = (𝑎𝜏₀) ln(𝑡/(𝑎𝜏₀)) (4.5.6)

Here, the infinitesimal generator ln(𝑡/(𝑎𝜏₀)) is normalized by the characteristic time 𝑎𝜏₀, yielding a dimensionless generator. The prefactor (𝑎𝜏₀), with dimension of t, scales this generator into a wall time quantity, representing its action in the algebra.

Naturally, we define a corresponding temporal scale Δ𝑡_group on the group manifold proportional to the group element itself: Δ𝑡_group = √𝑡 with a dimension of t^(1/2) as discussed in Section 4.5.1. This establishes a duality between two fundamental time parameters: Δ𝑡 representing generator dynamics with dimension t and Δ𝑡_group corresponding to group element evolution with reciprocal square root scaling in t^(-1/2). The dimensional consistency of the Lie algebra parameter Δ𝑡 aligns precisely with physical principles governing energy conservation (Sections 2.4.5.4 and 4.2) and time translation symmetry through unitary temporal evolution 𝑈(𝑡) = 𝑒^(−𝑖𝐻𝑡/ℏ) (Equation 2.4.19), where the role of the product of Hamiltonian generator and wall time directly corresponds to the algebraic formalism. This principle extends to neural perception via Bernoulli logit functions (ln(𝑝/(1−𝑝))), mapping dimensionless probability ratios directly into wall-time-scaled Lie algebra elements. This preserves dimensional coherence between physical time evolution and perceptual event representation (Section 1.7.3.5).

For clarity, we replicate Equations 2.1.26 and 2.1.27 here:

𝑝 = 𝑖𝑒^(−𝐻𝑖𝑛𝑡) − 𝑘𝑍 = cos(2π𝐻𝑙𝑑𝑐√𝑡/𝐸(𝑡)Δ𝑡) (2.1.26)

𝑛 = 𝑙³² (2.1.27)

From Equations 2.1.26, 2.1.27 and 4.5.1, we derive a formula for spiking frequency:

𝑓 = 𝐸√𝑡/Δ𝑡 (4.5.7)

Furthermore, simplifying Equation 4.5.1 yields:

𝑓Δ𝑡 = 𝑘^(2/3) (4.5.8)

Substituting our proposed relation Δ𝑡 from Equation 4.5.6 into Equation 4.5.8 suggests:

𝑓𝑎𝜏₀ ln(𝑡/(𝑎𝜏₀)) ∝ 𝑘 (4.5.9)

For a single inter-spike interval (ISI), where t is represented by the ISI and k is a constant of order 1, this leads to the proportionality:

𝑓𝑎𝜏₀ ln(𝐼𝑆𝐼/(𝑎𝜏₀)) ∝ 1 (4.5.10)

This relationship is asymptotically intriguing. The function ln(𝑥)/𝑥, which would be equivalent to 𝑓𝑎𝜏₀ ln(𝐼𝑆𝐼/(𝑎𝜏₀)) if 𝑓 = 1/𝐼𝑆𝐼, has a maximum value of 1/e at x = e and does not yield 1 for any real x.

However, a local linear approximation around 𝐼𝑆𝐼 = 𝑒𝑎𝜏₀² (the identity element of the scaled time in the Lie algebra) provides a valid region of applicability:

ln(𝐼𝑆𝐼/(𝑎𝜏₀))/(𝐼𝑆𝐼/(𝑎𝜏₀)) ≈ 1 − ln(𝑒) − 𝑒 + ... (4.5.11)

This implies that near 𝐼𝑆𝐼 ≈ 𝑒𝑎𝜏₀², the spike frequency can be approximated by 𝑓 ≈ 1/(𝑎𝜏₀ ln(𝐼𝑆𝐼/(𝑎𝜏₀))), which is consistent with certain SU(2)-derived dynamics under constraints of bi-invariant metrics.

Equation 4.5.10 is particularly compelling when compared to the Prime Number Theorem:

𝜋(𝑥) ~ 𝑥/ln(𝑥) (4.5.12)

The dimensionless product 𝑓𝑎𝜏₀ ln(𝐼𝑆𝐼/(𝑎𝜏₀)) might represent an invariant quantity in the neural code, analogous to the density of primes (1/𝜋(𝑥)). This mathematical analogy proposes a mechanism whereby the inherent dynamics of frequency adaptation, modeled as a Lie group process with an inherent timescale 𝜏₀, could favor the emergence of intervals with prime-number-like properties. Consistent with this hypothesis, simulations of spike frequency adaptation in Figures 2.1.4 and 2.4.4 demonstrate patterns where neurons adapt to fire prime numbers of spikes.

4.5.3 Prime-like information density and the emergence of magic seven in neuronal encoding

In Section 4.5.1, we established that √𝑡 acts as a macroscopic invariant metric, encoding dynamics through an infinitesimal generator that preserves biological plausibility across time scales. Consequently, interpreting the inherent time scale 𝜏₀ as a group time √𝐼𝑆𝐼 and neglecting the unitary factor 𝑎 yields:

ln(𝐼𝑆𝐼/𝜏₀)/(𝐼𝑆𝐼/𝜏₀) = ln(𝐼𝑆𝐼)/(2√𝐼𝑆𝐼) (4.5.13)

which attains its maximum value of 1/e at ISI = e² ≈ 7.389. Notably, this extremal value in the Lie algebra (√𝐼𝑆𝐼) maps onto the asymptotic density of primes 1/𝜋(𝑥) ~ ln(𝑥)/𝑥 in number theory (Equation 4.5.12) with the extremum at 𝑥 ≈ 7, and finds a functional correlate in neuronal spiking patterns, thereby aligning with empirical 'magic seven' phenomena (Sections 2.1.5 and 3.6.2). In addition, the dimensionless product 𝑓𝑎𝜏₀ ln(𝐼𝑆𝐼/(𝑎𝜏₀)), analogous to the density of primes 1/𝜋(𝑥), suggests that 7 marks a critical point in information density where neuronal frequency adaptation is not merely energy-efficient (Section 2.1.5), but also maximizes encoding efficacy and emergent functional complexity (Section 3.4) within a "prime-rich" region. This bridges number-theoretic patterns with neurophysiological constraints, revealing how prime-like structures emerge from biophysical principles to optimize information transmission under refractory period limitations.

Importantly, √𝑡 as an invariant metric ensures that the system dynamically adjusts to maintain optimal trade-offs between energy expenditure and information encoding. Without this metric, the natural logarithmic function ln(𝑥)/𝑥 attains its maximum at 𝑥 = 𝑒 ≈ 2.718, a value too small for frequency adaptation (≥ 3 spikes, Equation 4.3.9). By incorporating √𝑡, the critical point shifts to 𝑥 ≈ 7, where neurons align their operation with both empirical cognitive limits (~7 ± 2 items in working memory) (Miller, 1956) and prime-rich regions of information density. By exploiting the uniqueness of prime numbers, this design achieves the metabolic efficiency required for energy-constrained biological systems (Sections 3.6.2, 3.9.2.2 and Figure 3.6.1). Additionally, in number theory, there is a theorem of the smallest prime divisor: every composite number 𝑛 has a prime divisor 𝑝 less than or equal to its square root √𝑛 (Hardy et al., 2008). Thus, the metric √𝑡 acts as a bridge between abstract algebraic principles and empirical neurophysiology, enabling neurons to harness prime-rich dynamics for robust information processing.

4.5.4 Absolute refractory period revisited in Lie algebra

Section 2.1.8 originally defined the absolute refractory period as 𝐼𝑆𝐼ₘᵢₙ = Δ𝑡 (0.0187-0.0323 ms), where the characteristic time Δ𝑡 is linearly proportional to the interspike interval (ISI). The infinitesimal generator ln(𝑡/𝜏₀) in Equation 4.5.6, derived from the group element √𝑡, is a dimensionless mathematical object within the Lie algebra. To connect it to a physical quantity related to √𝐼𝑆𝐼 in the Lie algebra, we normalize it. This is achieved by scaling with the factor 𝑎 (with a dimension of t^(1/2)). Using the parameters 𝑡 = 𝐼𝑆𝐼 and interpreting the inherent time scale 𝜏₀ as the group time √𝐼𝑆𝐼 (dimension t^(1/2)), consistent with Section 4.5.3, the normalization procedure of Equation 4.5.6 results in the Lie algebraic time increment expression:

Δ𝑡 = 𝑎√𝐼𝑆𝐼 (4.5.14)

This expression serves as our physically realizable representation of the infinitesimal generator in the Lie algebra. Consequently, Equations 4.5.1, 4.5.6 and 4.5.14 jointly yield a revised minimum refractory period:

𝐼𝑆𝐼ₘᵢₙ = (𝑎𝜏₀)² (4.5.15)

where 𝐸, the net driving force of Na⁺-K⁺ ATPase (86-196 mV). The unitary factors 𝑎 and 𝜏₀ are crucial as they ensure dimensional homogeneity of Δ𝑡 and 𝐼𝑆𝐼ₘᵢₙ in the Lie algebra, and absorb the conversion factors necessary to transition from a pure mathematical symmetry to an emergent biophysical constraint. This formula reduces the minimum refractory period to 0.0051-0.0126 ms, reflecting a tighter lower bound constrained by both ionic kinetics and √𝐼𝑆𝐼-invariant dynamics. The shift from Δ𝑡 = 𝐼𝑆𝐼 to √𝐼𝑆𝐼 underscores the nonlinear interplay among temporal resolution (Figures 2.1.3 and 4.5.1), energy constraints of frequency adaptation (Sections 3.4 and 3.6.2), and prime-based information encoding (Section 4.5.3). These findings align with robust simulations and theoretical predictions about frequency adaptation and seven spiking numbers (Figure 2.1.4, using 𝐼𝑆𝐼ₘᵢₙ = 0.01 ms). This mechanistic framework forms the theoretical basis for simulations in Figure 2.1.4 by defining the absolute refractory period as a fundamental time constant while addressing limitations inherent to linear ISI-based models in Section 2.1.8. Equation 4.5.14 establishes that the absolute refractory period is determined by Na⁺-K⁺ ATPase energy demands and its CK/PCr buffering system (Section 1.8.3), with constraints arising from creatine dipole dynamics described in Section 2.1.6.

4.6.1 Oscillatory behavior of primes: a foundation for frequency-adapted encoding

Based on the explicit formula derived by von Mangoldt:

𝜓₀(𝑥) = 𝑥 − ∑_𝜌 𝑥^𝜌/𝜌 − 𝜁′(0)/𝜁(0) − (1/2)ln(1 − 𝑥⁻²) (4.6.1)

the oscillatory components in the distribution of prime numbers are directly attributed to the sum over non-trivial zeros 𝜌 = 𝛽 + 𝑖𝛾, where 𝛽 = Re(𝜌) and 𝛾 = Im(𝜌) (the imaginary part). Decomposing the term 𝑥^𝜌 via Euler's formula yields:

𝑥^𝜌/(𝛽 + 𝑖𝛾) = 𝑥^𝛽 ∙ (cos(𝛾 ln x) + 𝑖 sin(𝛾 ln x))/(𝛽 + 𝑖𝛾) (4.6.2)

This reveals that oscillations arise from the imaginary components (∝ sin(𝛾 ln x)) modulated by damping factors 𝑥^𝛽/|𝜌|.

Under the assumption of the Riemann Hypothesis (all non-trivial zeros satisfy 𝛽 = 1/2), von Koch proved in 1901 that the prime-counting function 𝜋(𝑥) satisfies:

𝜋(𝑥) = Li(𝑥) + O(√𝑥 ln 𝑥) (4.6.3)

where Li(𝑥) = ∫₂ˣ 𝑑𝑡/ln 𝑡 is the logarithmic integral approximation, with the error bound √𝑥 ln 𝑥 multiplied by a constant (von Koch, 1901). The explicit factorization of Equation 4.6.1 thus bridges analytic number theory to the spectral interpretation of prime distributions via zeta-zero dynamics.

By comparing Equation 4.5.8 (neuronal firing dynamics) and Equation 4.6.2 (zeta-zero oscillations), we observe a structural analogy: the spiking frequency 𝑓 ∝ 𝐸√𝑡 (Equation 4.5.7) in neuronal adaptation is mathematically analogous to the imaginary component 𝛾 of Riemann zeta zeros 𝜌. This correspondence suggests that the voltage-dependent driving force 𝐸 of Na⁺-K⁺ ATPase and creatine dipole √𝑡-scaled dynamics (Section 2.1.6) synergistically encode information through a frequency spectrum akin to the prime-number oscillations governed by zeta-zero dynamics as discussed in Sections 4.3.1, 4.3.2, 4.5.3 and 4.5.4. Moreover, the fluctuations in creatine dipole dynamics governed by √𝑡-scaled evolution (Section 2.1.6) within the Lie algebra framework anchored by an infinitesimal generator proportional to ln(𝑡) (Section 4.5.2), exhibit quantitative similarity to the error bound √𝑥 ln 𝑥 (Equation 4.6.3) characterizing deviations in the prime-counting approximation 𝜋(𝑥) − Li(𝑥). These highlight how non-linear biological oscillators might exploit mathematical structures originally derived from analytic number theory to optimize information encoding efficiency.

4.6.2 Conceptual implications for the Riemann Hypothesis through Hilbert-Pólya operator theory

The Hilbert-Pólya conjecture proposes that a self-adjoint operator in a physical system could establish a direct correspondence between its spectral properties and the non-trivial zeros of the Riemann zeta function (Spigler, 2025). The realization of the Hilbert-Pólya Conjecture provides a sufficient route to prove the Riemann Hypothesis, but whether it is equivalent remains unproven, meaning its necessity is still unresolved.

[FIGURE:4].6.1 Realizing the Hilbert-Pólya conjecture via frequency-adapted prime information encoding. The diagram explores the geometry of a Bernoulli statistical model, focusing on how symmetry and antisymmetry emerge through exponential/logarithmic maps. It links information-theoretic quantities (like entropy and self-information) to geometric structures (tangent spaces, Fisher information), and connects these ideas to advanced mathematical physics via the Hilbert-Pólya conjecture and the Riemann hypothesis, suggesting deep ties between statistics, geometry, and number theory.

The proposed framework integrates three interrelated components: First, frequency-adapted prime encoding in neural information processing models yields an eigenvalue spectrum aligned with the statistical distribution of Riemann zeros via the relation s(1-s) (Section 4.3), mirroring the Hilbert-Pólya requirement for eigenvalues of self-adjoint Hamiltonian operators 𝐻ₚ (Section 4.2) corresponding to critical zeros (Equation 4.3.9). Second, the Selberg trace formula establishes a geometric bridge between hyperbolic surfaces and the Riemann Hypothesis through spectral equivalence: the spectrum of the Laplace operator on such surfaces is analytically linked to geodesic length distributions via the Selberg trace formula (Section 4.3.1). Third, an observed analogy emerges between the prime-counting function 𝜋(𝑥) and frequency adaptation in neural encoding patterns across Sections 4.5-4.6, suggesting a shared structural logic underlying number-theoretic sequences and information processing dynamics. This synthesis demonstrates how spectral theory (via self-adjoint operators, Section 4.2), geometric analysis (Selberg trace formula on hyperbolic surfaces, Section 4.3.1), and computational models of information encoding (Sections 4.5-4.6) can converge to realize a pathway toward proving the Riemann Hypothesis by rigorously linking arithmetic zeros, operator eigenvalues, and computational analogs of prime distributions.

The unified diagram in Figure 4.6.1 illustrates how probabilistic and geometric structures on a statistical manifold where every point 𝑝 ∈ (0, 1) encodes event probabilities are intertwined with number-theoretic principles through dualities like 𝑝 ↔ 1 − 𝑝 and 𝑠 ↔ 1 − 𝑠. Tangent space (𝑒₁ = 𝑝∂𝑝) and cotangent space (𝜀₁ = 𝑑𝑝/𝑝) provide directional and metric tools, while symmetric bilinear hyperbolic self-information spaces preserve Hamiltonians 𝐻𝑝 = 𝑝(1 − 𝑝) under exponential mappings (Section 4.1). Antisymmetric frameworks (Section 1.7.3.4), by contrast, use logit transformations to invert 𝑝 ↔ 1 − 𝑝 into canonical parameter flips (𝜃 → −𝜃), encoding information-theoretic quantities like self-information (ln(1/𝑝)) and Fisher precision (𝐹(𝑝) = 1/(𝑝(1 − 𝑝))).

Extending this manifold into complex domains (e.g., 𝑝 = 1/2 + 𝑖𝑟) suggests links to unsolved problems, such as Maass waveforms and the Selberg trace formula (Section 4.3.1). This mathematical pathway proposes a biological framework where frequency-adapted neuronal encoding of primes could provide empirical grounding for Hilbert-Pólya conjectures, linking spectral analysis to arithmetic structures. By integrating differential geometry, information geometry, and number theory through GL(2)-equivariant bilinear self-information spaces embedded in the Langlands program (Section 4.4), as illustrated in Figure 4.4.1, this interdisciplinary framework bridges statistical intuition with deep arithmetic mysteries. It enables unified exploration of biological dynamics via isothermal coordinates, geometric symmetries through topological invariants, and foundational conjectures like the Riemann Hypothesis.

4.7 Unifying temporal dynamics: a Lie-algebraic framework for time perception across scales

The conceptual evolution of spacetime has profoundly shaped scientific progress, particularly in physics, where three historical paradigms illustrate its transformative role: the Newtonian absolute framework treating space and time as fixed and independent entities providing a static stage for physical laws; the Einsteinian relativistic paradigm unifying these dimensions into a dynamic geometric continuum warped by mass-energy distributions; and the quantum discrete spacetime emerging from theories like loop quantum gravity or string theory, positing granular structures at Planck scales (Section 2.4.5.4.4). This Lie-algebraic framework for time modeling synthesizes such historical insights with modern mathematical tools, offering a universal language to describe temporal dynamics across spatial and functional scales from subatomic processes to macroscopic systems.

This comprehensive study introduces several groundbreaking concepts about time that intertwine neural processing with fundamental principles from topology and group theory, particularly leveraging Lie algebra to explain the temporal dynamics of neural systems and unresolved questions in physics and mathematics. The core concepts about time perception include:

  1. Time definitions: The work defines multiple types of time: wall/real time, duration time, characteristic time/absolute refractory period, inter-spike interval (ISI), delay time from first spike (TFS), intrinsic biological time 𝜏₀ = √𝐼𝑆𝐼, Lie algebraic time Δ𝑡 ∝ √𝐼𝑆𝐼, and Lie group time √𝑡 ∝ ln(1 + 𝑇𝐹𝑆) (Section 4.5). These definitions provide a comprehensive framework for understanding neural dynamics across different scales. An appropriate space-time perspective becomes essential for cross-scale analysis and determining mathematical-physical solvability by aligning intrinsic biological constraints with extrinsic measurement frameworks.

  2. Time scaling symmetry: Neural systems inherently utilize Lie-theoretic principles to process time information, as evidenced by the linear relationship between √𝑡 and ln(1+TFS) in Figure 4.5.1. This relationship reveals a hierarchical or self-similar structure governed by Lie-group symmetries.

  3. Dimensional consistency: The parameter Δ𝑡 from the Lie algebra aligns precisely with physical principles of energy conservation and time translation symmetry through unitary temporal evolution 𝑈(𝑡) = 𝑒^(−𝑖𝐻𝑡/ℏ) (Equation 2.4.19). This establishes alignment between quantum time (via imaginary exponentials, Section 2.4.5.3) and biological perception, connecting theoretical frameworks to empirical neural data.

  4. Topology protection: The quantum mechanisms involving topological protection (Section 2.1.6) are foundational to consequent analyses and discussions, ensuring robustness in temporal information processing by neurons.

  5. Neural perception via Bernoulli logit functions: Dimensionless probability ratios are mapped into wall-time-scaled Lie algebra elements using Bernoulli logit functions (Section 1.7.3.5), preserving coherence between physical and perceptual time representations.

These concepts collectively establish a coherent theoretical framework that bridges neural information processing with abstract mathematical structures, offering testable hypotheses. The proposed formalism suggests potential links to tough questions in physics and mathematics, such as the cosmological constant problem (Section 2.4.5.4) or the Hilbert-Pólya conjecture (Section 4.6), by demonstrating how temporal symmetries in neural systems could mirror universal scaling laws (Section 2.4.5). The self-consistent integration of Lie-group-based time hierarchies and topological robustness (e.g., Section 2.1.6) supports the hypothesis that precise temporal perception (via Δ𝑡) is critical for neural computation, as demonstrated by the dimensional consistency between physical time evolution (imaginary exponentials) and perceptual event representation (Bernoulli logit function).

Epilogue: Summary and perspectives

The mind-body relationship encompasses both scientific inquiry and philosophical debate. Among the complexities of the science of the mind, consciousness stands as the most enigmatic challenge, particularly the "hard problem" of understanding subjective experience (Chalmers, 1995). Recently, there has been a spirited debate about theories of consciousness (Lenharo, 2023). From the standpoint of systems science philosophy, many consciousness theories overlook the structure-function correlation principle, thereby reducing their explanatory power regarding consciousness.

Therefore, we initially introduce systematicism, the philosophy for systems science, with five fundamental laws: the correlation between structure and function, feedback in information, competition and cooperation, order emerging from fluctuations, and the optimization of evolutionary processes (Wei and Zeng, 1995). Embracing systematicism is crucial for tackling the complex issues of our time, necessitating an interdisciplinary framework that integrates knowledge across various fields. This is particularly important when exploring theoretical models and the fundamental workings of the brain and consciousness.

Based on the structure-function correlation principle at the neuronal level, we have demonstrated the strong correlation between structures and functions in neurons, including the axon initial segment (AIS), axons, presynaptic vesicles, dendritic spines, and dendritic trees. Particularly, the neck structure of the dendritic spine is a key for information encoding in the frequency domain. Therefore, the whole work is discussed in the frequency domain.

At the level of sensory modalities, we find that the intrinsic function of simple auditory neurons encodes information as a wave function ψ with a characteristic frequency and an invariant phase. Subsequently, at the mesoscopic level, we successfully resolve the quantum mechanism for frequency adaptation, a representation of energy constraint. Particularly, we find that the accommodation mechanism and the multi-layered retina are specialized to employ the principles of Fourier optics and quantum optics. The stratified structures, positions and functions of retinal ganglion cells and bipolar cells are specialized for the Fock states in quantum optics. The whole brain might be considered as a neuronal quantal field, comprising billions of neurons and trillions of synapses, which intrinsically function as wave functions and perceptive units, respectively.

Sensory modalities and their submodalities generate distinct information chunking at various levels, indicating that self-consciousness encompasses hierarchical structures and modular functional units. This design in the brain may serve as a fault-tolerant mechanism for structure-function correlations. Therefore, we have summarized and discussed cortical areas associated with clinical diseases (Table 3.2.1), which provide valuable information about potential fault-tolerant mechanisms and enhance our understanding of the functional roles of their associated cortical areas.

The hard problem and the easy problems of consciousness share and rely on the same fundamental characteristics and algorithms of complex cells. Both the hard problem and the easy problems are explainable and resolvable in the realm of quantum wave mechanics along with the law of the structure-function correlation as corroborated in the cortical areas associated with clinical diseases. Principle matters about these questions of consciousness are how to define and quantitate information emerging. As for the definition and quantitation of information, we define neuronal information as Fisher information adhering to the principle of minimum uncertainty, and also propound the definition and quantitation of information emerging. Consequently, we have resolved Chalmers' hard problem in the frequency domain (Sections 2.4.4.4, 2.4.4.5 and 3.4-3.8).

Most recently, Zhou et al. reported that larger language models become less reliable (Zhou et al., 2024). As mentioned in Table 1.2.1, the activation function ReLU in an artificial neural network emulates the neurotransmitter release process at the synapse to introduce non-linearity. However, in reality, neurons can efficiently use energy-constrained adaptation to introduce nonlinear compound effects of the real time, the square root of the duration time, and the characteristic time as discussed in Section 2.1. Our findings support the need for a fundamental change in the design and development of general-purpose artificial intelligence (Zhou et al., 2024). In Section 2.4.4.5, we have proposed that Bayesian hierarchical Normal-Gamma conjugate inference and its associated meta-metrics offer foundational principles to advance AGI development. These methods provide mechanistic, computationally grounded metrics aimed at mimicking human-like awareness. By leveraging these statistical approaches, we can quantify and assess the emergence of consciousness-like phenomena in AI systems, thereby fostering more robust and human-aligned AGI capabilities.

In summary, it is thrilling to understand the mechanism and theoretical frame of neuronal function and consciousness employing quantum wave mechanics at the mesoscopic level. We prefer to name the theoretical frame as Unified Information Theory (UIT) for three reasons: first, the quantum mechanism of frequency adaptation at the mesoscopic level provides a bridge between microscopic and macroscopic levels. Secondly, we express our deepest thanks to experts in multidisciplinary fields and their contributions, including philosophy, mathematics, physics, neuroscience, and so on. Thirdly, we show appreciation for Everett's Universal Wave Function but not Dewitt's many-worlds interpretation (Dewitt and Graham, 2015). Naturally, our brains and animal brains are intricate subsystems of the Universal Wave Function (Hartle, 1983; Vilenkin, 1994), as evidenced by the human-dog interbrain neural coupling (Ren et al., 2024).

Acknowledgments

We thank all researchers for their work in multidisciplinary fields including mathematics, physics, philosophy, medicine, computer science, chemistry, molecular biology, biochemistry, cell biology and neuroscience. We apologize to many authors for their papers, books or software potentially uncited.

We thank the First Affiliated Hospital of Zhengzhou University, the Chinese Academy of Sciences, the National Natural Science Foundation of China and the National Basic Research Program for their support.

Finally, I express my sincere gratitude for the cumulative impact of my life's encounters, the intellectual enrichment derived from literature, and the formative influence of personal and professional experiences. These elements have significantly contributed to my academic and personal development. As Karl Marx once said, "The essence of man is the sum of all social relations." All of these have made everything possible.

Submission history

A Unified Information Theory of Subjective Cognition