Definition and Value Reconstruction of Human Creativity in the AI Era
Jiang Ke, Huang Ruizhi, Shen Zhoujie, Yan Wenjing
Submitted 2025-11-07 | ChinaXiv: chinaxiv-202511.00064 | Mixed source text

Abstract

In the AI era, traditional definitions of creativity can no longer accurately describe the value distinction between the intelligent activities of humans and AI. The "high-score" performance of AI in traditional creativity tasks has fueled the illusion that "AI possesses high creativity." In fact, the excellent performance of AI in board games and big data-driven precision decision-making is not equivalent to "high creativity."

Creative activities can be categorized into three types based on their outputs:
1. Interpolative creativity, which refers to the generation of knowledge with application value through the structuring and refinement of internal cognitive systems;
2. Extrapolative creativity, which refers to the process of transcending the boundaries of existing knowledge to generate new knowledge through induction and generalized inference, or applying knowledge from an existing cognitive domain to a new field;
3. Leapfrog creativity, which refers to the formation of higher-level abstractions and generalized statement models based on existing knowledge, achieving a transformation of the fundamental knowledge model.

Traditional creativity assessment tasks emphasize interpolative and extrapolative creativity while relatively neglecting the consideration of leapfrog creativity; furthermore, these assessment tasks do not precisely distinguish between the three creative modes or provide separate evaluations for them. Within these three types of creative activities, creativity is defined as the ratio of the creative output value obtained in each activity to the cost (energy consumption) incurred; that is, for a given creative output value, the lower the cost incurred, the higher the creativity of the system.

In the three-dimensional model of creativity assessment, creativity is defined as the psychometric and statistical norms across three dimensions: $X$ (interpolative creativity), $Y$ (extrapolative creativity), and $Z$ (leapfrog creativity). According to this model, the creativity levels and values of different types of creative activities—such as engineering manufacturing, research reviews, and theoretical innovation—can be evaluated uniformly. This provides a psychometric measurement standard for the distinction and unified evaluation of human and AI creativity.

Full Text

Preamble

Definition and Value Reconstruction of Human Creativity in the AI Era

Authors:
1. School of Teacher Education, Lishui University, Lishui, Zhejiang
2. Center for Brain, Mind, and Education Research, Shaoxing University, Shaoxing, Zhejiang
3. School of Mental Health, Wenzhou Medical University, Wenzhou, Zhejiang

Abstract

The rapid advancement of Artificial Intelligence (AI) is fundamentally reshaping the landscape of human productivity and intellectual labor. As machine learning models increasingly demonstrate capabilities in generating art, literature, and scientific hypotheses, the traditional understanding of creativity faces an unprecedented challenge. This paper explores the evolving definition of human creativity in the AI era and proposes a framework for its value reconstruction. We argue that while AI excels at combinatorial creativity and pattern recognition, human creativity remains distinct through its dimensions of intentionality, sociocultural resonance, and ethical judgment. By synthesizing perspectives from cognitive science, education, and philosophy, this study aims to redefine the core competencies of human creators and provide a theoretical foundation for fostering unique human creative value in an automated world.

1. Introduction

For decades, creativity has been regarded as one of the most complex and uniquely human cognitive functions. However, the emergence of generative AI has blurred the boundaries between human and machine output. From large language models (LLMs) to diffusion models for visual arts, AI is now capable of producing high-quality creative artifacts that were once thought to require deep human intuition and emotional experience. This technological shift necessitates a rigorous re-examination of what constitutes "creativity" and how human contributions should be valued when the cost of generation approaches zero.

2. The Evolution of Creativity Definitions

Historically, creativity has been defined as the ability to produce work that is both novel and useful. In the context of the AI era, this definition requires refinement. We categorize creativity into three distinct levels:

  1. Combinatorial Creativity: The novel synthesis of existing ideas.
  2. Exploratory Creativity: The exploration of structured conceptual spaces to find new possibilities.
  3. Transformational Creativity: The alteration of the conceptual space itself, leading to paradigm shifts.

While AI has demonstrated mastery over combinatorial and exploratory creativity, transformational creativity—which often involves breaking established rules based on subjective values or radical new insights—remains a predominantly human domain.

3. The Human Advantage: Intentionality and Context

The primary distinction between human and machine creativity lies in

摘要

For generations, traditional definitions of creativity have failed to precisely distinguish the value of intelligent activities performed by humans versus those performed by machines. The "high-score" performance of artificial intelligence (AI) in traditional creativity tasks has fostered an illusion of "AI creativity." In reality, excellent performance in games like chess or in big-data precision decision-making is not equivalent to "high creativity." Creative activity can be categorized into three distinct types based on their outputs:

  1. Interpolative Creativity: The generation of outcomes with practical application value through the internal structuralization and refinement of a cognitive system.
  2. Extrapolative Creativity: The process of generating new knowledge by transcending existing knowledge boundaries through induction and generalized inference, or the application of knowledge from an existing cognitive domain to a new field.
  3. Transitional Creativity: The formation of higher-level abstractions and generalized statement models based on existing knowledge, achieving a fundamental transformation of the underlying knowledge model.

Traditional creativity assessment tasks have placed significant emphasis on interpolative and extrapolative creativity while relatively neglecting the evaluation of transitional creativity. Furthermore, these assessment tasks have failed to precisely distinguish between these three creative modes or provide separate evaluations for them.

Within these three types of creative activity, creativity is defined as the ratio of the value of the creative output obtained to the cost (energy consumption) incurred to produce it. That is, for a given value of creative output, the lower the cost incurred, the higher the creativity of the system. In this three-dimensional model of creativity assessment, creativity is quantified and standardized across three dimensions: $C_i$ (Interpolative Creativity), $C_e$ (Extrapolative Creativity), and $C_t$ (Transitional Creativity).

According to this model, the creativity levels and values of different types of creative activities—such as engineering manufacturing, research reviews, and theoretical innovation—can be evaluated within a unified framework. This provides a psychometric and statistical measurement standard for the differentiation and unified evaluation of human and machine creativity.

关键词

Keywords: Artificial Intelligence; Creativity; Paradigmatic Revolution; Genetic Epistemology; Piaget; Free Energy

Abstract

This paper explores the intersection of artificial intelligence and human creativity through the lens of genetic epistemology and the free energy principle. By analyzing the mechanisms of cognitive development and the formal structures of creative thought, we propose a theoretical framework for understanding how AI systems might achieve a paradigmatic revolution in creative output. We draw upon Jean Piaget’s theories of equilibration and the minimization of variational free energy to model the transition from generative imitation to genuine conceptual innovation.

Introduction

The rapid advancement of machine learning and deep learning has brought the question of machine creativity to the forefront of academic discourse. While current generative models demonstrate remarkable proficiency in synthesizing existing styles and data distributions, the capacity for a "paradigmatic revolution"—the ability to break from established frameworks and establish new conceptual foundations—remains a significant challenge. This study investigates the epistemological requirements for such a leap, framing AI creativity not merely as a statistical phenomenon but as a dynamic process of cognitive restructuring.

1. Genetic Epistemology and the Structure of Creativity

According to Jean Piaget’s genetic epistemology, knowledge is not a static representation of reality but a continuous process of construction. This construction occurs through the dual mechanisms of assimilation and accommodation. In the context of artificial intelligence, assimilation corresponds to the integration of new data into existing model parameters, while accommodation represents the fundamental restructuring of the model's internal representations to account for anomalies or novel challenges.

1.1 Assimilation and Accommodation in Neural Networks

Current deep learning architectures excel at assimilation. By minimizing loss functions across vast datasets, models internalize the underlying statistical regularities of their training environment. However, genuine creativity requires more than the interpolation of known data points. It necessitates a form of "cognitive rupture" where the system's internal logic undergoes a qualitative shift.

[FIGURE:1]

As shown in [FIGURE:1], the transition from standard generative tasks to creative innovation involves a recursive feedback loop between the agent's internal model and the external environment. This process is governed by the principle of equilibration, where the system seeks a higher state of balance after encountering information that contradicts its current internal schema.

2. The Free Energy Principle and Creative Dynamics

To formalize the drive toward innovation, we employ the Free Energy Principle (FEP). In this framework, an intelligent agent (biological or artificial) acts to minimize its variational free energy, which serves as

21 世纪人类发展的两大主题,一是健康(

World Health Organization, 2000

The second is creativity (Delors, 1998). The intervention of artificial intelligence has brought unprecedented opportunities for human development, but it may simultaneously introduce significant risks.

Since the turn of the century, artificial intelligence has been integrating into human life at an exponential rate. Since 2023, the commercialization of Large Language Models (LLMs) has begun to threaten traditional human existential values. In a speech at the World Artificial Intelligence Conference held in Shanghai in July, Geoffrey Hinton, often regarded as the "Godfather of AI," explicitly suggested that the development of AI might lead to a situation where humanity is "nurturing a tiger that will eventually cause trouble." This potential threat is primarily manifested in the challenge AI poses to human creativity. As AI becomes capable of "creating" works of art and completing complex tasks with high efficiency, the "irreplaceability" that defines being human appears to be undergoing a process of "disenchantment" by artificial intelligence. In the face of AI's rapid evolution, how should humanity perceive this maturing "young tiger"? How should we evaluate the "creativity" exhibited by artificial intelligence? As AI continues to advance, should we, as humans, re-examine the fundamental meaning and value of human creativity?

2 AI

Deconstructing the Meaning of Classical Creativity

Introduction

The concept of creativity has long been a cornerstone of human intellectual and cultural development. However, the traditional or "classical" understanding of creativity often relies on a set of assumptions that merit rigorous deconstruction. In the contemporary academic landscape, particularly with the advent of machine learning and deep learning, the definition of what it means to be "creative" is undergoing a profound transformation. This section explores the foundational elements of classical creativity and begins the process of re-evaluating its core tenets.

The Myth of the Individual Genius

Classical interpretations of creativity frequently center on the "Great Man" theory, attributing breakthroughs to the singular flashes of insight experienced by isolated geniuses. This perspective posits that creativity is an inherent, almost mystical trait possessed by a select few. However, a deconstructive approach reveals that creativity is rarely an isolated event. Instead, it is a networked phenomenon, emerging from a complex web of cultural heritage, social collaboration, and incremental improvements on existing ideas. By shifting the focus from the individual to the system, we can better understand how creative outputs are synthesized from a vast array of pre-existing data and social influences.

Originality and the "Ex Nihilo" Fallacy

A primary pillar of classical creativity is the demand for absolute originality—the idea that a creative work must be created ex nihilo (out of nothing). From a technical and philosophical standpoint, this requirement is increasingly untenable. In the realm of deep learning, for instance, generative models produce "new" content by identifying and recombining patterns within massive datasets. This process mirrors human cognition more closely than the classical model suggests. Deconstructing "originality" reveals it to be a matter of degree rather than kind; what we label as original is often a sophisticated synthesis or a novel transformation of existing information.

Intentionality and the Creative Process

Another critical component of the classical definition is intentionality. It is often argued that for a work to be creative, it must be the result of a conscious, purposeful effort by a human agent. This anthropocentric view is challenged by the rise of autonomous systems. If a machine learning algorithm produces a symphony or a painting that evokes profound emotion in a human audience, does the lack of "intent" negate its creativity? Deconstructing this requirement allows us to separate the process of creation from the reception of the creative product, suggesting that creativity can be evaluated based on the output's impact and novelty rather than

2.1 AI

Does AI truly possess "creativity"? In classical theories of creativity, researchers typically employ metrics such as fluency, flexibility, and originality to measure the level of creative output (see [TABLE:1]). However, the performance of AI is currently challenging these established standards. In well-structured tasks—where the task structure is highly clear and feedback is immediately verifiable—the outputs and structural reorganization strategies of AI are often praised by professionals as "unprecedented" or "extraordinary." For example, on March 9, 2016, in Seoul, South Korea, during the second game of the man-machine match against Lee Sedol, AlphaGo's "Move 37" shattered the expectations of human Go formulas (commentators noted its prior probability was approximately "one in ten thousand") and significantly altered human players' concepts of opening layouts and thickness. The following year, the new generation AlphaZero, trained through "zero-knowledge self-play" starting from only the basic rules, reached and surpassed the level of AlphaGo within dozens of hours \cite{Jiang2018}. This demonstrates that in game spaces with closed rules and decidable goals, a new generation of AI can generate perspectives and problem-solving strategies regarded by humans as "highly creative" through reinforcement learning and tree searches \cite{Silver2016, Silver2018, Silver2017, Wikipedia2025a}.

According to commentary from Go professionals, "Move 37" completely broke away from the traditional formulaic thinking of human Go. At the time, commentators and professional players were shocked, even initially dismissing it as a "mistake." However, the subsequent development of the game proved the profound strategic value of this move, ultimately leading to AlphaGo's victory. This "stroke of genius" is widely regarded as a landmark of artificial intelligence breaking through the limitations of human experience in the field of Go. It also made the evaluation of a "one-in-ten-thousand prior probability" a classic annotation of the innovative nature of AI players. Official media reports, such as those from People's Daily Online, detailed the results and interpretations of this man-machine battle, highlighting its significance for the future of AI.

On the other hand, creators and critics from fields such as literature and art maintain a more cautious attitude toward the "creativity" displayed by AI. While they generally acknowledge the advantages of AI-generated results in terms of diversity, technical characteristics, efficiency, and cost control, they also point out that AI output remains a recombination within existing stylistic spaces, lacking true original characteristics. When it comes to issues involving value judgments, emotional resonance, and deep-level aesthetic evaluation, AI works often appear quite immature or even produce absurd errors.

For instance, although GPT-4 scored highly on the Torrance Tests of Creative Thinking (TTCT), the texts it "creates" still exhibit semantic repetition and formulaic generation \cite{Cropley2023, Guzik2023}. Similar criticisms are corroborated by scientometric research related to "disruptiveness" and paradigm shifts \cite{Park2023, Wu2019}. It appears that whether AI possesses "creativity" in the true sense remains a debated topic. This paper intends to begin with the question of "whether AI possesses true creativity" by analyzing the performance of AI in creative tasks and comparing the outputs of AI and humans. Through this, we explore the essential characteristics and value of human creativity. Simultaneously, starting from a critique of traditional concepts of creativity, this paper reflects on traditional definitions and measurement indicators, attempting to reconstruct the conceptual framework and evaluation strategies for creativity in the AI era.

2.2 AI

What can and cannot be "created"? Contemporary mainstream generative AI systems are typically composed of a coupling of three categories of strategies: (1) High-probability fitting, which involves learning the conditional distribution of output results through probabilistic sampling within large-scale corpora (such as Large Language Models); (2) Reinforcement learning and search training, where optimal response strategies are formed through self-play and Monte Carlo Tree Search within problem spaces characterized by closed rules and clear objectives, such as board games (e.g., AlphaGo/AlphaZero); and (3) Hybrid generation strategies, which utilize external knowledge bases or vector retrieval techniques to enhance the certainty and coverage of generated results \cite{lewis2020retrieval, silver2016mastering, silver2018general, silver2017mastering}.

These strategies often demonstrate significant advantages in tasks with established paradigms or well-defined structures. Under the prerequisites of formal correctness and semantic coherence, they can efficiently reorganize existing elements and optimize candidate outputs at a low marginal cost. However, within the context of complex domains emphasized by classical creativity theory...

such as remote association, problem discovery, and restructuring (problem...

Findings, Redefinition, and Conceptual Integration

The process of identifying new findings, redefining existing parameters, introducing novel concepts, and integrating theoretical significance represents a sophisticated level of academic synthesis. In these areas, AI models frequently expose the limitations of their underlying generative pathways. While these models can aggregate vast amounts of data, they often struggle with the nuanced cognitive leaps required to redefine a field or establish the profound implications of a new discovery.

Conceptual Innovation and Theoretical Significance

True academic progress relies on the ability to move beyond pattern recognition toward conceptual innovation. This involves not only the discovery of new data points but also the redefinition of foundational frameworks. AI-generated content often lacks the intentionality required to synthesize disparate ideas into a cohesive new theory. Consequently, while AI can assist in the preliminary stages of literature review or data processing, the critical task of assigning meaning and establishing the broader significance of research remains a uniquely human intellectual endeavor.

Limitations in Generative Logic

The generative logic of current AI systems is primarily probabilistic, which can lead to "hallucinations" or the superficial application of technical terminology without a deep understanding of the underlying logic. When tasked with high-level conceptual integration, AI may produce text that appears structurally sound but lacks the rigorous internal consistency and original insight expected in high-impact academic scholarship. This exposure of generative constraints highlights the necessity for human oversight in ensuring that new concepts are both theoretically sound and practically relevant.

Current machine learning models face significant limitations, including heavy data dependency, weak generalization capabilities, and low sensitivity to semantic mutations \cite{Lake & Baroni, 2018; Mednick, 1962; Mumford et al., 2012; Ontanon et al., 2022; Wold et al., 2025}. An intuitive comparison can be drawn between performance in board games versus poetic couplet generation. In closed-rule game systems with clear objectives and decidable outcomes, such as Go or Chess, models can stably output "novel" strategies—verifiable through win rates—via self-play training (e.g., AlphaGo's "God's Move"). However, in language games such as Chinese couplets and classical poetry creation, the situation is markedly different. Although models can adhere to formal constraints like tonal patterns (pingze), parallelism, rhyme schemes, and character counts to rapidly generate multiple candidates, they generally struggle with the problem of "easy formal compliance, difficult semantic integration" \cite{Yi, Li, & Sun, 2018; Yan, Li, Hu, & Zhang, 2016; Chen & Cao, 2024; Jiang & Zhou, 2020}.

Further research indicates that in these linguistic tasks, models frequently exhibit forced semantics, loose thematic coherence, the misuse of literary allusions, and contextual mismatches \cite{He et al., 2012; Yan et al., 2016; Liu et al., 2022}. Similar issues are prevalent in visual image generation tasks. For instance, while models can produce images with clear compositions and coordinated colors, they frequently commit common-sense errors, such as generating "six-fingered hands." These failures suggest that the models do not lack image processing capabilities per se, but rather the ability to integrate underlying meaning.

AI can reconstruct shapes and imitate brushstrokes, yet it struggles to understand the "semantic consistency" between the number of fingers and their functional purpose. This is because AI has never possessed or experienced a "hand." This lack of experience defines the disembodied nature of AI—it lacks an existential body, sensory organs, and behavioral experience. Consequently, it cannot bridge the gap between symbolic structures and the meaning of life.

In human cognition, the generation of meaning relies not only on language and logic but is also deeply rooted in bodily experience and situational interaction. Lacking this embodied experience, AI remains unable to truly "understand" the biological logic and experiential significance behind its outputs, even if it can simulate linguistic and visual rules \cite{Bender_Koller_2020, Bisk_Holtzman_Thomason_et_al_2020}. Therefore, whether it manifests as semantic defocusing in language generation or common-sense mismatches in image generation, these phenomena reflect the same fundamental dilemma: AI cannot organize local forms into the coherent meanings that humans acquire through embodied experience.

In summary, within the framework of the classical definition of creativity, models are generally able to reliably satisfy the criterion of "fluency."

3 当我们将一张有六根手指的手掌图片交给

During the recognition process, the system fails to perceive the distinction between this "six-fingered hand" and a normal palm. Correspondingly, if we task the model with generating an "image of a palm with six fingers," the output is often erratic and fails to effectively conform to the input command.

While the system demonstrates performance across metrics such as fluency, flexibility, originality, and elaboration, it remains significantly deficient in more complex dimensions of creativity. This is particularly evident in tasks involving remote association, problem restructuring, semantic integration, and the establishment of value criteria.

(Koestler, 1964; Mednick, 1962; Mumford et al., 2012) 。

2.3 为什么

Can machines achieve high scores in classic creativity measurement tasks? The classic definition of creativity is the ability of a subject to produce ideas, works, or solutions that are both novel and useful \cite{Runco & Jaeger, 2012}. Based on this definitional framework, classic creativity assessment tools—most notably the Torrance Tests of Creative Thinking (TTCT)—place divergent thinking at the core of evaluation. These tests primarily measure an individual's performance across four dimensions: fluency, flexibility, originality, and elaboration \cite{Torrance, 1966/1974; Warne et al., 2022}.

In recent years, five major categories of creativity measurement standards frequently adopted in global psychology journals, along with their core indicators, have come to be regarded as the most consensual "benchmarks" in current research (see Table 1) \cite{Carson, Peterson, & Higgins, 2005; Diaz, Nelson, Beaujean, Green, & Scullin, 2024; Dollinger, & Shafran, 2005; Gough, 1979; Hadas, & Hershkovitz, 2025; Hongdizi, Cui, & Zhou, 2023; Kapoor, Zheng, Reiter-Palmon, & Kaufman, 2023; Ogurlu, Acar, & Ozbey, 2023; Runco, & Acar, 2012; Zhang, Zhong, Zhang, Ren, & Zhang, 2023}.

[TABLE:1]

Measurement Category Typical Scoring Indicators Representative Literature Description Alternative Uses Task (AUT) (Divergent Thinking) Fluency, Originality, Flexibility, Elaboration Ogurlu & Runco, 2023; Hadas et al., 2025; Hongdizi et al., 2023 Evaluates the ability to generate diverse and rare ideas in an open-ended context. Remote Associates Test (RAT) (Convergent Thinking) Number of correct answers, Response time Wu et al., 2020; Diaz et al., 2024 Measures the efficiency of finding a unique cross-domain association among multiple semantic paths. Creative Achievement Questionnaire (CAQ) Diachronic creative achievement Carson et al., 2005; Zhang et al., 2025 Tracks authentic, socially recognized creative achievements over time.

10 领域累积成

Introduction

The integration of technology and the humanities has become a defining characteristic of contemporary academic discourse. This interdisciplinary approach is particularly evident in the fields of art and science, where the boundaries between creative expression and empirical methodology are increasingly blurred. By leveraging machine learning and deep learning techniques, researchers are now able to analyze artistic styles, historical trends, and scientific data with unprecedented precision.

1.1 The Intersection of Art and Science

The relationship between art and science is not merely one of mutual influence but of fundamental convergence. In the realm of digital humanities, computational tools are employed to quantify aesthetic qualities that were previously considered subjective. For instance, the application of neural networks to large-scale datasets of historical paintings allows for the identification of subtle stylistic shifts that may elude the human eye. Conversely, scientific visualization utilizes artistic principles to represent complex data in a manner that is both intuitive and informative.

[FIGURE:1]

1.2 Methodological Framework

To bridge these two domains, we propose a framework that utilizes advanced algorithmic models to process diverse data types. The core of this methodology relies on the extraction of features from both visual and quantitative sources. By applying $\mathcal{F}$ to the input space, we can map disparate elements into a unified latent representation. This process is governed by the following relationship:

$$
\begin{aligned}
L(x, y) = \sum_{i=1}^{n} \omega_i \cdot \phi(x_i, y_i) + \lambda R(\theta)
\end{aligned}
$$

where $x$ represents the artistic features, $y$ represents the scientific parameters, and $\phi$ denotes the transformation function. The regularization term $R(\theta)$ ensures that the model maintains generalizability across different cultural and historical contexts.

1.3 Data Collection and Processing

The efficacy of machine learning models in this context depends heavily on the quality of the training data. Our dataset comprises high-resolution scans of classical artworks and corresponding metadata from scientific archives. As noted in \cite{Ref1}, the preprocessing stage is critical for removing noise and artifacts that could bias the model's output. We employ a multi-stage pipeline to normalize the data, ensuring that $\bar{b}$ and $\tilde{x}$ values are consistent across all samples.

[TABLE:1]

As shown in [TABLE:1], the distribution of data points across various categories highlights the diversity of

Research Methodology and Evaluation

The assessment of creative achievements (such as academic research, creative writing, and other professional outputs) often relies on the Consensual Assessment Technique (CAT) to directly evaluate the quality and novelty of the work. This method is widely regarded as the "gold standard" in creativity research because it leverages the collective expertise of domain specialists to provide a reliable measure of a work's significance.

Novelty and the Consensual Assessment Technique

The Consensual Assessment Technique is particularly effective for evaluating complex creative products where objective metrics may be insufficient. By utilizing a panel of independent experts who possess deep knowledge of the specific field, the CAT ensures that the evaluation reflects the current standards and nuances of the discipline. These experts assess the work based on several dimensions, primarily focusing on novelty—the degree to which the work is original and unexpected—and appropriateness—the extent to which the work is useful, valuable, or meets the constraints of the task.

[TABLE:1]

In the context of academic writing and scientific discovery, the application of CAT allows for a nuanced understanding of how a particular contribution advances the field. Unlike automated metrics that may only count citations or keywords, expert assessment can identify the subtle shifts in theoretical frameworks or the innovative application of methodologies that characterize high-impact research.

Direct Evaluation of Work Quality

Directly evaluating the quality of a work involves a comprehensive review process that goes beyond surface-level characteristics. Experts are typically asked to provide ratings across multiple scales, which are then aggregated to reach a consensus. This process minimizes individual bias and highlights the strengths and weaknesses of the achievement from a professional standpoint.

[FIGURE:1]

Furthermore, the integration of machine learning and deep learning techniques has begun to complement traditional assessment methods. While expert consensus remains the benchmark, computational models can assist in identifying patterns of novelty or linguistic sophistication in large datasets of creative work. By combining human expertise with algorithmic precision, researchers can achieve a more robust and scalable evaluation of creative and academic achievements.

Technique (CAT) ; Creative

Dollinger & Shafran's Creative Product Semantic Scale (CPSS) is commonly used for pre- and post-intervention measurements or cross-group comparisons of expert ratings regarding creative aesthetics and surprise. Similarly, the Kaufman Domains of Creativity Scale (K-DOCS) \cite{Zhang2023} captures creative engagement across multiple domains, ranging from everyday to professional levels. Meanwhile, the Creative Personality Scale (CPS) \cite{Kapoor2021} measures the frequency of creative behaviors and personality traits based on a self-report personality matrix. Recent studies have found that Large Language Models (LLMs) often score significantly higher than the human average in these standardized tasks, sometimes even outperforming highly creative individuals \cite{Cropley2023, Guzik2023}. This result does not stem from LLMs possessing genuine high-level creative ability, but rather from their structural advantages within the assessment dimensions. For instance, regarding fluency, LLMs can output a vast number of candidate responses in an extremely short time through massively parallel generation mechanisms, easily satisfying the quantitative criterion of "idea quantity." In contrast, humans are constrained by working memory capacity and information retrieval speeds, making it difficult for even highly creative individuals to match the performance levels of these models. In terms of flexibility, the training corpora of LLMs cover numerous knowledge domains, allowing models to switch rapidly between different semantic clusters and demonstrating an advantage in cross-category generation. Regarding originality (or rarity) scores, the high-speed retrieval capabilities and massive training sets enable models to generate combinations that are relatively rare in human samples within a limited timeframe, thereby securing high scores in this dimension. In terms of elaboration, the algorithms of LLMs represent a form of fully rational inference; this ensures they rarely make logical errors and remain unaffected by irrational factors (such as emotion) during generalized reasoning. Consequently, they can generate logically rigorous, clearly structured, and rationally ordered content. In divergent thinking or "remote association" tasks, generative LLMs similarly rely on their advantages in information processing capacity and retrieval efficiency to obtain higher scores than humans. When humans perform such tasks, they are limited by working memory capacity. It is well-established that the instantaneous holding capacity of human working memory is approximately $7 \pm 2$ information units \cite{Miller1956}. This means that when faced with the need to generate large-scale, cross-domain, and structurally diverse novel results, human individuals often find it difficult to maintain the simultaneous activation and transformation of multiple parallel lines of thought. LLMs, however, based on nearly infinite data pools and computing power, can execute large-scale information retrieval and comparison in a very short time.

They can not only discover a large number of information combinations from the corpus that are difficult for humans to find or have been forgotten, but also completely preserve all previously executed generation paths and semantic mapping records. In summary, the creative behavior of LLMs is essentially an efficient coverage and compressed expression of a large-scale combinatorial space; it aligns well with the scoring orientations of "diversity," "flexibility," "originality," "elaboration," and "remote association" in creativity assessments. Therefore, LLMs do not achieve "high creativity scores" by effectively simulating the human creative mind, but rather by satisfying the formalized requirements of creativity assessment indicators through technical means under entirely different information processing conditions.

Multiple assessment results show that GPT-4's scores in the originality and fluency dimensions of creativity tasks reach the top 1% of human samples, with flexibility also falling within the 93rd to 99th percentile range \cite{Colton2012, Guzik2023}. This performance is "near-perfect" within the calculation systems of these scales. However, horizontal comparative research reveals a noteworthy trend: over the past 12 months, the scores of mainstream LLMs in tasks such as the Divergent Association Task and the Alternative Uses Task have not continued to rise, and have even stagnated \cite{ScienceDirect2023, UMWestern2023, TechnologyNetworks2023, Tykierko2025}. This phenomenon suggests that the training corpora and combinatorial strategies currently relied upon by LLMs may have approached the limit boundaries of their expressive space. We can refer to this boundary of expressive space, restricted by the corpus, as the "creativity bottleneck" of LLMs. In other words, models like GPT-4 can obtain high creativity scores within existing scoring frameworks, but these scores do not imply that the models possess the capacity for genuine creative expansion or cognitive breakthroughs.

More critically, creativity measurement tools themselves possess structural biases, a typical case being the widely discussed "fluency-originality coupling." Specifically, because scoring rules often use rarity as the primary indicator of originality, the more results a participant generates (i.e., the higher the fluency score), the higher the probability that rare answers will appear, thereby systematically inflating the originality score. This phenomenon is particularly significant in the TTCT Figural tasks. Existing research shows that the correlation coefficient between these two scores can reach as high as $r = 0.80$ (with typical values around $0.70$) \cite{Kim2004, Kim2006, Kim2006b}. Studies from reliability and equivalence testing have found that when subtests with potentially overlapping responses are removed, the correlation between the two drops significantly. This indicates that high scores on these tests only reflect a portion of the subject's creative processing ability, while another portion of the score may be a statistical bias caused by the task structure itself \cite{Acar2023, Forthmann2020, Said-Metwaly2018}. For LLMs supported by big data and high computing power, the scoring advantage resulting from this structural bias is even more pronounced.

This implies that traditional creativity scales amplify the importance of fluency and flexibility and use a low-threshold rarity index to approximate originality. Consequently, they lack sufficient consideration and effective differentiation of creativity dimensions such as the strength of remote associations, problem restructuring, conceptual introduction, and theoretical integration. Under this evaluation framework, LLMs can leverage their massive training corpora and high computing power to naturally excel in tasks involving "fluency-flexibility-unique combinations," thus earning a "highly creative" evaluation. However, this scoring performance is not equivalent to "originality in the strong sense." Especially when tasks involve value judgment and deep integration of meaning, the restructuring of problem spaces, and paradigm shifts across semantic scenarios, the creative capacity demonstrated by LLMs declines significantly \cite{Koestler1964}.

Mednick, 1962; Mumford et al., 2012) 。

2.4 重新审视人类创造力的意义与价值

The high scores achieved by models in creativity tests provide a compelling reason to deconstruct classical creativity assessment schemes and reconstruct the meaning of human creativity. By using fluency, flexibility, originality, and elaboration as core evaluation metrics, these assessment schemes inevitably and systematically overestimate the creativity of models while simultaneously overlooking the unique advantages of humans in high-order originality and the capacity for paradigm shifts. The structural difficulties exhibited by deep learning systems when processing out-of-distribution (OOD) data serve as a direct critique of this evaluative orientation \cite{Lake & Baroni, 2018; Liu et al., 2021}. Biases in the evaluation of model creativity have led to two alarming trends in current public opinion: on one hand, some have begun to doubt the level and value of human creativity due to the startling generative capabilities of models; on the other hand, others misjudge or dismiss the potential auxiliary creative functions of these models by overemphasizing their difficulties in specific tasks.

The fundamental purpose of analyzing model creativity is to gain a clearer understanding of the characteristics and value of human creativity. Deconstructing classical creativity assessment systems in light of model performance is essentially an effort to reconstruct the definition and evaluation strategies of human creativity within the context of AI. This serves as a starting point for further reflection: as AI continues to advance, what is the meaning and value of human existence and development? How should humans view AI rationally, and how can we establish a relationship of cooperation and symbiosis with it?

This reconstruction must be grounded in the fundamental ontological and epistemological differences between humans and AI. Human creative activity is an adaptive expression of groups or individuals within the competitive struggle for survival driven by natural selection; it is inherently embodied.

Human creativity is aimed at meaning generation and problem-solving; its cognitive processes are constrained by limited computational power and incomplete experiential accumulation, representing a cognitive expression of "small computing power and small data pools." In contrast, AI is a typical disembodied cognition system whose generative capacity relies on preset procedural architectures, massive datasets, and high-density computational support. In essence, AI is a process of system optimization based on symbol processing and pattern mapping, rather than the generation and construction of meaning. Simply put, human creativity fundamentally serves the survival of the subject and the continuation of the species, whereas the operation of AI lacks any teleology or purpose linked to its own existential needs.

Therefore, we advocate for a redefinition of creativity within an evolutionary dynamics framework centered on survival and development, understanding it as a functional characteristic oriented toward the "effect-to-cost ratio." This proposed definition of creativity will provide a theoretical foundation for constructing creativity measurement models applicable to human cognitive systems. Furthermore, it offers a quantifiable and computable representation system for creativity, allowing human creativity and AI-generated outputs to be integrated into a unified epistemological framework for comparative evaluation.

分析

Before beginning a discussion on the significance of creativity, it is essential to distinguish between the meaning and value of creative activity versus that of creativity itself. Creative activity describes a specific type of organic behavior—namely, a series of actions capable of producing novel and effective results. In contrast, creativity refers to the evaluative measure of an individual's capacity to engage in such activities. As previously mentioned, when comparing artificial intelligence with humans, the conceptual distinction between creative activity and creativity is often overlooked. Consequently, we may be so struck by the stunning performance of a generated work that we adopt the work itself as the sole criterion for judging "creativity." Furthermore, we may fail to recognize that the performance of AI relative to humans varies significantly across different types of creative activities.

Based on the phenomenological expression of creative outcomes, we categorize creative activities into three distinct forms: interpolation, extrapolation, and epistemological leap. These forms correspond, respectively, to the reorganization of existing concepts, the expansion of knowledge boundaries, and the elevation or generation of new cognitive schemas.

3.1 内插:系统内知识的精致化

Interpolation in this context refers to the structural refinement and systematic enhancement of knowledge within an existing intellectual framework through deductive reasoning. This type of creative activity centers on logical consistency and the completeness of the theoretical system.

core characteristics, although they do not break through the boundaries of the original problem framework, they achieve a high degree of refinement and organizational restructuring within that framework. Euclidean geometry serves as a typical representative of this type of creative activity. Starting from a limited set of axioms and utilizing a system of deductive reasoning, Euclidean geometry constructs a comprehensive and rigorously structured geometric system.

This type of creative activity is defined as "system-internal optimizing creativity." Under the premise that existing cognitive schemas remain stable, this process enhances the organizational degree and internal consistency of a knowledge system through the reorganization of internal elements and logical deduction \cite{Lenk, 2009}. The psychological representation of interpolative creativity typically manifests as the reorganization of elements supported by high levels of fluency and flexibility. Within the classical "novelty $\times$ effectiveness" definitional framework, interpolative creativity satisfies the innovation requirements of compliance and evaluative standards \cite{Runco & Jaeger, 2012; Torrance, 1966/1974; Warne et al., 2022}.

The logical significance of "interpolation" can be expressed through the "interpolation theorem." If a proposition can be established, then within certain systems, there exists an intermediary proposition such that all inferences are valid, and these propositions share common symbols. The interpolation theorem serves as the logical foundation for automated proof and program verifiability within formal systems. In modern computer technology, this "interpolative deductive reasoning" is widely applied in formal verification, theorem provers, and logic programming languages (such as Prolog). Its core operations are built upon enumerable derivations and optimal selection mechanisms within a defined rule space.

The "creative solutions" demonstrated in well-structured tasks—such as chess, formal decision-making, and logical puzzles—can be understood as a complex form of interpolation. In these contexts, the model performs search and recombination operations within a defined rule space to generate problem-solving strategies that are formally compliant yet novel in their composition.

In information-theoretic terms, interpolative creativity can be understood as the efficient encoding of a broad range of facts using a shorter description length. Consequently, interpolative creativity manifests not only as the optimization and reorganization of systemic knowledge according to established rules but also as the regulation and adjustment of the rule systems themselves.

A prime example of this is David Hilbert's work in Foundations of Geometry (1899), where he reconstructed Euclidean geometry using 21 axioms and introduced formal discussions regarding axiomatic independence and systemic consistency. This case illustrates how "interpolative creativity" expands from the generation of theorems to a meta-level reflection on the rule system itself \cite{Blanchette, 2024; Hilbert, 1899; Wikipedia, 2025b; Zach, 2023}.

Generative AI models often demonstrate a high degree of creativity in "interpolative" tasks. A notable example is the award-winning work Théâtre d’Opéra Spatial (commonly known as "Space Opera House") from the 2022 Colorado State Fair Digital Arts Competition. This piece performs a sophisticated interpolation centered around the conceptual and visual vocabularies of "space" and "opera house," blending these disparate elements into a cohesive and novel aesthetic expression.

coordinated and harmonized through a unified stylistic fusion, presenting a level of "exquisite image craftsmanship" and an "aesthetic style of profound fit" \cite{Smithsonian2022, VICE2022, Hyperallergic2022}.

3.2 外推:对知识边界的拓展

"Extrapolation" refers to the process of applying existing structures to new domains, typically relying on analogy and inductive reasoning. The essence of extrapolative creativity lies in expanding existing cognitive boundaries, thereby increasing the total information content of a knowledge system and reducing the uncertainty within the cognitive system.

From the perspective of organismal survival and adaptation, extrapolative creativity enables an organism to break through the original boundaries of its cognitive system. By introducing new elements—such as variables involving conditions or concepts—it reduces system uncertainty and achieves broader adaptability. This capacity for cognitive expansion makes extrapolation the "incremental" engine in the evolution of knowledge systems.

In contrast to the structural optimization within a knowledge system represented by interpolation, extrapolation represents a functional adaptation of a biological system to new environments. Extrapolative creativity primarily relies on inductive reasoning. However, the philosophical significance and epistemological foundations of inductive reasoning remain controversial. Hume's famous "problem of induction" led subsequent philosophers to recognize that inductive reasoning cannot achieve logical necessity from premises to conclusions \cite{Hume, 1748/2000}. Although philosophers after Hume have strived to argue for its validity, they have consistently failed to prove the logical legitimacy of inductive reasoning \cite{Russell, 1948/2004}. Nevertheless, in practical knowledge construction and behavioral decision-making, humans and other organisms universally demonstrate a reliance on inductive reasoning. Experimental research indicates that even insects with limited cognitive abilities, such as fruit flies, can form experience-based inductive extrapolations within their environments to perform path selection and resource anticipation \cite{Brembs, 1990}. This suggests that inductive extrapolation is not unique to complex agents but is a universal adaptive strategy possessed by biological individuals.

From an epistemological perspective, inductive reasoning is not only an important cognitive process for generating new concepts and forming categorical judgments, but also a key channel for abstracting more general understandings from local experiences. It often manifests in the form of analogical reasoning, characterized by the transfer of knowledge from an old domain to a new one; it is a vital method in the human cognitive system for promoting the "re-encoding of problem spaces" \cite{Gentner, 1983}. Therefore, the capacity for extrapolation is an instinct for living organisms to adapt to their environments, predict risks, and optimize resource allocation. By constructing generalized response tendencies from limited experience, organisms make optimal decisions amidst uncertainty. This is a process of information compression and entropy reduction that "foresees the whole through the part."

The improvement of capabilities in extrapolation tasks (such as generalization and reasoning) primarily relies on the iteration and optimization of algorithmic architectures. For example, this is achieved by expanding model scales (such as the evolution of Transformer width).

Enhancing context modeling capabilities, utilizing chain-of-thought prompting, and employing code generation technologies (such as code interpreters) allow for the simulation of multi-step Bayesian inference. Furthermore, reinforcement learning and memory mechanisms are used to improve the efficiency of strategy transfer. However, these advancements are essentially improvements in the computational efficiency of probability theory and do not touch upon the transformation of the "problem space" itself. Moreover, this enhancement of extrapolative capability is often predicated on extremely high energy costs. The training process of large language models frequently consumes thousands of megawatt-hours of electricity; for instance, the training of GPT-3 was estimated to consume approximately 1,287 MWh \cite{Patterson et al., 2024}. To support the training and inference tasks of next-generation models, some frontier institutions have begun constructing gigawatt-level computing and energy infrastructure \cite{Gao et al., 2025; Reuters, 2025; Tom’s Hardware, 2025}. Similarly, reinforcement learning systems represented by AlphaGo and AlphaZero involve massive computational overhead and long-term operational requirements during their self-play learning phases \cite{Silver et al., 2017}. In contrast, the human brain achieves efficient cognitive processing with extremely low energy consumption. The power required to operate the nervous system is only about 10-20 W, and the energy utilization efficiency in gray matter activity has reached a high level of optimization through evolution via ion channel dynamics and glucose metabolic pathways \cite{David Attwell & Simon B. Laughlin, 2001; Raichle & Gusnard, 2002}. Biological neural networks are far more efficient than existing general-purpose computing architectures in terms of the value of inference and abstraction produced per unit of energy \cite{Sengupta, 2013; Sengupta, 2014}. Such comparisons indicate that current AI, when performing extrapolative knowledge creation, primarily relies on a "brute-force probabilistic optimization" strategy driven by computing power, lacking the energy-efficiency trade-offs inherent in biological intelligence.

3.3 跃迁:认知

Interpolation and extrapolation constitute the reorganization and generalization of cognitive activities within a two-dimensional problem space. In contrast, an "epistemological leap" allows creative activity to break through this two-dimensional plane, achieving a hierarchical elevation of three-dimensional cognitive structures and a reflexive reconstruction of theoretical systems. This "leap" refers to the cognitive advancement of the subject from the empirical level to the relational level, and ultimately to a higher structural level, resulting in a meta-structural reorganization of the original cognitive framework. For example, Newton began with the phenomenon of a "falling apple," abstracted concepts such as "motion" and "mass," and further reconstructed a unified physical framework for celestial and terrestrial mechanics. This led to the formulation of a unified mathematical expression capable of describing everything from the motion of an apple on Earth to the movement of celestial bodies. This series of cognitive leaps is not merely data generalization; rather, it represents a meta-structural reorganization of the problem space, explanatory structures, and evaluative metrics.

Among the three types of creative activities, leap-style creativity is characterized by the highest degree of compression, theoretical unity, and structural reconstructibility. Consequently, "leap-style creativity" occupies a particularly vital position in the process of cognitive evolution \cite{Hanson}.

1958; Kuhn, 1962; Lakatos, 1978)

3.3.1 从“现象

From "Relations" to "Structures": The higher-order manifestation of creativity is found in the extraction and construction of cognitive structures and meta-structures \cite{Piaget1950, Piaget1970}. Piaget’s genetic epistemology describes this process of cognitive development as a continuous cycle of "equilibration and re-equilibration."

This process reflects a power-law cognitive hierarchy that ascends from "phenomena $\to$ relations $\to$ structures $\to$ meta-structures." In this progression, logic-mathematical relations are generated through the coordination of sensory-motor actions; subsequently, the coordination of these logical relations gives rise to cognitive structures and operational rules. Ultimately, this culminates in the formalized construction of a meta-rule system, resulting in the generation of a "cognitive schema." While current Large Language Models (LLMs) have demonstrated sophisticated combinatorial capabilities at the "phenomena-relation" level—such as pattern recognition and semantic matching—their essence remains the reorganization and fitting of explicit patterns. They do not yet possess the capacity to generate deep structures from heterogeneous phenomena. In contrast, human cognitive creativity often occurs at the level of structural abstraction. For instance, Newton unified terrestrial and celestial mechanics through "universal gravitation," and Darwin integrated species variation through "natural selection." Both examples illustrate the process of achieving structural abstraction through cognitive identification to construct overarching meta-theories.

Furthermore, a structural leap is not merely a local adjustment at the "relational level"; rather, it involves the construction of a new "structural layer" to reshape the problem space. This transformation transcends the simple identification or combination of existing input-output relationships. In other words, the key to saltatory (leap-like) creativity is not the discovery of more correlations, but the proposal of a structural framework capable of governing or replacing old relational systems. This allows previously scattered or incompatible information to be integrated into a new explanatory system.

Further leaps beyond the "structural layer" manifest as the construction and reorganization of "structures of structures," or meta-structures. This process is no longer confined to rule systems within specific domains; instead, it introduces meta-rules and formalized grammars capable of governing multiple cognitive domains and multi-level representations, thereby achieving cross-domain unified representation and problem transformation. For example, calculus abstracts continuous change into an operable limit process; group theory formalizes symmetry into a transferable operational structure; and information theory uses entropy as a core metric to quantify uncertainty and information volume. Abstractions at this meta-structural level not only enhance the compressibility and universality of knowledge organization but also provide the logical and regulatory foundations for the development of new theoretical frameworks.

2.3.2 经典

The Unity and Verifiability of the Law of Universal Gravitation

The "unity and verifiability" of the Law of Universal Gravitation represents one of the most iconic and transformative leaps in the history of scientific creation. A primary example of this is Newton's synthesis of the laws governing the motion of terrestrial objects and celestial bodies.

Before Newton, the prevailing view—rooted in Aristotelian philosophy—maintained a strict dichotomy between the "sublunary" sphere (Earth) and the "superlunary" sphere (the heavens). It was believed that objects on Earth followed linear paths and were subject to decay, while celestial bodies moved in perfect circles and were eternal. Newton shattered this dualism by demonstrating that the same physical force—gravity—governed both the falling of an apple and the orbiting of the Moon.

This unification was not merely philosophical; it was rigorously mathematical and verifiable. By applying the inverse-square law, Newton was able to calculate that the force required to keep the Moon in its orbit was consistent with the acceleration of gravity measured on the Earth's surface, adjusted for distance. This leap in reasoning transformed physics from a collection of isolated observations into a coherent, universal system. The ability to verify celestial mechanics through terrestrial experimentation remains a cornerstone of the modern scientific method, proving that the laws of nature are consistent throughout the universe.

The Construction of a Unified Theory

The falling of an apple and the orbital motion of the moon around the Earth are perceived as distinct types of kinetic events at the level of sensory experience. The former is a common observation within the Earth's immediate vicinity, while the latter is an astronomical phenomenon; phenomenologically, there is no direct correlation between the two. Newton's core insight lay in the unified integration of these two classes of phenomena within the Law of Universal Gravitation and the Three Laws of Motion \cite{Newton, 1687/1999}. This transition reflects a power-law elevation of cognitive structures in three specific dimensions, the first of which is unification. By utilizing the "Law of Universal Gravitation" to characterize the shared mechanism governing both terrestrial and celestial motion, Newton achieved a mechanical fusion where "all paths lead to one," effectively dismantling the traditional cosmological divide between the celestial and the terrestrial.

The epistemological divide of meta-opposition \cite{Janiak, 2008}.

) compressibility. Through a few fundamental laws (such as $F = ma$ and $F = G m_1 m_2 / r^2$), it is possible to derive quantitative predictions for a vast array of complex phenomena, including tidal variations, cometary trajectories, and planetary motion \cite{Cohen1985}. In the sense of information theory, this embodies the principle of using a shorter description length to cover a broader range of empirical data.

The principle of parsimony, which seeks to explain a "broad domain of phenomena" with minimal complexity, is a cornerstone of inductive inference (Li & Vitányi, 2008).

Testability is a cornerstone of scientific theory. A robust theory must not only explain known phenomena but also generate quantifiable new predictions, allowing for the measurement of deviations and subsequent model refinement. A classic example is how the combination of Newtonian mechanics and observed anomalies in Uranus's orbit led to the prediction of Neptune's existence. This process constitutes what can be described as an "adaptive feedback loop" in an evolutionary sense \cite{Lakatos, 1978}.

From the perspective of cognitive creation, this process involves a leap from observing "phenomenal relationships" to establishing a unified representation of "structure," and further ascending to the power-law operations of "meta-structures." Within a collective knowledge system, this upward operation manifests as a paradigm shift in the Kuhnian sense. A paradigm shift represents a fundamental update to the "worldview" adopted by a scientific community \cite{Kuhn & Hacking, 1970}. Lakatos also noted that progress in scientific theory is not merely an incremental accumulation but rather the evolution of "research programs." The hallmark of such progress is when a new theory surpasses the old paradigm in both unity and predictive power \cite{Lakatos, 1978}. Furthermore, Newton's theory provided a paradigmatic template for "structural compression" in subsequent scientific developments.

For example, Maxwell's equations unified electricity and magnetism into a single electromagnetic theory, while Einstein's general relativity replaced Newtonian models of gravity and motion with a geometric framework. In essence, both can be viewed as higher-order transitions from "structure to meta-structure" \cite{Piaget1950, Piaget1970}. Furthermore, classic creative breakthroughs in the history of science represent the manifestation of such transitions at the level of collective knowledge \cite{Kuhn1962}.

3.3.3 具身经验:人类与

Human knowledge does not originate from the isolated operation of abstract computation; rather, it is rooted in a triple embedding of embodied experience, concrete situations, and social practices. First, the body serves not only as an actuator for perception and action but also forms the initial cognitive framework through motor coordination, spatial navigation, and the perception of bodily boundaries \cite{Gallagher, 2005; Thelen & Smith, 1994}. Second, cognitive activity always occurs within specific contexts, constrained by factors such as task goals, tool availability, and environmental affordances—a concept known as "situatedness" \cite{Suchman, 1987}. Furthermore, social norms, linguistic systems, and cultural practices constitute the interactive context for knowledge formation, ensuring that knowledge possesses not only functional efficacy but also group-shared intelligibility and public evaluative metrics \cite{Tomasello, 1999; Vygotsky, 1978}. Within this series of multiple nestings, the abstract structure of individual knowledge is not bestowed top-down; instead, it is gradually constructed through multi-channel coordination and feedback during the processes of embodied operation and social interaction. This process is characterized by both historicity and structurality; the accumulation of experience is not a linear growth but a leap from operational structures to abstract formats achieved through representational reorganization and cross-layer integration \cite{Barsalou, 2008; Clark, 1997}. Supported by social embedding, human cognition demonstrates highly hierarchical, parallel, and cross-modal coordination capabilities \cite{Barsalou, 2008}. One of its core features is the mind's ability to jump and align between multiple "cognitive windows," much like an operating system \cite{Baars, 1993; Nelson, 1990}. These cognitive windows may include: the sensorimotor layer (e.g., audiovisual feedback and spatial manipulation), the relational-symbolic layer (e.g., understanding metaphors, identity, and power relations), and the physico-logical layer (e.g., abstract reasoning and model construction). In complex tasks, humans can rapidly switch between different windows and complete the integration of multi-level information through metacognitive regulatory mechanisms \cite{Clark, 2015}. Findings in neuroscience support this multi-layer coordination model. The cerebral cortex exhibits a progressive hierarchical structure ranging from primary sensorimotor areas to the limbic system (emotional processing areas such as the amygdala and cingulate gyrus), and finally to the abstract planning and control systems supported by the prefrontal cortex \cite{Fuster, 2000; Mesulam, 1998}. These three layers roughly correspond to the information processing hierarchy of the peripheral and central systems: the central system forms goals and plans, the intermediate system integrates cross-modal information, and the peripheral system implements strategies through interaction between the body and the environment, providing feedback to adjust plans. Cognitive transition unfolds precisely within this nested hierarchical system; the coordination between different levels allows knowledge not only to form structures but also to achieve a "power-raising" operation from structure to "meta-structure."

Although generative systems (such as large language models) have demonstrated significant success in simulating surface-level human linguistic structures and identifying statistical patterns, the probabilistic modeling algorithms upon which mainstream AI relies can perform conditional fitting and sampling optimization within a multi-layered parameter space.

While their deep neural networks possess an algorithmic architecture capable of multi-level transformations—seemingly approximating the power-raising operations of any complex function—this "algorithmic hierarchy" does not correspond to the "levels of meaning" found in human embodied cognition. Instead, it represents the superposition and optimization of linear computational mapping chains \cite{Bengio et al., 2021; Chollet, 2019}. A more fundamental issue is that the "knowledge" of AI primarily stems from the statistical integration of large-scale corpora rather than the multi-layered nested structures of meaning formed through embodied action, social practice, and situational reflection. Due to a lack of sensorimotor coordination and emotional-intentional regulation, AI cannot perform situational adjustment, normative internalization, or value alignment on its generated outputs. Consequently, its performance in the cross-hierarchical reflexive control required for cognitive transitions remains unsatisfactory \cite{Bisk et al., 2020; Lake et al., 2017}. Furthermore, creativity at the meta-structural level requires a system to possess the capacity for reflexive adjustment of its own structures—namely, the understanding, modification, and reconstruction of generative rules. Current AI systems have yet to demonstrate meta-control over their own representational mechanisms; they can simulate the results of rules but cannot propose, evaluate, or reconstruct the rules themselves. This deficiency makes it difficult for AI to complete the cognitive leap from "structure" to "meta-structure," resulting in a lack of a cognitive path toward theoretical dimensionality elevation and paradigmatic transformation. Therefore, despite excelling in intra-paradigmatic optimization and combinatorial innovation, AI's creativity in terms of hierarchical transitions remains inherently limited by its computational architecture.

3.3.4 个体

Collective cognitive leap creativity manifests not only as the updating of knowledge content but, more fundamentally, as the reorganization of the cognitive architecture itself and a shift in paradigms. This process can be understood as a cognitive system jumping from one steady-state structure to a higher-order paradigm structure, thereby generating entirely new organizational logic and schemas of meaning. A leap is not a product of continuous deductive or inductive reasoning; rather, it is a nonlinear structural mutation. It is typically accompanied by the intensification of cognitive conflict, the deconstruction of existing structures, and a reconstruction toward a higher order, ultimately leading to the formation of a new stable structure. This mechanism is embodied in a dynamical sequence of "conflict–re-stabilization."

Jean Piaget's theory of cognitive development describes how an individual's cognitive structure continuously undergoes cycles of "equilibrium–disequilibrium–new equilibrium" during the developmental process. This dynamic process is driven by two core operations: assimilation, which involves integrating new information into existing schemas, and accommodation, which involves adjusting original schemas to restore structural consistency when faced with heterogeneous information. When the intensity of information input or cognitive conflict exceeds the system's regulatory threshold, the original cognitive schema faces collapse. This triggers a systemic leap, leading to the generation of a new, higher-level cognitive organizational hierarchy.

Through this process, the cognitive system completes its structural reconstruction and paradigm update. Thomas Kuhn noted that scientific paradigms possess several key characteristics. First is their framework nature: a paradigm defines the criteria for what constitutes a "valid problem" and a "reasonable solution," thereby determining the cognitive boundaries of research activities. Second is exclusivity: old and new paradigms are incommensurable, and a paradigm shift often entails a reconstruction of the entire world picture. Third is nonlinear saltation: paradigm changes do not evolve continuously but emerge through a chain of stages characterized by the "accumulation of anomalies–structural leap." Structurally, this model echoes the "re-equilibration" process proposed by Piaget; however, Kuhn extends this cognitive developmental process from the individual level to the evolutionary dimension of the scientific community's knowledge structure, emphasizing the collective cognitive rupture and worldview reshaping carried by paradigm shifts.

3.4 三种创造形式的数学表达

To more precisely characterize the triple manifestations of creative activity, we further employ information theory, thermodynamics, and the free energy principle to perform quantitative modeling of these three forms of creativity.

3.4.1 内插式创造是

Interpolative creativity typically does not involve the expansion of knowledge boundaries; rather, it seeks optimal paths, more refined structures, or more efficient modes of organization within an existing framework. The core of this type of creativity lies in enhancing the internal order of a system and can therefore be characterized as a process of "entropy reduction." From the perspective of thermodynamic principles, the internal reorganization of order and structural optimization represent a transition from a high-entropy state to a low-entropy state. Internal entropy reduction does not alter the system's boundary conditions; instead, it improves the system's operational efficiency and explanatory power by strengthening internal coupling and the sophisticated processing of its structure.

First, thermodynamic entropy is used to measure the degree of disorder at the microscopic level or the breadth of distribution within a state space. A decrease in entropy implies that the system's state tends toward concentration, meaning its structure becomes more orderly and its constraints are strengthened. For example, when a physical system receives energy input and undergoes internal reorganization, the system's entropy value decreases, signifying a transformation from chaos to order. This mechanism is analogous to "interpolative creativity" within a knowledge structure: reorganizing and organizing elements within the system without expanding its boundaries to enhance overall order.

Second, from the perspective of information theory, Shannon entropy is defined as:

$$H(X) = -\sum_{i=1}^{n} P(x_i) \log P(x_i)$$

The formula measures the degree of uncertainty of a discrete random variable $X$; it can also represent the distribution breadth of information within a symbolic system. When a system undergoes processing such that its output probability distribution tends to concentrate—meaning the certainty of certain symbolic combinations increases (i.e., $P(x_i)$ increases)—the information entropy ($H(X)$) decreases. Consequently, the refinement of rules, compression of categories, and constraints on expression found in interpolative creativity constitute an "entropy reduction" process in the information-theoretic sense.

In other words, the compression of disorder achieved through deduction and integration within a logical structure manifests as a decrease in the uncertainty of the representational system. Georgiou investigated the dynamic characteristics of information entropy in cognitive processing \cite{Georgiou2005}. He found that during deductive reasoning operations, the cognitive system moves toward a more deterministic structure by continuously narrowing the space of possibilities, which is manifested as a decrease in information entropy. Organic systems must continuously introduce external energy or information to maintain or enhance their internal structural order. Correspondingly, creative cognitive activity can be viewed as an active entropy-reduction process aimed at transforming uncertain states of knowledge into a clearer, more refined, and structured system of expression.

Overall, the core characteristic of "interpolative creativity" is the generation of new structures by enhancing internal systemic order without expanding the problem space. To this end, we can model it as an entropy-reduction process within a knowledge system.

Let a system transition from an initial state $S_0$ to a final state $S_1$ through interpolative reasoning and structural optimization, with corresponding entropy values $H_0$ and $H_1$. The change in entropy is:

$$\Delta H = H_1 - H_0$$

Based on this definition, the "Interpolative Creativity Value" ($C_{int}$) can be expressed as:

$$C_{int} = -\Delta H = H_0 - H_1$$

That is to say, the system reduces internal uncertainty through structural reorganization, and the magnitude of this entropy reduction is equivalent to the "interpolative creativity value." This definition characterizes the refinement process of a knowledge system as it moves from "disorder" to "order."

3.4.2 外推式创造

Information Gain and Uncertainty Control

Extrapolative creativity occurs when an individual processes unknown environments or novel tasks by constructing hypothetical models or employing analogical transfer to new domains. This process serves to compress the uncertainty space of the knowledge system and enhance decision-making accuracy. It embodies the dual objectives of increasing systemic information gain and reducing environmental uncertainty; specifically, it expands the application scope of the existing knowledge base while simultaneously strengthening the cognitive system's adaptability and robustness when faced with atypical inputs.

From the perspective of information theory, extrapolation can be understood as the extraction of generalizable low-dimensional structures from high-entropy states within the input space. This allows for a "prior commitment" to a latent model, which is then continuously refined and updated through verification processing. Consequently, extrapolative creativity is not merely a rational allocation of cognitive resources, but an "output of order" directed toward the unknown \cite{Friston2010, Jaynes2003}. The "selective information sampling" model proposed by Delisle provides a biological analogy for extrapolative creativity based on cognitive constructivism. According to this model, biological agents do not adopt random information-gathering strategies when facing complex and dynamic environments. Instead, characterized by goal-orientation and structural sensitivity, they actively collect information fragments that possess predictive utility for future actions. This sampling mechanism demonstrates a preference and exploratory tendency toward latent structural regularities in the environment, enabling the agent to extract key patterns with extrapolative potential under highly uncertain input conditions. By continuously reinforcing the perception of "local integrability" within uncertain regions, individuals gradually develop structural understandings and strategic presuppositions for unfamiliar situations.

Therefore, extrapolative creativity can be viewed as an optimization activity based on Expected Information Gain (EIG). Within the frameworks of Bayesian inference and active inference, the optimization activities undertaken by an agent aim to maximize future information gain and actively reduce environmental uncertainty. Its mathematical definition is as follows: let $o$ represent the observed variables, $\pi$ represent possible action policies, and $s$ represent the true state of the system. Extrapolative creativity is manifested as a selective exploration process aimed at maximizing $EIG$, characterized by high structural sensitivity and a predictive orientation:

$$EIG(\pi) = E_{Q(o, s|\pi)} [\ln Q(s|o, \pi) - \ln Q(s|\pi)]$$

4 以体检为例:做某个动作

(For instance, conducting an inspection or running an experiment) enables us to reduce, on average, the "uncertainty" regarding what we wish to know.

This represents the information gain of the system, where the prior (prior entropy) denotes the agent's uncertainty about the environmental state before the innovative activity occurs, and the posterior (posterior entropy) denotes the remaining uncertainty after the extrapolation operation, denoted as $IG(x,a)$. Thus we have:
$$IG(x,a) = H_{\text{prior}} - H_{\text{posterior}}$$
Clearly, this is an entropy reduction process. The entropy reduction of the system implies an increase in information quantity, that is:
$$\Delta I = -\Delta S$$

It should be noted that the meaning of entropy reduction in extrapolative creation differs from that in interpolative creation, where order increases in a closed system; the entropy reduction in extrapolative creation refers to the reduction of uncertainty in the "unknown domain" and the overall information-theoretic gain of the system, that is:
$$\Delta S_{\text{unknown}} < 0, \quad \Delta I_{\text{system}} > 0$$
Thus, we define the extrapolative creation value as the system information gain (Information Gain, IG). For example, transferring the strategy of "crossfire" used in military operations to enhance strike effectiveness to the radiotherapy of tumors, leading to the invention of the "Gamma Knife" technique in stereotactic radiotherapy, represents a typical case of extrapolative creation. The essence of this process lies in migrating data from a known domain to target modeling in another domain. Such migration significantly reduces the uncertainty of the target system, and the resulting information gain (Information Gain) can be regarded as a metric for quantifying the value of such extrapolative creation.

3.4.3 跃迁式创造是

In the context of system-level transitions, the reduction of free energy and the transitions in cognitive structures manifested in human creative activities are often characterized by non-linear mutations, critical triggers, and phase transitions. To formalize this process, we define the following components:

  • $y$ (The Target Answer): The result of primary interest (e.g., "Does the patient have the disease?" or "What is the specific model label?").
  • $x$ (Known Context): The existing background and clues already available (e.g., age, symptoms, or historical data).
  • $a$ (Action/Inquiry): The action taken to reduce uncertainty, such as asking a critical question, performing a blood test, or conducting an additional measurement.
  • $p(y|x)$: The "subjective probability distribution" based solely on existing clues $x$. This represents an initial assessment that lacks full certainty.
  • $p(y|x, a)$: The updated probability distribution regarding $y$ after the action $a$ has been performed. This represents a refined judgment with increased confidence.
  • $H(p)$: The use of "Entropy" to quantify uncertainty. A higher entropy signifies greater uncertainty, while a lower entropy indicates higher certainty and clarity.

Expected Information Gain (EIG) represents the reduction in uncertainty regarding a variable $x$ after an action $a$ is performed, averaged across all possible outcomes. Mathematically, it is defined as the difference between the prior entropy and the expected posterior entropy: $H(p(x)) - \mathbb{E}_{y}[H(p(x|y, a))]$. This metric quantifies how much the entropy decreases from "before" to "after" an action is taken; this reduction is precisely the "information gain."

To illustrate this concept, consider a medical examination. Before a check-up, a doctor might only observe general symptoms ($x$). Subjectively, the doctor may feel there is a 50% chance of a specific illness, representing a state of high uncertainty where the entropy is near its maximum. However, after performing a high-accuracy diagnostic test ($a$), the diagnosis becomes much clearer (e.g., "positive for illness"). In this scenario, the uncertainty is significantly reduced, resulting in a high information gain.

This "degree of uncertainty reduction" is quantified by the difference in entropy. By averaging the entropy reduction across different patients (different $x$), we obtain the Information Gain (or Mutual Information) of the examination. A larger value indicates that the examination is more worthwhile, as it is more effective at reducing uncertainty.

Expected Information Gain (EIG) represents the average value of "how much certainty is increased (or how much uncertainty is reduced)" by a specific action. In the contexts of active learning, experimental design, diagnostic workflows, or question-answering strategies, selecting the action that yields the maximum EIG is typically considered the "most informative" choice.

Features such as probabilistic activation are inherent to this process, making it difficult to characterize using traditional continuous function modeling approaches. To better describe its underlying dynamical mechanism, we introduce the electronic energy level transition model from physics as an analogical framework. This allows us to formally characterize the cross-layer transition process of cognitive structures under specific conditions.

In the Bohr model, the necessary condition for an electron to be excited from a lower energy level to a higher energy level is the absorption of a photon with energy $E = h\nu = \Delta E$. If excitation occurs via particle collision, the incident particle must carry at least the threshold energy $\Delta E$. Whether a transition actually occurs is governed by transition probabilities in quantum mechanics (such as the Einstein coefficients or Fermi's Golden Rule), and thus exhibits statistical characteristics.

By drawing an analogy to cognitive systems, the conditions for a cognitive transition can be formally described. First, let us define the following variables:

  • $avail$: The cognitive resources available to the subject, such as attention, motivation, and the ability to invoke specific knowledge formats.
  • $E_0$: The existing resource level of the subject's current cognitive plane.
  • $\Delta E$: The abstract span and integration difficulty required for a transition in cognitive structure—essentially the "energy level difference" between the old structure and a potential new structure.
  • $I_{avail}$: The frequency or intensity of anomalous evidence, cognitive conflict, or counter-intuitive input per unit of time, representing the "excitation intensity."

When $avail + E_0 \geq \Delta E$, the probability of a structural transition, denoted as $P_{leap}$, increases significantly. This transition probability can be expressed by the following equation:

$$P_{leap} = \sigma(\gamma \cdot (I_{avail} - \Delta E))$$

In this expression, $\gamma$ is a coupling parameter reflecting the combined strength of factors such as the individual's background schema organization ability, motivational intensity, and degree of social support.

This formula indicates that the transition probability exhibits non-linear sensitivity toward both excitatory input and the energy budget. When the difference between the excitation intensity ($I_{avail}$) and the energy level gap ($\Delta E$) is non-zero, the system enters a non-steady state. This state may trigger a cognitive transition, leading to high-order creative behaviors such as sudden insight or paradigm shifts.

The occurrence of a transition causes the difference between the system's energy level gap ($\Delta E$) and the excitation intensity ($I$) to approach zero, thereby allowing the system to return to a steady state.

Next, we employ the "Free Energy Principle" (FEP) proposed by Friston \cite{Friston2010} to describe the dynamic changes in the discrepancy between the system's energy level gap ($\Delta E$) and the excitation intensity ($I$). According to predictive processing theory, the system aims to minimize this discrepancy through active inference and internal model updates.

The cognitive activities of biological agents can be viewed as a continuous regulatory process aimed at minimizing free energy, ultimately achieving superior predictions of the environment and maintaining the stability of the system's internal states (Friston, 2010). The difference between the system's energy level gradient ($\Delta E$) and the excitation intensity ($I$) can be mathematically transformed into the system's free energy ($F$).

In this context, $E$ represents the prediction energy error, which signifies the discrepancy between the agent's internal model and the external input.

The entropy of the system, denoted as $H$, represents the uncertainty associated with the input information. The parameter $T$ represents the activation temperature of the system, which reflects the degree of cognitive mobilization or the level of resource investment.

It describes

Submission history

Definition and Value Reconstruction of Human Creativity in the AI Era