ChinaRxiv

Narrative Reshaping and Optimization Strategies for AI Music in Mainstream Propaganda—From the Perspective of Human-Machine Co-Creation

Huang Jiayin

Submitted 2025-07-09 | ChinaXiv: chinaxiv-202507.00300

Note: Figures in this paper have not yet been translated.

Abstract

【目的】In the domain of mainstream propaganda, the application of AI music is becoming increasingly pervasive, with its lyrics and compositions often comprehensively mapping onto propaganda objectives. However, since the emergence of this technology, AI-generated songs produced by media outlets have seldom achieved widespread popularity, and many works furthermore lack aesthetic value, thereby provoking industry skepticism regarding this technology.【方法】This paper aims to conduct a qualitative analysis of the current utility boundaries of AI music through literature review and synthesis of frontline experiences.【结论】The author contends that the industry currently overestimates the potential of AI music, having failed to elucidate its limitations in practical application, thereby resulting in generated outputs that inadequately serve task objectives.【结果】This paper will explore how to rationally deploy AI music through a human-machine co-creation workflow to achieve narrative reshaping and expressive optimization.

Full Text

Preamble

Narrative Reshaping and Optimization Strategies for AI Music in Mainstream Propaganda: A Human-AI Co-creation Perspective
Guangzhou Broadcasting and Television Station, Guangzhou, Guangdong 510000

Abstract: [Purpose] AI music is increasingly applied in mainstream propaganda, with its lyrics and melodies often comprehensively reflecting promotional objectives. However, since its inception, few AI-generated songs by media organizations have become hits, and many lack aesthetic value, prompting industry skepticism about the technology. [Method] This paper qualitatively analyzes the current utility boundaries of AI music through literature review and synthesis of frontline experience. [Conclusion] The author argues that the industry generally overestimates AI music's potential while failing to clarify its limitations, resulting in generated outputs that inadequately serve intended objectives. [Result] This article explores how a human-AI co-creation workflow can rationally deploy AI music to achieve narrative reshaping and optimized expression.

Keywords: AI music; narrative reshaping; large language models; human-AI co-creation; workflow
Classification Code: G222
Document Code: A
Article ID: 1671-0134(2025)05-64-04
DOI: 10.19483/j.cnki.11-4653/n.2025.05.013
Citation Format: Huang Jiayin. Narrative Reshaping and Optimization Strategies for AI Music in Mainstream Propaganda: A Human-AI Co-creation Perspective [J]. China Media Technology, 2025, 32(5): 64-67.

1. Literature Review

Current AI music research and applications often carry a "strong technological determinism" presumption, easily overlooking the intrinsic connections between technological humanities (such as aesthetic imagination, emotional cognition, and affective generation) and technical implementation itself. As we move from the weak AI era to strong AI, emotional cognition must return to the origin of technological embodiment, excavating the generative logic and convertible factors of emotional motivation [4]. This represents the clear direction of our research.

Guo Shiwei of Xi'an University of Finance and Economics notes: "Like titles, subtitles, and visuals, background music represents media values, positions, and attitudes toward news facts... However, due to its non-original nature, background music often contradicts the logic and emotion of news facts. Moreover, since most media organizations lack the capacity for original music production, they can only repurpose existing resources to score news short videos, resulting in highly homogenized soundtracks that easily induce aesthetic fatigue among audiences" [1]. AI music promises to solve these problems. As Cui Bo, music director of the Perfect Youth OST, observes: "Current AI music is not mature enough, and its output contains many flaws, but this material can be selected, refined, and enriched" [2]. This opens imaginative space for our research on how to produce good "segments" to enrich communication effects.

The academic team of Yu Guoming at Beijing Normal University believes that "all subjects, objects, and intermediaries related to communication itself are objects of communication studies. With the intelligent development of media technology, the connotations of human-machine communication are further enriched" [3], meaning AI music is becoming an important issue in journalism and communication studies. Xiao Ping of Beijing Institute of Fashion Technology argues: "The focus of current AI music research and applications often carries a 'strong technological determinism' presumption..." [4]. Liu Jie of Renmin University states: "AI music is fundamentally a kind of 'counterfeit' art, and 'music without humans' must ultimately return to 'humans'" [5]. How to leverage human wisdom and use AI music effectively is the focus of this discussion.

2. The "Lightweight" Nature of Narrative

Applying AI music in propaganda aims to "increase quality and efficiency" in narrative construction. Narrative is rooted in individual concepts and ideas, and its propagation is closely related to public resonance. People shape their identity through supporting narratives—supporting a narrative is essentially a process of self-affirmation [6]. Narrative spreads "bottom-up," emerging from folk word-of-mouth. Public opinion is like "chicken feathers"—only when light enough can it be blown upward [7]. Therefore, calibrating emotions and popularizing expression are important functions of scoring. While AI music can significantly reduce production costs, it can also create more appropriate soundtracks that effectively shape context, omitting extensive text while reducing narrative ambiguity.

Those highly popular works often feature simple, straightforward language. However, some practitioners currently tend to use abstract, obscure "big words" when creating AI music lyrics, resulting in convoluted sentence structures—this contrasts sharply with the increasingly simplified trend in popular songs. In fact, this approach does not align with AI's working logic and exceeds the current capabilities of AI music.

3. Characteristics and Challenges of AI Music Creation

3.1 The "Brainwashing" Trend in Popular Music

A study published in Scientific Reports found that over the past several decades, the decreasing complexity of vocal melodies (main melodies) in popular music has made songs more memorable and singable. The analysis identified two distinct simplification trends around the 1975 disco boom and the 2000 digital audio workstation revolution. The study argues that the "brainwashing" of popular music results not from musicians' laziness but from systemic social changes [10]. In retrospect, AI music may mark the beginning of a new evolution.

However, it must be noted that AI music's evolutionary path may not be further melodic simplification but rather greater "arbitrariness." Research from the University of York indicates that AI music can generate music synchronized with listeners' biological signals and create feedback loops, holding broad prospects for creative industries and music therapy. Yet how to extract meaningful control information from bodily signals and design response mechanisms remains a challenge for future work [11].

3.2 AI Solves "Mechanical Application"

From experience, we discover the wisdom that "understanding = repetition + novelty." Many propaganda terms carry high value but fail to connect with the public. AI large language models can first solve the "mechanical application" problem at the front end, generating new expressions that better achieve "mind-and-heart" penetration. Through skillful prompting (breaking down problems, providing references, describing details, offering context, assigning roles to AI, guiding AI reflection, etc.), we can rapidly transform grand texts into poetry, prose, micro-fiction, jingles, and other forms, then create multiple songs through AI. This both enriches connotation and enlivens form.

3.3 Human-AI Co-creation Workflow for Lyrics

Discussing the scope of "AI music" must include large language models—only well-written lyrics can produce good music. Musician David Byrne mentions in his book How Music Works his method of "letting lyrics emerge automatically"—first writing meaningless phrases on paper, seeking resonance among them. When meaningful inspiration is found, he locks onto it [12-13]. Large language models can compress this process into an extremely short time. AI can easily list hundreds of rhyming and relevant phrases, rapidly "releasing" massive inspiration that allows creators to piece together combinations like building blocks, creating subconscious meaning behind words. As Byrne says, poorly written lyrics destroy the "pleasurable ambiguity" in music—this is precisely why we cannot fully rely on large models and must manually revise lyrics.

The main steps of the lyric workflow include:
(1) Initial Input Stage: Using large language model tools like Kimi and Wenxiaoyan, input creative requirements and background information to provide a foundational framework and set clear objectives and associative boundaries for lyrics.
(2) Iterative Dialogue Stage: Engaging in multiple rounds of dialogue with AI to gradually refine and deepen lyrical content. This process helps AI better understand the creative theme while stimulating associative dimensions.
(3) Lyric Optimization Stage: Continuously providing feedback during dialogue to guide AI in optimizing lyrics to better match the creative vision. Consider constructing "wordscapes" (time, place, characters) to enhance narrative quality, injecting elements of cognition, familiarity, and memory to create a sense of life fullness, or introducing novel metaphors, symbols, and other literary devices. This step aims to combine traditional music creation's "fuzzy thinking" with AI's "associative thinking"—AI rapidly generates ideas while human creators provide depth, personalization, and directional touch. Practice reveals that reasoning models like DEEP SEEK-R1 can optimize lyrics according to prompts, but because outputs remain uncertain and require later manual adjustment, they hold no significant advantage over large language models at this stage.
(4) Manual Refinement Stage: Manually revising AI-generated lyrics to identify and adjust details and emotional nuances that AI may miss, further simplifying lyrical structure and modifying individual words to enhance rhythm and imagery.
(5) Final Review: Ensuring overall discourse aligns with creative goals and style. Throughout manual revision, maintain focus on lyrical originality to avoid potential copyright issues.

3.4 Generating Songs and Overall Optimization

Through experience using multiple AI music models, I have found that when each line of a lyrical paragraph maintains relatively stable character counts (such as five-character or seven-character poetry), the generated melody flows more smoothly. When revising lyrics, appropriately adding humming lines like "ah," "oh," or "la" between paragraphs gives AI more room for酝酿, creating more structured and usable lyrics that can produce better melodies (Byrne also frequently adds humming to lyrics to make space for musical expression [13]). Additionally, annotating lyrical sections is crucial—marking basic paragraphs such as verse, chorus, bridge, intro, and outro helps AI generate more complete musical pieces.

AI music styles cover almost all popular songs heard daily, and different musical styles carry relatively distinct emotional tendencies (e.g., rock: energetic, excited; folk: attached, nostalgic; tango: playful, comical; R&B: fashionable, romantic, etc.). Experiments show that different music styles affect mood, focus, and cognitive performance differently [14]. Selecting appropriate styles (single or combined) for generation better serves narrative purposes. Understanding these musical styles and corresponding emotional tones also makes it easier to exclude emotionally mismatched compositions during review.

3.5 Selecting Applicable Results

After the above human-AI co-creation steps, we reach a stage commonly called "brute force generation" (multiple generations). At this stage, each generation of the same lyrics produces different melodies, allowing us to select the optimal version. During repeated generation, we can also fine-tune lyrics and paragraph markings based on phrasing effects or change style tags. However, one point must be clear: to generate a relatively complete pop song lasting 2–3 minutes or more with repeated sections and memorable hooks, clear paragraph markings and repeated lyrics in the prompt are essential.

Roger Dannenberg, director of the Center for Computer Music at Carnegie Mellon University, believes that repetitive structure is important for perceived song quality and AI music generation [15]. However, in a 2024 study he participated in, he also noted that lack of structure and hierarchy is a common problem in computer-generated music, and algorithms still have much to learn about constructing song layers [16]. Currently, AI music software like Suno operates on logic similar to Diffusion image generation: "AI first generates noise, then 'comprehends' an initial inspiration from the noise, and gradually refines this inspiration into pictures or video through a learned 'denoising' process" [17]. Grasping artificial intelligence lies not in causal meaning and empirical laws but in correlations [18]. If lyrics do not repeat, no matter how neat and parallel they are, AI struggles to generate repeated melodies—this does not align with AI's multi-dimensional associative working logic. Of course, for higher-level music creation, one can also process generated music in separate tracks and make local adjustments to lyrics and melodies, leveraging professional creative advantages (using more "manual" software like "NetEase Tianyin"). However, this enters the realm of professional composition, which is beyond this paper's scope.

4. Practical Applications and Dilemmas

4.1 AI Music Application in Television Programs

I host a talk-show radio program at Guangzhou Broadcasting and Television Station. In traditional production modes, such programs often use inserted pop songs matching the program atmosphere for transitions or embellishment. However, since 2024, with comprehensive AI technology development, program production workflows have undergone significant transformation. Through AI, spoken program content is rapidly converted into standardized text while simultaneously generating detailed segmented summaries and songs. Additionally, text-to-speech (TTS) technology enables production teams to quickly create segment headers and insert "custom songs" with highly契合 content between program sections, achieving seamless integration of music and language and significantly enhancing overall program quality.

The "visualization" transformation of radio stations is unfolding comprehensively. Future radio practitioners must master comprehensive skills including AI music, AI video, digital humans, and TTS technology. The "AI-ification" of radio stations will gradually become reality.

4.2 Application Dilemmas of AI Music in Media Services

Before AI music emerged, I was already involved in music creation for media services—this requires full communication and repeated negotiation between clients and producers, from composition and arrangement to lyric writing, with each step requiring multiple revisions and confirmations. Once a musical work takes shape, any modification could be fundamental. However, the instability of AI-generated melodies makes local modification extremely difficult, undoubtedly creating limitations. Nevertheless, as AI music further marketizes and technology continues to improve, all parties will develop more rational cognition and broader consensus about AI music products, making its cost advantages more prominent.

4.3 AI Music is Different from General Pop Songs

Pop music becomes popular partly due to promotion by television, radio, and other media. In an era of fragmented communication, digital works are quickly created and quickly forgotten, becoming outdated before they can become classics. AI music serves different social functions than traditionally defined pop songs or classic golden melodies. If we blindly pursue AI music that "matches" pop songs, we may miss an entirely new art form. On the other hand, I believe AI music's advantage lies in its high production efficiency and ability to significantly enhance narrative quality. The core appeal of propaganda is a certain spirit or value, not a melody. Whether a song becomes famous has little relationship to propaganda effectiveness.

Early AI vocals suffered from over-compression and insufficient dynamics, sounding "robotic." With technological iteration, by the time of writing, AI music software can generate multiple vocal styles, automatically enhance vocal dynamics, produce mixed voice and vibrato techniques, and even simulate breath sounds. I have played AI music multiple times while hosting radio programs, and listeners can hardly distinguish it. It is foreseeable that AI music will likely replace the current role of "emotional fast-moving consumer goods" in online songs.

5. The Dilemma of Journalistic Professionalism

Traditional journalistic professionalism once worried that scoring would undermine news professionalism, objectivity, and fairness. However, I believe this concept is limited by specific historical contexts. During the mass communication era, news media确实倾向于减少情绪化干预 to attract broader audiences, pursuing neutrality as the "greatest common denominator." But today's communication environment has undergone qualitative change. News lacking emotional identification struggles to attract audiences, thereby losing dissemination power and influence. AI music brings revolutionary solutions to news scoring—by accurately identifying, capturing, and refining the core emotions of news, it can construct a more appropriate and authentic emotional environment. Furthermore, new media provides stages for expressing different emotions, making negative emotions like "anger" and "fear" easier to spread and amplify. AI music holds promise for dispelling and countering these negative emotions in the public opinion arena.

Today's audiences no longer rely on single information sources. News media must first stimulate audience resonance in the communication competition to guide them to actively explore and compare information. Meanwhile, today's news media also bear heavier responsibilities in guiding public opinion, maintaining ideological security, and conducting public opinion struggles. Using new technological means like AI to enhance communication effectiveness is already imperative. Music has transcended pure artistic expression to become an important tool for aiding information dissemination. More and more media professionals will become applied music technology talents. For media practitioners: although we may still be "laymen" in music creation, improving personal music literacy, aesthetic appreciation, and hands-on ability is an unavoidable task. Despite many unknowns in AI music application, professional journalists' precise grasp and profound understanding of narrative content remain the strongest internal drivers on the path of media convergence development.

References

[1] Guo Shiwei. Analysis of Soundtrack Issues and Countermeasures in Short Video News [J]. Voice and Screen World, 2023(18): 104-107.
[2] Li Danping, Jiang Xiaobin. When AI Can Write Lyrics, Compose, and Sing, Where Do Musicians Go? [N]. China Youth Daily, 2024-03-29(007).
[3] Yu Guoming Academic Studio, Yang Ya, Chen Xuejiao, et al. Brain-like, Embodied, and Empathetic: How to Study the Impact of Artificial Intelligence on Communication and Post-humanism—Based on Analysis of AI-related Topics in International Top Three Journals Science, Nature, and PNAS [J]. Academics, 2021(08): 108-117.
[4] Xiao Ping. Embodiment, Imagination, and Empathy: A Techno-phenomenological Study of AI Music Generation and Communication [J]. Modern Communication—Journal of Communication University of China, 2022, 44(09): 155-161.
[5] Liu Jie. "Music Without Humans"—From Encoders to the Displacement of Subjectivity in AI Composition [J]. Tianjin Social Sciences, 2022(02): 117-121.
[6] Robert Fulford. The Triumph of Narrative: Storytelling in the Age of Mass Culture [M]. Nanjing: Nanjing University Press, 2021: 46.
[7] Zou Zhendong. Weak Communication: The Philosophy of the Public Opinion World [M]. Joint Publishing (Hong Kong) Co., Ltd., 2020: 130.
[8] Huang Jiayin. G4 Vocabulary Book [Z]. https://app.gztv.com/plusshare/#/topicDetail?id=22909.
[9] Huang Jiayin, et al. Suiyue Ruge·Mid-Autumn AI Song Festival [Z]. https://app.gztv.com/plusshare/#/liveDetail?id=1414114955202048.
[10] Madeline Hamilton, Marcus Pearce. Trajectories and Revolutions in Popular Melody Based on U.S. Charts from 1950 to 2023 [J]. Scientific Reports, https://www.nature.com/articles/s41598-024-64571-x, 2024.7.4.
[11] Williams et al. On the Use of AI for Generation of Functional Music to Improve Mental Health [J]. Frontiers in Artificial Intelligence, 2020.11.9.
[12-13] David Byrne. How Music Works [M]. Zhanlu Culture, 2016: 221-223.
[14] Raup Padillah et al. Different Music Types Affect Mood, Focus and Work Performance: Exploring the Potential of Music as Therapy with AI Music [J]. Journal of Public Health, Volume 45, Issue 4, 2023.12.
[15] Wang Xiaoxuan. New Trends in Future Music: Artificial Intelligence Empowering Music Development—Review of the World Music AI Conference [J]. People's Music, 2022(01): 36-39.
[16] Shuqi Dai, Huan Zhang and Roger B. Dannenberg. The Interconnections of Music Structure, Harmony, Melody, Rhythm, and Predictivity [J]. Music & Science Volume 7: 2024.
[17] Zhang Lei. How AI Draws Pictures, Composes Music, and Creates Videos [J]. Middle School Technology, 2024(08): 2-7.
[18] Yang Qingfeng. Reflecting on AI Ethics Principles from AI Challenges [J]. Philosophical Analysis, 2020, 11(02): 137-150, 199.

Author Bio: Huang Jiayin (1984—), male, Han ethnicity, from Guangzhou, Guangdong, holds a master's degree and is a chief reporter. His research focuses on media convergence and new media content.

Submission history

[v1] 2025-07-09