ChinaRxiv

Event Extraction and Analysis for Power Media Based on Large Language Models and Knowledge Graphs: Postprint

Li Jia, Zang Yanjiao, Gu Chenlan

Submitted 2025-07-09 | ChinaXiv: chinaxiv-202507.00209

Note: Figures in this paper have not yet been translated.

Abstract

Objective: Against the backdrop of deepening reforms in China's power system and the industry's vigorous development, power media, as a critical communication bridge between the power sector and the public, is facing two core challenges: a surge in the number of power news events and the efficient management of information, urgently requiring effective coping strategies. Method: This paper aims to explore and propose an innovative solution to address these challenges. Based on in-depth consideration of current resources and technological advancements, this paper creatively designs a power media event extraction and analysis method that integrates large language models and knowledge graph technology. Results: This method can accurately identify key entities in the power domain and deeply mine the intricate relationships between these entities, thereby constructing a knowledge graph for the power media field. Conclusion: Through the network relationship structure presented by the knowledge graph and the powerful semantic understanding capabilities of large language models, a dual improvement in the efficiency of information retrieval and the intuitiveness of interactive experience is achieved. This method significantly enhances users' ability to acquire and utilize information, providing strong technical support for the intelligent development of the power media industry.

Full Text

Event Extraction and Analysis for Power Media Based on Large Language Models and Knowledge Graphs

Li Jia, Zang Yanjiao, Gu Chenlan
(Yingda Media Investment Group Co., Ltd., Beijing 100052)

Abstract

[Purpose] Against the backdrop of deepening power system reform and vigorous industry development in China, power media serves as a critical communication bridge between the power sector and the public. It now faces two core challenges: a surge in power news events and the need for efficient information management, urgently requiring effective coping strategies. [Method] This paper proposes an innovative solution to address these challenges. Based on a thorough consideration of current resources and technological advances, we have designed a novel approach that integrates large language models and knowledge graph technology for power media event extraction and analysis. [Results] This method can accurately identify key entities in the power domain, deeply mine the intricate relationships among these entities, and construct a knowledge graph for the power media field. [Conclusion] Through the network structure presented by the knowledge graph and the powerful semantic understanding capabilities of large language models, this approach achieves dual improvements in information retrieval efficiency and interactive experience intuitiveness. The method significantly enhances users' ability to acquire and utilize information, providing robust technical support for the intelligent development of the power media industry.

Keywords: Power Media; Large Language Models; Knowledge Graphs; Entity Extraction; Relationship Extraction
Classification Number: G224
Document Code: A
Article ID: 1671-0134(2025)03-141-05
DOI: 10.19483/j.cnki.11-4653/n.2025.03.031
Citation Format: Li Jia, Zang Yanjiao, Gu Chenlan. Event Extraction and Analysis for Power Media Based on Large Language Models and Knowledge Graphs[J]. China Media Technology, 2025, 32(3): 141-145.

Introduction

In recent years, as China's power system reform deepens and the industry prospers, power media has become an important bridge for communication between the power industry and the outside world. Power events have become the core of power information. Against the backdrop of information diversification and increasingly high public demands for information quality, traditional media content, with its lengthy format and difficult-to-capture key points, can no longer adapt to the fragmented reading habits of the internet environment. Power media urgently needs in-depth analysis and precise management of content, particularly the identification and integration of power event data, to enhance content creation efficiency and quality while optimizing user interaction experiences.

To address these challenges, this paper proposes an event extraction and analysis strategy for power media based on large language models and knowledge graph technology. This strategy leverages advanced artificial intelligence technologies to deeply mine and intelligently analyze power media information, providing users with intuitive and detailed information displays as well as in-depth and rich power event information, thereby driving the continuous progress of the power media industry.

1.1 Entity Extraction Technology

Entity extraction technology, as one of the core tasks in the field of Natural Language Processing (NLP), lies in deeply mining and accurately extracting key entity information from textual data. This process relies on advanced semantic analysis, deep learning models, and complex natural language processing algorithms to extract meaningful entities from massive, unstructured text data such as power media information, including but not limited to events, persons, organizations, equipment and facilities, technical terms, and other abstract or concrete concepts.[1] With the vigorous development of deep learning technology, neural network-based methods have demonstrated powerful advantages in entity extraction tasks. These methods can more accurately identify and extract entities. Meanwhile, the introduction of pre-trained language models (such as BERT, GPT, Qwen, etc.) has further enhanced model comprehension and generalization capabilities, enabling entity extraction technology to perform excellently even when facing unknown or rare entities.[2]

1.2 Large Language Model Technology

In recent years, deep learning technology has achieved remarkable progress in the field of Natural Language Processing (NLP), with such broad application scope and significant impact that it has become an important force driving transformation in this domain. Deep learning learns complex language patterns and rules through multi-layer nonlinear transformations, thereby demonstrating excellent learning efficiency and powerful generalization capabilities. These characteristics enable deep learning models to exhibit performance unattainable by traditional machine learning algorithms when handling highly complex and uncertain tasks such as natural language semantic analysis.

With the continuous exploration and optimization of deep learning technology in the NLP field, Large Language Models (LLMs) have gradually emerged. As a special type of deep learning architecture, LLMs are centered on words or subwords as basic units, learning and understanding the intrinsic relationships and contextual dependencies between words, phrases, sentences, and even entire documents through massive trainable text datasets. This learning process goes beyond surface-level lexical matching to delve into semantic, syntactic, and even pragmatic levels, enabling the model to accurately capture and understand the diversity and complexity of human language, thus handling the multifaceted power media data with ease.[3-5]

1.3 Knowledge Graph Technology

Knowledge graphs, as a highly structured and graphical knowledge representation method, fundamentally abstract and model real-world entities (such as persons, locations, events, etc.) and their intricate relationships in the form of nodes (representing entities) and edges (representing relationships). This model not only intuitively displays the connections and interactions between entities but also deeply reveals their intrinsic relationships and logical structures, providing a powerful tool for understanding and analyzing knowledge within specific domains.[6] Furthermore, visualization technology provides an intuitive and rich interface for the display and application of knowledge graphs. By graphically presenting entities, relationships, and their attributes, users can easily browse and understand complex knowledge networks, discover patterns and trends, and thus make more informed decisions. This serves as an effective assistant for displaying the relationships between power events and their related entity elements.[7]

2. Methodology

2.1 Process Overview

The general process of power media event extraction and knowledge graph construction encompasses several core stages (see Figure 1 [FIGURE:1]). The workflow adjusts data modeling according to data content, with specific details determined by the application scenario and adapted based on data quality. Entity types are defined to instruct the LLM on which types of entities to extract, while entity attributes are specified to guide the LLM on which properties to extract for particular entities depending on the usage context. This paper will provide a more detailed exposition of these key steps to facilitate deeper understanding of their internal logic and operational details.

2.2 Entity Type Data Modeling

Entity type modeling plays a core role in the knowledge graph construction process. These entity types, known as ontologies in the traditional knowledge graph domain, have definitions and descriptions that are crucial for efficiently analyzing textual content using large language models. To extract valuable entity information from massive text data, it is essential to first define the precise meaning of entities. The purpose of this approach is not only to limit the scope of entity extraction within specific application scenarios but, more importantly, to help large language models better understand entity type concepts by pre-defining detailed attributes, thereby enabling more accurate parsing and extraction of entities and their relationships.[8][9]

Adopting this strategy requires a clear understanding of the text data to be processed. This means identifying the various forms in which different entity types may appear in text and what attributes and characteristics they possess. For instance, if the content discusses persons, attributes such as name, age, occupation, and educational background can be defined across multiple dimensions; for locations or objects, attention should be paid to their physical characteristics, positions, historical uses, and other relevant attributes. Such methodology ensures that model outputs are not only accurate but also comprehensive, providing a solid foundation for subsequent knowledge graph data applications.

In the context of power media information usage scenarios, we have further subdivided entity types into several categories, including equipment, technology, organization, person, and event. Equipment entities encompass various hardware and software devices and facilities involved in power systems, such as transformers, circuit breakers, and Apps; technology entities include relevant technologies, standards, and protocols for power transmission and conversion; organization entities comprise power companies, organizational departments, or industry associations; person entities record individuals involved in the power industry, such as engineers, technicians, and project managers; event entities refer to any activities or engineering projects related to power supply, maintenance, or accident handling. Through such meticulous entity classification and definition, a vast and complex knowledge network can be constructed to help users quickly locate information, solve practical problems, and even promote the development and application of new energy technologies.

2.3 Large Model Entity and Relationship Extraction

The preceding description has carefully planned a series of detailed entity modeling data categories, ensuring comprehensive and accurate data collection. For the actual entity extraction phase, we leverage cutting-edge large language model technology. These models not only possess the capability to accurately extract entity information from textual materials but also conduct in-depth analysis and precise definition of the involved raw content. Subsequently, the original materials and predefined entity type data are provided to the large language model, which is tasked with extracting entity information and attributes. This implementation enables the program to more clearly identify each entity's type, name, and unique characteristics during analysis, while also revealing the intricate network of connections between them—including which entities possess specific resources, how these entities are interrelated, and under what contexts these relationships emerge. This approach significantly enhances the depth and breadth of data analysis, enabling a more profound understanding and mastery of entities and their interrelationships.[10]

When selecting an appropriate large language model, practical application scenario requirements must be comprehensively considered, with priority given to models that demonstrate superior performance in Chinese language processing. Since entity recognition and relationship extraction often heavily depend on close contextual semantic associations within text, particular attention must be paid to whether the model supports sufficiently long context lengths during selection. Models supporting longer contexts are necessary to ensure adequate comprehension and extraction of implicit semantic information within textual content. Conversely, if limited by short supported context lengths that result in overly fragmented text materials, information loss or misuse may occur, thereby affecting the quality of overall analysis results. Through prudent evaluation and selection of suitable large language models, not only is the efficiency and accuracy of entity modeling improved, but a solid foundation is also established for subsequent data analysis.[11]

Figure 2 [FIGURE:2] Entity Extraction Record

2.4 Automatic Entity Merging

In the actual knowledge graph construction process, a common situation arises where the same entity may have multiple names that vary depending on different contexts, domains, or languages. For example, the company name "IBM" might be referred to as "International Business Machines Corporation" in technical documents, abbreviated as "IBM" in marketing materials, or even called "Big Blue" in informal settings. This phenomenon is widespread across all industries, where institutions such as banks, hospitals, and schools, along with their relevant personnel, also have their own abbreviations, aliases, or positional titles, which are important for understanding the essential meaning of entities.[12]

To ensure that knowledge graph data can accurately capture and reflect the true relationships of entities, it is necessary to unify these variant names and their connections with other entities. This involves not only the identification and recording of each entity name but also comprehensive consideration of their descriptions, attributes, and interrelationships. By doing so, the most accurate and commonly used names can be retained while merging inaccurate ones, making the entire dataset both rich and accurate.[13]

When performing entity merging, large language models are utilized to help screen from vast amounts of information those different aliases that truly represent the same entity identity. The model can understand subtle differences across various scenarios, thereby revealing different names for the same entity hidden beneath the surface. Through this approach, not only can errors be reduced and data quality improved, but the knowledge graph can also be made more accurate and practical, truly serving user needs and decision-making.[14]

2.5 Retrieval Target Analysis and Marking

In the aforementioned process, entity names, entity attributes, and relationships between these entities have been successfully extracted from the material repository. This data is stored in the database. Additionally, the associations between entities and materials have been thoroughly recorded, including what types of retrieval targets these materials support for specific entities, which is crucial for future data analysis and retrieval operations.

To further enhance the application effectiveness of knowledge graph data in retrieval scenarios, the concept of retrieval target attributes is introduced. This attribute establishes basic retrieval targets for various entity types, enabling the large language model to identify specific retrieval targets suitable for an entity based on the content of the current material. For example, when dealing with "organization" entities, the large language model will mark the most appropriate retrieval target from preset options such as "latest developments," "social activities," or "product releases"; when "technology" entities appear, targets like "application cases," "research achievements," or "development trends" become the selectable range. This design greatly enhances the flexibility and adaptability of knowledge graph data, enabling it to more effectively serve different query requirements.[15]

3. Application and Visualization

3.1 Large Language Model-Enhanced Knowledge Graph Retrieval

When confronting the challenges of diversity and complexity in user information retrieval, we are committed to developing an efficient and intelligent information retrieval system aimed at significantly enhancing user experience. When users input query text, the system's primary task is to promptly and accurately capture the user's query intent. To achieve this goal, we innovatively integrate a large language model as the core analysis engine to deeply parse the semantic connotations of user input.

Specifically, the system first utilizes tokenization technology to segment the user's retrieval text into lexical units, then efficiently matches these terms with a pre-constructed entity database. During the matching process, full advantage is taken of the alias and abbreviation information annotated during entity merging, ensuring that even non-standard terms or abbreviations input by users can be accurately mapped to target entities. Subsequently, the large language model analyzes the specific retrieval purpose of the current query for particular entities, thereby achieving precise understanding of user requirements.[16]

Based on the above analysis, the system further retrieves relevant data from the well-constructed knowledge graph, strictly filtering results according to both entity and retrieval purpose criteria. Ultimately, the system presents users with a list of articles closely related to their query entities and purposes, arranged according to the natural chronological order of entities and events on the timeline, forming an intuitive temporal线索.[17] This design not only greatly enhances the accuracy and efficiency of user retrieval but also significantly reduces the risk of retrieval failure due to input variations. Users no longer need to worry about terminology accuracy; they can simply input queries following personal habits, and the system intelligently guides them to relevant information, achieving a transition from "people seeking information" to "information finding people."

3.2 Intuitive Display of Knowledge Graph Network Diagrams

Based on in-depth data analysis, the knowledge graph data constructed in this paper is visualized as a network node knowledge graph chart. In this chart, each node serves as a concrete representation of an independent entity, like stars in the universe, each shining with rich information and value. To optimize chart readability and operational convenience, color coding technology is introduced, using distinct colors to differentiate entity types supplemented by detailed legends, enabling users to quickly capture information essence and understand core contexts while browsing.

The vitality of a knowledge graph lies in the intricate relationship network among its entities, a feature vividly demonstrated in the chart through closely connected edges between points. These edges not only outline the association framework between entities but also deeply reveal the underlying logical structures and intrinsic connections by clearly labeling relationship types (such as "contains," "belongs to," "related to," etc.). This design significantly reduces the cognitive burden on users in understanding complex relationships and promotes efficient information retrieval and utilization.

Figure 6 [FIGURE:6] Knowledge Graph Display

The visualization display further incorporates dynamic interactive functions to enhance the depth and breadth of user experience. In the initial display stage, the system focuses on nodes directly associated with the retrieved matching entity—that is, the most closely related and critical surrounding information—providing users with a concise and focused view. However, for users eager to explore deeper, simply clicking on any specific node triggers an immediate system response, dynamically reconstructing and displaying an expanded knowledge graph centered on that node. This new version of the graph deeply mines and extensively presents knowledge domains related to that node, providing users with a multi-dimensional, in-depth exploration platform that facilitates comprehensive and profound understanding of complex information systems.[18]

Conclusion

This paper has thoroughly explored an event extraction and analysis method for power media based on large language models and knowledge graph technology, aiming to improve the efficiency and quality of power media content production while optimizing user experience. Through entity extraction, large language models, and knowledge graph technology, a comprehensive power media knowledge graph framework has been constructed. This framework can accurately extract multiple types of entities and their attributes from power media information and enhance accuracy and efficiency through large language models. In application scenarios, the knowledge graph significantly improves the convenience and accuracy of information retrieval, providing rich visual experiences and in-depth information exploration capabilities. However, facing the challenge of scarce GPU resources, this research encounters bottlenecks when processing massive historical data, and the non-interpretability of large language models increases optimization difficulty. To overcome these pain points, future work plans to explore and optimize from several aspects: algorithm and model compression optimization, distributed computing and cloud computing resource utilization, intelligent resource scheduling and prediction, and exploration of emerging hardware technologies, to achieve deep mining and efficient presentation of data value while reducing dependence on GPU resources.

References

[10] Zhang Ning, Simon Mahony. Opportunities and Challenges of Large Language Models for Digital Publishing[J]. Editing Friends, 2023(11): 45-51.

[11] Cai Zifan, Yu Haiyan. The Evolution of AI-Generated Content (AIGC) and Its Application Scenarios in Library Smart Services[J]. Library Journal, 2023(4): 34-43, 135-136.

[12] Yang Bo, Sun Xiaohu, Dang Jiayi, Zhao Haiyan, Jin Zhi. A Large Language Model-Based Named Entity Recognition Method for Medical Question Answering Systems[J]. Computer Science and Exploration, 2023(10): 2389-2402.

[13] Zhang Caike, Li Xiaolong, Zheng Sheng, et al. Research on Knowledge Graph Construction and Application Based on Large Language Models[J]. Computer Science and Exploration, 2024(10).

[14] Xiang Wei. A Survey of Event Knowledge Graph Construction Technology and Applications[J]. Computer and Modernization, 2020(1): 14-20.

[15] Wang Zhiyue, Yu Qing, Wang Nan, Wang Yaoguo. A Survey of Intelligent Question Answering Based on Knowledge Graphs[J]. Computer Engineering and Applications, 2020(23): 1-11.

[16] Zhang Erkun, Zhang Yixiao. Chat GPT Implications: New Issues in Communication Studies in the Era of Large Language Models[J]. International Press, 2023(6): 167-176.

[17] Li Gang, Li Yinqiang, Wang Hongtao, et al. Power Equipment Health Management Knowledge Graph: Basic Concepts, Key Technologies, and Research Progress[J]. Automation of Electric Power Systems, 2022(3): 1-13.

[18] Wang Yongchao, Luo Shengwen, Yang Yingbao, Zhang Hongxin. A Survey of Knowledge Graph Visualization[J]. Journal of Computer-Aided Design and Computer Graphics, 2019(10).

[1] Ma Zhonggui, Ni Runyu, Yu Kaihang. Recent Advances, Key Technologies, and Challenges of Knowledge Graphs[J]. Chinese Journal of Engineering, 2020(10): 1254-1266.

[2] Li Dongmei, Zhang Yang, Li Dongyuan, Lin Danqiong. A Survey of Entity Relationship Extraction Methods[J]. Journal of Computer Research and Development, 2020(7): 1424-1448.

[3] Zhang Heyi, Wang Xin, Han Lifan, Li Zhao, Chen Zirui, Chen Zhe. Research on Question Answering Systems Integrating Large Language Models and Knowledge Graphs[J]. Computer Science and Exploration, 2023(10): 2377-2388.

[4] Huang Bo, Wu Shenao, Wang Wenguang, Yang Yong, Liu Jin, Zhang Zhenhua, Chen Nanxi, Yang Hongshan. Graph-Model Complementarity: A Survey of Knowledge Graph and Large Model Integration[J]. Journal of Wuhan University (Natural Science Edition), 2024(4): 397-412.

[5] Tang Xiaosheng, Cheng Linya, Zhang Chunhong, et al. Application of Large Language Models in Automated Construction of Disciplinary Knowledge Graphs[J]. Journal of Beijing University of Posts and Telecommunications (Social Science Edition), 2024(1): 125-136.

[6] Xu Zenglin, Sheng Yongpan, He Lirong, Wang Yafang. A Survey of Knowledge Graph Technology[J]. Journal of University of Electronic Science and Technology of China, 2016(4): 589-606.

[7] Liu Jin, Du Ning, Xu Jing, et al. Application and Research of Knowledge Graphs in the Power Field[J]. Electric Power Information and Communication Technology, 2020(1): 60-66.

[8] Pu Tianjiao, Tan Yuanpeng, Peng Guozheng, Xu Huifang, Zhang Zhonghao. Construction and Application of Knowledge Graphs in the Power Domain[J]. Power System Technology, 2021(6): 2080-.

[9] Yang Yuji, Xu Bin, Hu Jiawei, Tong Meihan, Zhang Peng, Zheng Li. An Accurate and Efficient Domain Knowledge Graph Construction Method[J]. Journal of Software, 2018(10).

Author Biographies

Li Jia (1990—), female, from Huanghua, Hebei, Master's degree, intermediate engineer, research direction: application of artificial intelligence in power industry media.

Zang Yanjiao (1989—), female, from Baoding, Hebei, Master's degree, intermediate engineer, research direction: integrated publishing, intelligent writing, and news big data communication.

Gu Chenlan (1975—), female, from Shanghai, Master's degree, senior engineer, research direction: application of artificial intelligence in news media.

(Editor in charge: Li Yansong)

Submission history

[v1] 2025-07-09

Abstract

Full Text

Event Extraction and Analysis for Power Media Based on Large Language Models and Knowledge Graphs

Abstract

Introduction

1.1 Entity Extraction Technology

1.2 Large Language Model Technology

1.3 Knowledge Graph Technology

2. Methodology

2.1 Process Overview

2.2 Entity Type Data Modeling

2.3 Large Model Entity and Relationship Extraction

2.4 Automatic Entity Merging

2.5 Retrieval Target Analysis and Marking

3. Application and Visualization

3.1 Large Language Model-Enhanced Knowledge Graph Retrieval

3.2 Intuitive Display of Knowledge Graph Network Diagrams

Conclusion

References

Author Biographies

Submission history

Access Paper

Citation

Share

Related Papers

Feedback

Event Extraction and Analysis for Power Media Based on Large Language Models and Knowledge Graphs: Postprint