Abstract
Amidst the global wave of digital transformation, the data revolution in the healthcare industry is accelerating, with countries actively introducing digital policies to drive medical data development. This paper focuses on the construction of digital policy-driven medical data spaces, systematically analyzing multidimensional challenges in data sharing, privacy security, technology integration, and ethical governance. By integrating practical experiences from the EU's "European Health Data Space" and China's "14th Five-Year Plan" policies, it leverages technologies such as federated learning and blockchain notarization to break through data silos, optimizes resource allocation through layered transparent fund allocation, and simultaneously establishes cross-border data mutual recognition mechanisms to strengthen international cooperation. The study contends that the sustainable development of medical data spaces urgently necessitates constructing a "technology-management-law" tripartite governance framework, building a multi-stakeholder collaborative ecosystem through technology and policy, and resolving core issues such as insufficient data standardization, storage bottlenecks, and talent shortages. Through a path analysis combining theory and practice, it provides systematic reference for the digital infrastructure construction of the Healthy China Strategy, promising to facilitate the transformation of medical data from resource accumulation to intelligent services, thereby significantly advancing the democratization of precision medicine and collaborative global health governance development.
Full Text
Preamble
·Review and Monograph· Digital Policies Leading the Governance Logic and Ecosystem Construction of Healthcare Data Space: An Analytical Framework Based on "Technology-Management-Law"
WANG Hongchuan¹,², ZHANG Jianbo³*, MA Wei³, ZHAO Sidi²
- School of Public Policy and Management, Tsinghua University, Beijing 100084, China
- Institute for Contemporary China Studies, Tsinghua University, Beijing 100084, China
- School of Economics and Management, Xinjiang University, Urumqi 830046, China
*Corresponding author: ZHANG Jianbo, Research assistant; E-mail: 3067754029@qq.com
【Abstract】 Under the wave of global digital transformation, the data reform in the healthcare industry is accelerating. Countries around the world are actively enacting digital policies to drive the development of healthcare data. This paper focuses on the construction of the healthcare data space driven by digital policies, systematically analyzes its multi-dimensional challenges in data sharing, privacy and security, technology integration, and ethical governance. By combining the practical experiences of the "European Health Data Space" of the European Union and China's "14th Five-Year Plan" policies, it breaks through data silos with technologies such as federated learning and blockchain-based certification, optimizes resource allocation through hierarchical and penetrating capital allocation, and simultaneously establishes a cross-border data mutual recognition mechanism to strengthen international cooperation. The study believes that the sustainable development of the healthcare data space urgently requires the construction of a trinity governance framework of "technology-management-law," relying on technology and policies to create a multi-agent collaborative ecosystem to solve core problems such as insufficient data standardization, storage bottlenecks, and talent shortages. Through the path analysis that combines theory and practice, this paper provides a systematic reference for the construction of the digital foundation of the Healthy China strategy, and is expected to help transform healthcare data from resource deposition to intelligent services, significantly promoting the popularization of precision healthcare and the coordinated development of global health governance.
【Key words】 Healthcare data space; Artificial intelligence; Big data; Data security; Precision healthcare
【Chinese Library Classification】 R-056
【Document code】 A
DOI: 10.12114/j.issn.1007-9572.2025.0145
Driven by the global wave of the intelligent era, the healthcare sector is undergoing unprecedented data transformation. Artificial intelligence is reshaping the medical and health fields with unprecedented depth and breadth, from intelligent assisted diagnosis to accelerated new drug development, making data the core engine driving medical innovation. Currently, prominent issues include heterogeneous data formats across medical institutions, insufficient cross-system interoperability, and the high sensitivity of patient health information (such as electronic medical records and genomic data). Existing regulatory systems, technical standards, and management mechanisms cannot meet the requirements, making it increasingly urgent to construct a key carrier that integrates multi-party data resources and enables secure and efficient circulation. Meanwhile, scenarios such as precision medicine, real-world research[1], and public health emergencies have put forward urgent demands for large-scale data sharing. Governments and international organizations have continuously introduced relevant policies aimed at breaking data silos, reducing privacy security risks, and addressing increasingly stringent compliance pressures.
The European Union has taken the lead in promoting the "European Health Data Space (EHDS)," achieving secure cross-border data circulation through the establishment of a digital health market[2]. The Chinese government has actively promoted relevant policies, such as the "14th Five-Year Plan for Digital Economy Development"[3] which emphasizes cultivating the construction of medical resource sharing spaces, and comprehensively advancing the digital transformation and upgrading of primary-level medical and health institutions based on relevant norms and standards including the "National Basic Public Health Service Standards (Third Edition)"[4]. This paper focuses on the construction of healthcare data space under the background of digital policies, aiming to systematically explore the practical challenges in data sharing innovation, privacy protection technologies, and intelligent governance models, summarize domestic and international experiences in rule-making, technology integration, and ecosystem cultivation, and provide theoretical analysis and practical references for building a trustworthy, secure, and efficient medical intelligent ecosystem[5].
2.1 Accelerated Development of the Healthcare Data Market
The healthcare data space market is developing with strong momentum, powerfully driving the upgrading of the global medical industry. Currently, the global healthcare data market is in a stage of vigorous development, with related industries growing rapidly. According to data from relevant institutions, the global digital healthcare market size reached $286.35 billion in 2023 and is expected to continue expanding at a compound annual growth rate of 26.8%, rising to $365.67 billion in 2024 and potentially exceeding $450 billion in 2025[8]. The "14th Five-Year Plan for Bioeconomy Development" proposes[9] focusing on advanced diagnosis and treatment technologies and equipment, precision medical testing, and other directions to enhance original innovation capabilities. Against this backdrop, China's intelligent medical device market has grown rapidly from 2020 to 2025, with the market size expected to reach 24.23 billion yuan in 2025, and is projected to maintain high growth from 2026 to 2027[10].
In market competition, the competitive landscape shows high concentration, with leading enterprises occupying dominant positions in the industry. Meanwhile, the rise of internet medical platforms has deepened scenarios such as health management and remote diagnosis and treatment, with enterprises possessing multimodal processing capabilities and compliance frameworks gradually gaining advantages in the industrial landscape. However, the market-oriented allocation mechanism for data remains imperfect, with most projects relying on government funding support, and the sustainability of business models needs to be explored. In the future, it is necessary to focus on resolving bottleneck issues such as the digital capability gap at the primary level and insufficient algorithm interpretability, to fully promote the development of artificial intelligence in the medical field toward greater accessibility.
2.2 Prominent Challenges in Healthcare Data Security
Data security plays a critically important role in healthcare data space, providing safeguards for the stable operation of healthcare data space, rational data utilization, and the healthy development of the medical industry. The "2020 Digital Healthcare: Research Report on Cybersecurity Risks During the Epidemic Prevention and Control Period" released by the China Academy of Information and Communications Technology[11] shows that nearly 30% of surveyed medical institutions face risks of data asset leakage. Verizon's cybersecurity report shows that globally, the healthcare industry is the only sector where internal threats exceed external threats, with internal practitioners' leakage of medical data reaching alarming levels[12].
Security threats to healthcare data space mainly originate from three dimensions. (1) Technical level: Weak innovation capability in data governance technology[13], security vulnerabilities in medical institution information systems and devices, functional degradation of traditional encryption technologies for protecting diagnosis and treatment data, outdated systems that are not promptly updated creating high-risk attack surfaces, and medical data being vulnerable to hacker eavesdropping, tampering, and interception during transmission. (2) Management level: Incomplete construction of management systems in medical institutions, weak risk management awareness, chaotic internal permission management[14], insufficient safety awareness among staff who may obtain permissions beyond their work scope, increasing the risk of unnecessary data access and leakage. (3) Legal level: Ambiguous definition of medical data ownership, imperfect data rights allocation systems and relevant laws and regulations[15], loopholes in the standardized use and supervision of data security, and current penalty standards that hardly cover new types of data crimes, making it difficult to effectively curb illegal activities.
2.3 Innovation Opportunities from Technology-Policy Synergy
Currently, the operation of healthcare data space presents an evolutionary trend where technology-driven and policy innovation are intertwined. Globally, data sharing models are transitioning from centralized storage to the distributed collaborative concept of healthcare data space, with core technological breakthroughs focusing on privacy computing, cross-chain interoperability, and federated learning fusion applications that support this transformation[16]. Foreign healthcare data spaces are developing rapidly, with the United States, European Union, and other regions taking a leading position in healthcare data space standards and ecosystem construction thanks to advanced information technology and mature market environments. The United States has promoted policies such as the "21st Century Cures Act"[17] to encourage medical institutions to share data, thereby forming a broader healthcare data sharing ecosystem and facilitating the application of aggregated medical data resources in drug development and clinical decision support system development, aiding early screening and precision diagnosis in healthcare.
The European Union, through its "European Health Data Space" (EHDS) strategy, breaks down data barriers between member states, enabling cross-border medical data circulation under unified rules and security frameworks, and promoting multi-center clinical research and public health collaborative governance[18]. For example, a joint Alzheimer's disease research project between France and Germany relies on the EHDS trusted data space infrastructure to integrate data from multiple medical institutions in both countries, accelerating disease mechanism research and new drug development processes. Similar successful cases include the application of France's National Health Data System (data repository) and patient electronic medical records (data sources), whose research helps explore connections and potential for improving real-world research quality[19]. Additionally, foreign countries have relatively mature support for the secure operation and privacy protection of data spaces. For instance, the U.S. Health Insurance Portability and Accountability Act[20-21] strictly regulates medical data usage, while the EU General Data Protection Regulation[22] strengthens data subject rights protection, standardizing the collection, storage, and use of personal health data and regulating various aspects of online personalized services, providing a solid legal guarantee for the development of the trusted healthcare data space industry.
The development of China's healthcare data space is in a dual opportunity period of policy-driven and technology integration. The 2024 "Guiding Opinions on Comprehensively Promoting the Construction of Compact County-level Medical Communities"[23] explicitly proposes promoting interconnectivity of medical data within counties, building unified data sharing platforms, and facilitating tiered diagnosis and treatment and resource integration. The timely release of the "National Data Standards System Construction Guide"[24] lays a solid foundation for cross-institution data interoperability by unifying medical data concept definitions, technical specifications, and security frameworks, covering seven key areas including data infrastructure, resource integration, and circulation application[25]. Policy documents such as the "Healthy China 2030" Planning Outline[26] also clearly propose promoting medical informatization construction, laying technical and management foundations for data space.
With continuously increasing policy support, multiple key roles are being played in the construction and development of healthcare data space. At the national level, policy guidance and institutional establishment promote the integration and application of medical data. In 2025, the Big Data Center of the National Healthcare Security Administration was officially inaugurated, undertaking the application, management, and service tasks of national medical insurance data, covering massive data from 1.33 billion insured individuals and 930,000 hospitals and pharmacies, supporting medical insurance reform and the development of the pharmaceutical and health industry. Data applications in the medical field focus on assisted diagnosis, patient virtual assistants, medical image analysis, and other aspects[27]. In the clinical diagnosis and treatment field, multiple tertiary Grade A hospitals in China have established smart medical platforms, utilizing big data and AI to achieve disease prediction and precision diagnosis and treatment. In the public health governance field, by establishing multi-modal epidemic monitoring systems that integrate case data and epidemiological survey data, hierarchical precision prevention and control and efficient resource allocation have been realized. These digital applications continue to break through in the medical field, achieving the leap from "data resource accumulation" to "value creation", comprehensively empowering the digital-intelligent transformation of the medical and health industry[28]. From an overall ecosystem perspective, China is in the stage of building various data sharing platforms, aggregating medical data resources, and evolving toward constructing a secure, open, and interoperable healthcare data space.
3 Main Bottlenecks in Developing Healthcare Data Space
3.1 Bottlenecks in Data Storage and Sharing
The accelerated development of medical informatization has made real-time processing of massive data, security assurance capabilities, and long-term stable operation core challenges. Currently, medical institutions' data storage and processing face dual pressures from soaring algorithm complexity and increasing communication costs. Standalone storage systems can no longer meet actual storage and computing needs, making optimized data storage methods key to solving the problem[29]. Meanwhile, to ensure data security and business continuity, data storage systems need to implement a dual real-time backup mechanism of "local + remote", enabling data recovery within a short time frame even under extreme circumstances such as server damage or cyber attacks, avoiding interruption of diagnosis and treatment operations.
However, at the hardware deployment level, regulatory, technical, and human constraints remain major obstacles[30]. Medical institutions have significant shortcomings in hardware conditions, with extremely prominent contradictions between performance requirements and cost control. To efficiently process computationally intensive tasks such as CT image analysis, hospitals need to configure high-performance storage and high-speed network equipment. However, the outdated power systems of primary-level hospitals cannot support high-energy-consuming equipment, and the constant temperature and dust-proof renovations required for dedicated machine rooms are difficult to implement due to funding shortages. At the same time, the contradiction between exponential data growth and equipment expansion urgently needs to be resolved—storage system expansion in tertiary hospitals requires downtime for data migration. This "expansion while operation" model leads to business continuity risks and uncontrolled expansion costs. The training of AI medical large models requires simultaneous access to massive data resources and powerful computing capabilities, further exposing the huge gap in existing hardware architectures regarding computing power supply and data processing timeliness[31].
In the field of data sharing, multiple barriers severely constrain the release of medical value. Policy regulations and privacy barriers constitute primary obstacles, as sharing requires costly anonymization processing and obtaining hierarchical authorizations[32]. Particularly in cross-border transmission of sensitive information such as genetic data, relevant compliance reviews have become more stringent, with not only complex operational procedures but also rising risk costs. At the practical level, the absence of a "rights, responsibilities, and benefits" allocation mechanism leads to a lack of revenue guarantees for contributors. Hospitals worry that data leakage will cause legal risks and loss of competitive advantages, patients have concerns about privacy abuse, and pharmaceutical companies and research institutions struggle to establish sustainable benefit distribution models. This multi-party game creates a vicious cycle of "unwilling to share, afraid to share". Simultaneously, insufficient technology empowerment directly limits implementation pathways. Key technologies such as privacy computing (e.g., federated learning) and blockchain have yet to break through performance bottlenecks and large-scale application thresholds, making it difficult to balance the needs of secure sharing and efficient collaboration, and unable to effectively support scenarios such as clinical diagnosis and treatment and cross-domain research.
Ultimately, storage and sharing bottlenecks create a superimposed effect. Performance limitations of storage systems hinder high-concurrency data access and real-time analysis, while imperfect sharing mechanisms prevent the aggregation and utilization of dispersed data resources, further amplifying the hardware computing power gap. This coexistence of "storage silos" and "sharing shackles" severely delays the transformation of healthcare data from resources to productivity.
3.2 Bottlenecks in Data Standardization and Quality
Data sources for healthcare data space are highly complex, covering multi-dimensional fields including clinical diagnosis and treatment, public health, patient-generated data, and scientific research exploration, presenting significant multi-source heterogeneous characteristics. Mainly including: internal medical institution data—electronic health records, medical imaging, and laboratory data constitute core sources; patient-generated data—wearable devices, mobile health applications, Internet of Things and smart devices; public health data—disease surveillance data, population health data, etc., supporting macro health policy formulation.
However, the absence of data standardization systems leads to integration challenges: disease coding lags in updates due to regional differences, with disease classification error rates remaining high; structured electronic medical records coexist with unstructured text, with low matching rates between manually entered data and system fields; primary-level medical institutions become weak links in multi-center data collaboration due to lack of unified standard interfaces and significant regional disparities in resource allocation[34].
Data quality directly affects the performance of machine learning models and the reliability of medical decisions[35]. Current medical data suffers from prominent issues of authenticity bias, contamination, and missing data: some intelligent models produce distorted outputs due to training data contamination; data missing problems are particularly common in primary-level scenarios, with issues such as patient compliance, follow-up mechanisms, and cooperation with medical and health institutions being prominent[36], in addition to equipment failures; lagging data processing technologies, historical data compatibility issues, and lack of equipment maintenance causing noise differences further reduce data usability. Furthermore, primary-level institutions have not yet established full-process quality management systems, with frequent source problems such as non-standard medical record filling and chaotic order sets, continuously amplifying the risks of data extraction and analysis biases.
As the "safety net" for safeguarding public health needs and improving population health levels, primary-level medical and health services' data governance capabilities directly relate to the accessibility and precision of public health services[37]. Currently, primary-level healthcare faces multiple structural shortcomings in data governance: (1) Traditional management models with limited integration with medical data, lacking refined data support, leading to insufficient procedural and standardized processing; (2) Exacerbated data standardization issues for medical equipment, with primary-level ECG monitoring and other devices having inconsistent technical parameters such as storage formats and sampling frequencies, high proportions of outdated equipment, and insufficient acquisition precision; (3) Disconnect between patient-generated data and professional medical data, with wearable device precision deviations and weak calibration capabilities causing difficulties in multimodal data integration, often unable to achieve format conversion and calibration due to technical and funding constraints. These challenges highlight systematic defects in primary-level medical data from collection and storage to application, urgently requiring technology empowerment and resource allocation to build digital solutions adapted to primary-level needs.
3.3 Bottleneck in Compound Talent Cultivation
The construction of healthcare data space is constrained by a shortage of cross-domain compound talents, requiring collaborative cultivation among government, industry, academia, and research to strengthen the core momentum of technology research and development and break through development bottlenecks. Shen Huiwen et al.[38] point out that hospitals generally lack compound talents who understand both clinical practice and data analysis.
Current talent shortages mainly focus on three dimensions. (1) Capability demand gap: The field of smart healthcare urgently needs compound talents with knowledge of clinical medicine, data science, and ethical regulations. However, current university curricula are disconnected from clinical practice, lacking interdisciplinary training programs in collaboration with medical institutions (such as dual degrees or joint training). Developing countries, limited by educational resources and insufficient industry-education integration, have more significant compound talent gaps than developed countries, constraining the sustainable development of healthcare data space. (2) Lagging education system: The talent cultivation mechanism lacks innovation. In the AI era, medical education and physician training are facing major transformations[39], and should rely on digital intelligence technologies to build personalized learning paths and intelligent education management systems[40] to meet society's demand for innovative and compound talents and promote high-quality education development. (3) Missing career pathways: The traditional professional title evaluation system centers on single-discipline capabilities, lacking evaluation standards for interdisciplinary compound talents, causing those with knowledge of clinical medicine, data science, and ethical regulations to face structural obstacles in promotion. It is urgent to reconstruct professional title evaluation dimensions, add specialized positions for smart healthcare, and establish interdisciplinary competency certification systems to adapt to the diversified talent needs of healthcare data space.
3.4 Ethical Dilemmas and Governance Paradoxes in Healthcare Data Space
In the construction process of healthcare data space, ethical contradictions and governance challenges present a multi-dimensional intertwined situation: Data ethics research is still in the early exploration stage, with academia not only having unclear cognition of its ethical essence but also showing fragmented characteristics in related research systems[41]. Meanwhile, the ambiguity in defining medical data property rights further constrains sharing motivation—the reproducible nature of data leads to weakened patient informed consent in secondary utilization, and imperfect compliance guidance mechanisms expose data ownership to loss risks during circulation[42].
At the same time, differences in interest demands among multiple stakeholders and the legal absence of benefit distribution mechanisms create structural contradictions. Although technological evolution such as blockchain and federated learning improves processing efficiency, it simultaneously generates new problems such as fragmented property rights and rising circulation costs, ultimately constituting a governance paradox of "technology empowerment and institutional lag." The data sharing process also exacerbates leakage risks and conflicts in cross-border flow rules, forcing governments into a balancing dilemma between innovation incentives and security supervision. Overemphasizing data encryption requirements may hinder scientific research and clinical collaboration, while insufficient supervision can easily trigger security hazards. This governance deadlock of "chaos without regulation, stagnation with regulation" highlights that policy coordination needs to build a dynamic balance mechanism for security, efficiency, and fairness, promoting the transformation of data from resources to production factors through multi-dimensional institutional design, thereby supporting the achievement of precision medicine and universal health goals.
4 Innovative Paths for Promoting Healthcare Data Space Construction
4.1 Constructing a "Trinity" Collaborative Innovation Ecological Framework
As the cornerstone of modern healthcare system transformation, the construction of healthcare data space is releasing enormous social value and economic potential through technological innovation and ecological reconstruction. With the deepening of the "AI+Healthcare" initiative, applications such as intelligent assisted diagnosis and chronic disease management are accelerating the extension of high-quality medical resources to broader populations. Against this backdrop, the industry is building a "technology-management-law" trinity collaborative framework, which will help healthcare data space applications achieve leapfrog development in areas such as precision medicine accessibility and intelligent chronic disease management.
In terms of technology, the focus is on privacy computing technologies, federated learning, and other "usable but invisible" technologies to break through data silo barriers, deepening the application of federated learning and quantum computing to break through dynamic data protection and computing power limitations. Medical data security has developed encryption and anonymization technologies. The IoT ciphertext index storage scheme constructed by Xu Cheng et al.[43] reduces leakage risks through distributed encryption. Song Kai[44] proposed a cross-chain privacy protection scheme based on group signature algorithms, effectively solving privacy issues in medical data cross-chain sharing. Meanwhile, accompanied by research on optimization methods for massive small file storage in medical data[45], storage efficiency and data security have been improved.
In terms of management, internal governance systems in medical institutions are being upgraded, with medical staff following relevant laws, regulations, and ethical guidelines while protecting patient privacy. Currently, medical institutions establish data security responsibility systems, strengthen practitioners' safety awareness and permission management, and conduct regular drills and vulnerability checks. By integrating digital intelligence technologies into performance management through technical means (such as log auditing) and behavioral norms (such as prohibiting casual photography), the effectiveness and level of hospital performance management are improved[46], strengthening safety management capabilities and reducing leakage risks. Simultaneously, promoting tripartite medical data collaboration—relying on healthcare data space to break through barriers between diagnosis and treatment, medical insurance, and research and development—drives policy implementation through data sharing and enhances health service effectiveness.
In terms of law, data security and patient privacy are core priorities. The current legal system presents a dual-layer evolutionary characteristic of "basic legislation-industry norms." The Personal Information Protection Law and Data Security Law provide legal foundations for medical data protection, but require further refinement of industry implementation rules[47]. For example, clarifying classification and grading standards and management models for medical data, listing genetic data and electronic medical records as the highest sensitivity level, and stipulating principles for data sharing. Research by Ge Yongbin et al.[48] shows that establishing a data circulation "whitelist" system can effectively reduce unauthorized access rates. Simultaneously, establishing relevant legal regulations referencing the EU's Health Data Space Act can perfect the full-chain data governance system[49].
4.2 Path Planning for Accelerating Digital Infrastructure
Healthcare data space governance needs to construct a "technology-policy" dual-axis collaborative framework (Figure 1[FIGURE:1]). The technology side relies on privacy computing and blockchain to build trusted data circulation networks, breaking through silo barriers. Policy builds compliance bottom lines through hierarchical authorization and cross-border mutual recognition systems. The two achieve dynamic adaptation through smart contracts: federated learning supports cross-domain desensitized sharing of diagnosis and treatment data, while sandbox regulation balances AI medical innovation and risk control. The framework's bottom layer is supported by new infrastructure providing computing power, while the upper layer forms a closed-loop ecosystem for data rights confirmation, circulation, and application, both ensuring security protection for highly sensitive information such as genetic data and promoting value release in scenarios such as chronic disease management, ultimately achieving secure and efficient circulation of data elements.
Figure 1. Operational plan diagram of healthcare data space
*Innovation Point 1: Hardware subsidies issuance
Upload encrypted diagnosis and treatment data
Federated learning modeling (dynamic desensitization)
Push clinical decision-making recommendations
Feedback execution difference rate
Innovation Point 2: Emergency data priority computing
Jointly develop algorithm optimization projects
Return enhanced AI model
[Sandbox verification]
High difference rate triggers retraining
Submit standard proposals
Certify cross-border mutual recognition agreements
Propose data flow mutual recognition rules
Launch cross-border joint research
Authorize data usage rights
Output privacy computing results
Submit data space maturity report
Innovation Point 3: Construction indicator dynamic feedback*
Optimize capital allocation. Central finance focuses on core infrastructure such as national-level medical supercomputing centers and 5G medical private networks; local matching funds support digital upgrading of medical systems. The state can introduce relevant policies, establish special funds for medical data new infrastructure to support the deployment of intelligent analysis platforms, guide medical institutions to purchase specialized hardware through fiscal subsidies, and include software systems such as data middle platforms and blockchain nodes in government procurement catalogs. For example, project planning requires newly built tertiary Grade A hospitals to configure edge computing gateways and federated learning modules to achieve real-time desensitization and on-chain storage of diagnosis and treatment data[50], and incorporates healthcare data space construction indicators into the smart hospital rating system, forcing institutions to undergo intelligent transformation.
Innovative technology integration. Policies should promote the establishment of special zones for medical AI large model training, authorizing platforms like DeepSeek to use desensitized medical data for reinforcement learning within these zones, while simultaneously constructing a "sandbox supervision" mechanism to conduct dynamic verification of clinical recommendations output by models. By revising relevant legal regulations, clearly authorize compliant institutions to use federated learning technology for cross-border research, and reduce or exempt data cross-border security assessment fees for data circulation using privacy computing technologies. Form a virtuous cycle of "policy traction-technology driven-data value", making healthcare data space truly become the digital foundation for Healthy China construction.
Talent cultivation pathways. Currently, medical institutions are strengthening cooperation with universities and enterprises, deepening collaboration through jointly building laboratories or innovation centers. On this basis, encourage enterprises and universities to jointly establish laboratories and actively explore a "dual-mentor system" talent cultivation model. Simultaneously, improve the digital skills of existing medical staff through vocational training[51], establish a "data craftsman" cultivation mechanism covering cross-domain capabilities such as clinical business understanding, data cleaning, and algorithm optimization[52]. The government can also implement talent introduction policies, establish special funds for overseas high-level talents, and attract compound talents with multi-disciplinary backgrounds.
Strengthen international cooperation. China can actively participate in the formulation of global healthcare data governance rules, promote cross-border flow of medical data, establish mutual recognition mechanisms for cross-border data flow, and enhance the security and efficiency of data interaction. For example, sign data sharing agreements with countries along the "Belt and Road" to promote international collaborative research on medical data, while conducting full life-cycle health service research[53-54], improving global medical service levels, enhancing China's discourse power in the digital health field, and ultimately promoting the global healthcare system toward greater efficiency, fairness, and sustainability.
5 Discussion and Outlook
Currently, healthcare data space is accelerating its evolution toward intelligent collaboration and global interconnectivity, with its core driving force stemming from two-way breakthroughs in heterogeneous technology integration and institutional innovation. At the technical level, the deep coupling of federated learning and edge computing is building a distributed diagnosis and treatment network where "data is usable but invisible", supporting real-time desensitized data sharing for multi-center clinical research. Meanwhile, the collaborative application of blockchain and privacy computing can reduce dependence on single intermediaries, alleviating sovereignty disputes in cross-border data flow[55]. For example, the EU EHDS achieves property rights tracking and compliant invocation of medical data among member states through smart contracts, providing technical references for China's participation in global health governance.
However, technological leaps still face dual challenges of lagging data standardization and computing power bottlenecks. Primary-level medical institutions suffer from chaotic data acquisition formats due to device heterogeneity, requiring reliance on a "cloud-edge-end" integrated architecture to achieve multimodal data fusion, and using AI-assisted annotation tools to improve structuring efficiency. At the policy level, top-level design needs to be strengthened: on one hand, clarify circulation boundaries for highly sensitive information such as genetic data and electronic medical records through legislation, and establish dynamic sandbox supervision mechanisms to balance innovation and risk; on the other hand, optimize "new infrastructure" capital allocation, tilting toward 5G medical private networks and edge computing node construction in central and western regions to narrow the regional digital divide.
In the future, healthcare data space will present three major trends: (1) Technology-clinical closed-loop iteration, where quantum encryption technology breaks through traditional computing power constraints, enabling cross-domain medical data to complete desensitization and feature extraction at the millisecond level, empowering real-time optimization of prediction models; (2) Policy-market collaborative evolution, where China can draw on other countries' data management models to activate social capital through public-private partnership mechanisms, directing it toward public welfare data application scenarios such as chronic disease management and rare disease research; (3) Global-regional governance nesting, relying on the "Belt and Road" medical data alliance to build a cross-border data mutual recognition and joint research and development ecosystem. Through the "technology-policy" collaborative governance framework, promote the paradigm upgrade of medical data resources from production factors to global public goods, jointly building a trustworthy, controllable, and sustainable digital foundation for the construction of a global community of health for all, boosting innovative development of medical AI, providing strong support for the digital transformation of the medical industry, and enhancing global medical service levels.
References
[1] CONCATO J, CORRIGAN-CURAY J. Real-world evidence - where are we now? [J]. N Engl J Med, 2022, 386(18): 1680-1682. DOI: 10.1056/NEJMp2200089.
[2] STELLMACH C, MUZOORA M R, THUN S. Digitalization of health data: interoperability of the proposed European health data space [J]. Stud Health Technol Inform, 2022, 298: 132-136. DOI: 10.3233/shti220922.
[3] State Council Notice on Issuing the "14th Five-Year Plan for Digital Economy Development" (Guofa [2021] No. 29)[J]. China Military-to-Civilian, 2022(1): 6-12. DOI: 10.3969/j.issn.1008-5874.2022.01.002.
[4] National Health and Family Planning Commission. "National Basic Public Health Service Standards (Third Edition)"[EB/OL]. (2017-04-17)[2025-04-25]. http://www.nhc.gov.cn/ewebeditor/uploadfile/2017/04/20170417104506514.pdf.
[5] Zhang Di, Zhang Liwei. Digital Intelligence Information Ecosystem: Connotation, Composition and Mechanism[J]. Modern Intelligence, 2024, 44(4): 11-21.
[6] Wang Hongwei. The Emergence Logic, Realistic Dilemmas and Construction Path of Digital Health Community[J]. Chinese Health Service Management, 2023, 40(12): 881-884, 902.
[7] Huang Ruyi, Jing Qi. Digital Health in the Digital Era: Connotation, Characteristics, Challenges and Governance Paths[J]. Health Economics Research, 2022, 39(6): 60-63, 66. DOI: 10.14055/j.cnki.issn.0253-993X.2022.06.019.
[8] "2022-2027 China Digital Healthcare Industry Market Analysis and Investment Risk Trend Forecast Research Report"[EB/OL]. (2023-10-20)[2025-07-07]. https://www.qianzhan.com/analyst/detail/220/2310201636e358.html.
[9] National Development and Reform Commission Notice on Issuing the "14th Five-Year Plan for Bioeconomy Development"[EB/OL]. (2021-12-20)[2025-07-07]. https://www.gov.cn/zhengce/zhengceku/2022-05/10/content_5689556.htm.
[10] KPMG China "First Health Tech 50" Report[EB/OL]. (2025-07-02)[2025-07-07]. https://assets.kpmg.com/content/dam/kpmg/cn/pdf/zh/2025/07/kpmg-china-healthcare-health-tech-50.pdf.
[11] China Academy of Information and Communications Technology. "2020 Digital Healthcare: Research Report on Cybersecurity Risks During the Epidemic Prevention and Control Period"[EB/OL]. [2025-04-25]. http://www.caict.ac.cn/kxyj/qwfb/ztbg/202003/P020200316481943325476.pdf.
[12] Verizon. 2020 Data Breach Investigations Report[EB/OL]. (2020-06-20)[2025-04-21]. https://www.secrss.com/articles/20611.
[13] Que Tianshu, Wang Ziyue. Global Data Security Governance and China's Strategy in the Digital Economy Era[J]. International Security Studies, 2022, 40(1): 130-154, 158. DOI: 10.14093/j.cnki.cn10-1132/d.2022.01.006.
[14] Yang Zhengyun. Discussion on Internal Control of Public Hospitals Based on Risk Management[J]. Friends of Accounting, 2019(6): 137-140.
[15] Gao Fuping. On the Allocation of Medical Data Rights—Legal Framework for Medical Data Open Utilization[J]. Modern Law Science, 2020, 42(4): 52-68. DOI: 10.3969/j.issn.1001-2397.2020.04.04.
[16] Zhu Jianming, Zhang Qinnan, Gao Sheng, et al. A Blockchain-Based Privacy-Preserving Trusted Federated Learning Model[J]. Chinese Journal of Computers, 2021, 44(12): 2464-2484. DOI: 10.11897/SP.J.1016.2021.02464.
[17] JAFFE S. 21st century cures act progresses through US congress[J]. Lancet, 2015, 385(9983): 2137-2138. DOI: 10.1016/S0140-6736(15)61008-X.
[18] Zhao Lin, Qian Yuqiu, Zheng Han. EU Data Element Market Cultivation Policies, Practices and Models[J]. Library Tribune, 2024, 44(12): 151-160.
[19] GOFF R L, BRICE S, CONTINI A, et al. Successful linkage of electronic medical records and national health data system in type 2 diabetes research: methodological insights and implications[J]. Pharmacoepidemiol Drug Saf, 2025, 34(2): e70095. DOI: 10.1002/pds.70095.
[20] HUI K R, GILMORE C J, KHAN M. Medical records: more than the health insurance portability and accountability act[J]. J Acad Nutr Diet, 2021, 121(4): 770-772. DOI: 10.1016/j.jand.2020.06.022.
[21] BHATE C, HO C H, BRODELL R T. Time to revisit the Health Insurance Portability and Accountability Act (HIPAA)?[J]. J Am Acad Dermatol, 2020, 83(4): e313-314. DOI: 10.1016/j.jaad.2020.06.989.
[22] HOOFNAGLE C J, VAN DER SLOOT B, BORGESIUS F Z. The European Union general data protection regulation: what it is and what it means[J]. Inf Commun Technol Law, 2019, 28(1): 65-98. DOI: 10.1080/13600834.2019.1573501.
[23] Primary Health Department. "Guiding Opinions on Comprehensively Promoting the Construction of Compact County-level Medical Communities"[EB/OL]. (2023-12-30)[2025-04-20]. http://www.nhc.gov.cn/jws/s7874/202312/e5d16e73fa324533bcc8f75755844726.shtml.
[24] Central Committee of the Communist Party of China and State Council. "National Data Standards System Construction Guide"[EB/OL]. (2024-10-08)[2025-04-20]. https://www.gov.cn/zhengce/zhengceku/202410/P020241008789641651212.pdf.
[25] Yin Yuanyue, Yuan Lingyun, Chen Meihong. Research on Health Data Secure Sharing Mechanism Based on Blockchain Multi-Chain[J]. Network Security Technology and Application, 2025(2): 65-70.
[26] Central Committee of the Communist Party of China and State Council. "Healthy China 2030" Planning Outline[EB/OL]. (2016-10-25)[2025-04-20]. https://www.sport.gov.cn/gdnps/files/c25531211.pdf.
[27] Xiao Qingying, Yu Guangjun. Research and Progress on Medical Big Data[J]. Shanghai Medical Journal, 2023, 46(7): 420-423. DOI: 10.19842/j.cnki.issn.0253-993X.2023.07.012.
[28] Huang Yanli. Implementation Elements of Primary Healthcare Digital Transformation: Based on the Consolidated Framework for Implementation Research[J/OL]. Chinese General Practice, 2024: 1-10. (2024-08-29)[2025-07-07]. https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CJFD&dbname=CJFD&filename=QKYX20240828003.
[29] He Gongshan, Zhao Chuanlei, Jiang Jinhu, et al. A Survey on Data Storage Technologies for Deep Learning[J]. Chinese Journal of Computers, 2025, 48(5): 1013-1064.
[30] RANCHON F, CHANOINE S, LAMBERT-LACROIX S, et al. Development of indirect health data linkage on health product use and care trajectories in France: systematic review[J]. J Med Internet Res, 2023, 25: e41048. DOI: 10.2196/41048.
[31] Feng Yangyang, Wang Qing, Xie Minhui, et al. From BERT to ChatGPT: Storage System Challenges and Technological Development in Large Model Training[J]. Journal of Computer Research and Development, 2024, 61(4): 809-823.
[32] GADOTTI A, ROCHER L, HOUSSIAU F, et al. Anonymization: The imperfect science of using data while preserving privacy[J]. Sci Adv, 2024, 10(29): eadn7053. DOI: 10.1126/sciadv.adn7053.
[33] SHAH S M, KHAN R A. Secondary use of electronic health record: opportunities and challenges[J]. IEEE Access, 2020, 8: 13644-13655.
[34] Mei Zihong, Liu Chanjuan. Analysis of Resource Allocation Efficiency of Primary Healthcare in China from 2012-2020[J]. Chinese Health Economics, 2022, 41(10): 54-58.
[35] GONG Y D, LIU G Z, XUE Y Z, et al. A survey on dataset quality in machine learning[J]. Inf Softw Technol, 2023, 162: 107268. DOI: 10.1016/j.infsof.2023.107268.
[36] Li Wanyu, Zhang Hanzhi, Jin Hua, et al. Study on Implementation Status of Health Management in Primary Healthcare Institutions Oriented by Active Health[J]. Chinese General Practice, 2024, 27(28): 3467-3473.
[37] Shen Xianlei, He Rongxin, Liang Wannian. Configuration and Path Study on Influencing Factors of Primary Healthcare Service Performance[J]. Chinese General Practice, 2025, 28(16): 1965-1973.
[38] Shen Huiwen, Ma Deyuan, Zhang Ce. Medical Big Data and Scientific Research Practice[J]. China Medical Education Technology, 2023, 37(3): 351-355. DOI: 10.13566/j.cnki.cmet.cn61-1317/g4.202303020.
[39] XU Y Y, JIANG Z H, TING D S W, et al. Medical education and physician training in the era of artificial intelligence[J]. Singapore Med J, 2024, 65(3): 159-166. DOI: 10.4103/singaporemedj.SMJ-2023-203.
[40] QIAN L M, CAO W R, CHEN L F. Influence of artificial intelligence on higher education reform and talent cultivation in the digital intelligence era[J]. Sci Rep, 2025, 15(1): 6047. DOI: 10.1038/s41598-025-89392-4.
[41] Zhang Chuhui, Li Zhuozhuo, Pei Lei. Interdisciplinary Integration and Multi-Scenario Embedding: A Review of Data Ethics Research at Home and Abroad[J]. Journal of Information Resources Management, 2025, 15(2): 91-107. DOI: 10.13365/j.jirm.2025.02.091.
[42] WANG L H, MENG L Y, LIU F K, et al. A user-centered medical data sharing scheme for privacy-preserving machine learning[J]. Secur Commun Netw, 2022, 2022: 3670107. DOI: 10.1155/2022/3670107.
[43] Xu Cheng, Gu Lang. Centralized Storage of Medical Big Data Information Security Based on Internet of Things[J]. Information Technology, 2023, 47(1): 109-114. DOI: 10.13274/j.cnki.hdzj.2023.01.020.
[44] Song Kai, He Liwen. Research on Blockchain Identity Privacy Scheme for Medical Big Data[J]. Software, 2023, 44(8): 150-152.
[45] Zeng Meng, Zou Beiji, Zhang Wensheng, et al. Optimization Method for Massive Small File Storage in Multi-Modal Medical Data[J]. Journal of Software, 2023, 34(3): 1451-1469. DOI: 10.13328/j.cnki.jos.006710.
[46] Xu Ruanxin, Zheng Wenxin, Lin Yu. Research on Application of Medical Big Data in Hospital Performance Management[J]. China New Telecommunications, 2023, 25(22): 71-73.
[47] Niu Li, Huo Zenghui. Research on Regulation of Medical Big Data Utilization Under the Background of Personal Information Protection Law[J]. Chinese Health Law, 2023, 31(4): 11-16. DOI: 10.19752/j.cnki.1004-6607.2023.04.002.
[48] Ge Yongbin, Dong Jianping. Compliance Requirements for Conducting Real-World Clinical Research Using Medical Big Data[J]. China Food and Drug Administration Magazine, 2023(10): 86-94.
[49] RAK R. Anonymisation, pseudonymisation and secure processing environments relating to the secondary use of electronic health data in the European health data space (EHDS)[J]. Eur J Risk Regul, 2024, 15(4): 928-938. DOI: 10.1017/err.2024.67.
[50] Tang Kai, Zhang Guoming, Chu Shengxiang. Application and Practice of Big Data Privacy Security Based on Data Desensitization Technology[J]. Chinese Journal of Health Informatics and Management, 2022, 19(3): 381-386.
[51] Deng Hui, Sun Hui. Justification and Institutional Construction of Fiduciary Duty of Personal Medical Health Data Processors[J]. Journal of East China University of Political Science and Law, 2025, 28(1): 76-90.
[52] Hu Bilian, Hu Haibo. Evolution Path and Review of Domestic and Foreign Research Themes on Medical Health Data[J]. Chinese Health Service Management, 2024, 41(12): 1434-1440.
[53] Meng Xiaowei, Li Yan, Tian Xia, et al. Research on Health Medical Data Assetization from the Perspective of Data Life Cycle[J]. Health Economics Research, 2025, 42(2): 28-31, 36. DOI: 10.14055/j.cnki.33-1056/f.2025.02.007.
[54] Du Qiujing, Yao Wenmo, Wang Dan, et al. Concept, Connotation and Research Progress of Life-Cycle Health Services[J]. West China Medical Journal, 2022, 37(12): 1909-1916.
[55] SHAHRIAR RAHMAN M, AL OMAR A, BHUIYAN M Z A, et al. Accountable cross-border data sharing using blockchain under relaxed trust assumption[J]. IEEE Trans Eng Manag, 2020, 67(4): 1476-1486. DOI: 10.1109/TEM.2019.2960829.
Author Contributions: Wang Hongchuan proposed the research question, constructed the corresponding theoretical framework, and conducted in-depth exploration and analysis of the topic; Zhang Jianbo was responsible for relevant data collation, extracting key themes and viewpoints, and writing the paper; Ma Wei proposed relevant policy recommendations, was responsible for final version revision, and is accountable for the paper; Zhao Sidi was responsible for collecting materials and data, and conducting analysis and proofreading.
Conflict of Interest: This article has no conflicts of interest.
ORCID IDs:
Wang Hongchuan https://orcid.org/0000-0001-8629-0982
Zhang Jianbo https://orcid.org/0009-0001-3422-3443