Post-print of Artificial Intelligence-Driven Pathomics in Breast Cancer Metastasis Research
Xiaoqing Fang, Qingjie Lyu
Submitted 2025-11-10 | ChinaXiv: chinaxiv-202511.00073 | Mixed source text

Abstract

Breast cancer remains one of the most significant global health challenges. Patient survival rates drop drastically when cancer cells spread to adjacent or distant tissues and organs, resulting in distant metastasis. Consequently, early diagnosis and the prediction of malignant potential are critical for formulating effective treatment plans and improving the prognosis of breast cancer patients. Against the backdrop of rapid advancements in artificial intelligence, the increasing maturity of pathomics technology has contributed significantly to the evolution of precision medicine.

By reviewing the relevant literature, this paper systematically and comprehensively explores the progress of pathomics across three key areas: the diagnosis of breast cancer, the prediction of metastasis, and the study of tumor heterogeneity. Artificial intelligence algorithms can automatically identify key structures—such as tumor cells, stromal components, and immune cells—within pathological sections, significantly improving the efficiency and accuracy of pathomics feature extraction.

Furthermore, multimodal artificial intelligence models, constructed by integrating diverse information such as clinical data and genomic data, have further enhanced the performance of pathomics in diagnosing and predicting breast cancer metastasis, marking an important contribution to precision medicine. However, due to technical constraints and limitations such as data-sharing barriers arising from patient privacy protection, this field requires further exploration and data mining. This paper aims to provide a more instructive reference for the precision treatment of breast cancer patients and to promote the deep application and clinical integration of artificial intelligence and pathomics technology in the study of breast cancer metastasis.

Full Text

Preamble

Advances in Integrative Pathomics Enhanced by Artificial Intelligence for Breast Cancer Metastasis Research

Affiliation: Department of Pathology, Shengjing Hospital of China Medical University, Shenyang 110000, Liaoning Province, China.
Author: Lü Qingjie, Chief Physician, Doctoral Supervisor.

Abstract

Breast cancer remains one of the most significant global health challenges. When cancer cells disseminate to adjacent or distant tissues and organs—resulting in distant metastasis—patient survival rates decrease significantly. Consequently, early diagnosis and the prediction of malignant potential are critical for formulating treatment strategies and improving the prognosis of breast cancer patients. Against the backdrop of rapid developments in artificial intelligence (AI), the increasing maturity of pathomics technology has made substantial contributions to the advancement of precision medicine. By reviewing relevant literature, this paper systematically and comprehensively explores the progress of pathomics in three key areas: the diagnosis of breast cancer metastasis, the prediction of metastatic risk, and the study of tumor heterogeneity.

AI algorithms can automatically identify critical structures in pathological slides, such as tumor cells, stromal components, and immune cells, thereby significantly enhancing the efficiency and accuracy of pathomic feature extraction. Furthermore, multimodal AI models—constructed by integrating clinical data, genomic data, and other diverse information—have further strengthened the performance of pathomics in diagnosing and predicting breast cancer metastasis. Despite these advancements, the field faces challenges such as technical constraints and limitations on data sharing due to patient privacy protections, necessitating further exploration and research. This review aims to provide a more instructive reference for the precision treatment of breast cancer patients and to promote the deep application and clinical practice of AI and pathomics in breast cancer metastasis research.

Keywords: Breast cancer; Artificial intelligence; Machine learning; Pathomics; Metastasis

Introduction

Breast cancer is a major global disease. When cancer cells spread to neighboring or non-adjacent tissues and organs, the resulting distant metastasis leads to a lower survival rate for patients. Therefore, early diagnosis and the prediction of malignant potential are vital for determining treatment plans and assessing prognosis. In the context of the rapid development of artificial intelligence, pathomics technology has become increasingly sophisticated, contributing significantly to the evolution of precision medicine.

This article systematically explores the progress of pathomics in three dimensions: the diagnosis of breast cancer metastasis, the prediction of metastasis, and the investigation of tumor heterogeneity. AI algorithms enable the automated identification of tumor cells, stromal components, and immune cells within pathological sections, greatly improving the efficiency and precision of feature extraction. Moreover, multimodal AI models that combine clinical and genomic data have enhanced the diagnostic and predictive capabilities of pathomics. However, technical limitations and data-sharing restrictions due to privacy concerns remain hurdles that require further investigation. This paper seeks to provide a guiding reference for precision therapy and to drive the integration of AI and pathomics into clinical practice for breast cancer.

1. The Role of AI-Driven Pathomics in Diagnosis

The integration of machine learning and deep learning into pathomics has revolutionized the way pathologists analyze tissue samples. Traditional pathology relies on manual observation, which can be subjective and time-consuming. AI-driven pathomics allows for the high-throughput extraction of quantitative features from digital whole-slide images (WSIs). These features include cell morphology, nuclear pleomorphism, and spatial arrangements that are often imperceptible to the human eye.

In the context of breast cancer metastasis, AI models have demonstrated high sensitivity in detecting micrometastases in lymph nodes. By utilizing convolutional neural networks (CNNs), researchers can segment tumor regions with high precision, reducing the false-negative rates associated with manual screening.

2. Predicting Metastatic Potential and Patient Prognosis

Predicting whether a primary breast tumor will metastasize is essential for deciding between aggressive systemic therapy and more conservative management. Pathomics leverages the "radiomics" philosophy but applies it to microscopic tissue architecture. By analyzing the primary tumor's microenvironment—including the density of tumor-infiltrating lymphocytes (TILs) and the characteristics of the peritumoral stroma—AI models can predict the likelihood of future distant metastasis.

Recent studies have shown that pathomic signatures are strongly correlated with gene expression profiles. By combining these digital signatures with clinical variables, multimodal AI models provide a more comprehensive risk assessment than traditional staging systems alone.

3. Investigating Tumor Heterogeneity

Tumor heterogeneity is a primary driver of treatment resistance and metastatic progression. Breast cancer is not a single disease but a collection of diverse subtypes with varying biological behaviors. Pathomics allows for the mapping of spatial heterogeneity within a single tumor section. AI algorithms can quantify the diversity of cell populations and the spatial distribution of different clones, providing insights into how the tumor might evolve or respond to specific therapies.

4. Challenges and Future Directions

Despite the promising advancements, several challenges hinder the widespread clinical adoption of AI-driven pathomics:

  • Technical Constraints: The standardization of slide staining, scanning, and pre-processing remains a significant hurdle for model generalizability across different institutions.
  • Data Privacy and Sharing: Strict privacy regulations often prevent the sharing of large-scale pathological datasets, which are necessary for training robust deep learning models.
  • Interpretability: Many AI models, particularly deep learning architectures, operate as "black boxes," making it difficult for clinicians to understand the biological rationale behind a specific prediction.

Future research should focus on developing "explainable AI" (XAI) to bridge the gap between computational output and biological insight. Furthermore, the development of federated learning may offer a solution to data privacy issues, allowing models to be trained across multiple institutions without the need for raw data exchange.

Conclusion

AI-driven pathomics represents a transformative frontier in breast cancer research. By providing objective, quantitative, and spatially resolved data, it enhances our ability to diagnose metastasis, predict patient outcomes, and understand the complexities of tumor heterogeneity. Continued interdisciplinary collaboration between pathologists, oncologists, and computer scientists will be essential to translate these technological advances into routine clinical practice, ultimately achieving the goal of precision medicine for breast cancer patients.

Abstract

Breast cancer remains one of the most prevalent and life-threatening malignancies worldwide. The prognosis of patients markedly worsens once cancer cells metastasize to regional or distant sites. Thus, early detection and accurate assessment of metastatic potential are critical for optimizing treatment strategies and improving clinical outcomes. With the rapid evolution of artificial intelligence (AI), digital pathology has emerged as a powerful tool, significantly advancing the field of precision oncology. This review provided a comprehensive overview of recent advancements in the application of digital pathology in breast cancer metastasis, focusing on three major domains: diagnostic accuracy, predictive modeling, and tumor heterogeneity analysis. AI-driven algorithms enabled automated and high-throughput recognition of key histopathological components—such as tumor cells, stromal architecture, and immune infiltrates—substantially enhancing the efficiency and reproducibility of feature extraction. Furthermore, the integration of pathological imaging with multimodal data sources, including clinical parameters and genomic profiles, through advanced AI models has demonstrated to improve performance in metastatic risk stratification and outcome prediction. Despite these promising developments, challenges such as technical constraints and limitations in data sharing due to patient privacy concerns continued to hinder broader clinical translation. This review aimed to provide a valuable reference for the development of personalized therapeutic approaches and to promote the integration of AI-assisted digital pathology in the management of metastatic breast cancer.

Keywords: Breast cancer; Artificial intelligence; machine learning; Pathomics; Metastasis

How to cite this article:
Fang Xiaoqing, Lyu Qingjie. Application of artificial intelligence-driven pathomics in the study of breast cancer metastasis [J]. Chinese General Practice, 2025. DOI: 10.12114/j.issn.1007-9572.2024.0534.

Chinese General Practice, 2025. [Epub ahead of print] Editorial Office of Chinese General Practice. This is an open access article under the CC BY-NC-ND 4.0 license.

Introduction

According to the 2022 global cancer statistics, new cases of breast cancer (BC) reached 2.309 million, with related deaths exceeding 665,000 \cite{1}. The molecular heterogeneity of metastatic lesions not only increases the complexity of selecting treatment regimens but also leads to a significant rise in diagnostic and therapeutic costs, highlighting an urgent need for more precise assessment tools \cite{2}. Histopathological (HP) examination remains the gold standard for diagnosing metastasis, relying on Hematoxylin-Eosin (HE) morphological observation and immunohistochemical (IHC) marker analysis \cite{3}. However, this approach faces technical bottlenecks, including long turnaround times, high costs, and the dynamic loss of tumor markers, which constrain the ability to trace the origin of metastatic lesions and evaluate therapeutic efficacy.

The innovation of digital pathology has provided a breakthrough opportunity for this field. Whole-slide imaging (WSI) technology can completely preserve histomorphological details, propelling pathological diagnosis into a new era of quantitative analysis \cite{4-5}. Against this backdrop, the integration of pathomics and artificial intelligence (AI) has significantly advanced the development of precision medicine \cite{6}. By utilizing high-throughput feature extraction algorithms, this technology captures quantitative indicators—such as morphological topological structures, cellular spatial distribution, and microenvironmental heterogeneity—from multimodal pathological images including HE staining and IHC. Combined with AI to analyze tumor evolution patterns, it provides a new paradigm for deciphering metastatic mechanisms and personalizing treatment.

This paper systematically reviews the latest progress of pathomics technology in BC metastasis research. Following the progression from traditional pathological assessment to the construction of pathomic feature systems and multi-omic applications, we focus on the application strategies of AI in scenarios such as diagnosis, prediction, and tumor heterogeneity research. Our objective is to provide a theoretical basis for optimizing the precision diagnosis and treatment pathways for breast cancer.

1 Literature Search Strategy

A comprehensive search was conducted across the PubMed, China National Knowledge Infrastructure (CNKI), and Web of Science databases. The search period spanned from the inception of each database to May 2025. The search strategy utilized a combination of keywords and terms, including: "breast cancer," "digital pathology," "computational pathology," "histology," "histopathology," "whole slide image," "metastasis," "pathomics," "artificial intelligence," "machine learning," and "deep learning."

The inclusion criteria for this study were defined as literature focusing on the application of pathomics in the diagnosis and prediction of breast cancer metastasis, as well as studies addressing the heterogeneity of metastatic lesions. Conversely, the exclusion criteria consisted of literature unrelated to the central theme of this review and studies for which the full text was unavailable.

2 AI Application Strategies in Metastatic Pathomics Research

Artificial Intelligence (AI) is a branch of computer science encompassing research in fields such as robotics, speech recognition, image recognition, natural language processing, and expert systems \cite{7}. Machine learning (ML) serves as a critical component of AI, primarily categorized into supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning \cite{8}. Currently, supervised learning is widely applied in breast cancer (BC) metastasis research \cite{9}. In supervised learning, datasets include target labels, allowing models to be trained by extracting features directly from pathological images to predict the labels or values of unknown data. However, the training process for supervised models requires each input pathological image to be annotated by human experts, leading to significant labor costs.

Semi-supervised and unsupervised learning methods offer effective solutions to these challenges. In unsupervised learning algorithms, input data lacks corresponding labels. These methods optimize models by calculating similarities between training samples and clustering unlabeled raw data based on latent structures or features. Furthermore, unsupervised learning can discover potential sample distribution patterns, thereby mining new metastatic features for quantifying tumor heterogeneity and discovering subtypes. In scenarios where annotated data is scarce, semi-supervised learning can leverage a small amount of labeled data combined with a large volume of unlabeled data for training. This approach significantly reduces labor costs while enhancing the model's generalization capabilities.

In the process of constructing BC metastasis prediction models using machine learning, the pathomics feature framework typically involves several key steps. First, regions of interest (ROI) are extracted from whole-slide images (WSI). These undergo preprocessing and segmentation to identify cellular and tissue structures. Subsequently, multi-dimensional features—including morphological, textural, spatial distribution, and deep learning-based features—are extracted. Following this, feature selection and dimensionality reduction are performed to construct a unified feature matrix for model training and clinical tasks. This framework integrates image analysis with AI, aiming to achieve precise quantification of pathological images and provide auxiliary decision support.

The primary stage of pathomics research involves extracting metastasis-related features from BC tissue section images. Compared to traditional techniques that rely on handcrafted features (where features useful for diagnosis are manually selected and designed), deep learning can automatically learn and extract useful representations from large volumes of high-quality data. Consequently, deep learning is more precise and efficient when characterizing data to construct predictive models for metastatic research.

3 Applications of Pathomics in Metastatic Diagnosis

Breast cancer (BC) lymph node metastasis and distant metastasis are critical factors leading to poor prognosis and increased mortality; they serve as the most significant predictors of overall recurrence and survival rates in BC patients \cite{10}. In recent years, with the development of AI algorithms and increasing focus on BC, pathological radiomics has been increasingly applied to research on BC metastasis diagnostic models. To facilitate a comparative analysis of these models, their publication years, modeling methods, primary objectives, and respective strengths and weaknesses are summarized in [TABLE:1].

3.1 Classification of Metastases of Unknown Primary Origin

In the diagnosis of BC metastasis, histopathological (HP) evaluation has long relied on morphological observation of hematoxylin and eosin (HE) staining and immunohistochemical (IHC) marker analysis \cite{11}. However, its clinical application faces multiple challenges. First, the testing process involves high economic and time costs, with an average turnaround time of $\geq 7$ working days. Second, dynamic loss of markers caused by tumor heterogeneity—such as the downregulation of estrogen receptor (ER) or progesterone receptor (PR) expression after treatment—can make tracing the origin of metastases difficult \cite{12}. Epidemiological studies show that approximately 2% to 5% of metastatic malignancies have an unclear primary site; the prognosis for such patients deteriorates significantly because they cannot be matched with organ-specific therapies \cite{13}.

To address the complex challenges of diagnosing the source of BC metastasis, AI technology has demonstrated significant potential through multimodal data fusion and deep learning models. Chen \cite{14} developed a hybrid manual-deep learning model that combines handcrafted features (nuclear morphology and texture quantification) with a Dense Convolutional Network (DenseNet) classifier. This model was applied to differentiate liver metastases originating from four primary sites: colon cancer, esophageal cancer, BC, and pancreatic cancer, achieving superior classification performance. Specifically, the area under the curve (AUC) for classifying metastatic colon tumors reached as high as 0.94. Furthermore, the model provided an interpretable pathway for tumor heterogeneity research by using heatmaps to localize morphological similarities between primary and metastatic regions.

At the whole-slide imaging (WSI) level, Lu et al. \cite{15} developed the Tumor Origin Assessment via Deep Learning (TOAD) model. By training on over 30,000 HE slides, the model achieved a Top-3 accuracy of 95.5% in predicting known primary site cases within the training set, validating the generalizability of deep learning in metastasis tracing. Regarding the cytological diagnostic dilemma of BC pleural metastasis (with an incidence of 15% to 30% \cite{16}), Park et al. \cite{17} constructed an AI model based on 596 cytological WSIs. The model achieved an accuracy of 81.13% in plaque classification tasks, surpassing the average performance of pathologists (72.49%) and providing a new solution to improve the diagnostic sensitivity of malignant pleural effusion (which is only 58% using traditional methods \cite{18}), though external validation is still required.

Tian et al. \cite{19} constructed the TORCH (Tumor Origin Differentiation using Cytology and Histology) model based on 57,220 cytological images of pleural effusion and ascites. This model achieved systematic localization for cancers of unknown primary (CUP), with a first-choice accuracy of 82.6%. Its diagnostic performance ($\text{AUC} = 0.969$) significantly outperformed that of pathologists ($\text{AUC} = 0.813$). Furthermore, model-assisted treatment extended the median survival of patients by 10 months, establishing a new paradigm for CUP diagnosis and treatment. These studies indicate that by integrating quantitative features of pathological images with clinical data, AI is gradually overcoming the spatial, temporal, and precision limitations of traditional diagnosis, driving BC metastasis diagnosis and treatment toward precision medicine \cite{20}.

3.2 Automated Identification and Diagnosis of Metastasis

HP imaging is the cornerstone of BC metastasis diagnosis and staging. By analyzing the morphological characteristics of lymph nodes and distant metastases, it provides a critical basis for accurate staging and treatment decisions. Early detection of metastasis can significantly improve patient survival (increasing the 5-year survival rate by 20% to 30% \cite{21}). However, traditional HP techniques are limited by scanner heterogeneity, staining fluctuations, and variations in pathologist experience, which leads to a missed diagnosis rate for micrometastases ($< 2 \text{ mm}$) as high as 42%. Therefore, predictive models developed using AI technology are vital for improving the diagnostic efficiency of pathologists.

Islam et al. \cite{22} constructed a computer-aided diagnostic tool based on Convolutional Neural Networks (CNNs) that achieved dual-mode recognition of metastatic tissue (maximum accuracy of 95%) and invasive ductal carcinoma (maximum accuracy of 94%) in breast histopathology images. One of the core tasks of the CAMELYON16 challenge was the automated detection of sentinel lymph node (SLN) metastasis based on WSIs. Among the 32 algorithms developed, seven outperformed a diagnostic panel of 11 pathologists (reaching a maximum sensitivity of 100%), showing a significant advantage particularly in identifying micrometastases ($< 0.2 \text{ mm}$) \cite{23}.

The trade-off between diagnostic accuracy and efficiency remains a key issue in metastasis localization within WSI images. A new two-stage magnification framework utilizes low magnification ($5\times$) to rapidly locate suspicious coordinates (sensitivity of 95%) and high magnification ($40\times$) to accurately identify metastases (specificity of 89%) \cite{24}. The inference speed reached 0.3 seconds per slide, a seven-fold improvement over traditional methods. Retamero et al. \cite{25} employed a model trained with Multiple Instance Learning (MIL) to locate and label the most likely hidden metastatic areas, improving the reading efficiency of pathologists by 55% and increasing diagnostic sensitivity from 74.5% to 93.5%. Khaliliboroujeni \cite{26} designed a spatial-channel attention model that dynamically focuses on metastasis-related morphological features. Using only a small amount of low-resolution annotated data to detect BC metastasis, the model increased WSI analysis speed to seven times that of a pathologist. Yu et al. \cite{27} extended the MIL framework to develop a weakly supervised model for detecting micrometastases, achieving a maximum AUC of 95.8%.

In clinical practice, the pathological diagnosis of lymph node metastasis is often hindered by the poor quality of frozen sections. Kim et al. \cite{28} established an AI diagnostic model for SLN frozen sections, where the optimal model achieved an AUC of 0.891. Chang et al. \cite{29} utilized a multi-scale feature fusion algorithm to increase the sensitivity of micrometastasis ($0.2\text{--}2.0 \text{ mm}$) detection from 67% to 85%, further optimizing the issues inherent in traditional frozen section diagnosis.

4 Applications of Pathomics in Recurrence and Metastasis Prediction

Breast cancer (BC) recurrence refers to the reappearance of the disease following initial treatment and a period of remission. Recurrence is further categorized into local and metastatic forms. Local recurrence occurs when the primary tumor spreads to adjacent breast tissue or regional lymph nodes. In contrast, metastatic recurrence involves the migration of tumor cells from the primary site to distant organs, such as the bones, lungs, liver, or brain \cite{30}. In recent years, an increasing number of studies have focused on developing prognostic models for BC recurrence and metastasis, as detailed in [TABLE:2].

4.1 Predicting Lymph Node Metastasis

Lymph node metastasis is not merely a localized spreading event; it serves as a "sentinel signal" for systemic metastasis. Clinical data indicate that patients with positive lymph nodes face a risk of distant metastasis within five years that is 3.2 times higher than that of the node-negative group ($95\% \text{ CI} = 2.8\text{--}3.7$). Furthermore, the pathomics features of metastatic lesions—such as a mitotic index $>10$ per high-power field and a necrotic area proportion $>30\%$—can be utilized to further refine prognostic stratification.

By utilizing full WSIs from 1,058 patients, researchers employed supervised learning methods to develop the first predictive model for SLN metastasis, achieving a high accuracy of 0.831 \cite{31}. Additionally, Yu et al. \cite{27} developed a Prototypical Multi-Instance Learning (PMIL) framework. By analyzing WSIs through a weak supervision strategy, this model achieved an AUC of 0.984 for predicting BC lymph node metastasis. Notably, the sensitivity for detecting micrometastases ($<0.2\text{ mm}$) reached 92%, representing a three-fold increase in efficiency compared to traditional manual annotation.

Park et al. \cite{32} utilized an innovative multimodal unsupervised learning approach to achieve preoperative prediction of axillary lymph node metastasis. By identifying histopathological imaging patterns associated with metastatic status—such as micropapillary growth, invasion patterns, and necrosis—their method reached a maximum AUC of 0.801 (95% CI: 0.728–0.873) across five external validation cohorts.

4.2 Predicting Distant Metastasis

BC metastatic recurrence is a multi-step dynamic process involving several key mechanisms, including the enhancement of tumor cell invasion and migration, the remodeling of the immune microenvironment, and the penetration of selective barriers. The inherent complexity of BC recurrence and metastasis has driven the rapid development of multimodal predictive models.

Yao et al. \cite{33} utilized data from 198 patients in the TCGA database to construct a deep feature fusion model. By employing weak supervision to integrate pathological image features, clinical staging, and gene mutation profiles (e.g., TP53, PIK3CA), their model achieved an AUC of 0.82 for predicting metastatic risk. Similarly, Yang \cite{34} utilized CNN to extract texture features from H&E images. By combining these features with lymph node metastasis status, they developed a prognostic model that achieved a sensitivity of 67%, a specificity of 83%, and an AUC of 0.72. Furthermore, Liu et al. \cite{35} developed a multi-scale risk assessment framework by fusing low-order color features with Wavelet Multi-subband Co-occurrence Matrices (WMCM). This model demonstrated stable performance, yielding an AUC of 0.75 in the internal validation set and 0.72 in the external validation set.

Research has demonstrated that the response of tumor-infiltrating lymphocytes is positively correlated with the prognosis of BC \cite{36-37}. Furthermore, Verghese et al. \cite{38} utilized high-resolution WSIs to identify TILs, discovering that the density of TILs serves as an independent prognostic factor for BC. Basaad et al. \cite{39} innovatively introduced a BERT model and Graph Neural Networks (GNN) to directly parse the text of histopathology reports. They developed the BERT-GNN Method for Breast Cancer (BG-MBC) model to predict the risk of distant metastasis, achieving an AUC of 0.98.

5 Quantifying Tumor Heterogeneity and Subtype Discovery

The complexity of BC metastasis stems from its high degree of heterogeneity. By integrating multi-omics data with pathomics features, AI systematically analyzes intratumoral molecular and morphological diversity \cite{40}. Genomic research has revealed the five classic molecular subtypes of BC (Luminal A, Luminal B, HER2-enriched, Basal-like, and Normal-like) along with their underlying molecular mechanisms \cite{41}. Notably, the molecular subtypes of primary and metastatic lesions may undergo dynamic evolution \cite{42}.

Yu et al. \cite{43} constructed four immunometabolic subtypes: immune-active, immune-excluded, immune-dysfunctional, and immune-desert. A deep learning model built on H&E sections achieved an AUC of 0.92 for predicting these subtypes. AI models can automatically quantify a range of spatial structural features from H&E images, including tumor nuclear density distribution, atypia scores, and spatial density of TILs \cite{44}. Zhang et al. \cite{45} found that samples with a higher proportion of tumor cells in WSIs were closely associated with an increased risk of BC recurrence.

6 Summary and Outlook

Currently, existing machine learning models have demonstrated significant efficacy in the diagnosis and malignant potential prediction of BC metastasis. However, several limitations persist: (1) Research design limitations: Most models are based on single-center, small-sample retrospective data. (2) Unresolved scientific questions: Issues such as organ-specific metastasis prediction have not been fully elucidated. (3) Data quality constraints: Supervised models rely on high-quality labels which are scarce. (4) Model interpretability: Most AI models still exhibit "black box" characteristics.

The transition from "functional" to "practical" requires the construction of a closed loop for clinical translation. Future research should focus on algorithmic innovation in areas such as few-shot learning, semi-supervised learning, and multimodal data fusion. Furthermore, improving model interpretability will be key to achieving clinical acceptance. In summary, building a data foundation characterized by multicenter, large-sample, and multimodal synergy will drive the management of BC toward an intelligent closed loop of individualized treatment.

Author Contributions: Fang Xiaoqing and Lyu Qingjie performed the conception and design of the article; Fang Xiaoqing conducted the literature collection and organization and drafted the manuscript; Lyu Qingjie performed the feasibility analysis, revised the manuscript, and was responsible for quality control, proofreading, overall supervision, and management of the article.

The authors declare no conflicts of interest.

References

[1] BRAY F, LAVERSANNE M, SUNG H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA Cancer J Clin, 2024, 74(3): 229-263. DOI: 10.3322/caac.21834.

[2] PARK M, KIM D, KO S, et al. Breast cancer metastasis: mechanisms and therapeutic implications [J]. Int J Mol Sci, 2022, 23(12): 6806. DOI: 10.3390/ijms23126806.

[3] DING Q Q, HUO L, PENG Y, et al. Immunohistochemical markers for distinguishing metastatic breast carcinoma from other common malignancies: update and revisit [J]. Semin Diagn Pathol, 2022.

[4] HANNA M G, PARWANI A, SIRINTRAPUN S J. Whole slide imaging: technology and applications [J]. Adv Anat Pathol, 2020, 27(4): 251-259. DOI: 10.1097/PAP.0000000000000273.

[5] MELO R C N, RAAS M W D, PALAZZI C, et al. Whole slide imaging and its applications to histopathological studies of liver disorders [J]. Front Med, 2020, 6: 310. DOI: 10.3389/fmed.2019.00310.

[6] WAQAS A, BUI M M, GLASSY E F, et al. Revolutionizing digital pathology with the power of generative artificial intelligence and foundation models [J]. Lab Investig, 2023, 103(11): 100255.

[7] DEO R C. Machine learning in medicine [J]. Circulation, 2015, 132(20): 1920-1930. DOI: 10.1161/circulationaha.115.001593.

[8] PAIXÃO M, et al. Machine learning in medicine: review and applicability [J]. Arq Bras Cardiol, 2022, 118(1): 95-102. DOI: 10.36660/abc.20200596.

[9] BHALLA K, XIAO Q, LUNA J M, et al. Radiologic imaging biomarkers in triple-negative breast cancer: a literature review about the role of artificial intelligence and the way forward [J]. BJR Artif Intell, 2024, 1(1): ubae016. DOI: 10.1093/bjrai/ubae016.

[10] CHANG J M, LEUNG J W T, MOY L, et al. Axillary nodal evaluation in breast cancer: state of the art [J]. Radiology, 2020, 295(3): 500-515. DOI: 10.1148/radiol.2020192534.

[11] GOWN A M, FULTON R S, KANDALAFT P L. Markers of metastatic carcinoma of breast origin [J]. Histopathology, 2016, 68(1): 86-95. DOI: 10.1111/his.12877.

[12] BORCH W R, MONACO S E. Current approach to undifferentiated neoplasms, with focus on new developments and novel immunohistochemical stains [J]. Arch Pathol Lab Med, 2023, 147(12): 1364-1373. DOI: 10.5858/arpa.2022-0459-RA.

[13] LEE M S, SANOFF H K. Cancer of unknown primary [J]. BMJ, 2020, 371: m4050. DOI: 10.1136/bmj.m4050.

[14] CHEN C H, LU C, VISWANATHAN V, et al. Identifying primary tumor site of origin for liver metastases via a combination of handcrafted and deep learning features [J]. J Pathol Clin Res, 2024, 10(1): e344. DOI: 10.1002/cjp2.344.

[15] LU M Y, CHEN T Y, WILLIAMSON D F K, et al. AI-based pathology predicts origins for cancers of unknown primary [J]. Nature, 2021, 594(7861): 106-110. DOI: 10.1038/s41586-021-03512-4.

[16] CHEN M T, SUN H F, ZHAO Y, et al. Comparison of patterns and prognosis among distant metastatic breast cancer patients by age groups: a SEER population-based analysis [J]. Sci Rep, 2017, 7: 9254. DOI: 10.1038/s41598-017-10166-8.

[17] PARK H S, CHONG Y, LEE Y J, et al. Deep learning-based computational cytopathologic diagnosis of metastatic breast carcinoma in pleural fluid [J]. Cells, 2023, 12(14): 1847.

[18] KASSIRIAN S, HINTON S N, CUNINGHAME S, et al. Diagnostic sensitivity of pleural fluid cytology in malignant pleural effusions: systematic review and meta-analysis [J]. Thorax, 2023, 78(1): 32-40. DOI: 10.1136/thoraxjnl-2021-217959.

[19] TIAN F, LIU D, WEI N, et al. Prediction of tumor origin in cancers of unknown primary origin with cytology-based deep learning [J]. Nat Med, 2024, 30(5): 1309-1319. DOI: 10.1038/s41591-024-02915-w.

[20] ALI S, LI J Q, PEI Y, et al. State-of-the-art challenges and perspectives in multi-organ cancer diagnosis via deep learning-based methods [J]. Cancers, 2021, 13(21): 5546. DOI: 10.3390/cancers13215546.

[21] HEGDE S, SHETTY S, RAI S, et al. A survey on machine learning approaches for automatic detection of voice disorders [J]. J Voice, 2019, 33(6): 947.e11-947.e33. DOI: 10.1016/j.jvoice.2018.07.014.

[22] ISLAM T, HOQUE M E, ULLAH M, et al. CNN-based deep learning approach for classification of invasive ductal and metastasis types of breast carcinoma [J]. Cancer Med, 2024, 13(16): e70069. DOI: 10.1002/cam4.70069.

[23] CHALLA B, TAHIR M, HU Y, et al. Artificial intelligence-aided diagnosis of breast cancer lymph node metastasis on histologic slides in a digital workflow [J]. Mod Pathol, 2023, 36(8): 100216.

[24] WANG R, GU Y, ZHANG T Y, et al. Fast cancer metastasis location based on dual magnification hard example mining network in whole-slide images [J]. Comput Biol Med, 2023, 158: 106880.

[25] RETAMERO J A, GULTURK E, BOZKURT A, et al. Artificial intelligence helps pathologists increase diagnostic accuracy and efficiency in the detection of breast cancer lymph node metastases [J]. Am J Surg Pathol, 2024, 48(7): 846-854.

[26] KHALILIBOROUJENI S, HE X J, JIA W J, et al. End-to-end metastasis detection of breast cancer from histopathology whole slide images [J]. Comput Med Imaging Graph, 2022, 102: 102136.

[27] YU J G, WU Z H, MING Y, et al. Prototypical multiple instance learning for predicting lymph node metastasis of breast cancer from whole-slide pathological images [J]. Med Image Anal, 2023.

[28] KIM Y G, SONG I H, CHO S Y, et al. Diagnostic assessment of deep learning algorithms for frozen tissue section analysis in women with breast cancer [J]. Cancer Res Treat, 2023, 55(2): 513-522. DOI: 10.4143/crt.2022.055.

[29] CHANG C P, HSU C Y, WANG H S, et al. Detection of metastatic breast carcinoma in sentinel lymph node frozen sections using an artificial intelligence-assisted system [J]. Pathol Res Pract, 2024.

[30] AKRAM M, IQBAL M, DANIYAL M, et al. Awareness and current knowledge of breast cancer [J]. Biol Res, 2017, 50(1): 33. DOI: 10.1186/s40659-017-0140-9.

[31] XU F, ZHU C, TANG W Q, et al. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides [J]. Front Oncol, 2021, 11: 759007. DOI: 10.3389/fonc.2021.759007.

[32] PARK D, LEE Y M, EO T, et al. Multimodal AI model for preoperative prediction of axillary lymph node metastasis in breast cancer using whole slide images [J]. NPJ Precis Onc, 2025, 9: 131. DOI: 10.1038/s41698-025-00914-9.

[33] YAO Y H, LV Y P, TONG L, et al. ICSDA: a multi-modal deep learning model to predict breast cancer recurrence and metastasis risk by integrating pathological, clinical and gene expression data [J]. Brief Bioinform, 2022, 23(6): bbac448. DOI: 10.1093/bib/bbac448.

[34] YANG J L, JU J, GUO L, et al. Prediction of HER2-positive breast cancer recurrence and metastasis risk from histopathological images and clinical information via multimodal deep learning [J]. Comput Struct Biotechnol J, 2022, 20: 333-342. DOI: 10.1016/j.csbj.2021.12.028.

[35] LIU X Y, YUAN P, LI R L, et al. Predicting breast cancer recurrence and metastasis risk by integrating color and texture features of histopathological images and machine learning technologies [J]. Comput Biol Med, 2022, 146: 105569.

[36] LIU F F, HARDIMAN T, WU K L, et al. Systemic immune reaction in axillary lymph nodes adds to tumor-infiltrating lymphocytes in triple-negative breast cancer prognostication [J]. NPJ Breast Cancer, 2021, 7: 86. DOI: 10.1038/s41523-021-00292-y.

[37] GRIGORIADIS A, GAZINSKA P, PAI T, et al. Histological scoring of immune and stromal features in breast and axillary lymph nodes is prognostic for distant metastasis in lymph node-positive breast cancers [J]. J Pathol Clin Res, 2018, 4(1): 39-54. DOI: 10.1002/cjp2.87.

[38] VERGHESE G, LI M Y, LIU F F, et al. Multiscale deep learning framework captures systemic immune features in lymph nodes predictive of triple negative breast cancer outcome in large-scale studies [J]. J Pathol, 2023, 260(4): 376-389. DOI: 10.1002/path.6088.

[39] BASAAD A, BASURRA S, VAKAJ E, et al. A BERT-GNN approach for metastatic breast cancer prediction using histopathology reports [J]. Diagnostics, 2024, 14(13): 1365. DOI: 10.3390/diagnostics14131365.

[40] SAMMUT S J, CRISPIN-ORTUZAR M, CHIN S F, et al. Multi-omic machine learning predictor of breast cancer therapy response [J]. Nature, 2022, 601(7894): 623-629. DOI: 10.1038/s41586-021-04278-5.

[41] PEROU C M, SØRLIE T, EISEN M B, et al. Molecular portraits of human breast tumours [J]. Nature, 2000, 406(6797): 747-752. DOI: 10.1038/35021093.

[42] CEJALVO J M, DE DUEÑAS E M, GALVÁN P, et al. Intrinsic subtypes and gene expression profiles in primary and metastatic breast cancer [J]. Cancer Res, 2017, 77(9): 2213-2221.

[43] YU Y F, CAI G Y, LIN R C, et al. Multimodal data fusion AI model uncovers tumor microenvironment immunotyping heterogeneity and enhanced risk stratification of breast cancer [J]. MedComm, 2024, 5(12): e70023. DOI: 10.1002/mco2.70023.

[44] AHUJA S, ZAHEER S. Advancements in pathology: Digital transformation, precision medicine, and beyond [J]. J Pathol, 2024.

[45] ZHANG H, YANG F, XU Y, et al. Multimodal integration using a machine learning approach facilitates risk stratification in HR+/HER2- breast cancer [J]. Cell Rep Med, 2025, 6(2).

[46] WANG Z Z, SAOUD C, WANGSIRICHAROEN S, et al. Label cleaning multiple instance learning: refining coarse annotations on single whole-slide images [J]. IEEE Trans Med Imaging, 2022, 41(12): 3952-3968. DOI: 10.1109/TMI.2022.3202759.

Submission history

Post-print of Artificial Intelligence-Driven Pathomics in Breast Cancer Metastasis Research