ChinaRxiv

A monitoring system to improve fault diagnosis in telescope arrays (Postprint)

Yang Xu, Guangwei Li, Jing Wang, Liping Xin, Hongbo Cai, Xuhui Han, Xiaomeng Lu, Lei Huang, Jianyan Wei

Submitted 2025-07-25 | ChinaXiv: chinaxiv-202508.00174

Note: Figures in this paper have not yet been translated.

Abstract

The Ground-based Wide-Angle Cameras array necessitates the integration of more than 100 hardware devices, 100 servers, and 2 500 software modules that must be synchronized within a 3-second imaging cycle. However, the complexity of real-time, high-concurrency processing of large datasets has historically resulted in substantial failure rates, with an observation efficiency estimated at less than 50% in 2023. To mitigate these challenges, we developed a monitoring system designed to improve fault diagnosis efficiency. It includes two innovative monitoring views for "state evolution" and "transient lifecycle". Combining these with "instantaneous state"and "key parameter" monitoring views, the system represents a comprehensive monitoring strategy. Here we detail the system architecture, data collection methods, and design philosophy of the monitoring views. During one year of fault diagnosis experimental practice, the proposed system demonstrated its ability to identify and localize faults within minutes, achieving fault localization nearly ten times faster than traditional methods. Additionally, the system design exhibited high generalizability, with possible applicability to other telescope array systems.

Full Text

Preamble

Astronomical Techniques and Instruments, Vol. 2, July 2025, 246–254

Article Open Access

A Monitoring System to Improve Fault Diagnosis in Telescope Arrays

Yang Xu¹*, Guangwei Li¹, Jing Wang¹, Liping Xin¹, Xiaomeng Lu¹, Lei Huang¹, Jianyan Wei¹,², Hongbo Cai¹, Xuhui Han¹

¹CAS Key Laboratory of Space Astronomy and Technology, National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100101, China

²University of Chinese Academy of Sciences, Beijing 100049, China

*Correspondence: yxu@nao.cas.cn

Received: February 7, 2025; Accepted: March 21, 2025; Published Online: May 13, 2025

https://doi.org/10.61977/ati2025019; https://cstr.cn/32083.14.ati2025019

© 2025 Editorial Office of Astronomical Techniques and Instruments, Yunnan Observatories, Chinese Academy of Sciences. This is an open access article under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/)

Citation: Xu, Y., Li, G. W., Wang, J., et al. 2025. A monitoring system to improve fault diagnosis in telescope arrays. Astronomical Techniques and Instruments, 2(4): 246−254. https://doi.org/10.61977/ati2025019.

Abstract: The Ground-based Wide-Angle Cameras array necessitates the integration of more than 100 hardware devices, 100 servers, and 2,500 software modules that must be synchronized within a 3-second imaging cycle. However, the complexity of real-time, high-concurrency processing of large datasets has historically resulted in substantial failure rates, with an observation efficiency estimated at less than 50% in 2023. To mitigate these challenges, we developed a monitoring system designed to improve fault diagnosis efficiency. It includes two innovative monitoring views for "state evolution" and "transient lifecycle." Combining these with "instantaneous state" and "key parameter" monitoring views, the system represents a comprehensive monitoring strategy. Here we detail the system architecture, data collection methods, and design philosophy of the monitoring views. During one year of fault diagnosis experimental practice, the proposed system demonstrated its ability to identify and localize faults within minutes, achieving fault localization nearly ten times faster than traditional methods. Additionally, the system design exhibited high generalizability, with possible applicability to other telescope array systems.

Keywords: Automated telescopes; Astronomical image processing; Fault diagnosis; Monitoring system

1. Introduction

The Ground-based Wide-Angle Cameras (GWAC) array, a main ground-segment component of the Chinese-French Space-based multi-band astronomical Variable Objects Monitor (SVOM) mission \cite{Wei2016}, is a fully automated, self-triggering survey system intended for follow-up observations of SVOM transient detections and for real-time autonomous detection of optical transients. Since construction completion, a series of scientific results were obtained with GWAC data, such as the independent detection of a short-duration, high-energy gamma-ray burst, GRB 201223A, with a timescale of only 29 s \cite{Xin2023}; the detection of more than 200 white-light flares \cite{Li2024, Li2023}; the detection of two cold-star superflares with amplitudes reaching an approximate magnitude of 10 \cite{Xin2024, Xin2021}; and characterization of the long-term flare activity of cold stars \cite{Li2023b}. The complete GWAC system consists of 10 telescope mounts, 50 image sensors, 50 camera focusing mechanisms, and more than 100 supporting servers.

The data processing pipeline includes multiple subsystems for observation scheduling \cite{Han2021}, observation control, automatic focus \cite{Huang2015}, automatic guiding, real-time scientific data processing, automatic follow-up \cite{Xu2020}, and scientific result management \cite{Xu2020b}.

Daily GWAC operations rely on the coordination of numerous software and hardware modules and involve complex processes that result in a moderately high fault rate. The two primary challenges of maintaining the array's operational efficiency are:

(1) Hardware-software dependence. Hardware operations strongly depend on software feedback. For example, pointing correction for each mount relies on the astrometric analysis of observed images, while maintaining image quality depends on the calculation results of the stellar energy concentration in each image. These are affected by meteorological variability, introducing uncertainty in the astrometric assessments. Because scientific data processing also depends on these results, the overall GWAC feedback loop is long, with limited reliability.

(2) Complex software architecture. The system is intricate and requires high concurrency and real-time processing. The data processing pipeline must complete a variety of critical tasks, such as observation planning, image quality assessment, template catalog generation, hardware feedback, cross-matching with multiple catalogs, and automated transient follow-up. For example, the processing of a single image requires more than 50 software modules; thus, because all 50 cameras operate simultaneously, a single image cycle necessitates more than 2,500 software module invocations. Interdependence between modules implies that a single failure can disrupt the entire operational process, reducing system robustness.

When the number of telescopes in an array is small, faults are rare and their impact on observation efficiency is minimal. However, as the number of telescopes increases, fault frequency increases, often exceeding the diagnosis and repair capacity of the maintenance staff. This results in prolonged operational periods affected by fault occurrences and a substantially reduced system efficiency. Therefore, the GWAC array critically requires an efficient monitoring system to track system state continuously, diagnose faults, issue subsequent alerts, and provide maintenance staff with guidance to solve problems rapidly and, ultimately, improve the operational efficiency.

Current monitoring solutions are inadequate for the GWAC array because of its complexity and requirements for high concurrency and real-time operation. For example, the monitoring system of the Cherenkov Telescope Array \cite{Costa2021} mainly tracks hardware degradation to prevent major failures. The Large Sky Area Multi-Object Fiber Spectroscopic Telescope \cite{Hu2021} relies on real-time assessments of its guiding system performance, focal surface defocusing, submirror performance, and active optics system performance. Similarly, radio astronomy projects such as the Square Kilometre Array \cite{DiCarlo2016} mainly monitor hardware health state. These systems emphasize hardware health or observation efficiency but cannot monitor hardware and software state comprehensively in high-concurrency environments like that of the GWAC array.

To overcome these limitations, the GWAC array requires a monitoring system that not only provides real-time hardware state tracking, but also integrates comprehensive software pipeline monitoring. Thus, by design, the GWAC monitoring system continuously assesses the running state of all array components and displays monitoring data in several views. To ensure comprehensive fault coverage, the system collects a variety of data on hardware state, real-time pipeline state, and key image parameters. Hardware-software collaboration generates considerable amounts of raw monitoring data that cannot be analyzed manually. By integrating and abstracting the raw data, the monitoring system provides multidimensional views that simplify data interpretation and reduce reliance on the experience and skills of the maintenance personnel. Additionally, the system contributes to characterizing the array's internal operations and temporal performance evolution, supports manual fault diagnosis, and establishes the basis for future automated fault detection.

Unlike traditional monitoring systems that focus on hardware health or isolated performance metrics, the proposed system introduces two original views to monitor state evolution and transient lifecycle. These views offer a dynamic, temporally resolved perspective on system behavior and partly alleviate the limitations of existing approaches. The "state evolution monitoring" view continuously records the temporal variations of key parameters (e.g., mount pointing, image quality, and module invocation time), allowing for early detection of complex faults that develop gradually or propagate across multiple modules. The "transient lifecycle monitoring" view illustrates the entire processing pipeline for transient events, from detection to follow-up, allowing for real-time identification of delays or bottlenecks. This view is particularly useful to optimize transient surveys and improve response efficiency. By integrating these original capabilities, the proposed system is designed to improve fault diagnosis and operational efficiency within a comprehensive monitoring framework, particularly suited to large-scale, high-concurrency systems like the GWAC array.

In this study, we introduce a new monitoring system for the GWAC array, with diversified monitoring data collection and an original visualization scheme. The structure of the manuscript is as follows: Section 2 describes the system architecture and its relationship with the existing GWAC pipeline; Section 3 details the system design and implementation, including monitoring view construction, database design, and system implementation; Section 4 presents the analysis of fault diagnosis cases using the new monitoring views; and Section 5 summarizes our study and suggests future uses for the proposed monitoring system.

2. System Architecture

To enhance the efficiency of fault detection and diagnosis in the GWAC system, we developed a monitoring system integrated into the existing GWAC pipeline. As illustrated in Fig. 1 [FIGURE:1], the monitoring system comprises two main components: data collection and monitoring views.

2.1. Data Collection

The "data collection" component (Fig. 1, top) represents data aggregation from the seven subsystems of the GWAC data processing pipeline. Collected data are sorted into three categories: key module invocation time, image and instrument parameters, and transient processing information, described hereafter.

(1) Key module invocation time: The seven GWAC subsystems include more than 50 software modules. Monitoring all software modules would produce excessive data, complicating data management and visualization. Thus, only selected key modules are monitored, such as observation planning generation, image exposure, image processing, and transient detection. For each key module, the invocation start time is recorded to indicate the module activity state. The specific key modules are described in Section 3.1.1 (instantaneous state monitoring).

(2) Image and instrument parameters: In the GWAC data processing pipeline, several analyses are conducted on each image to evaluate, e.g., image quality, pointing accuracy, alignment precision, and target count. The combined analysis results represent the image parameters. Additionally, physical components of the array, such as mounts or cameras, are characterized by state parameters (e.g., temperature, vacuum level, voltage, and current) representing the instrument parameters.

(3) Transient processing information: Transient properties are a critical scientific output of the GWAC array. Their timely processing is essential to the quality of the scientific results. Data in this category records the start times of key modules directly related to transients, from the acquisition time of transient discovery frames to the triggering time of follow-up observations, to track transient processing.

2.2. Monitoring Views

The "monitoring views" component (Fig. 1, bottom) represents the abstraction and presentation of the collected data into four views (described in Section 3.1): instantaneous state, state evolution, key parameters, and transient lifecycle.

(1) Instantaneous state monitoring: This view tracks real-time system state, providing immediate feedback on system health. It alerts staff of faults to allow for prompt diagnosis and resolution. This view includes real-time camera previews and displays the state of key modules.

(2) State evolution monitoring: This view illustrates the temporal evolution of the state of key modules, i.e., time series of the instantaneous state monitoring.

(3) Key parameter monitoring: This view primarily monitors the temporal evolution of parameters related to images, mounts, and cameras.

(4) Transient lifecycle monitoring: This view visualizes the lifecycle of detected transients, from discovery frame acquisition to the triggering of follow-up observations. It includes key timestamps: acquisition time of the transient candidate detection frame, start and end times of the identification process, and start and end times of follow-up observations. This view is used to verify whether processing is nominal or faulty.

3. System Design and Implementation

3.1. Monitoring View Design

Managing more than 100 custom hardware devices and 2,500 software modules on 100 servers with limited screen space is a substantial challenge for efficient presentation of system state and parameter data. It requires a well-structured user interface (UI), thorough understanding of the monitored parameters, and abstract representation. To improve fault analysis efficiency, the number of monitoring pages should be limited, with each page designed for clarity. Because excessive page switching can result in information loss and longer fault analysis time, a compact information display scheme is necessary.

On the basis of these principles, we developed four monitoring views to provide a comprehensive operational overview of the GWAC system and accelerate fault detection and localization.

3.1.1. Instantaneous State Monitoring View

This view is intended for early fault warnings and provides a comprehensive, real-time system health summary that includes images from the 50 cameras and a key module state monitoring table. The UI design details are illustrated in Fig. 2 [FIGURE:2].

(1) Camera observation image monitoring: This page displays real-time images from each camera to assess observation parameters, including focus, camera state, and weather conditions. The design challenge is to maintain clarity when displaying images from 50 cameras simultaneously. We propose a combination of thumbnails around a high-resolution image carousel (Fig. 2A). Thumbnails are displayed on the sides for browsing, while high-resolution images are presented in the center carousel, allowing users to view detailed images by clicking on the thumbnails.

(2) Key module instantaneous state monitoring: This page tracks the real-time state of the data processing pipeline, involving more than 50 software modules for each camera. Because of space constraints, only key modules are monitored and their state displayed (Fig. 2B). Each row corresponds to a camera and each column to a key module. Module state is color-coded in white (online), green (nominal), orange (warning), red (fault), and gray (offline). This layout supports rapid fault diagnosis by efficiently conveying critical information.

By providing a complete overview of the system's health state, the instantaneous state monitoring view enables users to rapidly assess overall array performance. For a detailed fault diagnosis, users can conduct a comprehensive system analysis using the additional monitoring views.

3.1.2. State Evolution Monitoring View

This view is used to monitor continuous system changes during observation, focusing on complex mechanisms essential for fault diagnosis. For the same purpose, traditional monitoring methods use specialized, time-consuming tools that require expertise. We designed a new view for dynamic visualization of the system state. It displays the variations in mount pointing, image quality, and image lifecycle (Fig. 3 [FIGURE:3]). The UI consists of a control area for mount selection, data type selection, and observation date selection (Fig. 3, left) and of a chart displaying temporal parameter variations (Fig. 3, right), with the following layout:

(1) Horizontal axis: elapsed time in minutes since the start of the active observation.

(2) Vertical axis: parameters such as mount properties (observation plan, guiding action, pointing errors), camera properties (template image creation, focus, image quality indicators such as the full width at half maximum or FWHM¹), and invocation time of key modules in the image processing pipeline. The monitored properties are arranged sequentially on the chart for each of the five cameras installed on the selected mount.

(3) Key module association: Each item on the vertical axis represents a key module. For each camera, multiple key modules are connected from top to bottom in chronological order; the shape of the connecting lines indicates different types of faults.

¹ The term FWHM (full width at half maximum) in this work denotes the width of a star's intensity profile measured at half its maximum brightness.

3.1.3. Key Parameter Monitoring View

This view provides separate monitoring of parameters related to the mounts, cameras, and images. For example, temporal variations of the mount pointing accuracy, displayed on Fig. 4A [FIGURE:4], indicate deviations from the planned pointing. Image quality (Fig. 4B), influenced by the camera itself and by environmental factors, is monitored with parameters such as FWHM, star count, background brightness, limiting magnitude, and processing time. Temporal variations of camera hardware parameters, such as temperature, are also displayed on Fig. 4B. All of these parameters are listed in the "image parameters" menu on the left-hand side; selecting a parameter name displays the corresponding curve.

3.1.4. Transient Lifecycle Monitoring View

Because of its hardware characteristics, the GWAC array can observe fast transients, i.e., with durations of the order of minutes. The entire GWAC transient processing pipeline, from transient detection to automatic identification, has been developed and is fully operational. To date, the array has successfully observed numerous minute-scale transients, such as the 200 white-light flares reported by Li et al. \cite{Li2024, Li2023}. Follow-up identification \cite{Xu2020} of transient candidates, using an independent 60-cm telescope located at the same site, is a critical step in this observation process. Early identification of transients alerts large-aperture telescopes sooner, allowing for timely spectroscopic measurements and other follow-up observations. Therefore, optimizing the GWAC data processing pipeline is critical to accelerate the observation process, from detection to automatic identification, particularly for short-duration phenomena such as optical transients.

For this purpose, we introduced the concept of "transient lifecycle," which consists in monitoring and optimizing key modules in the transient processing pipeline. These include detection, identification, and follow-up observation. The system records the uptime of each key module to monitor the transient lifecycle (Fig. 5 [FIGURE:5]). When a system fault occurs, time consumption markedly increases, reflecting system anomalies in real time. Fig. 6 [FIGURE:6] illustrates an example of a transient lifecycle fault.

Although the proposed transient lifecycle monitoring view is valuable to monitor the processing timeline of transient events, the system could benefit from future optimization, such as finer-resolution performance sampling or distributed tracing. Implementing more detailed performance metrics at each stage of the transient processing pipeline could allow the system to precisely identify bottlenecks and optimize resource allocation. This would further improve the transient processing efficiency and reduce delays caused by data accumulation.

3.2. Database Design

To support data storage for the proposed monitoring system, we designed a series of database tables divided into two main categories: "basic entity" and "entity state." Here, an entity represents a monitoring target within the astronomical data processing pipeline, such as a mount, camera, or image.

(1) Basic entity tables: These include the observation plan table, mount table, camera table, image table, and transient table. The basic entity tables are shared with the scientific data processing pipeline and are used to support GWAC operations and scientific result management.

(2) Entity state tables: These include the instantaneous state table, transient key module uptime table, image lifecycle monitoring table, and image and instrument parameter tables. They are specific to the monitoring system and are used to record data parameters, state parameters, and temporal lifecycle variations of the monitored entities.

3.3. System Implementation

The GWAC monitoring system comprises a web server, a database, and several data collection programs. The web server facilitates view display and provides application programming interfaces (APIs) for data collection. The data collection programs run on the GWAC data processing servers, where they monitor and collect information on the local operational state, uploaded to the web server via the API during each image cycle. The collected monitoring data are then stored in the database and made accessible to users through multiple view pages on a web browser.

Details of the monitoring system implementation, necessary to ensure efficient data collection and visualization for fault diagnosis, are given in the following sections.

3.3.1. Data Visualization Performance Optimization

The monitoring system employs several strategies to accelerate data visualization and handle a large volume of monitoring data efficiently.

(1) Instantaneous state monitoring view: The instantaneous camera state is stored in a dedicated state table, where each record represents the latest state of a camera. This minimal database footprint ensures rapid data retrieval. Additionally, a fast page refresh rate, at the frequency of the camera's observation cycle, ensures real-time state updates.

(2) State evolution monitoring view: This view aggregates state evolution data from all cameras at image granularity over an entire night. For the GWAC telescope array, a single page may display approximately 600,000 records, with a total attribute count exceeding 10 million. To optimize rendering speed, a "lazy-loading" mechanism is adopted: if cached data are available, the system loads them by default; otherwise, data are retrieved from the database in real time. Users can manually initiate data updates from the interface.

Rendering technology: The system uses a JavaScript-based canvas for online rendering on web pages. This enables users, through interactive zooming and moving, to examine fine details within the monitoring data.

3.3.2. Database Optimization for Real-Time Performance and Reliability

The GWAC monitoring system must process approximately 18 million observation records per month. Unfortunately, the performance of a relational database degrades considerably once a single table exceeds 10 million records. To maintain real-time performance and ensure database reliability, the following optimizations are implemented.

(1) Data partitioning into daily and historical tables: A daily data table is used to store and retrieve real-time operational data, ensuring rapid access and update for active observations. Additionally, a historical data table is used for long-term storage and supports offline fault analysis and data mining, for which real-time performance is less critical. Data are automatically migrated from the daily table to the historical table in Beijing at 16:30 PM every day to maintain optimal database performance.

(2) Database reliability measures: The database is configured with real-time streaming replication to ensure high availability. Backup points are established at both the observation site and the Beijing data center to provide geographical redundancy and enhance data security.

3.3.3. Message Queue for Real-Time Processing and System Robustness

To balance real-time data ingestion and fault tolerance, the system employs a message queue-based data processing architecture. All collected monitoring data are stored in a message queue before being committed to the database. This ensures that data are ingested in real time without overwhelming the database. Message consumption threads then process incoming data asynchronously and continuously to prevent system bottlenecks. Thus, in case of anomalous data spikes or unexpected failures, the backlog is handled without system failure, ensuring system robustness even under extreme conditions.

4. Fault Diagnosis Practices Using Monitoring Views

The monitoring system was completed in early 2024. Over nearly a year of operation, we have conducted extensive fault analyses and diagnoses using the proposed monitoring views, ultimately establishing a correspondence table between faults and monitoring views in which nearly all common faults correspond to a specific view configuration and to a solution. With our proposed monitoring views, the complete fault analysis process, from warning to diagnosis, typically lasted for a few minutes only. Compared with traditional methods, for which the same process can last from several tens of minutes to several hours, the proposed monitoring solution accelerated fault diagnosis by a factor of 10 or more. Moreover, the monitoring views described in this study do not consider all available fault information; system optimization is ongoing, and upgrades are continuously applied. In the future, the monitoring content will be further expanded to cover more fault points. Typical examples of common fault diagnoses, characterized using the proposed monitoring views, are presented hereafter.

(1) Defocused image: Fig. 7A [FIGURE:7] is a real-time preview of a faulty observation. The poor image quality indicates a failure of the autofocus function.

(2) Single or multiple response timeout: Fig. 7B displays the instantaneous state of key modules (small rectangular boxes) for each camera in the array. A majority of boxes are typically colored green, indicating nominal module state. A red box indicates an anomalous state for a specific key module. An entirely red column indicates that the corresponding key module, for all cameras (rows), either failed to start or suffered complete failure.

(3) Simultaneous data processing failure on multiple servers: This is illustrated in Fig. 8A [FIGURE:8] and Fig. 8B, which presents three cases of state evolution monitoring view. In Fig. 8A, two cameras on the same mount exhibited an unusually large FWHM because of the weather conditions, degrading the image quality and causing simultaneous data processing failures on two servers of the same mount. In Fig. 8B, multiple failures are displayed. First, a weather-related fault caused data processing failures on all servers of a single mount, triggering a pointing switch to the next planned sky region. Second, frequent sky region switches on a mount depleted its observation plan for the corresponding time period, eventually causing it to enter a waiting state and suspend observations. At the same time, focusing failure of the guiding camera or excessive pointing deviations also induced simultaneous data processing failures on all servers of a single mount. Additionally, a failure in the source extraction module of one server prevented image processing, but not image acquisition.

(4) Single-server data processing delay: Fig. 8C illustrates a network card failure on the camera control server, reducing transmission speed to below the image readout rate, which caused considerable image transmission delays. However, image processing remained nominal in the subsequent modules.

(5) Anomalous transient lifecycle: Fig. 6 illustrates the temporal evolution of transient candidate catalog parsing. In two periods, transient catalog parsing experienced substantial delays. This is typically caused by consecutive processing failures for multiple images, which generate an excessive number of transient candidates. When the number of candidates exceeds peak pipeline processing capacity, a candidate queue forms, resulting in processing delays. As processing continues, the number of queued targets gradually decreases and processing time reverts to its nominal state.

5. Summary and Future Work

The complexity of the GWAC data processing pipeline results in a high failure rate and difficult fault diagnosis. In this study, we designed and implemented an original monitoring system to improve the fault diagnosis efficiency of the GWAC array. The primary innovation of the proposed system is the design of monitoring views tailored to the highly complex pipeline, requiring the abstraction of intricate data collected from more than 2,500 software modules across 50 observing cameras and the compact and efficient presentation of this information on a limited number of computer screens. The system comprises four views that provide a comprehensive view of the monitoring data: instantaneous state monitoring, state evolution monitoring, key parameter monitoring, and transient lifecycle monitoring. In particular, the state evolution and transient lifecycle monitoring views are here proposed for the first time in the field of astronomical research.

These innovative monitoring views provide users with comprehensive diagnosis tools for the instantaneous states and state evolution of the GWAC array, including pointing, observation, data processing, and hardware-software feedback. This enhanced diagnosis capability considerably accelerates fault detection and localization. Additionally, the transient lifecycle monitoring view provides clear visualization of critical anomalies in key processing modules, thus representing a valuable reference indicator to optimize observation efficiency. Our practical case analysis indeed demonstrated that the proposed monitoring system produced a tenfold acceleration of fault localization in the GWAC array. We also detailed the collection and storage schemes implemented for the monitoring data and presented several monitoring case studies to illustrate the aspects of fault analysis related to each original monitoring view.

In its current version, the GWAC monitoring system primarily focuses on the visualization of monitoring data, assisting operators in assessing the system operation state and conducting fault analyses. Although common faults can be diagnosed with the monitoring views only, complex failures require expert intervention combining the monitoring views with detailed backend logs. Future work will involve compiling an exhaustive list of fault cases and characterizing the connection between monitoring data and faults. Additionally, research will be undertaken to establish an automated fault diagnosis system, relying on monitoring data to further improve fault diagnosis and system maintenance efficiency. The design of the proposed system monitoring views is highly generalizable, hence applicable not only to the GWAC array, but also to other arrays.

Acknowledgments

This study was supported by the Young Data Scientist Program of the China National Astronomical Data Center, the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB0550401), and the National Natural Science Foundation of China (12494573).

AI Disclosure Statement: Deepseek was employed for language and grammar checks within the article. The authors carefully reviewed, edited, and revised the Deepseek-generated texts to their own preferences, assuming ultimate responsibility for the content of the publication.

Author Contributions: Yang Xu proposed the concept of comprehensive monitoring, and implemented the monitoring system. Guangwei Li, Jing Wang, and Liping Xin used and collected fault diagnosis cases based on monitoring views. Hongbo Cai, Xuhui Han, Xiaomeng Lu, Lei Huang, and Jianyan Wei provided guidance and suggestions on system details and future work. All authors read and approved the final manuscript.

Declaration of Interests: The authors declare no competing interests.

References

Wei, J. Y., Cordier, B., Antier, S., et al. 2016. The deep and transient universe in the SVOM era: new challenges and opportunities-scientific prospects of the SVOM mission. arXiv: 1610.06892.

Xin, L. P., Han, X. H., Li, H. L., et al. 2023. Prompt-to-afterglow transition of optical emission in a long gamma-ray burst consistent with a fireball. Nature Astronomy, 7(6): 724−730.

Li, G. W., Wang, L., Yuan, H. L., et al. 2024. The white-light superflares from cool stars in GWAC triggers. The Astrophysical Journal, 971: 114.

Li, G. W., Wu, C., Zhou, G. P., et al. 2023. Magnetic activity and parameters of 43 flare stars in the GWAC archive. Research in Astronomy and Astrophysics, 23(1): 015005.

Xin, L. P., Li, H. L., Wang, J., et al. 2024. A huge-amplitude white-light superflare on a L0 brown dwarf discovered by GWAC survey. Monthly Notices of the Royal Astronomical Society, 527(2): 2232−2239.

Xin, L. P., Li, H. L., Wang, J., et al. 2021. A ΔR~9.5 mag superflare of an ultracool star detected by the SVOM/GWAC system. The Astrophysical Journal, 909(2): 106.

Li, H. L., Wang, J., Xin, L. P., et al. 2023. White-light superflare and long-term activity of the nearby M7 type binary EI Cnc observed with GWAC system. The Astrophysical Journal, 954(2): 142.

Han, X. H., Xiao, Y. J., Zhang, P. P., et al. 2021. The automatic observation management system of the GWAC network. I. System architecture and workflow. Publications of the Astronomical Society of the Pacific, 133(1024): 104504.

Huang, L., Xin, L. P., Han, X. H., et al. 2015. Auto-focusing of wide-angle astronomical telescope. Optics and Precision Engineering, 23: 174−183.

Xu, Y., Xin, L. P., Wang, J., et al. 2020. A real-time automatic validation system for optical transients detected by gwac. Publications of the Astronomical Society of the Pacific, 132(1011): 054502.

Xu, Y., Xin, L. P., Han, X. H., et al. 2020. The GWAC data processing and management system. arXiv: 2003.00205.

Costa, A., Munari, K., Incardona, F., et al. 2021. The monitoring, logging, and alarm system for the cherenkov telescope array. arXiv: 2109.05770.

Hu, T. Z., Zhang, Y., Cui, X. Q., et al. 2021. Telescope performance real-time monitoring based on machine learning. Monthly Notices of the Royal Astronomical Society, 500(1): 388−396.

Di Carlo, M., Dolci, M., Smareglia, R., et al. 2016. Monitoring and controlling the SKA telescope manager: a peculiar LMC system in the framework of the SKA LMCs. In Proceedings of SPIE, 9913: 1348−1357.

Submission history

[v1] 2025-07-25