Postprint: A Study of the Spatial Distribution of Infall Candidates in Molecular Cloud Clumps
Wan Yujie, Jiang Zhibo, Jiang Yu, Chen Zhiwei
Submitted 2025-10-11 | ChinaXiv: chinaxiv-202510.00056

Abstract

Stars form from the gravitational collapse of dense molecular cloud cores. Investigating the specific locations within molecular cloud clumps where this collapse motion preferentially occurs will help elucidate star formation across different regions of molecular cloud clumps, providing additional insights for star formation studies. Utilizing CO data from the Milky Way Imaging Scroll Painting project, combined with basic information on 3533 infall candidate sources identified through CO spectral lines, we identify the molecular cloud clumps associated with these infall candidate sources and examine their distribution within the clumps. By comparing the distribution obtained from random sampling according to a specified number density in a three-dimensional sphere with the actual distribution of infall candidate sources within molecular cloud clumps, we find that the number density distribution of infall candidate sources approximately follows a Gaussian decay with normalized central distance, i.e., the relationship between the number density n of infall sources and the normalized central distance r is approximately $ n \propto {\rm e}^{-ar^{2}} $, where a is the attenuation coefficient. In 13CO clumps, the best-fit number density function is $ n \propto {\rm e}^{-4.5r^{2}} $; while in C18O clumps, the best-fit number density function is $ n \propto {\rm e}^{-3.2r^{2}} $. The results demonstrate that infall preferentially occurs in the central regions of molecular cloud clumps and is less frequent at the clump peripheries.

Full Text

Preamble

Vol. 66 No. 5

September 2025

Acta Astronomica Sinica Vol. 66 No. 5 Sept., 2025 doi: 10.15940/j.cnki.0001-5245.2025.05.006

Spatial Distribution of Infall Candidates in Molecular Clumps

WAN Yu-jie¹,² JIANG Zhi-bo¹† JIANG Yu¹,² CHEN Zhi-wei¹

(1 Purple Mountain Observatory, Chinese Academy of Sciences, Nanjing 210023)
(2 School of Astronomy and Space Sciences, University of Science and Technology of China, Hefei 230026)

Abstract

Stars form through the gravitational collapse of dense molecular cloud cores. Investigating where this collapse motion preferentially occurs within molecular cloud clumps will help us understand star formation across different regions of clumps and provide valuable insights into the star formation process. Using CO data from the Milky Way Imaging Scroll Painting (MWISP) project and basic information for 3533 infall candidates identified from CO spectral lines, we search for the molecular cloud clumps associated with these infall candidates and examine their distribution within the clumps. By comparing distributions obtained from Monte Carlo simulations of points scattered at various number densities in a 3D sphere with the actual distribution of infall candidates in molecular cloud clumps, we find that the number density of infall candidates exhibits an approximate Gaussian decay with normalized center distance. Specifically, the relationship between the number density n of infall sources and the normalized center distance r is approximately n/e(cid:0)ar2. In 13CO clumps, the best-fitting number density function is n/e(cid:0)4:5r2, while in C18O clumps, it is n/e(cid:0)3:2r2, where a is the decay coefficient. The results indicate that infall motions are more likely to occur in the central regions of molecular cloud clumps and less likely at the edges.

Key words stars: formation, interstellar medium: clouds, infall candidates
CLC number: P155
Document code: A

1 Introduction

Since the first detection of the J=1⟶0 rotational transition line of interstellar carbon monoxide (CO) in Orion in the 1970s [1], mounting observational evidence has demonstrated that stars are born in cold molecular gas in interstellar space. Based on extensive observational data from nearby low-mass star-forming regions, primarily comprising millimeter-wave molecular line and infrared observations together with theoretical studies, Shu et al. [2] synthesized the standard model for low-mass star formation: dense molecular cloud cores in molecular clouds begin inside-out collapse under gravity to form protostars, with material in the envelope continuing to fall inward. The protostar gains mass through an accretion disk until the envelope and disk material are gradually exhausted, leaving a revealed star that reaches the classical T Tauri stage—that is, a pre-main-sequence star. Pre-main-sequence stars gradually contract and heat up until they reach the main sequence [3–4].

Two main models currently explain massive star formation [5]. The first is the monolithic collapse model, also known as the turbulent core accretion model [6], in which a single molecular cloud core collapses under gravity and increases its mass through disk accretion. The second is the competitive accretion model [7], which posits that massive stars form in the densest regions of molecular cloud cores by rapidly accreting gas. Regardless of which mechanism operates, the gravitational collapse of molecular cloud cores represents the earliest phase of star formation [8]. The onset of gravitational collapse in molecular cloud cores can follow two modes. One is spontaneous star formation, where a core automatically begins collapsing due to some instability after meeting certain physical conditions. The other is triggered star formation [9–10], where collapse is induced by external factors such as radiation flows, shocks, or cloud-cloud collisions [11] after the core meets certain conditions. This latter mode may dominate in burst-like star formation.

Gravitational collapse of molecular cloud cores implies that gas in the outer envelope falls toward the central region. Since the excitation temperature in the core region is higher than in the outer envelope, this infall motion produces a blue-asymmetric double-peaked profile in optically thick molecular lines, while the line center of optically thin lines falls exactly between the two peaks of the optically thick profile [12]. Using this diagnostic method, numerous studies have shown that gas infall can persist until the Class I protostar stage [13–14]. Investigations of molecular line diagnostics toward massive molecular cloud cores indicate that the mass infall rates (M⊙/yr) are sufficient to form massive stars [15–16], and that a considerable fraction of massive molecular cloud cores exhibit global collapse [17].

Most current studies have been limited to targeted observations of specific star-forming regions or dense molecular cloud cores, providing only limited sample sizes. Due to this scarcity of samples, we still lack a comprehensive understanding of gas infall motions during the gravitational collapse phase. Consequently, there is an urgent need for large-scale, unbiased sample studies across the entire Milky Way to more fully understand the dynamical processes during the earliest stages of star formation.

Based on 2400 molecular line data of CO and its isotopologues covering Galactic longitudes 12° to 230° and Galactic latitudes from the MWISP survey [18], Jiang et al. [19] identified 3533 molecular cloud clumps with gas infall features using the blue-asymmetric profiles of 12CO/13CO lines and corresponding optically thin 13CO/C18O lines, creating the largest sample of infall clumps internationally. Yang et al. [20] and Yu et al. [21] selected 343 infall sources from this sample for dense gas tracer (HCO+ and H13CO+) line studies, confirming that 96 sources exhibit clear infall signatures. Meanwhile, these dense gas tracer observations indicate that most infall sources in the sample do not show significant HCO+ emission.

To more comprehensively understand the physical properties and dynamical states of these 3533 infall candidates, we must fully utilize the CO and isotopologue line data provided by MWISP. In this paper, we first perform unbiased molecular cloud core detection using the 2400 MWISP CO molecular line data, cross-match the extracted molecular cloud cores with the 3533 infall candidates, investigate the spatial distribution of gas infall motions relative to molecular cloud cores, enhance our understanding of the dynamical state of gravitational collapse in molecular clouds, and provide statistically significant observational clues for fully understanding star formation mechanisms.

2.1 Molecular Cloud Clump Identification Algorithm

To investigate the distribution of infall candidates within molecular cloud clumps, we must first accurately identify the molecular cloud clumps associated with these candidates using an automated detection algorithm. Since the 1990s, numerous automated clump detection algorithms have been developed, including widely used methods such as ClumpFind [22], GaussClumps [23], and FellWalker [24]. In recent years, new algorithms like LDC (Local Density Clustering) [25], ConBased [26], and FacetClumps [27] have emerged, providing more options for related research. Jiang et al. [27] compared the recall rates, distance errors, and region intersection over union metrics of FellWalker, LDC, ConBased, and FacetClumps on synthetic data composed of observed and simulated molecular cloud clumps with different densities. The experimental results demonstrated that FacetClumps achieves superior overall performance across different environments. Therefore, this work adopts FacetClumps for molecular cloud clump detection.

FacetClumps consists of four main sub-processes. First, it extracts signal regions based on morphology. Second, it detects clump centers using a Facet model. Third, it segments local regions through gradient analysis. Finally, it clusters local regions to clump centers using connectivity-based minimum distance clustering. FacetClumps combines morphological operations including threshold segmentation, opening operations, and connected component labeling to extract regions with significant signals from the data. It employs a Gaussian Facet model and multivariate function extremum theory to identify potential clump centers, improving positioning accuracy in dense regions and reducing dependence on peaks. Additionally, FacetClumps uses a gradient-based method to segment signal regions into local areas and enhances the rationality of region segmentation through connectivity-based minimum distance clustering that assigns local regions to clump centers. The algorithm is adaptive, automatically iterating parameters according to different local conditions, and optimizes the algorithm for clumps in faint or highly overlapping regions during detection.

Since this work is based on MWISP data, the parameter descriptions for FacetClumps and their adopted values in this paper are as follows:
- RMS: Represents the global noise level of the data; the value is taken as the RMS (Root Mean Square) in the header file or the median of the noise file data.
- Threshold: Represents the signal cutoff threshold; the value is 2×RMS.
- SWindow: Represents the scale of the window function; the value is 3 pixels.
- KBins: Represents the coefficient for calculating the number of eigenvalue intervals; the value is 35.
- FwhmBeam: Represents the beam size in pixels; the value is 2 pixels.
- VeloRes: Represents the velocity resolution of the instrument in channels; the value is 2 channels.
- SRecursionLBV: Represents the minimum area in spatial directions and minimum length in velocity channels for a region when recursion terminates; the values are 16 pixels and 5 channels, respectively.

2.2.1 Matching Principles

We extracted data blocks of approximately 20′×20′ scale centered on the positions of infall candidates from Jiang et al. [18]'s table from the MWISP database, and applied the FacetClumps algorithm to detect all molecular cloud clumps and their corresponding masks within these data blocks. The candidates were identified using two line pairs (12CO & 13CO, hereafter P1, and 13CO & C18O, hereafter P2). If a candidate came from P1, we used 13CO data to detect clumps; if from P2, we used C18O data. To ensure matching accuracy, we established the following matching criteria:
- The infall candidate center must lie within the spatial mask of the clump.
- The central velocity of the infall candidate must fall within the velocity range of the clump.

Figure 1 [FIGURE:1] shows a successful matching example. The grayscale image represents the integrated intensity within the corresponding clump velocity range. The blue line indicates the 2D spatial mask of the clump, the red star marks the infall candidate position, and the lower left corner annotates the 3D coordinates of the infall candidate center and the velocity range determined by FacetClumps for that clump.

In some cases, particularly in the inner Galaxy region, the complex distribution of molecular gas may cause a candidate center to simultaneously match two or more clumps, as shown in Figure 2 [FIGURE:2]. The symbols are similar to Figure 1, with blue and green curves representing the 2D spatial masks of the two clumps in the spatial direction, and the velocity ranges of both clumps annotated in the lower left corner.

For such cases where a candidate matches multiple clumps, we cannot definitively determine which clump it belongs to. Therefore, we discard these ambiguous infall candidates from our matching effort and consider only those that match a unique clump.

2.2.2 Matching Results

Following the above matching criteria, we obtained matches for 1712 out of 3329 infall candidates identified from P1, accounting for approximately 51.58%; for P2, we matched 124 out of 204 candidates, accounting for approximately 60.78%. Subsequent analysis focuses exclusively on these uniquely matched infall candidates, totaling 1836 sources.

3.1 Two-Dimensional Projected Distribution of Infall Candidates in Molecular Cloud Clumps

Figure 3 [FIGURE:3] illustrates the position of an infall candidate sample within the sky-plane projection of its associated clump. The grayscale map shows integrated intensity, the light blue star marks the projected position of the clump's centroid, and the green star indicates the projected position of the infall candidate. Since some clumps are surrounded by others, the clump contours may not perfectly match the integrated intensity map. In most cases, the sky-plane projection of a clump is an irregular shape with varying size. To quantitatively describe the distribution of infall candidates in the 2D projection, we define a dimensionless parameter β. We connect the projected position of the clump center to that of the infall candidate center (shown as the red line segment in the figure) and extend it to the boundary of the clump's 2D projection (shown as the blue line segment). We calculate the length of the red segment, denoted as d₁, and the sum of the lengths of the red and blue segments, denoted as d₂. Then β = d₁/d₂. When approximating the clump projection as circular, β represents the normalized center distance—the ratio of the distance from a point inside the circle to the center versus the circle's radius. The β value can quantitatively reflect the distribution of infall candidates in the 2D projection of clumps.

By definition, β ranges from 0 to 1. Smaller β values indicate that the infall candidate's projection is closer to the clump centroid's projection, while larger β values indicate proximity to the clump projection boundary. We divide the projected region of molecular cloud clumps into three parts—central, transitional, and edge regions—based on β values using the following classification:
- Central region: β ≤ 0.3
- Transitional region: 0.3 < β ≤ 0.7
- Edge region: β > 0.7

Figure 4 [FIGURE:4] displays six examples of infall candidate positions in molecular cloud clump projections. The symbols are identical to Figure 3, with each panel's upper right corner annotating the β value calculated from the clump and its matched infall candidate, along with the clump's velocity integration range. The top, middle, and bottom rows show infall candidates located in the central, transitional, and edge regions, respectively.

Based on the above definition, we obtained the β values for all infall candidates within their associated molecular cloud clump projections. Figure 5 [FIGURE:5] shows the distribution of infall candidates identified from the two line pairs across different regions of the clump projections. Specifically, for the P1 dataset, the β values in clump projections have a mean of 0.40 and a standard deviation of 0.039. Among these, 601 sources (approximately 35%) reside in the central region, 970 (57%) in the transitional region, and 141 (8%) in the edge region. For the P2 dataset, the mean β value is 0.44 with a standard deviation of 0.043, comprising 36 sources (29.03%) in the central region, 72 (58.06%) in the transitional region, and 16 (12.90%) in the edge region. Overall, the β values of the two candidate groups show no significant statistical difference. The distribution maps of all infall candidates in their corresponding clump projections and the matching tables are available at the following link¹. To avoid confusion caused by mismatches between some clump contours and integrated intensity maps, the figures in the link only display integrated intensity within the masked regions identified by FacetClumps.

To further investigate the positional distribution of infall candidates in clumps, we calculated the probability density function of β values. Figure 6 [FIGURE:6] presents the probability density functions of β for the P1 dataset in 13CO clump projections (left) and the P2 dataset in C18O clump projections (right). Both distributions show similar trends, characterized by higher densities in the middle and lower densities at both ends. However, it is important to note that this distribution reflects only the sky-plane projection of infall candidates in molecular cloud clumps, not their true 3D distribution.

3.2 Distribution Number Density Function in Three-Dimensional Space

3.2.1 Monte Carlo Simulation

Since we cannot directly observe the 3D structure of molecular cloud clumps in the data, we employ Monte Carlo simulations to model the spatial distribution of infall candidates in 3D. First, we assume that molecular cloud clumps are approximately spherical and that the number density distribution of infall candidates within them is isotropic, varying only along the radial direction. Given that we have normalized the observed candidate positions, this assumption should be reasonable. Next, by assuming various functional relationships for how the number density varies with radius, we scatter points in a virtual sphere through simulation. For each point's projected distance d from the center in the projection plane, we define another dimensionless normalized center distance γ = d/R, where R is the radius of the virtual sphere. Similar to β, γ ranges from 0 to 1, reflecting each point's distribution within the projected circle. This yields the probability density functions of γ values under different models.

Finally, we determine the most likely model by calculating the Root Mean Square Error (RMSE) between the probability density functions of γ values obtained from different models and the β value probability density function, as well as through Kolmogorov-Smirnov (K-S) testing. Equation (1) shows the RMSE calculation method:

$$
\text{RMSE} = \sqrt{\frac{1}{N}\sum_{i}(f_{\beta i} - f_{\gamma i})^2}
$$

where N is the total number of bins, and f{βi} and f are the probability density function values of β and γ at the i-th bin, respectively. Smaller RMSE values indicate smaller deviations and fluctuations between the two distributions. The K-S test is a non-parametric method that can test whether two samples originate from the same probability distribution.

3.2.2 Simulation Results for Different Functional Models

We first consider the simplest case: a uniform distribution of infall candidates within molecular cloud clumps. Figure 7 [FIGURE:7] shows the probability density functions of β values for both P1 and P2 datasets and the γ values obtained from the uniform distribution model. Significant deviations exist between the β and γ probability density functions for both datasets, with γ values distributed more heavily toward larger values than β. This suggests that the true distribution of infall candidates in clumps is not approximately uniform; instead, the number density more likely decreases from the clump center toward the boundary.

We therefore focus on three common decay functions: power-law decay (n ∝ r⁻ᵃ, where n is number density, r is the normalized center distance, and a is a positive coefficient), exponential decay (n ∝ e⁻ᵃʳ), and Gaussian decay (n ∝ e⁻ᵃʳ²). To determine the optimal a value for each model, we sample a series of a values at intervals of 0.1 within a certain range, calculate the RMSE between the γ probability density function from simulated scattering and the β probability density function, and identify the a value that minimizes RMSE as the best-fitting decay coefficient for that model. The error of the optimal a obtained by this method does not exceed 0.1.

Figure 8 [FIGURE:8] plots the RMSE between the γ probability density function from the power-law decay model and the β probability density function as a function of a, with left and right panels showing results for datasets P1 and P2, respectively. For P1, RMSE is minimized at a ≈ 1.7 (≈0.39), while for P2, RMSE is minimized at a ≈ 1.5 (≈0.29). Thus, the best-fitting models for the P1 and P2 datasets in their respective molecular cloud clumps are n/r⁻¹·⁷ and n/r⁻¹·⁵. Figure 9 [FIGURE:9] shows the best-fitting results for the power-law decay model for both datasets. Although power-law decay fits the observed β distribution better than uniform distribution, the fit remains unsatisfactory, prompting us to proceed with exponential decay modeling.

Figure 10 [FIGURE:10] presents the RMSE between the γ probability density function from the exponential decay model and the β probability density function as a function of a, with left and right panels for datasets P1 and P2, respectively. For P1, RMSE is minimized at a ≈ 4.7 (≈0.165), while for P2, RMSE is minimized at a ≈ 3.7 (≈0.172). We therefore infer that the best-fitting models for the P1 and P2 datasets are n/e⁻⁴·⁷ʳ and n/e⁻³·⁷ʳ, respectively. Figure 11 [FIGURE:11] shows the best-fitting results for the exponential decay model for both datasets. The exponential decay model provides a better fit to the observed β probability density function than the power-law decay model. We then perform K-S tests on all β and γ values for both P1 and P2 datasets. The results show a P-value of 9.96×10⁻⁴ for P1, far below 0.05, rejecting the hypothesis that β and γ values come from the same distribution. For P2, the P-value is 0.42, greater than 0.05, supporting that β and γ values originate from the same distribution. Thus, the exponential decay model does not fit P1 well, and the P2 sample size is too small, necessitating Gaussian decay modeling.

Figure 12 [FIGURE:12] shows the RMSE between the γ probability density function from the Gaussian decay model and the β probability density function as a function of a, with left and right panels for datasets P1 and P2, respectively. For P1, RMSE is minimized at a ≈ 4.5 (≈0.113), while for P2, RMSE is minimized at a ≈ 3.2 (≈0.134). Therefore, the best-fitting Gaussian decay models for the P1 and P2 datasets in their respective molecular cloud clumps are n/e⁻⁴·⁵ʳ² and n/e⁻³·²ʳ². Figure 13 [FIGURE:13] presents the best-fitting results for the Gaussian decay model for both datasets. The Gaussian decay model fits the β probability density function better than the exponential decay model. K-S test results for both datasets yield P-values of 0.18 for P1 and 0.74 for P2, both greater than 0.05, supporting that β and γ values in both datasets come from the same distribution. Consequently, the Gaussian decay model excellently describes the distribution of infall candidates in molecular cloud clumps. However, since actual molecular cloud clumps are not perfect spheres, some discrepancies between the distribution and the spherical simulation results are inevitable.

4 Discussion

Comparisons between the γ probability density functions from different distribution models and the β probability density function reflecting the true distribution of infall candidates in clumps reveal that this sample of infall candidates shows a number density that gradually decreases from clump center to boundary, approximating a Gaussian decay with normalized center distance. Infall candidates traced by 13CO (J=1⟶0) exhibit a faster decay trend than those traced by C18O (J=1⟶0). However, since the sample size of candidates selected through C18O lines is an order of magnitude smaller than that through 13CO, we caution that interpretation of the Gaussian decay parameter a should be approached carefully. Furthermore, because CO (J=1⟶0) is not the optimal line for tracing infall motions and our spectral line data have limited spatial resolution, this sample can only be regarded as infall candidates. Nevertheless, since these infall candidates were identified based solely on spectral line profile characteristics without a priori conditions such as known star-forming regions or HII regions, we believe their distribution can represent, to some extent, the spatial distribution of true infall sources. More precise determination of infall source distributions in molecular cloud clumps requires further observational studies, including additional selection (e.g., HCO+ observations for confirmation) and higher spatial resolution mapping observations.

5 Conclusions

This study uses the sample of infall candidates obtained from 12CO, 13CO, and C18O (J=1⟶0) data from the Milky Way Imaging Scroll Painting survey to investigate the number density distribution of spatial locations where infall motions occur within molecular cloud clumps, based on their positions in the projected plane of their host clumps using Monte Carlo methods.

We employed the molecular cloud core detection algorithm FacetClumps to extract molecular cloud clumps in the fields of 3533 infall candidates and cross-matched the extracted clumps with the infall candidates. This yielded 1836 samples with uniquely matched clumps, including 1712 from the 12CO/13CO line pair and 124 from the 13CO/C18O line pair, representing 51.58% and 60.78% of their respective total samples. To obtain clear positional information, we restricted our study to these uniquely matched samples.

Using Monte Carlo methods, we found that both datasets exhibit spatial number density distributions following Gaussian decay, with best-fitting decay function models of n/e⁻⁴·⁵ʳ² and n/e⁻³·²ʳ², respectively. The results demonstrate that infall motions are more likely to occur in the central regions of molecular cloud clumps and less likely at the edges.

References

  1. Wilson R W, Jefferts K B, Penzias A A. ApJ, 1970, 161: L43
  2. Shu F H, Adams F C, Lizano S. ARA&A, 1987, 25: 23
  3. Hayashi C, Hōshi R, Sugimoto D. Progress of Theoretical Physics Supplement, 1962, 22: 1
  4. Hosokawa T, Omukai K. ApJ, 2009, 691: 823
  5. Tan J C, Beltrán M T, Caselli P, et al. Protostars and Planets, 2014, 5: 149
  6. McKee C F, Tan J C. ApJ, 2003, 585: 850
  7. Bonnell I A, Bate M R, Clarke C J, et al. MNRAS, 2001, 323: 785
  8. Bachiller R. ARA&A, 1996, 34: 111
  9. Bisbas T G, Wünsch R, Whitworth A P, et al. ApJ, 2011, 736: 142
  10. Zhang S, Wang K, Liu T, et al. MNRAS, 2023, 520: 322
  11. 徐小云, 陈学鹏, 张世瑜, 等. 天文学报, 2024, 65: 45
  12. Evans I N J. ARA&A, 1999, 37: 311
  13. Mardones D, Myers P C, Tafalla M, et al. ApJ, 1997, 489: 719
  14. Gregersen E M, Evans I N J, Mardones D, et al. ApJ, 2000, 533: 440
  15. Chen X, Shen Z Q, Li J J, et al. ApJ, 2010, 710: 150
  16. Yang Y, Jiang Z, Chen Z, et al. ApJ, 2021, 922: 144
  17. Yang Y, Chen X, Jiang Z, et al. ApJ, 2023, 955: 154
  18. Su Y, Yang J, Zhang S, et al. ApJS, 2019, 240: 9
  19. Jiang Z, Zhang S, Chen Z, et al. RAA, 2023, 23: 075001
  20. Yang Y, Jiang Z B, Chen Z W, et al. RAA, 2020, 20:
  21. Yu S, Jiang Z, Chen Z, et al. AJ, 2024, 168: 52
  22. Williams J P, de Geus E J, Blitz L. ApJ, 1994, 428: 693
  23. Stutzki J, Guesten R. ApJ, 1990, 356: 513
  24. Berry D S. A&C, 2015, 10: 22
  25. Luo X, Zheng S, Huang Y, et al. RAA, 2022, 22: 015003
  26. Jiang Y, Zheng S, Jiang Z, et al. A&C, 2022, 40: 100613
  27. Jiang Y, Chen Z, Zheng S, et al. ApJS, 2023, 267: 32

¹ https://www.scidb.cn/s/VBji22

Submission history

Postprint: A Study of the Spatial Distribution of Infall Candidates in Molecular Cloud Clumps