Abstract
Wide binary stars are systems composed of two widely separated celestial bodies orbiting a common center of mass, with their physical projected separation being the most direct observable quantity. Using high-purity wide binary stars within 1 kpc of the solar neighborhood as a test sample, anomalous γ-value phenomena caused by selection effects were identified (showing increasing deviation from the normal value of γ = -1.5 across different distance shells). To address this issue, we quantified the strength of selection effects in each distance shell, recalibrated the selection function accordingly, and employed three different power-law models to perform parameter fitting for the intrinsic distribution of projected separations through Bayesian statistics and Markov Chain Monte Carlo methods. The results demonstrate that after applying the improved selection function, the power-law index γ for all three mathematical models becomes stable and normal across all distance shells. These findings provide a reference basis for correcting model parameters of the projected separation distribution of wide binary stars affected by selection effects, and are of significant importance for understanding the formation and evolution of wide binary stars, as well as for other related fields.
Full Text
Preamble
Vol. 43, No. 3
September 2025
PROGRESS IN ASTRONOMY Vol. 43, No. 3 Sept., 2025 doi: 10.3969/j.issn.1000-8349.2025.03.04
Study on Selection Effects of Wide Binary Stars
LIU Di¹, ZHANG Peng¹,², TIAN Haijun³, LIU Gaochao¹,², YANG Xiangming¹, XIONG Zhuang¹
(1. College of Science, China Three Gorges University, Yichang 443002, China; 2. Center for Astronomy and Space Sciences, China Three Gorges University, Yichang 443002, China; 3. College of Science, Hangzhou Dianzi University, Hangzhou 310018, China)
Abstract
Wide binary systems consist of two widely separated celestial bodies orbiting a common barycenter, with their physical projected separation being the most direct observable quantity. Using a high-purity sample of wide binaries within 1 kpc of the solar neighborhood as a test sample, we discovered anomalous γ values caused by selection effects that increasingly deviate from the normal value of γ = -1.5 across different distance shells. To address this issue, we quantified the strength of selection effects in each distance shell, recalibrated the selection function accordingly, and employed three different power-law models to fit the intrinsic distribution of projected separations using Bayesian statistics and Markov Chain Monte Carlo methods. The results demonstrate that after applying the improved selection function, the power-law indices γ from all three mathematical models stabilize and approach normal values in each distance shell. These findings provide a reference basis for correcting model parameters of projected separation distributions affected by selection effects in wide binaries, which is significant for understanding the formation and evolution of wide binaries and related fields.
Keywords: binary; wide binary systems; selection effect; selection function
1 Introduction
Binary star systems consist of two celestial bodies born at approximately the same time with identical initial chemical compositions, gravitationally bound to orbit their common center of mass. Approximately half of Sun-like stars belong to binary systems, with orbital periods ranging from decades to hundreds of millions of years. A small fraction of binaries have small separations (s ≲ 10 AU), where mass exchange between components leads to complex evolutionary pathways, while most binaries have large separations, reaching up to 2 × 10⁴ AU (≈0.1 pc), allowing components to evolve independently. Systems with widely separated components and weak mutual interactions are known as wide binaries.
The formation of binary stars remains a central topic in astronomical research, with different theoretical models favoring specific formation mechanisms depending on projected separation, though no consensus has been reached. Binaries with small projected separations (s ≲ 100 AU) may form through fragmentation of a common circumstellar disk (disk fragmentation mechanism). Wide binary formation is more complex: systems with separations less than 10³ AU may form via turbulent fragmentation of molecular cloud cores, while those with larger separations (s ≳ 10³ AU or 10⁴ AU) may originate from random pairing of dissolving cluster members, dynamical evolution of unstable triple systems, or chance encounters between neighboring stellar cores.
Wide binaries represent one of the simplest, smallest, and most fragile astronomical systems. Studying their projected separation distribution enables probing variations in the Galactic gravitational potential. With extremely low binding energy, wide binary orbits are highly vulnerable to disruption by external gravitational perturbations from other stars, giant molecular clouds, and Galactic tides, or by internal evolutionary processes. Since disruption probability increases with separation, the separation distribution of an evolved wide binary population should exhibit a break, corresponding to a critical break value s_br in the power-law model parameters. Conversely, the location of this break can be used to assess differences in gravitational potential across regions of the host galaxy. Thus, wide binaries serve as powerful probes of Galactic gravitational potential on small scales. By analyzing the separation distribution of halo wide binaries, we can investigate the existence and properties of massive compact halo objects (MACHOs). For instance, Yoo et al. predicted a break in the angular separation power-law distribution of halo wide binaries in the presence of MACHOs, with the break location determined by MACHO mass and density. Tian et al. used total tangential velocity (V_⊥,tot) as a tracer of age and metallicity to divide wide binary samples into young disk-like, old halo-like, and intermediate sub-samples, revealing different break characteristics in projected separation. In the Milky Way, wide binary systems cannot maintain stable gravitational binding beyond separations sensitive to the Galactic tidal field. As extremely weakly bound systems, wide binaries can also test modified gravity theories in ultra-low gravitational environments and provide an ideal setting for probing dark matter mass distribution in ultra-faint dwarf galaxies dominated by dark matter, enabling tests of various dark matter models on small galactic scales. Additionally, wide binary disruption may result from internal stellar evolution, where the "kick" imparted during supernova explosions forming neutron stars or black holes can destroy these fragile systems.
Astronomers typically study binary projected separation distributions using the power-law model p(s) ∝ s^γ (γ ≈ -1.5). Since projected separation (s) has a linear relationship with orbital semi-major axis (a) of s ≈ 0.978a, they share similar statistical properties. Lépine and Bongiorno found that wide binary projected separation distributions follow f(s)ds ≈ s^{-l} (l = 1.6^{+0.1}{-0.1}, s ≳ 4000 AU) based on Hipparcos data. Chanamé and Gould fitted angular separation distributions of disk-like and halo-like wide binaries, obtaining best-fit power-law indices of -1.67^{+0.07}} in the range 3.5″ < Δθ < 900″. El-Badry and Rix characterized main sequence-main sequence (MS-MS) wide binary projected separation distributions with γ ≈ -1.6. Tian et al. used a smoothly broken power-law model to fit intrinsic distributions of disk-like, halo-like, and intermediate wide binary samples, obtaining power-law indices γ of -1.51^{+0.03{-0.07} and -1.55^{+0.10}}, and -1.56^{+0.03{-0.04} and -1.55^{+0.05}.
Due to limitations in Gaia's angular resolution and binary search algorithms, our wide binary catalog is incomplete. El-Badry and Rix found that incompleteness primarily arises from blending of the two stellar components, particularly severe when brightness differences are large or angular separations are small. Using a high-purity wide binary sample constructed from the El-Badry 2021 binary catalog, we fitted the intrinsic projected separation distribution of subsamples in different distance shells using three mathematical models: single power law (SPL), double power law (DPL), and smoothly broken power law (SBPL). We discovered anomalous γ values caused by selection effects, leading to biased inferences of the intrinsic distribution. To address this, we recalibrated the selection function in each distance shell and applied it to the fitting process to obtain more accurate intrinsic distributions.
Section 2 describes additional filtering criteria used to construct the high-purity wide binary sample. Section 3 discusses selection effects in binary samples. Section 4 explains how we improved the selection function to infer intrinsic separation distributions. Section 5 presents fitting results from the three power-law models and compares γ (or γ₁) values before and after calibration. Finally, we summarize our findings and outline future research directions.
2 Data and Selection Process
Our dataset utilizes the catalog of wide binaries identified in Gaia Early Data Release 3 (Gaia EDR3) from the Gaia Space Telescope. El-Badry et al. first used this catalog to construct an initial sample containing 1,817,594 wide binary candidates within approximately 1 kpc of the Sun, with physical projected separations ranging from tens of AU to 1 pc.
The wide binary catalog contains numerous false binaries—systems where two stars are artificially paired due to randomness and chance during the binary identification process. Stars from previously merged galaxies can maintain coherent motions for several Galactic dynamical timescales as they gradually dissolve. Many sources in the initial binary sample can be identified with known open clusters and moving groups, or have companions selected from nearby clusters, satisfying binary selection criteria but likely not being genuine binaries, resulting in high contamination rates. Therefore, we adopted methods similar to El-Badry et al. and Tian et al. to exclude these contaminants and performed additional selection procedures to obtain a purer wide binary sample. We briefly summarize the selection steps below.
2.1 Selection Criteria
First, we applied stricter parameter constraints to the astrometric and photometric information of both components in wide binary systems, eliminating numerous false sources in non-physical regions of the color-magnitude diagram along with a few genuine sources.
(1) Both stars in binary candidate pairs possess five-parameter astrometric solutions and successful color index G_BP - G_RP measurements, with low astrometric excess noise satisfying χ²/(ν' - 5) < 1.2 × max(1, exp(-0.2(G - 19.5))), where χ² and ν' correspond to parameters astrometric_chi2_al and astrometric_n_good_obs_al in the Gaia archive.
(2) Since magnitudes G_BP and G_Rp are calculated by integrating low-resolution spectra, which is more dispersed than the G-band point spread function, we can assess contamination from nearby sources by comparing BP/RP total fluxes (G_BP, G_RP) and their excess factor (phot_bp_rp_excess_factor). To ensure photometric measurements are minimally contaminated by nearby sources, both stars satisfy 1.0 + 0.015(G_BP - G_RP)² < phot_bp_rp_excess_factor < 1.3 + 0.06(G_BP - G_RP)².
(3) Both stars in the binary system have high signal-to-noise photometry: uncertainties in mean G-band flux for primary and secondary components are both less than 2% (phot_g_mean_flux_over_error > 50). Uncertainties in mean BP and RP fluxes are less than 5% and 10% respectively, requiring phot_bp_mean_flux_over_error > 20 (>10 for secondary) and phot_rp_mean_flux_over_error > 20 (>10 for secondary).
Second, we defined nearby binaries as systems within 1° on the sky, ±3 mas·yr⁻¹ in proper motion coordinates, and ±5 pc in distance (1/ϖ). We calculated the number of nearby binaries (N) for each candidate in position-parallax-proper motion space, accepting only candidates with N < 2. This process removed 565,668 wide binary systems associated with clusters, moving groups, and higher-order multiples. Additionally, to obtain a purer sample, we further constrained Δμ ≤ Δμ_or + 1.0σ_Δμ and σ_Δμ < 0.12, where Δμ is the proper motion difference, Δμ_or is the maximum proper motion difference sustainable for a circular orbit of a 5 M⊙ binary, and σ_Δμ is the uncertainty in proper motion difference.
Finally, we defined stars satisfying M_G < 2.75(G_BP - G_RP) + 5.75 as "main sequence (MS) stars" and those satisfying M_G < 3.25(G_BP - G_RP) + 9.63 as "white dwarfs (WD)," where absolute magnitude M_G = G + 5 log(ϖ/mas) - 10. To eliminate effects of internal orbital evolution, we further excluded binaries containing white dwarfs (WD-MS and WD-WD) from the high-purity sample, selecting only stable MS-MS binaries with longer stellar lifetimes.
These selection criteria yielded a high-purity catalog of 144,458 wide binary candidates, containing no resolved higher-order multiple systems.
2.2 Overview and Characteristics of the Wide Binary Catalog
A total of 144,458 wide binary candidates passed the selection criteria (Section 2.1), comprising 142,995 MS-MS binaries, 1,412 WD-MS binaries, and 51 WD-WD binaries. We note that the classification criteria are not strictly definitive, particularly for WD-MS binaries where some samples near boundaries may be misclassified. The MS category primarily serves to exclude WDs, though it includes small fractions of giants, subgiants, and pre-main-sequence stars. The BP/RP spectral data acquisition window measures 2.1″ × 3.5″, so most sources within 2″ with relatively bright companions lack BP/RP spectral measurements and cannot be classified as WD or MS using Gaia data alone; binaries containing such stars were excluded. Under the assumption of equal distances for both components, we used the brighter primary's parallax (typically more precise) to calculate absolute magnitude M_G. For extinction, we assigned appropriate values to each component using the Galactic extinction catalog constructed by Leike and Enßlin.
[FIGURE:1] shows the sky distribution of the high-purity wide binary sample in Galactic longitude-latitude space (l-b). Gaia's scanning pattern for its all-sky survey leaves a clear imprint, as sources with well-constrained astrometric and photometric parameters are typically located in frequently visited sky regions, making the binary distribution non-uniform in the l-b plane. Binary candidate density is also higher near the Galactic plane, becoming more pronounced before removing sources with numerous phase-space neighbors.
[FIGURE:2] presents color-magnitude diagrams (CMDs) for primary and secondary stars in the high-purity sample. A secondary sequence above the main sequence is clearly visible in both CMDs, indicating the presence of hierarchical triple systems composed of unresolved close binaries.
[FIGURE:3] displays distributions of two fundamental parameters: primary apparent magnitude and magnitude difference. The median apparent magnitudes for MS-MS binaries and MS stars are G = 13.9 mag and G = 15.2 mag respectively, with 67.4% and 67.8% of their distributions falling in ranges of 11.5–15.9 mag and 12.5–16.9 mag. This reflects that stringent requirements on parallax, proper motion, and photometric measurements primarily include relatively bright stars. Apparent magnitude distributions differ significantly across distance shell subsamples (Figure 3a).
Figure 3b shows the magnitude difference distribution (ΔG = |G₁ - G₂|). The median magnitude difference in the wide binary catalog is ΔG = 1.71 mag, with 67.4% of binaries distributed between 11.5–15.9 mag. Significant differences exist among distance shell subsamples, which is crucial for inferring intrinsic separation distributions because Gaia's sensitivity to detect companions at a given angular separation varies with magnitude difference. This indicates that subsamples in different distance shells are affected by selection effects to varying degrees at small separations, motivating the differential calibration methods in Section 4.2 to correct this incompleteness.
[FIGURE:4] shows the number density distribution of wide binaries with distance. The high-purity sample (Figure 4b) was obtained from the initial sample (Figure 4a) through the selection criteria in Section 2.1, with median distance d = 316.3 pc. Approximately 67.8% of wide binaries are distributed in the range 151.6 ≲ d ≲ 599.7 pc, with about 83.8% within 600 pc. For both initial and high-purity samples, we counted distance-number density variations across 10 distance shells in 100 pc intervals. For subsamples extracted from the high-purity sample, we further counted variations in 10 pc intervals.
Due to selection effects, the number density distributions in Figures 4a and 4b both decrease with increasing distance shell, with the latter showing a steeper decline because of stricter selection criteria. Figures 4c–l display distance-number density distributions for high-purity subsamples in each distance shell, showing almost no variation and approximately uniform distribution in log space across shells. The 0 < d < 100 pc subsample shows the highest values and largest decrease.
2.3 False Binaries and Contamination Rate
Before the Gaia mission, precise astrometric data were unavailable, making construction of pure binary catalogs difficult. The first systematic binary catalog was built under the assumption that close binaries were false—a hypothesis proven wrong two decades later. Binary search methods evolved from low-dimensional phase-space searches using position or proper motion to high-dimensional searches incorporating parallax and radial velocity (RV). Gaia data releases have enabled construction of high-purity, large-sample binary catalogs. The contamination rate from false binaries depends on multiple factors; for wide binaries with small parallaxes and proper motions, false binary possibilities cannot be completely eliminated. Contamination increases with projected separation, especially dominating when s > 30,000 AU. El-Badry et al. constructed a seven-dimensional parameter space including angular separation, distance, parallax difference uncertainty, local sky density, tangential velocity, and signal-to-noise ratios of parallax and proper motion differences, using kernel density estimation to calculate contamination rate R. Our high-purity sample is limited to R < 0.1.
Contamination can also be indirectly quantified by comparing absolute radial velocity differences (ΔRV) between primary and secondary stars. Among 5,579 wide binary candidates with accurate RV measurements (σ_RV < 3 km·s⁻¹), we can effectively check contamination rates. False binaries typically show significant RV differences between components. We identify potential contaminants as systems satisfying ΔRV/σ_RV > 5 and ΔRV > 10 km·s⁻¹.
[FIGURE:5] shows RV comparisons and ΔRV versus projected separation distributions. Most binaries with large ΔRV also have above-average σ_ΔRV, with only 16 pairs meeting false binary criteria (ΔRV/σ_RV > 5 and ΔRV > 10 km·s⁻¹), implying a contamination rate of about 0.15%. Most contaminants are distributed in the range 3.0 ≲ log(s/AU) ≲ 4.2. However, the true contamination rate for the entire sample is relatively higher. RV provides contamination estimates for systems where both components are relatively bright.
2.4 Observable Separation Distribution
[FIGURE:6] shows distributions of angular separation θ and physical projected separation s. Due to Gaia's angular resolution limit, no binaries exist with θ ≲ 1.5″. With loose selection criteria, the separation distribution shows a bimodal pattern with peaks at log(θ) ≈ 0.5 and log(θ) ≈ 2.5 in angular separation, corresponding to log(s) ≈ 3 and log(s) ≈ 5 in projected separation (black histograms). However, the large-separation peak is entirely due to false binaries rather than genuine systems, shifting to larger separations and eventually disappearing with stricter criteria. Dhital et al. and Oelkers et al. found bimodal projected separation distributions for wide binaries, suggesting the larger-separation population contains binaries formed through different mechanisms.
For the initial sample, contaminants dominate at large separations (θ ≈ 4.2″, s ≳ 0.05 pc), with the separation distribution peak entirely caused by false binaries. Consistent with El-Badry and Rix and Tian et al., MS-MS binaries do not follow Opik's law (uniform distribution in log projected separation) at any s ≳ 500 AU. We divided the high-purity sample into 10 distance shell subsamples. Notably, angular separation distributions for these subsamples show different truncation at various θ (decreasing with distance shell), while raw projected separation distributions show corresponding truncation at s ≲ 10³ AU (increasing with distance shell) and a clear cutoff at s ≈ 10⁵ AU. This significant bimodal distribution is caused by randomly paired false binaries.
3 Selection Effects
Selection effects are a pervasive challenge in astronomical research, arising from instrumental limitations, data processing methods, and detection techniques that cause certain objects to be more or less easily discovered or studied, potentially biasing results and distorting understanding of celestial distributions, properties, and evolution. Therefore, considering and calibrating selection effects is crucial.
Selection effects remain widespread and complex in binary research. Bright stars are more easily detected, while faint companions may be missed, creating a bias toward bright stars. Wide angular separations are more easily resolved, while small separations may cause stellar blending, making identification difficult. Binaries can be identified as complete systems only when angular separation is sufficiently large and both stars are bright enough for independent detection. However, at small angular separations, bright stars make detecting faint companions difficult at fixed separations. Thus, sample incompleteness results from combined effects of apparent magnitude, magnitude difference, and angular separation.
Due to Gaia observational limitations and binary selection criteria, our wide binary catalog is incomplete: it lacks spatially unresolved close binaries and systems where one component is too faint to pass selection criteria or be detected initially. The lack of binaries with small physical projected separations primarily results from incompleteness at small angular separations (see [FIGURE:7]).
Selection effect strength varies across distance shells, with different subsamples showing varying degrees of incompleteness (see Figures 3 and 6). Relatively, subsamples in smaller distance shells have more observables in apparent magnitude G, magnitude difference ΔG, angular separation θ, and projected separation s than those in larger distance shells. Additionally, different distance shell subsamples show varying degrees of truncation at small separations (s ≲ 10³ AU), increasing with distance. In summary, selection effects cause sample incompleteness and affect wide binary property distributions in different ways.
[FIGURE:7] shows the distribution of wide binaries in distance-projected separation space ((1/ϖ)-s). The typical resolution limit (θ ≈ 2″) is clearly visible, preventing detection of small-projected-separation binaries at large distances. Consequently, large-distance samples are dominated by wide-separation binaries. Interestingly, uniformly dividing the sample into 10 distance shells and fitting their projected separation distributions with three power-law models (SPL, DPL, and SBPL) reveals that all power-law indices γ (or γ₁) deviate from normal values and vary systematically with distance shell (see [TABLE:1]). This anomaly caused by sample incompleteness leads to severe bias in inferred intrinsic separation distributions, making calibration of incompleteness due to missing unresolved close binaries and large brightness differences crucial.
4 Improved Method for Correcting Power-law Indices Affected by Selection Effects
Section 3 clearly identified severe incompleteness in wide binary separation distributions at small angular separations (θ < 10″), particularly for even smaller θ. Therefore, before modeling the intrinsic distribution of physical projected separations, we must account for selection effects to compensate for incompleteness at small separations and correct model power-law indices (γ or γ₁) affected by distance and other factors. This section references the same mathematical form of the selection function as El-Badry and Rix's empirical data-driven function, with improvements considering distance effects. We uniformly divided the total distance range 0–1 kpc into shells: 0–100 pc, 100–200 pc, 200–300 pc, 300–400 pc, 400–500 pc, 500–600 pc, 600–700 pc, 700–800 pc, 800–900 pc, and 900–1000 pc. In each of these 10 distance shells, we statistically analyzed the relationship between angular separation θ and completeness for different magnitude differences ΔG, then applied these patterns to the likelihood function for fitting projected separation distributions. Finally, we modeled the intrinsic distribution of physical projected separations using three power-law models.
4.1 Mathematical Models for Fitting Projected Separation Distribution
Wide binary projected separation distributions typically follow specific power-law patterns. In this section, we model the physical projected separation distribution using three mathematical models: SPL, DPL, and SBPL.
4.1.1 SPL
This model corresponds to a one-dimensional free parameter space m = (γ), containing only one free parameter γ. Despite its limitations, this model can calculate the best-fit γ value within a given physical projected separation range (full or partial), expressed as:
$$\phi(s) = \phi_0 s^\gamma$$
4.1.2 DPL
This model corresponds to a three-dimensional free parameter space m = (γ₁, γ₂, log(s_br/AU)), adding new power-law index γ₂ and break point s_br. γ₁ and γ₂ represent power-law indices for separations s ≲ s_br and s ≳ s_br respectively. The model is expressed as:
$$\phi(s) = \phi_0 \begin{cases} s^{\gamma_1}, & s \leq s_{br} \ s^{\gamma_2}, & s > s_{br} \end{cases}$$
4.1.3 SBPL
This model corresponds to a four-dimensional free parameter space m = (γ₁, γ₂, log(s_br/AU), Λ). Building on the DPL model, it adds a smoothing factor Λ to quantify the steepness or smoothness of the transition between the two power laws. The model is expressed as:
$$\phi(s) = \phi_0 \left[1 + \left(\frac{s}{s_{br}}\right)^\Lambda\right]^{(\gamma_2 - \gamma_1)/\Lambda} s^{\gamma_1}$$
In formulas (1)–(3), the coefficient φ₀ is a normalization constant.
4.2 Selection Function
Wide binary identification requires both stars to be spatially resolved and satisfy astrometric and photometric selection criteria (Section 2.1). Therefore, detection depends on angular separation and flux ratio. With large flux ratios, the secondary is more likely to be masked or contaminated by the primary's light at fixed angular separation; conversely, this occurs less frequently. This effect must be considered when inferring intrinsic projected separation distributions. Additionally, detection depends on apparent magnitudes: if either star is too faint, the system is difficult to detect. If undetected binaries share the same intrinsic separation distribution as detected ones, there is no impact on inference, requiring the separation distribution to be independent of distance and absolute magnitude. El-Badry 2018's maximum distance was 200 pc, where φ(s|m) does not vary significantly with distance. However, when the high-purity sample extends to nearly 1 kpc, distance-induced variations in power-law parameters, especially index γ (or γ₁), must be considered. As established in Section 3, sample incompleteness arises from selection effects related to magnitude difference ΔG and angular separation θ. To quantitatively study these effects, we adopt El-Badry and Rix's selection function f_ΔG(θ) to describe the probability of detecting a companion at given angular separation θ. In Section 4.3, we use this selection function to correct selection effects when deriving intrinsic separation distributions:
$$f_{\Delta G}(\theta) = \frac{1}{1 + (\theta/\theta_0)^{-\beta}}$$
The selection function f_ΔG(θ) depends on angular separation θ and magnitude difference ΔG (absolute G-band magnitude difference), where θ₀ represents the angular separation at which sensitivity drops below unity, and β determines the rate of sensitivity decline when θ ≪ θ₀. We first performed statistical analysis for 10 different distance shells (see [FIGURE:8]), then estimated optimal θ₀ and β values for different ΔG through interpolation fitting of discrete ΔG values. Note that f_ΔG(θ) does not represent an absolute selection function but rather the relative ratio of detected binaries at angular separation θ to those detected at arbitrarily large separations.
[FIGURE:8] shows the relationship between completeness and angular separation θ for different magnitude differences ΔG in various distance shells. Using a method similar to Arenou et al., we assessed how Gaia photometric sensitivity to companions varies with angular separation. For each ΔG interval, we calculated the relative fraction of binaries passing selection criteria (Section 2.1) in angular separation bins of width ≈0.66″. Linear interpolation fitting of discrete ΔG values (using interval midpoints) estimated optimal θ₀ and β parameters for the selection function f_ΔG(θ). Generally, smaller ΔG and larger θ yield higher completeness.
4.3 Likelihood Function for Fitting Projected Separation Distribution
Consider an intrinsic binary projected separation distribution of the form φ(s|m) = dP/ds, where m is a set of free model parameters (the free parameter spaces in Section 4.1). For a set of binaries with projected separations s_i, the corresponding likelihood function is:
$$\mathcal{L} = p({s_i}|\mathbf{m}) = \prod_i p(s_i|\mathbf{m})$$
where p(s_i|m) is the detection probability for the i-th binary given model parameters m, calculated as:
$$p(s_i|\mathbf{m}) = \frac{\phi(s_i|\mathbf{m}) f_{\Delta G}(s|d_i)}{\int_{s_{\min}}^{s_{\max}} \phi(s|\mathbf{m}) f_{\Delta G}(s|d_i) ds}$$
The probability p_i is proportional to the probability of finding a binary with distance d_i, magnitude difference ΔG, and projected separation s_i in the catalog. s_min and s_max are the minimum and maximum projected separations. φ(s|m) is normalized such that ∫φ(s|m)ds = 1, and the selection function f_ΔG(θ) = f_ΔG(s|d_i) is a function of angular separation given by equation (1). The denominator in equation (6) reflects the fraction of detectable binaries at distance d_i and magnitude difference ΔG, qualitatively indicating that at larger distances and magnitude differences, wide-separation binaries are easier to detect than close ones. When projected separation is very small, the probability in equation (6) approaches zero, making s_min's value irrelevant; we uniformly set s_min = 10⁻² AU. When s > 10⁵.²⁰ AU, contamination is high, so we set s_max = 10⁵.²⁰ AU (≈0.8 pc).
5 Fitting Results
Within the distance range 0–1 kpc, we uniformly divided the high-purity sample into 10 distance shell subsamples. Using the emcee library, we performed posterior sampling for three power-law models of physical projected separation, with all model parameters assigned broad, flat priors.
This section presents constraints on model parameters for the intrinsic projected separation distribution of high-purity binary subsamples in different distance shells, visualized using corner plots to show relationships between parameters and their marginal probability distributions. In figures showing power-law index γ (or γ₁) versus distance, "before calibration" and "after calibration" refer to using θ₀ and β values estimated in El-Badry 2018 versus those corrected for different distance shells in the selection function f_ΔG.
Below we present model parameter variations and comparisons before and after selection function calibration for three power-law models (SPL, DPL, and SBPL), focusing on power-law index γ (or γ₁). For clearer visualization, we plot absolute values |γ| in some figures.
5.1 SPL
We applied projected separation constraints to the 10 distance shell subsamples, retaining both the full range s ≲ 10⁵.²⁰ AU and s ≲ 10⁴ AU. Using the SPL model (φ(s) = φ₀s^γ) to fit the intrinsic distributions yielded |γ| values (see [FIGURE:9]).
[FIGURE:9] shows that |γ| increases with distance shell, showing a gradual rising trend from 1.70^{+0.01}{-0.01} (s < 10⁴ AU) in the nearest shell (0 < d < 100 pc) to 1.99^{+0.02} (s < 10⁴ AU) in the farthest shell (900 < d < 1000 pc). After applying the calibrated selection function, |γ| stabilizes near 1.5.
5.2 DPL
Unlike SPL's single free parameter γ, the DPL model has three free parameters with vector space m = (γ₁, γ₂, log(s_br/AU)). Without selection function calibration, |γ₁| shows a significant increasing trend with distance shell (see [FIGURE:10]), rising from 1.64^{+0.01}{-0.01} in the nearest shell to 1.87^{+0.04} in the farthest shell. After recalibration, |γ₁| fluctuates around 1.5.
Due to the break point s_br, we appropriately expanded the projected separation range, but given high contamination at large separations, we set s_max = 10⁵.²⁰ AU (Section 4.3). Using uncalibrated and calibrated selection functions, we fitted intrinsic distributions with the DPL model (see [FIGURE:11]), which shows marginal probability distributions for DPL parameters across distance shells before and after calibration.
5.3 SBPL
Differing from both previous models, SBPL has four free parameters with vector space m = (γ₁, γ₂, log(s_br/AU), Λ), adding smoothing factor Λ to quantify the steepness of the transition between power laws. We maintained the same projected separation limits as DPL (s ≲ 10⁵.²⁰ AU) and fitted intrinsic distributions using both uncalibrated and calibrated selection functions.
With the uncalibrated selection function, |γ₁| still increases with distance shell (see [FIGURE:12]), rising from 1.64^{+0.01}{-0.01} in the nearest shell to 1.87^{+0.04} in the farthest shell. After recalibration, |γ₁| fluctuates around 1.35. [FIGURE:13] shows marginal probability distributions for SBPL parameters across distance shells before and after calibration.
5.4 Comparison of Results Before and After Calibration
During model parameter fitting, we recorded best-fit power-law indices (γ) for each distance shell and model. Models include SPL, DPL, and SBPL, defaulting to the full separation range s ≲ 10⁵.²⁰ AU; for SPL we also limited s ≲ 10⁴ AU.
[TABLE:1] and [TABLE:2] numerically show power-law index variations across distance shells before and after calibration. After calibration, power-law indices (γ) for the same model in the same distance shell decrease significantly. Under equivalent conditions, power-law indices satisfy: γ_SBPL ≲ γ_DPL ≲ γ_SPL.
We selected the distance shell (200 < d < 300 pc) as a test sample after calibration, fitted parameters using three models (SPL, DPL, SBPL), and visualized the intrinsic distribution using best-fit values.
[FIGURE:14] compares observed and model-fitted intrinsic distributions before and after selection function calibration for the test sample. The black dashed line (SPL(γ = -1.5)) represents the expected distribution. Compared to before calibration, the break point s_br is larger after calibration for the same model (DPL, SBPL), and all models approach the expected distribution at s ≲ s_br after calibration.
6 Summary and Outlook
Based on Gaia EDR3 data, we constructed a high-purity wide binary catalog containing 144,458 pairs within 1 pc projected separation in the solar neighborhood. Using three power-law models (SPL, DPL, SBPL), we studied how selection effects impact the intrinsic projected separation distribution, particularly the power-law index γ. We found that uncalibrated selection functions cause γ to deviate from the expected value (γ ≈ -1.5), with deviations increasing in more distant shells. To correct this, we proposed calibrating the selection function by distance, uniformly dividing the 0–1 kpc range into 10 shells and statistically analyzing the relationship between angular separation θ and completeness for different magnitude differences ΔG in each shell, then applying these relationships to fit the three models. Results show that incorporating distance-dependent calibration stabilizes γ, providing more reliable theoretical foundations for wide binary formation and evolution.
After applying distance-calibrated selection functions, SPL and DPL fitted γ values remain stable across all distance shells, fluctuating around -1.5. SBPL yields γ ≈ -1.35, slightly lower than SPL and DPL, primarily due to the smoothing factor Λ's effect on the s ≲ s_br region. The four-parameter SBPL model more comprehensively and accurately describes intrinsic separation distribution characteristics. We therefore hypothesize that previous fits to p(s) ∝ s^γ may have overestimated γ, with the true value being closer to Opik's law.
Due to Gaia's angular resolution limits and brightness difference effects in companion detection, our catalog remains incomplete. This incompleteness causes significant bias in fitted intrinsic separation distributions (especially γ), manifested as increasing deficiency at small projected separations, significant deviation of γ from -1.5, and larger break points s_br. Notably, limiting SPL model separations from the full range s < 10⁵.²⁰ AU to s < 1 × 10⁴ AU (≈ s_br) shifts γ toward the normal value -1.5, which is reasonable since SPL doesn't account for s_br while γ is defined for the s ≲ s_br region. For the same sample, best-fit γ values typically satisfy γ_SBPL ≲ γ_DPL ≲ γ_SPL.
This study calibrated biases in power-law model parameters (especially γ) caused by selection effects, but limitations remain: the sample doesn't cover smaller projected separations, and false binary contamination continues affecting catalog construction. Our catalog represents a trade-off between completeness and purity, without fundamentally resolving selection effects or sample incompleteness caused by observational characteristics and search methods.
Gaia DR3 will provide more information about selection effects. Combining Gaia data with other surveys, such as LAMOST's low- and medium-resolution spectroscopic data, will enable better study of projected separation distribution characteristics and selection effect differences among various binary populations.
References
[1] Moe M, Di Stefano R. ApJS, 2017, 230(2): 15
[2] El-Badry K, Rix H W, Heintz T M. MNRAS, 2021, 506(2): 2269
[3] Halbwachs J L, Mayor M, Udry S, et al. A&A, 2003, 397(1): 159
[4] Lépine S, Bongiorno B. AJ, 2007, 133(3): 889
[5] Tobin J J, Kratter K M, Persson M V, et al. Nature, 2016, 538(7626): 483
[6] Tokovinin A, Moe M. MNRAS, 2020, 491(4): 5158
[7] Offner S S, Kratter K M, Matzner C D, et al. ApJ, 2010, 725(2): 1485
[8] Klein R L, Fisher R, Krumholz M, et al. Rev Mex Astron Astrofis, 2002, 15: 92
[9] Kouwenhoven M B N, Goodwin S P, Parker R J, et al. MNRAS, 2010, 404(4): 1835
[10] Perets H B, Kouwenhoven M B N. ApJ, 2012, 750(1): 83
[11] Reipurth B. Mem Soc Astron Ital, 2017, 88: 611
[12] Reipurth B, Mikkola S. AJ, 2015, 149(4): 145
[13] Tokovinin A. MNRAS, 2017, 468(3): 3461
[14] Chanamé J, Gould A. ApJ, 2004, 601(1): 289
[15] Yoo J, Chaname J, Gould A, et al. ApJ, 2004, 601(1): 311
[16] Quinn D P, Wilkinson M I, Irwin M J, et al. MNRAS, 2009, 396(1): L11
[17] Monroy-Rodríguez M A, Allen C. ApJ, 2014, 790(2): 159
[18] El-Badry K, Rix H W. MNRAS, 2018, 480(4): 4884
[19] Jiang Y F, Tremaine S. MNRAS, 2010, 401(2): 977
[20] TIAN H J, El-Badry K, Rix H W, et al. ApJS, 2019, 246(1): 4
[21] Andrews J J, Chanamé J, Agüeros M A. MNRAS, 2017, 472(1): 675
[22] Pittordis C, Sutherland W. MNRAS, 2018, 480(2): 1778
[23] Livernois A R, Vesperini E, Pavlík V. MNRAS, 2023, 521(3): 4395
[24] Peñarrubia J, Ludlow A D, Chanamé J, et al. MNRAS, 2016, 461(1): L72
[25] Fellhauer M, Lin D N C, Bolte M, et al. ApJ, 2003, 595(1): L53
[26] Brown A G, Vallenari A, Prusti T. A&A, 2021, 649: A1
[27] Jiang Y F, Tremaine S. MNRAS, 2010, 401(2): 977
[28] Oh S, Price-Whelan A M, Hogg D W, et al. AJ, 2017, 153(6): 257
[29] Babusiaux C, Van Leeuwen F, Barstow M A, et al. A&A, 2018, 616: A10
[30] Lindegren L, Hernández J, Bombrun A, et al. A&A, 2018, 616: A2
[31] Evans D W, Riello M, De Angeli F, et al. A&A, 2018, 616: A4
[32] Arenou F, Luri X, Babusiaux C, et al. A&A, 2018, 616: A17
[33] Leike R H, Enßlin T A. A&A, 2019, 631: A32
[34] Widmark A, Leistedt B, Hogg D W. ApJ, 2018, 857(2): 114
[35] Gaia Collaboration, et al. The Gaia Mission. https://arxiv.org/pdf/1609.04153, 2016
[36] Herschel M, Watson D. Philosophical Transactions of the Royal Society of London, 1782, 72: 112
[37] Herschel W. Philosophical Transactions of the Royal Society of London, 1803, 93: 339
[38] Sesar B, Ivezić Ž, Jurić M. ApJ, 2008, 689(2): 1244
[39] Lindegren L, Hernández J, Bombrun A, et al. A&A, 2018, 616: A2
[40] Duchêne G, Kraus A. ARA&A, 2013, 51: 269
[41] Dhital S, West A A, Stassun K G, et al. AJ, 2010, 139(6): 2566
[42] Oelkers R J, Stassun K G, Dhital S. AJ, 2017, 153(6): 259
[43] Ziegler C, Law N M, Baranec C, et al. AJ, 2018, 156(6): 259
[44] Foreman-Mackey D, Hogg D W, Lang D, et al. PASP, 2013, 125(925): 306
[45] Foreman-Mackey D. JOSS, 2016, 1(2): 24