The impact of a massive Sagittarius dSph on GD-1-like streams

We investigate the effects of a massive ($\gtrsim4\times10^{10}M_\odot$) Sagittarius dwarf spheroidal galaxy (Sgr) on stellar streams using test particle simulations in a realistic Milky Way potential. We find that Sgr can easily disrupt streams formed more than $\sim3$ Gyr ago, while stars stripped more recently are generally unaffected. In certain realizations, Sgr is able to produce asymmetry between the leading and trailing tails of Pal 5, qualitatively similar to observations. Using data from the Gaia space telescope and elsewhere, we fit models to the GD-1 stream in the presence of a Sgr with various initial masses. While the best-fitting models do show perturbations resulting from interactions with Sgr, we find that the level of disruption is not significantly greater than in the observed stream. To investigate the general effects of Sgr on a population of streams, we generate 1000 mock streams on GD-1-like orbits with randomized orientations. Some streams show clear evidence of disruption, becoming folded on the sky or developing asymmetry betweeen their two tails. However, many survive unaffected and the peak surface brightness of stars is decreased by no more than $\sim0.3$ mag/arcsec$^2$ on average. We conclude that Sgr having an initial mass of $\gtrsim4\times10^{10}M_\odot$ is compatible with the survival and detection of streams formed more than 3 Gyr ago.


INTRODUCTION
The Sagittarius dwarf spheroidal galaxy (Sgr), discovered by Ibata et al. (1995), is one of the closest and brightest satellites of the Milky Way. Having just passed pericentre, it is currently undergoing strong tidal disruption and is expected to completely dissolve over the next billion years (Vasiliev & Belokurov 2020). Stars have been stripped from Sgr to form the Sagittarius stream, a pair of long tails which lead and trail Sgr. These wrap around the Milky Way in a plane roughly perpendicular to its disc (see e.g. Ibata et al. 2001;Majewski et al. 2003;Belokurov et al. 2006;Newberg et al. 2002), and are useful probes of both the Milky Way's potential and the orbit of Sgr (e.g. Ibata et al. 2001;Fellhauer et al. 2006;Law & Majewski 2010;Gibbons et al. 2014;Vasiliev et al. 2021). Dynamical modelling of the disruption of Sgr by Law & Majewski (2010) suggested that the present-day mass of the Sgr remnant is about 2.5 × 10 8 , and more recently Vasiliev & Belokurov (2020) found a mass of ∼ 4 × 10 8 . This is much less than the current mass of the Large Magellanic Cloud (LMC), about 1.4 × 10 11 ). However, since Sgr has already been significantly disrupted, its initial mass must have been much higher. Using N-body models, Jiang & Binney (2000) argued that the available data was compatible with a Sgr of initial mass ∼ 10 11 . Niederste-Ostholt et al. (2010) calculated the total luminosity of the Sgr debris and estimated the original mass of its dark matter (DM) halo to be ∼ 10 10 . Various lines of evidence have more recently emerged to suggest that this mass may have been > 10 10 . Gibbons et al. (2017) used simulations of the Sgr stream in conjunction with chemistry and kinematics from ★ E-mail: amd206@cam.ac.uk Sloan Digital Sky Survey/SEGUE data (Yanny et al. 2009) to derive an initial mass estimate of > 6 × 10 10 . Using an abundance matching technique Read & Erkal (2019) found a similar value of 5 × 10 10 .
Further evidence may be provided from the discovery by Antoja et al. (2018) of a spiral pattern in the phase space of Milky Way disc stars using data from the Gaia space telescope (Gaia Collaboration et al. 2018). This is sometimes known as a 'snail' or 'phase space spiral'. Studies have shown that such a spiral can emerge from interaction with a disc-crossing dwarf galaxy (e.g. Antoja et al. 2018;Binney & Schönrich 2018) such as Sgr, providing a possible indicator of its mass at earlier times. Laporte et al. (2018) ran simulations of Sgr with initial mass ∼ 10 11 and showed that many observed perturbations of the disc can be reproduced (Laporte et al. 2019). Bland-Hawthorn & Tepper-García (2021) have suggested that Sgr must have been losing mass at a rate of 0.5-1.0 dex per orbit for it to have excited the phase space spiral in the last 1-2 Gyr. However, Bennett et al. (2022) have argued that Sgr cannot be the sole creator of the perturbations, since their simulations were unable to match both the perturbations and the present mass of the Sgr remnant. The mass of Sgr at earlier times therefore remains an open question, one that is central to understanding the last few billion years of the Milky Way.
Stellar streams are produced by dissolving globular clusters or satellite galaxies. As the progenitor orbits the host galaxy, tidal forces strip stars from the Lagrange points to create one or two tails which lead or trail the progenitor on its orbit (e.g. Lynden-Bell & Lynden-Bell 1995;Binney & Tremaine 2008). Cold stellar streams, formed from disrupted globular clusters, are narrow with low velocity dispersion. They provide a visible trace of an approximate orbit through The pericentric and apocentric distances of GD-1 are around 14 kpc (Bowden et al. 2015) and 26 kpc (Koposov et al. 2010) respectively. The radial range explored by GD-1 therefore overlaps with that of Sgr, which reached a radius of ≈ 16 kpc at its last pericentre (Vasiliev & Belokurov 2020). de Boer et al. (2020) showed that an interaction between GD-1 and Sgr ≈ 3 Gyr ago could reproduce some of the off-track features of GD-1. These include the 'spur' and the 'blob' (Price-Whelan & Bonaca 2018), which lie ∼ 1 • away from the main track of the stream on the sky. The stream also has wiggles and density variations (e.g. de Boer et al. 2018de Boer et al. , 2020. All these features are thought to arise from encounters with DM subhaloes , and may provide information about the distributions and nature of these subhaloes in the Milky Way. It has also been proposed by Malhan et al. (2019a) and Qian et al. (2022) that the off-track features could arise if GD-1 was formed from a cluster accreted from a satellite galaxy. In these scenarios, the off-track features are comprised of stars stripped from the cluster before it was accreted into the Milky Way.
Like many other Galactic stellar streams (e.g. the Orphan stream, Grillmair 2006;Belokurov et al. 2006;Koposov et al. 2014;Shipp et al. 2018), no progenitor cluster has been discovered for GD-1, although Webb & Bovy (2019) have suggested possible scenarios for the dissolution and location of the progenitor. One possibility is that the cluster dissolved more than 2.5 Gyr ago, in which case the stream would have had ample time to interact with subhaloes including Sgr.
In this study, we investigate the effects of a Sgr of mass > 10 10 on stellar streams in the Milky Way. We focus on GD-1 due to the possibility of encounters with Sgr, and the high-quality data available over a large expanse of sky. While uncertainties in the Milky Way's potential and the position of Sgr are far too great to predict or model the nature of such encounters, the masses of Sgr being considered mean that even distant encounters may cause significant disruption to the stream. It is therefore conceivable that such disruption is unavoidable for streams orbiting at similar radii to the pericentre of Sgr. We seek to answer whether this is the case, and whether the observed GD-1 stream is compatible with it having survived encounters with a massive Sgr. If not, this would suggest that observed streams like GD-1 must have formed after Sgr had lost much of its initial mass. This would have consequences for future studies of GD-1-like streams.
The paper is arranged as follows. In Section 2 we describe the simulation setup, including the models for Sgr and the method for generating streams using the Lagrange Cloud Stripping technique. We then test these models on mock streams generated from known globular clusters in Section 3, before fitting models to data from the GD-1 stream in Section 4. In Section 5, we generate 1000 streams on GD-1-like orbits to investigate whether a population of streams can survive the passage of a massive Sgr. Finally we summarise our findings in Section 6.

SETUP
We use the package AGAMA (Vasiliev 2019a) to conduct test particle simulations of stellar streams in the combined potential of the Milky Way (MW), Large Magellanic Cloud (LMC) and Sagittarius Dwarf Galaxy (Sgr). Throughout this paper we use a right-handed Galactocentric coordinate system ( , , ), where the Sun is situated at (−8.1, 0, 0.02) kpc with a velocity of (12.9, 245.6, 7.8) km/s, consistent with ASTROPY (Astropy Collaboration et al. 2018).

Potential
In setting up the potential of the Milky Way and the LMC, we closely follow the methods described in Vasiliev et al. (2021), where a detailed description can be found.

Milky Way
For the MW's potential, we choose the triaxial fiducial model used by Vasiliev et al. (2021) to model the Sgr stream. Its halo has a radius-dependent shape: the inner halo is axisymmetric and flattened perpendicular to the disc, while the outer halo ( 50 kpc) is triaxial, with its major axis perpendicular to the disc. This potential was shown to produce reasonable matches to the Sgr stream, and is therefore an appropriate choice for studying dynamics in its vicinity. The potential is described in more detail in Appendix A, and also Section 3.4 and Table 1 in Vasiliev et al. (2021).

LMC
The contribution from the LMC to our potential consists of two components. In addition to the direct gravitational potential, there is a uniform (but time-varying) acceleration due to the reflex motion of the Milky Way's centre of mass towards the LMC. This correction was found by Gómez et al. (2015) to have a significant effect on the phase space structure of the Sgr debris, and Vasiliev et al. (2021) concluded that the recoil was necessary to find a satisfactory fit to the stream.
The LMC is modelled with a spherically symmetric density profile which is a truncated Navarro-Frenk-White (NFW) model. We set the total mass to be LMC = 1.5 × 10 11 , with a scale radius scale ≈ 15.6 kpc and truncation radius trunc = 10 scale ≈ 156 kpc. The relation between LMC and scale was chosen by Vasiliev et al. (2021) to satisfy observational constraints on the LMC's circular velocity.
The motions of the MW and LMC under their mutual gravitational attraction are found by numerically integrating the equations where x MW and x LMC (v MW and v LMC ) are the position vectors (velocities) of the MW and LMC in an inertial frame, Φ MW and Φ LMC are the gravitational potentials, andã MW andã LMC are correction terms to account for the effects of dynamical friction and deformation. These are calculated by calibration with the N-body simulations of Vasiliev et al. (2021).
Eqns (2) were integrated backwards from the present-day ( = 0) to an initial time start = −6 Gyr. The trajectory of the MW was used to compute the resulting uniform acceleration potential, as experienced by a particle at rest relative to the MW. This was added to the MW and LMC gravitational potentials in the non-inertial galactocentric coordinate system to obtain a combined MW-LMC potential consistent with the motion of the MW.

Sgr
We model the Sgr dSph as a Hernquist sphere with time-varying mass and scale radius. Again following Vasiliev et al. (2021), we place the present-day position of the Sgr remnant at Sgr = 283.76 • , Sgr = −30.48 • , with proper motions * , = −2.7 mas/yr and , = −1.35 mas/yr. The distance and radial velocity are 27 kpc and 142 km/s. This choice of distance was found by Vasiliev et al. (2021) to provide better fits to the Sgr stream than other values (e.g. 26.5 kpc, Vasiliev & Belokurov 2020).
We integrate the orbit of Sgr in the combined MW-LMC-reflex potential, including an acceleration due to dynamical friction (Chandrasekhar 1943;Vasiliev et al. 2021): Here v Sgr is the velocity of Sgr in the Galactocentric frame, is the mass density of the MW (and LMC) at the location of Sgr, is the velocity dispersion of particles at that location, and lnΛ is the Coulomb logarithm. While is calculated from the MW+LMC potential, and lnΛ must be prescribed. Following Vasiliev et al. (2021), we set = 120 km/s and lnΛ = 3. A crucial part of this work is the choice of mass decay profile Sgr ( ) of Sgr as it orbits the Milky Way. We wish for Sgr ( ) to be continuous, realistic and consistent with eqn (3), while allowing us to consider a high initial mass (> 10 10 ). We make the assumption that the total mass decreases by a factor of 10 per pericentre passage. The 'decay rate parameter' > 0 is the logarithmic mass lost per orbit in dex; i.e. log 10 Sgr decreases by every orbit. For continuity of Sgr ( ), the mass is decreased exponentially between each pericentre and the subsequent apocentre (such that log Sgr ( )  The periods of exponentially decreasing mass (linear on the logarithmic plot) occur between pericentres and subsequent apocentres. All models have Sgr = 4 × 10 8 at the present-day. Bottom panel: Galactocentric distances of Sgr models as a function of time, with the LMC included for comparison. The models all follow similar orbits over their last two periods, and have a pericentre of Sgr ≈ 20 kpc at ≈ −2.7 Gyr. Before this time, the more massive models ( 0.5) are on longer-period orbits with larger apocentres, before dynamical friction causes their orbits to decay. decreases uniformly). It is then held constant until the next pericentre. Since only a small fraction of an orbital period has elapsed since the most recent pericentre passage, we do not remove any mass between this time and the present-day. The present-day mass is set to Sgr (0) = 4 × 10 8 , the total mass of the Sgr remnant found by Vasiliev & Belokurov (2020). We assume that the Hernquist radius scales as 1/3 Sgr ( ) and set the present-day value to 1.4 kpc. This gives a peak circular velocity of 17.5 km/s, similar to the best-fitting models of Sgr in Vasiliev & Belokurov (2020). This prescription would also give a scale radius of 8.2 kpc for a model of total mass 8 × 10 10 , similar to the L2 model of Laporte et al. (2018). We approximate Sgr as a test particle moving in the MW potential, since simulating the mutual interaction of Sgr and the MW would require additional assumptions about how the mass stripped from Sgr is distributed in the MW (see Appendix B for a justification of this approximation). However, we do include the uniform acceleration potential resulting from the MW's acceleration towards Sgr. Since we are considering masses of Sgr up to ∼ 1/3 that of the LMC, this effect is non-negligible. This is especially important when considering perturbations on streams; the MW reflex acceleration causes Mock stellar streams generated from three globular clusters, with five different values of the Sgr mass decay rate parameter . All panels show the streams as viewed on the sky from the galactic centre, in coordinate systems where the 1 axis is aligned with the instantaneous orbital plane of the progenitor cluster (situated at 1 = 2 = 0). The progenitor's motion is towards the left (positive 1 ). Both leading and trailing streams are shown, and the stars are colour-coded by the time at which they were stripped from the progenitor; purple is earliest and yellow is most recent. In all three cases increasing the mass of Sgr increases the disruption of streams. The stars stripped earliest (at −3 Gyr) are most affected, while the more recently stripped stars are able to form narrow tracks with even the highest masses of Sgr. An animated version of this figure can be viewed at https://www.youtube.com/playlist?list= PLEleLLhXAwEMx6GcSror-iF-QsskHrXYM. partial cancellation of Sgr-induced perturbations, so neglecting this effect would lead to overestimation of these perturbations. We include this reflex in the same manner as that due to the LMC; we calculate the Sgr-centric acceleration evaluated at the centre of the MW, and subtract this acceleration from all equations of motion in the galactocentric frame. The LMC is ignored in this calculation because it is far from the MW at early times, when Sgr has non-negligible mass.
For the analysis we consider up to five models with different decay rates, with ∈ {0.2, 0.4, 0.6, 0.7, 1.0}. Sgr ( ) and the galactocentric distance Sgr ( ) are plotted in Fig. 1 for these five models. The two models with lowest (0.2 and 0.4) do not experience significant orbital decay due to dynamical friction, and each have 5 pericentre passages during the 6 Gyr. The = 0.6 and 0.7 models are similar to each other, both starting at large radii and experiencing 4 pericentre passages. The fastest-decaying model ( = 1.0) only passes pericentre 3 times, the earliest being at ≈ −2.8 Gyr. As a result, the initial mass (4 × 10 10 ) is actually lower than that of the = 0.7 model (5 × 10 10 ). Care is therefore needed when interpreting our results in relation to the mass of Sgr at earlier times. The = 0.7 and = 1.0 models have initial masses somewhat less than those of Laporte et al. (2018). However, those authors' values of 8 − 14 × 10 10 were the masses at infall, so their models would be more comparable to ours by the time of the first pericentric passage due to tidal stripping.

Stream generation
To rapidly generate realistic models of stellar streams we use the 'Lagrange point stripping' technique outlined by Bowden et al. (2015), also known as 'modified Lagrange Cloud Stripping' (Gibbons et al. 2014). This is similar (though not identical) to the methods of Varghese et al. (2011) and Küpper et al. (2012). We treat the stars in the stream as test particles moving in the combined potential of the MW, LMC, Sgr and a stream progenitor. This technique has been shown to produce streams which are indistinguishable from those in Nbody simulations, while being much cheaper computationally to run (Gibbons et al. 2014).
The stream progenitor is modelled as a Plummer sphere of mass prog ( ) and constant scale radius prog . Its orbit is found by integrating back in time from some present-day phase space coordinates, through the combined MW+LMC+Sgr potential Φ. At each time the locations of the Lagrange points are estimated as being along the line from the centre of the MW to the progenitor, at a displacement from the progenitor Δ = ± t . The tidal radius t is given by (Gibbons et al. 2014;Bowden et al. 2015) where Ω is the instantaneous angular speed of the progenitor about the MW's centre. Following Bowden et al. (2015), we release particles from Galactocentric radii strip = prog ± t where = 1.2, rather than from the Lagrange points themselves. This ensures that a large majority of particles escape the progenitor and enter the streams. The velocities of the stream particles are randomly drawn from a 3D isotropic Gaussian centred on v strip with standard deviation strip . The radial component of v strip is equal to that of the progenitor, while the tangential components are set equal to those of the point halfway between the centre of the progenitor and the Lagrange point. This choice has been used successfully by Küpper et al. (2012) and Bowden et al. (2015).
We set the initial progenitor mass (at = −6 Gyr) to prog = 2 × 10 4 , and either keep prog constant or decrease it uniformly to zero at some time of dispersal disp . The Plummer scale radius is set to prog = 2 pc, and the stripping velocity dispersion is strip = 0.5 km/s. The total stellar mass of the visible GD-1 stream was estimated by de Boer et al. (2020) to be 1.8 × 10 4 , so our choice of initial prog is above the absolute lower bound for the initial mass of the GD-1 progenitor. We later check whether using a lower mass affects our results (see Section 4.5). We choose the velocity dispersion to obtain a model stream with a similar on-sky width to GD-1 (in the absence of strong perturbations).

Generation of mock streams from known globular clusters
We first test our models of a massive Sgr with mock streams orbiting at similar radii to GD-1, which in our potential is in the range 12 /kpc 24 (with no Sgr). Since we wish the progenitors of these streams to be on realistic globular cluster (GC) orbits, we search for suitable GCs in the catalogue by Vasiliev (2019b) based on Gaia DR2 data.
We load the present-day 6D phase space positions of all the GCs in Vasiliev (2019b) via the galpy module (Bovy 2015). We integrate the orbits backwards to = −6 Gyr in the MW+LMC potential, but with no Sgr. We find the maximum and minimum radii max and min of all orbits, and select those which satisfy max < 25 kpc and min > 11 kpc. This limits the GC orbits to the spherical shell explored by GD-1.
Of the 150 GCs in the catalogue, only 3 have orbits that satisfy these constraints. These are Pal 1, Pal 5 and BH 176. The latter's identity as a globular cluster is not agreed upon, and it has also been identified as an open cluster (van den Bergh & Hagen 1975;Davoust et al. 2011). Interestingly, Pal 5 has its own tidal tails (Odenkirchen et al. 2001) which are among the best-studied of all cold stellar streams in the halo (see e.g. Odenkirchen et al. 2001;Rockosi et al. 2002;Dehnen et al. 2004;Grillmair & Dionatos 2006a;Küpper et al. 2015;Ibata et al. 2016;Erkal et al. 2017;Bonaca et al. 2020a). We therefore take this opportunity to investigate models of the Pal 5 tails under the influence of a massive Sgr.
We use our Lagrange point stripping method to generate stellar streams from the Lagrange points of these GCs, keeping prog constant (i.e. the clusters do not disperse). We strip stars from both the inner and outer Lagrange points, generating leading and trailing tails respectively. We repeat for different choices of the Sgr mass decay rate parameter . To display the streams, we calculate the instantaneous orbital plane of the progenitor as a function of time, and use this as the equator of a Galactocentric polar coordinate system with longitude 1 and latitude 2 . We place the progenitor at 1 = 2 = 0. The present-day appearances of the 3 streams are shown in Fig 2 for various values of . Along with Figs. 6 and 7, an animated version can be viewed at https://www.youtube.com/playlist?list= PLEleLLhXAwEMx6GcSror-iF-QsskHrXYM. Fig. 2 demonstrates that a massive Sgr can have a highly disruptive effect on streams. While these effects vary between different cases, the structures of the streams formed at early times are all heavily affected by increasing the value of . For = 1.0, the stars from Pal 1 stripped before ≈ −3 Gyr no longer form narrow tracks on the sky, and the two arms are twisted and significantly widened. The early-stripped stars from BH 176 are in a narrow track, but this is offset from the younger stream attached to the cluster. At lower values of the damage is a little less dramatic, but is still clearly visible at = 0.6, corresponding to a maximum Sgr mass of 2.5 × 10 10 . However, the disruption does not extend to stars stripped later (at > −3 Gyr). All the panels in Fig. 2 do show narrow streams comprised of recently stripped stars along the 1 axes, albeit sometimes shorter than the full streams with smaller . This is unsurprising, since encounters with a massive Sgr ( Sgr 10 10 ) can occur no later than ≈ −2.7 Gyr, around its third-from-last pericentre. This allows any material stripped after this time to form narrow streams, unaffected by Sgr. The Pal 1 and BH 176 stream models are comparable to those of Woudenberg et al. (2022), who simulated the Jhelum stream in the presence of Sgr. They similarly showed that interactions with Sgr can result in a thin stream running parallel to older, more diffuse ones. This recreated some of the features of the observed Jhelum stream.
Pal 5 does not show extreme disruption to the same extent as the other two streams. However, for some values of there is strong asymmetry between the leading and trailing tails. For = 0.6 and = 1.0 in particular, one of the tails is significantly shorter than the other. This is interesting because the real Pal 5 stream does show such asymmetry; the leading tail is shorter than the trailing tail (Dehnen et al. 2004;Erkal et al. 2017;Starkman et al. 2020;Bonaca et al. 2020a). We discuss the asymmetry of our Pal 5 models further in Section 3.2.

The asymmetry of Pal 5
To compare our Pal 5 models to observations of the stream, we transform to a heliocentric coordinate system ( 1 , 2 ) approximately aligned with the stream. This is the Pal5PriceWhelan18 coordinate system in the gala package (Price-Whelan 2017) devised to study the Pal 5 stream (e.g Bonaca et al. 2020a;Price-Whelan et al. 2019). We repeat this for ∈ {0.2, 0.5, 1.0}. The value = 0.5 corresponds to an initial Sgr mass of 4 × 10 10 , and was selected because it produces a good example of an asymmetric stream. We emphasise that these models each result from a single realisation of the progenitor positions, so represent only one set of possible outcomes. Fig. 3 shows the stream's on-sky appearance for each value of , again with stripping times colour-coded. These times are also plotted against 1 in the second row. To demonstrate the asymmetry about 1 = 0, we calculate kernel density estimates (KDEs) of the 1 distributions, first excluding stars outside the on-sky region −60 • < 1 < 60 • , −5 • < 2 < 20 • . As a measure of the asymmetry, we also compute skewness values˜3 of the same 1 distributions. The KDEs and skewness values are shown in the bottom row of Fig. 3.
With the low-mass Sgr ( = 0.2), the stream is close to symmetric about the progenitor ( 1 = 0); the leading and trailing tails have a very similar length and density distribution. In the middle row, we see that the distance of stars from the progenitor in 1 roughly correlates with the time since they were stripped (although with considerable scatter).
The symmetry is broken in the other two cases. With = 0.5, the

Time of stripping [Gyr]
Pal 5 density trailing tail (at 1 < 0 • , on the right-hand side) extends further than 50 • from the progenitor, while the leading tail (at 1 > 0 • ) is cut off sharply at 1 ≈ 25 • . The middle row reveals the structural reason for this: the leading tail is compressed so that the early-stripped stars lie closer to the progenitor. The KDE confirms that there is a significant enhancement in density at 1 ≈ 15 • . Since this compressed part is not perfectly aligned with the rest of the tail, it also results in an increased width along this part of the stream by almost a factor of 2.
The = 1.0 stream is superficially similar, though reversed; in this case, the leading stream extends beyond 50 • while the trailing stream is sharply cut off at 1 ≈ −25 • . However, the stream is structurally different: the shortened tail does not show the same buildup of earlystripped stars or the resulting density peak. A dog-leg in the stripping time plot at ≈ −2.7 Gyr hints that the trailing stream has been 'folded' and extends towards positive 1 , lying on top of the leading stream. This is confirmed by animations. After the Sgr pericentre at ≈ −2.7 Gyr, the trailing stream is twisted and its stars begin to overtake the progenitor, sometimes lying along the leading stream.
The observed tails of Pal 5 have complex structure which has been studied in detail by Erkal et al. (2017) and Bonaca et al. (2020a). The leading tail is significantly shorter than the trailing one (although discoveries by Starkman et al. (2020) of possible extensions to the stream may dispute this). The leading tail also fans out before it is curtailed, and both tails contain wiggles and gaps (i.e. underdensities). The mock Pal 5 streams shown in Figure 3 exhibit some qualitative similarities to these features, most clearly the asymmetry in lengths.
The stellar density plots on the bottom row also show underdensities a few degrees away from the cluster, even in the = 0.2 case. These are comparable to the gaps of Pal 5 shown in Fig. 5 of Erkal et al. (2017) and in Fig. 3 of Bonaca et al. (2020a), though much less prominent. The greater width of the leading tail in the = 0.5 model is also similar to the 'fan' of the real stream. Slightly different initial conditions can result in this 'folded' tail becoming more misaligned with itself, resulting in an even larger apparent width on the sky. This type of perturbation is therefore able to reproduce both shortening and widening of the tail, at least qualitatively.
However, the scale of the asymmetry is considerably larger in our mock streams. While the shorter tails are both sharply cut off 25 • from the cluster, the observed leading tail of Pal 5 extends to only 1 ≈ 7 • . The second row of Fig. 3 shows that the stars stripped since ≈ −2.7 Gyr form a tail which is unaffected by Sgr, and extends to its full unperturbed length. Only the earlier-stripped stars are shifted in 1 . The cut-off angle of 25 • is therefore set by how far the more recently stripped stars drift from the cluster in the time since the interaction with Sgr. If the asymmetry of Pal 5 was caused solely by Sgr, this rate of drift must have been considerably slower than in our models. We checked whether changing the model parameters could achieve this, and found that a progenitor mass of 1 × 10 4 reduced the cut-off angle to less than 15 • if we set = 0.5 (i.e. particles are released halfway between the cluster centre and the Lagrange point). This mass is consistent with the value of (1.39 ± 0.65) × 10 4 reported by Baumgardt & Hilker (2018). It may therefore be possible for some realistic model to match the observations even closer. Hence, we cannot rule out Sgr as the cause of the asymmetry in Pal 5, and this provides an alternative to other possible causes of the asymmetry, such as the rotation of the galactic bar (Pearson et al. 2017).

FITTING THE GD-1 STREAM
In this section, we investigate the effect of a massive Sgr on the GD-1 stream, which may have been able to interact with Sgr ∼ 3 Gyr ago. Such an interaction may have created observed off-track features such as the 'spur' (de Boer et al. 2020). We now take this idea further by questioning whether the present-day appearance of GD-1 is consistent with it having survived a pericentric passage of a more massive Sgr. To achieve this, we find maximum likelihood models of GD-1 constructed using the Lagrange cloud stripping technique.

4D Gaia data
We use astrometric data from the Gaia mission's Early Data Release 3 (EDR3; Gaia Collaboration et al. 2016Collaboration et al. , 2021 for a set of stars identified as probable members of the GD-1 stream (as described in detail in Tavangar et al. in prep). Briefly, this membership model fits the track of the stream -that is, the mean dependence of stream latitude 2 , proper motion components in stream-aligned coordinates * 1 , 2 , and radial velocity rad (when available) -using splines with regularly spaced knots in stream longitude 1 and variable values in each other coordinate dimension ( 2 , * 1 , 2 , rad ). Both the mean dependence in each coordinate and the width of the stream as a function of stream longitude are fit using spline tracks, assuming that the intrinsic width of the stream in each component is Gaussian, which allows naturally incorporating measurement uncertainties into this methodology. We use 8 spline knots for each coordinate dimension track, and 5 knots for each width track. After first filtering Gaia sources based on isochrone selection using the best-fitting stellar population parameters from Price-Whelan & Bonaca (2018), the stream track model is fit simultaneously with a model for the stellar background (i.e. field stars) in each coordinate dimension. The background stellar distribution is modelled as a mixture of Gaussians for the proper motion components, but is not explicitly modeled in the other phase space coordinates. From this procedure, each star is assigned a membership probability of belonging to the GD-1 stream; we use stars with a membership probability > 0.3. We found that using higher probabilities did not greatly change the distributions of the data points.

Distances
We use the heliocentric distance measurements reported by de Boer et al. (2020) and shown in their Table 1. These were derived using photometry from the Pan-STARRS survey (PS1; Chambers et al. 2016), complemented by Gaia DR2 astrometry (Gaia Collaboration et al. 2018). The distance measurements are 10 • apart in 1 , covering a wide range of 110 • . The values are on the order of ∼ 10 kpc, with uncertainties of ∼ 0.08 kpc. While the above measurements based on the models of the stream's stellar distribution in colour-magnitude space provide reliable relative distances, some uncertainty in the absolute distance scale may remain due to the choice of the isochrones used. In what follows, we compare the absolute distance scale along GD-1 6D Data the streams with stellar standard candles such as Blue Horizontal Branch (BHB) stars and RR Lyrae.
We select BHB stars that are probable members of the GD-1 stream using extinction-corrected Pan-STARRS (PS1) photometry from the Gaia-PS1 cross-match sample constructed in Price-Whelan & . We use an approximate distance track for the stream as a function of stream longitude 1 derived from fitting an orbit to the stream to compute PS1 absolute -band magnitudes and de-reddened − colors for all stars. We use the stream member-ship probabilities computed above to select probable stream members based on sky position and astrometry, and then select candidate BHB stars as having ( − ) < 0 and 0 < < 1. Additionally, we select candidate RR Lyrae-type (RRL) stream members using a cross-match of the Gaia DR2 RRL sample with a catalog of RRL from the PS1 survey (Sesar et al. 2017). We again use the astrometric membership probabilities defined above to select probable stream members within this subset. For the BHBs we use the colour-dependant absolute magnitude calibration ( − ) from Belokurov & Koposov (2016). For the RR Lyrae, we use the metallicity-dependant absolute magnitude calibration from Muraveva et al. (2018)  . We quantify the offset of these new data points by fitting a polynomial to the photometric distance track, and calculating the differences in the distances of the standard candles. The mean offset from the photometric track is less than 0.5 kpc, and only ∼ 0.25 kpc for the RRL. This is less than the scatter of the photometric distances about their fitted polynomial, so we choose to use the unchanged distance values as reported by de Boer et al. (2020).

Radial velocities
We use radial velocity measurements from two sources. The first set comes from Koposov et al. (2010) and were obtained at the Calar Alto observatory. These consist of radial velocity values of 23 stars which cover a smaller range in 1 than the other data, between −45 • and −13 • . Typical uncertainties are ∼ 9 km/s. We also include more precise measurements derived from high-resolution spectroscopy (Bonaca et al. 2020b), which cover a narrower range of the stream between 1 ≈ −46 • and −29 • . The uncertainties in these values are typically ∼ 1 km/s, giving a tight constraint on the radial velocity track in this range of 1 .

Cuts
While all distance and radial velocity data are included in our likelihood, we choose to use Gaia data in the range −80 • < 1 < 30 • only for the fitting. This covers a very large extent of sky, and excludes less than 18% of the 4D data points (at 1 < −80 • ). This choice was made because the observed 2 proper motion tracks significantly change gradient at more negative 1 , which causes otherwise well-fitting models to have excessively penalised log-likelihoods. For a similar reason, we also exclude Gaia stars at 2 > 0.9 • . This only removes the stars in the off-track 'spur' (see e.g. Price-Whelan & Bonaca 2018; Bonaca et al. 2019;de Boer et al. 2020). While this spur may have been produced by an interaction with Sgr (de Boer et al. 2020), our intention is not to accurately model such an interaction (nor do observational uncertainties allow this). Hence, including the spur would only serve to distort the likelihood of models which are good fits to the main stream track. However, when interpreting the results it must be remembered that GD-1 does show more disruption than we are including in our data. Our data is shown for all 6 dimensions of phase space in Fig. 4. The excluded stars in the spur at 2 > 0.9 • are also shown in red.
After these cuts our data consists of 448 stars with positions and proper motions, 12 distance data points and 66 radial velocity measurements.

Model setup
We again use the Lagrange cloud stripping technique to generate models of GD-1. Unlike Pal 5, the identity of the GD-1 progenitor is unknown, and it has likely already dispersed. We therefore choose to linearly decrease the mass of the progenitor from prog = 2 × 10 4 at = −6 Gyr to zero at = disp . We consider the scenario where disp = −3 Gyr, which roughly correspond to one of the two scenarios for the dispersal of the progenitor suggested by Webb & Bovy (2019). All stars forming such streams are already stripped before the Sgr pericentre at ≈ −2.7 Gyr, so will be exposed to any perturbations. We also use their suggestion for the location of the progenitor, placing it at 1 = −40 • . The other five phase space coordinates of the progenitor are allowed to vary, namely 2 , distance, proper motions in 1 and 2 , and radial velocity.

Likelihood
When comparing our model streams to the data, we wish to consider both the mean stream track and the width. For example, if an interaction with Sgr causes a model stream to significantly widen, its likelihood should be lowered even if its mean track is well-aligned with the data. We therefore use kernel density estimates (KDEs) to approximate the distributions of model stars in bins of 1 , from which the likelihood of each data point can be calculated (see e.g.

Palau & Miralda-Escudé 2019, for a similar idea).
For each model stream, we divide the mock stars into 12 bins of width Δ 1 = 10 • , centred on the distance data points at {−85 • , −75 • , ..., 5 • , 15 • }. To account for the path of the stream through each bin, we estimate its mean track by fitting a quadratic to each of the 5 phase space distributions as functions of 1 . We use the scipy function stats.gaussian_kde to fit KDEs to the distributions of mock stars about these quadratics. A popular and rapid algorithm to find the optimal bandwidth of a Gaussian KDE is Silverman's rule of thumb (Silverman 1986), which works well with unimodal distributions. While most of the mock streams do produce unimodal distributions, we found that perturbations can create lowdensity secondary streams offset from the main track. These resulted in much larger bandwidths and therefore oversmoothing when using Silverman's rule of thumb. To fix this issue we use only data points within 2 standard deviations of the mean when calculating the bandwidth of each distribution, so far-outlying secondary streams do not contribute. We also do not let the 2 and proper motion bandwidths exceed 0.25 • and 0.05 mas/yr respectively, to prevent oversmoothing. These values are typical widths of the streams in the corresponding dimensions. However, we include all mock stars in each bin when calculating the KDEs themselves. Tests on randomly generated streams showed that this produced KDEs which represented the distributions well. Each of the five dimensions are fitted with their own independent KDE.
We assign a likelihood to a data point by assuming its uncertainty is Gaussian, and multiplying this Gaussian by the KDE. The likelihood of the point is taken as the integral of this product over all space. We ignore the uncertainties in 2 (since they are negligible compared to the width of the stream) and do not include the membership probabilities in the likelihood. Finally, we compute the log-likelihood of the model log by summing the log-likelihoods of all the data points. The only exceptions to this are when there are too few mock stars in a bin (5 or fewer in our procedure), and when the model stream does not extend over the full range of 1 (−90 • < 1 < 30 • ). In both cases we set log = −∞ and the model is rejected. We avoid using the normalisation of the density in the likelihood, since this may be sensitive to the time-dependence of the stripping rate from the cluster and would require a more detailed model of the cluster evolution.

Likelihood optimization
To find high-likelihood models of the stream for each value of , we maximize the log-likelihood function over the 5 phase space coordinates of the progenitor (all except 1 ). We use the scipy function optimize.dual_annealing, which combines generalized simulated annealing (Xiang et al. 1997) with a local search method. We use the Nelder-Mead simplex algorithm for the local search because the likelihood function contains discontinuities, so we wish to avoid gradient-based methods. 1 For the optimization with no Sgr, the initial guess was taken from fitting cubic or quadratic polynomials to the data. For the cases with Sgr, we use both this guess and the result of the no Sgr optimization. The optimizer was allowed to explore coordinates in ranges of width Δ 2 = 4 • , Δ(distance) = 4 kpc, Δ * 1 = Δ 2 = 4 mas/yr, and Δ rad = 100 km/s, all much greater than the widths of the observed stream. The optimizations are independently repeated 50 times for no Sgr and 100 times for the models with Sgr (50 for each inital guess), with 10 iterations of the annealing in each run. The local optimization was allowed a maximum of 20 iterations, with a toleration for convergence of 20. A stricter tolerance was not used because the small-scale fluctuations and discontinuities in the log-likelihood would have prevented the optimizer from converging. The solutions were then used as initial guesses for a further 20 repetitions of the optimizer. The optimized model streams from this process for each Sgr model are taken as the results; this produced adequate fits to the data so we do not perform any more iterations. We tested this procedure on mock data generated from models both with and without a massive Sgr. The optimized streams with the correct Sgr models had the highest likelihoods, except that the = 0.2 model was slightly favoured with data generated with no Sgr. It is therefore difficult to distinguish between different Sgr models of low mass.

Results
The optimized model streams are plotted in all 6 dimensions of phase space in Fig. 5, for several values of . The 5th and 6th rows show the proper motions relative to polynomials fitted to the data, to more clearly show details of the proper motion tracks. In Fig. 6, we also show the physical appearance of the streams in galactocentric Cartesian coordinates at four different times, together with the trajectories of Sgr. Fig. 5 shows reasonably good fits to the data for each of the four models. The two components of proper motion are fitted well in all cases, with the models passing within or close to all the error bars. The distance and radial velocity data are more scattered and scarce, but the model tracks still generally provide good fits. The most obvious deviations from the data are in the on-sky coordinate intact, we do not consider that this misalignment rules out a Sgr with these masses. The = 0.7 case is one of the most disrupted of all the models, with a secondary stream offset in 2 , distance and proper motions. This results from a early close encounter with Sgr at ≈ −4.1 Gyr, when this Sgr model still has a mass of 5 × 10 10 . However, even this model stream is a reasonable fit to the data for most of 1 . The GD-1 stream itself exhibits various off-track features (Price-Whelan & Bonaca 2018), among them the spur shown in red in Fig. 4. There is also the 'blob' centred around 1 ≈ −15 • . These features were investigated in detail by de Boer et al. (2020), and both have proper motions very similar to the stars in the main stream. Combined with their proximity, this makes them very likely to be related to the main stream. Quantitatively, the proper motion residuals of the two features are 2 mas/yr in magnitude. GD-1 is also accompanied by a secondary stream, known as Kshir (Malhan et al. 2019b), which is offset by ∼ 10 • in 2 from GD-1 but shares similar kinematics. The off-track secondary stream in our = 0.7 model similarly differs by less than 1 mas/yr from the main stream. Hence even though the model stream has had a close encounter with a Sgr of mass > 10 10 , the level of disruption is not significantly greater than in the observed stream. While the offset in 2 is not so great as the observed Kshir stream, there are other examples where Sgr has resulted in differences in 2 of ∼ 10 • . For example, these are visible in the Pal 1 and BH 176 mock streams in Fig. 2. We have also seen similar cases in other models of GD-1 (not shown here). This suggests that a massive Sgr may be able to produce secondary streams similar to Kshir. The = 1.0 stream fits the mean track of the data in all dimensions well, with the exception of a sparse secondary stream. This secondary stream has a large offset in distance and kinematics as well as 2 , and results from the stream wrapping around the galaxy multiple times (see Fig. 6). It is therefore not comparable to Kshir, unlike the secondary stream in the = 0.7 model. Despite the good fit of the main track, the stream has not escaped encounters with Sgr. Fig. 6 reveals that the stream (in red) is folded at ≈ −2 Gyr before extending again in both directions. This stream is also shown in Fig. 7 at more snapshots. The left-hand column shows the stream in the galactocentric on-sky coordinates ( 1 , 2 ) as used in Fig. 2. Again the progenitor is always at 1 = 2 = 0 • and moving in the direction of positive 1 . In the middle column energy is plotted against the x-component of angular momentum , where the origin is shifted to the position of the progenitor in − space. The righthand column shows the component of acceleration from Sgr along the velocity vector of each particle, against 1 . The model stars are colour-coded according to the location of their release. Blue stars are released from the inner Lagrange point, so initially form the leading arm, while red stars are from the outer Lagrange point and form the trailing arm.
The left-hand column reveals that the folding and extension of the stream actually results in the arms being inverted; over a period of ≈ 1 Gyr, the leading (blue) arm is overtaken by the progenitor and becomes the trailing arm, and vice versa for the trailing (red) arm. This process is initiated by an encounter with Sgr at ≈ −2.7 Gyr; the right-hand column shows an acceleration of the stream particles along their direction of motion. Since Sgr passes in front of the stream, there is a gradient in the magnitude of this acceleration, with the leading tail experiencing the greatest force. The result is that the leading tail gains more energy than the trailing tail, so its new orbit has a longer period and it is overtaken. The middle column also shows a significant increase in the spread of energy and angular momentum.
Optimized GD-1 models with different δ Figure 5. 6D tracks of the optimized stream models (coloured) compared to the data (black). From top to bottom, the rows are 1 , distance, proper motions in 1 and 2 , proper motions relative to a polynomial fitted to the data, and radial velocity. All are plotted against 1 . Each column corresponds to a different value of , with the mass decay rate of Sgr increasing from left to right. The values of log-likelihood for these models are shown above each column.
The changes in energy and angular momentum of stars during an encounter can vary significantly along the stream. For the = 1.0 stream, some stars (around the centre) experience little change in their orbital parameters. However, stars in the original leading (trailing) tail gained (lost) energy and angular momentum, the latter changing by a factor of up to ∼ 1.4. The resultant gradients in energy and angular momentum along the stream offer a method by which an inverted stream can be detected, if kinematics can be measured precisely enough.
While this reversal happens quite rapidly, this is by no means always the case. Compare the on-sky appearance of the = 1.0 stream at = −1.91 Gyr to the = 1.0 Pal 5 stream in Fig. 2. These two streams share the same 'folding' of one of the tails, which in the GD-1 model leads to a complete inversion a few hundred Myr later. This folding and consequent bifurcation can therefore be seen as an intermediate stage of inversion, at which one part of the stream is being overtaken by another. This raises the possibility that a stream perturbed by Sgr could be observed undergoing an inversion today, as in our = 1.0 model of Pal 5. The distance, geometry and relative speed of the encounter are likely the main factors determining the timescale of the inversion. A closer or slower encounter would lead to a larger perturbation, and hence a larger gradient in the orbital frequencies along the stream. However, as the = 1.0 model suggests, this also results in a more dispersed and hence fainter stream. A more detailed investigation of this process and its results is required to determine the likelihood of observing such a stream, but is beyond the scope of this paper.
The optimized phase space positions themselves show few clear correlations with the Sgr mass. The most obvious trend is in the distance, though even this variation is small. Between the 'no Sgr' and = 1.0 models, the distance decreases monotonically from 7.47 to 7.27 kpc. This is accompanied by a change in * 1 from -13.39 to -13.27 mas/yr. These changes likely arise because the optimizer forces the stream to follow an orbit which leads to minimal damage at high Sgr masses. However, this is unlikely to be a general result applicable to the real stream, as such an orbit depends on the details of the Milky Way's potential over time.
To check the sensitivity of our results to the initial mass of the cluster, we generate streams from the optimized phase space coordinates (as in Fig. 5) using different initial values of prog . Webb & Bovy (2019) and Gialluca et al. (2021) have suggested initial masses of roughly 3 − 6 × 10 3 , so we test our models with prog = 5 × 10 3 . The streams generated from the lower mass clusters do not greatly differ from those shown in Fig. 5. The lower mass streams tend to have their stellar density more concentrated in their centres (near the progenitors). This is expected, since the difference in energy between the Lagrange points is smaller for a lower mass cluster, so the two tails should be less dispersed. Similarly, the sparse secondary stream in the = 1.0 model has a lower density when generated from the lower mass cluster. However, these differences are generally small and do not affect our conclusions.

Testing track-proper motion misalignment
A possible consequence of stream-satellite interactions is a misalignment between the on-sky stream track and proper motions, such that the motion of stars is not along the stream. This has been observed in the Orphan stream, resulting from an interaction with the LMC ). However, data from Gaia DR2 has not shown any inconsistency between the stream track and proper motions of GD-1 (de Boer et al. 2020). Here we search for a misalignment in both the EDR3 data and the model streams, to determine whether Sgr is expected to cause this effect. This is particularly relevant for the = 1.0 model, which appears to fit the data reasonably well despite being inverted. The presence of a clear misalignment in this model could help to rule out the inversion of GD-1. We estimate the on-sky gradient d 2 /d 1 and its uncertainty by fitting a straight line to stars in overlapping 1 bins of width 10 • . The proper motions are first corrected for solar motion, and the ratio 2 / 1 is calculated without the usual cos 2 factor in the 1 proper motion. Correcting for solar motion requires distance estimates of the stars in the data, which we take from a 4th-order polynomial fitted to the distance data. For simplicity we set the fractional uncertainties in distance to 0.1. We then calculate the weighted mean of the proper motion ratio values in each 1 bin.
The on-sky slope and proper motion ratio are plotted in Fig. 8 for the data (top panel), the model stream with no Sgr (middle panel) and the = 1.0 model (bottom panel). In all cases the slope and proper motion ratio are consistent with each other over the 1 range shown, with only a few error bars not overlapping. Hence the more precise EDR3 data support the findings of de Boer et al. (2020) that there is no significant misalignment in the observed stream. Neither of the model streams show clear misalignments either. Although the = 1.0 stream was inverted by a close encounter ∼ 2.7 Gyr ago, in almost all the bins there is very little difference between the slope and proper motion ratio. This may be because of the time elapsed since the encounter; any misalignment is likely to decrease as the stream spreads out along its orbit. With the addition of observational errors in distance and proper motions, any remaining signal could easily be unobservable. Hence the lack of an observed misalignment in GD-1 does not rule out past interactions with Sgr.

SURVIVAL OF MOCK GD-1-LIKE STREAMS
In this section we investigate the effects of a massive Sgr on streams that orbit at similar radii to GD-1, but which are not generally at the present-day location of GD-1. This addresses the probability of such a stream displaying observable signatures of a massive Sgr.

Generation of mock GD-1-like streams
We generate a large number of GD-1-like streams as follows. We integrate the orbit of the maximum likelihood progenitor with no δ = 1.0 stream at different times ( 1 , 2 ) is the Galactocentric polar coordinate system in which the instantaneous orbital plane of the progenitor is aligned with the 1 axis. The progenitor is situated at 1 = 2 = 0 • and its motion is in the direction of positive 1 (to the left). The stars are colour-coded according to the Lagrange point at which they were released. Blue (red) stars were released from near the inner (outer) Lagrange point, so initially formed the leading (trailing) tail at lower (higher) energy than the progenitor. Middle column: The same stars plotted in energy-angular momentum space, where the origin is shifted to the instantaneous position of the progenitor. Right column: The component of acceleration from Sgr (including MW reflex) along the velocity vector of each particle. An encounter with Sgr at around ≈ −2.7 Gyr causes the two tails to be inverted, and the stars which originally formed the trailing tail are now ahead of the leading tail in their orbit. The right-hand column shows that the leading (blue) tail experiences a greater acceleration along the stream than the trailing (red) tail, due to Sgr passing in front of the stream. This causes the leading tail's energy and angular momentum to increase above those of the trailing tail. The spread in energy and angular momentum of both tails also increases significantly as a result of this encounter. An animated version of this figure can be viewed at https://www.youtube.com/playlist?list=PLEleLLhXAwEMx6GcSror-iF-QsskHrXYM.
Sgr backwards for 6 Gyr, and select a segment covering exactly 12 radial orbital periods of total length 12 , from pericentre to pericentre. We find the galactocentric radius , radial velocity and tangential speed as functions of time. For each mock progenitor, we randomly generate a time from a uniform distribution between 0 and 12 , and assign the present-day radius and radial velocity to be those of GD-1 at the corresponding time. We also set the tangential velocity equal in magnitude to that of GD-1 at that time, but with a direction drawn from a uniform distribution. We also generate the Galactocentric position on the sky from a uniform distribution in solid angle. In this way we create a population of stream progenitors whose orbits have similar radial ranges to GD-1, but which are approximately uniformly distributed in radial phase and orientation. Note however that this cannot be achieved exactly in this non-spherical potential. For a variety of Sgr mass decay parameters , we again use the Lagrange Cloud Stripping technique to generate 1000 mock streams from these progenitors. This produces a sample of 1000 different streams for each , but each sample shares the same set of presentday progenitor phase space positions.
Since Sgr cannot have passed pericentre with a mass exceeding ∼ 10 10 in its last two orbits, any stream which formed recently is unlikely to be significantly perturbed by Sgr. Hence we instead study the effects on older streams whose stars were stripped before the Sgr pericentre at ≈ 2.7 Gyr. Therefore, we set disp = −3 Gyr for all the On-sky slope and proper motion ratio Figure 8. Comparison of on-sky gradient and ratio of proper motions of GD-1, for observations (top panel) and two optimized model streams (lower two panels). With the exception of a few isolated points, the slope and ratio are consistent in both the observations and the models. This is true even for the = 1.0 stream, which has been perturbed and inverted by Sgr. mock streams, so the stars are stripped only in the first 3 Gyr of the simulations.

Appearances of mock streams
The on-sky appearances of 49 mock streams are shown in Fig. 9, with no Sgr (top) and the = 1.0 model (bottom). The progenitor of each stream has the same present-day phase space position for both Sgr models. These plots reveal a variety of effects that Sgr can have on the appearance of streams. In the most extreme cases one of the tails is folded back on itself (e.g. top row, 3rd column from left), giving an appearance of two streams side by side. Several less perturbed streams instead have a smaller 'hook' at the end of one of the tails. There are also examples of inverted streams, as seen in the optimized = 1.0 GD-1 model (Fig. 7). In these streams, the originally trailing (red) tails now lead the progenitors (i.e. at positive 1 ), and vice versa for the initially leading (blue) tails. Despite the reversals these streams can show little other evidence of perturbations, remaining narrow and lying close to the 1 axes (e.g. bottom row, 2nd from right). Observations of such streams may therefore provide no indication that they have been strongly perturbed, depending on the quality of the data. There are also many streams whose appearances are not greatly affected by increasing the mass of Sgr. Some experience nothing more than a small change in the length of one or both tails (e.g. top row, left-most column). It is therefore possible for streams to survive a pericentric passage of a massive Sgr without showing obvious visible signs of an encounter.

Energy distributions
Stellar streams form tightly clustered groups in energy-angular momentum space (e.g. Bonaca et al. 2021). One approach to measuring the survival chance of the streams is therefore to measure the spreads of energy and angular momentum. While a stream spread out in energy space does not necessarily mean it will appear disrupted (e.g. see Fig. 7), it remains a strong indication that the stream has been perturbed. Since we are using a triaxial potential, the angular momentum L is not conserved. However, for orbits in a static MW potential without the LMC or Sgr, the energy is an integral of motion. We therefore focus on the energy, defined by = 1 2 2 + Φ MW , where Φ MW is the static MW potential. We treat the time-dependent parts of the potential induced by the LMC and Sgr as perturbations causing to vary with time.
In this section, we generate only leading streams; a pair of unperturbed leading and trailing streams has a bimodal energy distribution, which we wish to avoid when studying the spread in energies. Therefore, a large majority of stars in an unperturbed stream are expected to have lower energy than the progenitor from which they were stripped. To see how this is affected by the perturbations, we calculate the difference in energy Δ between each star and its progenitor. We also find the standard deviation of the stream star energies for each mock stream, which is a measure of how tightly clustered the stream is in energy space. The median Δ and are plotted for each mock stream and for each value of as functions of time in Fig. 10. The present-day distributions are also plotted in Fig. 11.
Until about 1 Gyr ago, in the absence of Sgr the median Δ and remain almost constant, and there is little variation between the different streams. This is expected, since is a constant of motion in the static MW potential if the LMC can be ignored. This changes in the last billion years, and the distributions of and median Δ become much more spread out. This is due to the LMC perturbing the energy distributions of the streams. The majority become more spread out in energy space (the median increases), but for a significant fraction decreases and the streams become more tightly clustered in energy space.
The same effects are seen when a massive Sgr is introduced. At each pericentric passage of Sgr, streams rapidly become either more or less spread out in enery space over. A minority of these leading streams also have their median energy rise above that of the progenitor; these can be associated with the inverted streams previously discussed, where the progenitor overtakes the leading tail. The top row of Fig. 11 shows the distributions of and median Δ before the LMC begins to affect them. The right-hand panel confirms that the distribution of spreads out when the mass of Sgr is increased. With = 0.7 or 1.0, the distribution has a long tail at large ; these are the streams which become significantly less clustered in energy, and consequently are more likely to disperse spatially. In the last Gyr the LMC erases some of the differences between the distributions, but remains higher on average with the larger values of . The median increases by a factor of about 1.4 with = 1.0 compared to with no Sgr. A massive Sgr therefore only has a mild effect on the average energy spread of streams.

Surface brightness
The disrupted appearances of many of the streams in Fig. 9 due to Sgr suggests that the encounters may affect the surface density of No Sgr δ = 1.0 Figure 9. Top: On-sky appearance of 49 mock streams on GD-1-like orbits, with no perturbations from Sgr. The streams are viewed from the galactic centre, and the 1 axes (grey dashed lines) are the instantaneous orbital planes of the stream progenitors. As in Fig. 7, the blue (red) stars represent initially leading (trailing) tails. Bottom: the same 49 streams (i.e. generated from progenitors with the same present-day phase space positions), but with perturbations from the = 1.0 Sgr model. While many streams are largely unchanged, several exhibit disruption from encounters with Sgr. This includes bifurcations resulting from stream 'folding' (e.g. top row, third from left), asymmetry in the lengths of the two arms (e.g. second row from bottom, right-most column), and inversion of the arms (e.g. bottom row, second from right).  Energy distributions of mock streams  stars in the stream, and hence the surface brightness and probability of detection. While some streams are widened and lengthened with reduced density, others have at least one tail significantly shortened, which may increase the local surface brightness. When studying the dynamics of subhalo encounters with streams, Erkal & Belokurov (2015) similarly found that stream gaps caused by such interactions are associated with caustics of higher density on either side. It is therefore unclear whether the presence of a massive Sgr will increase or decrease the surface brightness and detectability of streams. We consider the relative surface brightness of stars as viewed from the centre of the galaxy. We first divide the mock stars in the range −60 • < 1 < 60 • into 12 bins of width 10 • . Reusing the method described in Section 4.3, we fit a KDE to the 2 distribution in each bin, after subtracting a fitted parabola from the 2 values. The contribution from each mock star to the KDE is weighted by its flux as viewed from the galactic centre, using a fiducial luminosity of 1 . The peak surface brightness (in units of flux density per solid angle) is taken as the maximum of the KDE, normalised by the total flux from stars in the bin and the bin width. This is then converted to a surface brightness with units of magnitudes per square arcsecond. This is only a relative value, since changing the stripping rate or stellar luminosity would shift each of the values. Distributions of the peak surface brightness are plotted in the top panel of Fig. 12 for four Sgr models, with the median values marked with dashed vertical lines for the cases with no Sgr and = 1.0. The effect of Sgr on the peak surface brightness is reasonably small. Compared to with no Sgr, the = 1.0 case only causes the median surface brightness to dim by 0.3 mag/arcsec 2 , and the distributions have similar widths for all Sgr models. However, the = 1.0 model does decrease the proportion of streams at high brightness by up to ∼ 50%. Hence a massive Sgr does disperse the densest parts of some streams, but is unlikely to prevent their discovery entirely.
In the lower panel of Fig. 12 we also plot distributions of the streams' angular lengths as viewed from the centre of the galaxy. This is simply defined as the difference between the 95th and 5th percentiles of their 1 distributions. In the absence of Sgr virtually all streams have lengths between 30 • and 160 • at the present-day. However, introducing a massive Sgr results in more streams with both shorter and longer lengths, anywhere between 0 • and 360 • . This is unsurprising given the appearances of the streams in Fig. 9; some are significantly shortened while others are more dispersed along the 1 axis.

CONCLUSIONS
We have conducted test particle simulations of stellar streams in the Milky Way under the influence of a Sgr dSph with various initial masses, up to 5 × 10 10 . Each prescribed model of Sgr loses a fixed fraction of its mass at each pericentre, ending with a presentday mass of 4×10 8 . The aim of our analysis is to address whether such high masses are compatible with the survival of streams formed over 3 Gyr ago, with a focus on the well-studied GD-1 stream. Our principal conclusions are summarised below.
(i) A Sgr with an initial mass of 10 10 is able to cause significant disruption to old stellar streams in the inner halo of the Milky Way. As examples we have generated mock stellar streams from three known globular clusters, and shown that in each case the stream becomes significantly more perturbed when the initial mass of Sgr is raised to 4 × 10 10 or more. However, this only applies to the stars stripped more than ∼ 2.7 Gyr before the present-day. This is because realistic models of Sgr require at least two orbital periods to lose mass from ∼ 10 10 to the present-day value of 4 × 10 8 . Encounters with a massive Sgr can therefore occur no more recently than around the pericentre two orbital periods ago. In our choice of potential this pericentre occurs about 2.7 Gyr before the present, though this time will be shorter for some other realistic and commonly used potentials (e.g. McMillan 2017). Any stream formed since that time is able to grow unperturbed, and the effects of Sgr can be safely ignored. (ii) Pal 5 tidal tails. We have generated models of the Pal 5 tidal tails that qualitatively reproduce features in the observed stream, in particular the asymmetry between the leading and trailing tails. This is produced by an interaction with Sgr of mass > 10 9 , which causes one of the tails to be folded back on itself and shortened. While the simulated shortened tails tend to be longer than the observed one, we find that this is sensitive to the model setup. An interaction with Sgr therefore remains a possible candidate for the origin of the Pal 5 asymmetry. (iii) Fitting the GD-1 stream with different models of Sgr. For several different mass decay profiles of Sgr, we fitted models to 6D data of the GD-1 stream using a kernel density estimate to define the likelihood. For each mass of Sgr, the optimized model fitted the data reasonably well, with the most obvious deviations being in the on-sky coordinate 2 . For the highest initial mass (5 × 10 10 , the = 0.7 model), an encounter with Sgr creates off-track features parallel to the main track. The offsets of these features from the main stream are comparable to those of the 'spur' and 'blob' in the observed GD-1 stream, so the degree of disruption induced by this mass of Sgr is consistent with observations. Hence, we conclude that a Sgr of mass ∼ 5 × 10 10 as predicted by Read & Erkal (2019) and Laporte et al. (2018) is compatible with the current state of the GD-1 stream, even if the stream was formed more than 3 Gyr ago. (iv) Stream reversals. One of the highest-likelihood GD-1 models corresponds to the fastest-decaying Sgr model ( = 1.0, initial mass 4 × 10 10 ). The mean track of the mock stars fits the data reasonably well in all dimensions of phase space. Interestingly however, the stream is inverted with respect to its original orientation. The stars released from the inner Lagrange point which originally formed the leading tail now form the trailing tail, behind the stars released from the outer Lagrange point. This reversal is initiated by an encounter with Sgr and happens over a period of ∼ 1 Gyr. It results from Sgr passing in front of the stream, causing the leading stars to move onto orbits with slightly lower frequencies. They are subsequently overtaken by the trailing stars over the next orbital periods, accompanied by a contraction of the stream. The stream then extends again along its orbit, with the originally trailing stars now leading. We find that neither the data nor this stream shows any significant misalignment between the on-sky slope and proper motion ratio, so such a scenario is difficult to rule out without more precise data. It is notable that such a dramatic change could take place and leave such a well-fitting stream.
(v) Mock streams. We generated 1000 mock streams with apocentric and pericentric distances similar to those of GD-1, but with randomized orbital planes and phases. We set the dispersal time of each cluster to disp = −3 Gyr so that all stars are susceptible to interactions with Sgr around the pericentre at ≈ −2.7 Gyr. The massive Sgr models create a wide range of visible features in the streams, including asymmetry between the lengths of the two tails, bifurcations induced by 'folding' of a tail, and reversals of the leading and trailing tails. (vi) Energy distributions of mock streams. We calculated the energy of stars in each mock streams, using only the leading tails (i.e. stars stripped from the inner Lagrange point). We found their median offset in energy Δ from that of the progenitor, and the standard deviations of their energies . At = −1 Gyr, a massive Sgr has caused the spread in energy to increase for most streams, by up to an order of magnitude. However, the distributions of for each Sgr model are more similar at the present-day, due to the influence of the LMC in the last Gyr. With a Sgr of initial mass 4 × 10 10 (the = 1.0 model), the present-day median value of is also increased by only a factor of 1.4 compared to the case with no Sgr. The present-day clustering of stream stars in energy space is therefore a poor indicator of the initial mass of Sgr. (vii) Surface brightness and length of mock streams. We estimated the peak surface brightness of stars in each of the same 1000 mock streams as viewed from the centre of the galaxy. The = 1.0 Sgr model (with initial mass 4 × 10 10 ) reduces (dims) the median peak surface brightness by about 0.3 mag/arcsec 2 compared to the model with no Sgr. We conclude that a massive Sgr only slightly reduces the chance of detecting a stream formed before = −3 Gyr, so mass estimates of ∼ 5 × 10 10 are compatible with observed streams being more than 3 Gyr old. The massive Sgr models also result in a much wider range of stream lengths than with no Sgr.
We have demonstrated that a Sgr of mass ∼ 10 10 is capable of causing significant disruption to stellar streams formed more than 3 Gyr ago, including folding and reversal of the arms. However, these large masses are compatible with the survival of the streams, and it is possible to recreate GD-1 with no more damage than is observed. Our results suggest that the influence of Sgr should be considered when studying the perturbations of streams in the Milky Way, as encounters over 2.5 Gyr could result in present-day disruption. Being one of the largest satellites of the Milky Way, knowing the former mass of Sgr is key to reconstructing its history. As new data from Gaia and other surveys becomes available, stellar streams will play a crucial role in understanding our galaxy's past.