Topics in Gravitational-Wave Astronomy Theoretical studies, source modelling & statistical methods ALVIN J. K. CHUA CLARE HALL / INSTITUTE OF ASTRONOMY / UNIVERSITY OF CAMBRIDGE A dissertation submitted for the degree of Doctor of Philosophy Supervisor: Dr Jonathan R. Gair 1 MARCH 2017 ii Abstract Astronomy with gravitational-wave observations is now a reality. Much of the theoretical research in this field falls under three broad themes: the mathe- matical description and physical understanding of gravitational radiation and its effects; the construction of accurate and computationally efficient wave- form models for astrophysical sources; and the improved statistical analysis of noisy data from interferometric detectors, so as to extract and characterise source signals. The doctoral thesis presented in this dissertation is an investi- gation of various topics across these themes. Under the first theme, we examine the direct interaction between gravita- tional waves and electromagnetic fields in a self-contained theoretical study; this is done with a view to understanding the observational implications for highly energetic astrophysical events that radiate in both the gravitational and electromagnetic sectors. We then delve into the second theme of source mod- elling by developing and implementing an improved waveform model for the extreme-mass-ratio inspirals of stellar-mass compact objects into supermassive black holes, which are an important class of source for future space-based de- tectors such as the Laser Interferometer Space Antenna. Two separate topics are explored under the third theme of data analysis. We begin with the procedure of searching for gravitational-wave signals in de- tector data, and propose several combinatorial compression schemes for the large banks of waveform templates that are matched against putative signals, before studying the usefulness of these schemes for accelerating searches. Af- ter a gravitational-wave source is detected, the follow-up process is to mea- sure its parameters in detail from the data; this is addressed as we apply the machine-learning technique of Gaussian process regression to gravitational- wave data analysis, and in particular to the formidable problem of parameter estimation for extreme-mass-ratio inspirals. iii iv Acknowledgements The completion of a doctoral thesis is an intrinsically individual and (some might say) thankless experience. It is however by no means isolated, and should not pass without a word of thanks to those who have helped. I am indebted to my supervisor Jon Gair for the opportunity of working with him, the overarching guidance he has provided, and indeed just for be- ing so good at what he does. His broad and creative approach to research is inspiring, while his experience and stature in the field have been an invaluable conduit for my exposure to gravitational-wave astronomy. Further gratitude goes to Chris Moore, who is always ready to talk science, and equally adept at bouncing ideas around or combing through the details. Daily life in the office has been an enriching—and often exhausting—learning experience thanks to both him and Rob Cole, who has provided much help as well over the years. I have likewise enjoyed my interactions with other past members of the IoA gravity group, some of whom I have had the addi- tional pleasure of collaborating with: Christopher Berry, Tom Callister, Priscilla Can˜izares, Sonke Hee, Eliu Huerta, and Steve Taylor. For graciously serving as both my academic referees and thesis examiners, as well as for advice freely proffered and often heeded, I thank Leor Barack and Anthony Lasenby. I am also grateful for science- and job-related dis- cussions with various researchers at the IoA and around the globe: Mustafa Amin, Stas Babak, John Baker, Anthony Challinor, Scott Field, Ik Siong Heng, Scott Hughes, Donald Lynden-Bell, Nikku Madhusudhan, Michele Vallisneri, Maarten van de Meent, and Yan Wang (both of them). Finally, I greatly appreciate the assorted assistance afforded by Debbie Pe- terson and Margaret Harding at the IoA, Irene Hills and Magda Bergman at Clare Hall, and the good people at the Cambridge Trust. This thesis is dedicated to my family—for everything, and then some. v vi Preface This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration, except as declared in the preface and specified in the text. It is not substantially the same as any that I have submitted, or that is being concurrently submitted for a degree or diploma or other qualification at the University of Cambridge or any other university or similar institution, except as declared in the preface and specified in the text. I further state that no substantial part of my dissertation has already been submitted, or is being concurrently submitted for any such degree, diploma or other qualification at the University of Cambridge or any other university or similar institution, except as declared in the preface and specified in the text. All concepts, results and data obtained by others are properly referenced. The use of the collaborative “we” in the text is a stylistic choice. Some of the material in this dissertation has been adapted from the follow- ing journal publications: • A. J. K. Chua, P. Can˜izares & J. R. Gair, Electromagnetic signatures of far- field gravitational radiation in the 1+3 approach, Class. Quantum Grav. 32, 015011 (2015). • A. J. K. Chua & J. R. Gair, Improved analytic extreme-mass-ratio inspiral model for scoping out eLISA data analysis, Fast Track Communication, Class. Quantum Grav. 32, 232002 (2015). • A. J. K. Chua, Augmented kludge waveforms and Gaussian process regression for EMRI data analysis, J. Phys.: Conf. Ser. 716, 012028 (2016). • A. J. K. Chua & J. R. Gair, Tunable compression of template banks for fast gravitational-wave detection and localisation, Phys. Rev. D 93, 122001 (2016). vii viii • C. J. Moore, C. P. L. Berry, A. J. K. Chua & J. R. Gair, Improving gravitational-wave parameter estimation using Gaussian process regres- sion, Phys. Rev. D 93, 064001 (2016). • C. J. Moore, A. J. K. Chua, C. P. L. Berry & J. R. Gair, Fast methods for train- ing Gaussian processes on large datasets, R. Soc. Open Sci. 3, 160125 (2016). Explicit citation of these papers indicates results that have since been updated (as in Chapter 4), or work done mainly in collaboration (as in Chapter 6). This dissertation is shorter than 60,000 words in length, including the ab- stract, tables, footnotes and appendices. The word count is 52,000 words (eval- uated using Kile). — Alvin J. K. Chua 1 March 2017 Contents Abstract iii Acknowledgements v Preface vii 1 Introduction 1 1.1 Chapter synopses . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.2 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Gravitational-wave astronomy 7 2.1 Gravitational waves in general relativity . . . . . . . . . . . . . . 7 2.1.1 Linearised formalism . . . . . . . . . . . . . . . . . . . . . 8 2.1.2 Gravitational-wave generation . . . . . . . . . . . . . . . 9 2.1.3 Gravitational-wave propagation . . . . . . . . . . . . . . 11 2.1.4 Gravitational-wave effects . . . . . . . . . . . . . . . . . . 12 2.2 Sources and waveform modelling . . . . . . . . . . . . . . . . . . 13 2.2.1 Source overview . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.2 Binary-coalescence modelling . . . . . . . . . . . . . . . . 15 2.2.3 Comparable-mass mergers . . . . . . . . . . . . . . . . . 18 2.2.4 Extreme-mass-ratio inspirals . . . . . . . . . . . . . . . . 19 2.3 Detectors and data analysis . . . . . . . . . . . . . . . . . . . . . 21 2.3.1 Detector overview . . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 Signal processing . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.3.4 Parameter estimation . . . . . . . . . . . . . . . . . . . . . 28 ix x Contents 3 Einstein–Maxwell interactions 31 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2 The 1+3 approach to relativity . . . . . . . . . . . . . . . . . . . . 34 3.3 Linearised far-field equations . . . . . . . . . . . . . . . . . . . . 37 3.4 Solutions and observational implications . . . . . . . . . . . . . 41 3.4.1 Static electromagnetic field . . . . . . . . . . . . . . . . . 41 3.4.2 Electromagnetic radiation . . . . . . . . . . . . . . . . . . 45 3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4 Augmented kludge waveforms 55 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2 Kludge waveform models . . . . . . . . . . . . . . . . . . . . . . 57 4.2.1 Analytic kludge . . . . . . . . . . . . . . . . . . . . . . . . 59 4.2.2 Numerical kludge . . . . . . . . . . . . . . . . . . . . . . 62 4.3 Augmented analytic kludge . . . . . . . . . . . . . . . . . . . . . 65 4.3.1 The fundamental-frequency map . . . . . . . . . . . . . . 67 4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 69 4.3.3 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5 Template bank compression 81 5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 5.2 Conic compression . . . . . . . . . . . . . . . . . . . . . . . . . . 84 5.2.1 Partition scheme . . . . . . . . . . . . . . . . . . . . . . . 89 5.2.2 Symmetric base scheme . . . . . . . . . . . . . . . . . . . 92 5.2.3 Binomial coefficient scheme . . . . . . . . . . . . . . . . . 97 5.2.4 Scheme comparison . . . . . . . . . . . . . . . . . . . . . 100 5.3 Orthogonality and subspaces . . . . . . . . . . . . . . . . . . . . 104 5.3.1 Non-orthogonal templates . . . . . . . . . . . . . . . . . . 104 5.3.2 Two-dimensional subspace . . . . . . . . . . . . . . . . . 108 5.4 Taylor-T2 example . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Contents xi 6 Gaussian process regression 121 6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 6.2 The marginalised likelihood . . . . . . . . . . . . . . . . . . . . . 123 6.2.1 The Gaussian process prior . . . . . . . . . . . . . . . . . 125 6.2.2 Training the Gaussian process . . . . . . . . . . . . . . . . 128 6.2.3 Application to black-hole binary mergers . . . . . . . . . 129 6.3 Application to extreme-mass-ratio inspirals . . . . . . . . . . . . 131 6.3.1 One-dimensional example . . . . . . . . . . . . . . . . . . 134 6.3.2 Two-dimensional example . . . . . . . . . . . . . . . . . . 138 6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 7 Conclusion 145 A Induced Maxwell field solution 149 B Combinatorial design theory 153 C Taylor-T2 waveform expansions 155 xii Contents List of Figures and Tables 1.1 Table: Glossary of abbreviations . . . . . . . . . . . . . . . . . . . 6 3.1 Figure: Magnetic-field amplitude of induced electromagnetic field, for static and uniform free field . . . . . . . . . . . . . . . . 43 3.2 Figure: Poynting flux amplitude envelope of induced electro- magnetic field, for radiative and planar free field . . . . . . . . . 48 3.3 Figure: Time-averaged Poynting flux of induced electromag- netic field, for radiative and planar free field . . . . . . . . . . . 50 4.1 Figure: Comparison of early-inspiral waveforms across three kludge models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.2 Figure: Comparison of characteristic strain for year-long wave- forms across three kludge models . . . . . . . . . . . . . . . . . . 74 4.3 Table: Comparison of two- and six-month overlaps for original and augmented analytic kludge waveforms, with respect to nu- merical kludge waveforms . . . . . . . . . . . . . . . . . . . . . . 75 4.4 Figure: Two- to six-month overlaps and computational speed- up for augmented analytic kludge waveforms with varying compact-object mass . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.5 Figure: Two- to six-month overlaps and computational speed- up for augmented analytic kludge waveforms with varying black-hole spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.6 Figure: Two- to six-month overlaps and computational speed- up for augmented analytic kludge waveforms with varying ini- tial eccentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 5.1 Figure: Optimal detection surface for uncorrelated statistics . . 87 xiii xiv LIST OF FIGURES AND TABLES 5.2 Figure: Receiver operating characteristic curves for partition scheme with various detection statistics . . . . . . . . . . . . . . 90 5.3 Figure: Optimal detection surface for correlated statistics of symmetric base scheme . . . . . . . . . . . . . . . . . . . . . . . . 94 5.4 Figure: Receiver operating characteristic curves for symmetric base scheme with various detection statistics . . . . . . . . . . . 95 5.5 Figure: Comparison of compression rates for various conic com- pression schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5.6 Figure: Comparison of discrimination and identification ac- curacy for partition, symmetric base and binomial coefficient schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 5.7 Figure: Discrimination and localisation accuracy for partition scheme variants and coarsening approach, with non-orthogonal template bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.8 Figure: Discrimination and identification accuracy for partition scheme, with one- and two-dimensional signal injections . . . . 109 5.9 Figure: Visualisation of signal overlaps with Taylor-T2 template bank at various levels of partition-scheme compression, for cen- tral signal injection . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5.10 Figure: Detection rates and localisation accuracy for Taylor-T2 template bank with partition scheme and coarsening approach, applied to central signal injection . . . . . . . . . . . . . . . . . . 114 5.11 Figure: Visualisation of signal overlaps with Taylor-T2 template bank, for random and boundary signal injections . . . . . . . . . 116 5.12 Figure: Detection rates and localisation accuracy for Taylor-T2 template bank with partition scheme and coarsening approach, applied to random and boundary signal injections . . . . . . . . 117 6.1 Figure: Log-hyperlikelihood for metric hyperparameter of co- variance function over one-dimensional parameter space, with various training-set densities . . . . . . . . . . . . . . . . . . . . 135 LIST OF FIGURES AND TABLES xv 6.2 Figure: Accurate, approximate and marginalised likelihoods over one-dimensional parameter space, with various training- set densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 6.3 Figure: Visualisation of various training-set grids and correla- tion ellipses over two-dimensional parameter space . . . . . . . 140 6.4 Figure: Accurate, approximate and marginalised likelihood contours over two-dimensional parameter space, with various training-set densities . . . . . . . . . . . . . . . . . . . . . . . . . 141 xvi LIST OF FIGURES AND TABLES Chapter 1 Introduction Gravitational waves (GWs) are radiative perturbations in the curvature of spacetime. They are a confirmed prediction of Einstein’s general theory of relativity [1], and hence an essential feature of any other viable theory of grav- itation. The measurement of GW effects is on the long list of tests that general relativity has undergone (and passed with flying colours), but the intrinsic weakness of these effects means they can only be detected for astrophysical sources in much stronger gravitational fields than those probed by Solar Sys- tem experiments. This allows GWs to place exclusive constraints on gravity in the strong-field regime, as well as to reveal and describe highly energetic sources that might be inaccessible to traditional electromagnetic astronomy. Radio observations of the Hulse–Taylor binary pulsar PSR B1913+16 [2] in the 1970s provided the first compelling evidence for the existence of GWs. This pulsar and its neutron star companion are losing orbital energy through the emission of GWs, which causes the orbit of the binary to shrink; the rate of de- crease in orbital period that is predicted by general relativity agrees exquisitely with the measured value. A number of similar binaries were later found and observed with even greater precision [3, 4], lending further credence to GWs and their description in the underlying theory. The detection of GWs in a more direct sense (i.e. measuring their effects as they pass through a detector) is notoriously difficult, due to the weak cou- pling of gravity to matter. Efforts to directly detect astrophysical GWs began with the resonant detectors of the 1960s, which were designed to pick up res- onantly amplified antenna oscillations caused by a passing wave [5, 6]. Al- though unsuccessful, these attempts paved the way for the adoption of laser 1 2 Introduction interferometry in GW detection, and the eventual construction of large-scale interferometers such as the Laser Interferometer Gravitational-wave Observa- tory (LIGO) [7]. These highly sensitive instruments operate by measuring the phase difference of light along their angled arms, which changes in response to the miniscule distortions of spacetime as a GW passes through. In 2015, the twin Advanced LIGO detectors situated at both ends of the continental United States simultaneously recorded a transient GW signal from the merger of two stellar-mass black holes [8]. This discovery was essentially a pair of first detections: not just that of GWs, but also the energy radiated directly by a black-hole system (which is dark in the electromagnetic sector by definition). It was shortly followed by another confirmed black-hole binary merger in the same observational run [9, 10]. More detections of stellar-mass GW sources are anticipated during the presently ongoing run and into the next decade, as Advanced LIGO is brought up to full design sensitivity. While the LIGO discoveries are the culmination of a generation-long re- search and development programme targeting the direct detection of GWs, they are also the herald of a brand new era in astronomy—one that extends its reach beyond the electromagnetic spectrum for the first time, and into the radiation spectrum of gravity. GW sources with characteristic masses of less than 103 Solar masses radiate most strongly in the kilohertz (101–104 Hz) band of the gravitational spectrum. These sources will be found and observed by a global detector network that comprises Advanced LIGO and other ground- based interferometers such as Advanced Virgo [11] and KAGRA [12]. Another exciting prospect in GW astronomy is the detection of longer-lived signals from sources exceeding 105 Solar masses; these will involve the super- massive black holes (SMBHs) that reside at the centres of galaxies, and thus revolutionise the study of formation rates and evolution scenarios for such ob- jects. There will be an abundance of these signals in the sensitivity band of millihertz (10-4–100 Hz) space-based detectors such as the proposed Laser In- terferometer Space Antenna (LISA) [13]1, or more ambitious missions such as DECIGO [15] and TianQin [16]. Although LISA is still over a decade away 1 This is the recently submitted L3 mission proposal for ESA’s Cosmic Vision programme, in which the design of LISA has been restored from the down-scoped eLISA instrument [14] to an earlier configuration with higher sensitivity. §1.1 Chapter synopses 3 from coming to fruition, recent results from the precursor LISA Pathfinder satellite [17] have already successfully demonstrated the technological viabil- ity of observing GW sources from space [18]. At even lower frequencies, precise timing observations of Galactic millisec- ond pulsars by radio telescopes allow arrays of such pulsars to be used as natural detectors in the nanohertz (10-9–10-6 Hz) gravitational band. An inter- national consortium of three pulsar timing array collaborations [19] has been formed to detect and study GW sources in this band. The most promising source for these arrays is the stochastic background signal from the cosmolog- ical population of inspiralling SMBH binaries, for which detection is expected within the next ten years [20]. With the advent of pulsar timing arrays and space-based detectors, a near-complete observational coverage of the gravi- tational spectrum can be obtained—this will in turn unlock the full scientific potential of multiband GW astronomy [21], thereby enhancing the synergy of the nascent field with its electromagnetic counterpart. 1.1 Chapter synopses We provide an introductory overview of GW astronomy in Chapter 2, with a strong focus on its theoretical aspects: the mathematical/physical framework of GWs in general relativity, the astrophysical modelling of source signals, and the statistical analysis of detector data. In Chapter 3, the interactions between far-field GWs and electromagnetic fields are studied using the 1+3 approach to general relativity. Many known ef- fects are rederived in this framework, while new results are also found. These interactions are typically extremely weak even under astrophysical conditions, but can give rise to potentially observable electromagnetic signatures. Chapter 4 describes the development and implementation of an improved approximate waveform model for the extreme-mass-ratio inspirals of stellar- mass compact objects into supermassive black holes, which will be observable by LISA and other space-based detectors. The computationally efficient new model uses the three frequencies of an orbit around a spinning Kerr black hole to better match the accuracy of more expensive models. 4 Introduction In Chapter 5, a general method of tunable compression is proposed for the large banks of waveform templates that are used in GW detection searches. Various combinatorial schemes under this method are investigated with toy waveform models. The best one is applied to a simple example featuring actual waveforms, and yields results that are promising for practical implementation. Chapter 6 explores the viability of using the machine-learning technique known as Gaussian process regression in GW parameter estimation. The method employs a Bayesian approach to account for the systematic bias that might occur due to theoretical error in waveform models. Through low- dimensional studies, it is demonstrated to be potentially useful for measuring the astrophysical parameters of extreme-mass-ratio inspirals. Finally, concluding remarks and additional discussion of possible future research directions are given in Chapter 7. 1.2 Preliminaries Readers are assumed to have at least a rudimentary knowledge of differential geometry, general relativity, Bayesian statistics and signal analysis. 1.2.1 Conventions Throughout this dissertation, we use the common set of conventions laid out in this section, unless otherwise stated here (and in the respective chapters). Notation in each chapter has been chosen for intuitiveness, concordance with standard notation, and to avoid symbolic overload. Due to the variety of topics covered in this dissertation, adherence to the first two constraints may give rise to symbols that are multiply defined across chapters (or even across sections in the same chapter), but we trust context to prevent confusion. • Latin (spacetime) indices run from 0 to 3, while Greek (space) indices run from 1 to 3; exceptions to this rule will be explicitly flagged. The Einstein summation convention applies only to lower–upper pairs of indices. §1.2 Preliminaries 5 • The metric signature is (−,+,+,+), and the Riemann tensor sign con- vention is Rab = Rcacb. • The partial-derivative operator ∂a is denoted by an index comma: T,a := ∂aT. (1.1) The covariant-derivative operator∇a is denoted by an index semicolon: T;a := ∇aT. (1.2) • Symmetrisation is denoted by round brackets around indices, e.g. T(ab) := 1 2 (Tab + Tba). (1.3) Skew-symmetrisation is denoted by square brackets around indices, e.g. T[abc] := 1 6 (Tabc + Tbca + Tcab − Tacb − Tbac − Tcba). (1.4) • Geometrised units such that c = G = 1 are adopted, where c is the speed of light in vacuum and G is the gravitational constant. Exceptions occur where units are explicitly restored, and also throughout Chapter 3 where the cosmological convention c = 8piG = µ0 = 1 is used (with µ0 being the magnetic permeability in vacuum). • Cartesian coordinates are implied unless otherwise stated: [xa] := (t, x, y, z). (1.5) • In the context of time-varying quantities q(t), q0 denotes q(0). Overdots are reserved for time derivatives with respect to coordinate time; any other time derivatives will use Leibniz’s notation. • Vectors, matrices and tensors are typically denoted by boldface letters or symbols. Their individual components are typically denoted by indices 6 Introduction on the corresponding italicised letters or symbols. Overhats might be used to emphasise that a vector has unit norm, but are often omitted to ease notation when emphasis is unnecessary. • Complex conjugation is denoted by a superscript ∗, and Fourier trans- forms are denoted by an overtilde. • The common (base-10) logarithm is denoted by lg, while the natural log- arithm is denoted by ln. • The expected value of a random variable/field is denoted by E. 1.2.2 Abbreviations AAK Augmented analytic kludge AK Analytic kludge EMRI Extreme-mass-ratio inspiral EMW Electromagnetic wave GPR Gaussian process regression GW Gravitational wave LIGO Laser Interferometer Gravitational-wave Observatory LISA Laser Interferometer Space Antenna NK Numerical kludge NR Numerical relativity PN Post-Newtonian ROC Receiver operating characteristic SMBH Supermassive black hole SNR Signal-to-noise ratio Table 1.1: Glossary of abbreviations. Chapter 2 Gravitational-wave astronomy Theoretical research in the field of GW astronomy falls generally under three broad themes (although this taxonomy is neither standard nor sharply demar- cated). The first and most fundamental is the study of mathematical frame- works in which the physics of GWs may be described and understood. An- other theme is the usage of such frameworks in constructing waveform models for astrophysical GW sources, allowing general relativity and other theories of gravitation to be tested against observations. Finally, advanced statistical and computational methods are required for the extraction of GW signals from the noisy detector data, and the characterisation of their source properties; these two procedures are known respectively as detection and parameter estimation. An overview of the basic approaches that underpin these three themes is provided in this chapter. It focuses on introducing context and relevant con- cepts for the rest of the dissertation, and is not intended to be comprehensive. 2.1 Gravitational waves in general relativity GWs are most simply described using the linearised form of general relativity, which is a sound approximation to the full theory in the weak-field region of GW sources, i.e. where the background is near-Minkowski (flat or slightly curved) and the spacetime perturbations are small. In this section, we outline the canonical description of GWs in linearised general relativity, following the influential treatment of Misner, Thorne & Wheeler [22]. 7 8 Gravitational-wave astronomy 2.1.1 Linearised formalism The metric gab of a GW-perturbed spacetime may be written as gab = ηab + hab, (2.1) where [ηab] = diag(−1, 1, 1, 1) represents the Minkowski metric in Cartesian coordinates, and the components hab of the metric perturbation are small (|hab|  1). We substitute (2.1) into the Einstein field equations Rab − 1 2 gabR = 8piTab, (2.2) and linearise in hab by neglecting terms that are O(|hab|2). In this approxima- tion, all indices may be raised and lowered using ηab instead of gab; expressions for the Ricci tensor Rab and scalar curvature R (in terms of the metric compo- nents) then depend only on second partial derivatives of hab. By further writing hab in trace-reversed form hab = h¯ab − 1 2 tr(h¯)ηab, (2.3) where tr(h¯) := h¯ aa = −h aa , and imposing the Lorenz gauge condition h¯ab,b = 0, (2.4) Equation (2.2) reduces to a Minkowski-space wave equation for h¯ab that is sourced by the stress–energy tensor Tab, i.e. h¯ab = −16piTab, (2.5) where h¯ab := h¯ cab,c defines the flat-space d’Alembert operator . The wave equation (2.5) is valid as long as the gravitational field is weak, e.g. far from any GW source, or close to a nearly Newtonian source (charac- terised by T00  |Tab| for all nonzero a or b). A similar equation applies for strong fields in full general relativity, but with a curved-space d’Alembertian and an additional effective source term −16pi∆Tab that contains all quadratic- §2.1 Gravitational waves in general relativity 9 and higher-order terms in hab. The general solution to (2.5) is given by the retarded integral h¯ab(t,x) = 4 ∫ d3x′ Tab(t− r,x′) r , (2.6) where x := [xµ], and r := |r|with r = x− x′. Far from any GW source (Tab = 0), the simplest vacuum solution to (2.5) is the monochromatic plane wave h¯ab = <[Aab exp (ikcxc)], (2.7) where the four-wavevector ka is null (kaka = 0) and the symmetric ampli- tude tensor Aab is transverse to ka (Aabkb = 0). The four constraints Aabkb = 0 on the ten independent components Aab arise from the Lorenz condition (2.4), which is a partial gauge fixing; the remaining gauge freedom may be used to further constrain the plane-wave solution. We choose the standard transverse– traceless gauge by imposing the three spatial conditions A0µ = 0, (2.8) and the trace-free condition tr(A) = 0, (2.9) which leaves two dynamical degrees of freedom in Aab. With tr(h¯) = 0 in (2.3) as a result of (2.9), the metric perturbation in this gauge reduces to its transverse and traceless part hTTab := hab = h¯ab. (2.10) 2.1.2 Gravitational-wave generation For an isolated and nearly Newtonian system, the mass monopole moment is precisely the total mass–energy M = ∫ d3x′ T00, which is a conserved quantity and contributes no radiation (M˙ = 0). Likewise, the first time derivative of the mass dipole momentmµ = ∫ d3x′ T00x′µ corresponds to the total linear momen- tum, which is also conserved with no associated radiation (m¨µ = 0). Hence the 10 Gravitational-wave astronomy lowest-order radiation emitted by such a system is quadrupolar in nature, i.e. generated by a time-varying mass quadrupole moment.2 Any astrophysical system that spins or orbits with a degree of rotational asymmetry will emit GWs, due to the oscillation of its mass quadrupole mo- ment. The far-field radiation from most GW sources is described to high accu- racy by the well-known quadrupole formula, which is valid in the slow-motion approximation (i.e. as long as the characteristic source size is small compared to the wavelength of the GWs emitted). This formula may in principle be used to compute gravitational waveforms for any such source, provided the dynam- ics of its mass distribution are known. The local mass quadrupole moment for a nearly Newtonian, slow-motion source is given by Iµν(t ′) = ∫ d3x′ T00(t′,x′)x′µx ′ ν , (2.11) where the mass–energy density T00 may be evaluated at time t′ since retarda- tion over the extent of the source will be negligible for a distant observer, i.e. |x′|  |x|. It is convenient to consider the traceless part of (2.11): Iµν = Iµν − 1 3 tr(I)ηµν , (2.12) which simplifies various expressions (e.g. (2.22) and (2.23)) and is known as the reduced mass quadrupole moment. In the far field of the source, we only require the traceless part of Iµν (equivalently, of Iµν) after projecting transverse to the direction of wave propagation. For a plane wave propagating along the z-axis, the nonzero components of ITTµν reduce to ITT11 = −ITT22 = 1 2 (I11 − I22), (2.13) ITT12 = ITT21 = I12. (2.14) Finally, to obtain the quadrupole formula, we use the result∫ d3x′ Tµν = 1 2 ∂20 ∫ d3x′ T00x′µx ′ ν , (2.15) 2 This argument is largely heuristic; a more rigorous derivation of the result is given in [23]. §2.1 Gravitational waves in general relativity 11 which follows from the flat-space conservation equations T ab,b = 0. (2.16) Via (2.6), (2.12), (2.15) and the far-field approximation r = |x|, (the spatial part of) the metric perturbation in transverse–traceless gauge is then related to the second time derivative of ITTµν by hTTµν (t,x) = 2 r I¨TTµν (t− r). (2.17) 2.1.3 Gravitational-wave propagation Energy and momentum are carried by propagating GWs, much as they are by electromagnetic radiation. The stress–energy tensor for a GW—or indeed any gravitational field—cannot be defined locally, since the equivalence principle implies the existence of a local reference frame in which the field vanishes. However, it is valid to define an effective GW stress–energy tensor that is smeared out over several gravitational wavelengths; this is given in linearised theory by the Isaacson tensor TGWab = 1 32pi 〈hTTcd,ahTTef,b〉ηceηdf , (2.18) where the angled brackets denote an average over many wave cycles. Equa- tion (2.18) is subject to the usual conservation of energy and momentum via (2.16). Although its reinsertion into (2.2) is uninformative in linearised theory (since TGWab = O(|hab|2) = 0), the energy–momentum associated with the GW field in full general relativity can contribute significantly to the background curvature and result in non-negligible backscattering or tail effects. The linearised description of GW propagation on Minkowski space is ex- tensible to slightly curved spacetimes, where the weak curvature might be due to the GWs themselves or any nearby matter/gravitational fields. This is be- cause the plane-wave solution (2.7) and the broader approach of geometri- cal “optics” remain valid in the short-wave approximation (i.e. as long as the gravitational wavelength is small compared to the background radius of cur- 12 Gravitational-wave astronomy vature). On a flat or slightly curved spacetime, the stress–energy tensor (2.18) for a plane and monochromatic GW simplifies to TGWab = 1 32pi A2kakb, (2.19) whereA2 := AabAab/2. In addition, the geometrical-optics approach allows ray effects such as redshift, lensing, and Faraday rotation of the polarisation plane to be characterised for GWs in the same way as for electromagnetic radiation. 2.1.4 Gravitational-wave effects As seen in Section 2.1.1, the field of a GW in general relativity has two dy- namical degrees of freedom. These are known as the plus (h+) and cross (h×) linear-polarisation modes; for a plane wave propagating along the z-axis, they are defined with respect to the chosen (x, y)-coordinates as h+ := h TT 11 = −hTT22 , (2.20) h× := hTT12 = h TT 21 . (2.21) In the field of a passing plus-polarised GW (h× = 0), the proper distance between two points with initial separation ` in the x-direction oscillates with h+ as (g11)1/2` ≈ (1 + h+/2)`. For two points with initial separation ` in the y-direction, it oscillates as (g22)1/2` ≈ (1−h+/2)`. This is an effective stretching and squeezing of space in two orthogonal directions. A cross-polarised GW (h+ = 0) has a similar effect, since its metric is mapped to the plus-polarised one with h+ ≡ h× by a pi/4 rotation in the polarisation plane. Although a GW is technically a distortion of both space and time, the met- ric perturbation in the far field equals its transverse and traceless part, and so in transverse–traceless gauge the far-field effects are purely spatial as de- scribed above. The strength of a far-field GW may thus be interpreted as a dimensionless strain of size O(|h|), where h := h+ + ih×. This strain is driven by the evolution of a distant GW source via (2.17); it allows the detection of such sources through interferometry, where the two arms of an interferometer §2.2 Sources and waveform modelling 13 are alternately lengthened and shortened by the passage of a GW. Another key effect of GWs in astrophysical settings is gravitational radia- tion reaction: energy E and angular momentum Lµ are carried away from GW sources, leading to phenomena such as spin-down in neutron stars or inspi- ralling in binaries. We obtain the rates of energy and angular-momentum loss (averaged over time and solid angle) for a source by integrating the respective fluxes of TGWab through a surrounding sphere of radius r. Via (2.17) and the general version of (2.18) for an arbitrary gauge, these rates are related to the time derivatives of Iµν by 〈E˙〉 = − ∫ dΩTGW01 r 2 = −1 5 〈...I µν ...I µν〉, (2.22) 〈L˙µ〉 = − ∫ dΩ µνρxνT GW ρ1 r 2 = −2 5 µνρ〈I¨νσ ...I ρτ 〉ηστ , (2.23) where we have converted to spherical polar coordinates [xµ] = (r, θ, φ), and µνρ is the three-dimensional Levi-Civita symbol. 2.2 Sources and waveform modelling In this section, we provide a brief overview of GW sources (adapted from reviews such as [24–26]), with particular emphasis on the coalescing binary systems that are relevant to much of the material in this dissertation. Differ- ent strategies are used to model such binaries in the comparable-mass and extreme-mass-ratio regimes; we summarise these methods and the present sta- tus of waveform development for both cases. 2.2.1 Source overview Gravity is many orders of magnitude weaker than the other fundamental inter- actions. The matter terms on the right-hand sides of (2.2) and (2.17) contain a suppressed multiplicative constant G/c4 that is actually∼ 10−45 in SI units. In- serting realistic scales of mass, length and time for any conceivable man-made source (e.g. the mass and orbital speed of the International Space Station) into 14 Gravitational-wave astronomy (2.17), we find |h| . 10−32; such strains are impossible to detect with modern instruments, which are around 10 orders of magnitude less sensitive. Hence detectable GW sources are necessarily of astrophysical or cosmological ori- gin, and must comprise extremely massive objects moving at near-relativistic speeds to make up for their vast distances from local detectors. The best-understood types of GW sources are “clean” astrophysical sys- tems that are relatively straightforward to model and identify. Chief among these are binary coalescences, which are some of the most ubiquitous sources in the sensitivity bands of ground- and space-based interferometers [27, 28], and indeed the first to be detected. These coalescences refer to binary systems in the final stage of their evolution, spanning from the late inspiral (starting at orbital separations small enough that the source is in band) to the post-merger ringdown (ending as the oscillations of the final object decay to zero). Other types of well-modelled GW sources include the early-inspiral stage of lower- mass binaries that do not merge in band, but are close enough for their weaker signals to be detected, and also “continuous” sources such as individual neu- tron stars that are asymmetric and rapidly spinning. Transient GW signals that are less cleanly modelled may be classified as burst sources. These are generated by processes such as core-collapse super- novae [29], glitches in the internal dynamics of neutron stars [30], and the for- mation of cusps or kinks in the one-dimensional topological spacetime defects known as cosmic strings [31]. Short-duration GW signals from binary systems in particular scenarios are also referred to as bursts, e.g. the radiation emit- ted close to periapsis by long-period, nearly parabolic binaries in the extreme- mass-ratio regime [32], or that from late-stage binary coalescences with little or no inspiral evolution (such that most of the signal strength is attributed to the highly nonlinear merger phase). The descriptions above refer to GW sources that are strong or close enough for their signals to be individually identified. However, the Universe is also filled with countless similar systems that are unresolvable; the superposi- tion of signals from such sources forms a stochastic background of GWs, which is modelled as a correlated random field and searched for in the re- sponses of independent detectors over long observing times. Like any ran- §2.2 Sources and waveform modelling 15 dom field, a stochastic background may be characterised by its power spec- tral density Sh(f), which is integrated over the redshift and mass distribu- tions of the sources under consideration. In addition to astrophysical back- grounds, stochastic backgrounds of primordial GWs can also arise through early-Universe processes such as the amplification of quantum fluctuations by inflation [33], or first-order cosmological phase transitions [34]. 2.2.2 Binary-coalescence modelling We now summarise the more general methods used to model the orbital evo- lution and GW emission of a coalescing binary system with component masses (m1,m2 ≥ m1). It is useful to introduce the standard definitions of total mass M , reduced mass µ, (small) mass ratio q and symmetric mass ratio η, i.e. M := m1 +m2, (2.24) µ := m1m2 m1 +m2 , (2.25) q := m1 m2 , (2.26) η := m1m2 (m1 +m2)2 = µ M = q (1 + q)2 . (2.27) The inspiral stage of a binary coalescence is extremely well modelled at large orbital separations, and its GW signal is essentially a “chirp” that in- creases in frequency and amplitude. For illustrative purposes, we describe here the radiation emitted by a binary of non-spinning point masses. Via (2.17), (2.22) and (2.23), it is possible to derive expressions for the binary’s GW am- plitude |h|, as well as the evolution of its orbital frequency f and eccentricity e due to radiation reaction. At leading order in f , these are given by |h| ∝ M 5/3f 2/3 r (1 +O(e2)), (2.28) f˙ ∝M5/3f 11/3(1 +O(e2)), (2.29) e˙ ∝ −M5/3f 8/3e(1 +O(e2)), (2.30) 16 Gravitational-wave astronomy where the chirp massM is defined as M := (m1m2) 3/5 (m1 +m2)1/5 = M2/5µ3/5 = Mη3/5. (2.31) For a Newtonian chirp (i.e. for a binary in an instantaneous Keplerian or- bit), there are no higher-order terms in f , and the right-hand sides of (2.28)– (2.30) truncate after terms that are O(e4). A closed-form expression for e(f) may be obtained by solving the system of equations (2.29) and (2.30); at small eccentricities, this expression reduces to e ∝ f−19/18 [35, 36]. Equations (2.28) and (2.29) thus decouple from (2.30) for a Newtonian binary. They show that the chirp mass and distance of such a source may be estimated from its inspiral alone (since (|h|, f, f˙) are measurable from the GW signal). Equations (2.28)–(2.30) for a relativistic binary include higher-order terms in f , which are known as post-Newtonian (PN) corrections [37, 38]. The cor- responding expansions are done in a PN parameter that is associated with a typical internal speed of the system, e.g. the average squared orbital speed v2. Corrections that are O(v2n) relative to leading order are referred to as nPN terms. The PN formalism is well suited to systems in which internal speeds are small, in the sense that expansions are not required to high order. In gen- eral, any GW observable may be written as a PN expansion with respect to its expression as evaluated in the Newtonian potential. For example, expansions of the GW phase include terms with different dependencies on η, which allows both masses of the binary to be estimated (rather than just the chirp mass). For a binary that is nearly circular, Kepler’s third law gives v2 ∝ (fM)2/3. By transforming to dimensionless coordinates xa/M , we can put (2.28)–(2.30) into a form that is explicitly dependent on v2 and independent of M , i.e. |h| ∝ ηv 2 r¯ (1 +O(e2)), (2.32) df¯ dt¯ = M2f˙ ∝ ηv11(1 +O(e2)), (2.33) de dt¯ = Me˙ ∝ −ηv8e(1 +O(e2)), (2.34) §2.2 Sources and waveform modelling 17 where the overbars denote dimensionless quantities. These equations illus- trate the total-mass invariance of waveforms in dimensionless coordinates, which is often exploited in the modelling of GW sources. The merger stage of a binary coalescence occurs deep within the nonlinear dynamical strong field, at which point PN methods become both impractical and inaccurate. Numerical relativity (NR) methods [39] (i.e. direct numerical integration of the Einstein field equations) are typically used to model this stage. The radiative degrees of freedom in a gravitational field are encoded in the Weyl tensor Cabcd (the fully traceless part of the Riemann tensor), which is reduced in the Newman–Penrose formalism [40] to five scalar contractions Ψn of Cabcd with a complex null tetrad. For waveform modelling, Ψ4 is the most relevant as it describes outgoing radiation at null infinity; it is related to the metric perturbation by Ψ4 = −h¨+ + ih¨× (2.35) in the limit as r →∞. In NR simulations, Ψ4 is expanded in terms of the spin-weighted spherical harmonics sYlm [41] with s = −2, i.e. Ψ4(t, r, θ, φ) = ∞∑ l=2 l∑ m=−l Ψ4,lm(t, r)−2Ylm(θ, φ). (2.36) This facilitates computation (to some desired l) of the modes Ψ4,lm, which are then extracted on a sphere of finite radius. Breakthroughs in NR during the mid-2000s [42–44] have enabled the highly accurate evolution of binary coa- lescences from several orbits before plunge (the start of the merger stage) right through to ringdown. However, NR waveforms are expensive to compute, and are limited at present to . 100 cycles—typically with small initial orbital separations (. 20M ) and mass ratios close to unity (& 1/20) [45]. The end product of a binary coalescence is a vibrating black hole or rel- ativistic star that gradually settles into a stable configuration. Although this ringdown stage may be modelled within NR, its waveforms are described more simply through black-hole perturbation theory as a superposition of quasinormal modes [46], i.e. exponentially decaying sinusoids. The frame- 18 Gravitational-wave astronomy work for studying black-hole perturbations is built upon early calculations by Regge & Wheeler [47] and Zerilli [48]; for a rotating Kerr black hole, it is cen- tred around the Teukolsky master equation [49] describing the linear pertur- bations of various scalar fields ψ on the spacetime. In the spherical-like Boyer–Lindquist coordinates, the Kerr line element for the spacetime around a mass M with angular momentum Ma is written as ds2 = −∆ Σ (dt− a sin2 θ dφ)2 + sin 2 θ Σ ((r2 + a2)dφ− a dt)2 + Σ ∆ dr2 + Σ dθ2, (2.37) where ∆ = r2− 2Mr+ a2 and Σ = r2 + a2 cos2 θ. The Teukolsky equation takes the general form sΘψ = 4piΣT, (2.38) where T is the source term (written in Newman–Penrose form), and sΘ is the Teukolsky operator for a field of spin weight s: sΘ = − ( (r2 + a2)2 ∆ − a2 sin2 θ ) ∂2t − 4Mar ∆ ∂t∂φ − ( a2 ∆ − 1 sin2 θ ) ∂2φ +∆−s∂r∆s+1∂r + 1 sin θ ∂θ sin θ ∂θ + 2s ( a(r −M) ∆ + i cos θ sin2 θ ) ∂φ +2s ( M(r2 − a2) ∆ − r − ia cos θ ) ∂t − s2 cot2 θ + s. (2.39) For outgoing gravitational radiation, we set s = −2 and ψ = ρ−4Ψ4 with ρ = 1/(r − ia cos θ); the Teukolsky equation is then separable into its radial and angular components via the expansion (2.36), such that the Kerr quasinormal modes (i.e. the eigenfunctions of (2.38) with T = 0) and in particular their complex frequencies may be obtained analytically (e.g. [50]). 2.2.3 Comparable-mass mergers Although the three stages of a binary coalescence have very different practi- cal descriptions (the two-body problem at large and small separations, then one-body perturbations), there are regimes of overlap in which the com- puted waveforms can be cross-validated and stitched together smoothly. Ap- §2.2 Sources and waveform modelling 19 proaches such as those described above may be used to construct inspiral– merger–ringdown waveform models for comparable-mass binaries with q & 10−1. These models are applicable to binary systems of stellar-mass compact objects—white dwarfs, neutron stars or stellar-origin (. 102M ) black holes— as well as SMBH binaries, through a trivial rescaling by total mass. The best-developed comparable-mass models incorporate at some level the spins of both objects in the binary, whether these are assumed to be aligned with the orbital angular momentum, or generically aligned and hence pre- cessing. One powerful framework for achieving this is the effective-one- body formalism [51], in which the relative motion of two spinning masses ((m1,S1), (m2,S2)) is mapped to the geodesic motion of an effective particle (µ,Sµ) on a deformed Kerr spacetime (M,SM). The effective-one-body ap- proach has been used to construct a succession of comparable-mass models, the latest of which includes precessing spins [52]. These models typically pro- duce inspiral–merger waveforms that are smoothly connected to quasinormal modes, with NR simulations employed to calibrate unknown coefficients and free parameters in the merger–ringdown. While most of the methods and models described so far provide time- domain waveforms h(t), it is often desirable to directly compute h˜(f) for ease of use in data analysis algorithms. This is done in the other well-known fam- ily of models, dubbed Phenom [53], which use NR-fitted phenomenological parameters to write their waveforms as closed-form analytic expressions in the frequency domain. Phenom waveforms rival those from the effective-one- body approach in accuracy, and have also been extended to account for the effects of spin precession [54]. 2.2.4 Extreme-mass-ratio inspirals The standard strategies adopted in the modelling of comparable-mass binary coalescences are not well suited to the regime where q  1, with the possi- ble exception of the effective-one-body approach (although work in this area is presently limited to circular and equatorial orbits, and requires calibration against more accurate models [55]). This is because the inspiral of the smaller 20 Gravitational-wave astronomy object includes many orbits in the strong field of the larger object, and so the PN treatment is inadequate on its own, while NR simulations are hampered by the greatly differing scales of mass, length and time in the system. Extreme-mass-ratio inspirals (EMRIs) involve a compact object in orbit around a rotating SMBH, with q . 10−4 such that (m1,m2, η) ≈ (µ,M, q). They are expected to be an important and detectable source of GWs at millihertz frequencies [56]. The most readily available EMRI waveforms are generated using mixed-formalism models known as “kludges” [57–59], which are char- acterised by their computational efficiency and modular construction. In these models, evolution of the orbit over long timescales is based on PN expressions in the limit of (non-spinning) test-particle motion on Kerr geometry; the modu- lar setup then allows the introduction of relativistic orbital corrections on both short and long timescales. Flat-space multipole formulae (e.g. (2.17)) sourced by the compact object are used for fast waveform generation to a reasonable approximation, even though the background spacetime is strongly curved. Fuller EMRI treatments require black-hole perturbation theory and the computation of the gravitational self-force on the compact object [60, 61]. At first order in q, the gravitational field of a test particle on Kerr spacetime is a linear perturbation of the background metric, i.e. gab → gab + qhab. Motion on the perturbed spacetime depends only on some regular part hRab of the metric perturbation, and may be interpreted either as geodesic motion on gab + qhRab, or as Kerr geodesic motion that is altered by an effective external force. This self-force is written as F a = µ∇abchRbc, (2.40) with the force operator∇abc given in terms of the Kerr covariant derivative by ∇abc = 1 2 (gadub − 2gabud − uaubud)uc∇d, (2.41) where ua is the test particle’s four-velocity. The adiabatic (time-averaged), dis- sipative effects of the self-force at first order in q are encoded in solutions of the Teukolsky equation (2.38) with compact-object source; while these are expen- sive to compute, they are more accurate than kludges and have been used to generate full EMRI waveforms for eccentric and inclined Kerr orbits [62, 63]. §2.3 Detectors and data analysis 21 Although Teukolsky-based waveforms are potentially sufficient for detec- tion purposes, they suffer from a build-up of error in the orbital phase over the long inspiral lifetime. To accurately recover the orbital parameters of EMRIs, waveform models must include the next dominant self-force phenomenon of resonances, where the concurrence of the various orbital frequencies causes transient jumps in the evolution of the compact object’s orbit [64]. They also need to account for higher-order phase effects due to the conservative first- order self-force and the dissipative second-order self-force [65]; these calcula- tions are the aim of the ongoing self-force programme. Full self-force models might be too computationally intensive to directly provide waveforms for data analysis, and hence will likely be used to inform kludges instead [66–68]. The actual merger and ringdown of an extreme-mass-ratio coalescence are not as important from a data analysis standpoint, since most of the source information will typically be contained in the extensive inspiral stage. Regard- less, the merger is now a relatively straightforward transition between the last stable orbit of the compact object and its geodesic plunge [69], while the ring- down is again described by the quasinormal modes of the final Kerr black hole. Finally, coalescing binary systems with 10−3 . q . 10−2 are known as intermediate-mass-ratio inspirals. Less work has been done on constructing waveform models for such sources; this is due to the scarcity of direct obser- vational evidence for intermediate-mass black holes, as well as the limitations of NR and perturbation theory at intermediate mass ratios. However, several hybrid methods have been applied to the modelling of these systems [70, 71], and the implied existence of ∼ 102M black holes in light of the first LIGO detection might presage their eventual discovery. 2.3 Detectors and data analysis We now describe some of the existing and proposed GW detectors introduced in Chapter 1, before outlining the signal processing framework used to analyse their data (with particular emphasis on the standard approaches to detection and parameter estimation for laser interferometers). The material in this sec- tion has been adapted from reviews such as [25, 26, 72, 73]. 22 Gravitational-wave astronomy 2.3.1 Detector overview The radiation spectrum of likely sources in GW astronomy will be spanned by an envisioned network of man-made and naturally occurring interferomet- ric detectors with different arm lengths and noise backgrounds. These fall into three main categories: ground-based detectors (which are sensitive at kilohertz or “high” frequencies), space-based detectors (at millihertz or “low” frequen- cies), and pulsar timing arrays (at nanohertz or “very low” frequencies). Ground-based detectors will observe stellar-mass binary coalescences, asymmetric neutron stars, supernovae and other low-mass GW sources. Mod- ern Michelson interferometers such as Advanced LIGO [7] and Advanced Virgo [11] have arms that are ∼ 103 m long, and employ Fabry–Pe´rot cavities to increase the effective arm length to ∼ 105 m, hence improving sensitivity at sub-kilohertz frequencies. They also use vibration isolation systems and materials with low mechanical loss to screen out instrumental noise, but envi- ronmental changes in the local gravitational field (e.g. seismic vibrations, or atmospheric pressure gradients) are an irreducible source of gravity gradient noise at . 101 Hz. Future variants such as KAGRA [12] and the Einstein Tele- scope [74] will explore the same frequency band with additional noise reduc- tion strategies, including underground construction and cryogenic cooling. Gravity gradient noise falls off rapidly away from the Earth, and so GWs at lower frequencies can be detected by an interferometer in space. Space-based detectors will observe Galactic white-dwarf binaries in the early-inspiral stage, and more distant sources such as EMRIs and the binary coalescences of smaller (. 107M ) SMBHs. The archetypal design for a space interferometer is that of LISA [13]: three freely flying test masses housed in drag-free spacecraft, sepa- rated by∼ 109 m, and placed in an Earth-trailing orbit around the Sun. LISA is most sensitive in the millihertz band, below which it is limited by noise from long-timescale drifts in the relative acceleration of the test masses. At decihertz frequencies, LISA’s arm length exceeds multiple wavelengths and its sensitiv- ity is reduced due to partial signal cancellation; however, future detectors with shorter test-mass separations (e.g. DECIGO [15], the geocentric TianQin [16], and other proposed missions [75]) will extend the range of space-based GW observations up to the lower frequency limits of ground interferometers. §2.3 Detectors and data analysis 23 Arrays of stable millisecond pulsars in the Milky Way emit radio pulses that arrive at Earth with exquisite regularity in time, thus effectively acting as a network of natural “interferometers” with Galactic-scale (∼ 1020 m) arm lengths. A GW that passes through the Galaxy will perturb the paths taken by the pulses, and hence their times of arrival, in a correlated way; this effect is then distinguishable from the noise in pulse arrival times due to intrinsic pulsar processes, which are uncorrelated. The peak sensitivity of a pulsar tim- ing array occurs near its lower frequency limit, i.e. the cut-off determined by the total observation time. For a five-year baseline, this corresponds to the nanohertz frequency band. Pulsar timing arrays such as the IPTA [19] and a future programme supported by the Square Kilometre Array [76] will observe inspiralling SMBH binaries at the higher end of their mass range, as well as the stochastic GW background of signals from such sources. Apart from interferometric detectors, there is another broad class of detec- tor that works by amplifying the vibrations induced in a material object due to the passage of a GW. Such instruments are known as resonant detectors [77, 78]. Their sensitivities have gradually been surpassed by those of laser in- terferometers; furthermore, they typically operate in a narrow high-frequency band (∼ 103 Hz) where there are fewer anticipated astrophysical or cosmologi- cal sources, and so are less relevant to the content of this dissertation. 2.3.2 Signal processing When analysing data from laser interferometers, the two polarisation modes h+,× are not measured directly. The intermediate step involves at least two in- dependent detector-response functions hI,II , which are orthogonal in the sense that the noise in both channels is uncorrelated. These two data streams are ob- tained from either separate interferometers or noise-correlated ones (e.g. in shared-arm configurations such as LISA); as long as the noise is not perfectly correlated, hI,II can always be orthogonalised in the usual way. Working in arbitrary coordinates, we denote the orthonormal coordinate frame adopted in (2.20) and (2.21) as (e1, e2, e3) ≡ (p,q, r) (with r pointing 24 Gravitational-wave astronomy from the source to the detector), and define the polarisation basis tensors H+µν := pµpν − qµqν , (2.42) H×µν := pµqν + qµpν , (2.43) such that the far-field metric perturbation may be written as hTTµν = h+H + µν + h×H × µν . (2.44) For a single detector whose two arms are aligned with the unit vectors (`1, `2), the strain signal caused by the passage of a GW is given by hI = 1 2 hTTµν (` µ 1` ν 1 − `µ2`ν2). (2.45) This response function is proportional to the sine of the angle α between the arms [79], and is explicitly related to h+,× via (2.44): hI = (h+F + I + h×F × I ) sinα, (2.46) F+,×I sinα = 1 2 H+,×µν (l µ 1 l ν 1 − lµ2 lν2), (2.47) where the antenna pattern functions F+,×I (θS, φS, ψS) depend on the sky loca- tion (θS, φS) and polarisation angle ψS of the source [80]. A second channel hII is required for the recovery of both modes h+,×, and is similarly defined as (the orthogonal part of) the strain signal in another linearly independent detector. GW strain data from laser interferometers is often characterised by a low instantaneous signal-to-noise ratio (SNR), due to the high levels of instrumen- tal and environmental noise. However, the data may be processed with tem- plate filters generated from waveform models, which allows the build-up of SNR over time. In the standard signal-processing framework for GW astron- omy, data from a pair of orthogonal detectors may be written as the complex time series3 x(t) = h(t) + n(t), where the source signal h = hI + ihII is a de- 3 Another common and more general approach is to consider the real time-series data from individual detectors; the SNRs for multiple orthogonal detectors may then be summed in quadrature to give a combined SNR for the detector network. §2.3 Detectors and data analysis 25 terministic function of time (and some unknown source parameters), and the detector noise n = nI + inII in both channels is assumed to be a stationary and zero-mean stochastic process. The process known as matched filtering involves the cross-correlation of x with some optimal template filter F (t) that maximises the output SNR ρ := E[(x|F )]√ E[(n|F )(n|F )] , (2.48) where the (zero-lag) cross-correlation inner product (·|·) on the function space of finite-length time series is defined as (a|b) := ∫ dt a(t)b∗(t). (2.49) Since n is stationary, its autocorrelation function Rn(t, t′) := E[n(t)n∗(t′)] may be written as an even function of one variable, i.e. Rn(τ) = E[n(t)n∗(t− τ)]. (2.50) By the Wiener–Khinchin theorem, the (two-sided) noise power spectral density Sn(f) is simply the Fourier transform of Rn(τ): Sn(f) := ∫ dτ Rn(τ) exp (−2piifτ). (2.51) The solution to the variational problem δρ/δF = 0 is then h(t) ∝ (Rn ∗ F )(t), which by the convolution theorem gives F˜ (f) ∝ h˜(f)Sn(f) , (2.52) and so the matched filter is proportional to the noise-weighted signal itself. It is convenient to shift the noise weighting into the inner product (2.49); the resultant noise-weighted inner product 〈·|·〉 is real and defined as 〈a|b〉 := (a|Fb) = ∫ dt a(t)F ∗b (t) = ∫ df a˜(f)b˜∗(f) Sn(f) (2.53) 26 Gravitational-wave astronomy via the Plancherel theorem, with Fb denoting the matched filter corresponding to the waveform template b (i.e. F˜b := b˜/Sn). With a filter that is matched to the source signal h, the output SNR (2.48) simplifies to ρ = E[〈x|h〉]√ E[〈n|h〉〈n|h〉] = √ 〈h|h〉, (2.54) where we have used the noise identities E[〈n|h〉] = 0, (2.55) E[〈n|h〉〈n|h′〉] = 〈h|h′〉. (2.56) Equation (2.54) may be interpreted as the “true” SNR of the source. If the filter is matched not to h but some other waveform h′, the output SNR is instead ρ′ = 〈h|h′〉√〈h′|h′〉 , (2.57) which is ≤ ρ by the Cauchy–Schwarz inequality. The common principle of comparing detector data against many templates and identifying the best-matching one is used in both detection and parame- ter estimation. Since the SNR might be reduced significantly if the identified template is not close to optimal, the generating waveform model should be as accurate as possible. At the same time, it must also be computationally effi- cient due to the large number of templates required in both procedures. Dif- ferent strategies apply for detection, which can be viewed as a global search on the SNR surface over the model parameter space, and parameter estimation, which is an exploration (over a more localised region) of the probability that the template-subtracted data is consistent with detector noise. 2.3.3 Detection To discern the presence of a signal buried in detector noise, one standard ap- proach is to perform a targeted search by filtering the data with a template bank, i.e. a discrete set of precomputed (or rapidly computed) unit-SNR tem- §2.3 Detectors and data analysis 27 plates that densely span an extended region of the model parameter space. This may be done while the detector is online, in order to identify high-SNR candidates for prompt follow-up. A detection is then claimed if the output SNR for any template in the bank surpasses a specified significance threshold that is determined by the properties of the concurrent noise background. Various prescriptions for template placement may be used to ensure proper coverage of the search region. The most straightforward is to grid the tem- plates on a lattice that is uniform with respect to some parameter-space met- ric. Such a metric is often defined by way of the overlap O(·|·) between two waveforms, which is given by O(a|b) := 〈a|b〉√〈a|a〉〈b|b〉 . (2.58) This bilinear form takes the value of±1 for linearly dependent waveforms and 0 for orthogonal waveforms, and hence provides a measure of the accuracy with which one waveform represents another. The overlap is related to several other commonly used quantities in the GW literature. For example, the maximal overlap between the source signal and a (discrete or continuous) set of templates from a waveform model is known as the “fitting factor” of that set [81]. The “match” between two waveforms refers to their overlap maximised over some chosen subset of parameters; these are typically extrinsic to the source and more rapidly searched over [82]. We also have the “mismatch” between neighbouring templates in a bank, which is just their match subtracted from unity, as well as the “minimal match” for a bank with uniform mismatch, which is the fraction of optimal SNR it retains when the signal falls maximally far from the nearest templates [83]. Template banks are well suited to detection searches for simply modelled signals across a low-dimensional parameter space, but their computational ef- ficiency scales poorly with waveform cost and search volume. Banks of ∼ 105 approximate templates for stellar-mass binary coalescences are employed in low-latency LIGO pipelines to search a reduced four-dimensional parameter space (the component masses and aligned-spin magnitudes) for coincident sig- nals in the two detectors. The significance of a candidate signal is assessed 28 Gravitational-wave astronomy against the SNR distribution of false coincident signals due to noise in either data stream. A frequentist approach is used to estimate this distribution: the timestamps of one data stream are artificially shifted many times, with the off- sets being large enough that all coincident signals in the time-shifted data sets are necessarily uncorrelated between detectors (and interpreted as noise). Apart from targeted searches, the LIGO pipelines also incorporate generic searches that are performed without the use of waveform models. These gen- erally involve the identification of coincident excess power in spectrograms of the data, i.e. time–frequency plots of the short-time power spectral den- sity; they provide a complementary search for (stronger) signals from well- modelled sources, and are also suitable for detecting the transient signals from burst sources. As the field of GW astronomy matures, the standard detection algorithms based on template banks and spectrograms will likely be improved or supplanted by modern statistical techniques such as deep neural networks [84] for targeted searches, or compressed sensing [85] for generic ones. 2.3.4 Parameter estimation Detection searches are performed by maximising the SNR-related quantity 〈x|h〉 over the parameter space of the waveform model h; in parameter esti- mation, it is the noise-related quantity 〈x− h|x− h〉 that is minimised instead. The probability that x − h is consistent with Gaussian detector noise supplies a likelihood function for the model parameters λ given the data, i.e. L(λ|x) ≡ p(x|λ) ∝ exp ( −1 2 〈x− h(λ)|x− h(λ)〉 ) . (2.59) This allows the optimisation problem to be cast in the framework of Bayesian statistics, such that confidence regions around the optimal parameter point may be obtained from the posterior probability density p(λ|x) = L(λ|x)pi(λ) Z(x) . (2.60) §2.3 Detectors and data analysis 29 The parameter prior pi is specified as appropriate or taken to be uninformative, while the model evidence Z = ∫ dλLpi is a normalising factor that is unused in parameter estimation. Although Z is expensive to evaluate explicitly, it can be used for the Bayes-factor comparison of different waveform models, and also provides a Bayesian approach to the detection of weaker signals via a comparison with the null-model evidence Z0 ∝ exp (−1/2〈x|x〉). Grid-based searches of the posterior surface with template banks are too computationally inefficient to estimate parameters at the precision admitted by most waveform models. The measurement precision for a model h(λ) at some point in parameter space is determined by its waveform derivatives ∂h/∂λ at that point. In the case of high SNR, the parameter estimation errors ∆λ due to Gaussian noise have the normal distribution N (0,Γ−1), where the Fisher information matrix Γ is given by [86, 87] Γij = 〈 ∂h ∂λi ∣∣ ∂h ∂λj 〉 . (2.61) The root-mean-square errors in the general case can then be approximated as ∆λi ≈ √ (Γ−1)ii, (2.62) which places stringent and typically impractical upper limits on the grid spac- ing of a template bank that is put to this purpose. Parameter estimation algorithms thus make use of stochastic search meth- ods instead, in order to map out (relevant parts of) the posterior surface at higher resolution. The unnormalised posterior density Lpi in (2.60) is sampled over a region of parameter space containing the detected signal; this is usually localised beforehand by the initial detection algorithms, although stochastic methods can in principle search extended regions and adjust their sampling resolution adaptively. Estimates of the source parameters and their associated errors are obtained directly from the distribution of samples, after it has con- verged to the underlying posterior density. The most successful class of stochastic sampling methods for GW parame- ter estimation are the Markov-chain-Monte-Carlo algorithms, which have seen 30 Gravitational-wave astronomy extensive use due to their simplicity and versatility. These typically involve sampling with a random walk across parameter space, but directed by a prob- ability density that proposes points at each step, as well as some criterion for accepting or rejecting points. The speed and reliability of these algorithms can also be improved through techniques such as simulated annealing [88] and parallel tempering [89]. Other well-known methods that have been applied to GW parameter estimation include nested sampling [90], which explores nested contours of increasing probability with a number of live points, and evolution- ary algorithms [91], which use genetic concepts such as breeding and fitness to evolve a population of points across parameter space. Chapter 3 Einstein–Maxwell interactions GWs from astrophysical sources can interact with background electromagnetic fields, giving rise to distinctive and potentially detectable electromagnetic sig- natures. In this chapter, we study such interactions for far-field gravitational radiation using the 1+3 approach to relativity. Linearised equations for the Maxwell field on perturbed Minkowski space are derived and solved analyti- cally. The inverse Gertsenshteı˘n conversion of GWs in a static electromagnetic field is rederived, and the resultant electromagnetic radiation is shown to be significant for highly magnetised neutron stars in compact binary systems. We also obtain a variety of nonlinear interference effects for interacting gravitational and electromagnetic plane waves, although wave–wave reso- nances previously described in the literature are absent when the electric– magnetic self-interaction is taken into account. The fluctuation and amplifi- cation of electromagnetic energy flux as the strength of the GW increases to- wards the gravitational–electromagnetic frequency ratio is a possible signature of gravitational radiation from extended astrophysical sources. The material in this chapter has been adapted from [92]. 3.1 Background The strongest GW sources for present and future detectors are highly ener- getic astrophysical events, many of which will be accompanied by distinctive and detectable electromagnetic signals. Such electromagnetic counterparts can aid the detection of GW sources through improved event rate prediction, the provision of search triggers, the confirmation of individual detections, and en- 31 32 Einstein–Maxwell interactions hanced parameter estimation. Once an operational network of GW observa- tories is fully realised, the synergy of complementary information from gravi- tational and electromagnetic observations will establish GWs as an important component of multimessenger astronomy [93–95]. As observations of GW sources and their electromagnetic counterparts im- prove in precision, so too must models of such dual sources, to account for any correlations between the two types of signal. A nascent line of research towards this end is the direct coupling between gravitational and electro- magnetic fields in the strong-field regime. Recent work in this area has fo- cused on the electromagnetic signatures of gravitational perturbations on var- ious curved spacetimes; the perturbed Einstein–Maxwell equations have been solved for Schwarzschild [96], slowly rotating Kerr–Newman [97] and equal- mass binary Kerr [98], with the numerical involvement increasing as per the complexity of the spacetime. The problem of Einstein–Maxwell coupling for gravitational radiation in flat space is older and more analytically tractable than that in curved space, leading to a better characterisation of the (albeit weaker) interactions between far-field GWs and electromagnetic fields. One such effect is the resonant con- version of a GW into an electromagnetic wave (EMW)—and vice versa—in the presence of a static electromagnetic field [99–101]. The direct signatures of GWs on EMWs have also been studied; these include frequency splitting [102], intensity fluctuations [102, 103], deflection of rays [103–105] and gravita- tionally induced Faraday rotation of the EMW polarisation [105–110]. Indirect GW detection schemes using microlensing [111] and phase modulation [112] effects on light have been proposed as well. Among various frameworks suited to the study of interacting GWs and electromagnetic fields is the 1+3 covariant approach to general relativity, in which spacetime is locally split into time and space via the introduction of a fundamental timelike congruence [113–115]. This approach is most commonly employed in the cosmological setting, and in particular has been used to de- scribe electromagnetic signatures of the large-scale tensor perturbations asso- ciated with primordial GWs [116–120]. It may also be applied to gravitational– electromagnetic interactions in a general spacetime [121], although any in- §3.1 Background 33 homogeneity in the spacetime typically renders the governing equations in- tractable due to tensor–vector and tensor–tensor coupling [122]. Such difficulties with the 1+3 formalism may be partially overcome by ex- tending the spacetime splitting to a 1+1+2 decomposition in the case of locally rotationally symmetric spacetimes, which have a preferred spatial direction [122]; this method has been used to semi-analytically model the electromag- netic signature of a Schwarzschild ringdown [123]. The 1+3 approach has also been applied to the interaction of far-field GWs and electromagnetic fields in the presence of a magnetised plasma [124, 125]. Finally, recent work on Minkowski-space GWs and EMWs within the 1+3 framework has uncovered resonant interactions between the two under specific conditions [120, 126]. As any resonant amplification of electromagnetic fields by gravitational ra- diation might be important for GW detection, we take a more detailed look at flat-space interactions between GWs and electromagnetic fields using the 1+3 approach to relativity. We provide in Section 3.2 a brief primer to the 1+3 formalism, which is seldom encountered in the context of GW astronomy. In Section 3.3, we derive linearised evolution and constraint equations for the electromagnetic field on GW-perturbed flat space, and approximate these on exact Minkowski space. This framework is applied to simple models of static and radiative electromagnetic fields in Section 3.4, where we consider the re- sultant effects in astrophysical settings and discuss their implications for dual observations of GW sources. We rederive in Section 3.4.1 the resonant induction of an EMW by a GW in a static electromagnetic field, and estimate that for highly magnetised neu- tron stars in compact binary systems, the energy radiated through this process might be non-negligible with respect to the magnetic dipole radiation. In Sec- tion 3.4.2, we find no resonant interaction between plane GWs and EMWs af- ter considering electric–magnetic self-interaction contributions that have been omitted in previous work [120, 126]. However, nonlinear interference effects are shown to be significant in a regime where the GW strength approaches the GW–EMW frequency ratio from below; the resultant fluctuation and am- plification of electromagnetic energy flux is a potentially stronger signature of gravitational radiation than other geometrical-optics effects in the literature. 34 Einstein–Maxwell interactions 3.2 The 1+3 approach to relativity The 1+3 covariant formalism [113–115] has been applied to many problems in general relativity. In the context of gravitational perturbations, the approach has limited benefits due to the generally intractable evolution equations that arise, although it is fairly well-suited to the analytical study of GWs in spatially homogeneous spacetimes [122]. For a general spacetime, we define a local 1+3 threading into time and space by introducing a fundamental timelike congruence of worldlines on the mani- fold. The timelike vector field tangential to this congruence is given by ua := dxa dτ , (3.1) where τ is proper time along the worldlines and uaua = −1. In the absence of vorticity, such a threading foliates the spacetime into spacelike hypersurfaces orthogonal to ua. Every tensor field T on the spacetime may then be decom- posed into its timelike and spacelike parts via contraction with, respectively, (3.1) and the spatial projection tensor4 Pab := gab + uaub. (3.2) Decomposing the covariant derivative T;a in the above way defines the co- variant time (T˙ ) and space (T:a) derivatives of T . If T is a spatially projected tensor, we may write T˙ = T;aua, (3.3) T:a = T;a + T˙ ua. (3.4) Following the conventions of [114], we also define the spacetime volume form abcd = [abcd] as the fully antisymmetric pseudotensor with 0123 := |det[gab]|1/2, and its projection onto each instantaneous rest space as abc := uddabc (such that the spatial curl is right-handed). For a general fluid solution to the Einstein field equations (2.2), the stress– 4 The usual 1+3 notation for the projection tensor is hab, by unfortunate coincidence. §3.2 The 1+3 approach to relativity 35 energy tensor decomposition with respect to (3.1) is given by Tab = µuaub + pPab + 2q(aub) + piab, (3.5) where µ is the mass–energy density, p is the isotropic pressure, qa is the energy flux vector and piab is the symmetric and traceless anisotropic stress tensor. The matter field fully determines the local Ricci curvature, but not the nonlocal cur- vature encoded in the Weyl tensor Cabcd, which is obtained from the Riemann tensor Rabcd by subtracting its various trace terms: Cabcd = Rabcd − (ga[cRd]b − gb[cRd]a − 1 3 ga[cgd]bR). (3.6) By analogy with electromagnetism, the Weyl tensor splits into its “electric” and “magnetic” parts; these are given respectively by Eab = Cacbdu cud, (3.7) Hab = 1 2  cda Ccdbeu e. (3.8) Both matter and geometry affect motion along the worldlines, which is characterised by the kinematical quantities in the decomposition ua;b = 1 3 ϑPab + σab + ωab − u˙aub. (3.9) The vorticity tensor ωab = u[a:b] and the (non-gravitational) acceleration vector u˙a vanish in GW-perturbed Minkowski space, and are henceforth taken to be zero. With this simplification, the evolution of the expansion scalar ϑ = u :aa is governed by the Raychaudhuri equation ϑ˙ = −1 3 ϑ2 − 2σ2 − 1 2 (µ+ 3p), (3.10) where σ2 := σabσab/2. The shear tensor σab is the symmetric and traceless part of u(a:b), and evolves (keeping all quantities symmetric and traceless) as σ˙ab = −2 3 ϑσab − σacσ cb − Eab + 1 2 piab. (3.11) 36 Einstein–Maxwell interactions Equations (3.10) and (3.11) are accompanied by the two spatial constraints divσab := σ :b ab = 2 3 ϑ:a − qa, (3.12) curlσab := acdσ d:c b = Hab, (3.13) where all quantities in (3.13) are again kept symmetric and traceless. An electromagnetic field Fab on the spacetime splits into its electric and magnetic parts in the usual way: Ea = Fabu b, (3.14) Ba = 1 2 abcF bc. (3.15) It evolves in accordance with Maxwell’s equations F ;bab = Ja, (3.16) F[ab;c] = 0, (3.17) where Ja = ρEua + Ja is the four-current source with charge density ρE and spatially projected current Ja. Equations (3.16) and (3.17) may also be decom- posed with respect to (3.1). With divVa := V :aa and curlVa := abcV c:b for a spa- tially projected vector Va, we write the first-order electromagnetic evolution equations (keeping all quantities spatially projected) as E˙a = −2 3 ϑEa + σabE b + curlBa − Ja, (3.18) B˙a = −2 3 ϑBa + σabB b − curlEa, (3.19) along with the two spatial constraints divEa = ρE, (3.20) divBa = 0. (3.21) The electromagnetic field in turn curves the background spacetime via its §3.3 Linearised far-field equations 37 stress–energy tensor TEMab . This may be written in the fluid form (3.5) with µ = 1 2 (E2 +B2), (3.22) p = 1 6 (E2 +B2), (3.23) qa = Sa, (3.24) piab = 1 3 (E2 +B2)Pab − EaEb −BaBb, (3.25) where E2 := EaEa and B2 := BaBa are the squared field magnitudes and Sa := abcEbBc is the covariant Poynting vector. Coupling of the Einstein and Maxwell fields is manifest in (3.10)–(3.13) and (3.18)–(3.25); we explore this in the following sections for the case of a radiative gravitational field. 3.3 Linearised far-field equations We consider the interactions between gravitational radiation and electromag- netic fields in perturbed Minkowski space with the metric η˜ab = ηab + hab. In the transverse–traceless gauge, η˜00 = −1 and we may choose ua = δa0 , where δab is the Kronecker delta. Gravitational radiation is covariantly described by the electric and magnetic parts of the Weyl tensor, and more simply in flat space by the shear tensor, which is related to the metric perturbation by σab = 1 2 h˙TTab . (3.26) Apart from the vanishing vorticity and acceleration, other kinematical and ge- ometrical quantities are greatly simplified by the flatness and symmetry of the spacetime; for example, the expansion scalar and electric Weyl tensor reduce at linear perturbative order to ϑ = 0 and (via (3.11)) Eab = −σ˙ab. Second-order evolution equations for an electromagnetic field {Ea, Ba} on the spacetime may be derived from (3.18) and (3.19) by spatially projecting their covariant time derivatives, making use of (3.10)–(3.13) to simplify the expressions. These wave-like equations are sourced by the kinematical quan- 38 Einstein–Maxwell interactions tities ϑ and σab, with additional Ricci and Weyl curvature terms arising from the non-commutativity of derivatives. Similar equations have been derived for a fully general spacetime, where the only assumption is a single perfect-fluid matter field with a barotropic equation of state [121]. For most astrophysical GW sources we expect to observe, the electromag- netic luminosity (∼ 1037 W for a typical galaxy) is dwarfed by the gravitational luminosity (some significant fraction of c5/G ∼ 1052 W) [25], and so the energy carried by gravitational radiation is generally much greater than that stored in the electromagnetic field. This translates to E2 ∼ B2  σ2  1 in geometric units. The wave-like equations for Ea and Ba contain source terms of three sizes: ∼ σE, ∼ σ2E and O(E3), with the last arising from the back-reaction of the electromagnetic field on the background spacetime via (3.22)–(3.25). Con- sidering only the leading (in σ) terms at linear order in E, we find ˜Ea = σabE˙b + 2σ˙abEb + abcσcdB :bd + abcσcd:bBd + ( ˜curlσab)Bb, (3.27) ˜Ba = σabB˙b + 2σ˙abBb − abcσcdE :bd − abcσcd:bEd − ( ˜curlσab)Eb, (3.28) where ˜T := T¨ − T :a:a for any spatially projected tensor T . Here and hence- forth, a tilde over an operator explicitly indicates that we are raising and low- ering indices with the perturbed metric η˜. Equations (3.27) and (3.28), along with the divergence constraints d˜ivEa = d˜ivBa = 0 from (3.20) and (3.21), govern the evolution of electromagnetic fields in the presence of weak-field gravitational radiation. They are coupled to the first-order evolution and constraint equations (3.11)–(3.13) for σab, which may be cast as a constrained wave-like equation in similar fashion to the derivation of (3.27) and (3.28). The shear equations contain terms that are ∼ σ2, ∼ σ3 and O(E2σ); at linear order in σ, however, we have ˜σab = 0 (3.29) with d˜ivσab = 0 (which is the transversality condition implied by (3.26)). Hence it is reasonable to treat σab as a fixed background of gravitational radiation that drives oscillations in the electromagnetic field via (3.27) and (3.28). §3.3 Linearised far-field equations 39 For weak-field calculations, it is convenient to replace the perturbed Minkowski metric η˜ab with an exact one, which simplifies index manipulation and any harmonic expansion of tensor fields. This approximation is trivially valid for (3.29) and its divergence constraint (where replacing the perturbed operators with their Minkowski counterparts only introduces terms that are quadratic- or higher-order in σ), but not so for (3.27) and (3.28). Using a per- turbative approach, we consider the gravitationally coupled electromagnetic field as the sum of a free field and an induced first-order perturbation, i.e. Ea = E (0) a + E (1) a , (3.30) Ba = B (0) a +B (1) a , (3.31) where {E(0)a , B(0)a } is a vacuum Maxwell solution, E(1)  E(0) and B(1)  B(0). Denoting operators on Minkowski space with overbars, we have ¯E(0)a = ¯B(0)a = 0, (3.32) d¯ivE(0)a = d¯ivB (0) a = 0. (3.33) Substituting (3.30)–(3.33) into the equations for {Ea, Ba} yields wave-like equations for the induced field that are essentially (3.27) and (3.28) with linear corrections. These corrections are due to the difference operators (˜− ¯) and (d˜iv − d¯iv) giving rise to terms that are ∼ σE and non-negligible with respect to (3.27) and (3.28). The induced field equations read ¯E(1)a = F [E(0)a ] +G[B(0)a ] + {linear corrections}[E(0)a ], (3.34) ¯B(1)a = F [B(0)a ]−G[E(0)a ] + {linear corrections}[B(0)a ], (3.35) d¯ivE(1)a = {linear corrections}[E(0)a ], (3.36) d¯ivB(1)a = {linear corrections}[B(0)a ]. (3.37) Here F and G are linear maps defined by (3.27) and (3.28) as F [Va] := σabV˙ b + 2σ˙abV b, (3.38) 40 Einstein–Maxwell interactions G[Va] := abcσ cdV :bd + abcσ cd:bVd + ( ¯curlσab)V b, (3.39) with the covariant time and space derivatives equal to their partial counter- parts at linear perturbative order. The divergences of the induced electromagnetic field contain terms that are generally nonzero, even in the absence of sources. Equation (3.36) in particular has been interpreted as an effective four-current generator for the induced field [102], although there is no similar analogy for its magnetic counterpart (3.37). A more suitable comparison might be to think of the corrections in (3.36) and (3.37) as “polarisation” and “magnetisation” effects generated by the space- time perturbations, with Ea and B (0) a playing the respective roles of the electric displacement and auxiliary magnetic fields [127]. In this work, we consider a far-field background GW that is plane, monochromatic and linearly polarised with constant amplitude. The geometrical-optics approximation is valid whenever the gravitational wave- length is much smaller than the background radius of curvature, i.e. across the distant wave zone of a typical astrophysical source and well into its local wave zone [23]. More realistic (multimodal) inspiral-type waveforms may be built from superpositions of our simplified model, with the resultant imprint on the electromagnetic field bearing the characteristics of the source waveform. The background GW is a solution to (3.29) (with ˜ = ¯) and d¯ivσab = 0. Choosing coordinates xa such that it propagates in the positive z-direction with zero initial phase, the wave is described by the real part of σab = σ exp (−ik(t− z))pab, (3.40) where σab has been promoted to a complex tensor. The unit polarisation tensor pab has nonzero components p11 = −p22 = cos 2α and p12 = p21 = sin 2α for some wave polarisation angle α. Any linear corrections in (3.34)–(3.37) are obtained in the usual way with the metric perturbation, which is given by hTTab = < [ 2i k σab ] (3.41) in accordance with (3.26). §3.4 Solutions and observational implications 41 3.4 Solutions and observational implications We now consider two simple models for the free field {E(0)a , B(0)a } in (3.34)– (3.37), and discuss their astrophysical implications. Section 3.4.1 deals with the effects of gravitational radiation on a static electromagnetic field, while GW–EMW interactions are examined in Section 3.4.2. 3.4.1 Static electromagnetic field When an EMW propagates through a static electromagnetic field, it is reso- nantly converted to a GW of the same frequency and wavevector; the GW is sourced by a stress–energy tensor proportional to both the radiative and static electromagnetic fields [99]. Astrophysical GWs generated through this “Gert- senshteı˘n process” are generally too weak to be of practical interest [6]. The Gertsenshteı˘n effect and its inverse process, where a GW in a static electromag- netic field induces an EMW proportional to both fields, might nevertheless be relevant for detecting individual gravitons [128] or high-frequency GWs [129]. The inverse Gertsenshteı˘n process is as inefficient as its counterpart, and the fraction of gravitational energy converted is small (. 10−10) even under typical conditions in pulsar environments [100]. However, the energy in the induced EMW might be comparable to that radiated conventionally by astro- physical systems where both the gravitational radiation and magnetic field are strong (but still in the far-field regime of Section 3.3). Hence it is worthwhile to derive the inverse Gertsenshteı˘n effect within our framework, and to revisit the feasibility of detecting it in observations. For a plane GW propagating in a uniform magnetic field, the field com- ponent in the direction of the wavevector does not affect the induced EMW. Considering only the projection of the magnetic field onto the (x, y)-plane, the free field may be written as E(0)a = 0, (3.42) B(0)a = B (0)p(0)a , (3.43) with the unit polarisation vector p(0)a = (0, cos β, sin β, 0) for some field polarisa- tion angle β. All linear corrections in (3.34)–(3.37) vanish for static and uniform 42 Einstein–Maxwell interactions electromagnetic fields, and we expect separable solutions to the system. Iso- lating the spatial dependence in our ansatz as a scalar harmonic, we promote (3.30) and (3.31) to complex vectors and write E(1)a = Ea exp (ikz), (3.44) B(1)a = Ba exp (ikz), (3.45) where {Ea,Ba} depends only on time. Equations (3.34)–(3.37) now simplify to an ordinary differential equation in time for the sole independent component of the induced field. Solving this with homogeneous initial conditions, we arrive at Ea =  bca Bbδ3c , (3.46) Ba = 1 2 hB(0)(kt exp (−ikt)− sin kt)p(1)a , (3.47) where h = 2σ/k and p(1)a = p ba p (0) b = (0, cos (2α− β), sin (2α− β), 0). Equations (3.46) and (3.47) describe a plane, monochromatic and linearly polarised EMW; its amplitude is given by E(1) = B(1) = 1 2 hB(0)(k2t2 − kt sin (2kt) + sin2 kt) 12 , (3.48) which is proportional to time for large t. The period TGW = 2pi/k and strength h of the sinusoidal background GW determine a natural timescale TGW/(2pih), at which B(1) ∼ B(0) and higher- order perturbations to the electromagnetic field become significant. In reality, the linear growth in (3.48) is contingent on a steady build-up of oscillations over time, and is more of an upper bound for EMWs induced by chirp- or ringdown-type GWs with evolving frequency and/or amplitude. We incorpo- rate such waveforms with the generalised model σab = 1 2 (k + k˙t)h exp (−i(k + 1 2 k˙t)t+ ikz − λt)pab, (3.49) where the spatial dependence has been left unchanged from (3.40) to main- §3.4 Solutions and observational implications 43 Figure 3.1: Induced EMW amplitude relative to background magnetic field strength for different gravitational waveforms with fGW = 10 Hz and hI = 10−3: a sinusoid (S), a chirp with f˙GW = 10n Hz/s (Cn), and a ringdown with τGW = 10 n s (Rn). tain separability. When λ = 0, (3.49) describes a linear chirp with constant chirp rate f˙GW = k˙/(2pi), while for k˙ = 0 it gives a ringdown with damping timescale τGW = 1/λ. Equations (3.34)–(3.37) may then be solved analytically to yield Fresnel-like integrals in the chirp case, and solutions with bounded exponential growth in the ringdown case. The inverse Gertsenshteı˘n effect is potentially significant in the context of neutron star binaries, since the induced EMW is proportional in strength to both h and B(0). While the stable GWs from the early inspirals of such sys- tems might be conducive to resonant growth, any associated magnetic fields will have fallen off drastically where the wave zone for gravitational radiation begins. Hence we consider a neutron star binary coalescence in an interaction region I with the strongest possible GW strain hI and magnetic field strength BI , i.e. at the inner edge RI := c/(2pifGW) of the local wave zone [23]. Figure 3.1 shows the ratio B(1)/B(0) for various gravitational waveforms, using typical values of (initial) frequency fGW = 10 Hz and measured strain h⊕ = 10−21 (hI := h⊕R⊕/RI = 10−3) that correspond to a neutron star binary 44 Einstein–Maxwell interactions coalescence at R⊕ = 102 Mpc. With such a large interaction strain, we have B(1) ↗ B(0) in just 300 GW periods for the sinusoidally driven EMW, which is well within the ∼ 104 waveform cycles observable by LIGO. In general, how- ever, the induced EMW amplitude is reduced with increasing variability in the gravitational waveform. The inverse Gertsenshteı˘n effect is insignificant for the R−1 waveform, and hence completely negligible for actual stellar-mass ringdowns with their damping timescales of ∼ 10−5 s. By Poynting’s theorem [127], the spacetime-averaged power density trans- ferred from a sinusoidal GW to its induced EMW is (at leading order in time) − 〈u˙GW〉 = 〈d¯ivSa〉 = 1 8µ0 h2IB 2 Iω 2 GWt, (3.50) where units have been restored and ωGW = 2pifGW.5 The GW energy density is given via (2.18) by uGW = c2 32piG h2Iω 2 GW. (3.51) Hence the fraction of gravitational energy converted in the region I is Υ = 2piG µ0c2 B2I t 2, (3.52) in accordance with the original Gertsenshteı˘n result [99, 100]. Even for a neu- tron star binary coalescence containing a magnetar6 with radius RS = 104 m and surface field strength BS = 1011 T (BI := BSR3S/R 3 I = 10 3 T), Υ over 104 GW periods is small (∼ 10−9). At leading order in time, the time-averaged Poynting flux of the induced EMW at the interaction distance RI is given by 〈S〉 := √ 〈Sa〉〈Sa〉 = c 24µ0 h2IB 2 Iω 2 GWt 2. (3.53) Like the magnetic dipole radiation emitted by a neutron star, the Gertsenshteı˘n radiation typically dwarfs the beamed radiation arising from synchrotron 5 Note that the covariant Poynting vector is now Sa =  bca <[Eb]<[Bc]. 6 Magnetars are highly magnetised neutron stars with typical periods of 1 to 10 s and sur- face fields ranging from 109 to 1011 T [130]. §3.4 Solutions and observational implications 45 emission in the magnetosphere, but can neither propagate through the ionised interstellar medium nor be detected by existing radio telescopes due to its low frequency (. 103 Hz). It is more instructive to compare (3.53) with the angle- averaged flux density of the maximal magnetic dipole radiation at RI , which is given by [131] 〈Sdip〉 = 1 6µ0c3 B2Sω 4 dipR 6 SR −2 I , (3.54) where ωdip is the neutron star’s angular velocity. For a neutron star binary coalescence containing a millisecond pulsar with radius RS = 104 m and surface field BS = 106 T, we have 〈S〉 ∼ 109 W/m2 after 300 GW periods and 〈Sdip〉 ∼ 1017 W/m2. If the pulsar is replaced by a similarly sized magnetar with a 1 s period and 1011 T surface field, the aver- age flux generated through the Gertsenshteı˘n process after 300 GW periods is ∼ 1019 W/m2—a good 104 times larger than that due to the magnetar’s dipole radiation. Although this excess flux cannot be detected directly, it should in principle contribute significantly to the heating of any bipolar outflows or nearby interstellar clouds. The resultant secondary emission of pulsed elec- tromagnetic radiation (with pulse frequency fGW) might then be observable by conventional telescopes across a range of bands, depending on the compo- sition of the surrounding nebula. 3.4.2 Electromagnetic radiation Interactions between gravitational and electromagnetic radiation in the far field are most prominently characterised by a variety of interference-like (but fully nonlinear) effects on the latter. For our framework, we consider a free EMW that is plane, monochromatic and linearly polarised; since electromag- netic wavelengths are typically much shorter than gravitational and astrophys- ical length scales, our choice is motivated by the validity of geometrical optics as much as the suitability of plane harmonics to the tensor–vector contractions in (3.38) and (3.39). The EMW is described by the real part of E(0)a = E (0) exp (i(nbx b + ψ))p(0)a , (3.55) 46 Einstein–Maxwell interactions B(0)a = 1 n  bca nbE (0) c , (3.56) where the four-wavevector na = n(−1, sin θ cosφ, sin θ sinφ, cos θ) has the usual polar and azimuthal angles (with respect to the z-axis), and ψ is the initial phase relative to (3.40). The unit polarisation vector now lies in the plane or- thogonal to the spatial wavevector nµ, and is defined such that p (0) 3 = sin θ sin γ for some wave polarisation angle γ. For separable solutions, the tensor–vector contractions in (3.38) and (3.39) motivate the ansatz E(1)a = 1 2 (E (+)a exp (im(+)µ xµ) + E (−)a exp (im(−)µ xµ)), (3.57) B(1)a = 1 2 (B(+)a exp (im(+)µ xµ) + B(−)a exp (im(−)µ xµ)), (3.58) where m(±)µ := nµ ± kµ (with kµ = kδ3µ) are spatial wavevectors associated with the first-order perturbation, and we have used the phasor multiplication rule <[exp (iΦ)]<[exp (iΨ)] = 1 2 <[exp (i|Φ + Ψ|) + exp (i|Φ−Ψ|)]. (3.59) The scalar Helmholtz harmonics exp (im(±)µ xµ) decouple from (3.34) and (3.35), leaving a system of ordinary differential equations in time for {E (±)a ,B(±)a }. Al- though the choice of ansatz is amenable to plane-wave solutions, the diver- gences (3.36) and (3.37) depend on the angular configuration {θ, φ, α, γ} of the waves, and are in general nonzero. As it turns out, the full system (3.34)–(3.37) of evolution and constraint equations is inconsistent with (3.57) and (3.58) for all but two wave configurations: parallel (θ = 0) and antiparallel (θ = pi), both of which yield plane-wave perturbations. GW–EMW interactions have previously been studied in the 1+3 formal- ism, but neglecting the electric–magnetic self-interaction terms in the evolu- tion equations (3.27) and (3.28) (i.e. setting G = 0 in (3.39)). This decouples the spatial dependence without explicit knowledge of the covariant Helmholtz harmonics, and for parallel waves the resultant ordinary differential equation describes a resonantly driven oscillator with natural and driving frequency m = n + k [120, 126]. By considering the full equations, however, we find that §3.4 Solutions and observational implications 47 the effect of G is to cancel the terms due to F when θ = 0, such that (3.34) and (3.35) become homogeneous wave equations. In other words, parallel waves do not interact at all. Such cancellation does not occur for antiparallel waves, although we find no resonant interaction either. The lack of interaction be- tween parallel GWs and EMWs is a known result that has been obtained via other approaches in the literature [104, 106, 107]. For a general interaction angle θ, (3.34)–(3.37) do not admit two-mode solu- tions of the form (3.57) and (3.58). Nevertheless, the m(±)-modes are dominant when the frequency ratio ρ := k/n is small, in the sense that the evolution and constraint equations are consistent at leading order when ρ sec θ  1. Since ρ . 10−4 in most astrophysical scenarios (the highest-frequency GW sources have fGW ∼ 103 Hz [25], while fEM ∼ 107 Hz is the lowest frequency that mod- ern radio telescopes are sensitive to [132, 133]), the two-mode ansatz is justifi- able and valid for all angular configurations except orthogonal waves. Solving the ordinary differential equations for {E (±)a ,B(±)a } with homoge- neous initial conditions, we obtain a wave perturbation with a complicated dependence on {k, n, θ, φ, α, γ} (see Appendix A). The solution (A.1)–(A.10) remains linearly polarised, however, as its components have a common phase offset ψ and time dependence E (±)a ,B(±)a ∝ m(±) exp (−i(n± k)t) −m(±) cos (m(±)t) + i(n± k) sin (m(±)t), (3.60) where m(±) = (k2 + n2 ± 2kn cos θ)1/2. This represents an effective splitting of the EMW frequency into four perturbation frequencies m(±) and n ± k, along with the original free frequency n. The amplitude of the wave perturbation vanishes in the limit for parallel waves, and is ∼ hE(0) for antiparallel waves; its characteristic size for general θ is E(1), B(1) = O(hE(0)/ρ), (3.61) which indicates that nonlinear interference effects between GWs and EMWs might become significant as h ↗ ρ. When h > ρ, higher-order perturbations come into play and the validity of the perturbative approach is limited. 48 Einstein–Maxwell interactions (a) (b) (c) (d) (e) (f) Figure 3.2: Perturbed Poynting flux envelope for h = ρ at different interaction angles between θ = pi/2 and θ = 0 ((a)–(d)), along with comparisons of the four configurations over different timescales ((e) and (f)). The bolded curves in (a)–(d) are for h = 10−1ρ. §3.4 Solutions and observational implications 49 To illustrate the behaviour in the h ∼ ρ regime, we define here a complex covariant Poynting vector7 Sa :=  bc a EbB ∗ c , (3.62) which gives the envelope S := (S∗aSa)1/2 of the usual Poynting vector magni- tude for the full field {Ea, Ba}. Due to the presence of cross terms in (3.62), the Poynting flux envelope is spatially periodic on the gravitational length scale 2pi/k. Figure 3.2 shows the relative flux envelope S/S(0) for a plus-polarised GW (α = 0) and an EMW in the (y, z)-plane ((φ, γ) = (pi/2, 0)), where S is evaluated at xµ = 0 and S(0) is the constant flux envelope of the free EMW. In accordance with previous results [126], there is an emergence of θ- dependent beats in the perturbed EMW, with frequency given by the greatest common divisor of the spectrum {n,m(±), n ± k}. It is useful to define an ap- proximate beat period Tbeat(θ) := 2pi/(k − (m(+) − m(−))/2), which describes much of the beat structure for most values of θ. As the interaction angle de- creases from pi/2− to  (where  = 10−3), the peaks for the extremal case h = ρ increase from around S/S(0) = 2 to a limiting value of S/S(0) = 9. Addition- ally, we find significant nonlinear amplification of the beats as h is raised from 10−1ρ to ρ. Beating effects are essentially negligible for h . 10−3ρ. There is an overall flux increase apparent in Figure 3.2, attributable to the transfer of energy from the GW to the electromagnetic field as in Section 3.4.1. For a clearer picture of this flux amplification and its dependence on interac- tion angle, we require the Poynting flux averaged over finite time intervals T , which we denote explicitly as 〈S〉T = (〈Sa〉T 〈Sa〉T )1/2 with 〈Sa〉T = 1 T ∫ T 0 dtSa. (3.63) Considering the same angular configuration as before, Figure 3.3 shows a se- quence of polar plots (with respect to interaction angle) for 〈S〉T/〈S(0)〉 with increasing T , where 〈S〉T is evaluated at xµ = 0 and 〈S(0)〉 for the free EMW is effectively constant over gravitational timescales. When h ∼ ρ, the overall flux in the forward sector |θ| < pi/2 is approximately doubled for small |θ| after 7 Our definition differs by a factor of 1/2 from the conventional complex Poynting vector, which is used to calculate time-averaged flux for sinusoidal plane waves. 50 Einstein–Maxwell interactions Figure 3.3: Perturbed time-averaged Poynting flux 〈S〉T/〈S(0)〉 as a radial func- tion of interaction angle, for different time intervals between T = 100TGW and T = 105TGW. Each plot is for h = ρ, with θ = 0 on the positive horizontal axis such that the GW propagates to the right. §3.4 Solutions and observational implications 51 just 102 GW periods. There is little to no flux amplification in the backward sector |θ| > pi/2. We note that the interaction between parallel waves vanishes as expected, with the beat frequency and the induced field itself going to zero smoothly as |θ| → 0; the seemingly pathological behaviour of 〈S〉T/〈S(0)〉 at θ = 0 is attributable to the non-smoothness of the time-averaging operation (3.63) in the limit as T →∞. The nonlinear interference depicted in Figures 3.2 and 3.3 is potentially relevant for GW sources with electromagnetic counterparts that are long- lived (lasting at least several GW periods), and preferably high-frequency (ρ . 10−10) for effects to be significant at low interaction strains. Possible counterparts for a compact binary system are a pulsar component as in Sec- tion 3.4.1 or, more promisingly, an extended electromagnetic source such as a bipolar outflow or interstellar cloud around the binary. If an extended source emits radiation in the band fEM ∼ fGW/hI , its radiation profile might be char- acterised by intensity fluctuations and overall flux amplification at small an- gular distances from the binary’s sky location; the fluctuations should increase in frequency to fbeat(pi/2) = fGW as the interaction angle widens, then diminish rapidly at larger angular distances as hI falls below 10−1ρ. Figure 3.3 effectively describes the flux amplification at different interaction angles, but there is actually a tiny deflection of the perturbed time-averaged Poynting vector 〈Sa〉 in the direction of the GW. The original Poynting vector, averaged over all time, is given simply by 〈S(0)a 〉 = 12<[S(0)a ]. (3.64) Its perturbed counterpart reduces to 〈Sa〉 = 〈S(0)a 〉+ 12 〈<[S(1)a ]〉+ 12 〈<[ bca E(1)b B(1)c ]〉 , (3.65) since both cross terms average to zero over all time. The first two terms in (3.65) depend only on the angular configuration of the waves and have no spatial dependence, while the spatial dependence in the third term is negligible for ρ 1. We consider the deflection angle Θdef between (3.64) and (3.65) with the 52 Einstein–Maxwell interactions same angular configuration as before; expanding the angle in powers of both h and ρ, we find Θdef = O(min {h2/ρ, ρ}), (3.66) which is valid in the forward sector but away from θ = 0, where Θdef goes sharply to pi/2 due to the time-averaging operation. Equation (3.66) becomes Θdef = O(h2/ρ) for h < ρ, such that the deflection of the time-averaged Poynting vector varies with both fGW and fEM. This is a new result, although a frequency-dependent deflection of time-averaged flux does not necessarily imply the dispersion of light by GWs. The maximal angle Θdef ∼ h agrees with previous results for the ray deflection angle in other approaches [104, 111]. A hydrogen-line radio wave passing the neutron star binary coalescence of Section 3.4.1 with an impact parameter corresponding to hI ∼ ρ ∼ 10−8 will have its Poynting vector deflected by ∼ 10−3 arcsec; this is comparable to the deflection due to conventional gravitational lensing by the same system (∼ 10−2 arcsec). Such angular deviations are too small to be observed directly, but might be amenable to microlensing techniques. Another well-documented GW–EMW interaction is the gravitational ana- logue of Faraday rotation experienced by an EMW in the field of a passing GW; if the projection of the EMW polarisation vector onto the GW polarisation plane is aligned with the plus mode, it will undergo a slight (oscillatory) ro- tation as long as the cross mode is nonzero [106, 107, 109]. In our framework, there is indeed no rotation for a plus-polarised GW (α = 0) and an aligned EMW ((φ, γ) = (pi/2, 0)), since <[E(0)a ] and <[Ea] are parallel. We consider the rotation angle8 Θrot between E (0) a and Ea for a cross-polarised GW (α = pi/4) and the same EMW at xµ = 0; expanding the angle in powers of h, we find |Θrot| = O(h), (3.67) in accordance with previous results [106, 107]. The rotation angle also oscil- lates at ∼ fGW as expected [106], with beat frequency fbeat(θ). Again, since we have h . ρ . 10−4 in most astrophysical scenarios, the gravitationally in- 8 We use the complex fields to smooth out oscillations on the electromagnetic timescale; the angle Θ between two complex vectors Va and Wa is given by cos Θ = <[V ∗aW a]/(VW ). §3.5 Discussion 53 duced Faraday rotation due to the passage of a far-field GW will typically be . 10 arcsec and difficult to detect using modern techniques. 3.5 Discussion In this chapter, we have studied far-field interactions between gravitational ra- diation and electromagnetic fields in the 1+3 approach to relativity, with a view to characterising observable signatures on any electromagnetic radiation emit- ted by astrophysical GW sources. Linearised evolution and constraint equa- tions for the electromagnetic field on a GW-perturbed spacetime have been approximated and solved perturbatively on Minkowski space, where the rele- vant harmonic expansions are explicitly known and analytically tractable. We have rederived the inverse Gertsenshteı˘n effect by applying this frame- work to the interaction of a plane GW with a static electromagnetic field, and have also considered the resonantly induced electromagnetic radiation in an astrophysical setting. Order-of-magnitude calculations have shown that the Gertsenshteı˘n radiation is comparable to the magnetic dipole radiation for highly magnetised neutron stars in compact binary systems; in the presence of a surrounding nebula, this might lead to a secondary emission of electro- magnetic radiation pulsed at the GW frequency. Several geometrical-optics effects have been found in the case of interacting GWs and EMWs. There is no resonant growth of the electromagnetic field as found in other work, due to the additional consideration of electric–magnetic self-interaction contributions here. We have also demonstrated that the non- linear fluctuation and amplification of electromagnetic energy flux becomes significant as the GW strain approaches the GW–EMW frequency ratio from below, and might serve as a distinctive astrophysical signature of gravitational radiation emitted near or within an extended electromagnetic source. From the various assumptions and approximations employed in this work, it is evident that the analytical advantages of the 1+3 approach are limited even for our simple model. A calculation of second-order perturbations in- duced by the first-order fields via (3.34)–(3.37) might provide a clearer picture of interacting waves in the h ∼ ρ regime, although a rapid blow-up of the full 54 Einstein–Maxwell interactions field (signalling the breakdown of the perturbative approach) is more likely. Numerical solutions of (3.27) and (3.28) or their unlinearised versions might be worth pursuing in this case, both to verify results from the perturbative approach and to facilitate more accurate models by extending the framework into the h > ρ regime. Our results have observational implications for two types of astrophysi- cal source: compact sources with large values of h and B(0) (for the inverse Gertsenshteı˘n effect to be relevant), and extended ones with a wide range of interaction angle (for more prominent nonlinear interference effects). They are not restricted to any specific example suggested here, however; neither have we considered scenarios where the gravitational and electromagnetic sources are separate. Detailed source models that incorporate far-field gravitational– electromagnetic interactions—or any Einstein–Maxwell coupling in general— will be an asset to the detection and observation of GW sources at present, and indeed the larger realm of multimessenger astronomy in the future. Chapter 4 Augmented kludge waveforms To prepare for the formulation of the LISA mission over the next few years, several outstanding and urgent questions in data analysis must be addressed using mock data challenges. These data challenges will require accurate and computationally affordable waveform models for anticipated sources such as EMRIs. Previous data challenges have made use of the well-known analytic kludge waveforms of Barack & Cutler, which are extremely quick to generate but dephase relative to more accurate waveforms within hours, due to their mismatched radial, polar and azimuthal frequencies. In this chapter, we describe an augmented Barack–Cutler waveform model that uses a frequency map to the correct Kerr frequencies, along with updated PN evolution equations and a local trajectory fit to a more accurate model. The augmented kludge waveforms stay in phase for months and may be generated with virtually no additional computational cost. The material in this chapter is adapted from [59, 134], but has been updated to include new unpublished results that were obtained with the latest version of the waveform model [135]. 4.1 Background With the recent success of the LISA Pathfinder mission [18], space-based GW astronomy is now one step closer to becoming a reality. The launch and opera- tion of Pathfinder provides an early demonstration of the technology necessary for proposed detectors such as LISA to probe the source-rich and scientifically rewarding millihertz GW sky. Results from the mission also offer vital insights 55 56 Augmented kludge waveforms into the noise properties of the LISA instrument; these will be used over the next few years to address several important open questions in data analysis, which must be dealt with prior to mission adoption at the end of this decade. Accurate and computationally affordable models of likely sources will allow such issues to be investigated in forthcoming mock data challenges, and are therefore urgently needed. EMRIs are an important type of source for LISA and other space-based de- tectors. Radiation reaction from the emission of GWs causes the orbit of the compact object around the central black hole to shrink and circularise adia- batically. During the final years of inspiral, the orbital dynamics display ex- treme relativistic effects that arise because the compact object is deep within the strong-field region of the black-hole spacetime. These effects are imprinted on the GW signal from the source; measuring them will allow us to map out the multipole structure of the spacetime, and hence to test the strong-field va- lidity of black-hole solutions in general relativity [136]. The optimal detection and identification of EMRI signals depends on the availability of a waveform model that is as accurate as possible. While the ex- treme mass ratio prohibits the use of NR simulations, it does allow the source to be modelled faithfully within the framework of black-hole perturbation the- ory. At first order in mass ratio, such models are based on solving for the outgoing radiation field Ψ4 with the Teukolsky equation (2.38) (sourced by a compact object moving along a Kerr geodesic). Orbital evolution may be intro- duced in Teukolsky-based models by balancing the change in orbital energy and angular momentum against the radiation fluxes of “snapshot” geodesic waveforms. The ongoing development of self-force calculations will eventu- ally provide a more accurate evolution that accounts for both dissipative and conservative self-interaction effects at higher orders in mass ratio. For scoping out data analysis, the large parameter space of EMRI mod- els and the complexity of their waveforms necessitates the use of templates that can be generated as quickly as possible. As Teukolsky-based waveforms are computationally expensive, they have been supplemented by approximate waveforms designed for robust use in data analysis. The waveform model used for previous mock LISA data challenges [137] is the analytic kludge (AK) §4.2 Kludge waveform models 57 [57], in which the orbit is built from Keplerian ellipses, with relativistic inspi- ral, periapsis precession and Lense–Thirring precession imposed using ana- lytic PN evolution equations. The AK model is extremely quick to compute, but is less accurate than the numerical kludge (NK) model [58], which com- bines Kerr geodesics with Teukolsky-fitted PN orbital evolution for greater ac- curacy. In both kludge models, generation of the waveform from the orbit is sped up from Teukolsky-based models by using a flat-space approximation. In this chapter, we describe an augmented analytic kludge (AAK) wave- form model based on a frequency mapping method. The orbital frequency and two precession rates in the AK model are matched to appropriate com- binations of the fundamental frequencies describing the radial, polar and az- imuthal components of geodesic orbits in Kerr spacetime [138]. We also update the model with suitable PN evolution equations and include an additional fit to the NK model, which itself shows excellent agreement with the Teukolsky- based geodesic and inspiral waveforms in [62, 63]. The length of time over which the AAK waveform stays in phase with the NK waveform is typically increased by a few orders of magnitude, while the added computational cost is insignificant as the map-and-fit is only performed locally. A brief overview of the AK and NK models is given in Section 4.2, while the AAK model is presented in Section 4.3. In Section 4.3.1, we first introduce the Kerr fundamental frequencies and the parameter-space map they induce in the AK model, before describing the technical details of the AAK implementation in Section 4.3.2. The performance of the augmented model is then compared to that of the original AK model in Section 4.3.3, with the more accurate but slower NK model used as the benchmark for both. 4.2 Kludge waveform models A kludge in the context of EMRI modelling is any approximate model that uses a combination of formalisms to generate waveforms quickly and extensively for sampling purposes. Kludge waveforms capture many qualitative features of more accurate EMRI waveforms, and (owing to their modular construction) can be modified to incorporate self-force information as it becomes available. 58 Augmented kludge waveforms Two widely used kludges are introduced briefly in this section: the AK waveform of Barack & Cutler [57], which is very fast to compute and provides the basis for our new model, and the NK waveform of Babak et al. [58], which we take as a fiducial model for calibration and benchmarking purposes. Other approximate EMRI models exist but at varying levels of implementation (e.g. [139, 140]), and we do not consider them in this work (apart from the PN fluxes of Sago & Fujita [141], which are used as part of the AAK model). Assuming the spin of the compact object is negligible, an EMRI can be de- scribed by 14 parameters: the two masses (µ,M  µ) of the system, the three components of the central black hole’s spin vector S, three constants E de- scribing the compact object’s (instantaneous) orbit, the three components of the compact object’s position vector X with respect to the black hole, and the three components of the system’s position vector R with respect to the So- lar System. Of these 14 degrees of freedom, seven are extrinsic to the source: two in S and one in X (corresponding to spatial rotation of the source), three in R (corresponding to spatial translation), and one in E (corresponding to temporal translation). The parameters of an EMRI model are often chosen to decouple the intrinsic degrees of freedom from the extrinsic ones, which are generally cheaper to search over during data analysis [142]. Schematically, the main ingredients of a kludge waveform model are then (i) the evolution of the orbital constants along the inspiral (i.e. the “phase- space” trajectory), using PN or fitted fluxes F: E˙ = F(µ,M,S,E); (4.1) (ii) the construction of the compact object’s worldline (i.e. the “configuration- space” trajectory), using geodesic or flux-derived expressions G: X˙ = G(µ,M,S,E); (4.2) and (iii) the generation of the waveform field h at the detector, using some weak-field multipole formula H : h(t) = H(X,R). (4.3) §4.2 Kludge waveform models 59 4.2.1 Analytic kludge In the AK model [57], both the orbital trajectory and the waveform are com- puted in a flat-space approximation, with relativistic effects such as inspi- ralling and precession added separately. The trajectory is built out of rotating Keplerian ellipses. Radiation reaction is introduced in phase space, where the orbital constants describing a Keplerian ellipse are evolved with PN equations. In configuration space, the orientation of this ellipse is also evolved with PN equations to simulate relativistic precession. The waveform is then generated using the Peters–Mathews mode-sum approximation for Keplerian orbits [35], in which the mass quadrupole moment is decomposed into harmonics of the Keplerian orbital frequency. Since a Keplerian orbit is confined within the plane normal to its angular momentum vector L, the AK waveform is constructed in an L-based coordi- nate frame (xˆ, yˆ, zˆ)L˜ := ( (Rˆ · Lˆ)Lˆ− Rˆ SL,R , Rˆ× Lˆ SL,R , Lˆ ) , (4.4) and projected transverse to the wave frame (xˆ, yˆ, zˆ)AK := ( Rˆ× Lˆ SL,R , Lˆ− (Lˆ · Rˆ)Rˆ SL,R ,−Rˆ ) , (4.5) where the normalisation factor SL,R := (1 − (Lˆ · Rˆ)2)1/2. These two frames are made time-varying (with respect to a fixed heliocentric and ecliptic-based frame [79]) through the forced precession of L. The two waveform polarisations in the transverse–traceless gauge (with respect to (xˆ, yˆ)AK) are given by the n-mode sums h+ = ∞∑ n=1 h+n , (4.6) h× = ∞∑ n=1 h×n , (4.7) 60 Augmented kludge waveforms with h+n = (1 + (Rˆ · Lˆ)2)(bn sin 2γ˜ − an cos 2γ˜) + (1− (Rˆ · Lˆ)2)cn, (4.8) h×n = 2(Rˆ · Lˆ)(bn cos 2γ˜ + an sin 2γ˜), (4.9) where γ˜ is an azimuthal angle in the orbital plane measuring the direction of periapsis with respect to (Rˆ·Lˆ)Lˆ−Rˆ (i.e. the orthogonal projection of zˆAK onto the plane normal to Lˆ).9 The functions (an, bn, cn) describe the changing mass quadrupole moment of a Keplerian orbit with mean anomaly Φ(t), eccentricity e and orbital angular frequency Φ˙, and are given by [35] an = −nA(Jn−2(ne)− 2eJn−1(ne) + (2/n)Jn(ne) +2eJn+1(ne)− Jn+2(ne)) cosnΦ, (4.10) bn = −nA(1− e2)1/2(Jn−2(ne)− 2Jn(ne) + Jn+2(ne)) sinnΦ, (4.11) cn = 2AJn(ne) cosnΦ, (4.12) where the Jn are Bessel functions of the first kind, and A = (Φ˙M)2/3µ/|R| in the extreme-mass-ratio limit. In the ecliptic-based coordinate system, the sky position Rˆ ≡ (θS, φS) of the source and the black-hole spin orientation Sˆ ≡ (θK , φK) are effectively constant. It is convenient to represent Lˆ in ecliptic coordinates with respect to Sˆ (where both vectors are moved by parallel transport to the Solar System barycentre). We have Lˆ = Sˆ cos ι+ ( zˆ− (zˆ · Sˆ)Sˆ |zˆ− (zˆ · Sˆ)Sˆ| cosα + Sˆ× zˆ |Sˆ× zˆ| sinα ) sin ι, (4.13) where zˆ = [0, 0, 1]T is normal to the ecliptic plane, ι is the inclination angle between Lˆ and Sˆ, and α is an azimuthal angle in the spin-equatorial plane measuring the direction of Lˆ − (Lˆ · Sˆ)Sˆ with respect to zˆ − (zˆ · Sˆ)Sˆ (i.e. the 9 We have changed some of the notation in [57] for consistency, since we are constructing a hybrid model using different formalisms. For example: the notation for the angles γ and γ˜ has been swapped; the notation ν for the orbital frequency Φ˙/(2pi) is unused; the notation for the inclination λ is now ι. §4.2 Kludge waveform models 61 angle between the orthogonal projections of Lˆ and zˆ onto the plane normal to Sˆ). Furthermore, since γ˜ is neither intrinsic nor extrinsic, it is useful to define the purely intrinsic parameter γ := γ˜ − β, where β = β(Rˆ, Sˆ, Lˆ) = β(θS, φS, θK , φK , ι, α) is an azimuthal angle in the orbital plane measuring the direction of Lˆ× Sˆ with respect to (Rˆ · Lˆ)Lˆ− Rˆ. Only one of the six parameters comprising E = (e, ι, Φ˙) and X = (Φ(t), γ, α) in the above Keplerian setup changes with time. In the AK model, (e, Φ˙, γ, α) are promoted to functions of time and evolved with mixed-order PN expres- sions that depend on (µ,M, a = |S|/M,E) [143–146], while ι is approximated as constant (since the inclination angle of a typical EMRI varies extremely slowly [147]). The Keplerian orbit shrinks and circularises as Φ˙(t) and e(t) increase and decrease respectively. From (4.13), the time dependence of the orbital ori- entation Lˆ(t) is confined to α(t), where α˙ is precisely the angular rate of Lense– Thirring precession. Finally, the angular rate of periapsis precession is given by γ˙ + α˙ since γ(t) is measured with respect to Lˆ(t)× Sˆ. While the waveform field h+,× is effectively planar at the Solar System and may be calculated in the fixed heliocentric frame (as opposed to a detector- centric one), the rotational and orbital motion of LISA in the ecliptic plane must be factored into the detector’s response to h+,×. In the standard LISA framework, the field is transformed into the response functions hI,II via (2.46) and (2.47), with F+,×I (θS, φS, θK , φK) given explicitly in [79]; these rotate with respect to Rˆ as the plane of the detector along its orbit precesses around the ecliptic plane. Doppler modulation of the waveform phase (through Φ(t)) is also included to correct for the orbital motion of the detector itself. The AK model was the first waveform model used to investigate the preci- sion of LISA parameter estimation over the full (modulo compact-object spin) EMRI parameter space [57]. Due to its computational efficiency, the model has also been employed in past mock LISA data challenges to generate injected signals in simulated data and parametrised templates for search algorithms [137]. However, the approximate waveforms it produces are demonstrably inaccurate, and will result in reduced detection and parameter estimation per- formance if used to analyse data sets containing realistic EMRI signals. 62 Augmented kludge waveforms 4.2.2 Numerical kludge In the NK model [58], the orbital trajectory is computed in curved space with a treatment that is fully relativistic up to the evolution of orbital constants [148, 149], i.e. it is built out of Kerr geodesics. The three constants of mo- tion for a geodesic are evolved with Teukolsky-fitted PN equations, which introduces radiation reaction. In configuration space, precession effects are obtained for free by integrating the geodesic equations along the phase-space trajectory. The curved-space coordinates of the compact object’s worldline are then associated artificially with coordinates in flat space, and the waveform is generated using the standard quadrupole formula (or variants that include additional contributions from higher-order moments of mass [150, 151]). The NK waveform is constructed in an S-based coordinate frame (xˆ, yˆ, zˆ)S := ( Rˆ× Sˆ SS,R , Rˆ− (Rˆ · Sˆ)Sˆ SS,R , Sˆ ) , (4.14) and projected transverse to the wave frame (xˆ, yˆ, zˆ)NK := ( Rˆ× Sˆ SS,R , Sˆ− (Sˆ · Rˆ)Rˆ SS,R ,−Rˆ ) , (4.15) where the normalisation factor SS,R := (1 − (Sˆ · Rˆ)2)1/2. Aligning the z-axis with S is a more natural choice for the NK model, since the compact object’s worldline is computed in Boyer–Lindquist coordinates. The two wave frames (4.5) and (4.15) are related by a rotation about R. Using the decomposition (2.44), the two waveform polarisations in the transverse–traceless gauge (with respect to (xˆ, yˆ)NK) are given by h+ = 1 2 hTTij H + klδ ikδjl, (4.16) h× = 1 2 hTTij H × klδ ikδjl, (4.17) where we have switched to Latin spatial indices and δij is the Kronecker delta. Via (2.12)–(2.14) and (2.17), the far-field metric perturbation in the quadrupole §4.2 Kludge waveform models 63 approximation is given by hTTij = 2 |R| ( P ki P l j − 1 2 PijP kl ) I¨kl, (4.18) where contraction with Pij := (δij − zˆizˆj)NK projects transverse to R. In the extreme-mass-ratio limit, the mass quadrupole moment is simply Iij = µxixj, (4.19) where the xi(t) are Cartesian components of the compact object’s position vec- tor X with respect to the frame (4.14) centred on the black hole. Although (4.18) (with (4.19)) is a weak-field equation in flat-space coordinates, the NK model specifies and calculates (x1, x2, x3) = (r sin θ cosφ, r sin θ sinφ, r cos θ) in Boyer– Lindquist coordinates. The self-consistency of this approach clearly degrades further into the strong field, but does not severely impact the effectiveness of the NK waveforms as an approximation to Teukolsky-based ones [58]. A timelike Kerr geodesic is described fully by three first integrals of motion: the orbital energyE, the projection Lz of the orbital angular momentum L onto S, and the quadratic Carter constantQ. Along such an orbit, (r(t), θ(t), φ(t)) are obtained by integrating the geodesic equations for a test particle in the Kerr spacetime (2.37); these are written in canonical form as [22] Σ dr dτ = ± √ Vr, (4.20) Σ dθ dτ = ± √ Vθ, (4.21) Σ dφ dτ = Vφ, (4.22) Σ dt dτ = Vt, (4.23) where τ is proper time along the worldline and Σ = r2 +a2 cos2 θ. The potential functions Vr,θ,φ,t are given by Vr(r) = P 2 − (r2 + (Lz − aE)2 +Q)∆, (4.24) 64 Augmented kludge waveforms Vθ(θ) = Q− cos2 θ ( a2(1− E2) + L 2 z sin2 θ ) , (4.25) Vφ(r, θ) = Lz sin2 θ − aE + aP ∆ , (4.26) Vt(r, θ) = aLz − a2E sin2 θ + (r 2 + a2)P ∆ , (4.27) with P = E(r2 + a2)− aLz and ∆ = r2 − 2Mr + a2. In practice, it is convenient to work with alternative parametrisations of (E,Lz, Q). For a bound orbit, the geodesic may be specified by the parameters (rp, ra, θmin) (the values of r at periapsis and apoapsis, and the minimal value of θ respectively), which fully describe the range of motion in the radial and polar coordinates. The roots of Vr determine rp and ra, while the roots of Vθ determine cos θmin (the maximal value of cos θ). Another parametrisation is (e, ι, p) (the quasi-Keplerian eccentricity, inclination and semi-latus rectum); these are defined in terms of (rp, ra, θmin) as (e, ι, p) := ( ra − rp ra + rp , pi 2 − θmin, 2rarp ra + rp ) . (4.28) Finally, since the configuration-space parameters (r, θ) oscillate between the bounds rp ≤ r ≤ ra and θmin ≤ θ ≤ pi − θmin, it is useful to define (ψ, χ) := ( cos−1 ( p− r er ) , cos−1 ( cos θ cos θmin )) , (4.29) such that ψ (the quasi-Keplerian true anomaly) and χ are the phases of radial and polar motion respectively. The orbital constants E = (E,Lz, Q) in the above geodesic setup do not vary with time. Radiation reaction is added to the NK model by evolving E with fluxes that depend on (µ,M, a,E) (note that the inclination ι(t) = tan−1( √ Q/Lz) is correctly time-dependent in this model). These fluxes are mixed-order PN expressions that have been fitted to the results of Teukolsky- based computations for circular inclined orbits [149]. Integrating the geodesic equations along the phase-space trajectory then gives X = (ψ(t), χ(t), φ(t)), complete with relativistic precession. Once the waveform field h+,× has been §4.3 Augmented analytic kludge 65 calculated via (4.16)–(4.19), the LISA response functions hI,II may be obtained through the method outlined in Section 4.2.1. Waveforms from the NK model display excellent agreement with Teukolsky-based waveforms in the strong-field regime; they are reliable up to a closest approach of rp ≈ 5M , with typical matches of over 0.95 [58]. NK waveforms might even be accurate enough to serve as templates in actual LISA detection algorithms. However, they are still slightly expensive to generate in large numbers due to the relatively elaborate construction of the phase- and configuration-space trajectories, while added computational cost also arises in the parameter conversion (E,Lz, Q)↔ (e, ι, p), the handling of plunge, etc. 4.3 Augmented analytic kludge The AK model is 5–15 times faster than the NK model at generating year-long waveforms sampled at 0.2 Hz for a generic (101, 106)M EMRI with low initial eccentricity (e0 . 0.3); this speed-up is enhanced for longer waveform dura- tions, but diminished for higher initial eccentricity (since more modes must be summed in the Peters–Mathews approximation).10 However, AK waveforms suffer from severe dephasing with respect to NK waveforms, even at the early- inspiral stage. In Figure 4.1, the AK waveform for a (101, 106)M EMRI with initial semi-latus rectum p0 = 15M matches the qualitative features of the cor- responding NK waveform, but is a full cycle out of phase within three hours. This is due to the mismatched frequencies in the two models. In Sections 4.3.1 and 4.3.2, we describe the construction of a hybrid model that capitalises on the benefits of both kludges. The AK model is augmented with an initial map to the fundamental frequencies of Kerr geodesic motion, which corrects the instantaneous phasing as shown in Figure 4.1. Over longer timescales, the mapped orbital trajectory is further improved through self- consistent PN evolution and a local polynomial fit to the phase-space trajectory 10 The sums in (4.6) and (4.7) must be truncated at some arbitrary number of modes N , which directly affects both the speed and accuracy of the AK model. This number may be specified by setting a threshold for the relative power radiated into the N -th harmonic, and has been experimentally determined to scale linearly with eccentricity [57]. We use N = b30e0c as the default value for both the AK and AAK models. 66 Augmented kludge waveforms Figure 4.1: First 12 hours of AK (red) and AAK (green) waveforms overlaid on NK waveform (black), for the early inspiral of a (101, 106)M EMRI with initial semi-latus rectum p0 = 15M . of the NK model. Fast algorithms for higher-order fits and plunge handling have been incorporated in the latest AAK implementation, which has been made publicly available at github.com/alvincjk/EMRI_Kludge_Suite as part of a software suite for generating kludges. The initial version of the AAK model yields waveforms that can remain phase-coherent with NK waveforms for over two months, but with overlap values lower than 0.97 [59]. This is the commonly chosen minimal match for a waveform template bank that corresponds to a 90%-ideal observed event rate [83], and thus ensures the equivalent localisation of any signal detected with such banks of AAK and NK templates. In Section 4.3.3, we report fur- ther improved results for the present AAK implementation [135]. Two-month overlaps higher than 0.97 are found for EMRIs with varying spin and eccen- tricity; however, the overlaps still degrade with proximity to plunge, due to the divergence of the AAK and NK trajectories deep within the strong field. §4.3 Augmented analytic kludge 67 4.3.1 The fundamental-frequency map The geodesic equations (4.20)–(4.23) take a simple form with the choice of a timelike parameter λ = ∫ dτ/Σ [152, 153]; this decouples (4.20) and (4.21), and for a bound orbit makes the radial and polar components of motion mani- festly periodic with respect to λ. For the azimuthal and temporal components (whose potentials depend only on (r, θ)), overall rates of evolution may be ob- tained by averaging (4.26) and (4.27) over many periods of (r, θ) motion. From the radial and polar periods Λr,θ, the average azimuthal rate 〈dφ/dλ〉 and the average temporal rate 〈dt/dλ〉 (denoted Γ by analogy with the Lorentz factor), we may define three angular and dimensionless fundamental frequen- cies Ωr,θ,φ for the test particle’s motion with respect to coordinate time. In terms of (rp, ra, θmin), these frequencies are written as [141, 154] Ωr = 2pi ΛrΓ , (4.30) Ωθ = 2pi ΛθΓ , (4.31) Ωφ = lim N→∞ 1 N2ΛrΛθΓ ∫ NΛr 0 dλr ∫ NΛθ 0 dλθ Vφ(r(λr), θ(λθ)), (4.32) where Λr,θ and Γ are given by Λr = 2 ∫ ra rp dr√ Vr , (4.33) Λθ = 4 ∫ pi/2 θmin dθ√ Vθ , (4.34) Γ = lim N→∞ 1 N2ΛrΛθ ∫ NΛr 0 dλr ∫ NΛθ 0 dλθ Vt(r(λr), θ(λθ)). (4.35) Expressions for Ωr,θ,φ in terms of (e, ι, p) have been derived by Schmidt [138]; these are less compact, but have more utility in practical implementations. In terms of the fundamental frequencies, the periapsis and Lense–Thirring precession rates are given by Ωφ−Ωr and Ωφ−Ωθ respectively. These vanish in the Newtonian limit, where Ωr,θ,φ approach a single orbital frequency Ω from 68 Augmented kludge waveforms below, i.e. Ωr ↗ Ωθ ↗ Ωφ ↗ Ω. The frequency Ω is then related to (e, p) by Kepler’s third law: Ω = ( 1− e2 p )3/2 . (4.36) In the AK model, Ω is associated with the quantity Φ˙M ; however, periapsis and Lense–Thirring precession are added on top of Φ˙ via γ˙ + α˙ and α˙ respectively, and so Ω is the lowest frequency by construction. This inconsistency with the relativistic case leads to mismatched frequencies when supplying identical pa- rameters to the two models, since the same value of p in (4.36) specifies the ra- dial AK frequency while approximating the azimuthal NK frequency. In other words, the frequencies in the AK model are generally too high. A three-dimensional endomorphism over the AK space of orbits is induced by requiring that the radial, polar and azimuthal frequencies (Φ˙, Φ˙+γ˙, Φ˙+γ˙+α˙) for any (e, ι, p) have the same values as the (dimensionful) relativistic frequen- cies ωr,θ,φ := Ωr,θ,φ/M . We map the parameters (M,a, p) rather than (e, ι, p) to unphysical values; this gives better results since periapsis and Lense–Thirring precession are more directly determined by the central mass and its rotation respectively. The map (M,a, p) 7→ (M˜, a˜, p˜) is given implicitly by solving the algebraic system of equations Φ˙(M˜, a˜, p˜) = ωr(M,a, p), (4.37) γ˙(M˜, a˜, p˜) = ωθ(M,a, p)− ωr(M,a, p), (4.38) α˙(M˜, a˜, p˜) = ωφ(M,a, p)− ωθ(M,a, p) (4.39) for the unphysical set (M˜, a˜, p˜), which is defined as the root closest to the phys- ical set (M,a, p) with a Euclidean metric on parameter space. Substituting (M˜, a˜, p˜) for (M,a, p) in the AK model provides an instanta- neous correction of its frequencies at any point along the inspiral trajectory (e(t), ι(t), p(t)). In principle, applying the map along the entirety of a fidu- cial inspiral will keep the AK waveform phase-coherent with relativistic wave- forms (generated from that trajectory) until plunge. However, such an inspiral is usually more expensive to compute (as in the case of the NK model), and §4.3 Augmented analytic kludge 69 the additional cost from evaluating the map itself also scales linearly with the number of points sampled along the trajectory. Complications also arise as the compact object approaches the point of plunge, where the fundamental frequencies diverge and the map (4.37)–(4.39) is no longer well-defined. 4.3.2 Implementation In order to retain the main advantage of the AK model, computational costs are kept as low as possible by evaluating the map at a small number of points and relying on independent evolution of the orbital constants over long timescales. Firstly, the NK trajectory is generated at and around the specified initial point (M,a, e0, ι0, p0) over a user-defined timescale Tfit, which depends on the radiation-reaction timescale TRR := M2/µ and specifies the duration over which the AAK model is calibrated. The timescale Tfit and number of sample points Nfit may be adjusted adaptively based on the proximity of (e0, ι0, p0) to plunge; they typically satisfy 0.1TRR . Tfit . 10TRR and Nfit . 10 (the latter to ensure an added computational cost of. 1%). Evaluation of the map at each of the Nfit points gives a local “best-fit” trajectory (M˜(t), a˜(t), e(t), ι, p˜(t))fit for the AAK model, where the unphysical (M˜(t), a˜(t))fit now change with time and ιfit = ι0 is approximately constant over the duration Tfit. The actual trajectory (M˜, a˜, e(t), ι, p˜(t)) in the AAK model is generated in- dependently from the NK model, and hence more rapidly. From the mapped initial point (M˜, a˜, e0, ι0, p˜0) (which also lies on the best-fit trajectory by con- struction), (e(t), p˜(t)) are evolved with 3PN O(e6) expressions given by Sago & Fujita [141], while (M˜, a˜, ι) are left constant. The configuration-space evolution of (Φ, γ, α) is performed with the appropriate combinations (4.37)–(4.39) of the fundamental frequencies, given by Sago–Fujita expressions that are consistent at 3PN O(e6) with the phase-space evolution. Higher-order 4PN O(e6) expres- sions [141] have also been tested, but these seem to result in poorer agree- ment with NK waveforms (which use fluxes of up to 3PN), possibly due to the known divergence of certain expansions beyond 3PN order [155]. With inclination constant along both the best-fit and 3PN O(e6) trajectories, sampling (M˜, a˜, e(t), p˜(t)) at each of the Nfit points allows the calculation of 70 Augmented kludge waveforms (M˜(t), a˜(t), e(t), p˜(t))fit − (M˜, a˜, e(t), p˜(t)) over the duration Tfit. This difference trajectory is then fitted to polynomials in time, extrapolated over the lifetime of the inspiral, and added to the 3PN O(e6) evolution. In the initial AAK im- plementation, the coefficients of the quadratic fit are given by second-order finite-difference quotients (i.e. Nfit = 3), which works well but only for small values of Tfit. The present version uses a quartic least-squares fit, which al- lows the choice of longer Tfit and consequently gives better long-term phase agreement with NK waveforms. The augmentations to the AK framework are focused on improving the phase information of its waveforms, since the amplitude of a GW signal is measured far less precisely than its phase. However, the calculation of ampli- tude in the original AK model (A in (4.10)–(4.12)) is a decent approximation since it is based on Φ˙M , which is assigned a value ≈ Ωφ that turns out to be correct for this purpose (see discussion around (4.36)). Hence it is the AAK amplitude that is shifted away from the fiducial NK value through the map- ping of frequencies and the unphysical evolution of M˜ . A simple adjustment is made to reverse this shift; in (4.10)–(4.12) for the AAK model, the amplitude is now given by A = (ωφM) 2/3µ |R| , (4.40) where the azimuthal frequency ωφ and the physical black-hole mass M are used in place of Φ˙ = ωr and M˜ (i.e. M in the original AK model) respectively. Finally, a fast method of plunge handling has been added to the present AAK implementation; this feature is useful in general, but especially when generating large numbers of AAK waveform templates for search algorithms. The compact object plunges when its instantaneous orbit along the phase- space trajectory E(t) becomes unstable, i.e. ∂2Vr(r, a,E) ∂r2 ≤ ∂Vr(r, a,E) ∂r = Vr(r, a,E) = 0, (4.41) where Vr is given in (4.24) with E = (E,Lz, Q). This point is termed the last stable orbit ELSO, and is precisely the point at which the discriminant D(a,E) of the quartic polynomial Vr(r) changes sign from positive (four real roots) to §4.3 Augmented analytic kludge 71 negative (two real roots) [156]. Since D is a simple analytic function of the quartic coefficients, it is computationally trivial to check for stability at every integration step for the phase-space trajectory, provided the evolution is done in terms of (E,Lz, Q).11 Plunge detection in the AAK model is far less straightforward than in the other two kludges, partly because the evolution is performed in terms of the quasi-Keplerian orbital parameters, and the computational benefits of the dis- criminant method are nullified by having to convert (e, ι, p˜)→ (E,Lz, Q). Fur- thermore, (M˜, a˜, p˜) are unphysical; the inverse of the map (4.37)–(4.39) is com- putationally expensive and (more crucially) ill-defined at plunge, and so can- not be used to obtain the physical parameters for stability calculations. To circumvent these issues, the AAK model uses (4.36) with Ω ≈ Ωφ to obtain an approximation for the physical parameter p. While generating the phase-space trajectory, it checks (between the least- and most-bound orbits [62]) the stability of (e, ι, p) at every radiation-reaction interval TRR. Once the stability changes across an interval, it then bisects that interval to find pLSO, and smoothly zeroes the waveform over 10 additional orbits with a one-sided Planck-taper window [157]. The added computational cost associated with this algorithm is . 1%. Although the approximation for p is crude, the phase- space trajectories in the AAK and NK models are generally divergent to begin with, and the plunge points for both models may differ significantly even if a more accurate expression is used. 4.3.3 Benchmarking The specified initial state of an EMRI in the AAK model is described by the intrinsic parameters (µ,M, a, e0, ι0, γ0, ψ0) and the extrinsic parameters (p0, θS, φS, θK , φK , α0, D), where D := |R|.12 In configuration space, transfor- 11 The discriminant method is applicable to the NK model, and may speed it up slightly. Currently, the NK implementation precomputes (for the specified value of a) an interpo- lated pLSO surface over the relevant region of (e, ι)LSO space, by finding and examining the roots of Vr(r) numerically. It then checks for p < pLSO(e, ι) when generating the phase-space trajectory. 12 For a source at cosmological redshift z, the values of D and (µ,M) are replaced with the luminosity distance D(1 + z) and the redshifted masses (µ(1 + z),M(1 + z)) respectively. 72 Augmented kludge waveforms mations from XAAK = (ψ, γ, α) to XAK = (Φ, γ, α) and XNK = (ψ, χ, φ) are required for a comparison of the three waveform models. We have chosen to specify the quasi-Keplerian true anomaly ψ in the (shared) AAK parame- ter space, since there is no closed-form expression for ψ in terms of the mean anomaly Φ. The conversion ψ → Φ is given by the Keplerian expressions [158] Φ = E − e sinE, (4.42) E = tan−1 (√ 1− e2 sinψ e+ cosψ ) , (4.43) where E is known as the eccentric anomaly. On the other hand, the AAK model retains the AK parameters (γ, α); these have explicit meanings in the (intrinsic) L-based coordinate frame (xˆ, yˆ, zˆ)L := ( Lˆ× Sˆ SL,S , (Sˆ · Lˆ)Lˆ− Sˆ SL,S , Lˆ ) , (4.44) where the normalisation factor SL,S := (1 − (Lˆ · Sˆ)2)1/2 and Lˆ(α) is given in ecliptic coordinates by (4.13). The unit position vector of the compact object with respect to this frame is rˆL = [cos (ψ + γ), sin (ψ + γ), 0]T , and a change of basis to the S-based coordinate frame (4.14) gives rˆS = Q T SQLrˆL, (4.45) where the orthogonal matrices QS := [xˆ|yˆ|zˆ]S and QL := [xˆ|yˆ|zˆ]L are formed from the triads in (4.14) and (4.44) respectively. It is then straightforward to obtain (χ, φ) from rˆS = [sin θ cosφ, sin θ sinφ, cos θ]T , via (4.29). As a generic example, we consider a prograde EMRI with redshifted com- ponent masses (µ,M) = (101, 106)M , spin a = 0.5M , and initial orbital pa- rameters (e0, ι0, p0) = (0.1, pi/6, 8.25M). The initial semi-latus rectum is chosen such that the compact object plunges approximately one year after entering the LISA band at a representative frequency fGW = 2.7 mHz, where fGW is defined as twice the azimuthal orbital frequency (i.e. the dominant GW harmonic at low eccentricity). In the AAK model, the fitting timescale and number of sam- §4.3 Augmented analytic kludge 73 ple points are set to Tfit ≤ 10TRR and Nfit ≤ 10 respectively, with inequality in the case of adaptive adjustments. One important result from our comparison studies is that the AK model can lead to an overestimation of SNR if used without modification. This is due to the fact that the frequencies in the AK model are generally too high for any given (e, ι, p), as mentioned in Section 4.2.1. The SNR of a signal h = hI+ihII is given by (2.54), with the noise-weighted inner product (2.53) written in terms of the one-sided noise power spectral density Sn(f > 0) := 2Sn(f) as [87] 〈a|b〉 = 2 ∫ ∞ 0 df a˜(f)b˜∗(f) + a˜∗(f)b˜(f) Sn(f) = 4< [∫ ∞ 0 df a˜(f)b˜∗(f) Sn(f) ] . (4.46) In this chapter, Sn(f) is taken to be a LISA noise model for the L6A5 (six links, five-million-kilometre arms) configuration known as classic LISA, assuming the original mission requirements and including confusion noise from the fore- ground of Galactic white-dwarf binaries [159]. At a luminosity distance of 5 Gpc, an NK signal (sampled at 0.2 Hz) from the example EMRI described above has a one-year SNR of ρ = 30.8. When using the AAK model to generate the signal, a comparable value of ρ = 32.1 is obtained. However, an AK signal from the same EMRI has ρ = 57.8. To illustrate why this is the case, we consider the characteristic strain hc of a signal and the noise amplitude hn; these are given respectively by [26] hc(f) = 2f |h˜(f)|2, (4.47) hn(f) = √ fSn(f), (4.48) such that ρ2 = ∫ ∞ −∞ d(ln f) ( hc(f) hn(f) )2 . (4.49) With these definitions, the area between hc and hf on a log–log plot gives an in- dication (but not an approximation) of SNR, and allows the relative detectabil- ity of signals to be estimated. The characteristic strain for the three signals and the LISA noise amplitude are shown in Figure 4.2, where the excess power in the AK signal at higher frequencies is evident, along with the consequent 74 Augmented kludge waveforms Figure 4.2: Characteristic strain for year-long AK (red), AAK (green) and NK (black) signals from the example EMRI, along with the noise amplitude (dashed) for the LISA configuration L6A5. A moving-average filter has been applied to the hc curves, such that oscillations are smoothed out for ease of visualisation while the overall spectral profile is preserved. §4.3 Augmented analytic kludge 75 (µ/M , a/M, e0) O(·|hNK)2 mth O(·|hNK)6 mth σ hAK hAAK hAK hAAK (101, 0.5, 0.1) 2.0× 10−3 9.5× 10−1 6.4× 10−4 1.5× 10−1 0.89 (100, 0.5, 0.1) 4.4× 10−5 5.9× 10−1 2.7× 10−6 1.3× 10−1 0.92 (101, 0.8, 0.1) −1.6× 10−3 9.3× 10−1 −2.7× 10−4 1.3× 10−1 0.91 (101, 0.5, 0.5) 8.6× 10−3 2.3× 10−1 −2.0× 10−3 4.3× 10−2 0.38 Table 4.3: Comparison of the AK and initial AAK models via two- and six- month waveform overlapsO(·|hNK) for generic EMRIs with different compact- object mass, black-hole spin and initial eccentricity. The computational speed- up σ of both models over the NK model is given as well. boost to SNR. This error is likely to persist for M & 106M , but may be miti- gated for less massive central black holes as the maxima of the three hc curves are blueshifted past the minimum of hn. For the purposes of this work (where the NK model is taken as fiducial), the phase accuracy of the AK and AAK models is assessed by how well their waveforms overlap with NK waveforms. In [59], the overlapsO(hAK|hNK) and O(hAAK|hNK) over two and six months are computed for the example EMRI, as well as for the same source with (i) µ = 100M , (ii) a = 0.8M and (iii) e0 = 0.5. The AK and initial AAK models have virtually identical computation times τ , and are both quicker than the NK model with typical speed-up factors of σ := 1− τ/τNK ≈ 0.9 (except in the case of e0 = 0.5, where σ ≈ 0.4). However, the AAK model yields overlaps that are consistently higher, and by 2–3 orders of magnitude in most cases (see Table 4.3). The speed-up factors for the AAK model are preserved by the present im- plementation, while its overlaps are increased across the board due to the en- hanced fitting algorithm. Most notably, there is substantial improvement for EMRIs with higher initial eccentricities, although this is partly attributable to the use of a higher sample rate than that in [59] (around 0.03 Hz). The overlap and timing performance of the AAK model across 2–6 months with varying compact-object mass, black-hole spin and initial eccentricity is summarised by the plots in Figures 4.4–4.6. In Figure 4.4, the overlap O(hAAK|hNK) is computed for the example EMRI, as well as for the same source with 0.5 ≤ lg (µ/M ) ≤ 2. The minimum value of 76 Augmented kludge waveforms Figure 4.4: Two- to six-month overlaps between AAK and NK waveforms for generic EMRIs with varying compact-object mass. The computational speed- up σ of the AAK over the NK is coded by colour. Vertical dashed lines cor- respond to the example EMRI, while horizontal ones indicate the standard minimal-match value of 0.97 for template banks. Figure 4.5: Overlaps and computational speed-up as in Figure 4.4, but for generic EMRIs with varying black-hole spin. §4.3 Augmented analytic kludge 77 µ ≈ 3M is chosen since the upper bound of 10TRR for Tfit exceeds six months for a less massive compact object, and so the overlaps in that regime will not show significant improvement (even the six-month overlap is already > 0.97). We also do not consider intermediate-mass black holes with µ > 100M . In- stead of choosing p0 such that the EMRI plunges after one year (as done in [59]), we consider fixed p0 = 8.25M in this analysis; this leads the overlaps to degrade at larger rather than smaller mass ratios µ/M . The two-month over- laps are> 0.97 up to µ ≈ 20M , for which plunge occurs at around 5.6 months. There is greater speed-up over the NK model for longer waveform durations as expected, but all values of σ are & 0.8. Overlaps for the example EMRI with varying spin 0.1 ≤ a/M ≤ 0.9 are shown in Figure 4.5. Again, all values of σ are & 0.8, with greater speed- up for longer waveform durations. For fixed p0, prograde EMRIs with lower spin start closer to plunge, and so the overlap values generally increase along with a. There appears to be an opposing effect at higher spin (possibly due to the additional degrees of freedom for error from fitting M˜ and a˜ in the AAK model) that causes a fall-off in the four- and six-month overlaps for a & 0.6. The two-month overlaps are > 0.97 across the full range of considered spins, which is perhaps unsurprising since the upper bound of 10TRR for Tfit is also two months for a (101, 106)M EMRI. However, we note here that while the phase accuracy of the AAK model may be arbitrarily increased in principle by taking Tfit > 10TRR, the computational requirement that Nfit . 10 will likely reduce the quality of the trajectory fit at early times. The effect of varying eccentricity for the example EMRI is illustrated in Fig- ure 4.6. We consider initial eccentricities 0.05 ≤ e0 ≤ 0.5, since there turns out to be no speed-up over the NK model (σ ≈ 0) when generating a two-month AAK waveform with e0 = 0.5. This is an important limitation of the Peters– Mathews approximation for waveform generation (see discussion in footnote 10), and will have to be addressed if the AAK model is to be useful in searches for high-eccentricity EMRIs. Nevertheless, the two- and even four-month over- laps are > 0.97 for e0 . 0.3, with σ & 0.5. As in the case of Figure 4.5, there is a peak in the six-month overlap; this is probably due to variance in the fit for e, and is unlikely to carry any fundamental significance. 78 Augmented kludge waveforms Figure 4.6: Overlaps and computational speed-up as in Figure 4.4, but for generic EMRIs with varying initial eccentricity. Finally, we summarise here the results of a Fisher matrix calculation for a year-long AAK signal from the example EMRI considered throughout this section. The Fisher information matrix Γ for a GW signal h parametrised by λ is given by (2.61), where λ = (µ,M,S,E,X,R) in the case of EMRIs. For our AAK signal normalised to an SNR of ρ = 30, we find that the masses and spin can be measured to within the fractional root-mean-square errors ∆(lnµ) ≈ 4× 10−5 (4.50) ∆(lnM) ≈ 2× 10−5, (4.51) ∆(ln (a/M)) ≈ 4× 10−5. (4.52) These errors are an order of magnitude better than the corresponding values for the AK model [57], and are more comparable to those cited for the NK model [66] (although the latter consider a circular, equatorial EMRI with a = 0.9M ). They are also consistent with (the lower end of) the values reported in the L2/L3 mission selection proposal [14] for ESA’s Cosmic Vision programme, where the AK model was used but with a modified plunge criterion. §4.4 Discussion 79 4.4 Discussion The widely used AK waveform model of Barack & Cutler is extremely quick to generate, but typically dephases within hours with respect to more accu- rate EMRI waveforms due to a mismatch of its radial, polar and azimuthal frequencies with the actual Kerr frequencies along its orbit. Hence using AK waveforms as signal templates for EMRIs will result in reduced SNRs for de- tection, as well as an inaccurate estimation of astrophysical parameters. This might limit their utility in scoping out data analysis issues for space-based GW detectors—an area where there are several important open questions that must be addressed within the next few years, in preparation for LISA mission adoption at the end of this decade. We have proposed in this chapter an augmented AK waveform model that features a frequency map to the Kerr fundamental frequencies, self-consistent 3PN O(e6) evolution equations from Sago & Fujita, and calibration to the NK model of Babak et al. via a local phase-space trajectory fit that is extrapolated over the waveform duration. The present implementation of the model also includes various improvements such as amplitude corrections and fast plunge handling. AAK waveforms are virtually as quick to compute as their predeces- sors, but stay phase-coherent with NK waveforms for months and yield high overlap values (> 0.97) across extended regions of parameter space. The most pressing deficiency in the AAK model is the ill-defined nature of (4.37)–(4.39) at the last stable orbit due to the divergent Kerr frequencies; this complicates both plunge detection and the specification of orbital parameters at plunge. Another limitation is that the mode-sum approximation (4.6) and (4.7) becomes more expensive than the quadrupole formula itself at high ec- centricities (e0 & 0.5). If the model can be made robust through the resolution of these issues, it may also be upgraded with fits to improved NK models when they become available (e.g. with Kerr self-force evolution included through the framework in [67, 68]). With the era of space-based detectors drawing closer, the AAK waveform model will be an essential tool for near-future work on LISA data analysis and mock data challenges, as well as a useful addition to the inventory of EMRI waveforms for millihertz GW astronomy. 80 Augmented kludge waveforms Chapter 5 Template bank compression One strategy for reducing the online computational cost of matched-filter de- tection searches for GWs is to introduce a compressed basis for the waveform template bank. In this chapter, we propose and investigate several combinato- rial compression schemes for a general template bank. Through offline com- pression, these tunable schemes are shown to yield faster detection and local- isation of signals, along with moderately improved sensitivity and accuracy over coarsened banks at the same level of computational cost. The general method is inspired by a recently proposed template bank com- pression scheme that potentially yields significant cost savings, but may be demonstrated to suffer from a number of fundamental problems. Our com- pression schemes address and rectify these problems; they might be useful for any search involving template banks, and especially in the analysis of data from space-based detectors, where online searches will be difficult due to the long-duration waveforms and large parameter spaces of some sources. The material in this chapter has been adapted from [160]. 5.1 Background Various strategies exist to reduce the online cost of evaluating matched-filter inner products in GW detection algorithms, typically by shifting the com- putational burden to the preparatory offline stage. Some methods focus on making individual inner products computationally cheaper: this may be achieved across regions of parameter space through direct template interpola- tion [161, 162], or more generally by using a reduced-order quadrature [163]. 81 82 Template bank compression Other methods seek to reduce the number of inner products in a grid-based search through template bank compression, i.e. the reduced-basis representa- tion of a large template bank by a smaller set of templates [164–166]. In a recently proposed method of template bank compression [167], binary labelling is used to define a non-orthogonal basis that maximises compression losslessly (in the sense of perfect signal recovery without noise). This idea is fully general and admits a much higher compression rate than existing meth- ods based on the eigenvalue structure of the template bank, but comes with significant penalties to detection sensitivity and localisation accuracy in the presence of detector noise. The method as originally described also suffers from an arbitrarily asymmetric treatment of templates, as well as a restrictive level of compression that limits its practicality to high-SNR signals. While the binary labelling method might be useful in the context of LISA (where source SNRs are potentially higher than for ground-based detectors), its practical ap- plicability to GW data analysis remains undeveloped and hence unclear. In this chapter, we introduce and develop the related method of conic com- pression (i.e. defining a compressed basis through conic combinations of tem- plates) by characterising its performance under various simplifying assump- tions, before investigating its viability for present and future GW detectors with a more realistic example. We propose several compression schemes, one of which subsumes a symmetric-treatment version of the binary labelling method as a particular case. These tunable schemes feature discrete transitions between zero and maximal compression, and offer fast detection and localisa- tion of GW signals in the search space with a controlled loss (if at all) in sen- sitivity/accuracy. Their generality and straightforward implementation also allow them to supplement existing grid-search methods, or to rapidly identify seed points for stochastic searches in parameter estimation. The general method of conic compression is set out in Section 5.2. Three families of conic compression schemes are then proposed in Sections 5.2.1– 5.2.3: a lossy scheme based on partitions of the template bank, and two lossless schemes whose conic combinations are determined permutatively or by base representations of template labels. We calculate the optimal detection statistics for these schemes, and find that the standard maximum-overlap statistic is sig- §5.1 Background 83 nificantly suboptimal for detection in the lossless case. Section 5.2.4 compares the three schemes under simplified conditions, i.e. assuming the GW signal is proportional to a single template in an orthogonal template bank. The lossy partition scheme is shown to have slightly better detection sensitivity than its lossless counterparts at the same level of compression. Furthermore, while the lossless schemes provide automatic identification (i.e. localisation to a sin- gle template) of the signal upon detection, the identification accuracy falls off more rapidly with compression in the presence of noise. We focus exclusively on the partition scheme in Section 5.3, where the or- thogonality and single-template assumptions are lifted separately. As shown in Section 5.3.1, the overall performance of the scheme is partition-dependent in the case of a correlated (non-orthogonal) template bank, and must be pre- optimised by grouping highly correlated templates together. The optimised partition scheme retains the benefits of a correlated template bank up to high levels of compression, and is superior to a simple coarsening of the template bank (obtained by increasing the maximal mismatch between neighbouring templates). Section 5.3.2 discusses the case of a GW signal lying in a low- dimensional subspace of an orthogonal template bank, for which the detection sensitivity of the scheme is not significantly reduced. In Section 5.4, we implement the optimised partition scheme for a highly correlated (maximal mismatch ≈ 0.01) template bank of ∼ 104 PN waveforms, which describe only the inspiral phase of a comparable-mass binary coales- cence. The scheme is shown to be viable for practical applications, as it per- forms well on this example up to high levels of compression and at all consid- ered values of SNR. Its detection rate for a signal injected centrally is superior to that of the coarsening approach (especially at & 80% compression), and this improvement is even more marked for a signal injected at the boundary of the bank. In addition, the accuracy rate for localisation of the injection to a . 0.1% region of the search space is undiminished up to ∼ 90% compression, and is again higher than that of the coarsening approach. The speed-up and enhanced accuracy in localising the GW signal with conic compression is promising for LISA data analysis, where the online use of template banks will be challenging for many sources (e.g. EMRIs with their 84 Template bank compression large parameter space [168]). While the long duration of LISA signals is com- putationally prohibitive to fully coherent detection searches even with com- pression, our method is suitable for the shorter semi-coherent searches that are required for rapid electromagnetic follow-up. Conic compression might also provide an alternative to the singular-value-decomposition method used in LIGO detection pipelines for stellar-mass binary coalescences [165]: it scales well with parameter-space dimensionality, and easily matches or surpasses the order-of-magnitude computational savings from that method. 5.2 Conic compression We consider a generic bank of N waveform templates hn, where the template labels n are drawn from the collection N := {n ∈ Z+ |n ≤ N}, and the tem- plates have been normalised with respect to the noise-weighted inner product (2.53) such that 〈hn|hn〉 = 1 for all n ∈ N. The inner products of the data x and the templates define N associated statistics xn := 〈x|hn〉, (5.1) which may be used for detection and localisation in a simple grid search. Our general method of compression is to reduce the number of statistic evaluations from N to M by considering conic (i.e. positive-coefficient) combi- nations of the original templates. The template labels are grouped into M sets Um, where the set labels m are drawn from the collection M := {m ∈ Z+ |m ≤ M}, and the sets satisfy⋃m∈M Um = N. These sets define M conic templates Hm := ∑ n∈Um hn, (5.2) which are prepared offline (like the template bank itself), along with M asso- ciated statistics Xm := 〈x|Hm〉 = ∑ n∈Um xn, (5.3) which are evaluated at the online stage. §5.2 Conic compression 85 Without any prior assumptions on the template bank, each template must be treated equally. This is done by ensuring that: (a) each combination is weighted equally; (b) each combination includes the same number of templates; (c) each template is included in the same number of combinations. Definition (5.2) has been chosen to satisfy Condition (a), while Condition (b) is imposed by further requiring card(Um) = card(Um′) for all m,m′ ∈M (where the set cardinality card(S) is the number of elements in the set S). Condition (c) must be enforced separately in the construction of the sets. The second equality in Definition (5.3) relates the conic statistic evaluations to the original statistics (5.1), which are no longer evaluated at the online stage. To simplify analysis, we first assume the template bank is an orthogonal set such that 〈hn|hn′〉 = δnn′ , (5.4) where δnn′ is the Kronecker delta. We further assume the GW signal (if present) lies in the one-dimensional subspace spanned by a single template in Hilbert space, i.e. x = Ah1 + {noise}, (5.5) where A > 0 and the templates have been relabelled without loss of generality. It follows from (2.54) and (5.5) that A = ρ is the source SNR. These orthogonal and 1-D restrictions are neither realistic nor optimal, but facilitate the analytic assessment and comparison of various compression schemes in this section. The overall performance of conic compression is generally improved by the lifting of these assumptions, which we consider in Sections 5.3 and 5.4. In the presence of a GW signal, the expectation values and covariances of the normally distributed original statistics (5.1) are now given by E[xn] = A〈h1|hn〉 = Aδ1n, (5.6) cov(xn, xn′) = 〈hn|hn′〉 = δnn′ . (5.7) 86 Template bank compression As the labelling of templates is itself a probabilistic process with discrete uni- form distribution, the original statistic vector x has the multivariate Gaussian distributionN (µ(i),Σ) (with µ(i)n = E[xn] and Σnn′ = cov(xn, xn′)), but summed over the N possible assignments i of 1 ∈ N and renormalised accordingly. If the signal is absent, the distribution of x is simply N (0,Σ). Hence we have p1(x) ∝ 1 N N∑ i=1 exp ( −1 2 xTΣ−1x + µT(i)Σ −1x− 1 2 µT(i)Σ −1µ(i) ) , (5.8) p0(x) ∝ exp ( −1 2 xTΣ−1x ) , (5.9) where p1 and p0 are the probability density functions of x in the respective presence or absence of a GW signal. An optimal detection region R in Hilbert space maximises the detection rate PD = ∫ R p1 subject to a given false alarm rate PF = ∫ R p0; hence p1 = λp0 on its boundary ∂R for some Lagrange multiplier λ. Using (5.6)–(5.9), we define the optimal detection statistic xopt := p1(x) p0(x) = 1 N exp ( −A 2 2 )∑ n∈N exp (Axn), (5.10) such that the optimal detection surfaces ∂R are precisely the level sets of xopt parametrised by λ, and a detection is claimed if xopt exceeds the threshold λT corresponding to some fixed value of PF . In deriving (5.10), we have implicitly assumed a population of GW sources with equal likelihood and known signal amplitude. Equation (5.10) therefore defines the optimal statistic for detecting events drawn from such a popula- tion. For a population of sources that are not equally likely, we need to replace the sum in (5.10) with a suitably weighted sum. Similarly, for a population with a distribution of amplitudes, we need to marginalise (5.10) over A; in the case of an (improper) uniform prior over the interval (−∞,∞), this would give a detection statistic proportional to ∑ n∈N exp(x 2 n/2). Any choice of population makes assumptions about the astrophysical dis- tribution of GW sources that might not be justified. In this work, the focus is §5.2 Conic compression 87 Figure 5.1: Three-dimensional projection of optimal detection surface for un- correlated statistics xn, at true SNR of ρ = 2. on the investigation and comparison of template bank compression schemes, and so we only consider the equal-likelihood and known-amplitude popula- tion assumed in the derivation of (5.10). While the treatment of amplitude in particular is artificial, a search that is optimised for sensitivity to signal ampli- tudes around the detection threshold will likely be near-optimal for any given astrophysical population (and closer to optimality than a search tuned for the wrong astrophysical population). Finally, we note that although (5.10) has been derived as a frequentist optimal statistic, the same equation also arises as the Bayes factor for the presence (versus absence) of a signal, assuming flat model priors and the source population assumptions outlined above. For sufficiently high SNR (large A), the optimal surfaces xopt = λ defined by (5.10) are well approximated by semi-infinite hypercubes (see Figure 5.1), i.e. the level sets of the standard maximum-overlap detection statistic [169] xmax := max n∈N {xn}. (5.11) Since the original statistics (5.1) are uncorrelated, the probability density func- tions of xmax in the presence or absence of a GW signal are obtainable explicitly. 88 Template bank compression These are given respectively by q1(xmax) = F0(xmax) N−1f1(xmax) + (N − 1)F0(xmax)N−2F1(xmax)f0(xmax), (5.12) q0(xmax) = NF0(xmax) N−1f0(xmax), (5.13) where fs(xmax) is the probability density function for the Gaussian distribution N (sA, 1), and Fs(xmax) is the cumulative distribution function Fs(xmax) = ∫ xmax −∞ du fs(u). (5.14) For our analysis of conic compression schemes, we also require the expec- tation values and covariances of the normally distributed conic statistics (5.3). From (5.3), (5.6) and (5.7), it follows in the presence of a GW signal that E[Xm] = ∑ n∈Um E[xn] = A card({1} ∩Um), (5.15) cov(Xm, Xm′) = ∑ n∈Um ∑ n′∈Um′ cov(xn, xn′) = card(Um ∩Um′), (5.16) where the cardinalities are determined by the choice of compression scheme. As before, the conic statistic vector X has the multivariate Gaussian dis- tribution N (µ(i),Σ) (now with µ(i)m = E[Xm] and Σmm′ = cov(Xm, Xm′)), but summed over the N possible assignments of 1 ∈ N and renormalised accord- ingly. If the signal is absent, the distribution of X is againN (0,Σ). The proba- bility density functions of X in the presence or absence of a GW signal are then given respectively by (5.8) and (5.9) with x ≡ X. We now propose and investigate three general conic compression schemes in Sections 5.2.1–5.2.3, before comparing their performance and potential ap- plicability in Section 5.2.4. The orthogonal and 1-D restrictions (5.4) and (5.5) are assumed throughout Section 5.2. §5.2 Conic compression 89 5.2.1 Partition scheme The simplest method of grouping the template labels n is to take the family of sets Um as a partition of N, i.e. Um ∩ Um′ = ∅ for all distinct m,m′ ∈ M. Condition (c) is then automatically satisfied, while Condition (b) defines the set cardinality P = card(Um) for all m ∈M. It follows that M = N/P . For the comparison of schemes in Section 5.2.4, it is useful to introduce a compression parameter K ∈ Z+ for each scheme, which determines the com- pression rate κ := 1− Neval N , (5.17) where Neval = M is the required number of statistic evaluations (for detec- tion or localisation purposes). This generates a sliding scale of groupings that ranges from no compression at K = 1 to maximal compression at some scheme-dependent value of K. We may clearly choose K = P for the parti- tion scheme, such that maximal compression is given by K = N . The minimal nontrivial compression is 50% at K = 2, while there are diminishing returns at large K since κ(K) is concave-down. From (5.15) and (5.16), we now have E[Xm] = Aδ1m, (5.18) cov(Xm, Xm′) = Pδmm′ , (5.19) where the sets have been relabelled such that 1 ∈ U1 without loss of generality. Again considering the N possible assignments of 1 ∈ N, the optimal detection statistic Xopt := p1(X)/p0(X) follows from (5.8) and (5.9) (with x ≡ X) as Xopt = 1 M exp ( −A 2 2P ) ∑ m∈M exp ( A P Xm ) . (5.20) Since the conic statistics for the partition scheme remain uncorrelated, the optimal surfaces Xopt = λ resemble that in Figure 5.1, and in lieu of (5.20) it is valid to consider the maximum-overlap detection statistic Xmax := max m∈M {Xm}. (5.21) 90 Template bank compression Figure 5.2: ROC curves for the partition scheme’s optimal and maximum- overlap detection statistics, at different values of set cardinality P (with com- pression rate κ in parentheses) for N = 256 and a true SNR of ρ = 10. The dashed diagonal line indicates the worst possible performance, i.e. a random search for which the detection and false alarm rates are equal. §5.2 Conic compression 91 Plots of detection rate PD against false alarm rate PF for a detection statistic are known as receiver operating characteristic (ROC) curves [161]. ROC curves for both the optimal and maximum-overlap statistics are compared in Figure 5.2.13 With increased compression, the performance of the maximum-overlap statis- tic falls away slightly from that of the optimal statistic, due to the lowering of effective SNR A/ √ P in (5.20); nevertheless, (5.21) is a sound approximation as both sets of curves show good overall agreement. For the partition scheme to admit a useful (i.e. populated) sliding scale of compression rates, the template bank might need to be trimmed or padded such that N has as many divisors as possible. Fixing the false alarm rate and choosing either a desired detection rate or a compression rate then allows ad- vance determination of the conic templates (5.2) and the threshold λT , which is the value of λ corresponding to the fixed false alarm rate. The algorithm for detection follows as: (i) evaluate the conic statistics (5.3); (ii) claim a detec- tion if Xmax > λT . Threshold and detection SNRs for the maximum-overlap statistic may be defined respectively as ρT := λT√ var(Xmax) , (5.22) ρD := Xmax√ var(Xmax) . (5.23) An extension of the detection algorithm is required for signal identification (i.e. localisation to a single template), since the simple coarse-graining of parti- tion compression does not distinguish between template labels in the same set. The signal is most likely to be associated with the largest conic statistic evalua- tionX(1), so the best candidate may be obtained by further evaluating all of the original statistics xn contributing to X(1) and identifying the largest. This finer level of evaluations increases the computational cost by P to Neval = M + P . For better identification accuracy at lower SNRs, we may widen our search to the i largestXm instead, at an added computational cost of iP . The standard identification algorithms Ii for the partition scheme follow (after detection) as: 13 The curves for (5.20) were obtained via 105-trial Monte-Carlo simulations, while numeri- cal integration of (5.12) and (5.13) was used to generate the curves for (5.21). 92 Template bank compression (iii) evaluate the original statistics (5.1) for all n ∈ Vi, where Vi := i⋃ j=1 U(j), (5.24) with U(j) corresponding to the j-th largest conic statistic evaluation X(j); (iv) identify maxn∈Vi{xn}. Other identification algorithms may also be considered. One such alterna- tive is obtained by defining a further partition of Vi into two sets and evalu- ating the associated conic statistics, then identifying the set V′i corresponding to the larger statistic evaluation and repeating the process with Vi ≡ V′i until card(V′i) = 1. This method might be useful for large values of P ; it yields a smaller added computational cost of 2 log2 iP , but incurs a penalty to identifi- cation accuracy since the early iterations still involve coarse-grained searches. 5.2.2 Symmetric base scheme Without an additional fine-grained search, partition compression is lossy in the sense that the GW signal is not automatically identified in the limit of zero noise. A lossless conic compression scheme proposed by Wang [167] repre- sents each template label n in binary and assigns it to the set Um if its m-th digit is 1. This binary scheme features the largest possible lossless compres- sion (M = log2N ) and an automatic identification of the GW signal; however, it suffers from an unequal treatment of templates (i.e. it violates Conditions (b) and (c)) and hence it yields an arbitrary level of performance that depends on the initial assignment of template labels. Furthermore, the restriction to maximum compression limits its usefulness in practical applications. We propose a compression scheme modelled on Wang’s binary scheme, but symmetrised (for equal treatment of templates) and generalised to a sliding scale of base representations (for tunable compression). The template labels n are represented modulo N in base B, and each set Um ≡ Uk,b is constructed by collecting all of the labels whose k-th digit is b (this includes b = 0, and gives a symmetric version of the binary scheme when B = 2). For Conditions (b) and (c) to be satisfied, we require logB N ∈ Z+; it follows that M = B logB N . §5.2 Conic compression 93 The compression parameter is chosen as K = logB N , such that maximal compression is given by K = log3N ≈ lnN (base-2 compression is slightly suboptimal with symmetrisation). In contrast to the partition scheme, com- pression for the symmetric base scheme is dependent on the size of the tem- plate bank; the minimal nontrivial compression for N = 102 is nearly 80% at K = 2 (base- √ N compression), and over 95% for N = 104. From (5.15) and (5.16), we have14 E[Xm] = Aδ0b, (5.25) cov(Xm, Xm′) = B K−2(Bδkk′δbb′ − δkk′ + 1), (5.26) where m = B(k − 1) + b+ 1, and the templates have been relabelled such that (5.5) becomes x = AhN + {noise} without loss of generality. Considering the N possible assignments of N ∈ N, the optimal detection statistic follows from (5.8) and (5.9) as Xopt = 1 N exp ( −βKA 2 2 + (β − α)A tr(X) ) K∏ k=1 B−1∑ b=0 exp (αAXk,b) (5.27) with α := B N , (5.28) β := M −K + 1 NK , (5.29) where tr(X) := ∑ m∈MXm. The higher compression rates provided by the symmetric base scheme re- sult from the non-empty intersections among the sets Uk,b with different values of k. As seen in (5.26), these also lead to correlations among the conic statistics Xk,b. The optimal detection surfaces given byXopt = λ differ significantly from that depicted in Figure 5.1; their projections onto the correlated subspaces are now compact hyperboloids, and no longer approach the semi-infinite hyper- cubes of the maximum-overlap detection statistic at high SNR (see Figure 5.3). 14 The covariance matrix defined by (5.26) is rank-deficient, but we may take the Moore– Penrose pseudoinverse Σ+ as a perturbative approximation to Σ−1 in (5.8) and (5.9). 94 Template bank compression (a) (b) Figure 5.3: Three-dimensional projection of optimal detection surface for cor- related statistics Xk,b of symmetric base scheme with N = 256 and B = 4, at true SNRs of (a) ρ = 10 and (b) ρ = 100. Without a simple approximation for the optimal detection statistic, the most feasible option is to use (5.27) itself with an estimate A of the true SNR. ROC curves for the estimated statistic XA=ρopt with  ∈ {1/2, 2} are compared against those for the optimal and maximum-overlap statistics in Figure 5.4. Not much detection sensitivity (for a fixed false alarm rate) is lost if the true SNR can be estimated to within a factor of two, while usage of the maximum-overlap statistic now incurs a more noticeable drop in performance as expected. The restriction ofN ,B andK to integer values also results in more sparsely populated sliding scales than those admitted by the partition scheme. There are two possible compression rates for N = 256 (base-2 compression is sub- optimal compared to B = 4), and three for N = 6561 = 812 = 94 = 38; most other values of N will admit only one or none. Notwithstanding the lack of tunability, a feasible strategy is to trim or pad the template bank such that N is a perfect square or cube, since the smallest values ofK already yield high com- pression rates. The detection algorithm then follows as given in Section 5.2.1, with some estimated detection statistic XA=ρopt in place of Xmax. One key feature of the symmetric base scheme and other lossless methods §5.2 Conic compression 95 Figure 5.4: ROC curves for the symmetric base scheme’s optimal, maximum- overlap and estimated detection statistics, at different values of base B (with compression rate κ in parentheses) for N = 256 and a true SNR of ρ = 10. 96 Template bank compression of compression is the automatic identification of the GW signal (upon detec- tion). In the symmetric base scheme, the label of the identified template in base-B representation is given digit-wise by the largest conic statistic evalua- tion Xk,(1) := maxb{Xk,b} for each value of k. However, as each digit k is iden- tified individually, the overall identification accuracy falls off severely with increasing K (i.e. the total number of digits). A possible modification for better accuracy is to consider the i + 1 largest Xk,b for each k and perform an additional fine-grained search over the (i+ 1)K templates, which increases the computational cost accordingly. The standard identification algorithms Ii for the symmetric base scheme follow (after detec- tion) as: (iii) evaluate the original statistics (5.1) for all n ∈ Vi, where Vi := K⋂ k=1 i+1⋃ j=1 Uk,(j), (5.30) with Uk,(j) corresponding to the j-th largest conic statistic evaluation Xk,(j) for each k; (iv) identify maxn∈Vi{xn}. Automatic identification is recovered for i = 0, where steps (iii) and (iv) become unnecessary as card(V0) = 1. For large values of K (small values of B), these identification algorithms might still suffer from poor accuracy. One alternative algorithm is obtained by defining some threshold XT and considering all conic statistic evalua- tions Xk,b > XT , then performing the additional fine-grained search over all of the corresponding templates. Such a threshold may be set prior to data- taking; if Xk,(1) < XT for some value of k, the k-th digit of the number is unconstrained and templates corresponding to all possible choices of that digit are considered. Alternatively, XT may be based on the data by setting XT = f mink{Xk,(1)} for some fixed fraction f , which ensures that at least one possible value is identified for each digit. Both approaches will in general yield increased accuracy, but they offer less control over the number of conic statistic evaluations considered and hence the overall computational cost. §5.2 Conic compression 97 5.2.3 Binomial coefficient scheme The symmetric base labelling method is not the only construction of the sets Um that preserves both lossless compression (automatic identification) and equal treatment of templates (Conditions (b) and (c)). In general, we may rep- resent any assignment of N templates to M sets with a collection of N M -digit binary labels, where the m-th digit of each label is 1 if it appears in Um and 0 otherwise. Condition (c) implies that each label must appear in exactly R sets, and hence contain exactly R 1’s. In addition, Condition (b) defines the set car- dinality C = card(Um) for all m ∈ M, which yields the constraint NR = MC (each of the N labels appears exactly R times across all sets, while each of the M sets contains exactly C labels). For some given integers N ≥ M ≥ R, this constraint is equivalent to the existence of C = NR M ∈ Z+, (5.31) which is both a necessary and sufficient condition for such a set construction to be possible [170]. We now require that the conic statistics (5.3) are correlated symmetrically, as seen in the partition scheme (but not the symmetric base scheme). This ad- ditional condition implies that the intersection of each pair of sets has fixed cardinality I , i.e. card(Um ∩ Um′) = I for all distinct m,m′ ∈ M. Consider- ing the family of all such intersections then yields the constraint NR(R− 1) = M(M − 1)I (each of the N labels appears exactly RC2 times across all intersec- tions, while each of the MC2 intersections contains exactly I labels). For some given integersN ≥M ≥ R and C satisfying (5.31), this constraint is equivalent to the existence of I = NR(R− 1) M(M − 1) = C(R− 1) M − 1 ∈ Z +, (5.32) which is a necessary (but not in general sufficient) condition for such a set construction to be possible. The general construction of a family of sets under the constraints (5.31) and (5.32) is an open problem in combinatorial design theory (see Apppendix B). In 98 Template bank compression this chapter, we restrict our focus to a special case that may be treated in greater detail. Every M -digit binary number with exactly R 1’s is taken to represent a distinct template label; the set cardinality then equals the number of (M − 1)- digit binary numbers with exactly (R−1) 1’s, while the intersection cardinality of each pair of sets equals the number of (M − 2)-digit binary numbers with exactly (R− 2) 1’s. Hence for all distinct m,m′ ∈M, we have N = MCR, (5.33) C = M−1CR−1, (5.34) I = M−2CR−2, (5.35) such that (5.31) and (5.32) are satisfied. We refer to this as the binomial coeffi- cient scheme. The usual ordering of the binary numbers gives a natural map onto the original label collection N = {n ∈ Z+ |n ≤ N}, although the inverse map is analytically nontrivial (but straightforward in practice). As the binomial coefficient scheme shares many similarities with the sym- metric base scheme, we only highlight its key features in this section. The compression parameter is chosen as K = R, such that maximal compression is given by K = cbc−1(N)/2 (where cbc(M) := Γ(M + 1)/Γ(M/2 + 1)2 is the continuous extension of the central binomial coefficient MCM/2). Compression rates again depend on the size of the template bank; at small values of K, they are only slightly higher than those of the symmetric base scheme. From (5.15) and (5.16), we have E[Xm] = A R∑ r=1 δrm, (5.36) cov(Xm, Xm′) = M−2CR−2 ( M −R R− 1 δmm′ + 1 ) , (5.37) where the sets have been relabelled such that 1 ∈ Ur for 1 ≤ r ≤ Rwithout loss of generality. Considering the N possible assignments of 1 ∈ N, the optimal §5.2 Conic compression 99 detection statistic follows from (5.8) and (5.9) as Xopt = 1 N exp ( −βKA 2 2 + (β − α)A tr(X) ) N∑ i=1 exp ( αA ∑ r∈Ri Xr ) (5.38) with α := 1 M−1CR−1 , (5.39) β := 1 M−2CR−1 , (5.40) where the sets Ri are the N distinct R-combinations of the collection M. All of the conic statistics Xm are correlated symmetrically, as seen in (5.37). Upon projection onto any three-dimensional subspace, the optimal detection surfaces given by Xopt = λ resemble those in Figure 5.3 at both low and high SNRs. It follows that the maximum-overlap detection statistic is again an in- adequate approximation to the optimal statistic, and we are compelled to use (5.38) itself (assuming an accurate estimate of true SNR). We do not include ROC curves for the binomial coefficient scheme here, as they are very similar to those in Figure 5.4. A direct comparison of the base and binomial schemes is difficult, since there are few suitable values of N that are exactly valid for both schemes. Lack of tunability is also more of an issue for the binomial scheme: the only val- ues of N that admit more than one nontrivial compression rate might be the Singmaster numbers [171] (which admit two as they appear six times in Pas- cal’s triangle), and it is not known whether any number admits more than two (apart from N = 3003 = 78C2 = 15C5 = 14C6 [172]). The problem may be overcome by considering a more general compression scheme satisfying the conditions (5.31) and (5.32). This is beyond the scope of the current work due to the complexity of set construction (see Appendix B), but might be investi- gated for specific template banks in the future. The detection algorithm for the binomial coefficient scheme is as given in Section 5.2.1, with some estimated detection statistic XA=ρopt in place of Xmax. Automatic identification is available as well, with the label of the identified template given uniquely by theR largest conic statistic evaluations. For higher 100 Template bank compression accurate-identification rates, a possible alternative is to consider the R + i largest Xm and perform an additional fine-grained search over the R+iCR tem- plates. The standard identification algorithms Ii for the binomial coefficient scheme follow (after detection) as: (iii) evaluate the original statistics (5.1) for all n ∈ Vi, where Vi := R+iCR⋃ k=1 ⋂ j∈Jk U(j), (5.41) with U(j) corresponding to the j-th largest conic statistic evaluation X(j) and the sets Jk given by the R+iCR distinct R-combinations of the set {j ∈ Z+ | j ≤ R + i}; (iv) identify maxn∈Vi{xn}. Automatic identification is recovered for i = 0, where steps (iii) and (iv) become unnecessary as card(V0) = 1. We note that another possible scheme would be a “direct sum” of the par- tition scheme and either the symmetric base or binomial coefficient scheme. The collection of template labels is first partitioned into subcollections, each of which is further decomposed into smaller sets via one of the correlated schemes; these sets may also be recombined across the initial partition for in- creased compression. We do not consider this further here, but such an ap- proach would overcome some of the difficulties associated with the restricted values of N for the base and binomial schemes. 5.2.4 Scheme comparison In this section, we compare the performance of the uncorrelated partition scheme and its two correlated alternatives across three areas: compression, detection and identification (i.e. localisation to a single template). The detec- tion and identification plots here (and throughout the rest of the chapter) were obtained using 105-trial Monte-Carlo simulations, and so the errors on each plot point are ∼ 10−3 for a 1-sigma binomial confidence interval. Log–log plots of M against N for various conic compression schemes are shown in Figure 5.5, where the maximum lossless compression provided by the binary scheme [167] has also been included for reference. As alluded to in Sections 5.2.1–5.2.3, the partition scheme has the largest range of compres- sion rates, both in terms of compression bounds (plot area) and admitted rates §5.2 Conic compression 101 Figure 5.5: Log–log plots of M against N for various tunable/fixed conic com- pression schemes. For each tunable scheme, the corresponding shaded region indicates the range of possible compression rates (with the trivial compression settingK = 1 excluded). Not every point in this region is realisable in practice, as discussed in the text. 102 Template bank compression (discrete density, not shown). The two correlated schemes cover similar areas at lower densities in Figure 5.5, with the binomial coefficient scheme offering slightly greater compression. Detection performance for each compression setting of a given scheme may be measured by detection sensitivity at a fixed false alarm rate (which is simply read off the corresponding ROC curve), or by a summary statistic that captures most of the information contained in an ROC curve (e.g. the area AROC under the curve). Since an ROC curve always lies above the no-discrimination line PD = PF , we define the discrimination D := 2AROC − 1, (5.42) which serves as a measure of how well the detection statistic discriminates between true and false positives. Figure 5.6(a) shows plots of discrimination against compression for the three proposed schemes at different values of true SNR, with N ≈ 256. We use the maximum-overlap detection statistic in lieu of the optimal statistic for the partition scheme, and are compelled to chooseN = 210 for the binomial coefficient scheme. The three schemes have comparable performance at lower SNRs, but the partition scheme begins to outpace its correlated alternatives as SNR increases. To compare identification performance (after a true detection), we consider plots of accurate-identification rate PI against compression, but only for the fastest standard algorithms of each scheme (i.e. I1 for the partition scheme, and automatic identification I0 for the correlated schemes). The rate PI for each plot point is calculated using all and only the trials with the injected signal present, and therefore assumes perfect detection throughout (PD = 1 and PF = 0). This decouples identification from detection: it allows standardised comparison of the schemes at a fixed false alarm rate, and does not penalise the identification performance of any method for having inferior detection performance. As seen in Figure 5.6(b), the usefulness of lossless compression and au- tomatic identification is limited in the presence of noise; the addition of a simple fine-grained search to the partition scheme is enough to yield signif- icantly higher identification accuracy at the cost of marginally lower compres- §5.2 Conic compression 103 (a) (b) Figure 5.6: Plots of (a) discrimination D and (b) accurate-identification rate PI against compression rate κ for the partition, symmetric base and binomial coefficient schemes, at different values of true SNR ρ for N ≈ 256. sion. The turnaround in accurate-identification rates for the partition scheme at larger values of P is due to the additional statistic evaluations used in the fine-grained search, which for I1 givesNeval = M+P in (5.17). SinceM = N/P , κ(P ) has one turning point. For this example, P = 8 and P = 64 provide the same level of compression; identification accuracy is higher for the former at ρ = 10, similar for both at ρ = 4, and higher for the latter at ρ = 2. In summary, the partition scheme offers better overall performance than its correlated alternatives at the same level of compression. For detection, the introduced correlations among the conic statistics lead to slightly reduced de- tection sensitivity and discriminatory power at high SNR; furthermore, the potential benefits of lossless compression for identification turn out to be nul- lified by the effects of noise. Hence there appears to be little reason for using correlated schemes over the partition scheme, which is more promising as it is easy to implement and admits a relatively populated sliding scale of compres- sion rates. We further investigate and implement the partition scheme as the representative conic compression scheme in Sections 5.3 and 5.4. 104 Template bank compression 5.3 Orthogonality and subspaces The conic compression schemes proposed in Section 5.2 are fully general, in the sense that no prior assumptions about the template bank are made apart from (5.4) and (5.5). These orthogonal and 1-D restrictions are neither realistic nor optimal, as template banks typically feature highly correlated neighbour- ing templates and are unlikely to contain a template exactly proportional to the GW signal itself. In this section, we discuss the (separate) lifting of each as- sumption for the partition scheme, and the resultant effects on detection sensi- tivity and localisation accuracy. Each approach may be viewed as a simplified limiting case of an actual template bank, which can always be made dense enough to include a signal-proportional template (assuming model accuracy), or orthogonally decomposed. A more realistic example with both assumptions lifted is considered in Section 5.4. 5.3.1 Non-orthogonal templates We first consider a sufficiently dense bank of correlated (non-orthogonal) tem- plates, such that the GW signal still lies in the 1-D subspace spanned by a single template in Hilbert space. From the first equalities in (5.6), (5.7), (5.15) and (5.16), it follows in the presence of a GW signal that E[Xm] = A ∑ n∈Um 〈h1|hn〉, (5.43) cov(Xm, Xm′) = ∑ n∈Um ∑ n′∈Um′ 〈hn|hn′〉. (5.44) Any partition of N as in Section 5.2.1 defines a splitting of the (sorted) orig- inal mean vector and covariance matrix into P × 1 blocks and P × P blocks respectively; each entry in the conic mean vector and covariance matrix is then simply the sum of entries in the corresponding block, which reflects the coarse- graining of the compression. As a toy model for investigating non-orthogonal templates, we use a frequency-parametrised bank of sinusoidal waveforms h = sin (2pift) with fi- §5.3 Orthogonality and subspaces 105 nite observation time T . Assuming white noise for simplicity, the inner prod- uct (2.53) may be written for real time series as 〈a|b〉 ∝ ∫ T 0 dt a(t)b(t). (5.45) For anN -template bank with fmin ≤ f ≤ fmax and δf := (fmax−fmin)/(N−1) fmin, the overlaps are given by 〈hn|hn+∆n〉 ≈ 2 ∫ 1 0 dt sin (2pifmint) sin (2pi(fmin + |∆n|δf)t), (5.46) where we have normalised to T = 1 such that f is given in waveform cycles per observation time. This sinc-like function of ∆n ∈ Z yields a band covariance matrix for xn; we set N = 256, and choose the frequency bounds such that cov(xn, xn±1) ≈ 0.97 (i.e. a maximal mismatch of around 0.03). In contrast to the orthogonal case, the choice of partition generally affects the performance of the partition scheme for non-orthogonal templates. For the one-parameter template bank with overlaps given by (5.46), we consider both a randomised partition and a more optimised (but not necessarily optimal) par- tition with Um = {n ∈ Z+ | (m− 1)P < n ≤ mP}. We also include for compar- ison a uniformly spaced M -template subset of the original bank; equivalently, Um = {n ∈ Z+ |n = mP} where ⋃ m∈M Um 6= N. This “coarsened” template bank is obtained not through compression, but by simply reducing the correla- tion (increasing the maximal mismatch) between neighbouring templates. The standard detection algorithm outlined in Section 5.2.1 is then applied for the two partition schemes and the coarsening method. Figure 5.7(a) shows plots of discrimination (using Xmax) against compres- sion for both choices of partition and the coarsened template bank, with perfor- mance in the presence of a GW signal averaged over theN possible locations of the corresponding template in the bank. The optimised partition (with highly correlated templates grouped together) outperforms its randomised counter- part at all considered values of true SNR. It also shows improvement over the coarsening method at higher compression, which is expected as it uses infor- mation from the full N -template bank rather than just an M -template subset. 106 Template bank compression (a) (b) Figure 5.7: Plots of (a) discrimination D against compression rate κ for the randomised/optimised partition scheme and the coarsening method, and (b) accurate-identification/localisation rate PI against compression rate κ for the optimised partition scheme and the coarsening method, at different values of true SNR ρ for a non-orthogonal bank. Accurate localisation here is defined as the identification of the template h1 or one of the nearest P templates. §5.3 Orthogonality and subspaces 107 The largest statistic evaluation for the coarsened template bank identifies a best guess for the GW signal, but the accuracy of this identification is zero if the signal does not correspond to a template in the coarsened bank. Since the spacing of the coarsened bank is P , we may consider the best-guess template as representative of the P templates nearest to it (or P − 1 if P is odd), and say that the largest statistic evaluation localises a best guess for the signal. We then define the localisation to be accurate if the correct template h1 is one of those templates (equivalently, if the identified best-guess template is h1 or one of the nearest P templates). The identification algorithms in Section 5.2.1 also identify a single best-guess template for the partition scheme, which allows us to consider both accurate identification (to a precision of 1) and accurate localisation (to a precision of P ) in the same way. Figure 5.7(b) shows plots of these accuracy rates (using I1, which gives Neval = M + P in (5.17)) against compression for the optimised partition scheme and the coarsening method. As in Section 5.2.4, the turnaround in accurate-identification and localisa- tion rates for the partition scheme is due to the additional statistic evaluations of the fine-grained search. The localisation rates increase up to some level of compression, which is mainly because “accurate” localisation is defined up to a degree of precision that degrades with compression; this effect is seen for the coarsening method as well. Localisation to within the spacing of the original template bank (i.e. identification) decreases monotonically in accuracy for the partition scheme, and will not be achievable for the majority of signals with the coarsening method. The partition scheme localises the GW signal with slightly greater accuracy than the coarsening method, and in fact identifies it with virtually no fall-off in accuracy at significant compression levels. Increasing the correlation between neighbouring templates is known to im- prove the detection and localisation performance of a general template bank [173]. Results in this section illustrate that the partition scheme retains these benefits up to high levels of compression, and provides a superior alternative to simply coarsening the template bank for computational savings. The via- bility of conic compression becomes even more evident in Section 5.4, where we apply the partition scheme to a larger and more broadly correlated two- parameter template bank. 108 Template bank compression 5.3.2 Two-dimensional subspace Throughout Sections 5.2 and 5.3.1, we have assumed that the GW signal is exactly proportional to a template in the bank. To understand the impact on compression performance when this is not the case, we consider a bank of N uncorrelated templates obtained through some orthogonalisation procedure (e.g. as in [164–166]) on a general template bank, and a signal lying in the N -dimensional Hilbert space spanned by the orthogonal set. IfN is large, the signal is typically restricted to a low-dimensional subspace (this follows from the volume of anN -sphere). For simplicity, we assume it lies exactly between two templates in a 2-D subspace, i.e. x = A(h1 + h2) + {noise}, (5.47) where the templates have been relabelled without loss of generality and A = ρ/ √ 2 from (2.54). Hence the expectation values of the original and conic statis- tics become E[xn] = A(δ1n + δ2n), (5.48) E[Xm] = A card({1, 2} ∩Um), (5.49) while their covariances remain as (5.7) and (5.16) respectively. The assumption (5.47) is the worst-case scenario for a 2-D subspace, since the signal is maxi- mally far from both templates in the subspace. Although it is not possible to pre-optimise the choice of partition for or- thogonal templates, the performance of the partition scheme in the presence of a 2-D GW signal falls into two partition-dependent cases. At small values of P , it is more likely that the labels 1 ∈ Um and 2 ∈ Um′ are assigned to different sets (m 6= m′); as P increases, so does the probability that they are assigned to the same set (m = m′), which improves performance (e.g. the effective SNR for detection purposes is raised by a factor of √ 2). The standard detection algorithm in Section 5.2.1 is applicable for a 2-D signal, while the standard identification algorithms may be generalised at step (iv) by considering the two largest original statistic evaluations instead. Fig- ure 5.8 shows plots of discrimination and accurate-identification rate against §5.3 Orthogonality and subspaces 109 (a) (b) Figure 5.8: Plots of (a) discrimination D and (b) accurate-identification rate PI against compression rate κ for the partition scheme with 1-D and 2-D signals, at different values of true SNR ρ. The higher dotted curves for each value of ρ correspond to the scenario where the template labels 1 and 2 are assigned to the same set, while the lower curves correspond to them being assigned to different sets. Accurate identification for the 2-D case is defined as the identi- fication of both templates h1 and h2. 110 Template bank compression compression for a 2-D signal ∝ (h1 + h2), compared against a 1-D signal ∝ h1 with the same true SNR ρ. The identification algorithm I2 is used, since the accuracy rate of I1 falls to zero if m 6= m′. This gives Neval = M + 2P in (5.17). For detection of a 2-D GW signal, the effectiveness of the partition scheme is reduced slightly at lower SNRs, but mitigated by the case where m = m′ (i.e. the higher dotted curves in Figure 5.8). Detection performance for this special case actually improves up to some level of compression, which can occur as the symmetry among all possible signals is broken (by the partitioning process). A similar effect is seen for the example in Section 5.4. The discrimination for a 1-D signal generally lies within the 2-D discrimination bounds; at higher compression rates, there is little to no detection performance lost if the signal is not confined to a 1-D subspace. Accurate identification of a 2-D GW signal (i.e. the identification of both h1 and h2 in this toy model) is more problematic than in the 1-D case, since accuracy rates are reduced to begin with and fall off rapidly even at high SNR. Nevertheless, options such as lowering compression or switching to Ii>2 are available for the partition scheme, which should at least allow the template with maximal signal overlap to be identified at acceptable accuracy rates. If the true SNR is sufficiently high, the algorithms Ii>j may also be used to identify a j-dimensional GW signal described by an arbitrary linear combina- tion of templates, i.e. x = j∑ k=1 Akhk + {noise}, (5.50) where Ak > Ak+1 and the templates have been relabelled without loss of gen- erality. At step (iv) of the algorithms, each ordered weight Ak may be approx- imated by the k-th largest original statistic evaluation x(k), with the detection SNR given by ρD = √√√√ j∑ k=1 x2(k). (5.51) While this method fully recovers the (relative) weights of a GW signal’s j largest modes in the limit of zero noise, its accuracy might be significantly diminished for lower-SNR signals or at higher levels of compression. §5.4 Taylor-T2 example 111 5.4 Taylor-T2 example In this section, we implement the (optimised) partition scheme described in Sections 5.2.1 and 5.3.1 for a larger and more realistic example: a two- parameter template bank of mixed-order PN waveforms, which describe the gravitational radiation emitted during the inspiral part of a comparable-mass binary coalescence. An optimised partition here (and in general) refers to a partition of the template bank for which the convex hulls of the sets are non- intersecting, such that highly correlated templates are grouped together. The waveform family we use is the Taylor-T2 approximant [38, 174] for a circular and non-inclined binary with comparable component masses (m1,m2 ≥ m1). These waveforms are parametrised by chirp mass (2.31) and symmetric mass ratio (2.27), and are written as PN expansions in the frequency-related variable ξ := (GMη−3/5φ˙/c3)2/3, where units have been re- stored and φ˙ is the orbital phase frequency. We truncate the PN expansions at finite order, specifying the phase and amplitude to 3.5PN and 2PN respec- tively; the resultant mixed-order waveform may be written compactly as [175] hM,η(t) = 2GMη2/5 c2R A(t) exp (2iψ(t)), (5.52) where R is the source distance (which the true SNR ρ is inversely proportional to), and expressions for the amplitude function A and tail-distorted orbital phase ψ are given in Appendix C. Template bank compression is potentially more important for analysing data from space-based detectors, since the long duration and potential com- plexity of GW signals in their low-frequency sensitivity band greatly increases the cost of a grid-based detection search. As SMBH binary coalescences are an anticipated source for LISA, we consider a Taylor-T2 signal with the parame- ters θC = (1, 0.15), where θ := (M/(106M ), η); this corresponds to a black- hole binary inspiral with component masses (m1,m2) = (0.6, 2.5)×106M . The duration of the signal is set to tc = 1 yr. We also generate a bank of Taylor-T2 templates with the same duration, each normalised with respect to the inner product (2.53), where Sn(f) is given 112 Template bank compression by an analytic approximation to the (down-scoped) LISA noise power spectral density [176]. These templates are gridded uniformly in the transformed pa- rameters θ′ := θC + L(θ − θC) (128 points in each parameter), with the signal lying in the middle of the four central templates and the linear transformation L chosen such that the template overlaps are isotropic with respect to the grid (at least for the central region). The maximal mismatch of each template with its four nearest neighbours is around 0.01. Since the N = 16384 templates are pre-sorted by the (skewed) square grid, an optimised (but not necessarily optimal) partition is obtained by the obvious grouping into M blocks of √ P ×√P templates. This particular template bank admits six nontrivial square partitions with P ∈ {4, 16, 64, 256, 1024, 4096}; we do not consider the case P = 4096, as P = 1024 already yields a compression rate of 99.9%. A large number of rectangular partitions (where P = 2i with 0 < i < 14) are also possible, but we omit these here for simplicity as they are degenerate with the square partitions and among themselves. Square par- titions are straightforward to generalise for various lattice choices [177–179], and will be fairly optimal as long as the templates are gridded uniformly in the parameter-space metric. The expectation values of the original and conic statistics (given by the first equality in (5.6) and (5.43) respectively) are visualised in Figure 5.9, where the coarse-graining of the compression is evident. Overlaps for the Taylor-T2 tem- plate bank are much less localised than the toy model overlaps in Section 5.3.1; this is due to their wider cycle widths in both M and η, as well as a slight degeneracy in the two parameters (overlaps at the boundary of the first plot in Figure 5.9 can be as high as 0.4ρ). As the templates are so broadly corre- lated and the GW signal is injected right in the centre of the bank, the partition scheme is expected to perform well up to a high level of compression. For comparison purposes, we again consider the simple coarsening method discussed in Section 5.3.1. The smaller coarsened banks are formed by select- ing individual templates near the centre of each square block in the original bank, rather than by summing the templates as in the partition scheme. Detec- tion and localisation performance for both the partition scheme and the coars- ening method on the Taylor-T2 template bank with central injection is sum- §5.4 Taylor-T2 example 113 Figure 5.9: Matrix/contour plots of the expectation values E[xn] and E[Xm] for the partition scheme, at different values of set cardinality P (with compression rate κ in parentheses) for a Taylor-T2 GW signal (red cross) injected between the four central templates of a (128×128)-template Taylor-T2 bank. The bank is gridded uniformly in linearly transformed parametersM′ (increasing from top to bottom) and η′ (increasing from left to right) with maximal mismatch≈ 0.01. Overlap values depend on the true SNR ρ (set to 1 in these plots), and range from positive (orange) to negative (blue) in some subinterval of (−Pρ, Pρ). 114 Template bank compression (a) (b) Figure 5.10: Plots of (a) detection rate PD (at fixed false alarm rate PF ) and (b) accurate-localisation rate PI (to nearest ν templates) against compression rate κ for the optimised partition scheme and the coarsening method, at different values of true SNR ρ for a GW signal injected with central parameters θC . Ac- curate localisation here is defined as the identification of a template within the central squares of ν templates. §5.4 Taylor-T2 example 115 marised in Figure 5.10. The semi-log plots in this section use an abscissa of − lg (1− κ)/3, as most of the considered compression rates are > 90%. Instead of the discrimination (5.42), we quantify detection performance us- ing the detection rates at two fixed false alarm rates PF = 10−2 and PF = 10−4 (the number of Monte-Carlo trials performed for each plot point is ∼ 105, and so the errors are ∼ 10−3 for a 1-sigma binomial confidence interval). At all considered values of SNR and fixed false alarm rate, there is no fall-off in the partition scheme’s detection performance up to κ = 93.8% (and even a slight increase, due to the special choice of central injection). While this is also the case for the coarsening method, detection rates for the partition scheme are distinctly higher at compression rates of > 90%, with improvements of over 0.1 at κ = 99.9%. The identification algorithm I1 is used to localise the GW signal, which gives Neval = M +P in (5.17). Rates for accurate localisation to within two cen- tral squares of 12×12 templates (corresponding to< 1% of the entire bank) and 4 × 4 templates (< 0.1% of the bank) are considered. Localisation is typically improved by compression up to κ = 93.7%, which is provided by two different values of P (see discussion in Section 5.2.4). The two values are P = 16, beyond which the matrix/contour plot of E[Xm] in Figure 5.9 loses scale-similarity to that of E[xn], and P = 1024, for which performance is regained as each conic template incorporates more of the original templates and accuracy is added by the fine-grained search. Localisation is poorer at κ = 98.0%, which corre- sponds to both P = 64 and P = 256. To reduce clutter in Figure 5.10(b), only the higher localisation rates for κ = 93.7% and κ = 98.0% are plotted. The partition scheme outperforms the coarsening method at most levels of com- pression, especially in the case of accurate localisation to within the smaller square of 4× 4 templates. For the special case of a centrally injected GW signal, the detection and localisation performance of the partition scheme is non-decreasing up to high levels of compression and can even rise above that of the original template bank; however, this may also be said for the coarsening method. To illustrate that the improvement of the partition scheme over the coarsening method is not simply due to the special choice of injection, we consider two other cases: 116 Template bank compression Figure 5.11: Matrix/contour plots of the expectation values E[xn] for Taylor-T2 GW signals (red crosses) injected in a (128 × 128)-template Taylor-T2 bank at random (left) and near the boundary (right). a Taylor-T2 signal injected with randomly drawn parameters θR = (1.0, 0.16), and another injected near the boundary of the bank with the parameters θB = (0.98, 0.06) (i.e. in the middle of the four corner templates with low chirp mass and symmetric mass ratio). The expectation values of the original statistics for these two injections are visualised in Figure 5.11. Figure 5.12 shows detection and accurate-localisation rates for both the par- tition scheme and the coarsening method on the random and boundary injec- tions. The random injection is actually recovered with slightly better rates than the central injection rates in Figure 5.10, but with a similar improvement of the partition scheme over the coarsening method. A more marked difference be- tween the two methods is obtained for the boundary injection. Detection rates for both methods are now non-increasing, with the partition scheme showing greater improvement over the coarsening method; for the ρ = 6 case, the im- provement is around 0.3 at κ = 93.8%. Rates for accurate localisation of the boundary injection to within the corner square of 12 × 12 templates follow a similar trend, with a largest improvement of around 0.5 (again for the ρ = 6 case at κ = 93.7%). Detection and localisation performance for this Taylor-T2 example is injection-dependent, as it is for any realistic template bank: there is clearly no symmetry among all possible GW signals, since the templates are asymmetri- §5.4 Taylor-T2 example 117 (a) (b) Figure 5.12: Plots of (a) detection rate PD (at fixed false alarm rate PF = 10−2) and (b) accurate-localisation rate PI (to nearest 12×12 templates) against com- pression rate κ for the optimised partition scheme and the coarsening method, at different values of true SNR ρ for a GW signal injected with random param- eters θR and boundary parameters θB. Accurate localisation here is defined as the identification of a template within the square of 12 × 12 templates nearest to each injection. 118 Template bank compression cally correlated and the signals may lie between templates. We have not un- dertaken a full injection-averaged analysis (similar to that in Section 5.3.1) due to the size of the template bank, but overall detection rates for such an analy- sis should decrease monotonically with compression as per intuition, with the partition scheme outperforming the coarsening method (as it does for the three injections presented here, as well as several others we have examined). The partition scheme is expected to remain robust for searches in a (d > 2)- dimensional parameter space. As the number of templates that are highly cor- related with the GW signal increases exponentially with d, enlarging the span P of each conic template at the same rate should maintain detection sensitivity while increasing the relative computational savings (which scale as 1 − 1/P ). Good scaling with parameter-space dimensionality allows conic compression to be competitive with other search techniques that reduce computational cost. For example, the method of searching over the signal time-of-arrival using fast Fourier transforms yields a logarithmic reduction in the number of search points for that parameter [177], but for multidimensional searches an overall logarithmic reduction is easily attained by the partition scheme with little im- pact on performance. The two methods might even be combined for greater savings, by constructing conic sums of templates aligned at a fixed reference time and using their Fourier transforms to search over time-of-arrival. 5.5 Discussion In this chapter, we have presented and compared three tunable conic compres- sion schemes (partition, symmetric base and binomial coefficient) for a GW template bank in a grid-based detection search. The bank is compressed in the preparatory offline stage, which yields faster detection and localisation of sig- nals by reducing the number of inner product evaluations performed online. A recently proposed binary labelling method, modified to ensure the equal treatment of templates, is subsumed as a particular case of the symmetric base scheme. Optimal detection statistics have been calculated for all three schemes under simplified conditions, and the standard maximum-overlap detection statistic is shown to be significantly suboptimal for the base and binomial §5.5 Discussion 119 schemes. While these two lossless schemes provide automatic identification of the GW signal upon detection, the benefits of this are negated in the pres- ence of noise. Furthermore, the lossy partition scheme offers better overall performance than its counterparts at the same level of compression. We have applied the partition scheme to toy models of (i) a correlated tem- plate bank with a signal-proportional template and (ii) a signal lying in the span of orthogonal templates, to show that it remains feasible under such con- ditions. These toy models are instructive as they represent the two limiting cases of a general template bank. Correlations among the original templates result in partition-dependent performance, but this may be optimised before- hand by grouping highly correlated templates together; the optimised parti- tion scheme is then superior to a simple coarsening of the template bank. If the signal is proportional to a linear combination of templates in an orthogo- nal bank, the detection performance of the scheme is not significantly reduced. Conic compression performs well if the original template bank is suffi- ciently correlated, as demonstrated by our example implementation of the op- timised partition scheme for a bank of ∼ 104 PN waveforms. We consider a centrally injected GW signal, a randomly injected one, and one at the bound- ary of the bank; again, the scheme is superior to the coarsening method across the board. The partition scheme is shown to be viable for practical applica- tions, as it maintains good detection sensitivity and localisation accuracy up to high levels of compression and at all considered values of SNR for this more realistic template bank. In summary, our tunable conic compression schemes—specifically the opti- mised partition scheme—might provide an effective method of improving the speed, detection sensitivity and localisation accuracy of GW template banks. The schemes are potentially useful for any search involving template banks, as they are fully general and may easily be adapted to supplement existing algo- rithms in GW data analysis pipelines. Conic compression is also particularly promising in the context of LISA data analysis, where online grid searches will be difficult as computational costs are more prohibitive; for example, the method could be used as an online tool to rapidly identify nearby sources be- fore merger and generate alerts for electromagnetic telescopes. 120 Template bank compression Chapter 6 Gaussian process regression The theoretical uncertainty in waveform models must be accounted for when performing Bayesian parameter estimation on data from GW detectors, espe- cially as it could be the dominant source of error at high SNRs. A recently introduced method deals with model error by marginalising over it, using a prior probability density that is constructed through the machine-learning technique of Gaussian process regression. In this chapter, we outline the framework of the method and summarise the results of its application to LIGO parameter estimation, before performing a preliminary investigation of its viability for EMRI sources in LISA data anal- ysis. Through low-dimensional parameter estimation studies, we demonstrate that the method remains computationally practical in the light of various diffi- culties associated with searching the large EMRI parameter space. We also dis- cuss possible strategies for its generalisation to higher-dimensional searches. The material in this chapter is adapted from selected parts of [134, 180, 181], but the main results in Section 6.3 have yet to be published. 6.1 Background A typical EMRI will be observed by LISA for ∼ 105 orbits over the mission lifetime. Although this allows its properties to be measured very precisely, the results are consequently susceptible to any inaccuracy in the waveform models that are used in the Bayesian parameter estimation algorithms. This “theoreti- cal error” will likely dominate over statistical error at high SNRs [182]. At the same time, EMRIs are notoriously difficult to model accurately, since the ex- 121 122 Gaussian process regression treme mass ratio prohibits the use of NR simulations. Waveform models that employ self-force calculations are still under development; they will also be computationally expensive, and are unlikely to be used directly for parameter estimation after they become available. Current data analysis studies for LISA are therefore heavily reliant on the kludges described in Chapter 4, which are designed for robust use in search algorithms and can be modified to include self-force information. However, it will still be challenging to do a rough detection search of the full EMRI pa- rameter space with kludge waveforms [168], much less explore it with the precision required for accurate parameter estimates. EMRI waveforms are ex- tremely sensitive to small changes in their parameters, and so the global peak in the vast and multimodal posterior surface is akin to the proverbial needle in a haystack. This fact, coupled with the difficulty in modelling the complex waveforms and the systematic bias due to theoretical error, makes EMRI pa- rameter estimation the most formidable problem in LISA data analysis. In this chapter, we investigate the machine-learning technique of Gaussian process regression (GPR) [183, 184] as a possible strategy for mitigating the- oretical error in EMRI parameter estimation. Specifically, GPR is used to in- terpolate a small set of precomputed waveform differences between an accu- rate model and an approximate one; the GPR interpolant then provides a prior probability density for the waveform difference, which allows theoretical error to be marginalised over in the standard Bayesian likelihood with the approx- imate model [185]. The benefits of this method for GW parameter estimation are twofold: it includes information from computationally expensive wave- forms while searching with faster but less accurate ones, and accounts for any residual model inaccuracy with more conservative error estimates. An overview of the new marginalised-likelihood method is given in Sec- tion 6.2. The technique of GPR (in the context of waveform interpolation) and the training of the Gaussian process model from the precomputed set of waveform differences are introduced in Sections 6.2.1 and 6.2.2 respectively; Section 6.2.3 then briefly summarises results from a LIGO-oriented proof- of-concept study, where the method is applied to parameter estimation for comparable-mass black-hole binary mergers. §6.2 The marginalised likelihood 123 In Section 6.3, the viability of the GPR likelihood for EMRI parameter es- timation is investigated, and a simple argument is provided to show that the minimal allowed density of the training set (over parameter space) is typically much lower than the sampling density of a parameter estimation search with the accurate waveforms. This is verified by one- and two-dimensional exam- ples in Sections 6.3.1 and 6.3.2 respectively, where the relationship between the waveform-difference Fisher information matrix and the optimality of the GPR model and training set is also made clear. Finally, possible computational strategies for generalising the method to higher-dimensional searches are dis- cussed in Section 6.4. 6.2 The marginalised likelihood In this chapter, we consider waveform models h(λ ∈ Λ) over a common `- dimensional parameter space Λ, which we take to be isomorphic to R` for simplicity. The standard Bayesian likelihood L(λ|x) for the model parame- ters (given the data x) is defined in (2.59), with the usual noise-weighted inner product (2.53) on the function space W of finite-length complex time series. A maximum-likelihood estimation λML of the parameters may then be obtained by maximising (2.59) over Λ, such that〈 ∂h ∂λ (λML) ∣∣x− h(λML)〉 = 0. (6.1) For a waveform model hacc that provides an accurate description of the source, the waveform template at the true parameter values λtrue is consistent with the source signal, i.e. x = hacc(λtrue) + n. At leading order in n, any error λ := λML − λtrue in the measured parameter values is then purely statistical, in that it depends linearly on n. Using Einstein notation, we may write (6.1) as 〈(∂hacc)b|n− (∂hacc)a(λ)a〉 = 0 =⇒ (λ)a = (Γ−1acc)ab〈n|(∂hacc)b〉, (6.2) where ∂h ≡ ∂h/∂λ is the waveform derivative vector and Γ is the Fisher infor- mation matrix (2.61), with both evaluated at λML. 124 Gaussian process regression In general, a waveform model happ that is used for parameter estimation may only be approximate, such that happ(λtrue) 6= hacc(λtrue). At leading order in n, any error in the measured parameter values may be written as (λ) a = (Γ−1app) ab〈n|(∂happ)b〉 − (Γ−1app)ab〈h|(∂happ)b〉, (6.3) where the first term is statistical in the sense of (6.2), and the second term corresponds to the theoretical error that arises from the difference h between the approximate and accurate waveforms, i.e. h := happ − hacc. (6.4) Again, all derivatives in (6.3) are evaluated at λML, while the waveform differ- ence h is evaluated at λtrue.15 The statistical-error terms in (6.2) and (6.3) are inversely proportional to the waveform SNR ρ = 〈h|h〉1/2, since ∂h ∝ ρ and Γ ∝ ρ2. On the other hand, the theoretical-error term in (6.3) is independent of ρ. Hence the systematic bias incurred by using approximate waveforms happ in (2.59) may dominate the noise uncertainty for high-SNR sources, and is likely to be the limiting factor in extracting parameter information from such signals [182]. One approach to account for this bias is to marginalise over the error of happ (with respect to hacc) in (2.59), by specifying a suitable prior probability density p(h) for the waveform difference [185]. The “marginalised likelihood” is given by the (functional) integral L ∝ ∫ Dh Lappp(h), (6.5) which can be evaluated analytically if p(h) is Gaussian (sinceLapp is also Gaus- sian). Such a prior may be obtained through the technique of GPR, which pro- vides an interpolant for h with an associated normal distribution at each point in parameter space. 15 However, h(λtrue) = h(λML) at leading order, which allows the theoretical error in (6.3) (and hence λtrue itself, at high SNR) to be estimated for a given measurement λML [182]. §6.2 The marginalised likelihood 125 6.2.1 The Gaussian process prior In the GPR approach, the waveform difference h ∈ W is modelled as a zero- mean Gaussian process over Λ, i.e. h ∼ GP(0, k), (6.6) where the mean is chosen (uninformatively) as the time series 0 ∈ W , and the covariance function k(λ,λ′) is some symmetric and positive-definite bilinear form on Λ. For any finite set of parameter points {λi ∈ Λ | i = 1, 2, ..., N}, the corresponding set of waveform differences {h(λi) ∈ W | i = 1, 2, ..., N} has a Gaussian probability distribution N (0,K) on WN , i.e. p(v) ∝ 1 det K exp ( −1 2 vTK−1v ) , (6.7) where the waveform difference vector v and covariance matrix K are given respectively by v = [h(λi)]. (6.8) K = [k(λi,λj)], (6.9) It is convenient to write the quadratic form in (6.7) as vTK−1v = tr(K−1M), (6.10) where M := vvT = [〈h(λi)|h(λj)〉]. (6.11) In choosing a frequency-independent form k(λ,λ′) for the covariance func- tion, we have assumed that the correlations among the waveform differences across parameter space do not depend on frequency. The particular normalis- ing factor in (6.7) is only correct for a single frequency component (i.e. if the waveform difference at each parameter point is perfectly correlated across all frequency bins), but is retained for practical reasons.16 Finally, the inner prod- uct for waveform differences in (6.11) is chosen to be the same noise-weighted 16 Also, the “usual” exponent of 1/2 is absent due to integration over the complex plane. 126 Gaussian process regression inner product (2.53) as for the waveforms. These assumptions simplify the GPR calculations, but are also conservative in the sense that they yield less informative likelihoods; a more detailed justification is provided in [180]. From the definition of a Gaussian process, the enlarged set {h(λi), h(λ)} is again normally distributed with zero mean and the covariance matrix K∗ := [ K k∗ kT∗ k∗∗ ] , (6.12) where k∗ := [k(λi,λ)], (6.13) k∗∗ := k(λ,λ). (6.14) If {h(λi)} is known, then the conditional probability density of h(λ) given {h(λi)} is also Gaussian, i.e. p(h(λ)|{h(λi)}) ∝ 1 σ2 exp ( −1 2 〈h(λ)− µ|h(λ)− µ〉 σ2 ) , (6.15) where µ(λ) and σ2(λ) are given respectively by µ = kT∗K −1v, (6.16) σ2 = k∗∗ − kT∗K−1k∗. (6.17) The conditional probability (6.15) forms the basis of GPR, and yields an interpolation of h(λ) from a small precomputed “training” set D := {(λi, h(λi)) | i = 1, 2, ..., N}. (6.18) This interpolant is given by the waveform difference mean µ(λ), with associ- ated variance σ2(λ); it provides a new GPR-informed waveform model hGPR := happ − µ, (6.19) which approximates hacc via (6.4). Equation (6.15) also supplies the prior den- §6.2 The marginalised likelihood 127 sity for h in (6.5), which evaluates to L ∝ 1 1 + σ2 exp ( −1 2 〈x− hGPR|x− hGPR〉 1 + σ2 ) . (6.20) The GPR marginalised likelihood has several desirable features for param- eter estimation. A maximum-likelihood estimation of λwith (6.20) gives 〈(∂hGPR)b|n− h + µ− (∂hGPR)a(λ)a〉 = 0 (6.21) from (6.1) and (6.19) with x = hacc(λtrue) + n and λML = λtrue + λ, and so (λ) a = (Γ−1GPR) ab〈n|(∂hGPR)b〉 − (Γ−1GPR)ab〈h|(∂hGPR)b〉 +(Γ−1GPR) ab〈µ|(∂hGPR)b〉, (6.22) where the third term is proportional to the GPR interpolant µ and acts to cancel the second term by design. This correction greatly reduces the systematic bias due to theoretical error, provided the interpolant is performing optimally (i.e. µ ≈ h) near λtrue. Another safeguard against theoretical error is the presence of the GPR vari- ance σ2 in (6.20). This variance is 1 when µ ≈ h, but may become ∼ 1 far from all training-set points, or in the case of a suboptimally chosen training set or covariance function. The consequent broadening of the Gaussian in (6.20) is then conservative in nature, as it usually prevents the true parameter values from being excluded at high significance. Lastly, the premise of the GPR approach is based on the availability of ac- curate waveforms hacc that are extremely expensive to compute, and hence unsuitable for use in Monte Carlo search algorithms with the standard like- lihood. The marginalised likelihood remains computationally tractable while including information from hacc, since it only uses the approximate waveforms happ and adds to them some linear combination of precomputed waveform differences via (6.16). Any extra online computational cost from using the marginalised likelihood thus scales linearly with the size N of the training set. The scaling coefficient (relative to the cost of (2.59) with happ) is typically small; for the analyses in Section 6.3, it is ∼ 10−3. 128 Gaussian process regression 6.2.2 Training the Gaussian process With the zero-mean assumption in (6.6), the waveform difference model is fully specified by the covariance function k. The standard approach is to define a functional form for k that depends on a number of hyperparameters θ, and to select values for θ by training the Gaussian process with information from the set D. A covariance function k(λ,λ′) is stationary if it depends only on the relative position λ−λ′ of the two parameter points; it is furthermore isotropic if it depends only on τ 2 := gab(λ− λ′)a(λ− λ′)b, (6.23) where the gab are the `(` + 1)/2 independent components of some constant parameter-space metric g on Λ. An investigation of various common isotropic (hence stationary) covari- ance functions in the GW context finds the performance of the GPR interpolant and the marginalised likelihood to be fairly robust against changes in the func- tional form for k [180]. Hence we consider a single fixed form in this chapter: the squared-exponential covariance function kSE(λ,λ ′) = σ2f exp ( −1 2 τ 2 ) , (6.24) which is the smooth limiting case for several different families of covariance functions. The hyperparameters for the model GP(0, kSE) then comprise only the metric components gab and some overall scale factor σ2f . As the size of the training set D increases, the covariance matrix K rapidly becomes ill-conditioned, even for a modestly sized set with N & 10. This is partly mitigated by the addition of noise to D, such that the GPR interpolant need only pass close to (rather than through) each training-set point. Keeping σ2f as the overall scale factor, we transform K→ K + σ2fσ2nIN , (6.25) where IN is the identity matrix, and the fractional noise variance σ2n of each training-set point is taken to be uniform and fixed (i.e. not treated as a hy- §6.2 The marginalised likelihood 129 perparameter). In practical terms, the transformation (6.25) effectively reduces the condition number of K, thereby facilitating its numerical inversion. We use σ2n = 10 −4 throughout this work, which is the smallest value compatible with all of the N . 100 training sets considered in Section 6.3. The most straightforward method of selecting the Gaussian process hyper- parameters θ is through maximum-likelihood estimation with the hyperlikeli- hood Z(θ|D) = p({h(λi)}) in (6.7), i.e. the likelihood for the hyperparameters given the training set. In other words, an optimal set of hyperparameters θML is obtained by maximising the log-hyperlikelihood lnZ = −1 2 tr(K−1M)− ln det K + const (6.26) over the hyperparameter space Θ. Part of this maximisation may be done analytically, since the overall scale σ2f factors out of the matrix expressions in lnZ [181]. In the case of (6.26), lnZ with K = σ2fKˆ achieves a maximum in σ 2 f at σ2f = 1 2N tr(Kˆ−1M). (6.27) Substituting (6.27) back into (6.26) then gives a scale-invariant form for lnZ: lnZ = −N ln tr(Kˆ−1M)− ln det Kˆ + const. (6.28) Equations (6.27) and (6.28) effectively reduce the dimensionality of the hyper- parameter space by one (to dim Θ = `(`+1)/2 for the model GP(0, kSE)), which is useful for the low-dimensional searches conducted in Section 6.3. 6.2.3 Application to black-hole binary mergers In [180], the viability of the GPR marginalised likelihood for improving GW parameter estimation is demonstrated through a one-dimensional (` = 1) study, using waveforms for merging black-hole binary systems with compa- rable component masses (m1,m2 ≥ m1). Two waveform approximants im- plemented in the LIGO Scientific Collaboration Algorithm Library [186] are 130 Gaussian process regression considered: the phenomenologically fitted IMRPhenomC model [187] and the analytic Taylor-F2 model [188], which are taken as accurate and approximate respectively. Even though these two waveforms are qualitatively different (IMRPhenomC describes the full inspiral–merger–ringdown while Taylor-F2 is inspiral-only), the marginalised likelihood functions as expected in reduc- ing systematic bias. Equation (6.20) is used to estimate the chirp mass (2.31) from synthetic data x = hacc(λtrue) (no detector noise), where hacc(λtrue) is an injected IMRPhe- nomC signal with Mtrue = 5.045M and fixed mass ratio q = 0.75. As the density of the training set (with respect to some metric on Λ) is expected to be the strongest determinant of interpolation performance, two different grid sizes in M are considered: ∆M = 10−2M and ∆M = 5 × 10−3M . The corresponding points are gridded uniformly around Mtrue across the range 5 ≤ M/M ≤ 5.6, such that the density of the training set is varied by fixing its span and changing its cardinality. A GPR model with the squared-exponential covariance function (6.24) is then trained on both sets by optimising the single independent metric hyper- parameter, which is chosen more intuitively in the ` = 1 case to be the covari- ance length δM := (gMM)−1/2. In general (and for the considered sets), the optimal value of δM shifts as the density and cardinality of the training set are varied, but typically by less than a factor of two. The performance of the marginalised likelihood is found to be similarly robust against the choice of δM for a given training set. Unsurprisingly, the performance of the marginalised likelihood is im- proved for the denser training set. Higher fidelity between the GPR waveform (6.19) and the accurate waveform is obtained across the span of the set; this is quantified by overlaps O(hGPR|hacc) that are & 0.999. The variance σ2 associ- ated with the waveform difference interpolant is smaller for the denser train- ing set as well, with values that are . 10−3 relative to σ2f (which is the limiting value of σ2 outside the span of the set). A maximum-likelihood estimation of M with the corresponding marginalised likelihood is therefore closer to the true value, and better constrained. The sparser training set in this study gives O(hGPR|hacc) & 0.985 and σ2/σ2f . 10−2, with a marginalised likelihood that §6.3 Application to extreme-mass-ratio inspirals 131 is shifted away and broader but still functional. As will be discussed in Sec- tion 6.3, this is because its density is close to some threshold determined by the optimal value of δM (which is largely independent of training-set density). Different source SNRs in the range 8 ≤ ρ ≤ 64 are also considered in this study. The marginalised likelihood for the sparser training set re- duces the systematic error in the maximum-likelihood estimation ofM from M = 5× 10−3M (around 10-sigma for a typical LIGO source with ρ = 16) to M = 9×10−4M ; it also broadens to remain consistent withMtrue at 2-sigma, even at high SNR. These results are obtained in the regime where the overlaps between the accurate and approximate waveforms are ≈ 0.35 across the span of the training set. Although theoretical error is reduced if the approximate model is improved, results in Section 6.3 show that the marginalised likeli- hood remains needed for overlaps as high as ≈ 0.97, which may still lead to significant systematic bias for a typical EMRI with ρ = 30. 6.3 Application to extreme-mass-ratio inspirals The detection and characterisation of EMRIs is a formidable challenge in GW data analysis, especially in the broader context of resolving these sources from a LISA data set that is likely to contain an (over)abundance of long-lived and overlapping signals [189]. Even as a standalone problem, searches of the EMRI parameter space are greatly hindered by its large volume (with respect to the Fisher information metric), which requires ∼ 1030 waveforms for full cover- age in a template bank approach [168]. This is exacerbated by the long and unwieldy waveforms themselves; a sampling rate of 0.2 Hz (the characteristic frequency for an EMRI with a 106M central black hole) yields ∼ 107 samples for both channels of a year-long waveform. Due to the O(N3) cost of computing the Cholesky decomposition for K in (6.16) and (6.17), it is clearly impractical—if not impossible—to cover any sig- nificant fraction of parameter space with a single training set. The present pur- pose of the GPR marginalised likelihood is thus restricted to precise parameter estimation in highly localised regions of parameter space. Furthermore, if the GPR approach is to be useful for EMRIs at all, the typical separation of points 132 Gaussian process regression in the training set must be greater than the Fisher metric lengths of the accu- rate waveform model, which determine the characteristic resolution of grid or stochastic searches with that model [134]. A simple argument shows that this is normally the case for EMRI wave- forms with ρ > 1. We consider a small neighbourhood of some point λ0 ∈ Λ, along with a covariance metric g for a Gaussian process that optimally de- scribes the distribution of h in that neighbourhood. The metric defines the short covariance lengths δλa := (gaa)−1/2, i.e. the half widths of the associ- ated hyperellipsoid in each one-dimensional parameter subspace through λ0. These lengths place upper bounds on the characteristic grid sizes of the train- ing set and lower bounds on its span; an incompatible training set typically yields no peak in the log-hyperlikelihood surface (6.26), and so the regression becomes suboptimal. From the optimality of the Gaussian process, the covariance lengths asso- ciated with g approximate the correlation lengths of h itself. Hence we have δλa ∼ (δλover)a, where the vector of overlap lengths δλover is defined to satisfy 〈h(λ0)|h(λ0 + Paδλover)〉 = 0 (6.29) for each a, with Pa projecting δλover onto the subspace corresponding to λa. At leading order, we then have δλa ∼ 〈h|h〉|〈h|(∂h)a〉| = ρ secφ a√ (Γ)aa , (6.30) where ∂h and Γ respectively denote the derivative vector and Fisher informa- tion matrix of the waveform difference, and φa is the principal inner-product angle between h and (∂h)a. The covariance lengths do not scale with the SNR ρ of the waveform difference, since the final denominator in (6.30) is ∝ ρ. We now consider the Fisher metric lengths of the waveform difference. As pointed out in [180], the Fisher lengths of the difference between two wave- form models are generally larger than those of the individual models, espe- cially if both models generate waveforms with high overlap. This might be due to the waveform difference having a lower SNR, or being more broadly corre- §6.3 Application to extreme-mass-ratio inspirals 133 lated across parameter space. Since the first of these two factors is known, we may separate it from the analysis and consider the Fisher lengths of the unit- SNR waveform difference instead. For the EMRI examples that follow in this section, these are found to be comparable to the Fisher lengths of the unit-SNR waveforms, which is simply the statement that the waveform difference varies across parameter space in similar fashion to the waveforms themselves. Since the short (i.e. defined analogously to δλa) Fisher metric lengths of the unit-SNR waveform difference are given by (δλFish) a := ρ√ (Γ)aa , (6.31) it follows that δλa & (δλFish)a. In general, any waveform derivative with re- spect to a parameter that only affects amplitude will have φa ≈ 0, such that δλa ∼ (δλFish)a. Nevertheless, the SNR-normalised Fisher lengths for the waveform difference (or the waveforms themselves in the EMRI case) give rough order-of-magnitude estimates for the optimal covariance lengths, and are more straightforward to obtain. These estimates may be used to provide initial guesses for the diagonal metric hyperparameters when maximising the log-hyperlikelihood with standard optimisation routines. Finally, we compare the optimal covariance lengths δλa to the unnor- malised Fisher metric lengths of the accurate waveform model, which for EM- RIs are given approximately by (δλFish)a/ρ through the above argument. The former lengths determine the maximal separation of points in the training set, while the latter describe the support of the accurate posterior probability den- sity. Since δλa & (δλFish)a, it follows for ρ > 1 that the lowest training-set density allowed (while ensuring interpolation optimality) will typically be far below the sampling density required to resolve the posterior surface. Hence the GPR approach should be viable for EMRIs, and its computational benefits are enhanced for sources with higher SNR. The validity of this simple argument is illustrated through one- and two- dimensional analyses of the marginalised likelihood in Sections 6.3.1 and 6.3.2 respectively. Kludge waveform models are used in this study, due to their availability and computational practicality. Waveforms generated from the NK 134 Gaussian process regression model in Section 4.2.2 are taken to be accurate, since they have high fidelity with Teukolsky-based waveforms up to an orbital separation of ≈ 5M . The latest version of the AAK in Section 4.3 is used as the approximate model; it is faster than the NK and matches its early phase evolution with high waveform overlap, but dephases gradually as the compact object approaches plunge. In both sections, the synthetic data x is an injected NK signal from a 101M stellar-mass black hole orbiting a 106M SMBH, with no detector noise added. The signal is two months long and sampled at 0.2 Hz, while the source dis- tance is adjusted such that the SNR is ρ = 30 with respect to an analytic ap- proximation for the (down-scoped) LISA noise power spectral density [176]. Other orbital parameters are chosen to ensure that the NK and AAK wave- forms have an overlap of ≈ 0.97, so as to investigate the scenario in which the approximate model is fairly accurate to begin with. As will be shown, EMRI parameter estimation is susceptible to significant systematic bias even at this level of theoretical error. For the given source and waveform parameters, a single evaluation of the NK likelihood takes 29.4 s on average, as compared to an average of 5.66 s per evaluation for the AAK likelihood. This disparity in computational cost is likely to be even greater in more realistic scenarios where, for example, the NK is used as the approximate model and the accurate waveforms are provided by a self-force model. In Section 6.3.1, we verify that the added cost of using the GPR marginalised likelihood over the standard likelihood with approximate waveforms is indeed proportional to the training-set size N , and with a small scaling coefficient (as mentioned in Section 6.2.1). 6.3.1 One-dimensional example In this section, the GPR marginalised likelihood (6.20) is used to estimate the compact-object mass of an EMRI with µtrue = 101M , assuming all other pa- rameters are known and fixed at their true values. The covariance metric on the corresponding parameter subspace has a single component gµµ, which is optimised through the maximisation of the log-hyperlikelihood (6.26). This one-dimensional example provides a simple illustration of the relationships §6.3 Application to extreme-mass-ratio inspirals 135 Figure 6.1: Plots of lnZ against lg (δµ/M ) for eight 10-point training sets with grid sizes −2.1 ≤ lg (∆µ/M ) ≤ −1.4 (indicated by the abscissae of the solid circles). The vertical dashed line corresponds to the Fisher metric length δµFish. between the optimal covariance length δµ := (gµµ)−1/2, the Fisher metric length δµFish := ((Γ)µµ) −1/2, and the training-set grid size ∆µ. The GPR model is trained on eight 10-point training sets with uniform grids, where the grid sizes are distributed in the range −2.1 ≤ lg (∆µ/M ) ≤ −1.4. This range is chosen to encompass the Fisher length of the unit-SNR waveform difference17, which is approximately constant across the spans of the considered training sets and evaluated at µtrue as lg (δµFish/M ) = −1.96. Each training set is placed such that µtrue lies at the geometric centre of its span (and thus maximally far from the nearest points in the set). Figure 6.1 shows plots of the log-hyperlikelihood for the eight training sets, with the optimal covariance length for each set given by the abscissa of the peak (where it exists). The optimal value δµ is approximately constant for all valid training sets, and falls in the narrow range lg (δµFish/M ) ≤ lg (δµ/M ) ≤ −1.8. In comparison to the approach of [180] described in Section 6.2.3, vary- 17 For comparison, the Fisher lengths of the unit-SNR NK and AAK waveforms at µtrue are lg (δµNKFish/M ) = −1.86 and lg (δµAAKFish /M ) = −1.85 respectively. 136 Gaussian process regression ing the density of the training set by fixing its cardinality and changing its span also shifts δµ by less than a factor of two, which demonstrates that both span and cardinality have less impact than density on a training set’s perfor- mance. Hyperlikelihood peaks emerge only for grid sizes lg (∆µ/M ) ≤ −1.8, indicating that 1/δµ determines a minimum threshold for the density of an op- timal training set. Finally, δµFish appears to set a lower bound on δµ, which is consistent with the discussion around (6.30) and (6.31). We now consider the marginalised likelihood itself for three other 10-point training sets. Firstly, a set DFish is placed around µtrue with grid size ∆µFish = δµFish; training the GPR model on this set yields an optimal covariance length lg (δµ/M ) = −1.82. Two more sets Dcov and D2cov are constructed in the same way, with the grid sizes ∆µcov = δµ and ∆µ2cov = 2δµ respectively. A different optimal covariance length is found for Dcov, while there is no hyperlikelihood peak forD2cov. Nevertheless, the performance of the marginalised likelihood is found to be practically constant across the range lg (δµFish/M ) ≤ lg (δµ/M ) ≤ −1.8, and so the Gaussian process is not retrained for Dcov and D2cov (i.e. the same value of δµ is used for all three training sets). At a source SNR of ρ = 30, a high overlap of ≈ 0.97 between the accurate and approximate waveforms still results in a 5-sigma bias due to theoretical error; as seen in Figure 6.2, the approximate likelihood Lapp is peaked away from µtrue with µ ≈ 3 × 10−3M , while the 1-sigma length for Lapp (and the accurate likelihood Lacc) is ≈ 5 × 10−4M . The marginalised likelihood with the training set DFish is virtually identical to Lacc, and is slightly broader with Dcov but remains peaked near the true value. For the sparsest training setD2cov, the peak of the marginalised likelihood has an error µ similar to that of Lapp, although it is sufficiently broadened to ensure that it is still consistent with µtrue at 2-sigma significance. The one-dimensional example in this section indicates that the GPR ap- proach might be viable for EMRIs, since even the densest considered training setDFish has a grid size that is significantly larger than the width of the accurate likelihood. By constructing a few additional training sets with different sizes N , the marginalised likelihood is found to take an average of 5.66 + 0.013N s per evaluation, i.e. ≈ 2% longer than the approximate likelihood for a 10-point §6.3 Application to extreme-mass-ratio inspirals 137 Figure 6.2: One-dimensional likelihood plots for the standard likelihood with accurate and approximate waveforms, and the marginalised likelihood with the training sets DFish, Dcov and D2cov. The only training-set points within the horizontal plot range belong to the densest setDFish, and are indicated by thick marks on the horizontal axis. 138 Gaussian process regression training set. However, it provides no computational savings over the accurate likelihood for large training sets (with & 2000 points in this particular case), which may restrict its usefulness for higher-dimensional problems. This lim- itation, as well as the generally poor scaling of GPR with dimensionality, is discussed further in Sections 6.3.2 and 6.4. 6.3.2 Two-dimensional example In this section, the GPR marginalised likelihood (6.20) is used to estimate the component masses of an EMRI with (µ,M)true = (101, 106)M , again assuming all other parameters are known and fixed at their true values. Maximisation of the log-hyperlikelihood (6.26) is now over the three independent compo- nents (gµµ, gMM , gµM) of the covariance metric on the two-dimensional param- eter subspace. The eigensystem {(λi, vˆi) | i = 1, 2} of g defines a covariance ellipse with semi-principal axes {λ−1/2i vˆi} in the usual way. Although the component-mass subspace is chosen for the two-dimensional example here, a straightforward search in (µ,M) is not necessarily optimal in the context of higher-dimensional parameter estimation. For example, since the central mass M ≈ µ + M strongly determines the characteristic frequency of an EMRI waveform, variations in the waveform difference with respect to M may be reduced by rescaling the time coordinate to dimensionless time t/M . This approach has been investigated, and yields longer (by an order of magni- tude or so) covariance lengths as expected. However, it also results in less sta- ble derivatives and poorer interpolation; this is likely because the AAK model maps M to an unphysical mass M˜ that is then evolved through fits (see Sec- tion 4.3.2), and so its waveforms vary differently from the NK waveforms with respect to M . If more accurate models are used, the waveform difference will have an infinite covariance length in total mass, such that rescaling the time coordinate by µ+M reduces the component-mass subspace to a single degree of freedom (e.g. the mass ratio µ/M ). Three different training sets are considered for the (µ,M) example in this section. The first is a (6×6)-point setDFish with (µ,M)true lying at the geometric centre of its span; its points are placed uniformly on a grid defined by the semi- §6.3 Application to extreme-mass-ratio inspirals 139 principal axes {λ−1/2i vˆi}Fish of the Fisher metric ellipse (where {(λi, vˆi)}Fish is the eigensystem of Γ for the unit-SNR waveform difference). Two more (10× 10)-point training sets Ddense and Dsparse are constructed on rectangular grids, with the grid sizes given by the short and long Fisher lengths respectively, i.e. (∆µ,∆M)dense := ( 1√ (Γ)µµ , 1√ (Γ)MM ) , (6.32) (∆µ,∆M)sparse := (√ (Γ−1 )µµ, √ (Γ−1 )MM ) . (6.33) As justified in Section 6.3.1, the GPR model is trained on a single training set (Ddense in this case), and the same optimal covariance ellipse is subsequently used for all three sets. The relative placement of points in the three training sets is shown in Figure 6.3, along with the covariance and Fisher ellipses. Both ellipses are aligned and the Fisher ellipse is slightly smaller, which is consistent with the discussion around (6.30) and (6.31). From the contour plots in Figure 6.4, the measurement of (µ,M) with the approximate likelihood Lapp has a theoretical error of (µ,M) ≈ (2 × 10−3, 6 × 100)M , and excludes (µ,M)true at beyond 2-sigma significance. The marginalised likelihood with the training set Ddense is virtually identical to the accurate likelihood Lacc; so too is the likelihood for DFish, which is sparser and contains fewer points. More surprisingly, the training set Dsparse also yields a likelihood that is very similar to Lacc, which indicates that a training-set den- sity no lower than that corresponding to the long Fisher metric lengths (i.e. the half extents of the unit-SNR Fisher ellipse in each parameter) will still be opti- mal on the level of the marginalised likelihood. However, it may be difficult to learn the optimal covariance metric from such a training set if it is too sparse or contains too few points. It is clear that a simple rectangular grid approach to the placement of train- ing set points will not scale well with the dimensionality ` of the parameter space, but uniform placement on a grid defined by the Fisher metric eigensys- tem is also limited at moderately large `. From our one- and two-dimensional studies, as well as preliminary investigations of a three-dimensional problem, 140 Gaussian process regression Figure 6.3: Training-set point placement around (µ,M)true for Ddense (dots), DFish (triangles) and Dsparse (squares). The grid for DFish is defined by the semi-principal axes of the Fisher metric ellipse (green), and is aligned with the optimal covariance ellipse (black) learnt from Ddense. The central grey square corresponds to the plot range of Figure 6.4. §6.3 Application to extreme-mass-ratio inspirals 141 Figure 6.4: Two-dimensional likelihood contour plots for the standard likeli- hood with accurate and approximate waveforms, and the marginalised likeli- hood with the training sets Ddense, DFish and Dsparse. All contours are 2-sigma. The only training-set points within the plot range belong to Ddense, and are indicated by solid circles. 142 Gaussian process regression six points along each eigendirection appears to be the bare minimum for learn- ing a covariance metric that well describes the waveform difference locally. This necessitates O(N3) operations on a 6` × 6` covariance matrix in the train- ing stage, which is computationally challenging for ` > 5. If a suitable covariance metric can be obtained, the actual set of waveform differences used in the interpolation stage does not have to be quite as large or dense as the set used to train the GPR model. A set with just three points along each eigendirection becomes unfeasible at ` > 8, since the marginalised likelihood is likely to be more expensive than the accurate likelihood for N ∼ 104; still, this is enough to estimate the seven intrinsic EMRI parameters (see Section 4.2). In general, the dimensionality problem for the GPR approach might be tackled by reducing the computational cost of operations on the N × N covariance matrix, or by reducing N itself while ensuring the training set is functional. Possible methods for achieving these are beyond the scope of the present work, but are discussed in the concluding remarks of Section 6.4. 6.4 Discussion In this chapter, we have introduced the GPR marginalised likelihood and its applications in GW data analysis, with a focus on investigating its viability for EMRI parameter estimation through low-dimensional studies. Even in the considered scenario where the waveform model used for parameter estima- tion has a ≈ 0.97 match with the source signal at the true parameter values, significant systematic bias from theoretical error will still arise for high-SNR (ρ & 30) sources. The GPR approach is shown to mitigate this bias, and hence to be suitable for improving the accuracy of EMRI parameter estimation (albeit in highly localised regions of parameter space). The performance of the marginalised likelihood is strongly dependent on its precomputed training set of waveform differences, which relies on the ex- istence of a more accurate waveform model that reproduces the source signal with high fidelity. For the method to be practical, the density of the training set must be significantly lower than the resolution of a posterior search with the accurate waveforms. This is shown to be the case for EMRIs through a simple §6.4 Discussion 143 argument, and is verified with one- and two-dimensional examples. Another key result in these sections is a demonstration of how the Fisher information matrix of the waveform difference (normalised to unit SNR) may be used to inform the placement of training-set points, as well as to estimate a covariance metric that describes the waveform difference locally. While the marginalised likelihood shows early promise for EMRI param- eter estimation, it is akin to other applications of GPR in being subject to the curse of dimensionality. The number of training-set points required to search an `-dimensional parameter subspace grows exponentially with `, which hin- ders not just the offline training stage (since the covariance matrix is larger and more ill-conditioned), but also the online interpolation stage (where a new lin- ear combination of points is computed for each likelihood evaluation). One possible approach to these computational problems is to replace the squared-exponential covariance function (6.24) in the GPR model with a co- variance function that has compact support on parameter space (e.g. the Wendland polynomials [190]), such that the covariance matrix becomes sparse. Iterative methods [191] may then be used to accelerate the Cholesky decompo- sition of K in (6.16), (6.17) and (6.26). A compact-support covariance function also reduces the number of training-set points summed in (6.16), which di- rectly determines the evaluation speed of the marginalised likelihood. Another strategy is to minimise the size of the training set itself. As seen throughout Section 6.3, the covariance metric that is learnt during the training stage also determines a fixed threshold for the density of a training set that functions optimally at the interpolation stage. However, it may be possible to lower this threshold density through reparametrisation or dimensionality re- duction methods, e.g. the component-mass example discussed in Section 6.3.2. In the case of parameters for which this is not feasible, the number of training- set points used to cover the region of relevance may still be reduced through a non-uniform placement of points, or non-geometric prescriptions such as stochastic placement algorithms [192–194]. Full coverage of the search region with a precomputed training set might not even be necessary; one possibility is to use a smaller “moving” set that is updated adaptively as the marginalised likelihood is sampled with Markov-chain-Monte-Carlo methods. 144 Gaussian process regression Chapter 7 Conclusion The historic first direct detection of GWs by Advanced LIGO at the end of 2015 [8] was but the opening event in the quick succession of exciting developments that have since unfolded in the field. Within the short span of 18 months, we have witnessed a second confirmed black-hole binary merger [9] and the unequivocal success of the LISA Pathfinder proof-of-technology mission [18], while the outlook for an impending detection with pulsar timing arrays is also much improved [20]. Against the backdrop of this rapidly evolving landscape, it is hoped that the work presented in this dissertation constitutes a modest but timely contribution to the burgeoning field of GW astronomy. In Chapter 3, we have found that Einstein–Maxwell interactions between far-field GWs and electromagnetic fields are likely too weak for the practical purposes of GW detection and parameter estimation, except under conditions that might be contrived or extreme by astrophysical standards (e.g. those con- sidered in Section 3.4). A characterisation of the electromagnetic perturbations driven by strong-field gravitational radiation should yield effects that are more pronounced, and will allow the modelling of GW sources in greater detail. However, EMWs induced by GWs in the strong-field regime are restricted to gravitational frequency bands (. 103 Hz) [96, 123], which presents the same difficulties for direct detection as mentioned in Section 3.4.1. The framework in Section 3.3 was originally intended as an intermedi- ate step towards adapting the 1+3 approach for strong-field calculations, but turned out to be analytically intractable even for the far-field problem in spher- ical coordinates, due to the mixing of spherical harmonic modes by the tensor– vector contractions in (3.38) and (3.39). An alternative research direction in 145 146 Conclusion the area of Einstein–Maxwell interactions would be to unify or extend exist- ing semi-analytical methods [96, 97, 123] for Schwarzschild/Kerr spacetimes. Unification should be achievable in principle, as a common (and perhaps in- evitable) trait of such methods is a decomposition of the coupled fields into secondary variables distributed among the axial [47] and polar [48] sectors. The work presented in Chapter 4 is of more pressing interest to the GW community, due the scarcity of accurate and computationally efficient mod- els that describe the information-rich waveforms from EMRIs. We have de- veloped an augmented variant of a well-known kludge waveform model; the new AAK model retains the speed of its predecessor, while matching the phase evolution of more accurate kludges over a significant fraction of the inspiral. With a (meta)stable version of the model released online as part of a kludge software suite [135], AAK waveforms will hopefully see widespread use in the next round of mock LISA data challenges. EMRIs are an attractive research topic due to the inherent challenges they pose for source modelling and data analysis, and there is much that remains to be done in terms of populating the EMRI waveform inventory before the LISA mission design is finalised at the end of this decade. The immediate improvements that must be made to the AAK model are workaround solutions to the problems mentioned in Section 4.4. For example, the ill-defined nature of (4.37)–(4.39) at plunge may be resolved (inelegantly) by simply starting or stopping the trajectory integration as close to plunge as numerically allowed. Other modifications will involve streamlining the model to make it as fast and robust as possible for the mock data challenges, e.g. the development of an optimised method that detects and handles plunge with a minimal amount of computational overhead, or the improvement of the fitting algorithms so as to reduce the amount of tuning presently required. Another important research direction in the area of kludge waveform models is to bridge the present disconnect between data analysis work and the gravitational self-force programme. The most accurate kludges currently available are fitted to Teukolsky-based models, and hence account only for the adiabatic, dissipative first-order self-force; a logical next step is to incorporate the subleading “post-adiabatic” self-force phenomenon of resonances. Efforts 147 have been made to explore the prevalence of resonance jumps [195] and their impact on EMRI detection rates [196], but the effect of resonances on the wave- forms themselves has yet to be studied. It is also crucial to set up a channel enabling kludges to be informed by higher-order self-force results once they become available. Early steps to- wards this goal have been made for an EMRI with a non-spinning central black hole [68]; in this work, the compact object’s orbit is described as an evolving geodesic whose parameters are governed by forced equations of mo- tion, and it is explicitly computed by supplying a model for the self-force on Schwarzschild orbits. A similar result may now be achieved for spinning black holes, by using pieces that have recently become available: the forced equa- tions of motion for a Kerr black hole [67] and the self-force on an eccentric, equatorial Kerr orbit [197]. The third major theme of this doctoral thesis is the development of im- proved techniques for GW data analysis, with the two overlapping but quali- tatively different procedures of detection and parameter estimation covered in Chapters 5 and 6 respectively. Future research directions under this theme are the most open-ended, as they typically involve the application of new statis- tical or computational methods to data analysis, as well as the study of their efficacy using realistic source signals and detector noise. Various families of tunable compression schemes for GW template banks have been proposed in Chapter 5. In particular, the optimised partition of a template bank into non-overlapping sets performs better than other consid- ered groupings with intersection (which are still interesting from a purely com- binatorial perspective), and also yields moderate gains over a straightforward coarsening of the bank. By investigating the features of the method on a bank of ∼ 104 PN waveforms, the first step towards advancing its usage in GW detection has been taken; however, the development of a practical implemen- tation and a more extensive comparison to existing orthogonal-decomposition methods [164–166] is necessary for its adoption. Nevertheless, the generality of these compression schemes means that the method need not stand on its own, and may instead be integrated with existing algorithms for further controlled compression (if the benefits are found to be worthwhile). 148 Conclusion In Chapter 6, we have found through low-dimensional parameter estima- tion studies that the GPR marginalised likelihood might be viable for improv- ing the accuracy of searches in highly localised regions of EMRI parameter space. Possible strategies for the generalisation of the method to higher di- mensions are discussed in Section 6.4. Apart from working towards realistic parameter estimation in all dimensions of the full (or at least intrinsic) param- eter space, another target is to ensure that the highly localised prior support required for the likelihood to be used in sampling algorithms is compatible with the rough localisation provided by detection-stage searches. Methods that address the curse of dimensionality might simultaneously achieve this, but the problem can also be tackled from the detection end, e.g. by employing a semi-coherent search with short segments of AAK templates that are highly accurate over their duration, such that better localisation accuracy is obtained. There are other interesting research directions associated with developing the GPR marginalised likelihood: for example, extending it in a physically mo- tivated way to account for the varying degree of error that is expected across different frequency bins. The more general method of GPR waveform interpo- lation also has possible applications that are unrelated to parameter estimation, such as the smooth combination of waveform models incorporating different features (e.g. a non-spinning eccentric model for comparable-mass mergers and a circular one with spin). Appendix A Induced Maxwell field solution For ρ sec θ  1 and homogeneous initial conditions, the solution to (3.34)– (3.37) with the background GW in (3.40), the free EMW in (3.55) and (3.56), and the induced electromagnetic field ansatz in (3.57) and (3.58) is given by E (±)a = hE(0)ξ(±)(t)eiψP (±)a , (A.1) B(±)a = hE(0)ξ(±)(t)eiψQ(±)a , (A.2) where ξ(±)(t) = m(±) exp (−i(n± k)t) −m(±) cos (m(±)t) + i(n± k) sin (m(±)t), (A.3) P (±) 0 = Q (±) 0 = 0, (A.4) 149 150 Induced Maxwell field solution P (±) 1 = i sin2 (θ/2) 8m(±)(k ± n−m(±))(k ± n+m(±)) × [2n2 sin (2α− γ − 3φ) + 6n2 sin (2α + γ − 3φ) −n2 sin (2α− γ − 2θ − 3φ) + n2 sin (2α + γ − 2θ − 3φ) +4n2 sin (2α + γ − θ − 3φ) + 4n2 sin (2α + γ + θ − 3φ) −n2 sin (2α− γ + 2θ − 3φ) + n2 sin (2α + γ + 2θ − 3φ) −2(8k2 + 10kn+ 3n2) sin (2α− γ − φ) −2n(2k + n) sin (2α + γ − φ)− n2 sin (2α− γ − 2θ − φ) +n2 sin (2α + γ − 2θ − φ)− 2n(k + 2n) sin (2α− γ − θ − φ) −2kn sin (2α + γ − θ − φ)− 2n(k + 2n) sin (2α− γ + θ − φ) −2kn sin (2α + γ + θ − φ)− n2 sin (2α− γ + 2θ − φ) +n2 sin (2α + γ + 2θ − φ)] , (A.5) P (±) 2 = i sin2 (θ/2) 8m(±)(k ± n−m(±))(k ± n+m(±)) × [2n2 cos (2α− γ − 3φ) + 6n2 cos (2α + γ − 3φ) −n2 cos (2α− γ − 2θ − 3φ) + n2 cos (2α + γ − 2θ − 3φ) +4n2 cos (2α + γ − θ − 3φ) + 4n2 cos (2α + γ + θ − 3φ) −n2 cos (2α− γ + 2θ − 3φ) + n2 cos (2α + γ + 2θ − 3φ) +2(8k2 + 10kn+ 3n2) cos (2α− γ − φ) +2n(2k + n) cos (2α + γ − φ) + n2 cos (2α− γ − 2θ − φ) −n2 cos (2α + γ − 2θ − φ) + 2n(k + 2n) cos (2α− γ − θ − φ) +2kn cos (2α + γ − θ − φ) + 2n(k + 2n) cos (2α− γ + θ − φ) +2kn cos (2α + γ + θ − φ) + n2 cos (2α− γ + 2θ − φ) −n2 cos (2α + γ + 2θ − φ)] , (A.6) 151 P (±) 3 = in sin θ 8m(±)(k ± n−m(±))(k ± n+m(±)) × [2(3k + n) sin (2α− γ − 2φ) + 2(k − n) sin (2α + γ − 2φ) −3k sin (2α− γ − θ − 2φ) + k sin (2α + γ − θ − 2φ) −3k sin (2α− γ + θ − 2φ) + k sin (2α + γ + θ − 2φ) −n sin (2α− γ + 2θ − 2φ) + n sin (2α + γ + 2θ − 2φ) −n sin (2α− γ − 2θ − 2φ) + n sin (2α + γ − 2θ − 2φ)] , (A.7) Q (±) 1 = i sin2 (θ/2) 8m(±)(k ± n−m(±))(k ± n+m(±)) × [2n2 cos (2α− γ − 3φ)− 6n2 cos (2α + γ − 3φ) −n2 cos (2α− γ − 2θ − 3φ)− n2 cos (2α + γ − 2θ − 3φ) −4n2 cos (2α + γ − θ − 3φ)− 4n2 cos (2α + γ + θ − 3φ) −n2 cos (2α− γ + 2θ − 3φ)− n2 cos (2α + γ + 2θ − 3φ) −2(8k2 + 10kn+ 3n2) cos (2α− γ − φ) +2n(2k + n) cos (2α + γ − φ)− n2 cos (2α− γ − 2θ − φ) −n2 cos (2α + γ − 2θ − φ)− 2n(k + 2n) cos (2α− γ − θ − φ) +2kn cos (2α + γ − θ − φ)− 2n(k + 2n) cos (2α− γ + θ − φ) +2kn cos (2α + γ + θ − φ)− n2 cos (2α− γ + 2θ − φ) −n2 cos (2α + γ + 2θ − φ)] , (A.8) 152 Induced Maxwell field solution Q (±) 2 = i sin2 (θ/2) 8m(±)(k ± n−m(±))(k ± n+m(±)) × [−2n2 sin (2α− γ − 3φ) + 6n2 sin (2α + γ − 3φ) +n2 sin (2α− γ − 2θ − 3φ) + n2 sin (2α + γ − 2θ − 3φ) +4n2 sin (2α + γ − θ − 3φ) + 4n2 sin (2α + γ + θ − 3φ) +n2 sin (2α− γ + 2θ − 3φ) + n2 sin (2α + γ + 2θ − 3φ) −2(8k2 + 10kn+ 3n2) sin (2α− γ − φ) +2n(2k + n) sin (2α + γ − φ)− n2 sin (2α− γ − 2θ − φ) −n2 sin (2α + γ − 2θ − φ)− 2n(k + 2n) sin (2α− γ − θ − φ) +2kn sin (2α + γ − θ − φ)− 2n(k + 2n) sin (2α− γ + θ − φ) +2kn sin (2α + γ + θ − φ)− n2 sin (2α− γ + 2θ − φ) −n2 sin (2α + γ + 2θ − φ)] , (A.9) Q (±) 3 = in sin θ 8m(±)(k ± n−m(±))(k ± n+m(±)) × [2(3k + n) cos (2α− γ − 2φ)− 2(k − n) cos (2α + γ − 2φ) −3k cos (2α− γ − θ − 2φ)− k cos (2α + γ − θ − 2φ) −3k cos (2α− γ + θ − 2φ)− k cos (2α + γ + θ − 2φ) −n cos (2α− γ + 2θ − 2φ)− n cos (2α + γ + 2θ − 2φ) −n cos (2α− γ − 2θ − 2φ)− n cos (2α + γ − 2θ − 2φ)] , (A.10) with m(±) = (k2 + n2 ± 2kn cos θ)1/2. Appendix B Combinatorial design theory The problem of constructing a family of sets Um under the cardinality con- straints (5.31) and (5.32) in Section 5.2.3 may be regarded geometrically as the problem of constructing a collection ofN distinct points (representing template labels) and M distinct lines (representing sets) with the following properties: (i) each point lies on exactly R lines; (ii) each line passes through exactly C points; (iii) any two lines intersect at exactly I points; (iv) any two points lie on at most R− 1 lines. The final property is the automatic identification condition, i.e. no two labels are assigned to exactly the same subfamily of sets. The feasibility of carrying out such a construction (or finding additional conditions on N , M and R that ensure it is possible) is a difficult and unsolved problem in combinatorics. One special case that has been studied in detail is R = C and I = 1. This implies that N = M = R2 − R + 1, and that any two points must lie on exactly one line. Under these circumstances, the four geometrical properties define a finite projective plane of order R − 1 [170]. It is known that finite projective planes exist with prime orders [170], but there is no finite projective plane of order 6 [198] or 10 [199], while the existence (or otherwise) of an order-12 finite projective plane remains an open question. The special case of finite projective planes is uninteresting from a compression-scheme point of view, as it has N = M and hence achieves no 153 154 Combinatorial design theory compression. However, it strongly indicates that the conditions (5.31) and (5.32) are not sufficient to ensure the existence of a set construction with the four required properties. Nonetheless, valid set constructions have been found for small values of N , M and R; for example, (N,M,R) = (10, 6, 3) yields C = 5, I = 2, and the set construction U1 = {1, 2, 3, 4, 5} , U2 = {1, 2, 6, 7, 8} , U3 = {1, 3, 6, 9, 10} , U4 = {2, 5, 8, 9, 10} , U5 = {3, 4, 7, 8, 10} , U6 = {4, 5, 6, 7, 9} . (B.1) Additional solutions for (N,M,R) = (12, 9, 3) and (N,M,R) = (14, 7, 3) also exist. No counterexamples (i.e. values of (N,M,R) satisfying (5.31) and (5.32) but admitting no set construction) have been found for N > M , although we have not conducted an exhaustive search. A general compression scheme satisfying the conditions (5.31) and (5.32) might potentially admit more compression rates than the symmetric base scheme for each value of N . Given the difficulties in actually constructing the sets, however, we focus instead on the special case of “maximal represen- tation” for fixed M and R (i.e. every M -digit binary number with exactly R 1’s represents a distinct template label); this gives the binomial coefficient scheme described in Section 5.2.3. Appendix C Taylor-T2 waveform expansions The Taylor-T2 PN waveform (5.52) used in Section 5.4 describes the inspi- ral part of a circular and non-inclined comparable-mass binary coalescence [38, 174, 175]. With units restored, its amplitude and phase are written as ex- pansions in the frequency-related variable ξ := ( GM c3η3/5 φ˙ )2/3 , (C.1) with the orbital phase φ given to 3.5PN accuracy by φ = −1 η { τ 5/8 + ( 3715 8064 + 55 96 η ) τ 3/8 − 3 4 piτ 1/4 + ( 9275495 14450688 + 284875 258048 η + 1855 2048 η2 ) τ 1/8 + ( − 38645 172032 + 65 2048 η ) pi ln ( τ τ0 ) + [ 831032450749357 57682522275840 − 53 40 pi2 − 107 56 γ + 107 448 ln ( τ 256 ) + ( −126510089885 4161798144 + 2255 2048 pi2 ) η + 154565 1835008 η2 − 1179625 1769472 η3 ] τ−1/8 + ( 188516689 173408256 + 488825 516096 η − 141769 516096 η2 ) piτ−1/4 } , (C.2) 155 156 Taylor-T2 waveform expansions where γ is the Euler–Mascheroni constant. Here τ is a time-related variable, written in terms of the binary coalescence time tc as τ = c3η8/5 5GM(tc − t), (C.3) where we set tc = 1 yr for a ∼ 106M black-hole binary inspiral. The GW amplitude is then proportional to the 2PN amplitude function A = ξ ( 2 + 1 3 (−13 + η)ξ + 4piξ3/2 + 1 180 (−837− 635η + 15η2)ξ2 ) , (C.4) while the GW phase is twice the tail-distorted orbital phase ψ = φ− 3ξ3/2 ( 1− η 2 ξ ) ln ( ξ ξ0 ) , (C.5) with the 1PN factor of 1− (η/2)ξ included to account for the nonlinear interac- tion between the gravitational field of the source and its emitted gravitational radiation [200]. Finally, the constant frequency in ξ0 is set to φ˙0 = 10−4pi, which corresponds to an approximate entry frequency of 10−4 Hz for LISA. Bibliography [1] A. Einstein. On gravitational waves. Proceedings of the Prussian Academy of Sciences, 1918:154, 1918. [2] R. A. Hulse and J. H. Taylor. Discovery of a pulsar in a binary system. The Astrophysical Journal, 195:L51, 1975. [3] I. H. Stairs. Testing general relativity with pulsar timing. Living Reviews in Relativity, 6:5, 2003. [4] A. G. Lyne et al. A double-pulsar system: A rare laboratory for relativis- tic gravity and plasma physics. Science, 303:1153, 2004. [5] J. Weber. Detection and generation of gravitational waves. Physical Re- view, 117:306, 1960. [6] W. H. Press and K. S. Thorne. Gravitational-wave astronomy. Annual Review of Astronomy and Astrophysics, 10:335, 1972. [7] G. M. Harry (for the LIGO Scientific Collaboration). Advanced LIGO: The next generation of gravitational wave detectors. Classical and Quan- tum Gravity, 27:084006, 2010. [8] B. P. Abbott et al. Observation of gravitational waves from a binary black hole merger. Physical Review Letters, 116:061102, 2016. [9] B. P. Abbott et al. GW151226: Observation of gravitational waves from a 22-Solar-mass binary black hole coalescence. Physical Review Letters, 116:241103, 2016. [10] B. P. Abbott et al. Binary black hole mergers in the first Advanced LIGO observing run. Physical Review X, 6:041015, 2016. 157 158 BIBLIOGRAPHY [11] F. Acernese et al. Advanced Virgo: A second-generation interferometric gravitational wave detector. Classical and Quantum Gravity, 32:024001, 2015. [12] K. Somiya (for the KAGRA Collaboration). Detector configuration of KAGRA—the Japanese cryogenic gravitational-wave detector. Classical and Quantum Gravity, 29:124007, 2012. [13] K. Danzmann et al. LISA: Laser Interferometer Space Antenna. 2017. www.elisascience.org/files/publications/LISA L3 20170120.pdf. [14] P. Amaro-Seoane et al. The gravitational universe. 2013. arXiv:1305.5720 [astro-ph.CO]. [15] S. Kawamura et al. The Japanese space gravitational wave antenna: DE- CIGO. Classical and Quantum Gravity, 28:094011, 2011. [16] J. Luo et al. TianQin: A space-borne gravitational wave detector. Classical and Quantum Gravity, 33:035010, 2016. [17] K. Danzmann (for the LISA Pathfinder Team and the eLISA Consor- tium). LISA and its pathfinder. Nature Physics, 11:613, 2015. [18] M. Armano et al. Sub-femto-g free fall for space-based gravitational wave observatories: LISA Pathfinder results. Physical Review Letters, 116:231101, 2016. [19] G. Hobbs et al. The International Pulsar Timing Array project: Using pulsars as a gravitational wave detector. Classical and Quantum Gravity, 27:084013, 2010. [20] S. R. Taylor et al. Are we there yet? Time to detection of nanohertz grav- itational waves based on pulsar-timing array limits. The Astrophysical Journal Letters, 819:L6, 2016. [21] A. Sesana. Prospects for multiband gravitational-wave astronomy after GW150914. Physical Review Letters, 116:231102, 2016. BIBLIOGRAPHY 159 [22] C. W. Misner, K. S. Thorne, and J. A. Wheeler. Gravitation. W. H. Freeman, 1973. [23] K. S. Thorne. Multipole expansions of gravitational radiation. Reviews of Modern Physics, 52:299, 1980. [24] C. Cutler and K. S. Thorne. An overview of gravitational-wave sources. In N. Bishop and S. D. Maharaj, editors, Proceedings of the 16th Interna- tional Conference on General Relativity and Gravitation. World Scientific, 2002. [25] B. S. Sathyaprakash and B. F. Schutz. Physics, astrophysics and cosmol- ogy with gravitational waves. Living Reviews in Relativity, 12:2, 2009. [26] C. J. Moore, R. H. Cole, and C. P. L. Berry. Gravitational-wave sensitivity curves. Classical and Quantum Gravity, 32:015014, 2015. [27] T. Creighton. Advanced LIGO: Sources and astrophysics. Classical and Quantum Gravity, 20:S853, 2003. [28] P. Amaro-Seoane et al. Low-frequency gravitational-wave science with eLISA/NGO. Classical and Quantum Gravity, 29:124016, 2012. [29] C. D. Ott. The gravitational-wave signature of core-collapse supernovae. Classical and Quantum Gravity, 26:063001, 2009. [30] P. D. Lasky. Gravitational waves from neutron stars: A review. Publica- tions of the Astronomical Society of Australia, 32:e034, 2015. [31] T. Damour and A. Vilenkin. Gravitational wave bursts from cosmic strings. Physical Review Letters, 85:3761, 2000. [32] L. J. Rubbo, K. Holley-Bockelmann, and L. S. Finn. Event rate for extreme mass ratio burst signals in the Laser Interferometer Space Antenna band. The Astrophysical Journal Letters, 649:L25, 2006. [33] P. D. Lasky et al. Gravitational-wave cosmology across 29 decades in frequency. Physical Review X, 6:011035, 2016. 160 BIBLIOGRAPHY [34] C. Caprini et al. Science with the space-based interferometer eLISA, II: Gravitational waves from cosmological phase transitions. Journal of Cos- mology and Astroparticle Physics, 04(2016):001, 2016. [35] P. C. Peters and J. Mathews. Gravitational radiation from point masses in a Keplerian orbit. Physical Review, 131:435, 1963. [36] P. C. Peters. Gravitational radiation and the motion of two point masses. Physical Review, 136:B1224, 1964. [37] L. Blanchet and T. Damour. Post-Newtonian generation of gravitational waves. Annales de l’Institut Henri Poincare´, 50:377, 1989. [38] L. Blanchet. Gravitational radiation from post-Newtonian sources and inspiralling compact binaries. Living Reviews in Relativity, 17:2, 2014. [39] J. Centrella, J. G. Baker, B. J. Kelly, and J. R. van Meter. Black-hole bi- naries, gravitational waves, and numerical relativity. Reviews of Modern Physics, 82:3069, 2010. [40] E. T. Newman and R. Penrose. An approach to gravitational radiation by a method of spin coefficients. Journal of Mathematical Physics, 3:566, 1962. [41] E. T. Newman and R. Penrose. Note on the Bondi–Metzner–Sachs group. Journal of Mathematical Physics, 7:863, 1966. [42] F. Pretorius. Evolution of binary black-hole spacetimes. Physical Review Letters, 95:121101, 2005. [43] M. Campanelli, C. O. Lousto, P. Marronetti, and Y. Zlochower. Accu- rate evolutions of orbiting black-hole binaries without excision. Physical Review Letters, 96:111101, 2006. [44] J. G. Baker et al. Gravitational-wave extraction from an inspiralling con- figuration of merging black holes. Physical Review Letters, 96:111102, 2006. BIBLIOGRAPHY 161 [45] The SXS Collaboration. SXS Gravitational Waveform Database, 2016. www.black-holes.org/waveforms. [46] K. D. Kokkotas and B. G. Schmidt. Quasi-normal modes of stars and black holes. Living Reviews in Relativity, 2:2, 1999. [47] T. Regge and J. A. Wheeler. Stability of a Schwarzschild singularity. Phys- ical Review, 108:1063, 1957. [48] F. J. Zerilli. Effective potential for even-parity Regge–Wheeler gravita- tional perturbation equations. Physical Review Letters, 24:737, 1970. [49] S. A. Teukolsky. Rotating black holes: Separable wave equations for gravitational and electromagnetic perturbations. Physical Review Letters, 29:1114, 1972. [50] E. W. Leaver. An analytic representation for the quasi-normal modes of Kerr black holes. Proceedings of the Royal Society A, 402:285, 1985. [51] A. Buonanno and T. Damour. Effective one-body approach to general relativistic two-body dynamics. Physical Review D, 59:084006, 1999. [52] Y. Pan et al. Inspiral–merger–ringdown waveforms of spinning, pre- cessing black-hole binaries in the effective-one-body formalism. Physical Review D, 89:084006, 2014. [53] P. Ajith et al. Inspiral–merger–ringdown waveforms for black-hole bina- ries with nonprecessing spins. Physical Review Letters, 106:241101, 2011. [54] M. Hannam et al. Simple model of complete precessing black-hole- binary gravitational waveforms. Physical Review Letters, 113:151101, 2014. [55] N. Yunes et al. Extreme mass-ratio inspirals in the effective-one-body approach: Quasicircular, equatorial orbits around a spinning black hole. Physical Review D, 83:044044, 2011. 162 BIBLIOGRAPHY [56] P. Amaro-Seoane et al. Intermediate and extreme mass-ratio inspirals: Astrophysics, science applications and detection using LISA. Classical and Quantum Gravity, 24:R113, 2007. [57] L. Barack and C. Cutler. LISA capture sources: Approximate waveforms, signal-to-noise ratios, and parameter estimation accuracy. Physical Re- view D, 69:082005, 2004. [58] S. Babak et al. “Kludge” gravitational waveforms for a test-body orbiting a Kerr black hole. Physical Review D, 75:024005, 2007. [59] A. J. K. Chua and J. R. Gair. Improved analytic extreme-mass-ratio in- spiral model for scoping out eLISA data analysis. Classical and Quantum Gravity, 32:232002, 2015. [60] L. Barack. Gravitational self-force in extreme mass-ratio inspirals. Clas- sical and Quantum Gravity, 26:213001, 2009. [61] E. Poisson, A. Pound, and I. Vega. The motion of point particles in curved spacetime. Living Reviews in Relativity, 14:7, 2011. [62] S. A. Hughes. Evolution of circular, nonequatorial orbits of Kerr black holes due to gravitational-wave emission, II: Inspiral trajectories and gravitational waveforms. Physical Review D, 64:064004, 2001. [63] S. Drasco and S. A. Hughes. Gravitational wave snapshots of generic extreme mass ratio inspirals. Physical Review D, 73:024027, 2006. [64] E´. E´. Flanagan and T. Hinderer. Transient resonances in the inspirals of point particles into black holes. Physical Review Letters, 109:071102, 2012. [65] T. Hinderer and E´. E´. Flanagan. Two-timescale analysis of extreme mass ratio inspirals in Kerr spacetime: Orbital motion. Physical Review D, 78:064028, 2008. [66] E. A. Huerta and J. R. Gair. Influence of conservative corrections on parameter estimation for extreme-mass-ratio inspirals. Physical Review D, 79:084021, 2009. BIBLIOGRAPHY 163 [67] J. R. Gair et al. Forced motion near black holes. Physical Review D, 83:044037, 2011. [68] N. Warburton et al. Evolution of inspiral orbits around a Schwarzschild black hole. Physical Review D, 85:061501(R), 2012. [69] A. Ori and K. S. Thorne. Transition from inspiral to plunge for a compact body in a circular equatorial orbit around a massive, spinning black hole. Physical Review D, 62:124022, 2000. [70] C. O. Lousto, H. Nakano, Y. Zlochower, and M. Campanelli. Intermediate-mass-ratio black-hole binaries: Numerical relativity meets perturbation theory. Physical Review Letters, 104:211101, 2010. [71] E. A. Huerta and J. R. Gair. Intermediate-mass-ratio inspirals in the Ein- stein Telescope, I: Signal-to-noise ratio calculations. Physical Review D, 83:044020, 2011. [72] P. Jaranowski and A. Kro´lak. Analysis of gravitational-wave data. Cam- bridge University Press, 2009. [73] P. Jaranowski and A. Kro´lak. Gravitational-wave data analysis: Formal- ism and sample applications. Living Reviews in Relativity, 15:4, 2012. [74] M. Punturo et al. The Einstein Telescope: A third-generation gravita- tional wave observatory. Classical and Quantum Gravity, 27:194002, 2010. [75] J. Crowder and N. J. Cornish. Beyond LISA: Exploring future gravita- tional wave missions. Physical Review D, 72:083005, 2005. [76] T. J. W. Lazio. The Square Kilometre Array pulsar timing array. Classical and Quantum Gravity, 30:224011, 2013. [77] M. Maggiore. Gravitational waves, Volume 1: Theory and experiments. Ox- ford University Press, 2007. [78] O. D. Aguiar. Past, present and future of the resonant-mass gravitational wave detectors. Research in Astronomy and Astrophysics, 11:1, 2010. 164 BIBLIOGRAPHY [79] C. Cutler. Angular resolution of the LISA gravitational wave detector. Physical Review D, 57:7089, 1998. [80] T. A. Apostolatos, C. Cutler, G. J. Sussman, and K. S. Thorne. Spin- induced orbital precession and its modulation of the gravitational wave- forms from merging binaries. Physical Review D, 49:6274, 1994. [81] T. A. Apostolatos. Search templates for gravitational waves from pre- cessing, inspiraling binaries. Physical Review D, 52:605, 1995. [82] B. F. Schutz. Data processing, analysis, and storage for interferometric antennas. In D. G. Blair, editor, The detection of gravitational waves. Cam- bridge University Press, 1989. [83] B. J. Owen. Search templates for gravitational waves from inspiraling binaries: Choice of template spacing. Physical Review D, 53:6749, 1996. [84] D. George and E. A. Huerta. Deep neural networks to enable real-time multimessenger astrophysics. 2017. arXiv:1701.00008[astro-ph.IM]. [85] P. Addesso et al. Compressed sensing for time-frequency gravitational wave data analysis. 2016. arXiv:1605.03496[astro-ph.IM]. [86] L. S. Finn. Detection, measurement, and gravitational radiation. Physical Review D, 46:5236, 1992. [87] C. Cutler and E´. E´. Flanagan. Gravitational waves from merging compact binaries: How accurately can one extract the binary’s parameters from the inspiral waveform? Physical Review D, 49:2658, 1994. [88] N. Christensen and R. Meyer. Using Markov chain Monte Carlo meth- ods for estimating parameters with gravitational radiation data. Physical Review D, 64:022001, 2001. [89] C. Ro¨ver, R. Meyer, and N. Christensen. Coherent Bayesian inference on compact binary inspirals using a network of interferometric gravita- tional wave detectors. Physical Review D, 75:062004, 2007. BIBLIOGRAPHY 165 [90] F. Feroz, J. R. Gair, M. P. Hobson, and E. K. Porter. Use of the MULTI- NEST algorithm for gravitational wave data analysis. Classical and Quan- tum Gravity, 26:215003, 2009. [91] J. Crowder, N. J. Cornish, and J. L. Reddinger. LISA data analysis using genetic algorithms. Physical Review D, 73:063011, 2006. [92] A. J. K. Chua, P. Can˜izares, and J. R. Gair. Electromagnetic signatures of far-field gravitational radiation in the 1+3 approach. Classical and Quan- tum Gravity, 32:015011, 2015. [93] E. S. Phinney. Finding and using electromagnetic counterparts of gravi- tational wave sources: A white paper for the Astro2010 Decadal Review. 2009. arXiv:0903.0098[astro-ph.CO]. [94] I. Mandel and R. O’Shaughnessy. Compact binary coalescences in the band of ground-based gravitational-wave detectors. Classical and Quan- tum Gravity, 27:114007, 2010. [95] S. Ghosh and G. Nelemans. Localizing gravitational wave sources with optical telescopes and combining electromagnetic and gravitational wave data. In C. F. Sopuerta, editor, Gravitational wave astrophysics. Springer, 2015. [96] H. Sotani, K. D. Kokkotas, P. Laguna, and C. F. Sopuerta. Gravitationally driven electromagnetic perturbations of neutron stars and black holes. Physical Review D, 87:084018, 2013. [97] P. Pani, E. Berti, and L. Gualtieri. Scalar, electromagnetic, and gravita- tional perturbations of Kerr-Newman black holes in the slow-rotation limit. Physical Review D, 88:064048, 2013. [98] C. Palenzuela et al. Binary black holes’ effects on electromagnetic fields. Physical Review Letters, 103:081101, 2009. [99] M. E. Gertsenshteı˘n. Wave resonance of light and gravitational waves. Soviet Physics JETP, 14:84, 1962. 166 BIBLIOGRAPHY [100] Y. B. Zel’dovich. Electromagnetic and gravitational waves in a stationary magnetic field. Soviet Physics JETP, 38:652, 1974. [101] C. Barrabe`s and P. A. Hogan. Interaction of gravitational waves with magnetic and electric fields. Physical Review D, 81:064024, 2010. [102] F. I. Cooperstock. The interaction between electromagnetic and gravita- tional waves. Annals of Physics, 47:173, 1968. [103] D. M. Zipoy. Light fluctuations due to an intergalactic flux of gravita- tional waves. Physical Review, 142:825, 1966. [104] R. Fakir. Gravitational wave detection: A nonmechanical effect. The Astrophysical Journal, 418:202, 1993. [105] S. Kopeikin, P. Korobkov, and A. Polnarev. Propagation of light in the field of stationary and radiative gravitational multipoles. Classical and Quantum Gravity, 23:4299, 2006. [106] A. M. Cruise. An interaction between gravitational and electromagnetic waves. Monthly Notices of the Royal Astronomical Society, 204:485, 1983. [107] E. Montanari. On the propagation of electromagnetic radiation in the field of a plane gravitational wave. Classical and Quantum Gravity, 15:2493, 1998. [108] A. R. Prasanna and S. Mohanty. Gravitational wave-induced rotation of the plane of polarization of pulsar signals. Europhysics Letters, 60:651, 2002. [109] M. Halilsoy and O. Gurtug. Search for gravitational waves through the electromagnetic Faraday rotation. Physical Review D, 75:124021, 2007. [110] V. Faraoni. The rotation of polarization by gravitational waves. New Astronomy, 13:178, 2008. [111] R. Ragazzoni, G. Valente, and E. Marchetti. Gravitational wave detection through microlensing? Monthly Notices of the Royal Astronomical Society, 345:100, 2003. BIBLIOGRAPHY 167 [112] G. B. Lesovik, A. V. Lebedev, V. Mounutcharyan, and T. Martin. Detec- tion of gravity waves by phase modulation of the light from a distant star. Physical Review D, 71:122001, 2005. [113] G. F. R. Ellis. Relativistic cosmology. In R. K. Sachs, editor, Proceedings of the International School of Physics “Enrico Fermi”, Course 47, General rela- tivity and cosmology. Academic Press, 1971. [114] G. F. R. Ellis and H. van Elst. Cosmological models: Carge`se Lectures 1998. In M. Lachie`ze-Rey, editor, Proceedings of the NATO Advanced Study Institute on Theoretical and Observational Cosmology. Kluwer Academic Publishers, 1999. [115] C. G. Tsagas, A. Challinor, and R. Maartens. Relativistic cosmology and large-scale structure. Physics Reports, 465:61, 2008. [116] P. K. S. Dunsby, B. A. C. C. Bassett, and G. F. R. Ellis. Covariant analysis of gravitational waves in a cosmological context. Classical and Quantum Gravity, 14:1215, 1997. [117] M. Marklund, P. K. S. Dunsby, and G. Brodin. Cosmological electromag- netic fields due to gravitational wave perturbations. Physical Review D, 62:101501(R), 2000. [118] C. G. Tsagas, P. K. S. Dunsby, and M. Marklund. Gravitational wave amplification of seed magnetic fields. Physics Letters B, 561:17, 2003. [119] C. G. Tsagas. Gravitomagnetic amplification in cosmology. Physical Re- view D, 81:043501, 2010. [120] C. G. Tsagas. Gravitoelectromagnetic resonances. Physical Review D, 84:043524, 2011. [121] C. G. Tsagas. Electromagnetic fields in curved spacetimes. Classical and Quantum Gravity, 22:393, 2005. [122] C. A. Clarkson and R. K. Barrett. Covariant perturbations of Schwarzschild black holes. Classical and Quantum Gravity, 20:3855, 2003. 168 BIBLIOGRAPHY [123] C. A. Clarkson, M. Marklund, G. Betschart, and P. K. S. Dunsby. The elec- tromagnetic signature of black hole ring-down. The Astrophysical Journal, 613:492, 2004. [124] M. Marklund, G. Brodin, and P. K. S. Dunsby. Radio wave emissions due to gravitational radiation. The Astrophysical Journal, 536:875, 2000. [125] M. Servin and G. Brodin. Resonant interaction between gravitational waves, electromagnetic waves, and plasma flows. Physical Review D, 68:044017, 2003. [126] A. P. Kouretsis and C. G. Tsagas. Gravito-electromagnetic resonances in Minkowski space. Physical Review D, 88:044006, 2013. [127] J. D. Jackson. Classical electrodynamics. Wiley, 1998. [128] F. J. Dyson. Is a graviton detectable? Henri Poincare´ Prize Lecture 2012. 2013. publications.ias.edu/sites/default/files/poincare2012.pdf. [129] F-Y. Li, H. Wen, and Z-Y. Fang. High-frequency gravitational waves hav- ing large spectral densities and their electromagnetic response. Chinese Physics B, 22:120402, 2013. [130] S. A. Olausen and V. M. Kaspi. The McGill magnetar catalog. The Astro- physical Journal Supplement Series, 212:6, 2014. [131] F. Pacini. Rotating neutron stars, pulsars and supernova remnants. Na- ture, 219:145, 1968. [132] H. D. Falcke et al. A very brief description of LOFAR: The Low Fre- quency Array. Highlights of Astronomy, 14:386, 2006. [133] A. R. Taylor. The Square Kilometre Array. Proceedings of the International Astronomical Union Symposia and Colloquia, 291:339, 2012. [134] A. J. K. Chua. Augmented kludge waveforms and Gaussian process regression for EMRI data analysis. Journal of Physics: Conference Series, 716:012028, 2016. BIBLIOGRAPHY 169 [135] A. J. K. Chua and J. R. Gair. EMRI Kludge Suite, 2016. github.com/al- vincjk/EMRI Kludge Suite. [136] J. R. Gair, M. Vallisneri, S. L. Larson, and J. G. Baker. Testing general relativity with low-frequency, space-based gravitational-wave detectors. Living Reviews in Relativity, 16:7, 2013. [137] K. A. Arnaud et al. An overview of the Mock LISA Data Challenges. AIP Conference Proceedings, 873:619, 2006. [138] W. Schmidt. Celestial mechanics in Kerr spacetime. Classical and Quan- tum Gravity, 19:2743, 2002. [139] M. Sasaki and H. Tagoshi. Analytic black hole perturbation approach to gravitational radiation. Living Reviews in Relativity, 6:6, 2003. [140] E. Forseth, C. R. Evans, and S. Hopper. Eccentric-orbit extreme-mass- ratio inspiral gravitational wave energy fluxes to 7PN order. Physical Review D, 93:064058, 2016. [141] N. Sago and R. Fujita. Calculation of radiation reaction effect on orbital parameters in Kerr spacetime. Progress of Theoretical and Experimental Physics, 2015:073E03, 2015. [142] A. Buonanno, Y. Chen, and M. Vallisneri. Detection template families for gravitational waves from the final stages of binary-black-hole inspirals: Nonspinning case. Physical Review D, 67:024016, 2003. [143] B. M. Barker and R. F. O’Connell. Gravitational two-body problem with arbitrary masses, spins, and quadrupole moments. Physical Review D, 12:329, 1975. [144] V. A. Brumberg. Essential relativistic celestial mechanics. CRC Press, 1991. [145] W. Junker and G. Scha¨fer. Binary systems: Higher order gravitational radiation damping and wave emission. Monthly Notices of the Royal As- tronomical Society, 254:146, 1992. 170 BIBLIOGRAPHY [146] F. D. Ryan. Effect of gravitational radiation reaction on nonequatorial orbits around a Kerr black hole. Physical Review D, 53:3064, 1996. [147] S. A. Hughes. Evolution of circular, nonequatorial orbits of Kerr black holes due to gravitational-wave emission. Physical Review D, 61:084004, 2000. [148] K. Glampedakis, S. A. Hughes, and D. Kennefick. Approximating the inspiral of test bodies into Kerr black holes. Physical Review D, 66:064005, 2002. [149] J. R. Gair and K. Glampedakis. Improved approximate inspirals of test bodies into Kerr black holes. Physical Review D, 73:064037, 2006. [150] J. D. Bekenstein. Gravitational-radiation recoil and runaway black holes. The Astrophysical Journal, 183:657, 1973. [151] W. H. Press. Gravitational radiation from sources which extend into their own wave zone. Physical Review D, 15:965, 1977. [152] S. Chandrasekhar. The mathematical theory of black holes. Clarendon Press, 1983. [153] Y. Mino. Perturbative approach to an orbital evolution around a super- massive black hole. Physical Review D, 67:084027, 2003. [154] S. Drasco and S. A. Hughes. Rotating black hole orbit functionals in the frequency domain. Physical Review D, 69:044015, 2004. [155] N. Yunes and E. Berti. Accuracy of the post-Newtonian approximation: Optimal asymptotic expansion for quasicircular, extreme-mass ratio in- spirals. Physical Review D, 77:124006, 2008. [156] E. L. Rees. Graphical discussion of the roots of a quartic equation. Amer- ican Mathematical Monthly, 29:51, 1922. BIBLIOGRAPHY 171 [157] D. J. A. McKechan, C. Robinson, and B. S. Sathyaprakash. A tapering window for time-domain templates and simulated signals in the detec- tion of gravitational waves from coalescing compact binaries. Classical and Quantum Gravity, 27:084020, 2010. [158] M. Capderou. Satellites: Orbits and missions. Springer, 2005. [159] A. Klein et al. Science with the space-based interferometer eLISA: Su- permassive black hole binaries. Physical Review D, 93:024003, 2016. [160] A. J. K. Chua and J. R. Gair. Tunable compression of template banks for fast gravitational-wave detection and localization. Physical Review D, 93:122001, 2016. [161] S. Mitra, S. V. Dhurandhar, and L. S. Finn. Improving the efficiency of the detection of gravitational wave signals from inspiraling compact bi- naries: Chebyshev interpolation. Physical Review D, 72:102001, 2005. [162] R. J. E. Smith et al. Towards rapid parameter estimation on gravitational waves from compact binaries using interpolated waveforms. Physical Review D, 87:122002, 2013. [163] P. Can˜izares, S. E. Field, J. R. Gair, and M. Tiglio. Gravitational wave parameter estimation with compressed likelihood evaluations. Physical Review D, 87:124005, 2013. [164] I. S. Heng. Rotating stellar core-collapse waveform decomposition: A principal component analysis approach. Classical and Quantum Gravity, 26:105005, 2009. [165] K. Cannon et al. Singular value decomposition applied to com- pact binary coalescence gravitational-wave signals. Physical Review D, 82:044025, 2010. [166] S. E. Field et al. Reduced basis catalogs for gravitational wave templates. Physical Review Letters, 106:221102, 2011. 172 BIBLIOGRAPHY [167] Y. Wang. Fast detection and automatic parameter estimation of a gravi- tational wave signal with a novel method. General Relativity and Gravita- tion, 47:142, 2015. [168] J. R. Gair et al. Event rate estimates for LISA extreme mass ratio capture sources. Classical and Quantum Gravity, 21:S1595, 2004. [169] S. V. Dhurandhar and B. S. Sathyaprakash. Choice of filters for the de- tection of gravitational waves from coalescing binaries, II: Detection in colored noise. Physical Review D, 49:1707, 1994. [170] N. L. Biggs. Discrete mathematics. Oxford University Press, 2002. [171] D. Singmaster. Repeated binomial coefficients and Fibonacci numbers. Fibonacci Quarterly, 13:295, 1975. [172] D. Singmaster. How often does an integer occur as a binomial coeffi- cient? American Mathematical Monthly, 78:385, 1971. [173] B. S. Sathyaprakash and S. V. Dhurandhar. Choice of filters for the de- tection of gravitational waves from coalescing binaries. Physical Review D, 44:3819, 1991. [174] L. Blanchet, B. R. Iyer, C. M. Will, and A. G. Wiseman. Gravita- tional waveforms from inspiralling compact binaries to second-post- Newtonian order. Classical and Quantum Gravity, 13:575, 1996. [175] R. H. Cole and J. R. Gair. Likelihood smoothing using gravitational wave surrogate models. Physical Review D, 90:124043, 2014. [176] P. Amaro-Seoane et al. eLISA/NGO: Astrophysics and cosmology in the gravitational-wave millihertz regime. GW Notes, 6:4, 2013. [177] P. R. Brady, T. Creighton, C. Cutler, and B. F. Schutz. Searching for peri- odic sources with LIGO. Physical Review D, 57:2101, 1998. [178] B. J. Owen and B. S. Sathyaprakash. Matched filtering of gravitational waves from inspiraling compact binaries: Computational cost and tem- plate placement. Physical Review D, 60:022002, 1999. BIBLIOGRAPHY 173 [179] R. Prix. Template-based searches for gravitational waves: Efficient lattice covering of flat parameter spaces. Classical and Quantum Gravity, 24:S481, 2007. [180] C. J. Moore, C. P. L. Berry, A. J. K. Chua, and J. R. Gair. Improving gravitational-wave parameter estimation using Gaussian process regres- sion. Physical Review D, 93:064001, 2016. [181] C. J. Moore, A. J. K. Chua, C. P. L. Berry, and J. R. Gair. Fast methods for training Gaussian processes on large datasets. Royal Society Open Science, 3:160125, 2016. [182] C. Cutler and M. Vallisneri. LISA detections of massive black hole inspi- rals: Parameter extraction errors due to inaccurate template waveforms. Physical Review D, 76:104018, 2007. [183] D. J. C. MacKay. Information theory, inference, and learning algorithms. Cambridge University Press, 2003. [184] C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine learning. MIT Press, 2006. [185] C. J. Moore and J. R. Gair. Novel method for incorporating model un- certainties into gravitational wave parameter estimates. Physical Review Letters, 113:251101, 2014. [186] LIGO Scientific Collaboration. LSC Algorithm Library Suite, 2016. wiki.ligo.org/DASWG/LALSuite. [187] L. Santamarı´a et al. Matching post-Newtonian and numerical relativity waveforms: Systematic errors and a new phenomenological model for nonprecessing black hole binaries. Physical Review D, 82:064016, 2010. [188] T. Damour, B. R. Iyer, and B. S. Sathyaprakash. Comparison of search templates for gravitational waves from binary inspiral. Physical Review D, 63:044023, 2001. 174 BIBLIOGRAPHY [189] S. Drasco. Strategies for observing extreme mass ratio inspirals. Classical and Quantum Gravity, 23:S769, 2006. [190] H. Wendland. Scattered data approximation. Cambridge University Press, 2004. [191] Y. Saad. Iterative methods for sparse linear systems. Society for Industrial and Applied Mathematics, 2003. [192] S. Babak. Building a stochastic template bank for detecting massive black hole binaries. Classical and Quantum Gravity, 25:195011, 2008. [193] C. Messenger, R. Prix, and M. A. Papa. Random template banks and relaxed lattice coverings. Physical Review D, 79:104017, 2009. [194] I. W. Harry, B. Allen, and B. S. Sathyaprakash. Stochastic template place- ment algorithm for gravitational wave data analysis. Physical Review D, 80:104014, 2009. [195] U. Ruangsri and S. A. Hughes. Census of transient orbital resonances encountered during binary inspiral. Physical Review D, 89:084036, 2014. [196] C. P. L. Berry, R. H. Cole, P. Can˜izares, and J. R. Gair. Importance of transient resonances in extreme-mass-ratio inspirals. Physical Review D, 94:124042, 2016. [197] M. van de Meent. Gravitational self-force on eccentric equatorial orbits around a Kerr black hole. Physical Review D, 94:044034, 2016. [198] R. C. Bose. On the application of the properties of Galois fields to the problem of construction of hyper-Graeco-Latin squares. Sankhya¯: The Indian Journal of Statistics, 3:323, 1938. [199] C. W. H. Lam, L. Thiel, and S. Swiercz. The non-existence of finite pro- jective planes of order 10. Canadian Journal of Mathematics, 41:1117, 1989. [200] L. Blanchet and G. Scha¨fer. Gravitational wave tails and binary star sys- tems. Classical and Quantum Gravity, 10:2699, 1993.