Topics in Gravitational-Wave Astronomy
Theoretical studies, source modelling & statistical methods
ALVIN J. K. CHUA
CLARE HALL / INSTITUTE OF ASTRONOMY / UNIVERSITY OF CAMBRIDGE
A dissertation submitted for the degree of Doctor of Philosophy
Supervisor: Dr Jonathan R. Gair
1 MARCH 2017
ii
Abstract
Astronomy with gravitational-wave observations is now a reality. Much of
the theoretical research in this field falls under three broad themes: the mathe-
matical description and physical understanding of gravitational radiation and
its effects; the construction of accurate and computationally efficient wave-
form models for astrophysical sources; and the improved statistical analysis
of noisy data from interferometric detectors, so as to extract and characterise
source signals. The doctoral thesis presented in this dissertation is an investi-
gation of various topics across these themes.
Under the first theme, we examine the direct interaction between gravita-
tional waves and electromagnetic fields in a self-contained theoretical study;
this is done with a view to understanding the observational implications for
highly energetic astrophysical events that radiate in both the gravitational and
electromagnetic sectors. We then delve into the second theme of source mod-
elling by developing and implementing an improved waveform model for the
extreme-mass-ratio inspirals of stellar-mass compact objects into supermassive
black holes, which are an important class of source for future space-based de-
tectors such as the Laser Interferometer Space Antenna.
Two separate topics are explored under the third theme of data analysis.
We begin with the procedure of searching for gravitational-wave signals in de-
tector data, and propose several combinatorial compression schemes for the
large banks of waveform templates that are matched against putative signals,
before studying the usefulness of these schemes for accelerating searches. Af-
ter a gravitational-wave source is detected, the follow-up process is to mea-
sure its parameters in detail from the data; this is addressed as we apply the
machine-learning technique of Gaussian process regression to gravitational-
wave data analysis, and in particular to the formidable problem of parameter
estimation for extreme-mass-ratio inspirals.
iii
iv
Acknowledgements
The completion of a doctoral thesis is an intrinsically individual and (some
might say) thankless experience. It is however by no means isolated, and
should not pass without a word of thanks to those who have helped.
I am indebted to my supervisor Jon Gair for the opportunity of working
with him, the overarching guidance he has provided, and indeed just for be-
ing so good at what he does. His broad and creative approach to research is
inspiring, while his experience and stature in the field have been an invaluable
conduit for my exposure to gravitational-wave astronomy.
Further gratitude goes to Chris Moore, who is always ready to talk science,
and equally adept at bouncing ideas around or combing through the details.
Daily life in the office has been an enriching—and often exhausting—learning
experience thanks to both him and Rob Cole, who has provided much help
as well over the years. I have likewise enjoyed my interactions with other
past members of the IoA gravity group, some of whom I have had the addi-
tional pleasure of collaborating with: Christopher Berry, Tom Callister, Priscilla
Can˜izares, Sonke Hee, Eliu Huerta, and Steve Taylor.
For graciously serving as both my academic referees and thesis examiners,
as well as for advice freely proffered and often heeded, I thank Leor Barack
and Anthony Lasenby. I am also grateful for science- and job-related dis-
cussions with various researchers at the IoA and around the globe: Mustafa
Amin, Stas Babak, John Baker, Anthony Challinor, Scott Field, Ik Siong Heng,
Scott Hughes, Donald Lynden-Bell, Nikku Madhusudhan, Michele Vallisneri,
Maarten van de Meent, and Yan Wang (both of them).
Finally, I greatly appreciate the assorted assistance afforded by Debbie Pe-
terson and Margaret Harding at the IoA, Irene Hills and Magda Bergman at
Clare Hall, and the good people at the Cambridge Trust.
This thesis is dedicated to my family—for everything, and then some.
v
vi
Preface
This dissertation is the result of my own work and includes nothing which is
the outcome of work done in collaboration, except as declared in the preface
and specified in the text.
It is not substantially the same as any that I have submitted, or that is being
concurrently submitted for a degree or diploma or other qualification at the
University of Cambridge or any other university or similar institution, except
as declared in the preface and specified in the text.
I further state that no substantial part of my dissertation has already been
submitted, or is being concurrently submitted for any such degree, diploma or
other qualification at the University of Cambridge or any other university or
similar institution, except as declared in the preface and specified in the text.
All concepts, results and data obtained by others are properly referenced.
The use of the collaborative “we” in the text is a stylistic choice.
Some of the material in this dissertation has been adapted from the follow-
ing journal publications:
• A. J. K. Chua, P. Can˜izares & J. R. Gair, Electromagnetic signatures of far-
field gravitational radiation in the 1+3 approach, Class. Quantum Grav. 32,
015011 (2015).
• A. J. K. Chua & J. R. Gair, Improved analytic extreme-mass-ratio inspiral model
for scoping out eLISA data analysis, Fast Track Communication, Class.
Quantum Grav. 32, 232002 (2015).
• A. J. K. Chua, Augmented kludge waveforms and Gaussian process regression
for EMRI data analysis, J. Phys.: Conf. Ser. 716, 012028 (2016).
• A. J. K. Chua & J. R. Gair, Tunable compression of template banks for fast
gravitational-wave detection and localisation, Phys. Rev. D 93, 122001 (2016).
vii
viii
• C. J. Moore, C. P. L. Berry, A. J. K. Chua & J. R. Gair, Improving
gravitational-wave parameter estimation using Gaussian process regres-
sion, Phys. Rev. D 93, 064001 (2016).
• C. J. Moore, A. J. K. Chua, C. P. L. Berry & J. R. Gair, Fast methods for train-
ing Gaussian processes on large datasets, R. Soc. Open Sci. 3, 160125 (2016).
Explicit citation of these papers indicates results that have since been updated
(as in Chapter 4), or work done mainly in collaboration (as in Chapter 6).
This dissertation is shorter than 60,000 words in length, including the ab-
stract, tables, footnotes and appendices. The word count is 52,000 words (eval-
uated using Kile).
— Alvin J. K. Chua
1 March 2017
Contents
Abstract iii
Acknowledgements v
Preface vii
1 Introduction 1
1.1 Chapter synopses . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Gravitational-wave astronomy 7
2.1 Gravitational waves in general relativity . . . . . . . . . . . . . . 7
2.1.1 Linearised formalism . . . . . . . . . . . . . . . . . . . . . 8
2.1.2 Gravitational-wave generation . . . . . . . . . . . . . . . 9
2.1.3 Gravitational-wave propagation . . . . . . . . . . . . . . 11
2.1.4 Gravitational-wave effects . . . . . . . . . . . . . . . . . . 12
2.2 Sources and waveform modelling . . . . . . . . . . . . . . . . . . 13
2.2.1 Source overview . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Binary-coalescence modelling . . . . . . . . . . . . . . . . 15
2.2.3 Comparable-mass mergers . . . . . . . . . . . . . . . . . 18
2.2.4 Extreme-mass-ratio inspirals . . . . . . . . . . . . . . . . 19
2.3 Detectors and data analysis . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Detector overview . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Signal processing . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.4 Parameter estimation . . . . . . . . . . . . . . . . . . . . . 28
ix
x Contents
3 Einstein–Maxwell interactions 31
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 The 1+3 approach to relativity . . . . . . . . . . . . . . . . . . . . 34
3.3 Linearised far-field equations . . . . . . . . . . . . . . . . . . . . 37
3.4 Solutions and observational implications . . . . . . . . . . . . . 41
3.4.1 Static electromagnetic field . . . . . . . . . . . . . . . . . 41
3.4.2 Electromagnetic radiation . . . . . . . . . . . . . . . . . . 45
3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4 Augmented kludge waveforms 55
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Kludge waveform models . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Analytic kludge . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2.2 Numerical kludge . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Augmented analytic kludge . . . . . . . . . . . . . . . . . . . . . 65
4.3.1 The fundamental-frequency map . . . . . . . . . . . . . . 67
4.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . 69
4.3.3 Benchmarking . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5 Template bank compression 81
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2 Conic compression . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2.1 Partition scheme . . . . . . . . . . . . . . . . . . . . . . . 89
5.2.2 Symmetric base scheme . . . . . . . . . . . . . . . . . . . 92
5.2.3 Binomial coefficient scheme . . . . . . . . . . . . . . . . . 97
5.2.4 Scheme comparison . . . . . . . . . . . . . . . . . . . . . 100
5.3 Orthogonality and subspaces . . . . . . . . . . . . . . . . . . . . 104
5.3.1 Non-orthogonal templates . . . . . . . . . . . . . . . . . . 104
5.3.2 Two-dimensional subspace . . . . . . . . . . . . . . . . . 108
5.4 Taylor-T2 example . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Contents xi
6 Gaussian process regression 121
6.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.2 The marginalised likelihood . . . . . . . . . . . . . . . . . . . . . 123
6.2.1 The Gaussian process prior . . . . . . . . . . . . . . . . . 125
6.2.2 Training the Gaussian process . . . . . . . . . . . . . . . . 128
6.2.3 Application to black-hole binary mergers . . . . . . . . . 129
6.3 Application to extreme-mass-ratio inspirals . . . . . . . . . . . . 131
6.3.1 One-dimensional example . . . . . . . . . . . . . . . . . . 134
6.3.2 Two-dimensional example . . . . . . . . . . . . . . . . . . 138
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
7 Conclusion 145
A Induced Maxwell field solution 149
B Combinatorial design theory 153
C Taylor-T2 waveform expansions 155
xii Contents
List of Figures and Tables
1.1 Table: Glossary of abbreviations . . . . . . . . . . . . . . . . . . . 6
3.1 Figure: Magnetic-field amplitude of induced electromagnetic
field, for static and uniform free field . . . . . . . . . . . . . . . . 43
3.2 Figure: Poynting flux amplitude envelope of induced electro-
magnetic field, for radiative and planar free field . . . . . . . . . 48
3.3 Figure: Time-averaged Poynting flux of induced electromag-
netic field, for radiative and planar free field . . . . . . . . . . . 50
4.1 Figure: Comparison of early-inspiral waveforms across three
kludge models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2 Figure: Comparison of characteristic strain for year-long wave-
forms across three kludge models . . . . . . . . . . . . . . . . . . 74
4.3 Table: Comparison of two- and six-month overlaps for original
and augmented analytic kludge waveforms, with respect to nu-
merical kludge waveforms . . . . . . . . . . . . . . . . . . . . . . 75
4.4 Figure: Two- to six-month overlaps and computational speed-
up for augmented analytic kludge waveforms with varying
compact-object mass . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5 Figure: Two- to six-month overlaps and computational speed-
up for augmented analytic kludge waveforms with varying
black-hole spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.6 Figure: Two- to six-month overlaps and computational speed-
up for augmented analytic kludge waveforms with varying ini-
tial eccentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1 Figure: Optimal detection surface for uncorrelated statistics . . 87
xiii
xiv LIST OF FIGURES AND TABLES
5.2 Figure: Receiver operating characteristic curves for partition
scheme with various detection statistics . . . . . . . . . . . . . . 90
5.3 Figure: Optimal detection surface for correlated statistics of
symmetric base scheme . . . . . . . . . . . . . . . . . . . . . . . . 94
5.4 Figure: Receiver operating characteristic curves for symmetric
base scheme with various detection statistics . . . . . . . . . . . 95
5.5 Figure: Comparison of compression rates for various conic com-
pression schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.6 Figure: Comparison of discrimination and identification ac-
curacy for partition, symmetric base and binomial coefficient
schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.7 Figure: Discrimination and localisation accuracy for partition
scheme variants and coarsening approach, with non-orthogonal
template bank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.8 Figure: Discrimination and identification accuracy for partition
scheme, with one- and two-dimensional signal injections . . . . 109
5.9 Figure: Visualisation of signal overlaps with Taylor-T2 template
bank at various levels of partition-scheme compression, for cen-
tral signal injection . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.10 Figure: Detection rates and localisation accuracy for Taylor-T2
template bank with partition scheme and coarsening approach,
applied to central signal injection . . . . . . . . . . . . . . . . . . 114
5.11 Figure: Visualisation of signal overlaps with Taylor-T2 template
bank, for random and boundary signal injections . . . . . . . . . 116
5.12 Figure: Detection rates and localisation accuracy for Taylor-T2
template bank with partition scheme and coarsening approach,
applied to random and boundary signal injections . . . . . . . . 117
6.1 Figure: Log-hyperlikelihood for metric hyperparameter of co-
variance function over one-dimensional parameter space, with
various training-set densities . . . . . . . . . . . . . . . . . . . . 135
LIST OF FIGURES AND TABLES xv
6.2 Figure: Accurate, approximate and marginalised likelihoods
over one-dimensional parameter space, with various training-
set densities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.3 Figure: Visualisation of various training-set grids and correla-
tion ellipses over two-dimensional parameter space . . . . . . . 140
6.4 Figure: Accurate, approximate and marginalised likelihood
contours over two-dimensional parameter space, with various
training-set densities . . . . . . . . . . . . . . . . . . . . . . . . . 141
xvi LIST OF FIGURES AND TABLES
Chapter 1
Introduction
Gravitational waves (GWs) are radiative perturbations in the curvature of
spacetime. They are a confirmed prediction of Einstein’s general theory of
relativity [1], and hence an essential feature of any other viable theory of grav-
itation. The measurement of GW effects is on the long list of tests that general
relativity has undergone (and passed with flying colours), but the intrinsic
weakness of these effects means they can only be detected for astrophysical
sources in much stronger gravitational fields than those probed by Solar Sys-
tem experiments. This allows GWs to place exclusive constraints on gravity
in the strong-field regime, as well as to reveal and describe highly energetic
sources that might be inaccessible to traditional electromagnetic astronomy.
Radio observations of the Hulse–Taylor binary pulsar PSR B1913+16 [2] in
the 1970s provided the first compelling evidence for the existence of GWs. This
pulsar and its neutron star companion are losing orbital energy through the
emission of GWs, which causes the orbit of the binary to shrink; the rate of de-
crease in orbital period that is predicted by general relativity agrees exquisitely
with the measured value. A number of similar binaries were later found and
observed with even greater precision [3, 4], lending further credence to GWs
and their description in the underlying theory.
The detection of GWs in a more direct sense (i.e. measuring their effects
as they pass through a detector) is notoriously difficult, due to the weak cou-
pling of gravity to matter. Efforts to directly detect astrophysical GWs began
with the resonant detectors of the 1960s, which were designed to pick up res-
onantly amplified antenna oscillations caused by a passing wave [5, 6]. Al-
though unsuccessful, these attempts paved the way for the adoption of laser
1
2 Introduction
interferometry in GW detection, and the eventual construction of large-scale
interferometers such as the Laser Interferometer Gravitational-wave Observa-
tory (LIGO) [7]. These highly sensitive instruments operate by measuring the
phase difference of light along their angled arms, which changes in response
to the miniscule distortions of spacetime as a GW passes through.
In 2015, the twin Advanced LIGO detectors situated at both ends of the
continental United States simultaneously recorded a transient GW signal from
the merger of two stellar-mass black holes [8]. This discovery was essentially
a pair of first detections: not just that of GWs, but also the energy radiated
directly by a black-hole system (which is dark in the electromagnetic sector by
definition). It was shortly followed by another confirmed black-hole binary
merger in the same observational run [9, 10]. More detections of stellar-mass
GW sources are anticipated during the presently ongoing run and into the next
decade, as Advanced LIGO is brought up to full design sensitivity.
While the LIGO discoveries are the culmination of a generation-long re-
search and development programme targeting the direct detection of GWs,
they are also the herald of a brand new era in astronomy—one that extends
its reach beyond the electromagnetic spectrum for the first time, and into the
radiation spectrum of gravity. GW sources with characteristic masses of less
than 103 Solar masses radiate most strongly in the kilohertz (101–104 Hz) band
of the gravitational spectrum. These sources will be found and observed by
a global detector network that comprises Advanced LIGO and other ground-
based interferometers such as Advanced Virgo [11] and KAGRA [12].
Another exciting prospect in GW astronomy is the detection of longer-lived
signals from sources exceeding 105 Solar masses; these will involve the super-
massive black holes (SMBHs) that reside at the centres of galaxies, and thus
revolutionise the study of formation rates and evolution scenarios for such ob-
jects. There will be an abundance of these signals in the sensitivity band of
millihertz (10-4–100 Hz) space-based detectors such as the proposed Laser In-
terferometer Space Antenna (LISA) [13]1, or more ambitious missions such as
DECIGO [15] and TianQin [16]. Although LISA is still over a decade away
1 This is the recently submitted L3 mission proposal for ESA’s Cosmic Vision programme,
in which the design of LISA has been restored from the down-scoped eLISA instrument
[14] to an earlier configuration with higher sensitivity.
§1.1 Chapter synopses 3
from coming to fruition, recent results from the precursor LISA Pathfinder
satellite [17] have already successfully demonstrated the technological viabil-
ity of observing GW sources from space [18].
At even lower frequencies, precise timing observations of Galactic millisec-
ond pulsars by radio telescopes allow arrays of such pulsars to be used as
natural detectors in the nanohertz (10-9–10-6 Hz) gravitational band. An inter-
national consortium of three pulsar timing array collaborations [19] has been
formed to detect and study GW sources in this band. The most promising
source for these arrays is the stochastic background signal from the cosmolog-
ical population of inspiralling SMBH binaries, for which detection is expected
within the next ten years [20]. With the advent of pulsar timing arrays and
space-based detectors, a near-complete observational coverage of the gravi-
tational spectrum can be obtained—this will in turn unlock the full scientific
potential of multiband GW astronomy [21], thereby enhancing the synergy of
the nascent field with its electromagnetic counterpart.
1.1 Chapter synopses
We provide an introductory overview of GW astronomy in Chapter 2, with a
strong focus on its theoretical aspects: the mathematical/physical framework
of GWs in general relativity, the astrophysical modelling of source signals, and
the statistical analysis of detector data.
In Chapter 3, the interactions between far-field GWs and electromagnetic
fields are studied using the 1+3 approach to general relativity. Many known ef-
fects are rederived in this framework, while new results are also found. These
interactions are typically extremely weak even under astrophysical conditions,
but can give rise to potentially observable electromagnetic signatures.
Chapter 4 describes the development and implementation of an improved
approximate waveform model for the extreme-mass-ratio inspirals of stellar-
mass compact objects into supermassive black holes, which will be observable
by LISA and other space-based detectors. The computationally efficient new
model uses the three frequencies of an orbit around a spinning Kerr black hole
to better match the accuracy of more expensive models.
4 Introduction
In Chapter 5, a general method of tunable compression is proposed for the
large banks of waveform templates that are used in GW detection searches.
Various combinatorial schemes under this method are investigated with toy
waveform models. The best one is applied to a simple example featuring actual
waveforms, and yields results that are promising for practical implementation.
Chapter 6 explores the viability of using the machine-learning technique
known as Gaussian process regression in GW parameter estimation. The
method employs a Bayesian approach to account for the systematic bias that
might occur due to theoretical error in waveform models. Through low-
dimensional studies, it is demonstrated to be potentially useful for measuring
the astrophysical parameters of extreme-mass-ratio inspirals.
Finally, concluding remarks and additional discussion of possible future
research directions are given in Chapter 7.
1.2 Preliminaries
Readers are assumed to have at least a rudimentary knowledge of differential
geometry, general relativity, Bayesian statistics and signal analysis.
1.2.1 Conventions
Throughout this dissertation, we use the common set of conventions laid out
in this section, unless otherwise stated here (and in the respective chapters).
Notation in each chapter has been chosen for intuitiveness, concordance with
standard notation, and to avoid symbolic overload. Due to the variety of topics
covered in this dissertation, adherence to the first two constraints may give rise
to symbols that are multiply defined across chapters (or even across sections
in the same chapter), but we trust context to prevent confusion.
• Latin (spacetime) indices run from 0 to 3, while Greek (space) indices run
from 1 to 3; exceptions to this rule will be explicitly flagged. The Einstein
summation convention applies only to lower–upper pairs of indices.
§1.2 Preliminaries 5
• The metric signature is (−,+,+,+), and the Riemann tensor sign con-
vention is Rab = Rcacb.
• The partial-derivative operator ∂a is denoted by an index comma:
T,a := ∂aT. (1.1)
The covariant-derivative operator∇a is denoted by an index semicolon:
T;a := ∇aT. (1.2)
• Symmetrisation is denoted by round brackets around indices, e.g.
T(ab) :=
1
2
(Tab + Tba). (1.3)
Skew-symmetrisation is denoted by square brackets around indices, e.g.
T[abc] :=
1
6
(Tabc + Tbca + Tcab − Tacb − Tbac − Tcba). (1.4)
• Geometrised units such that c = G = 1 are adopted, where c is the speed
of light in vacuum and G is the gravitational constant. Exceptions occur
where units are explicitly restored, and also throughout Chapter 3 where
the cosmological convention c = 8piG = µ0 = 1 is used (with µ0 being the
magnetic permeability in vacuum).
• Cartesian coordinates are implied unless otherwise stated:
[xa] := (t, x, y, z). (1.5)
• In the context of time-varying quantities q(t), q0 denotes q(0). Overdots
are reserved for time derivatives with respect to coordinate time; any
other time derivatives will use Leibniz’s notation.
• Vectors, matrices and tensors are typically denoted by boldface letters or
symbols. Their individual components are typically denoted by indices
6 Introduction
on the corresponding italicised letters or symbols. Overhats might be
used to emphasise that a vector has unit norm, but are often omitted to
ease notation when emphasis is unnecessary.
• Complex conjugation is denoted by a superscript ∗, and Fourier trans-
forms are denoted by an overtilde.
• The common (base-10) logarithm is denoted by lg, while the natural log-
arithm is denoted by ln.
• The expected value of a random variable/field is denoted by E.
1.2.2 Abbreviations
AAK Augmented analytic kludge
AK Analytic kludge
EMRI Extreme-mass-ratio inspiral
EMW Electromagnetic wave
GPR Gaussian process regression
GW Gravitational wave
LIGO Laser Interferometer Gravitational-wave Observatory
LISA Laser Interferometer Space Antenna
NK Numerical kludge
NR Numerical relativity
PN Post-Newtonian
ROC Receiver operating characteristic
SMBH Supermassive black hole
SNR Signal-to-noise ratio
Table 1.1: Glossary of abbreviations.
Chapter 2
Gravitational-wave astronomy
Theoretical research in the field of GW astronomy falls generally under three
broad themes (although this taxonomy is neither standard nor sharply demar-
cated). The first and most fundamental is the study of mathematical frame-
works in which the physics of GWs may be described and understood. An-
other theme is the usage of such frameworks in constructing waveform models
for astrophysical GW sources, allowing general relativity and other theories of
gravitation to be tested against observations. Finally, advanced statistical and
computational methods are required for the extraction of GW signals from the
noisy detector data, and the characterisation of their source properties; these
two procedures are known respectively as detection and parameter estimation.
An overview of the basic approaches that underpin these three themes is
provided in this chapter. It focuses on introducing context and relevant con-
cepts for the rest of the dissertation, and is not intended to be comprehensive.
2.1 Gravitational waves in general relativity
GWs are most simply described using the linearised form of general relativity,
which is a sound approximation to the full theory in the weak-field region
of GW sources, i.e. where the background is near-Minkowski (flat or slightly
curved) and the spacetime perturbations are small. In this section, we outline
the canonical description of GWs in linearised general relativity, following the
influential treatment of Misner, Thorne & Wheeler [22].
7
8 Gravitational-wave astronomy
2.1.1 Linearised formalism
The metric gab of a GW-perturbed spacetime may be written as
gab = ηab + hab, (2.1)
where [ηab] = diag(−1, 1, 1, 1) represents the Minkowski metric in Cartesian
coordinates, and the components hab of the metric perturbation are small
(|hab|  1). We substitute (2.1) into the Einstein field equations
Rab − 1
2
gabR = 8piTab, (2.2)
and linearise in hab by neglecting terms that are O(|hab|2). In this approxima-
tion, all indices may be raised and lowered using ηab instead of gab; expressions
for the Ricci tensor Rab and scalar curvature R (in terms of the metric compo-
nents) then depend only on second partial derivatives of hab.
By further writing hab in trace-reversed form
hab = h¯ab − 1
2
tr(h¯)ηab, (2.3)
where tr(h¯) := h¯ aa = −h aa , and imposing the Lorenz gauge condition
h¯ab,b = 0, (2.4)
Equation (2.2) reduces to a Minkowski-space wave equation for h¯ab that is
sourced by the stress–energy tensor Tab, i.e.
h¯ab = −16piTab, (2.5)
where h¯ab := h¯ cab,c defines the flat-space d’Alembert operator .
The wave equation (2.5) is valid as long as the gravitational field is weak,
e.g. far from any GW source, or close to a nearly Newtonian source (charac-
terised by T00  |Tab| for all nonzero a or b). A similar equation applies for
strong fields in full general relativity, but with a curved-space d’Alembertian
and an additional effective source term −16pi∆Tab that contains all quadratic-
§2.1 Gravitational waves in general relativity 9
and higher-order terms in hab. The general solution to (2.5) is given by the
retarded integral
h¯ab(t,x) = 4
∫
d3x′
Tab(t− r,x′)
r
, (2.6)
where x := [xµ], and r := |r|with r = x− x′.
Far from any GW source (Tab = 0), the simplest vacuum solution to (2.5) is
the monochromatic plane wave
h¯ab = <[Aab exp (ikcxc)], (2.7)
where the four-wavevector ka is null (kaka = 0) and the symmetric ampli-
tude tensor Aab is transverse to ka (Aabkb = 0). The four constraints Aabkb = 0
on the ten independent components Aab arise from the Lorenz condition (2.4),
which is a partial gauge fixing; the remaining gauge freedom may be used to
further constrain the plane-wave solution. We choose the standard transverse–
traceless gauge by imposing the three spatial conditions
A0µ = 0, (2.8)
and the trace-free condition
tr(A) = 0, (2.9)
which leaves two dynamical degrees of freedom in Aab. With tr(h¯) = 0 in
(2.3) as a result of (2.9), the metric perturbation in this gauge reduces to its
transverse and traceless part
hTTab := hab = h¯ab. (2.10)
2.1.2 Gravitational-wave generation
For an isolated and nearly Newtonian system, the mass monopole moment is
precisely the total mass–energy M =
∫
d3x′ T00, which is a conserved quantity
and contributes no radiation (M˙ = 0). Likewise, the first time derivative of the
mass dipole momentmµ =
∫
d3x′ T00x′µ corresponds to the total linear momen-
tum, which is also conserved with no associated radiation (m¨µ = 0). Hence the
10 Gravitational-wave astronomy
lowest-order radiation emitted by such a system is quadrupolar in nature, i.e.
generated by a time-varying mass quadrupole moment.2
Any astrophysical system that spins or orbits with a degree of rotational
asymmetry will emit GWs, due to the oscillation of its mass quadrupole mo-
ment. The far-field radiation from most GW sources is described to high accu-
racy by the well-known quadrupole formula, which is valid in the slow-motion
approximation (i.e. as long as the characteristic source size is small compared
to the wavelength of the GWs emitted). This formula may in principle be used
to compute gravitational waveforms for any such source, provided the dynam-
ics of its mass distribution are known.
The local mass quadrupole moment for a nearly Newtonian, slow-motion
source is given by
Iµν(t
′) =
∫
d3x′ T00(t′,x′)x′µx
′
ν , (2.11)
where the mass–energy density T00 may be evaluated at time t′ since retarda-
tion over the extent of the source will be negligible for a distant observer, i.e.
|x′|  |x|. It is convenient to consider the traceless part of (2.11):
Iµν = Iµν − 1
3
tr(I)ηµν , (2.12)
which simplifies various expressions (e.g. (2.22) and (2.23)) and is known as
the reduced mass quadrupole moment. In the far field of the source, we only
require the traceless part of Iµν (equivalently, of Iµν) after projecting transverse
to the direction of wave propagation. For a plane wave propagating along the
z-axis, the nonzero components of ITTµν reduce to
ITT11 = −ITT22 =
1
2
(I11 − I22), (2.13)
ITT12 = ITT21 = I12. (2.14)
Finally, to obtain the quadrupole formula, we use the result∫
d3x′ Tµν =
1
2
∂20
∫
d3x′ T00x′µx
′
ν , (2.15)
2 This argument is largely heuristic; a more rigorous derivation of the result is given in [23].
§2.1 Gravitational waves in general relativity 11
which follows from the flat-space conservation equations
T ab,b = 0. (2.16)
Via (2.6), (2.12), (2.15) and the far-field approximation r = |x|, (the spatial part
of) the metric perturbation in transverse–traceless gauge is then related to the
second time derivative of ITTµν by
hTTµν (t,x) =
2
r
I¨TTµν (t− r). (2.17)
2.1.3 Gravitational-wave propagation
Energy and momentum are carried by propagating GWs, much as they are by
electromagnetic radiation. The stress–energy tensor for a GW—or indeed any
gravitational field—cannot be defined locally, since the equivalence principle
implies the existence of a local reference frame in which the field vanishes.
However, it is valid to define an effective GW stress–energy tensor that is
smeared out over several gravitational wavelengths; this is given in linearised
theory by the Isaacson tensor
TGWab =
1
32pi
〈hTTcd,ahTTef,b〉ηceηdf , (2.18)
where the angled brackets denote an average over many wave cycles. Equa-
tion (2.18) is subject to the usual conservation of energy and momentum via
(2.16). Although its reinsertion into (2.2) is uninformative in linearised theory
(since TGWab = O(|hab|2) = 0), the energy–momentum associated with the GW
field in full general relativity can contribute significantly to the background
curvature and result in non-negligible backscattering or tail effects.
The linearised description of GW propagation on Minkowski space is ex-
tensible to slightly curved spacetimes, where the weak curvature might be due
to the GWs themselves or any nearby matter/gravitational fields. This is be-
cause the plane-wave solution (2.7) and the broader approach of geometri-
cal “optics” remain valid in the short-wave approximation (i.e. as long as the
gravitational wavelength is small compared to the background radius of cur-
12 Gravitational-wave astronomy
vature). On a flat or slightly curved spacetime, the stress–energy tensor (2.18)
for a plane and monochromatic GW simplifies to
TGWab =
1
32pi
A2kakb, (2.19)
whereA2 := AabAab/2. In addition, the geometrical-optics approach allows ray
effects such as redshift, lensing, and Faraday rotation of the polarisation plane
to be characterised for GWs in the same way as for electromagnetic radiation.
2.1.4 Gravitational-wave effects
As seen in Section 2.1.1, the field of a GW in general relativity has two dy-
namical degrees of freedom. These are known as the plus (h+) and cross (h×)
linear-polarisation modes; for a plane wave propagating along the z-axis, they
are defined with respect to the chosen (x, y)-coordinates as
h+ := h
TT
11 = −hTT22 , (2.20)
h× := hTT12 = h
TT
21 . (2.21)
In the field of a passing plus-polarised GW (h× = 0), the proper distance
between two points with initial separation ` in the x-direction oscillates with
h+ as (g11)1/2` ≈ (1 + h+/2)`. For two points with initial separation ` in the
y-direction, it oscillates as (g22)1/2` ≈ (1−h+/2)`. This is an effective stretching
and squeezing of space in two orthogonal directions. A cross-polarised GW
(h+ = 0) has a similar effect, since its metric is mapped to the plus-polarised
one with h+ ≡ h× by a pi/4 rotation in the polarisation plane.
Although a GW is technically a distortion of both space and time, the met-
ric perturbation in the far field equals its transverse and traceless part, and
so in transverse–traceless gauge the far-field effects are purely spatial as de-
scribed above. The strength of a far-field GW may thus be interpreted as a
dimensionless strain of size O(|h|), where h := h+ + ih×. This strain is driven
by the evolution of a distant GW source via (2.17); it allows the detection of
such sources through interferometry, where the two arms of an interferometer
§2.2 Sources and waveform modelling 13
are alternately lengthened and shortened by the passage of a GW.
Another key effect of GWs in astrophysical settings is gravitational radia-
tion reaction: energy E and angular momentum Lµ are carried away from GW
sources, leading to phenomena such as spin-down in neutron stars or inspi-
ralling in binaries. We obtain the rates of energy and angular-momentum loss
(averaged over time and solid angle) for a source by integrating the respective
fluxes of TGWab through a surrounding sphere of radius r. Via (2.17) and the
general version of (2.18) for an arbitrary gauge, these rates are related to the
time derivatives of Iµν by
〈E˙〉 = −
∫
dΩTGW01 r
2 = −1
5
〈...I µν
...I µν〉, (2.22)
〈L˙µ〉 = −
∫
dΩ µνρxνT
GW
ρ1 r
2 = −2
5
µνρ〈I¨νσ
...I ρτ 〉ηστ , (2.23)
where we have converted to spherical polar coordinates [xµ] = (r, θ, φ), and
µνρ is the three-dimensional Levi-Civita symbol.
2.2 Sources and waveform modelling
In this section, we provide a brief overview of GW sources (adapted from
reviews such as [24–26]), with particular emphasis on the coalescing binary
systems that are relevant to much of the material in this dissertation. Differ-
ent strategies are used to model such binaries in the comparable-mass and
extreme-mass-ratio regimes; we summarise these methods and the present sta-
tus of waveform development for both cases.
2.2.1 Source overview
Gravity is many orders of magnitude weaker than the other fundamental inter-
actions. The matter terms on the right-hand sides of (2.2) and (2.17) contain a
suppressed multiplicative constant G/c4 that is actually∼ 10−45 in SI units. In-
serting realistic scales of mass, length and time for any conceivable man-made
source (e.g. the mass and orbital speed of the International Space Station) into
14 Gravitational-wave astronomy
(2.17), we find |h| . 10−32; such strains are impossible to detect with modern
instruments, which are around 10 orders of magnitude less sensitive. Hence
detectable GW sources are necessarily of astrophysical or cosmological ori-
gin, and must comprise extremely massive objects moving at near-relativistic
speeds to make up for their vast distances from local detectors.
The best-understood types of GW sources are “clean” astrophysical sys-
tems that are relatively straightforward to model and identify. Chief among
these are binary coalescences, which are some of the most ubiquitous sources
in the sensitivity bands of ground- and space-based interferometers [27, 28],
and indeed the first to be detected. These coalescences refer to binary systems
in the final stage of their evolution, spanning from the late inspiral (starting at
orbital separations small enough that the source is in band) to the post-merger
ringdown (ending as the oscillations of the final object decay to zero). Other
types of well-modelled GW sources include the early-inspiral stage of lower-
mass binaries that do not merge in band, but are close enough for their weaker
signals to be detected, and also “continuous” sources such as individual neu-
tron stars that are asymmetric and rapidly spinning.
Transient GW signals that are less cleanly modelled may be classified as
burst sources. These are generated by processes such as core-collapse super-
novae [29], glitches in the internal dynamics of neutron stars [30], and the for-
mation of cusps or kinks in the one-dimensional topological spacetime defects
known as cosmic strings [31]. Short-duration GW signals from binary systems
in particular scenarios are also referred to as bursts, e.g. the radiation emit-
ted close to periapsis by long-period, nearly parabolic binaries in the extreme-
mass-ratio regime [32], or that from late-stage binary coalescences with little
or no inspiral evolution (such that most of the signal strength is attributed to
the highly nonlinear merger phase).
The descriptions above refer to GW sources that are strong or close enough
for their signals to be individually identified. However, the Universe is also
filled with countless similar systems that are unresolvable; the superposi-
tion of signals from such sources forms a stochastic background of GWs,
which is modelled as a correlated random field and searched for in the re-
sponses of independent detectors over long observing times. Like any ran-
§2.2 Sources and waveform modelling 15
dom field, a stochastic background may be characterised by its power spec-
tral density Sh(f), which is integrated over the redshift and mass distribu-
tions of the sources under consideration. In addition to astrophysical back-
grounds, stochastic backgrounds of primordial GWs can also arise through
early-Universe processes such as the amplification of quantum fluctuations by
inflation [33], or first-order cosmological phase transitions [34].
2.2.2 Binary-coalescence modelling
We now summarise the more general methods used to model the orbital evo-
lution and GW emission of a coalescing binary system with component masses
(m1,m2 ≥ m1). It is useful to introduce the standard definitions of total mass
M , reduced mass µ, (small) mass ratio q and symmetric mass ratio η, i.e.
M := m1 +m2, (2.24)
µ :=
m1m2
m1 +m2
, (2.25)
q :=
m1
m2
, (2.26)
η :=
m1m2
(m1 +m2)2
=
µ
M
=
q
(1 + q)2
. (2.27)
The inspiral stage of a binary coalescence is extremely well modelled at
large orbital separations, and its GW signal is essentially a “chirp” that in-
creases in frequency and amplitude. For illustrative purposes, we describe
here the radiation emitted by a binary of non-spinning point masses. Via (2.17),
(2.22) and (2.23), it is possible to derive expressions for the binary’s GW am-
plitude |h|, as well as the evolution of its orbital frequency f and eccentricity e
due to radiation reaction. At leading order in f , these are given by
|h| ∝ M
5/3f 2/3
r
(1 +O(e2)), (2.28)
f˙ ∝M5/3f 11/3(1 +O(e2)), (2.29)
e˙ ∝ −M5/3f 8/3e(1 +O(e2)), (2.30)
16 Gravitational-wave astronomy
where the chirp massM is defined as
M := (m1m2)
3/5
(m1 +m2)1/5
= M2/5µ3/5 = Mη3/5. (2.31)
For a Newtonian chirp (i.e. for a binary in an instantaneous Keplerian or-
bit), there are no higher-order terms in f , and the right-hand sides of (2.28)–
(2.30) truncate after terms that are O(e4). A closed-form expression for e(f)
may be obtained by solving the system of equations (2.29) and (2.30); at small
eccentricities, this expression reduces to e ∝ f−19/18 [35, 36]. Equations (2.28)
and (2.29) thus decouple from (2.30) for a Newtonian binary. They show that
the chirp mass and distance of such a source may be estimated from its inspiral
alone (since (|h|, f, f˙) are measurable from the GW signal).
Equations (2.28)–(2.30) for a relativistic binary include higher-order terms
in f , which are known as post-Newtonian (PN) corrections [37, 38]. The cor-
responding expansions are done in a PN parameter that is associated with a
typical internal speed of the system, e.g. the average squared orbital speed
v2. Corrections that are O(v2n) relative to leading order are referred to as nPN
terms. The PN formalism is well suited to systems in which internal speeds
are small, in the sense that expansions are not required to high order. In gen-
eral, any GW observable may be written as a PN expansion with respect to its
expression as evaluated in the Newtonian potential. For example, expansions
of the GW phase include terms with different dependencies on η, which allows
both masses of the binary to be estimated (rather than just the chirp mass).
For a binary that is nearly circular, Kepler’s third law gives v2 ∝ (fM)2/3.
By transforming to dimensionless coordinates xa/M , we can put (2.28)–(2.30)
into a form that is explicitly dependent on v2 and independent of M , i.e.
|h| ∝ ηv
2
r¯
(1 +O(e2)), (2.32)
df¯
dt¯
= M2f˙ ∝ ηv11(1 +O(e2)), (2.33)
de
dt¯
= Me˙ ∝ −ηv8e(1 +O(e2)), (2.34)
§2.2 Sources and waveform modelling 17
where the overbars denote dimensionless quantities. These equations illus-
trate the total-mass invariance of waveforms in dimensionless coordinates,
which is often exploited in the modelling of GW sources.
The merger stage of a binary coalescence occurs deep within the nonlinear
dynamical strong field, at which point PN methods become both impractical
and inaccurate. Numerical relativity (NR) methods [39] (i.e. direct numerical
integration of the Einstein field equations) are typically used to model this
stage. The radiative degrees of freedom in a gravitational field are encoded in
the Weyl tensor Cabcd (the fully traceless part of the Riemann tensor), which is
reduced in the Newman–Penrose formalism [40] to five scalar contractions Ψn
of Cabcd with a complex null tetrad. For waveform modelling, Ψ4 is the most
relevant as it describes outgoing radiation at null infinity; it is related to the
metric perturbation by
Ψ4 = −h¨+ + ih¨× (2.35)
in the limit as r →∞.
In NR simulations, Ψ4 is expanded in terms of the spin-weighted spherical
harmonics sYlm [41] with s = −2, i.e.
Ψ4(t, r, θ, φ) =
∞∑
l=2
l∑
m=−l
Ψ4,lm(t, r)−2Ylm(θ, φ). (2.36)
This facilitates computation (to some desired l) of the modes Ψ4,lm, which are
then extracted on a sphere of finite radius. Breakthroughs in NR during the
mid-2000s [42–44] have enabled the highly accurate evolution of binary coa-
lescences from several orbits before plunge (the start of the merger stage) right
through to ringdown. However, NR waveforms are expensive to compute,
and are limited at present to . 100 cycles—typically with small initial orbital
separations (. 20M ) and mass ratios close to unity (& 1/20) [45].
The end product of a binary coalescence is a vibrating black hole or rel-
ativistic star that gradually settles into a stable configuration. Although this
ringdown stage may be modelled within NR, its waveforms are described
more simply through black-hole perturbation theory as a superposition of
quasinormal modes [46], i.e. exponentially decaying sinusoids. The frame-
18 Gravitational-wave astronomy
work for studying black-hole perturbations is built upon early calculations by
Regge & Wheeler [47] and Zerilli [48]; for a rotating Kerr black hole, it is cen-
tred around the Teukolsky master equation [49] describing the linear pertur-
bations of various scalar fields ψ on the spacetime.
In the spherical-like Boyer–Lindquist coordinates, the Kerr line element for
the spacetime around a mass M with angular momentum Ma is written as
ds2 = −∆
Σ
(dt− a sin2 θ dφ)2 + sin
2 θ
Σ
((r2 + a2)dφ− a dt)2 + Σ
∆
dr2 + Σ dθ2, (2.37)
where ∆ = r2− 2Mr+ a2 and Σ = r2 + a2 cos2 θ. The Teukolsky equation takes
the general form
sΘψ = 4piΣT, (2.38)
where T is the source term (written in Newman–Penrose form), and sΘ is the
Teukolsky operator for a field of spin weight s:
sΘ = −
(
(r2 + a2)2
∆
− a2 sin2 θ
)
∂2t −
4Mar
∆
∂t∂φ −
(
a2
∆
− 1
sin2 θ
)
∂2φ
+∆−s∂r∆s+1∂r +
1
sin θ
∂θ sin θ ∂θ + 2s
(
a(r −M)
∆
+
i cos θ
sin2 θ
)
∂φ
+2s
(
M(r2 − a2)
∆
− r − ia cos θ
)
∂t − s2 cot2 θ + s. (2.39)
For outgoing gravitational radiation, we set s = −2 and ψ = ρ−4Ψ4 with ρ =
1/(r − ia cos θ); the Teukolsky equation is then separable into its radial and
angular components via the expansion (2.36), such that the Kerr quasinormal
modes (i.e. the eigenfunctions of (2.38) with T = 0) and in particular their
complex frequencies may be obtained analytically (e.g. [50]).
2.2.3 Comparable-mass mergers
Although the three stages of a binary coalescence have very different practi-
cal descriptions (the two-body problem at large and small separations, then
one-body perturbations), there are regimes of overlap in which the com-
puted waveforms can be cross-validated and stitched together smoothly. Ap-
§2.2 Sources and waveform modelling 19
proaches such as those described above may be used to construct inspiral–
merger–ringdown waveform models for comparable-mass binaries with q &
10−1. These models are applicable to binary systems of stellar-mass compact
objects—white dwarfs, neutron stars or stellar-origin (. 102M) black holes—
as well as SMBH binaries, through a trivial rescaling by total mass.
The best-developed comparable-mass models incorporate at some level the
spins of both objects in the binary, whether these are assumed to be aligned
with the orbital angular momentum, or generically aligned and hence pre-
cessing. One powerful framework for achieving this is the effective-one-
body formalism [51], in which the relative motion of two spinning masses
((m1,S1), (m2,S2)) is mapped to the geodesic motion of an effective particle
(µ,Sµ) on a deformed Kerr spacetime (M,SM). The effective-one-body ap-
proach has been used to construct a succession of comparable-mass models,
the latest of which includes precessing spins [52]. These models typically pro-
duce inspiral–merger waveforms that are smoothly connected to quasinormal
modes, with NR simulations employed to calibrate unknown coefficients and
free parameters in the merger–ringdown.
While most of the methods and models described so far provide time-
domain waveforms h(t), it is often desirable to directly compute h˜(f) for ease
of use in data analysis algorithms. This is done in the other well-known fam-
ily of models, dubbed Phenom [53], which use NR-fitted phenomenological
parameters to write their waveforms as closed-form analytic expressions in
the frequency domain. Phenom waveforms rival those from the effective-one-
body approach in accuracy, and have also been extended to account for the
effects of spin precession [54].
2.2.4 Extreme-mass-ratio inspirals
The standard strategies adopted in the modelling of comparable-mass binary
coalescences are not well suited to the regime where q  1, with the possi-
ble exception of the effective-one-body approach (although work in this area
is presently limited to circular and equatorial orbits, and requires calibration
against more accurate models [55]). This is because the inspiral of the smaller
20 Gravitational-wave astronomy
object includes many orbits in the strong field of the larger object, and so the
PN treatment is inadequate on its own, while NR simulations are hampered
by the greatly differing scales of mass, length and time in the system.
Extreme-mass-ratio inspirals (EMRIs) involve a compact object in orbit
around a rotating SMBH, with q . 10−4 such that (m1,m2, η) ≈ (µ,M, q). They
are expected to be an important and detectable source of GWs at millihertz
frequencies [56]. The most readily available EMRI waveforms are generated
using mixed-formalism models known as “kludges” [57–59], which are char-
acterised by their computational efficiency and modular construction. In these
models, evolution of the orbit over long timescales is based on PN expressions
in the limit of (non-spinning) test-particle motion on Kerr geometry; the modu-
lar setup then allows the introduction of relativistic orbital corrections on both
short and long timescales. Flat-space multipole formulae (e.g. (2.17)) sourced
by the compact object are used for fast waveform generation to a reasonable
approximation, even though the background spacetime is strongly curved.
Fuller EMRI treatments require black-hole perturbation theory and the
computation of the gravitational self-force on the compact object [60, 61]. At
first order in q, the gravitational field of a test particle on Kerr spacetime is a
linear perturbation of the background metric, i.e. gab → gab + qhab. Motion on
the perturbed spacetime depends only on some regular part hRab of the metric
perturbation, and may be interpreted either as geodesic motion on gab + qhRab,
or as Kerr geodesic motion that is altered by an effective external force. This
self-force is written as
F a = µ∇abchRbc, (2.40)
with the force operator∇abc given in terms of the Kerr covariant derivative by
∇abc = 1
2
(gadub − 2gabud − uaubud)uc∇d, (2.41)
where ua is the test particle’s four-velocity. The adiabatic (time-averaged), dis-
sipative effects of the self-force at first order in q are encoded in solutions of the
Teukolsky equation (2.38) with compact-object source; while these are expen-
sive to compute, they are more accurate than kludges and have been used to
generate full EMRI waveforms for eccentric and inclined Kerr orbits [62, 63].
§2.3 Detectors and data analysis 21
Although Teukolsky-based waveforms are potentially sufficient for detec-
tion purposes, they suffer from a build-up of error in the orbital phase over the
long inspiral lifetime. To accurately recover the orbital parameters of EMRIs,
waveform models must include the next dominant self-force phenomenon of
resonances, where the concurrence of the various orbital frequencies causes
transient jumps in the evolution of the compact object’s orbit [64]. They also
need to account for higher-order phase effects due to the conservative first-
order self-force and the dissipative second-order self-force [65]; these calcula-
tions are the aim of the ongoing self-force programme. Full self-force models
might be too computationally intensive to directly provide waveforms for data
analysis, and hence will likely be used to inform kludges instead [66–68].
The actual merger and ringdown of an extreme-mass-ratio coalescence are
not as important from a data analysis standpoint, since most of the source
information will typically be contained in the extensive inspiral stage. Regard-
less, the merger is now a relatively straightforward transition between the last
stable orbit of the compact object and its geodesic plunge [69], while the ring-
down is again described by the quasinormal modes of the final Kerr black hole.
Finally, coalescing binary systems with 10−3 . q . 10−2 are known as
intermediate-mass-ratio inspirals. Less work has been done on constructing
waveform models for such sources; this is due to the scarcity of direct obser-
vational evidence for intermediate-mass black holes, as well as the limitations
of NR and perturbation theory at intermediate mass ratios. However, several
hybrid methods have been applied to the modelling of these systems [70, 71],
and the implied existence of ∼ 102M black holes in light of the first LIGO
detection might presage their eventual discovery.
2.3 Detectors and data analysis
We now describe some of the existing and proposed GW detectors introduced
in Chapter 1, before outlining the signal processing framework used to analyse
their data (with particular emphasis on the standard approaches to detection
and parameter estimation for laser interferometers). The material in this sec-
tion has been adapted from reviews such as [25, 26, 72, 73].
22 Gravitational-wave astronomy
2.3.1 Detector overview
The radiation spectrum of likely sources in GW astronomy will be spanned
by an envisioned network of man-made and naturally occurring interferomet-
ric detectors with different arm lengths and noise backgrounds. These fall into
three main categories: ground-based detectors (which are sensitive at kilohertz
or “high” frequencies), space-based detectors (at millihertz or “low” frequen-
cies), and pulsar timing arrays (at nanohertz or “very low” frequencies).
Ground-based detectors will observe stellar-mass binary coalescences,
asymmetric neutron stars, supernovae and other low-mass GW sources. Mod-
ern Michelson interferometers such as Advanced LIGO [7] and Advanced
Virgo [11] have arms that are ∼ 103 m long, and employ Fabry–Pe´rot cavities
to increase the effective arm length to ∼ 105 m, hence improving sensitivity
at sub-kilohertz frequencies. They also use vibration isolation systems and
materials with low mechanical loss to screen out instrumental noise, but envi-
ronmental changes in the local gravitational field (e.g. seismic vibrations, or
atmospheric pressure gradients) are an irreducible source of gravity gradient
noise at . 101 Hz. Future variants such as KAGRA [12] and the Einstein Tele-
scope [74] will explore the same frequency band with additional noise reduc-
tion strategies, including underground construction and cryogenic cooling.
Gravity gradient noise falls off rapidly away from the Earth, and so GWs at
lower frequencies can be detected by an interferometer in space. Space-based
detectors will observe Galactic white-dwarf binaries in the early-inspiral stage,
and more distant sources such as EMRIs and the binary coalescences of smaller
(. 107M) SMBHs. The archetypal design for a space interferometer is that of
LISA [13]: three freely flying test masses housed in drag-free spacecraft, sepa-
rated by∼ 109 m, and placed in an Earth-trailing orbit around the Sun. LISA is
most sensitive in the millihertz band, below which it is limited by noise from
long-timescale drifts in the relative acceleration of the test masses. At decihertz
frequencies, LISA’s arm length exceeds multiple wavelengths and its sensitiv-
ity is reduced due to partial signal cancellation; however, future detectors with
shorter test-mass separations (e.g. DECIGO [15], the geocentric TianQin [16],
and other proposed missions [75]) will extend the range of space-based GW
observations up to the lower frequency limits of ground interferometers.
§2.3 Detectors and data analysis 23
Arrays of stable millisecond pulsars in the Milky Way emit radio pulses
that arrive at Earth with exquisite regularity in time, thus effectively acting
as a network of natural “interferometers” with Galactic-scale (∼ 1020 m) arm
lengths. A GW that passes through the Galaxy will perturb the paths taken
by the pulses, and hence their times of arrival, in a correlated way; this effect
is then distinguishable from the noise in pulse arrival times due to intrinsic
pulsar processes, which are uncorrelated. The peak sensitivity of a pulsar tim-
ing array occurs near its lower frequency limit, i.e. the cut-off determined by
the total observation time. For a five-year baseline, this corresponds to the
nanohertz frequency band. Pulsar timing arrays such as the IPTA [19] and a
future programme supported by the Square Kilometre Array [76] will observe
inspiralling SMBH binaries at the higher end of their mass range, as well as
the stochastic GW background of signals from such sources.
Apart from interferometric detectors, there is another broad class of detec-
tor that works by amplifying the vibrations induced in a material object due
to the passage of a GW. Such instruments are known as resonant detectors
[77, 78]. Their sensitivities have gradually been surpassed by those of laser in-
terferometers; furthermore, they typically operate in a narrow high-frequency
band (∼ 103 Hz) where there are fewer anticipated astrophysical or cosmologi-
cal sources, and so are less relevant to the content of this dissertation.
2.3.2 Signal processing
When analysing data from laser interferometers, the two polarisation modes
h+,× are not measured directly. The intermediate step involves at least two in-
dependent detector-response functions hI,II , which are orthogonal in the sense
that the noise in both channels is uncorrelated. These two data streams are ob-
tained from either separate interferometers or noise-correlated ones (e.g. in
shared-arm configurations such as LISA); as long as the noise is not perfectly
correlated, hI,II can always be orthogonalised in the usual way.
Working in arbitrary coordinates, we denote the orthonormal coordinate
frame adopted in (2.20) and (2.21) as (e1, e2, e3) ≡ (p,q, r) (with r pointing
24 Gravitational-wave astronomy
from the source to the detector), and define the polarisation basis tensors
H+µν := pµpν − qµqν , (2.42)
H×µν := pµqν + qµpν , (2.43)
such that the far-field metric perturbation may be written as
hTTµν = h+H
+
µν + h×H
×
µν . (2.44)
For a single detector whose two arms are aligned with the unit vectors
(`1, `2), the strain signal caused by the passage of a GW is given by
hI =
1
2
hTTµν (`
µ
1`
ν
1 − `µ2`ν2). (2.45)
This response function is proportional to the sine of the angle α between the
arms [79], and is explicitly related to h+,× via (2.44):
hI = (h+F
+
I + h×F
×
I ) sinα, (2.46)
F+,×I sinα =
1
2
H+,×µν (l
µ
1 l
ν
1 − lµ2 lν2), (2.47)
where the antenna pattern functions F+,×I (θS, φS, ψS) depend on the sky loca-
tion (θS, φS) and polarisation angle ψS of the source [80]. A second channel hII
is required for the recovery of both modes h+,×, and is similarly defined as (the
orthogonal part of) the strain signal in another linearly independent detector.
GW strain data from laser interferometers is often characterised by a low
instantaneous signal-to-noise ratio (SNR), due to the high levels of instrumen-
tal and environmental noise. However, the data may be processed with tem-
plate filters generated from waveform models, which allows the build-up of
SNR over time. In the standard signal-processing framework for GW astron-
omy, data from a pair of orthogonal detectors may be written as the complex
time series3 x(t) = h(t) + n(t), where the source signal h = hI + ihII is a de-
3 Another common and more general approach is to consider the real time-series data from
individual detectors; the SNRs for multiple orthogonal detectors may then be summed in
quadrature to give a combined SNR for the detector network.
§2.3 Detectors and data analysis 25
terministic function of time (and some unknown source parameters), and the
detector noise n = nI + inII in both channels is assumed to be a stationary and
zero-mean stochastic process.
The process known as matched filtering involves the cross-correlation of x
with some optimal template filter F (t) that maximises the output SNR
ρ :=
E[(x|F )]√
E[(n|F )(n|F )] , (2.48)
where the (zero-lag) cross-correlation inner product (·|·) on the function space
of finite-length time series is defined as
(a|b) :=
∫
dt a(t)b∗(t). (2.49)
Since n is stationary, its autocorrelation function Rn(t, t′) := E[n(t)n∗(t′)]
may be written as an even function of one variable, i.e.
Rn(τ) = E[n(t)n∗(t− τ)]. (2.50)
By the Wiener–Khinchin theorem, the (two-sided) noise power spectral density
Sn(f) is simply the Fourier transform of Rn(τ):
Sn(f) :=
∫
dτ Rn(τ) exp (−2piifτ). (2.51)
The solution to the variational problem δρ/δF = 0 is then h(t) ∝ (Rn ∗ F )(t),
which by the convolution theorem gives
F˜ (f) ∝ h˜(f)Sn(f) , (2.52)
and so the matched filter is proportional to the noise-weighted signal itself.
It is convenient to shift the noise weighting into the inner product (2.49);
the resultant noise-weighted inner product 〈·|·〉 is real and defined as
〈a|b〉 := (a|Fb) =
∫
dt a(t)F ∗b (t) =
∫
df
a˜(f)b˜∗(f)
Sn(f) (2.53)
26 Gravitational-wave astronomy
via the Plancherel theorem, with Fb denoting the matched filter corresponding
to the waveform template b (i.e. F˜b := b˜/Sn).
With a filter that is matched to the source signal h, the output SNR (2.48)
simplifies to
ρ =
E[〈x|h〉]√
E[〈n|h〉〈n|h〉] =
√
〈h|h〉, (2.54)
where we have used the noise identities
E[〈n|h〉] = 0, (2.55)
E[〈n|h〉〈n|h′〉] = 〈h|h′〉. (2.56)
Equation (2.54) may be interpreted as the “true” SNR of the source. If the filter
is matched not to h but some other waveform h′, the output SNR is instead
ρ′ =
〈h|h′〉√〈h′|h′〉 , (2.57)
which is ≤ ρ by the Cauchy–Schwarz inequality.
The common principle of comparing detector data against many templates
and identifying the best-matching one is used in both detection and parame-
ter estimation. Since the SNR might be reduced significantly if the identified
template is not close to optimal, the generating waveform model should be as
accurate as possible. At the same time, it must also be computationally effi-
cient due to the large number of templates required in both procedures. Dif-
ferent strategies apply for detection, which can be viewed as a global search on
the SNR surface over the model parameter space, and parameter estimation,
which is an exploration (over a more localised region) of the probability that
the template-subtracted data is consistent with detector noise.
2.3.3 Detection
To discern the presence of a signal buried in detector noise, one standard ap-
proach is to perform a targeted search by filtering the data with a template
bank, i.e. a discrete set of precomputed (or rapidly computed) unit-SNR tem-
§2.3 Detectors and data analysis 27
plates that densely span an extended region of the model parameter space.
This may be done while the detector is online, in order to identify high-SNR
candidates for prompt follow-up. A detection is then claimed if the output
SNR for any template in the bank surpasses a specified significance threshold
that is determined by the properties of the concurrent noise background.
Various prescriptions for template placement may be used to ensure proper
coverage of the search region. The most straightforward is to grid the tem-
plates on a lattice that is uniform with respect to some parameter-space met-
ric. Such a metric is often defined by way of the overlap O(·|·) between two
waveforms, which is given by
O(a|b) := 〈a|b〉√〈a|a〉〈b|b〉 . (2.58)
This bilinear form takes the value of±1 for linearly dependent waveforms and
0 for orthogonal waveforms, and hence provides a measure of the accuracy
with which one waveform represents another.
The overlap is related to several other commonly used quantities in the GW
literature. For example, the maximal overlap between the source signal and a
(discrete or continuous) set of templates from a waveform model is known as
the “fitting factor” of that set [81]. The “match” between two waveforms refers
to their overlap maximised over some chosen subset of parameters; these are
typically extrinsic to the source and more rapidly searched over [82]. We also
have the “mismatch” between neighbouring templates in a bank, which is just
their match subtracted from unity, as well as the “minimal match” for a bank
with uniform mismatch, which is the fraction of optimal SNR it retains when
the signal falls maximally far from the nearest templates [83].
Template banks are well suited to detection searches for simply modelled
signals across a low-dimensional parameter space, but their computational ef-
ficiency scales poorly with waveform cost and search volume. Banks of ∼ 105
approximate templates for stellar-mass binary coalescences are employed in
low-latency LIGO pipelines to search a reduced four-dimensional parameter
space (the component masses and aligned-spin magnitudes) for coincident sig-
nals in the two detectors. The significance of a candidate signal is assessed
28 Gravitational-wave astronomy
against the SNR distribution of false coincident signals due to noise in either
data stream. A frequentist approach is used to estimate this distribution: the
timestamps of one data stream are artificially shifted many times, with the off-
sets being large enough that all coincident signals in the time-shifted data sets
are necessarily uncorrelated between detectors (and interpreted as noise).
Apart from targeted searches, the LIGO pipelines also incorporate generic
searches that are performed without the use of waveform models. These gen-
erally involve the identification of coincident excess power in spectrograms
of the data, i.e. time–frequency plots of the short-time power spectral den-
sity; they provide a complementary search for (stronger) signals from well-
modelled sources, and are also suitable for detecting the transient signals from
burst sources. As the field of GW astronomy matures, the standard detection
algorithms based on template banks and spectrograms will likely be improved
or supplanted by modern statistical techniques such as deep neural networks
[84] for targeted searches, or compressed sensing [85] for generic ones.
2.3.4 Parameter estimation
Detection searches are performed by maximising the SNR-related quantity
〈x|h〉 over the parameter space of the waveform model h; in parameter esti-
mation, it is the noise-related quantity 〈x− h|x− h〉 that is minimised instead.
The probability that x − h is consistent with Gaussian detector noise supplies
a likelihood function for the model parameters λ given the data, i.e.
L(λ|x) ≡ p(x|λ) ∝ exp
(
−1
2
〈x− h(λ)|x− h(λ)〉
)
. (2.59)
This allows the optimisation problem to be cast in the framework of Bayesian
statistics, such that confidence regions around the optimal parameter point
may be obtained from the posterior probability density
p(λ|x) = L(λ|x)pi(λ)
Z(x)
. (2.60)
§2.3 Detectors and data analysis 29
The parameter prior pi is specified as appropriate or taken to be uninformative,
while the model evidence Z =
∫
dλLpi is a normalising factor that is unused
in parameter estimation. Although Z is expensive to evaluate explicitly, it can
be used for the Bayes-factor comparison of different waveform models, and
also provides a Bayesian approach to the detection of weaker signals via a
comparison with the null-model evidence Z0 ∝ exp (−1/2〈x|x〉).
Grid-based searches of the posterior surface with template banks are too
computationally inefficient to estimate parameters at the precision admitted by
most waveform models. The measurement precision for a model h(λ) at some
point in parameter space is determined by its waveform derivatives ∂h/∂λ at
that point. In the case of high SNR, the parameter estimation errors ∆λ due
to Gaussian noise have the normal distribution N (0,Γ−1), where the Fisher
information matrix Γ is given by [86, 87]
Γij =
〈
∂h
∂λi
∣∣ ∂h
∂λj
〉
. (2.61)
The root-mean-square errors in the general case can then be approximated as
∆λi ≈
√
(Γ−1)ii, (2.62)
which places stringent and typically impractical upper limits on the grid spac-
ing of a template bank that is put to this purpose.
Parameter estimation algorithms thus make use of stochastic search meth-
ods instead, in order to map out (relevant parts of) the posterior surface at
higher resolution. The unnormalised posterior density Lpi in (2.60) is sampled
over a region of parameter space containing the detected signal; this is usually
localised beforehand by the initial detection algorithms, although stochastic
methods can in principle search extended regions and adjust their sampling
resolution adaptively. Estimates of the source parameters and their associated
errors are obtained directly from the distribution of samples, after it has con-
verged to the underlying posterior density.
The most successful class of stochastic sampling methods for GW parame-
ter estimation are the Markov-chain-Monte-Carlo algorithms, which have seen
30 Gravitational-wave astronomy
extensive use due to their simplicity and versatility. These typically involve
sampling with a random walk across parameter space, but directed by a prob-
ability density that proposes points at each step, as well as some criterion for
accepting or rejecting points. The speed and reliability of these algorithms can
also be improved through techniques such as simulated annealing [88] and
parallel tempering [89]. Other well-known methods that have been applied to
GW parameter estimation include nested sampling [90], which explores nested
contours of increasing probability with a number of live points, and evolution-
ary algorithms [91], which use genetic concepts such as breeding and fitness to
evolve a population of points across parameter space.
Chapter 3
Einstein–Maxwell interactions
GWs from astrophysical sources can interact with background electromagnetic
fields, giving rise to distinctive and potentially detectable electromagnetic sig-
natures. In this chapter, we study such interactions for far-field gravitational
radiation using the 1+3 approach to relativity. Linearised equations for the
Maxwell field on perturbed Minkowski space are derived and solved analyti-
cally. The inverse Gertsenshteı˘n conversion of GWs in a static electromagnetic
field is rederived, and the resultant electromagnetic radiation is shown to be
significant for highly magnetised neutron stars in compact binary systems.
We also obtain a variety of nonlinear interference effects for interacting
gravitational and electromagnetic plane waves, although wave–wave reso-
nances previously described in the literature are absent when the electric–
magnetic self-interaction is taken into account. The fluctuation and amplifi-
cation of electromagnetic energy flux as the strength of the GW increases to-
wards the gravitational–electromagnetic frequency ratio is a possible signature
of gravitational radiation from extended astrophysical sources.
The material in this chapter has been adapted from [92].
3.1 Background
The strongest GW sources for present and future detectors are highly ener-
getic astrophysical events, many of which will be accompanied by distinctive
and detectable electromagnetic signals. Such electromagnetic counterparts can
aid the detection of GW sources through improved event rate prediction, the
provision of search triggers, the confirmation of individual detections, and en-
31
32 Einstein–Maxwell interactions
hanced parameter estimation. Once an operational network of GW observa-
tories is fully realised, the synergy of complementary information from gravi-
tational and electromagnetic observations will establish GWs as an important
component of multimessenger astronomy [93–95].
As observations of GW sources and their electromagnetic counterparts im-
prove in precision, so too must models of such dual sources, to account for
any correlations between the two types of signal. A nascent line of research
towards this end is the direct coupling between gravitational and electro-
magnetic fields in the strong-field regime. Recent work in this area has fo-
cused on the electromagnetic signatures of gravitational perturbations on var-
ious curved spacetimes; the perturbed Einstein–Maxwell equations have been
solved for Schwarzschild [96], slowly rotating Kerr–Newman [97] and equal-
mass binary Kerr [98], with the numerical involvement increasing as per the
complexity of the spacetime.
The problem of Einstein–Maxwell coupling for gravitational radiation in
flat space is older and more analytically tractable than that in curved space,
leading to a better characterisation of the (albeit weaker) interactions between
far-field GWs and electromagnetic fields. One such effect is the resonant con-
version of a GW into an electromagnetic wave (EMW)—and vice versa—in
the presence of a static electromagnetic field [99–101]. The direct signatures
of GWs on EMWs have also been studied; these include frequency splitting
[102], intensity fluctuations [102, 103], deflection of rays [103–105] and gravita-
tionally induced Faraday rotation of the EMW polarisation [105–110]. Indirect
GW detection schemes using microlensing [111] and phase modulation [112]
effects on light have been proposed as well.
Among various frameworks suited to the study of interacting GWs and
electromagnetic fields is the 1+3 covariant approach to general relativity, in
which spacetime is locally split into time and space via the introduction of a
fundamental timelike congruence [113–115]. This approach is most commonly
employed in the cosmological setting, and in particular has been used to de-
scribe electromagnetic signatures of the large-scale tensor perturbations asso-
ciated with primordial GWs [116–120]. It may also be applied to gravitational–
electromagnetic interactions in a general spacetime [121], although any in-
§3.1 Background 33
homogeneity in the spacetime typically renders the governing equations in-
tractable due to tensor–vector and tensor–tensor coupling [122].
Such difficulties with the 1+3 formalism may be partially overcome by ex-
tending the spacetime splitting to a 1+1+2 decomposition in the case of locally
rotationally symmetric spacetimes, which have a preferred spatial direction
[122]; this method has been used to semi-analytically model the electromag-
netic signature of a Schwarzschild ringdown [123]. The 1+3 approach has
also been applied to the interaction of far-field GWs and electromagnetic fields
in the presence of a magnetised plasma [124, 125]. Finally, recent work on
Minkowski-space GWs and EMWs within the 1+3 framework has uncovered
resonant interactions between the two under specific conditions [120, 126].
As any resonant amplification of electromagnetic fields by gravitational ra-
diation might be important for GW detection, we take a more detailed look
at flat-space interactions between GWs and electromagnetic fields using the
1+3 approach to relativity. We provide in Section 3.2 a brief primer to the 1+3
formalism, which is seldom encountered in the context of GW astronomy. In
Section 3.3, we derive linearised evolution and constraint equations for the
electromagnetic field on GW-perturbed flat space, and approximate these on
exact Minkowski space. This framework is applied to simple models of static
and radiative electromagnetic fields in Section 3.4, where we consider the re-
sultant effects in astrophysical settings and discuss their implications for dual
observations of GW sources.
We rederive in Section 3.4.1 the resonant induction of an EMW by a GW
in a static electromagnetic field, and estimate that for highly magnetised neu-
tron stars in compact binary systems, the energy radiated through this process
might be non-negligible with respect to the magnetic dipole radiation. In Sec-
tion 3.4.2, we find no resonant interaction between plane GWs and EMWs af-
ter considering electric–magnetic self-interaction contributions that have been
omitted in previous work [120, 126]. However, nonlinear interference effects
are shown to be significant in a regime where the GW strength approaches
the GW–EMW frequency ratio from below; the resultant fluctuation and am-
plification of electromagnetic energy flux is a potentially stronger signature of
gravitational radiation than other geometrical-optics effects in the literature.
34 Einstein–Maxwell interactions
3.2 The 1+3 approach to relativity
The 1+3 covariant formalism [113–115] has been applied to many problems in
general relativity. In the context of gravitational perturbations, the approach
has limited benefits due to the generally intractable evolution equations that
arise, although it is fairly well-suited to the analytical study of GWs in spatially
homogeneous spacetimes [122].
For a general spacetime, we define a local 1+3 threading into time and space
by introducing a fundamental timelike congruence of worldlines on the mani-
fold. The timelike vector field tangential to this congruence is given by
ua :=
dxa
dτ
, (3.1)
where τ is proper time along the worldlines and uaua = −1. In the absence of
vorticity, such a threading foliates the spacetime into spacelike hypersurfaces
orthogonal to ua. Every tensor field T on the spacetime may then be decom-
posed into its timelike and spacelike parts via contraction with, respectively,
(3.1) and the spatial projection tensor4
Pab := gab + uaub. (3.2)
Decomposing the covariant derivative T;a in the above way defines the co-
variant time (T˙ ) and space (T:a) derivatives of T . If T is a spatially projected
tensor, we may write
T˙ = T;aua, (3.3)
T:a = T;a + T˙ ua. (3.4)
Following the conventions of [114], we also define the spacetime volume form
abcd = [abcd] as the fully antisymmetric pseudotensor with 0123 := |det[gab]|1/2,
and its projection onto each instantaneous rest space as abc := uddabc (such
that the spatial curl is right-handed).
For a general fluid solution to the Einstein field equations (2.2), the stress–
4 The usual 1+3 notation for the projection tensor is hab, by unfortunate coincidence.
§3.2 The 1+3 approach to relativity 35
energy tensor decomposition with respect to (3.1) is given by
Tab = µuaub + pPab + 2q(aub) + piab, (3.5)
where µ is the mass–energy density, p is the isotropic pressure, qa is the energy
flux vector and piab is the symmetric and traceless anisotropic stress tensor. The
matter field fully determines the local Ricci curvature, but not the nonlocal cur-
vature encoded in the Weyl tensor Cabcd, which is obtained from the Riemann
tensor Rabcd by subtracting its various trace terms:
Cabcd = Rabcd − (ga[cRd]b − gb[cRd]a − 1
3
ga[cgd]bR). (3.6)
By analogy with electromagnetism, the Weyl tensor splits into its “electric” and
“magnetic” parts; these are given respectively by
Eab = Cacbdu
cud, (3.7)
Hab =
1
2
 cda Ccdbeu
e. (3.8)
Both matter and geometry affect motion along the worldlines, which is
characterised by the kinematical quantities in the decomposition
ua;b =
1
3
ϑPab + σab + ωab − u˙aub. (3.9)
The vorticity tensor ωab = u[a:b] and the (non-gravitational) acceleration vector
u˙a vanish in GW-perturbed Minkowski space, and are henceforth taken to be
zero. With this simplification, the evolution of the expansion scalar ϑ = u :aa is
governed by the Raychaudhuri equation
ϑ˙ = −1
3
ϑ2 − 2σ2 − 1
2
(µ+ 3p), (3.10)
where σ2 := σabσab/2. The shear tensor σab is the symmetric and traceless part
of u(a:b), and evolves (keeping all quantities symmetric and traceless) as
σ˙ab = −2
3
ϑσab − σacσ cb − Eab +
1
2
piab. (3.11)
36 Einstein–Maxwell interactions
Equations (3.10) and (3.11) are accompanied by the two spatial constraints
divσab := σ
:b
ab =
2
3
ϑ:a − qa, (3.12)
curlσab := acdσ
d:c
b = Hab, (3.13)
where all quantities in (3.13) are again kept symmetric and traceless.
An electromagnetic field Fab on the spacetime splits into its electric and
magnetic parts in the usual way:
Ea = Fabu
b, (3.14)
Ba =
1
2
abcF
bc. (3.15)
It evolves in accordance with Maxwell’s equations
F ;bab = Ja, (3.16)
F[ab;c] = 0, (3.17)
where Ja = ρEua + Ja is the four-current source with charge density ρE and
spatially projected current Ja. Equations (3.16) and (3.17) may also be decom-
posed with respect to (3.1). With divVa := V :aa and curlVa := abcV c:b for a spa-
tially projected vector Va, we write the first-order electromagnetic evolution
equations (keeping all quantities spatially projected) as
E˙a = −2
3
ϑEa + σabE
b + curlBa − Ja, (3.18)
B˙a = −2
3
ϑBa + σabB
b − curlEa, (3.19)
along with the two spatial constraints
divEa = ρE, (3.20)
divBa = 0. (3.21)
The electromagnetic field in turn curves the background spacetime via its
§3.3 Linearised far-field equations 37
stress–energy tensor TEMab . This may be written in the fluid form (3.5) with
µ =
1
2
(E2 +B2), (3.22)
p =
1
6
(E2 +B2), (3.23)
qa = Sa, (3.24)
piab =
1
3
(E2 +B2)Pab − EaEb −BaBb, (3.25)
where E2 := EaEa and B2 := BaBa are the squared field magnitudes and
Sa := abcEbBc is the covariant Poynting vector. Coupling of the Einstein and
Maxwell fields is manifest in (3.10)–(3.13) and (3.18)–(3.25); we explore this in
the following sections for the case of a radiative gravitational field.
3.3 Linearised far-field equations
We consider the interactions between gravitational radiation and electromag-
netic fields in perturbed Minkowski space with the metric η˜ab = ηab + hab. In
the transverse–traceless gauge, η˜00 = −1 and we may choose ua = δa0 , where δab
is the Kronecker delta. Gravitational radiation is covariantly described by the
electric and magnetic parts of the Weyl tensor, and more simply in flat space
by the shear tensor, which is related to the metric perturbation by
σab =
1
2
h˙TTab . (3.26)
Apart from the vanishing vorticity and acceleration, other kinematical and ge-
ometrical quantities are greatly simplified by the flatness and symmetry of the
spacetime; for example, the expansion scalar and electric Weyl tensor reduce
at linear perturbative order to ϑ = 0 and (via (3.11)) Eab = −σ˙ab.
Second-order evolution equations for an electromagnetic field {Ea, Ba} on
the spacetime may be derived from (3.18) and (3.19) by spatially projecting
their covariant time derivatives, making use of (3.10)–(3.13) to simplify the
expressions. These wave-like equations are sourced by the kinematical quan-
38 Einstein–Maxwell interactions
tities ϑ and σab, with additional Ricci and Weyl curvature terms arising from
the non-commutativity of derivatives. Similar equations have been derived for
a fully general spacetime, where the only assumption is a single perfect-fluid
matter field with a barotropic equation of state [121].
For most astrophysical GW sources we expect to observe, the electromag-
netic luminosity (∼ 1037 W for a typical galaxy) is dwarfed by the gravitational
luminosity (some significant fraction of c5/G ∼ 1052 W) [25], and so the energy
carried by gravitational radiation is generally much greater than that stored in
the electromagnetic field. This translates to E2 ∼ B2  σ2  1 in geometric
units. The wave-like equations for Ea and Ba contain source terms of three
sizes: ∼ σE, ∼ σ2E and O(E3), with the last arising from the back-reaction of
the electromagnetic field on the background spacetime via (3.22)–(3.25). Con-
sidering only the leading (in σ) terms at linear order in E, we find
˜Ea = σabE˙b + 2σ˙abEb + abcσcdB :bd + abcσcd:bBd + ( ˜curlσab)Bb, (3.27)
˜Ba = σabB˙b + 2σ˙abBb − abcσcdE :bd − abcσcd:bEd − ( ˜curlσab)Eb, (3.28)
where ˜T := T¨ − T :a:a for any spatially projected tensor T . Here and hence-
forth, a tilde over an operator explicitly indicates that we are raising and low-
ering indices with the perturbed metric η˜.
Equations (3.27) and (3.28), along with the divergence constraints d˜ivEa =
d˜ivBa = 0 from (3.20) and (3.21), govern the evolution of electromagnetic fields
in the presence of weak-field gravitational radiation. They are coupled to the
first-order evolution and constraint equations (3.11)–(3.13) for σab, which may
be cast as a constrained wave-like equation in similar fashion to the derivation
of (3.27) and (3.28). The shear equations contain terms that are ∼ σ2, ∼ σ3 and
O(E2σ); at linear order in σ, however, we have
˜σab = 0 (3.29)
with d˜ivσab = 0 (which is the transversality condition implied by (3.26)). Hence
it is reasonable to treat σab as a fixed background of gravitational radiation that
drives oscillations in the electromagnetic field via (3.27) and (3.28).
§3.3 Linearised far-field equations 39
For weak-field calculations, it is convenient to replace the perturbed
Minkowski metric η˜ab with an exact one, which simplifies index manipulation
and any harmonic expansion of tensor fields. This approximation is trivially
valid for (3.29) and its divergence constraint (where replacing the perturbed
operators with their Minkowski counterparts only introduces terms that are
quadratic- or higher-order in σ), but not so for (3.27) and (3.28). Using a per-
turbative approach, we consider the gravitationally coupled electromagnetic
field as the sum of a free field and an induced first-order perturbation, i.e.
Ea = E
(0)
a + E
(1)
a , (3.30)
Ba = B
(0)
a +B
(1)
a , (3.31)
where {E(0)a , B(0)a } is a vacuum Maxwell solution, E(1)  E(0) and B(1)  B(0).
Denoting operators on Minkowski space with overbars, we have
¯E(0)a = ¯B(0)a = 0, (3.32)
d¯ivE(0)a = d¯ivB
(0)
a = 0. (3.33)
Substituting (3.30)–(3.33) into the equations for {Ea, Ba} yields wave-like
equations for the induced field that are essentially (3.27) and (3.28) with linear
corrections. These corrections are due to the difference operators (˜− ¯) and
(d˜iv − d¯iv) giving rise to terms that are ∼ σE and non-negligible with respect
to (3.27) and (3.28). The induced field equations read
¯E(1)a = F [E(0)a ] +G[B(0)a ] + {linear corrections}[E(0)a ], (3.34)
¯B(1)a = F [B(0)a ]−G[E(0)a ] + {linear corrections}[B(0)a ], (3.35)
d¯ivE(1)a = {linear corrections}[E(0)a ], (3.36)
d¯ivB(1)a = {linear corrections}[B(0)a ]. (3.37)
Here F and G are linear maps defined by (3.27) and (3.28) as
F [Va] := σabV˙
b + 2σ˙abV
b, (3.38)
40 Einstein–Maxwell interactions
G[Va] := abcσ
cdV :bd + abcσ
cd:bVd + ( ¯curlσab)V
b, (3.39)
with the covariant time and space derivatives equal to their partial counter-
parts at linear perturbative order.
The divergences of the induced electromagnetic field contain terms that are
generally nonzero, even in the absence of sources. Equation (3.36) in particular
has been interpreted as an effective four-current generator for the induced field
[102], although there is no similar analogy for its magnetic counterpart (3.37).
A more suitable comparison might be to think of the corrections in (3.36) and
(3.37) as “polarisation” and “magnetisation” effects generated by the space-
time perturbations, with Ea and B
(0)
a playing the respective roles of the electric
displacement and auxiliary magnetic fields [127].
In this work, we consider a far-field background GW that is plane,
monochromatic and linearly polarised with constant amplitude. The
geometrical-optics approximation is valid whenever the gravitational wave-
length is much smaller than the background radius of curvature, i.e. across the
distant wave zone of a typical astrophysical source and well into its local wave
zone [23]. More realistic (multimodal) inspiral-type waveforms may be built
from superpositions of our simplified model, with the resultant imprint on the
electromagnetic field bearing the characteristics of the source waveform.
The background GW is a solution to (3.29) (with ˜ = ¯) and d¯ivσab = 0.
Choosing coordinates xa such that it propagates in the positive z-direction with
zero initial phase, the wave is described by the real part of
σab = σ exp (−ik(t− z))pab, (3.40)
where σab has been promoted to a complex tensor. The unit polarisation tensor
pab has nonzero components p11 = −p22 = cos 2α and p12 = p21 = sin 2α for
some wave polarisation angle α. Any linear corrections in (3.34)–(3.37) are
obtained in the usual way with the metric perturbation, which is given by
hTTab = <
[
2i
k
σab
]
(3.41)
in accordance with (3.26).
§3.4 Solutions and observational implications 41
3.4 Solutions and observational implications
We now consider two simple models for the free field {E(0)a , B(0)a } in (3.34)–
(3.37), and discuss their astrophysical implications. Section 3.4.1 deals with
the effects of gravitational radiation on a static electromagnetic field, while
GW–EMW interactions are examined in Section 3.4.2.
3.4.1 Static electromagnetic field
When an EMW propagates through a static electromagnetic field, it is reso-
nantly converted to a GW of the same frequency and wavevector; the GW is
sourced by a stress–energy tensor proportional to both the radiative and static
electromagnetic fields [99]. Astrophysical GWs generated through this “Gert-
senshteı˘n process” are generally too weak to be of practical interest [6]. The
Gertsenshteı˘n effect and its inverse process, where a GW in a static electromag-
netic field induces an EMW proportional to both fields, might nevertheless be
relevant for detecting individual gravitons [128] or high-frequency GWs [129].
The inverse Gertsenshteı˘n process is as inefficient as its counterpart, and
the fraction of gravitational energy converted is small (. 10−10) even under
typical conditions in pulsar environments [100]. However, the energy in the
induced EMW might be comparable to that radiated conventionally by astro-
physical systems where both the gravitational radiation and magnetic field are
strong (but still in the far-field regime of Section 3.3). Hence it is worthwhile
to derive the inverse Gertsenshteı˘n effect within our framework, and to revisit
the feasibility of detecting it in observations.
For a plane GW propagating in a uniform magnetic field, the field com-
ponent in the direction of the wavevector does not affect the induced EMW.
Considering only the projection of the magnetic field onto the (x, y)-plane, the
free field may be written as
E(0)a = 0, (3.42)
B(0)a = B
(0)p(0)a , (3.43)
with the unit polarisation vector p(0)a = (0, cos β, sin β, 0) for some field polarisa-
tion angle β. All linear corrections in (3.34)–(3.37) vanish for static and uniform
42 Einstein–Maxwell interactions
electromagnetic fields, and we expect separable solutions to the system. Iso-
lating the spatial dependence in our ansatz as a scalar harmonic, we promote
(3.30) and (3.31) to complex vectors and write
E(1)a = Ea exp (ikz), (3.44)
B(1)a = Ba exp (ikz), (3.45)
where {Ea,Ba} depends only on time.
Equations (3.34)–(3.37) now simplify to an ordinary differential equation in
time for the sole independent component of the induced field. Solving this
with homogeneous initial conditions, we arrive at
Ea =  bca Bbδ3c , (3.46)
Ba = 1
2
hB(0)(kt exp (−ikt)− sin kt)p(1)a , (3.47)
where h = 2σ/k and p(1)a = p ba p
(0)
b = (0, cos (2α− β), sin (2α− β), 0). Equations
(3.46) and (3.47) describe a plane, monochromatic and linearly polarised EMW;
its amplitude is given by
E(1) = B(1) =
1
2
hB(0)(k2t2 − kt sin (2kt) + sin2 kt) 12 , (3.48)
which is proportional to time for large t.
The period TGW = 2pi/k and strength h of the sinusoidal background GW
determine a natural timescale TGW/(2pih), at which B(1) ∼ B(0) and higher-
order perturbations to the electromagnetic field become significant. In reality,
the linear growth in (3.48) is contingent on a steady build-up of oscillations
over time, and is more of an upper bound for EMWs induced by chirp- or
ringdown-type GWs with evolving frequency and/or amplitude. We incorpo-
rate such waveforms with the generalised model
σab =
1
2
(k + k˙t)h exp (−i(k + 1
2
k˙t)t+ ikz − λt)pab, (3.49)
where the spatial dependence has been left unchanged from (3.40) to main-
§3.4 Solutions and observational implications 43
Figure 3.1: Induced EMW amplitude relative to background magnetic field
strength for different gravitational waveforms with fGW = 10 Hz and hI =
10−3: a sinusoid (S), a chirp with f˙GW = 10n Hz/s (Cn), and a ringdown with
τGW = 10
n s (Rn).
tain separability. When λ = 0, (3.49) describes a linear chirp with constant
chirp rate f˙GW = k˙/(2pi), while for k˙ = 0 it gives a ringdown with damping
timescale τGW = 1/λ. Equations (3.34)–(3.37) may then be solved analytically
to yield Fresnel-like integrals in the chirp case, and solutions with bounded
exponential growth in the ringdown case.
The inverse Gertsenshteı˘n effect is potentially significant in the context of
neutron star binaries, since the induced EMW is proportional in strength to
both h and B(0). While the stable GWs from the early inspirals of such sys-
tems might be conducive to resonant growth, any associated magnetic fields
will have fallen off drastically where the wave zone for gravitational radiation
begins. Hence we consider a neutron star binary coalescence in an interaction
region I with the strongest possible GW strain hI and magnetic field strength
BI , i.e. at the inner edge RI := c/(2pifGW) of the local wave zone [23].
Figure 3.1 shows the ratio B(1)/B(0) for various gravitational waveforms,
using typical values of (initial) frequency fGW = 10 Hz and measured strain
h⊕ = 10−21 (hI := h⊕R⊕/RI = 10−3) that correspond to a neutron star binary
44 Einstein–Maxwell interactions
coalescence at R⊕ = 102 Mpc. With such a large interaction strain, we have
B(1) ↗ B(0) in just 300 GW periods for the sinusoidally driven EMW, which is
well within the ∼ 104 waveform cycles observable by LIGO. In general, how-
ever, the induced EMW amplitude is reduced with increasing variability in
the gravitational waveform. The inverse Gertsenshteı˘n effect is insignificant
for the R−1 waveform, and hence completely negligible for actual stellar-mass
ringdowns with their damping timescales of ∼ 10−5 s.
By Poynting’s theorem [127], the spacetime-averaged power density trans-
ferred from a sinusoidal GW to its induced EMW is (at leading order in time)
− 〈u˙GW〉 = 〈d¯ivSa〉 = 1
8µ0
h2IB
2
Iω
2
GWt, (3.50)
where units have been restored and ωGW = 2pifGW.5 The GW energy density is
given via (2.18) by
uGW =
c2
32piG
h2Iω
2
GW. (3.51)
Hence the fraction of gravitational energy converted in the region I is
Υ =
2piG
µ0c2
B2I t
2, (3.52)
in accordance with the original Gertsenshteı˘n result [99, 100]. Even for a neu-
tron star binary coalescence containing a magnetar6 with radius RS = 104 m
and surface field strength BS = 1011 T (BI := BSR3S/R
3
I = 10
3 T), Υ over 104
GW periods is small (∼ 10−9).
At leading order in time, the time-averaged Poynting flux of the induced
EMW at the interaction distance RI is given by
〈S〉 :=
√
〈Sa〉〈Sa〉 = c
24µ0
h2IB
2
Iω
2
GWt
2. (3.53)
Like the magnetic dipole radiation emitted by a neutron star, the Gertsenshteı˘n
radiation typically dwarfs the beamed radiation arising from synchrotron
5 Note that the covariant Poynting vector is now Sa =  bca <[Eb]<[Bc].
6 Magnetars are highly magnetised neutron stars with typical periods of 1 to 10 s and sur-
face fields ranging from 109 to 1011 T [130].
§3.4 Solutions and observational implications 45
emission in the magnetosphere, but can neither propagate through the ionised
interstellar medium nor be detected by existing radio telescopes due to its low
frequency (. 103 Hz). It is more instructive to compare (3.53) with the angle-
averaged flux density of the maximal magnetic dipole radiation at RI , which
is given by [131]
〈Sdip〉 = 1
6µ0c3
B2Sω
4
dipR
6
SR
−2
I , (3.54)
where ωdip is the neutron star’s angular velocity.
For a neutron star binary coalescence containing a millisecond pulsar with
radius RS = 104 m and surface field BS = 106 T, we have 〈S〉 ∼ 109 W/m2
after 300 GW periods and 〈Sdip〉 ∼ 1017 W/m2. If the pulsar is replaced by a
similarly sized magnetar with a 1 s period and 1011 T surface field, the aver-
age flux generated through the Gertsenshteı˘n process after 300 GW periods is
∼ 1019 W/m2—a good 104 times larger than that due to the magnetar’s dipole
radiation. Although this excess flux cannot be detected directly, it should in
principle contribute significantly to the heating of any bipolar outflows or
nearby interstellar clouds. The resultant secondary emission of pulsed elec-
tromagnetic radiation (with pulse frequency fGW) might then be observable
by conventional telescopes across a range of bands, depending on the compo-
sition of the surrounding nebula.
3.4.2 Electromagnetic radiation
Interactions between gravitational and electromagnetic radiation in the far
field are most prominently characterised by a variety of interference-like (but
fully nonlinear) effects on the latter. For our framework, we consider a free
EMW that is plane, monochromatic and linearly polarised; since electromag-
netic wavelengths are typically much shorter than gravitational and astrophys-
ical length scales, our choice is motivated by the validity of geometrical optics
as much as the suitability of plane harmonics to the tensor–vector contractions
in (3.38) and (3.39). The EMW is described by the real part of
E(0)a = E
(0) exp (i(nbx
b + ψ))p(0)a , (3.55)
46 Einstein–Maxwell interactions
B(0)a =
1
n
 bca nbE
(0)
c , (3.56)
where the four-wavevector na = n(−1, sin θ cosφ, sin θ sinφ, cos θ) has the usual
polar and azimuthal angles (with respect to the z-axis), and ψ is the initial
phase relative to (3.40). The unit polarisation vector now lies in the plane or-
thogonal to the spatial wavevector nµ, and is defined such that p
(0)
3 = sin θ sin γ
for some wave polarisation angle γ.
For separable solutions, the tensor–vector contractions in (3.38) and (3.39)
motivate the ansatz
E(1)a =
1
2
(E (+)a exp (im(+)µ xµ) + E (−)a exp (im(−)µ xµ)), (3.57)
B(1)a =
1
2
(B(+)a exp (im(+)µ xµ) + B(−)a exp (im(−)µ xµ)), (3.58)
where m(±)µ := nµ ± kµ (with kµ = kδ3µ) are spatial wavevectors associated with
the first-order perturbation, and we have used the phasor multiplication rule
<[exp (iΦ)]<[exp (iΨ)] = 1
2
<[exp (i|Φ + Ψ|) + exp (i|Φ−Ψ|)]. (3.59)
The scalar Helmholtz harmonics exp (im(±)µ xµ) decouple from (3.34) and (3.35),
leaving a system of ordinary differential equations in time for {E (±)a ,B(±)a }. Al-
though the choice of ansatz is amenable to plane-wave solutions, the diver-
gences (3.36) and (3.37) depend on the angular configuration {θ, φ, α, γ} of the
waves, and are in general nonzero. As it turns out, the full system (3.34)–(3.37)
of evolution and constraint equations is inconsistent with (3.57) and (3.58) for
all but two wave configurations: parallel (θ = 0) and antiparallel (θ = pi), both
of which yield plane-wave perturbations.
GW–EMW interactions have previously been studied in the 1+3 formal-
ism, but neglecting the electric–magnetic self-interaction terms in the evolu-
tion equations (3.27) and (3.28) (i.e. setting G = 0 in (3.39)). This decouples
the spatial dependence without explicit knowledge of the covariant Helmholtz
harmonics, and for parallel waves the resultant ordinary differential equation
describes a resonantly driven oscillator with natural and driving frequency
m = n + k [120, 126]. By considering the full equations, however, we find that
§3.4 Solutions and observational implications 47
the effect of G is to cancel the terms due to F when θ = 0, such that (3.34) and
(3.35) become homogeneous wave equations. In other words, parallel waves
do not interact at all. Such cancellation does not occur for antiparallel waves,
although we find no resonant interaction either. The lack of interaction be-
tween parallel GWs and EMWs is a known result that has been obtained via
other approaches in the literature [104, 106, 107].
For a general interaction angle θ, (3.34)–(3.37) do not admit two-mode solu-
tions of the form (3.57) and (3.58). Nevertheless, the m(±)-modes are dominant
when the frequency ratio ρ := k/n is small, in the sense that the evolution and
constraint equations are consistent at leading order when ρ sec θ  1. Since
ρ . 10−4 in most astrophysical scenarios (the highest-frequency GW sources
have fGW ∼ 103 Hz [25], while fEM ∼ 107 Hz is the lowest frequency that mod-
ern radio telescopes are sensitive to [132, 133]), the two-mode ansatz is justifi-
able and valid for all angular configurations except orthogonal waves.
Solving the ordinary differential equations for {E (±)a ,B(±)a } with homoge-
neous initial conditions, we obtain a wave perturbation with a complicated
dependence on {k, n, θ, φ, α, γ} (see Appendix A). The solution (A.1)–(A.10)
remains linearly polarised, however, as its components have a common phase
offset ψ and time dependence
E (±)a ,B(±)a ∝ m(±) exp (−i(n± k)t)
−m(±) cos (m(±)t) + i(n± k) sin (m(±)t), (3.60)
where m(±) = (k2 + n2 ± 2kn cos θ)1/2. This represents an effective splitting of
the EMW frequency into four perturbation frequencies m(±) and n ± k, along
with the original free frequency n. The amplitude of the wave perturbation
vanishes in the limit for parallel waves, and is ∼ hE(0) for antiparallel waves;
its characteristic size for general θ is
E(1), B(1) = O(hE(0)/ρ), (3.61)
which indicates that nonlinear interference effects between GWs and EMWs
might become significant as h ↗ ρ. When h > ρ, higher-order perturbations
come into play and the validity of the perturbative approach is limited.
48 Einstein–Maxwell interactions
(a) (b)
(c) (d)
(e) (f)
Figure 3.2: Perturbed Poynting flux envelope for h = ρ at different interaction
angles between θ = pi/2 and θ = 0 ((a)–(d)), along with comparisons of the
four configurations over different timescales ((e) and (f)). The bolded curves
in (a)–(d) are for h = 10−1ρ.
§3.4 Solutions and observational implications 49
To illustrate the behaviour in the h ∼ ρ regime, we define here a complex
covariant Poynting vector7
Sa := 
bc
a EbB
∗
c , (3.62)
which gives the envelope S := (S∗aSa)1/2 of the usual Poynting vector magni-
tude for the full field {Ea, Ba}. Due to the presence of cross terms in (3.62), the
Poynting flux envelope is spatially periodic on the gravitational length scale
2pi/k. Figure 3.2 shows the relative flux envelope S/S(0) for a plus-polarised
GW (α = 0) and an EMW in the (y, z)-plane ((φ, γ) = (pi/2, 0)), where S is
evaluated at xµ = 0 and S(0) is the constant flux envelope of the free EMW.
In accordance with previous results [126], there is an emergence of θ-
dependent beats in the perturbed EMW, with frequency given by the greatest
common divisor of the spectrum {n,m(±), n ± k}. It is useful to define an ap-
proximate beat period Tbeat(θ) := 2pi/(k − (m(+) − m(−))/2), which describes
much of the beat structure for most values of θ. As the interaction angle de-
creases from pi/2− to  (where  = 10−3), the peaks for the extremal case h = ρ
increase from around S/S(0) = 2 to a limiting value of S/S(0) = 9. Addition-
ally, we find significant nonlinear amplification of the beats as h is raised from
10−1ρ to ρ. Beating effects are essentially negligible for h . 10−3ρ.
There is an overall flux increase apparent in Figure 3.2, attributable to the
transfer of energy from the GW to the electromagnetic field as in Section 3.4.1.
For a clearer picture of this flux amplification and its dependence on interac-
tion angle, we require the Poynting flux averaged over finite time intervals T ,
which we denote explicitly as 〈S〉T = (〈Sa〉T 〈Sa〉T )1/2 with
〈Sa〉T = 1
T
∫ T
0
dtSa. (3.63)
Considering the same angular configuration as before, Figure 3.3 shows a se-
quence of polar plots (with respect to interaction angle) for 〈S〉T/〈S(0)〉 with
increasing T , where 〈S〉T is evaluated at xµ = 0 and 〈S(0)〉 for the free EMW
is effectively constant over gravitational timescales. When h ∼ ρ, the overall
flux in the forward sector |θ| < pi/2 is approximately doubled for small |θ| after
7 Our definition differs by a factor of 1/2 from the conventional complex Poynting vector,
which is used to calculate time-averaged flux for sinusoidal plane waves.
50 Einstein–Maxwell interactions
Figure 3.3: Perturbed time-averaged Poynting flux 〈S〉T/〈S(0)〉 as a radial func-
tion of interaction angle, for different time intervals between T = 100TGW and
T = 105TGW. Each plot is for h = ρ, with θ = 0 on the positive horizontal axis
such that the GW propagates to the right.
§3.4 Solutions and observational implications 51
just 102 GW periods. There is little to no flux amplification in the backward
sector |θ| > pi/2. We note that the interaction between parallel waves vanishes
as expected, with the beat frequency and the induced field itself going to zero
smoothly as |θ| → 0; the seemingly pathological behaviour of 〈S〉T/〈S(0)〉 at
θ = 0 is attributable to the non-smoothness of the time-averaging operation
(3.63) in the limit as T →∞.
The nonlinear interference depicted in Figures 3.2 and 3.3 is potentially
relevant for GW sources with electromagnetic counterparts that are long-
lived (lasting at least several GW periods), and preferably high-frequency
(ρ . 10−10) for effects to be significant at low interaction strains. Possible
counterparts for a compact binary system are a pulsar component as in Sec-
tion 3.4.1 or, more promisingly, an extended electromagnetic source such as a
bipolar outflow or interstellar cloud around the binary. If an extended source
emits radiation in the band fEM ∼ fGW/hI , its radiation profile might be char-
acterised by intensity fluctuations and overall flux amplification at small an-
gular distances from the binary’s sky location; the fluctuations should increase
in frequency to fbeat(pi/2) = fGW as the interaction angle widens, then diminish
rapidly at larger angular distances as hI falls below 10−1ρ.
Figure 3.3 effectively describes the flux amplification at different interaction
angles, but there is actually a tiny deflection of the perturbed time-averaged
Poynting vector 〈Sa〉 in the direction of the GW. The original Poynting vector,
averaged over all time, is given simply by
〈S(0)a 〉 = 12<[S(0)a ]. (3.64)
Its perturbed counterpart reduces to
〈Sa〉 =
〈S(0)a 〉+ 12 〈<[S(1)a ]〉+ 12 〈<[ bca E(1)b B(1)c ]〉 , (3.65)
since both cross terms average to zero over all time. The first two terms in (3.65)
depend only on the angular configuration of the waves and have no spatial
dependence, while the spatial dependence in the third term is negligible for
ρ 1. We consider the deflection angle Θdef between (3.64) and (3.65) with the
52 Einstein–Maxwell interactions
same angular configuration as before; expanding the angle in powers of both
h and ρ, we find
Θdef = O(min {h2/ρ, ρ}), (3.66)
which is valid in the forward sector but away from θ = 0, where Θdef goes
sharply to pi/2 due to the time-averaging operation.
Equation (3.66) becomes Θdef = O(h2/ρ) for h < ρ, such that the deflection
of the time-averaged Poynting vector varies with both fGW and fEM. This is a
new result, although a frequency-dependent deflection of time-averaged flux
does not necessarily imply the dispersion of light by GWs. The maximal angle
Θdef ∼ h agrees with previous results for the ray deflection angle in other
approaches [104, 111]. A hydrogen-line radio wave passing the neutron star
binary coalescence of Section 3.4.1 with an impact parameter corresponding to
hI ∼ ρ ∼ 10−8 will have its Poynting vector deflected by ∼ 10−3 arcsec; this
is comparable to the deflection due to conventional gravitational lensing by
the same system (∼ 10−2 arcsec). Such angular deviations are too small to be
observed directly, but might be amenable to microlensing techniques.
Another well-documented GW–EMW interaction is the gravitational ana-
logue of Faraday rotation experienced by an EMW in the field of a passing
GW; if the projection of the EMW polarisation vector onto the GW polarisation
plane is aligned with the plus mode, it will undergo a slight (oscillatory) ro-
tation as long as the cross mode is nonzero [106, 107, 109]. In our framework,
there is indeed no rotation for a plus-polarised GW (α = 0) and an aligned
EMW ((φ, γ) = (pi/2, 0)), since <[E(0)a ] and <[Ea] are parallel. We consider the
rotation angle8 Θrot between E
(0)
a and Ea for a cross-polarised GW (α = pi/4)
and the same EMW at xµ = 0; expanding the angle in powers of h, we find
|Θrot| = O(h), (3.67)
in accordance with previous results [106, 107]. The rotation angle also oscil-
lates at ∼ fGW as expected [106], with beat frequency fbeat(θ). Again, since
we have h . ρ . 10−4 in most astrophysical scenarios, the gravitationally in-
8 We use the complex fields to smooth out oscillations on the electromagnetic timescale; the
angle Θ between two complex vectors Va and Wa is given by cos Θ = <[V ∗aW a]/(VW ).
§3.5 Discussion 53
duced Faraday rotation due to the passage of a far-field GW will typically be
. 10 arcsec and difficult to detect using modern techniques.
3.5 Discussion
In this chapter, we have studied far-field interactions between gravitational ra-
diation and electromagnetic fields in the 1+3 approach to relativity, with a view
to characterising observable signatures on any electromagnetic radiation emit-
ted by astrophysical GW sources. Linearised evolution and constraint equa-
tions for the electromagnetic field on a GW-perturbed spacetime have been
approximated and solved perturbatively on Minkowski space, where the rele-
vant harmonic expansions are explicitly known and analytically tractable.
We have rederived the inverse Gertsenshteı˘n effect by applying this frame-
work to the interaction of a plane GW with a static electromagnetic field, and
have also considered the resonantly induced electromagnetic radiation in an
astrophysical setting. Order-of-magnitude calculations have shown that the
Gertsenshteı˘n radiation is comparable to the magnetic dipole radiation for
highly magnetised neutron stars in compact binary systems; in the presence
of a surrounding nebula, this might lead to a secondary emission of electro-
magnetic radiation pulsed at the GW frequency.
Several geometrical-optics effects have been found in the case of interacting
GWs and EMWs. There is no resonant growth of the electromagnetic field as
found in other work, due to the additional consideration of electric–magnetic
self-interaction contributions here. We have also demonstrated that the non-
linear fluctuation and amplification of electromagnetic energy flux becomes
significant as the GW strain approaches the GW–EMW frequency ratio from
below, and might serve as a distinctive astrophysical signature of gravitational
radiation emitted near or within an extended electromagnetic source.
From the various assumptions and approximations employed in this work,
it is evident that the analytical advantages of the 1+3 approach are limited
even for our simple model. A calculation of second-order perturbations in-
duced by the first-order fields via (3.34)–(3.37) might provide a clearer picture
of interacting waves in the h ∼ ρ regime, although a rapid blow-up of the full
54 Einstein–Maxwell interactions
field (signalling the breakdown of the perturbative approach) is more likely.
Numerical solutions of (3.27) and (3.28) or their unlinearised versions might
be worth pursuing in this case, both to verify results from the perturbative
approach and to facilitate more accurate models by extending the framework
into the h > ρ regime.
Our results have observational implications for two types of astrophysi-
cal source: compact sources with large values of h and B(0) (for the inverse
Gertsenshteı˘n effect to be relevant), and extended ones with a wide range of
interaction angle (for more prominent nonlinear interference effects). They are
not restricted to any specific example suggested here, however; neither have
we considered scenarios where the gravitational and electromagnetic sources
are separate. Detailed source models that incorporate far-field gravitational–
electromagnetic interactions—or any Einstein–Maxwell coupling in general—
will be an asset to the detection and observation of GW sources at present, and
indeed the larger realm of multimessenger astronomy in the future.
Chapter 4
Augmented kludge waveforms
To prepare for the formulation of the LISA mission over the next few years,
several outstanding and urgent questions in data analysis must be addressed
using mock data challenges. These data challenges will require accurate and
computationally affordable waveform models for anticipated sources such as
EMRIs. Previous data challenges have made use of the well-known analytic
kludge waveforms of Barack & Cutler, which are extremely quick to generate
but dephase relative to more accurate waveforms within hours, due to their
mismatched radial, polar and azimuthal frequencies.
In this chapter, we describe an augmented Barack–Cutler waveform model
that uses a frequency map to the correct Kerr frequencies, along with updated
PN evolution equations and a local trajectory fit to a more accurate model. The
augmented kludge waveforms stay in phase for months and may be generated
with virtually no additional computational cost.
The material in this chapter is adapted from [59, 134], but has been updated
to include new unpublished results that were obtained with the latest version
of the waveform model [135].
4.1 Background
With the recent success of the LISA Pathfinder mission [18], space-based GW
astronomy is now one step closer to becoming a reality. The launch and opera-
tion of Pathfinder provides an early demonstration of the technology necessary
for proposed detectors such as LISA to probe the source-rich and scientifically
rewarding millihertz GW sky. Results from the mission also offer vital insights
55
56 Augmented kludge waveforms
into the noise properties of the LISA instrument; these will be used over the
next few years to address several important open questions in data analysis,
which must be dealt with prior to mission adoption at the end of this decade.
Accurate and computationally affordable models of likely sources will allow
such issues to be investigated in forthcoming mock data challenges, and are
therefore urgently needed.
EMRIs are an important type of source for LISA and other space-based de-
tectors. Radiation reaction from the emission of GWs causes the orbit of the
compact object around the central black hole to shrink and circularise adia-
batically. During the final years of inspiral, the orbital dynamics display ex-
treme relativistic effects that arise because the compact object is deep within
the strong-field region of the black-hole spacetime. These effects are imprinted
on the GW signal from the source; measuring them will allow us to map out
the multipole structure of the spacetime, and hence to test the strong-field va-
lidity of black-hole solutions in general relativity [136].
The optimal detection and identification of EMRI signals depends on the
availability of a waveform model that is as accurate as possible. While the ex-
treme mass ratio prohibits the use of NR simulations, it does allow the source
to be modelled faithfully within the framework of black-hole perturbation the-
ory. At first order in mass ratio, such models are based on solving for the
outgoing radiation field Ψ4 with the Teukolsky equation (2.38) (sourced by a
compact object moving along a Kerr geodesic). Orbital evolution may be intro-
duced in Teukolsky-based models by balancing the change in orbital energy
and angular momentum against the radiation fluxes of “snapshot” geodesic
waveforms. The ongoing development of self-force calculations will eventu-
ally provide a more accurate evolution that accounts for both dissipative and
conservative self-interaction effects at higher orders in mass ratio.
For scoping out data analysis, the large parameter space of EMRI mod-
els and the complexity of their waveforms necessitates the use of templates
that can be generated as quickly as possible. As Teukolsky-based waveforms
are computationally expensive, they have been supplemented by approximate
waveforms designed for robust use in data analysis. The waveform model
used for previous mock LISA data challenges [137] is the analytic kludge (AK)
§4.2 Kludge waveform models 57
[57], in which the orbit is built from Keplerian ellipses, with relativistic inspi-
ral, periapsis precession and Lense–Thirring precession imposed using ana-
lytic PN evolution equations. The AK model is extremely quick to compute,
but is less accurate than the numerical kludge (NK) model [58], which com-
bines Kerr geodesics with Teukolsky-fitted PN orbital evolution for greater ac-
curacy. In both kludge models, generation of the waveform from the orbit is
sped up from Teukolsky-based models by using a flat-space approximation.
In this chapter, we describe an augmented analytic kludge (AAK) wave-
form model based on a frequency mapping method. The orbital frequency
and two precession rates in the AK model are matched to appropriate com-
binations of the fundamental frequencies describing the radial, polar and az-
imuthal components of geodesic orbits in Kerr spacetime [138]. We also update
the model with suitable PN evolution equations and include an additional fit
to the NK model, which itself shows excellent agreement with the Teukolsky-
based geodesic and inspiral waveforms in [62, 63]. The length of time over
which the AAK waveform stays in phase with the NK waveform is typically
increased by a few orders of magnitude, while the added computational cost
is insignificant as the map-and-fit is only performed locally.
A brief overview of the AK and NK models is given in Section 4.2, while the
AAK model is presented in Section 4.3. In Section 4.3.1, we first introduce the
Kerr fundamental frequencies and the parameter-space map they induce in the
AK model, before describing the technical details of the AAK implementation
in Section 4.3.2. The performance of the augmented model is then compared
to that of the original AK model in Section 4.3.3, with the more accurate but
slower NK model used as the benchmark for both.
4.2 Kludge waveform models
A kludge in the context of EMRI modelling is any approximate model that uses
a combination of formalisms to generate waveforms quickly and extensively
for sampling purposes. Kludge waveforms capture many qualitative features
of more accurate EMRI waveforms, and (owing to their modular construction)
can be modified to incorporate self-force information as it becomes available.
58 Augmented kludge waveforms
Two widely used kludges are introduced briefly in this section: the AK
waveform of Barack & Cutler [57], which is very fast to compute and provides
the basis for our new model, and the NK waveform of Babak et al. [58], which
we take as a fiducial model for calibration and benchmarking purposes. Other
approximate EMRI models exist but at varying levels of implementation (e.g.
[139, 140]), and we do not consider them in this work (apart from the PN fluxes
of Sago & Fujita [141], which are used as part of the AAK model).
Assuming the spin of the compact object is negligible, an EMRI can be de-
scribed by 14 parameters: the two masses (µ,M  µ) of the system, the three
components of the central black hole’s spin vector S, three constants E de-
scribing the compact object’s (instantaneous) orbit, the three components of
the compact object’s position vector X with respect to the black hole, and the
three components of the system’s position vector R with respect to the So-
lar System. Of these 14 degrees of freedom, seven are extrinsic to the source:
two in S and one in X (corresponding to spatial rotation of the source), three
in R (corresponding to spatial translation), and one in E (corresponding to
temporal translation). The parameters of an EMRI model are often chosen to
decouple the intrinsic degrees of freedom from the extrinsic ones, which are
generally cheaper to search over during data analysis [142].
Schematically, the main ingredients of a kludge waveform model are then
(i) the evolution of the orbital constants along the inspiral (i.e. the “phase-
space” trajectory), using PN or fitted fluxes F:
E˙ = F(µ,M,S,E); (4.1)
(ii) the construction of the compact object’s worldline (i.e. the “configuration-
space” trajectory), using geodesic or flux-derived expressions G:
X˙ = G(µ,M,S,E); (4.2)
and (iii) the generation of the waveform field h at the detector, using some
weak-field multipole formula H :
h(t) = H(X,R). (4.3)
§4.2 Kludge waveform models 59
4.2.1 Analytic kludge
In the AK model [57], both the orbital trajectory and the waveform are com-
puted in a flat-space approximation, with relativistic effects such as inspi-
ralling and precession added separately. The trajectory is built out of rotating
Keplerian ellipses. Radiation reaction is introduced in phase space, where the
orbital constants describing a Keplerian ellipse are evolved with PN equations.
In configuration space, the orientation of this ellipse is also evolved with PN
equations to simulate relativistic precession. The waveform is then generated
using the Peters–Mathews mode-sum approximation for Keplerian orbits [35],
in which the mass quadrupole moment is decomposed into harmonics of the
Keplerian orbital frequency.
Since a Keplerian orbit is confined within the plane normal to its angular
momentum vector L, the AK waveform is constructed in an L-based coordi-
nate frame
(xˆ, yˆ, zˆ)L˜ :=
(
(Rˆ · Lˆ)Lˆ− Rˆ
SL,R ,
Rˆ× Lˆ
SL,R , Lˆ
)
, (4.4)
and projected transverse to the wave frame
(xˆ, yˆ, zˆ)AK :=
(
Rˆ× Lˆ
SL,R ,
Lˆ− (Lˆ · Rˆ)Rˆ
SL,R ,−Rˆ
)
, (4.5)
where the normalisation factor SL,R := (1 − (Lˆ · Rˆ)2)1/2. These two frames
are made time-varying (with respect to a fixed heliocentric and ecliptic-based
frame [79]) through the forced precession of L.
The two waveform polarisations in the transverse–traceless gauge (with
respect to (xˆ, yˆ)AK) are given by the n-mode sums
h+ =
∞∑
n=1
h+n , (4.6)
h× =
∞∑
n=1
h×n , (4.7)
60 Augmented kludge waveforms
with
h+n = (1 + (Rˆ · Lˆ)2)(bn sin 2γ˜ − an cos 2γ˜) + (1− (Rˆ · Lˆ)2)cn, (4.8)
h×n = 2(Rˆ · Lˆ)(bn cos 2γ˜ + an sin 2γ˜), (4.9)
where γ˜ is an azimuthal angle in the orbital plane measuring the direction of
periapsis with respect to (Rˆ·Lˆ)Lˆ−Rˆ (i.e. the orthogonal projection of zˆAK onto
the plane normal to Lˆ).9 The functions (an, bn, cn) describe the changing mass
quadrupole moment of a Keplerian orbit with mean anomaly Φ(t), eccentricity
e and orbital angular frequency Φ˙, and are given by [35]
an = −nA(Jn−2(ne)− 2eJn−1(ne) + (2/n)Jn(ne)
+2eJn+1(ne)− Jn+2(ne)) cosnΦ, (4.10)
bn = −nA(1− e2)1/2(Jn−2(ne)− 2Jn(ne) + Jn+2(ne)) sinnΦ, (4.11)
cn = 2AJn(ne) cosnΦ, (4.12)
where the Jn are Bessel functions of the first kind, and A = (Φ˙M)2/3µ/|R| in
the extreme-mass-ratio limit.
In the ecliptic-based coordinate system, the sky position Rˆ ≡ (θS, φS) of
the source and the black-hole spin orientation Sˆ ≡ (θK , φK) are effectively
constant. It is convenient to represent Lˆ in ecliptic coordinates with respect
to Sˆ (where both vectors are moved by parallel transport to the Solar System
barycentre). We have
Lˆ = Sˆ cos ι+
(
zˆ− (zˆ · Sˆ)Sˆ
|zˆ− (zˆ · Sˆ)Sˆ| cosα +
Sˆ× zˆ
|Sˆ× zˆ| sinα
)
sin ι, (4.13)
where zˆ = [0, 0, 1]T is normal to the ecliptic plane, ι is the inclination angle
between Lˆ and Sˆ, and α is an azimuthal angle in the spin-equatorial plane
measuring the direction of Lˆ − (Lˆ · Sˆ)Sˆ with respect to zˆ − (zˆ · Sˆ)Sˆ (i.e. the
9 We have changed some of the notation in [57] for consistency, since we are constructing
a hybrid model using different formalisms. For example: the notation for the angles γ
and γ˜ has been swapped; the notation ν for the orbital frequency Φ˙/(2pi) is unused; the
notation for the inclination λ is now ι.
§4.2 Kludge waveform models 61
angle between the orthogonal projections of Lˆ and zˆ onto the plane normal
to Sˆ). Furthermore, since γ˜ is neither intrinsic nor extrinsic, it is useful to
define the purely intrinsic parameter γ := γ˜ − β, where β = β(Rˆ, Sˆ, Lˆ) =
β(θS, φS, θK , φK , ι, α) is an azimuthal angle in the orbital plane measuring the
direction of Lˆ× Sˆ with respect to (Rˆ · Lˆ)Lˆ− Rˆ.
Only one of the six parameters comprising E = (e, ι, Φ˙) and X = (Φ(t), γ, α)
in the above Keplerian setup changes with time. In the AK model, (e, Φ˙, γ, α)
are promoted to functions of time and evolved with mixed-order PN expres-
sions that depend on (µ,M, a = |S|/M,E) [143–146], while ι is approximated as
constant (since the inclination angle of a typical EMRI varies extremely slowly
[147]). The Keplerian orbit shrinks and circularises as Φ˙(t) and e(t) increase
and decrease respectively. From (4.13), the time dependence of the orbital ori-
entation Lˆ(t) is confined to α(t), where α˙ is precisely the angular rate of Lense–
Thirring precession. Finally, the angular rate of periapsis precession is given
by γ˙ + α˙ since γ(t) is measured with respect to Lˆ(t)× Sˆ.
While the waveform field h+,× is effectively planar at the Solar System and
may be calculated in the fixed heliocentric frame (as opposed to a detector-
centric one), the rotational and orbital motion of LISA in the ecliptic plane
must be factored into the detector’s response to h+,×. In the standard LISA
framework, the field is transformed into the response functions hI,II via (2.46)
and (2.47), with F+,×I (θS, φS, θK , φK) given explicitly in [79]; these rotate with
respect to Rˆ as the plane of the detector along its orbit precesses around the
ecliptic plane. Doppler modulation of the waveform phase (through Φ(t)) is
also included to correct for the orbital motion of the detector itself.
The AK model was the first waveform model used to investigate the preci-
sion of LISA parameter estimation over the full (modulo compact-object spin)
EMRI parameter space [57]. Due to its computational efficiency, the model has
also been employed in past mock LISA data challenges to generate injected
signals in simulated data and parametrised templates for search algorithms
[137]. However, the approximate waveforms it produces are demonstrably
inaccurate, and will result in reduced detection and parameter estimation per-
formance if used to analyse data sets containing realistic EMRI signals.
62 Augmented kludge waveforms
4.2.2 Numerical kludge
In the NK model [58], the orbital trajectory is computed in curved space with
a treatment that is fully relativistic up to the evolution of orbital constants
[148, 149], i.e. it is built out of Kerr geodesics. The three constants of mo-
tion for a geodesic are evolved with Teukolsky-fitted PN equations, which
introduces radiation reaction. In configuration space, precession effects are
obtained for free by integrating the geodesic equations along the phase-space
trajectory. The curved-space coordinates of the compact object’s worldline are
then associated artificially with coordinates in flat space, and the waveform
is generated using the standard quadrupole formula (or variants that include
additional contributions from higher-order moments of mass [150, 151]).
The NK waveform is constructed in an S-based coordinate frame
(xˆ, yˆ, zˆ)S :=
(
Rˆ× Sˆ
SS,R ,
Rˆ− (Rˆ · Sˆ)Sˆ
SS,R , Sˆ
)
, (4.14)
and projected transverse to the wave frame
(xˆ, yˆ, zˆ)NK :=
(
Rˆ× Sˆ
SS,R ,
Sˆ− (Sˆ · Rˆ)Rˆ
SS,R ,−Rˆ
)
, (4.15)
where the normalisation factor SS,R := (1 − (Sˆ · Rˆ)2)1/2. Aligning the z-axis
with S is a more natural choice for the NK model, since the compact object’s
worldline is computed in Boyer–Lindquist coordinates. The two wave frames
(4.5) and (4.15) are related by a rotation about R.
Using the decomposition (2.44), the two waveform polarisations in the
transverse–traceless gauge (with respect to (xˆ, yˆ)NK) are given by
h+ =
1
2
hTTij H
+
klδ
ikδjl, (4.16)
h× =
1
2
hTTij H
×
klδ
ikδjl, (4.17)
where we have switched to Latin spatial indices and δij is the Kronecker delta.
Via (2.12)–(2.14) and (2.17), the far-field metric perturbation in the quadrupole
§4.2 Kludge waveform models 63
approximation is given by
hTTij =
2
|R|
(
P ki P
l
j −
1
2
PijP
kl
)
I¨kl, (4.18)
where contraction with Pij := (δij − zˆizˆj)NK projects transverse to R.
In the extreme-mass-ratio limit, the mass quadrupole moment is simply
Iij = µxixj, (4.19)
where the xi(t) are Cartesian components of the compact object’s position vec-
tor X with respect to the frame (4.14) centred on the black hole. Although (4.18)
(with (4.19)) is a weak-field equation in flat-space coordinates, the NK model
specifies and calculates (x1, x2, x3) = (r sin θ cosφ, r sin θ sinφ, r cos θ) in Boyer–
Lindquist coordinates. The self-consistency of this approach clearly degrades
further into the strong field, but does not severely impact the effectiveness of
the NK waveforms as an approximation to Teukolsky-based ones [58].
A timelike Kerr geodesic is described fully by three first integrals of motion:
the orbital energyE, the projection Lz of the orbital angular momentum L onto
S, and the quadratic Carter constantQ. Along such an orbit, (r(t), θ(t), φ(t)) are
obtained by integrating the geodesic equations for a test particle in the Kerr
spacetime (2.37); these are written in canonical form as [22]
Σ
dr
dτ
= ±
√
Vr, (4.20)
Σ
dθ
dτ
= ±
√
Vθ, (4.21)
Σ
dφ
dτ
= Vφ, (4.22)
Σ
dt
dτ
= Vt, (4.23)
where τ is proper time along the worldline and Σ = r2 +a2 cos2 θ. The potential
functions Vr,θ,φ,t are given by
Vr(r) = P
2 − (r2 + (Lz − aE)2 +Q)∆, (4.24)
64 Augmented kludge waveforms
Vθ(θ) = Q− cos2 θ
(
a2(1− E2) + L
2
z
sin2 θ
)
, (4.25)
Vφ(r, θ) =
Lz
sin2 θ
− aE + aP
∆
, (4.26)
Vt(r, θ) = aLz − a2E sin2 θ + (r
2 + a2)P
∆
, (4.27)
with P = E(r2 + a2)− aLz and ∆ = r2 − 2Mr + a2.
In practice, it is convenient to work with alternative parametrisations of
(E,Lz, Q). For a bound orbit, the geodesic may be specified by the parameters
(rp, ra, θmin) (the values of r at periapsis and apoapsis, and the minimal value
of θ respectively), which fully describe the range of motion in the radial and
polar coordinates. The roots of Vr determine rp and ra, while the roots of Vθ
determine cos θmin (the maximal value of cos θ). Another parametrisation is
(e, ι, p) (the quasi-Keplerian eccentricity, inclination and semi-latus rectum);
these are defined in terms of (rp, ra, θmin) as
(e, ι, p) :=
(
ra − rp
ra + rp
,
pi
2
− θmin, 2rarp
ra + rp
)
. (4.28)
Finally, since the configuration-space parameters (r, θ) oscillate between the
bounds rp ≤ r ≤ ra and θmin ≤ θ ≤ pi − θmin, it is useful to define
(ψ, χ) :=
(
cos−1
(
p− r
er
)
, cos−1
(
cos θ
cos θmin
))
, (4.29)
such that ψ (the quasi-Keplerian true anomaly) and χ are the phases of radial
and polar motion respectively.
The orbital constants E = (E,Lz, Q) in the above geodesic setup do not
vary with time. Radiation reaction is added to the NK model by evolving
E with fluxes that depend on (µ,M, a,E) (note that the inclination ι(t) =
tan−1(
√
Q/Lz) is correctly time-dependent in this model). These fluxes are
mixed-order PN expressions that have been fitted to the results of Teukolsky-
based computations for circular inclined orbits [149]. Integrating the geodesic
equations along the phase-space trajectory then gives X = (ψ(t), χ(t), φ(t)),
complete with relativistic precession. Once the waveform field h+,× has been
§4.3 Augmented analytic kludge 65
calculated via (4.16)–(4.19), the LISA response functions hI,II may be obtained
through the method outlined in Section 4.2.1.
Waveforms from the NK model display excellent agreement with
Teukolsky-based waveforms in the strong-field regime; they are reliable up
to a closest approach of rp ≈ 5M , with typical matches of over 0.95 [58]. NK
waveforms might even be accurate enough to serve as templates in actual LISA
detection algorithms. However, they are still slightly expensive to generate in
large numbers due to the relatively elaborate construction of the phase- and
configuration-space trajectories, while added computational cost also arises in
the parameter conversion (E,Lz, Q)↔ (e, ι, p), the handling of plunge, etc.
4.3 Augmented analytic kludge
The AK model is 5–15 times faster than the NK model at generating year-long
waveforms sampled at 0.2 Hz for a generic (101, 106)M EMRI with low initial
eccentricity (e0 . 0.3); this speed-up is enhanced for longer waveform dura-
tions, but diminished for higher initial eccentricity (since more modes must be
summed in the Peters–Mathews approximation).10 However, AK waveforms
suffer from severe dephasing with respect to NK waveforms, even at the early-
inspiral stage. In Figure 4.1, the AK waveform for a (101, 106)M EMRI with
initial semi-latus rectum p0 = 15M matches the qualitative features of the cor-
responding NK waveform, but is a full cycle out of phase within three hours.
This is due to the mismatched frequencies in the two models.
In Sections 4.3.1 and 4.3.2, we describe the construction of a hybrid model
that capitalises on the benefits of both kludges. The AK model is augmented
with an initial map to the fundamental frequencies of Kerr geodesic motion,
which corrects the instantaneous phasing as shown in Figure 4.1. Over longer
timescales, the mapped orbital trajectory is further improved through self-
consistent PN evolution and a local polynomial fit to the phase-space trajectory
10 The sums in (4.6) and (4.7) must be truncated at some arbitrary number of modes N ,
which directly affects both the speed and accuracy of the AK model. This number may be
specified by setting a threshold for the relative power radiated into the N -th harmonic,
and has been experimentally determined to scale linearly with eccentricity [57]. We use
N = b30e0c as the default value for both the AK and AAK models.
66 Augmented kludge waveforms
Figure 4.1: First 12 hours of AK (red) and AAK (green) waveforms overlaid on
NK waveform (black), for the early inspiral of a (101, 106)M EMRI with initial
semi-latus rectum p0 = 15M .
of the NK model. Fast algorithms for higher-order fits and plunge handling
have been incorporated in the latest AAK implementation, which has been
made publicly available at github.com/alvincjk/EMRI_Kludge_Suite
as part of a software suite for generating kludges.
The initial version of the AAK model yields waveforms that can remain
phase-coherent with NK waveforms for over two months, but with overlap
values lower than 0.97 [59]. This is the commonly chosen minimal match for
a waveform template bank that corresponds to a 90%-ideal observed event
rate [83], and thus ensures the equivalent localisation of any signal detected
with such banks of AAK and NK templates. In Section 4.3.3, we report fur-
ther improved results for the present AAK implementation [135]. Two-month
overlaps higher than 0.97 are found for EMRIs with varying spin and eccen-
tricity; however, the overlaps still degrade with proximity to plunge, due to
the divergence of the AAK and NK trajectories deep within the strong field.
§4.3 Augmented analytic kludge 67
4.3.1 The fundamental-frequency map
The geodesic equations (4.20)–(4.23) take a simple form with the choice of a
timelike parameter λ =
∫
dτ/Σ [152, 153]; this decouples (4.20) and (4.21), and
for a bound orbit makes the radial and polar components of motion mani-
festly periodic with respect to λ. For the azimuthal and temporal components
(whose potentials depend only on (r, θ)), overall rates of evolution may be ob-
tained by averaging (4.26) and (4.27) over many periods of (r, θ) motion.
From the radial and polar periods Λr,θ, the average azimuthal rate 〈dφ/dλ〉
and the average temporal rate 〈dt/dλ〉 (denoted Γ by analogy with the Lorentz
factor), we may define three angular and dimensionless fundamental frequen-
cies Ωr,θ,φ for the test particle’s motion with respect to coordinate time. In terms
of (rp, ra, θmin), these frequencies are written as [141, 154]
Ωr =
2pi
ΛrΓ
, (4.30)
Ωθ =
2pi
ΛθΓ
, (4.31)
Ωφ = lim
N→∞
1
N2ΛrΛθΓ
∫ NΛr
0
dλr
∫ NΛθ
0
dλθ Vφ(r(λr), θ(λθ)), (4.32)
where Λr,θ and Γ are given by
Λr = 2
∫ ra
rp
dr√
Vr
, (4.33)
Λθ = 4
∫ pi/2
θmin
dθ√
Vθ
, (4.34)
Γ = lim
N→∞
1
N2ΛrΛθ
∫ NΛr
0
dλr
∫ NΛθ
0
dλθ Vt(r(λr), θ(λθ)). (4.35)
Expressions for Ωr,θ,φ in terms of (e, ι, p) have been derived by Schmidt [138];
these are less compact, but have more utility in practical implementations.
In terms of the fundamental frequencies, the periapsis and Lense–Thirring
precession rates are given by Ωφ−Ωr and Ωφ−Ωθ respectively. These vanish in
the Newtonian limit, where Ωr,θ,φ approach a single orbital frequency Ω from
68 Augmented kludge waveforms
below, i.e. Ωr ↗ Ωθ ↗ Ωφ ↗ Ω. The frequency Ω is then related to (e, p) by
Kepler’s third law:
Ω =
(
1− e2
p
)3/2
. (4.36)
In the AK model, Ω is associated with the quantity Φ˙M ; however, periapsis and
Lense–Thirring precession are added on top of Φ˙ via γ˙ + α˙ and α˙ respectively,
and so Ω is the lowest frequency by construction. This inconsistency with the
relativistic case leads to mismatched frequencies when supplying identical pa-
rameters to the two models, since the same value of p in (4.36) specifies the ra-
dial AK frequency while approximating the azimuthal NK frequency. In other
words, the frequencies in the AK model are generally too high.
A three-dimensional endomorphism over the AK space of orbits is induced
by requiring that the radial, polar and azimuthal frequencies (Φ˙, Φ˙+γ˙, Φ˙+γ˙+α˙)
for any (e, ι, p) have the same values as the (dimensionful) relativistic frequen-
cies ωr,θ,φ := Ωr,θ,φ/M . We map the parameters (M,a, p) rather than (e, ι, p) to
unphysical values; this gives better results since periapsis and Lense–Thirring
precession are more directly determined by the central mass and its rotation
respectively. The map (M,a, p) 7→ (M˜, a˜, p˜) is given implicitly by solving the
algebraic system of equations
Φ˙(M˜, a˜, p˜) = ωr(M,a, p), (4.37)
γ˙(M˜, a˜, p˜) = ωθ(M,a, p)− ωr(M,a, p), (4.38)
α˙(M˜, a˜, p˜) = ωφ(M,a, p)− ωθ(M,a, p) (4.39)
for the unphysical set (M˜, a˜, p˜), which is defined as the root closest to the phys-
ical set (M,a, p) with a Euclidean metric on parameter space.
Substituting (M˜, a˜, p˜) for (M,a, p) in the AK model provides an instanta-
neous correction of its frequencies at any point along the inspiral trajectory
(e(t), ι(t), p(t)). In principle, applying the map along the entirety of a fidu-
cial inspiral will keep the AK waveform phase-coherent with relativistic wave-
forms (generated from that trajectory) until plunge. However, such an inspiral
is usually more expensive to compute (as in the case of the NK model), and
§4.3 Augmented analytic kludge 69
the additional cost from evaluating the map itself also scales linearly with the
number of points sampled along the trajectory. Complications also arise as
the compact object approaches the point of plunge, where the fundamental
frequencies diverge and the map (4.37)–(4.39) is no longer well-defined.
4.3.2 Implementation
In order to retain the main advantage of the AK model, computational costs are
kept as low as possible by evaluating the map at a small number of points and
relying on independent evolution of the orbital constants over long timescales.
Firstly, the NK trajectory is generated at and around the specified initial
point (M,a, e0, ι0, p0) over a user-defined timescale Tfit, which depends on
the radiation-reaction timescale TRR := M2/µ and specifies the duration over
which the AAK model is calibrated. The timescale Tfit and number of sample
points Nfit may be adjusted adaptively based on the proximity of (e0, ι0, p0) to
plunge; they typically satisfy 0.1TRR . Tfit . 10TRR and Nfit . 10 (the latter to
ensure an added computational cost of. 1%). Evaluation of the map at each of
the Nfit points gives a local “best-fit” trajectory (M˜(t), a˜(t), e(t), ι, p˜(t))fit for the
AAK model, where the unphysical (M˜(t), a˜(t))fit now change with time and
ιfit = ι0 is approximately constant over the duration Tfit.
The actual trajectory (M˜, a˜, e(t), ι, p˜(t)) in the AAK model is generated in-
dependently from the NK model, and hence more rapidly. From the mapped
initial point (M˜, a˜, e0, ι0, p˜0) (which also lies on the best-fit trajectory by con-
struction), (e(t), p˜(t)) are evolved with 3PN O(e6) expressions given by Sago &
Fujita [141], while (M˜, a˜, ι) are left constant. The configuration-space evolution
of (Φ, γ, α) is performed with the appropriate combinations (4.37)–(4.39) of the
fundamental frequencies, given by Sago–Fujita expressions that are consistent
at 3PN O(e6) with the phase-space evolution. Higher-order 4PN O(e6) expres-
sions [141] have also been tested, but these seem to result in poorer agree-
ment with NK waveforms (which use fluxes of up to 3PN), possibly due to the
known divergence of certain expansions beyond 3PN order [155].
With inclination constant along both the best-fit and 3PN O(e6) trajectories,
sampling (M˜, a˜, e(t), p˜(t)) at each of the Nfit points allows the calculation of
70 Augmented kludge waveforms
(M˜(t), a˜(t), e(t), p˜(t))fit − (M˜, a˜, e(t), p˜(t)) over the duration Tfit. This difference
trajectory is then fitted to polynomials in time, extrapolated over the lifetime
of the inspiral, and added to the 3PN O(e6) evolution. In the initial AAK im-
plementation, the coefficients of the quadratic fit are given by second-order
finite-difference quotients (i.e. Nfit = 3), which works well but only for small
values of Tfit. The present version uses a quartic least-squares fit, which al-
lows the choice of longer Tfit and consequently gives better long-term phase
agreement with NK waveforms.
The augmentations to the AK framework are focused on improving the
phase information of its waveforms, since the amplitude of a GW signal is
measured far less precisely than its phase. However, the calculation of ampli-
tude in the original AK model (A in (4.10)–(4.12)) is a decent approximation
since it is based on Φ˙M , which is assigned a value ≈ Ωφ that turns out to be
correct for this purpose (see discussion around (4.36)). Hence it is the AAK
amplitude that is shifted away from the fiducial NK value through the map-
ping of frequencies and the unphysical evolution of M˜ . A simple adjustment
is made to reverse this shift; in (4.10)–(4.12) for the AAK model, the amplitude
is now given by
A = (ωφM)
2/3µ
|R| , (4.40)
where the azimuthal frequency ωφ and the physical black-hole mass M are
used in place of Φ˙ = ωr and M˜ (i.e. M in the original AK model) respectively.
Finally, a fast method of plunge handling has been added to the present
AAK implementation; this feature is useful in general, but especially when
generating large numbers of AAK waveform templates for search algorithms.
The compact object plunges when its instantaneous orbit along the phase-
space trajectory E(t) becomes unstable, i.e.
∂2Vr(r, a,E)
∂r2
≤ ∂Vr(r, a,E)
∂r
= Vr(r, a,E) = 0, (4.41)
where Vr is given in (4.24) with E = (E,Lz, Q). This point is termed the last
stable orbit ELSO, and is precisely the point at which the discriminant D(a,E)
of the quartic polynomial Vr(r) changes sign from positive (four real roots) to
§4.3 Augmented analytic kludge 71
negative (two real roots) [156]. Since D is a simple analytic function of the
quartic coefficients, it is computationally trivial to check for stability at every
integration step for the phase-space trajectory, provided the evolution is done
in terms of (E,Lz, Q).11
Plunge detection in the AAK model is far less straightforward than in the
other two kludges, partly because the evolution is performed in terms of the
quasi-Keplerian orbital parameters, and the computational benefits of the dis-
criminant method are nullified by having to convert (e, ι, p˜)→ (E,Lz, Q). Fur-
thermore, (M˜, a˜, p˜) are unphysical; the inverse of the map (4.37)–(4.39) is com-
putationally expensive and (more crucially) ill-defined at plunge, and so can-
not be used to obtain the physical parameters for stability calculations.
To circumvent these issues, the AAK model uses (4.36) with Ω ≈ Ωφ to
obtain an approximation for the physical parameter p. While generating the
phase-space trajectory, it checks (between the least- and most-bound orbits
[62]) the stability of (e, ι, p) at every radiation-reaction interval TRR. Once the
stability changes across an interval, it then bisects that interval to find pLSO,
and smoothly zeroes the waveform over 10 additional orbits with a one-sided
Planck-taper window [157]. The added computational cost associated with
this algorithm is . 1%. Although the approximation for p is crude, the phase-
space trajectories in the AAK and NK models are generally divergent to begin
with, and the plunge points for both models may differ significantly even if a
more accurate expression is used.
4.3.3 Benchmarking
The specified initial state of an EMRI in the AAK model is described by
the intrinsic parameters (µ,M, a, e0, ι0, γ0, ψ0) and the extrinsic parameters
(p0, θS, φS, θK , φK , α0, D), where D := |R|.12 In configuration space, transfor-
11 The discriminant method is applicable to the NK model, and may speed it up slightly.
Currently, the NK implementation precomputes (for the specified value of a) an interpo-
lated pLSO surface over the relevant region of (e, ι)LSO space, by finding and examining
the roots of Vr(r) numerically. It then checks for p < pLSO(e, ι) when generating the
phase-space trajectory.
12 For a source at cosmological redshift z, the values of D and (µ,M) are replaced with the
luminosity distance D(1 + z) and the redshifted masses (µ(1 + z),M(1 + z)) respectively.
72 Augmented kludge waveforms
mations from XAAK = (ψ, γ, α) to XAK = (Φ, γ, α) and XNK = (ψ, χ, φ) are
required for a comparison of the three waveform models. We have chosen
to specify the quasi-Keplerian true anomaly ψ in the (shared) AAK parame-
ter space, since there is no closed-form expression for ψ in terms of the mean
anomaly Φ. The conversion ψ → Φ is given by the Keplerian expressions [158]
Φ = E − e sinE, (4.42)
E = tan−1
(√
1− e2 sinψ
e+ cosψ
)
, (4.43)
where E is known as the eccentric anomaly.
On the other hand, the AAK model retains the AK parameters (γ, α); these
have explicit meanings in the (intrinsic) L-based coordinate frame
(xˆ, yˆ, zˆ)L :=
(
Lˆ× Sˆ
SL,S ,
(Sˆ · Lˆ)Lˆ− Sˆ
SL,S , Lˆ
)
, (4.44)
where the normalisation factor SL,S := (1 − (Lˆ · Sˆ)2)1/2 and Lˆ(α) is given in
ecliptic coordinates by (4.13). The unit position vector of the compact object
with respect to this frame is rˆL = [cos (ψ + γ), sin (ψ + γ), 0]T , and a change of
basis to the S-based coordinate frame (4.14) gives
rˆS = Q
T
SQLrˆL, (4.45)
where the orthogonal matrices QS := [xˆ|yˆ|zˆ]S and QL := [xˆ|yˆ|zˆ]L are formed
from the triads in (4.14) and (4.44) respectively. It is then straightforward to
obtain (χ, φ) from rˆS = [sin θ cosφ, sin θ sinφ, cos θ]T , via (4.29).
As a generic example, we consider a prograde EMRI with redshifted com-
ponent masses (µ,M) = (101, 106)M, spin a = 0.5M , and initial orbital pa-
rameters (e0, ι0, p0) = (0.1, pi/6, 8.25M). The initial semi-latus rectum is chosen
such that the compact object plunges approximately one year after entering the
LISA band at a representative frequency fGW = 2.7 mHz, where fGW is defined
as twice the azimuthal orbital frequency (i.e. the dominant GW harmonic at
low eccentricity). In the AAK model, the fitting timescale and number of sam-
§4.3 Augmented analytic kludge 73
ple points are set to Tfit ≤ 10TRR and Nfit ≤ 10 respectively, with inequality in
the case of adaptive adjustments.
One important result from our comparison studies is that the AK model
can lead to an overestimation of SNR if used without modification. This is due
to the fact that the frequencies in the AK model are generally too high for any
given (e, ι, p), as mentioned in Section 4.2.1. The SNR of a signal h = hI+ihII is
given by (2.54), with the noise-weighted inner product (2.53) written in terms
of the one-sided noise power spectral density Sn(f > 0) := 2Sn(f) as [87]
〈a|b〉 = 2
∫ ∞
0
df
a˜(f)b˜∗(f) + a˜∗(f)b˜(f)
Sn(f)
= 4<
[∫ ∞
0
df
a˜(f)b˜∗(f)
Sn(f)
]
. (4.46)
In this chapter, Sn(f) is taken to be a LISA noise model for the L6A5 (six links,
five-million-kilometre arms) configuration known as classic LISA, assuming
the original mission requirements and including confusion noise from the fore-
ground of Galactic white-dwarf binaries [159].
At a luminosity distance of 5 Gpc, an NK signal (sampled at 0.2 Hz) from
the example EMRI described above has a one-year SNR of ρ = 30.8. When
using the AAK model to generate the signal, a comparable value of ρ = 32.1
is obtained. However, an AK signal from the same EMRI has ρ = 57.8. To
illustrate why this is the case, we consider the characteristic strain hc of a signal
and the noise amplitude hn; these are given respectively by [26]
hc(f) = 2f |h˜(f)|2, (4.47)
hn(f) =
√
fSn(f), (4.48)
such that
ρ2 =
∫ ∞
−∞
d(ln f)
(
hc(f)
hn(f)
)2
. (4.49)
With these definitions, the area between hc and hf on a log–log plot gives an in-
dication (but not an approximation) of SNR, and allows the relative detectabil-
ity of signals to be estimated. The characteristic strain for the three signals and
the LISA noise amplitude are shown in Figure 4.2, where the excess power
in the AK signal at higher frequencies is evident, along with the consequent
74 Augmented kludge waveforms
Figure 4.2: Characteristic strain for year-long AK (red), AAK (green) and
NK (black) signals from the example EMRI, along with the noise amplitude
(dashed) for the LISA configuration L6A5. A moving-average filter has been
applied to the hc curves, such that oscillations are smoothed out for ease of
visualisation while the overall spectral profile is preserved.
§4.3 Augmented analytic kludge 75
(µ/M, a/M, e0)
O(·|hNK)2 mth O(·|hNK)6 mth σ
hAK hAAK hAK hAAK
(101, 0.5, 0.1) 2.0× 10−3 9.5× 10−1 6.4× 10−4 1.5× 10−1 0.89
(100, 0.5, 0.1) 4.4× 10−5 5.9× 10−1 2.7× 10−6 1.3× 10−1 0.92
(101, 0.8, 0.1) −1.6× 10−3 9.3× 10−1 −2.7× 10−4 1.3× 10−1 0.91
(101, 0.5, 0.5) 8.6× 10−3 2.3× 10−1 −2.0× 10−3 4.3× 10−2 0.38
Table 4.3: Comparison of the AK and initial AAK models via two- and six-
month waveform overlapsO(·|hNK) for generic EMRIs with different compact-
object mass, black-hole spin and initial eccentricity. The computational speed-
up σ of both models over the NK model is given as well.
boost to SNR. This error is likely to persist for M & 106M, but may be miti-
gated for less massive central black holes as the maxima of the three hc curves
are blueshifted past the minimum of hn.
For the purposes of this work (where the NK model is taken as fiducial),
the phase accuracy of the AK and AAK models is assessed by how well their
waveforms overlap with NK waveforms. In [59], the overlapsO(hAK|hNK) and
O(hAAK|hNK) over two and six months are computed for the example EMRI, as
well as for the same source with (i) µ = 100M, (ii) a = 0.8M and (iii) e0 = 0.5.
The AK and initial AAK models have virtually identical computation times
τ , and are both quicker than the NK model with typical speed-up factors of
σ := 1− τ/τNK ≈ 0.9 (except in the case of e0 = 0.5, where σ ≈ 0.4). However,
the AAK model yields overlaps that are consistently higher, and by 2–3 orders
of magnitude in most cases (see Table 4.3).
The speed-up factors for the AAK model are preserved by the present im-
plementation, while its overlaps are increased across the board due to the en-
hanced fitting algorithm. Most notably, there is substantial improvement for
EMRIs with higher initial eccentricities, although this is partly attributable to
the use of a higher sample rate than that in [59] (around 0.03 Hz). The overlap
and timing performance of the AAK model across 2–6 months with varying
compact-object mass, black-hole spin and initial eccentricity is summarised by
the plots in Figures 4.4–4.6.
In Figure 4.4, the overlap O(hAAK|hNK) is computed for the example EMRI,
as well as for the same source with 0.5 ≤ lg (µ/M) ≤ 2. The minimum value of
76 Augmented kludge waveforms
Figure 4.4: Two- to six-month overlaps between AAK and NK waveforms for
generic EMRIs with varying compact-object mass. The computational speed-
up σ of the AAK over the NK is coded by colour. Vertical dashed lines cor-
respond to the example EMRI, while horizontal ones indicate the standard
minimal-match value of 0.97 for template banks.
Figure 4.5: Overlaps and computational speed-up as in Figure 4.4, but for
generic EMRIs with varying black-hole spin.
§4.3 Augmented analytic kludge 77
µ ≈ 3M is chosen since the upper bound of 10TRR for Tfit exceeds six months
for a less massive compact object, and so the overlaps in that regime will not
show significant improvement (even the six-month overlap is already > 0.97).
We also do not consider intermediate-mass black holes with µ > 100M. In-
stead of choosing p0 such that the EMRI plunges after one year (as done in
[59]), we consider fixed p0 = 8.25M in this analysis; this leads the overlaps to
degrade at larger rather than smaller mass ratios µ/M . The two-month over-
laps are> 0.97 up to µ ≈ 20M, for which plunge occurs at around 5.6 months.
There is greater speed-up over the NK model for longer waveform durations
as expected, but all values of σ are & 0.8.
Overlaps for the example EMRI with varying spin 0.1 ≤ a/M ≤ 0.9 are
shown in Figure 4.5. Again, all values of σ are & 0.8, with greater speed-
up for longer waveform durations. For fixed p0, prograde EMRIs with lower
spin start closer to plunge, and so the overlap values generally increase along
with a. There appears to be an opposing effect at higher spin (possibly due to
the additional degrees of freedom for error from fitting M˜ and a˜ in the AAK
model) that causes a fall-off in the four- and six-month overlaps for a & 0.6.
The two-month overlaps are > 0.97 across the full range of considered spins,
which is perhaps unsurprising since the upper bound of 10TRR for Tfit is also
two months for a (101, 106)M EMRI. However, we note here that while the
phase accuracy of the AAK model may be arbitrarily increased in principle by
taking Tfit > 10TRR, the computational requirement that Nfit . 10 will likely
reduce the quality of the trajectory fit at early times.
The effect of varying eccentricity for the example EMRI is illustrated in Fig-
ure 4.6. We consider initial eccentricities 0.05 ≤ e0 ≤ 0.5, since there turns out
to be no speed-up over the NK model (σ ≈ 0) when generating a two-month
AAK waveform with e0 = 0.5. This is an important limitation of the Peters–
Mathews approximation for waveform generation (see discussion in footnote
10), and will have to be addressed if the AAK model is to be useful in searches
for high-eccentricity EMRIs. Nevertheless, the two- and even four-month over-
laps are > 0.97 for e0 . 0.3, with σ & 0.5. As in the case of Figure 4.5, there is a
peak in the six-month overlap; this is probably due to variance in the fit for e,
and is unlikely to carry any fundamental significance.
78 Augmented kludge waveforms
Figure 4.6: Overlaps and computational speed-up as in Figure 4.4, but for
generic EMRIs with varying initial eccentricity.
Finally, we summarise here the results of a Fisher matrix calculation for
a year-long AAK signal from the example EMRI considered throughout this
section. The Fisher information matrix Γ for a GW signal h parametrised by λ
is given by (2.61), where λ = (µ,M,S,E,X,R) in the case of EMRIs. For our
AAK signal normalised to an SNR of ρ = 30, we find that the masses and spin
can be measured to within the fractional root-mean-square errors
∆(lnµ) ≈ 4× 10−5 (4.50)
∆(lnM) ≈ 2× 10−5, (4.51)
∆(ln (a/M)) ≈ 4× 10−5. (4.52)
These errors are an order of magnitude better than the corresponding values
for the AK model [57], and are more comparable to those cited for the NK
model [66] (although the latter consider a circular, equatorial EMRI with a =
0.9M ). They are also consistent with (the lower end of) the values reported in
the L2/L3 mission selection proposal [14] for ESA’s Cosmic Vision programme,
where the AK model was used but with a modified plunge criterion.
§4.4 Discussion 79
4.4 Discussion
The widely used AK waveform model of Barack & Cutler is extremely quick
to generate, but typically dephases within hours with respect to more accu-
rate EMRI waveforms due to a mismatch of its radial, polar and azimuthal
frequencies with the actual Kerr frequencies along its orbit. Hence using AK
waveforms as signal templates for EMRIs will result in reduced SNRs for de-
tection, as well as an inaccurate estimation of astrophysical parameters. This
might limit their utility in scoping out data analysis issues for space-based
GW detectors—an area where there are several important open questions that
must be addressed within the next few years, in preparation for LISA mission
adoption at the end of this decade.
We have proposed in this chapter an augmented AK waveform model that
features a frequency map to the Kerr fundamental frequencies, self-consistent
3PN O(e6) evolution equations from Sago & Fujita, and calibration to the NK
model of Babak et al. via a local phase-space trajectory fit that is extrapolated
over the waveform duration. The present implementation of the model also
includes various improvements such as amplitude corrections and fast plunge
handling. AAK waveforms are virtually as quick to compute as their predeces-
sors, but stay phase-coherent with NK waveforms for months and yield high
overlap values (> 0.97) across extended regions of parameter space.
The most pressing deficiency in the AAK model is the ill-defined nature of
(4.37)–(4.39) at the last stable orbit due to the divergent Kerr frequencies; this
complicates both plunge detection and the specification of orbital parameters
at plunge. Another limitation is that the mode-sum approximation (4.6) and
(4.7) becomes more expensive than the quadrupole formula itself at high ec-
centricities (e0 & 0.5). If the model can be made robust through the resolution
of these issues, it may also be upgraded with fits to improved NK models when
they become available (e.g. with Kerr self-force evolution included through the
framework in [67, 68]). With the era of space-based detectors drawing closer,
the AAK waveform model will be an essential tool for near-future work on
LISA data analysis and mock data challenges, as well as a useful addition to
the inventory of EMRI waveforms for millihertz GW astronomy.
80 Augmented kludge waveforms
Chapter 5
Template bank compression
One strategy for reducing the online computational cost of matched-filter de-
tection searches for GWs is to introduce a compressed basis for the waveform
template bank. In this chapter, we propose and investigate several combinato-
rial compression schemes for a general template bank. Through offline com-
pression, these tunable schemes are shown to yield faster detection and local-
isation of signals, along with moderately improved sensitivity and accuracy
over coarsened banks at the same level of computational cost.
The general method is inspired by a recently proposed template bank com-
pression scheme that potentially yields significant cost savings, but may be
demonstrated to suffer from a number of fundamental problems. Our com-
pression schemes address and rectify these problems; they might be useful for
any search involving template banks, and especially in the analysis of data
from space-based detectors, where online searches will be difficult due to the
long-duration waveforms and large parameter spaces of some sources.
The material in this chapter has been adapted from [160].
5.1 Background
Various strategies exist to reduce the online cost of evaluating matched-filter
inner products in GW detection algorithms, typically by shifting the com-
putational burden to the preparatory offline stage. Some methods focus
on making individual inner products computationally cheaper: this may be
achieved across regions of parameter space through direct template interpola-
tion [161, 162], or more generally by using a reduced-order quadrature [163].
81
82 Template bank compression
Other methods seek to reduce the number of inner products in a grid-based
search through template bank compression, i.e. the reduced-basis representa-
tion of a large template bank by a smaller set of templates [164–166].
In a recently proposed method of template bank compression [167], binary
labelling is used to define a non-orthogonal basis that maximises compression
losslessly (in the sense of perfect signal recovery without noise). This idea is
fully general and admits a much higher compression rate than existing meth-
ods based on the eigenvalue structure of the template bank, but comes with
significant penalties to detection sensitivity and localisation accuracy in the
presence of detector noise. The method as originally described also suffers
from an arbitrarily asymmetric treatment of templates, as well as a restrictive
level of compression that limits its practicality to high-SNR signals. While the
binary labelling method might be useful in the context of LISA (where source
SNRs are potentially higher than for ground-based detectors), its practical ap-
plicability to GW data analysis remains undeveloped and hence unclear.
In this chapter, we introduce and develop the related method of conic com-
pression (i.e. defining a compressed basis through conic combinations of tem-
plates) by characterising its performance under various simplifying assump-
tions, before investigating its viability for present and future GW detectors
with a more realistic example. We propose several compression schemes,
one of which subsumes a symmetric-treatment version of the binary labelling
method as a particular case. These tunable schemes feature discrete transitions
between zero and maximal compression, and offer fast detection and localisa-
tion of GW signals in the search space with a controlled loss (if at all) in sen-
sitivity/accuracy. Their generality and straightforward implementation also
allow them to supplement existing grid-search methods, or to rapidly identify
seed points for stochastic searches in parameter estimation.
The general method of conic compression is set out in Section 5.2. Three
families of conic compression schemes are then proposed in Sections 5.2.1–
5.2.3: a lossy scheme based on partitions of the template bank, and two lossless
schemes whose conic combinations are determined permutatively or by base
representations of template labels. We calculate the optimal detection statistics
for these schemes, and find that the standard maximum-overlap statistic is sig-
§5.1 Background 83
nificantly suboptimal for detection in the lossless case. Section 5.2.4 compares
the three schemes under simplified conditions, i.e. assuming the GW signal is
proportional to a single template in an orthogonal template bank. The lossy
partition scheme is shown to have slightly better detection sensitivity than its
lossless counterparts at the same level of compression. Furthermore, while
the lossless schemes provide automatic identification (i.e. localisation to a sin-
gle template) of the signal upon detection, the identification accuracy falls off
more rapidly with compression in the presence of noise.
We focus exclusively on the partition scheme in Section 5.3, where the or-
thogonality and single-template assumptions are lifted separately. As shown
in Section 5.3.1, the overall performance of the scheme is partition-dependent
in the case of a correlated (non-orthogonal) template bank, and must be pre-
optimised by grouping highly correlated templates together. The optimised
partition scheme retains the benefits of a correlated template bank up to high
levels of compression, and is superior to a simple coarsening of the template
bank (obtained by increasing the maximal mismatch between neighbouring
templates). Section 5.3.2 discusses the case of a GW signal lying in a low-
dimensional subspace of an orthogonal template bank, for which the detection
sensitivity of the scheme is not significantly reduced.
In Section 5.4, we implement the optimised partition scheme for a highly
correlated (maximal mismatch ≈ 0.01) template bank of ∼ 104 PN waveforms,
which describe only the inspiral phase of a comparable-mass binary coales-
cence. The scheme is shown to be viable for practical applications, as it per-
forms well on this example up to high levels of compression and at all consid-
ered values of SNR. Its detection rate for a signal injected centrally is superior
to that of the coarsening approach (especially at & 80% compression), and this
improvement is even more marked for a signal injected at the boundary of the
bank. In addition, the accuracy rate for localisation of the injection to a . 0.1%
region of the search space is undiminished up to ∼ 90% compression, and is
again higher than that of the coarsening approach.
The speed-up and enhanced accuracy in localising the GW signal with
conic compression is promising for LISA data analysis, where the online use
of template banks will be challenging for many sources (e.g. EMRIs with their
84 Template bank compression
large parameter space [168]). While the long duration of LISA signals is com-
putationally prohibitive to fully coherent detection searches even with com-
pression, our method is suitable for the shorter semi-coherent searches that
are required for rapid electromagnetic follow-up. Conic compression might
also provide an alternative to the singular-value-decomposition method used
in LIGO detection pipelines for stellar-mass binary coalescences [165]: it scales
well with parameter-space dimensionality, and easily matches or surpasses the
order-of-magnitude computational savings from that method.
5.2 Conic compression
We consider a generic bank of N waveform templates hn, where the template
labels n are drawn from the collection N := {n ∈ Z+ |n ≤ N}, and the tem-
plates have been normalised with respect to the noise-weighted inner product
(2.53) such that 〈hn|hn〉 = 1 for all n ∈ N. The inner products of the data x and
the templates define N associated statistics
xn := 〈x|hn〉, (5.1)
which may be used for detection and localisation in a simple grid search.
Our general method of compression is to reduce the number of statistic
evaluations from N to M by considering conic (i.e. positive-coefficient) combi-
nations of the original templates. The template labels are grouped into M sets
Um, where the set labels m are drawn from the collection M := {m ∈ Z+ |m ≤
M}, and the sets satisfy⋃m∈M Um = N. These sets define M conic templates
Hm :=
∑
n∈Um
hn, (5.2)
which are prepared offline (like the template bank itself), along with M asso-
ciated statistics
Xm := 〈x|Hm〉 =
∑
n∈Um
xn, (5.3)
which are evaluated at the online stage.
§5.2 Conic compression 85
Without any prior assumptions on the template bank, each template must
be treated equally. This is done by ensuring that:
(a) each combination is weighted equally;
(b) each combination includes the same number of templates;
(c) each template is included in the same number of combinations.
Definition (5.2) has been chosen to satisfy Condition (a), while Condition (b) is
imposed by further requiring card(Um) = card(Um′) for all m,m′ ∈M (where
the set cardinality card(S) is the number of elements in the set S). Condition
(c) must be enforced separately in the construction of the sets. The second
equality in Definition (5.3) relates the conic statistic evaluations to the original
statistics (5.1), which are no longer evaluated at the online stage.
To simplify analysis, we first assume the template bank is an orthogonal set
such that
〈hn|hn′〉 = δnn′ , (5.4)
where δnn′ is the Kronecker delta. We further assume the GW signal (if present)
lies in the one-dimensional subspace spanned by a single template in Hilbert
space, i.e.
x = Ah1 + {noise}, (5.5)
where A > 0 and the templates have been relabelled without loss of generality.
It follows from (2.54) and (5.5) that A = ρ is the source SNR. These orthogonal
and 1-D restrictions are neither realistic nor optimal, but facilitate the analytic
assessment and comparison of various compression schemes in this section.
The overall performance of conic compression is generally improved by the
lifting of these assumptions, which we consider in Sections 5.3 and 5.4.
In the presence of a GW signal, the expectation values and covariances of
the normally distributed original statistics (5.1) are now given by
E[xn] = A〈h1|hn〉 = Aδ1n, (5.6)
cov(xn, xn′) = 〈hn|hn′〉 = δnn′ . (5.7)
86 Template bank compression
As the labelling of templates is itself a probabilistic process with discrete uni-
form distribution, the original statistic vector x has the multivariate Gaussian
distributionN (µ(i),Σ) (with µ(i)n = E[xn] and Σnn′ = cov(xn, xn′)), but summed
over the N possible assignments i of 1 ∈ N and renormalised accordingly. If
the signal is absent, the distribution of x is simply N (0,Σ). Hence we have
p1(x) ∝ 1
N
N∑
i=1
exp
(
−1
2
xTΣ−1x + µT(i)Σ
−1x− 1
2
µT(i)Σ
−1µ(i)
)
, (5.8)
p0(x) ∝ exp
(
−1
2
xTΣ−1x
)
, (5.9)
where p1 and p0 are the probability density functions of x in the respective
presence or absence of a GW signal.
An optimal detection region R in Hilbert space maximises the detection
rate PD =
∫
R p1 subject to a given false alarm rate PF =
∫
R p0; hence p1 = λp0 on
its boundary ∂R for some Lagrange multiplier λ. Using (5.6)–(5.9), we define
the optimal detection statistic
xopt :=
p1(x)
p0(x)
=
1
N
exp
(
−A
2
2
)∑
n∈N
exp (Axn), (5.10)
such that the optimal detection surfaces ∂R are precisely the level sets of xopt
parametrised by λ, and a detection is claimed if xopt exceeds the threshold λT
corresponding to some fixed value of PF .
In deriving (5.10), we have implicitly assumed a population of GW sources
with equal likelihood and known signal amplitude. Equation (5.10) therefore
defines the optimal statistic for detecting events drawn from such a popula-
tion. For a population of sources that are not equally likely, we need to replace
the sum in (5.10) with a suitably weighted sum. Similarly, for a population
with a distribution of amplitudes, we need to marginalise (5.10) over A; in the
case of an (improper) uniform prior over the interval (−∞,∞), this would give
a detection statistic proportional to
∑
n∈N exp(x
2
n/2).
Any choice of population makes assumptions about the astrophysical dis-
tribution of GW sources that might not be justified. In this work, the focus is
§5.2 Conic compression 87
Figure 5.1: Three-dimensional projection of optimal detection surface for un-
correlated statistics xn, at true SNR of ρ = 2.
on the investigation and comparison of template bank compression schemes,
and so we only consider the equal-likelihood and known-amplitude popula-
tion assumed in the derivation of (5.10). While the treatment of amplitude in
particular is artificial, a search that is optimised for sensitivity to signal ampli-
tudes around the detection threshold will likely be near-optimal for any given
astrophysical population (and closer to optimality than a search tuned for the
wrong astrophysical population). Finally, we note that although (5.10) has
been derived as a frequentist optimal statistic, the same equation also arises
as the Bayes factor for the presence (versus absence) of a signal, assuming flat
model priors and the source population assumptions outlined above.
For sufficiently high SNR (large A), the optimal surfaces xopt = λ defined
by (5.10) are well approximated by semi-infinite hypercubes (see Figure 5.1),
i.e. the level sets of the standard maximum-overlap detection statistic [169]
xmax := max
n∈N
{xn}. (5.11)
Since the original statistics (5.1) are uncorrelated, the probability density func-
tions of xmax in the presence or absence of a GW signal are obtainable explicitly.
88 Template bank compression
These are given respectively by
q1(xmax) = F0(xmax)
N−1f1(xmax) + (N − 1)F0(xmax)N−2F1(xmax)f0(xmax), (5.12)
q0(xmax) = NF0(xmax)
N−1f0(xmax), (5.13)
where fs(xmax) is the probability density function for the Gaussian distribution
N (sA, 1), and Fs(xmax) is the cumulative distribution function
Fs(xmax) =
∫ xmax
−∞
du fs(u). (5.14)
For our analysis of conic compression schemes, we also require the expec-
tation values and covariances of the normally distributed conic statistics (5.3).
From (5.3), (5.6) and (5.7), it follows in the presence of a GW signal that
E[Xm] =
∑
n∈Um
E[xn] = A card({1} ∩Um), (5.15)
cov(Xm, Xm′) =
∑
n∈Um
∑
n′∈Um′
cov(xn, xn′) = card(Um ∩Um′), (5.16)
where the cardinalities are determined by the choice of compression scheme.
As before, the conic statistic vector X has the multivariate Gaussian dis-
tribution N (µ(i),Σ) (now with µ(i)m = E[Xm] and Σmm′ = cov(Xm, Xm′)), but
summed over the N possible assignments of 1 ∈ N and renormalised accord-
ingly. If the signal is absent, the distribution of X is againN (0,Σ). The proba-
bility density functions of X in the presence or absence of a GW signal are then
given respectively by (5.8) and (5.9) with x ≡ X.
We now propose and investigate three general conic compression schemes
in Sections 5.2.1–5.2.3, before comparing their performance and potential ap-
plicability in Section 5.2.4. The orthogonal and 1-D restrictions (5.4) and (5.5)
are assumed throughout Section 5.2.
§5.2 Conic compression 89
5.2.1 Partition scheme
The simplest method of grouping the template labels n is to take the family
of sets Um as a partition of N, i.e. Um ∩ Um′ = ∅ for all distinct m,m′ ∈ M.
Condition (c) is then automatically satisfied, while Condition (b) defines the
set cardinality P = card(Um) for all m ∈M. It follows that M = N/P .
For the comparison of schemes in Section 5.2.4, it is useful to introduce a
compression parameter K ∈ Z+ for each scheme, which determines the com-
pression rate
κ := 1− Neval
N
, (5.17)
where Neval = M is the required number of statistic evaluations (for detec-
tion or localisation purposes). This generates a sliding scale of groupings
that ranges from no compression at K = 1 to maximal compression at some
scheme-dependent value of K. We may clearly choose K = P for the parti-
tion scheme, such that maximal compression is given by K = N . The minimal
nontrivial compression is 50% at K = 2, while there are diminishing returns at
large K since κ(K) is concave-down.
From (5.15) and (5.16), we now have
E[Xm] = Aδ1m, (5.18)
cov(Xm, Xm′) = Pδmm′ , (5.19)
where the sets have been relabelled such that 1 ∈ U1 without loss of generality.
Again considering the N possible assignments of 1 ∈ N, the optimal detection
statistic Xopt := p1(X)/p0(X) follows from (5.8) and (5.9) (with x ≡ X) as
Xopt =
1
M
exp
(
−A
2
2P
) ∑
m∈M
exp
(
A
P
Xm
)
. (5.20)
Since the conic statistics for the partition scheme remain uncorrelated, the
optimal surfaces Xopt = λ resemble that in Figure 5.1, and in lieu of (5.20) it is
valid to consider the maximum-overlap detection statistic
Xmax := max
m∈M
{Xm}. (5.21)
90 Template bank compression
Figure 5.2: ROC curves for the partition scheme’s optimal and maximum-
overlap detection statistics, at different values of set cardinality P (with com-
pression rate κ in parentheses) for N = 256 and a true SNR of ρ = 10. The
dashed diagonal line indicates the worst possible performance, i.e. a random
search for which the detection and false alarm rates are equal.
§5.2 Conic compression 91
Plots of detection rate PD against false alarm rate PF for a detection statistic are
known as receiver operating characteristic (ROC) curves [161]. ROC curves for
both the optimal and maximum-overlap statistics are compared in Figure 5.2.13
With increased compression, the performance of the maximum-overlap statis-
tic falls away slightly from that of the optimal statistic, due to the lowering of
effective SNR A/
√
P in (5.20); nevertheless, (5.21) is a sound approximation as
both sets of curves show good overall agreement.
For the partition scheme to admit a useful (i.e. populated) sliding scale of
compression rates, the template bank might need to be trimmed or padded
such that N has as many divisors as possible. Fixing the false alarm rate and
choosing either a desired detection rate or a compression rate then allows ad-
vance determination of the conic templates (5.2) and the threshold λT , which
is the value of λ corresponding to the fixed false alarm rate. The algorithm
for detection follows as: (i) evaluate the conic statistics (5.3); (ii) claim a detec-
tion if Xmax > λT . Threshold and detection SNRs for the maximum-overlap
statistic may be defined respectively as
ρT :=
λT√
var(Xmax)
, (5.22)
ρD :=
Xmax√
var(Xmax)
. (5.23)
An extension of the detection algorithm is required for signal identification
(i.e. localisation to a single template), since the simple coarse-graining of parti-
tion compression does not distinguish between template labels in the same set.
The signal is most likely to be associated with the largest conic statistic evalua-
tionX(1), so the best candidate may be obtained by further evaluating all of the
original statistics xn contributing to X(1) and identifying the largest. This finer
level of evaluations increases the computational cost by P to Neval = M + P .
For better identification accuracy at lower SNRs, we may widen our search
to the i largestXm instead, at an added computational cost of iP . The standard
identification algorithms Ii for the partition scheme follow (after detection) as:
13 The curves for (5.20) were obtained via 105-trial Monte-Carlo simulations, while numeri-
cal integration of (5.12) and (5.13) was used to generate the curves for (5.21).
92 Template bank compression
(iii) evaluate the original statistics (5.1) for all n ∈ Vi, where
Vi :=
i⋃
j=1
U(j), (5.24)
with U(j) corresponding to the j-th largest conic statistic evaluation X(j); (iv)
identify maxn∈Vi{xn}.
Other identification algorithms may also be considered. One such alterna-
tive is obtained by defining a further partition of Vi into two sets and evalu-
ating the associated conic statistics, then identifying the set V′i corresponding
to the larger statistic evaluation and repeating the process with Vi ≡ V′i until
card(V′i) = 1. This method might be useful for large values of P ; it yields a
smaller added computational cost of 2 log2 iP , but incurs a penalty to identifi-
cation accuracy since the early iterations still involve coarse-grained searches.
5.2.2 Symmetric base scheme
Without an additional fine-grained search, partition compression is lossy in
the sense that the GW signal is not automatically identified in the limit of zero
noise. A lossless conic compression scheme proposed by Wang [167] repre-
sents each template label n in binary and assigns it to the set Um if its m-th
digit is 1. This binary scheme features the largest possible lossless compres-
sion (M = log2N ) and an automatic identification of the GW signal; however,
it suffers from an unequal treatment of templates (i.e. it violates Conditions
(b) and (c)) and hence it yields an arbitrary level of performance that depends
on the initial assignment of template labels. Furthermore, the restriction to
maximum compression limits its usefulness in practical applications.
We propose a compression scheme modelled on Wang’s binary scheme, but
symmetrised (for equal treatment of templates) and generalised to a sliding
scale of base representations (for tunable compression). The template labels n
are represented modulo N in base B, and each set Um ≡ Uk,b is constructed by
collecting all of the labels whose k-th digit is b (this includes b = 0, and gives a
symmetric version of the binary scheme when B = 2). For Conditions (b) and
(c) to be satisfied, we require logB N ∈ Z+; it follows that M = B logB N .
§5.2 Conic compression 93
The compression parameter is chosen as K = logB N , such that maximal
compression is given by K = log3N ≈ lnN (base-2 compression is slightly
suboptimal with symmetrisation). In contrast to the partition scheme, com-
pression for the symmetric base scheme is dependent on the size of the tem-
plate bank; the minimal nontrivial compression for N = 102 is nearly 80% at
K = 2 (base-
√
N compression), and over 95% for N = 104.
From (5.15) and (5.16), we have14
E[Xm] = Aδ0b, (5.25)
cov(Xm, Xm′) = B
K−2(Bδkk′δbb′ − δkk′ + 1), (5.26)
where m = B(k − 1) + b+ 1, and the templates have been relabelled such that
(5.5) becomes x = AhN + {noise} without loss of generality. Considering the
N possible assignments of N ∈ N, the optimal detection statistic follows from
(5.8) and (5.9) as
Xopt =
1
N
exp
(
−βKA
2
2
+ (β − α)A tr(X)
) K∏
k=1
B−1∑
b=0
exp (αAXk,b) (5.27)
with
α :=
B
N
, (5.28)
β :=
M −K + 1
NK
, (5.29)
where tr(X) :=
∑
m∈MXm.
The higher compression rates provided by the symmetric base scheme re-
sult from the non-empty intersections among the sets Uk,b with different values
of k. As seen in (5.26), these also lead to correlations among the conic statistics
Xk,b. The optimal detection surfaces given byXopt = λ differ significantly from
that depicted in Figure 5.1; their projections onto the correlated subspaces are
now compact hyperboloids, and no longer approach the semi-infinite hyper-
cubes of the maximum-overlap detection statistic at high SNR (see Figure 5.3).
14 The covariance matrix defined by (5.26) is rank-deficient, but we may take the Moore–
Penrose pseudoinverse Σ+ as a perturbative approximation to Σ−1 in (5.8) and (5.9).
94 Template bank compression
(a) (b)
Figure 5.3: Three-dimensional projection of optimal detection surface for cor-
related statistics Xk,b of symmetric base scheme with N = 256 and B = 4, at
true SNRs of (a) ρ = 10 and (b) ρ = 100.
Without a simple approximation for the optimal detection statistic, the most
feasible option is to use (5.27) itself with an estimate A of the true SNR. ROC
curves for the estimated statistic XA=ρopt with  ∈ {1/2, 2} are compared against
those for the optimal and maximum-overlap statistics in Figure 5.4. Not much
detection sensitivity (for a fixed false alarm rate) is lost if the true SNR can
be estimated to within a factor of two, while usage of the maximum-overlap
statistic now incurs a more noticeable drop in performance as expected.
The restriction ofN ,B andK to integer values also results in more sparsely
populated sliding scales than those admitted by the partition scheme. There
are two possible compression rates for N = 256 (base-2 compression is sub-
optimal compared to B = 4), and three for N = 6561 = 812 = 94 = 38; most
other values of N will admit only one or none. Notwithstanding the lack of
tunability, a feasible strategy is to trim or pad the template bank such that N is
a perfect square or cube, since the smallest values ofK already yield high com-
pression rates. The detection algorithm then follows as given in Section 5.2.1,
with some estimated detection statistic XA=ρopt in place of Xmax.
One key feature of the symmetric base scheme and other lossless methods
§5.2 Conic compression 95
Figure 5.4: ROC curves for the symmetric base scheme’s optimal, maximum-
overlap and estimated detection statistics, at different values of base B (with
compression rate κ in parentheses) for N = 256 and a true SNR of ρ = 10.
96 Template bank compression
of compression is the automatic identification of the GW signal (upon detec-
tion). In the symmetric base scheme, the label of the identified template in
base-B representation is given digit-wise by the largest conic statistic evalua-
tion Xk,(1) := maxb{Xk,b} for each value of k. However, as each digit k is iden-
tified individually, the overall identification accuracy falls off severely with
increasing K (i.e. the total number of digits).
A possible modification for better accuracy is to consider the i + 1 largest
Xk,b for each k and perform an additional fine-grained search over the (i+ 1)K
templates, which increases the computational cost accordingly. The standard
identification algorithms Ii for the symmetric base scheme follow (after detec-
tion) as: (iii) evaluate the original statistics (5.1) for all n ∈ Vi, where
Vi :=
K⋂
k=1
i+1⋃
j=1
Uk,(j), (5.30)
with Uk,(j) corresponding to the j-th largest conic statistic evaluation Xk,(j) for
each k; (iv) identify maxn∈Vi{xn}. Automatic identification is recovered for
i = 0, where steps (iii) and (iv) become unnecessary as card(V0) = 1.
For large values of K (small values of B), these identification algorithms
might still suffer from poor accuracy. One alternative algorithm is obtained
by defining some threshold XT and considering all conic statistic evalua-
tions Xk,b > XT , then performing the additional fine-grained search over all
of the corresponding templates. Such a threshold may be set prior to data-
taking; if Xk,(1) < XT for some value of k, the k-th digit of the number
is unconstrained and templates corresponding to all possible choices of that
digit are considered. Alternatively, XT may be based on the data by setting
XT = f mink{Xk,(1)} for some fixed fraction f , which ensures that at least one
possible value is identified for each digit. Both approaches will in general yield
increased accuracy, but they offer less control over the number of conic statistic
evaluations considered and hence the overall computational cost.
§5.2 Conic compression 97
5.2.3 Binomial coefficient scheme
The symmetric base labelling method is not the only construction of the sets
Um that preserves both lossless compression (automatic identification) and
equal treatment of templates (Conditions (b) and (c)). In general, we may rep-
resent any assignment of N templates to M sets with a collection of N M -digit
binary labels, where the m-th digit of each label is 1 if it appears in Um and 0
otherwise. Condition (c) implies that each label must appear in exactly R sets,
and hence contain exactly R 1’s. In addition, Condition (b) defines the set car-
dinality C = card(Um) for all m ∈ M, which yields the constraint NR = MC
(each of the N labels appears exactly R times across all sets, while each of the
M sets contains exactly C labels). For some given integers N ≥ M ≥ R, this
constraint is equivalent to the existence of
C =
NR
M
∈ Z+, (5.31)
which is both a necessary and sufficient condition for such a set construction
to be possible [170].
We now require that the conic statistics (5.3) are correlated symmetrically,
as seen in the partition scheme (but not the symmetric base scheme). This ad-
ditional condition implies that the intersection of each pair of sets has fixed
cardinality I , i.e. card(Um ∩ Um′) = I for all distinct m,m′ ∈ M. Consider-
ing the family of all such intersections then yields the constraint NR(R− 1) =
M(M − 1)I (each of the N labels appears exactly RC2 times across all intersec-
tions, while each of the MC2 intersections contains exactly I labels). For some
given integersN ≥M ≥ R and C satisfying (5.31), this constraint is equivalent
to the existence of
I =
NR(R− 1)
M(M − 1) =
C(R− 1)
M − 1 ∈ Z
+, (5.32)
which is a necessary (but not in general sufficient) condition for such a set
construction to be possible.
The general construction of a family of sets under the constraints (5.31) and
(5.32) is an open problem in combinatorial design theory (see Apppendix B). In
98 Template bank compression
this chapter, we restrict our focus to a special case that may be treated in greater
detail. Every M -digit binary number with exactly R 1’s is taken to represent a
distinct template label; the set cardinality then equals the number of (M − 1)-
digit binary numbers with exactly (R−1) 1’s, while the intersection cardinality
of each pair of sets equals the number of (M − 2)-digit binary numbers with
exactly (R− 2) 1’s. Hence for all distinct m,m′ ∈M, we have
N = MCR, (5.33)
C = M−1CR−1, (5.34)
I = M−2CR−2, (5.35)
such that (5.31) and (5.32) are satisfied. We refer to this as the binomial coeffi-
cient scheme. The usual ordering of the binary numbers gives a natural map
onto the original label collection N = {n ∈ Z+ |n ≤ N}, although the inverse
map is analytically nontrivial (but straightforward in practice).
As the binomial coefficient scheme shares many similarities with the sym-
metric base scheme, we only highlight its key features in this section. The
compression parameter is chosen as K = R, such that maximal compression
is given by K = cbc−1(N)/2 (where cbc(M) := Γ(M + 1)/Γ(M/2 + 1)2 is the
continuous extension of the central binomial coefficient MCM/2). Compression
rates again depend on the size of the template bank; at small values of K, they
are only slightly higher than those of the symmetric base scheme.
From (5.15) and (5.16), we have
E[Xm] = A
R∑
r=1
δrm, (5.36)
cov(Xm, Xm′) =
M−2CR−2
(
M −R
R− 1 δmm′ + 1
)
, (5.37)
where the sets have been relabelled such that 1 ∈ Ur for 1 ≤ r ≤ Rwithout loss
of generality. Considering the N possible assignments of 1 ∈ N, the optimal
§5.2 Conic compression 99
detection statistic follows from (5.8) and (5.9) as
Xopt =
1
N
exp
(
−βKA
2
2
+ (β − α)A tr(X)
) N∑
i=1
exp
(
αA
∑
r∈Ri
Xr
)
(5.38)
with
α :=
1
M−1CR−1
, (5.39)
β :=
1
M−2CR−1
, (5.40)
where the sets Ri are the N distinct R-combinations of the collection M.
All of the conic statistics Xm are correlated symmetrically, as seen in (5.37).
Upon projection onto any three-dimensional subspace, the optimal detection
surfaces given by Xopt = λ resemble those in Figure 5.3 at both low and high
SNRs. It follows that the maximum-overlap detection statistic is again an in-
adequate approximation to the optimal statistic, and we are compelled to use
(5.38) itself (assuming an accurate estimate of true SNR). We do not include
ROC curves for the binomial coefficient scheme here, as they are very similar
to those in Figure 5.4.
A direct comparison of the base and binomial schemes is difficult, since
there are few suitable values of N that are exactly valid for both schemes. Lack
of tunability is also more of an issue for the binomial scheme: the only val-
ues of N that admit more than one nontrivial compression rate might be the
Singmaster numbers [171] (which admit two as they appear six times in Pas-
cal’s triangle), and it is not known whether any number admits more than two
(apart from N = 3003 = 78C2 = 15C5 = 14C6 [172]). The problem may be
overcome by considering a more general compression scheme satisfying the
conditions (5.31) and (5.32). This is beyond the scope of the current work due
to the complexity of set construction (see Appendix B), but might be investi-
gated for specific template banks in the future.
The detection algorithm for the binomial coefficient scheme is as given in
Section 5.2.1, with some estimated detection statistic XA=ρopt in place of Xmax.
Automatic identification is available as well, with the label of the identified
template given uniquely by theR largest conic statistic evaluations. For higher
100 Template bank compression
accurate-identification rates, a possible alternative is to consider the R + i
largest Xm and perform an additional fine-grained search over the R+iCR tem-
plates. The standard identification algorithms Ii for the binomial coefficient
scheme follow (after detection) as: (iii) evaluate the original statistics (5.1) for
all n ∈ Vi, where
Vi :=
R+iCR⋃
k=1
⋂
j∈Jk
U(j), (5.41)
with U(j) corresponding to the j-th largest conic statistic evaluation X(j) and
the sets Jk given by the R+iCR distinct R-combinations of the set {j ∈ Z+ | j ≤
R + i}; (iv) identify maxn∈Vi{xn}. Automatic identification is recovered for
i = 0, where steps (iii) and (iv) become unnecessary as card(V0) = 1.
We note that another possible scheme would be a “direct sum” of the par-
tition scheme and either the symmetric base or binomial coefficient scheme.
The collection of template labels is first partitioned into subcollections, each
of which is further decomposed into smaller sets via one of the correlated
schemes; these sets may also be recombined across the initial partition for in-
creased compression. We do not consider this further here, but such an ap-
proach would overcome some of the difficulties associated with the restricted
values of N for the base and binomial schemes.
5.2.4 Scheme comparison
In this section, we compare the performance of the uncorrelated partition
scheme and its two correlated alternatives across three areas: compression,
detection and identification (i.e. localisation to a single template). The detec-
tion and identification plots here (and throughout the rest of the chapter) were
obtained using 105-trial Monte-Carlo simulations, and so the errors on each
plot point are ∼ 10−3 for a 1-sigma binomial confidence interval.
Log–log plots of M against N for various conic compression schemes are
shown in Figure 5.5, where the maximum lossless compression provided by
the binary scheme [167] has also been included for reference. As alluded to
in Sections 5.2.1–5.2.3, the partition scheme has the largest range of compres-
sion rates, both in terms of compression bounds (plot area) and admitted rates
§5.2 Conic compression 101
Figure 5.5: Log–log plots of M against N for various tunable/fixed conic com-
pression schemes. For each tunable scheme, the corresponding shaded region
indicates the range of possible compression rates (with the trivial compression
settingK = 1 excluded). Not every point in this region is realisable in practice,
as discussed in the text.
102 Template bank compression
(discrete density, not shown). The two correlated schemes cover similar areas
at lower densities in Figure 5.5, with the binomial coefficient scheme offering
slightly greater compression.
Detection performance for each compression setting of a given scheme may
be measured by detection sensitivity at a fixed false alarm rate (which is simply
read off the corresponding ROC curve), or by a summary statistic that captures
most of the information contained in an ROC curve (e.g. the area AROC under
the curve). Since an ROC curve always lies above the no-discrimination line
PD = PF , we define the discrimination
D := 2AROC − 1, (5.42)
which serves as a measure of how well the detection statistic discriminates
between true and false positives. Figure 5.6(a) shows plots of discrimination
against compression for the three proposed schemes at different values of true
SNR, with N ≈ 256. We use the maximum-overlap detection statistic in lieu of
the optimal statistic for the partition scheme, and are compelled to chooseN =
210 for the binomial coefficient scheme. The three schemes have comparable
performance at lower SNRs, but the partition scheme begins to outpace its
correlated alternatives as SNR increases.
To compare identification performance (after a true detection), we consider
plots of accurate-identification rate PI against compression, but only for the
fastest standard algorithms of each scheme (i.e. I1 for the partition scheme, and
automatic identification I0 for the correlated schemes). The rate PI for each plot
point is calculated using all and only the trials with the injected signal present,
and therefore assumes perfect detection throughout (PD = 1 and PF = 0). This
decouples identification from detection: it allows standardised comparison of
the schemes at a fixed false alarm rate, and does not penalise the identification
performance of any method for having inferior detection performance.
As seen in Figure 5.6(b), the usefulness of lossless compression and au-
tomatic identification is limited in the presence of noise; the addition of a
simple fine-grained search to the partition scheme is enough to yield signif-
icantly higher identification accuracy at the cost of marginally lower compres-
§5.2 Conic compression 103
(a) (b)
Figure 5.6: Plots of (a) discrimination D and (b) accurate-identification rate
PI against compression rate κ for the partition, symmetric base and binomial
coefficient schemes, at different values of true SNR ρ for N ≈ 256.
sion. The turnaround in accurate-identification rates for the partition scheme
at larger values of P is due to the additional statistic evaluations used in the
fine-grained search, which for I1 givesNeval = M+P in (5.17). SinceM = N/P ,
κ(P ) has one turning point. For this example, P = 8 and P = 64 provide the
same level of compression; identification accuracy is higher for the former at
ρ = 10, similar for both at ρ = 4, and higher for the latter at ρ = 2.
In summary, the partition scheme offers better overall performance than
its correlated alternatives at the same level of compression. For detection, the
introduced correlations among the conic statistics lead to slightly reduced de-
tection sensitivity and discriminatory power at high SNR; furthermore, the
potential benefits of lossless compression for identification turn out to be nul-
lified by the effects of noise. Hence there appears to be little reason for using
correlated schemes over the partition scheme, which is more promising as it is
easy to implement and admits a relatively populated sliding scale of compres-
sion rates. We further investigate and implement the partition scheme as the
representative conic compression scheme in Sections 5.3 and 5.4.
104 Template bank compression
5.3 Orthogonality and subspaces
The conic compression schemes proposed in Section 5.2 are fully general, in
the sense that no prior assumptions about the template bank are made apart
from (5.4) and (5.5). These orthogonal and 1-D restrictions are neither realistic
nor optimal, as template banks typically feature highly correlated neighbour-
ing templates and are unlikely to contain a template exactly proportional to
the GW signal itself. In this section, we discuss the (separate) lifting of each as-
sumption for the partition scheme, and the resultant effects on detection sensi-
tivity and localisation accuracy. Each approach may be viewed as a simplified
limiting case of an actual template bank, which can always be made dense
enough to include a signal-proportional template (assuming model accuracy),
or orthogonally decomposed. A more realistic example with both assumptions
lifted is considered in Section 5.4.
5.3.1 Non-orthogonal templates
We first consider a sufficiently dense bank of correlated (non-orthogonal) tem-
plates, such that the GW signal still lies in the 1-D subspace spanned by a
single template in Hilbert space. From the first equalities in (5.6), (5.7), (5.15)
and (5.16), it follows in the presence of a GW signal that
E[Xm] = A
∑
n∈Um
〈h1|hn〉, (5.43)
cov(Xm, Xm′) =
∑
n∈Um
∑
n′∈Um′
〈hn|hn′〉. (5.44)
Any partition of N as in Section 5.2.1 defines a splitting of the (sorted) orig-
inal mean vector and covariance matrix into P × 1 blocks and P × P blocks
respectively; each entry in the conic mean vector and covariance matrix is then
simply the sum of entries in the corresponding block, which reflects the coarse-
graining of the compression.
As a toy model for investigating non-orthogonal templates, we use a
frequency-parametrised bank of sinusoidal waveforms h = sin (2pift) with fi-
§5.3 Orthogonality and subspaces 105
nite observation time T . Assuming white noise for simplicity, the inner prod-
uct (2.53) may be written for real time series as
〈a|b〉 ∝
∫ T
0
dt a(t)b(t). (5.45)
For anN -template bank with fmin ≤ f ≤ fmax and δf := (fmax−fmin)/(N−1)
fmin, the overlaps are given by
〈hn|hn+∆n〉 ≈ 2
∫ 1
0
dt sin (2pifmint) sin (2pi(fmin + |∆n|δf)t), (5.46)
where we have normalised to T = 1 such that f is given in waveform cycles per
observation time. This sinc-like function of ∆n ∈ Z yields a band covariance
matrix for xn; we set N = 256, and choose the frequency bounds such that
cov(xn, xn±1) ≈ 0.97 (i.e. a maximal mismatch of around 0.03).
In contrast to the orthogonal case, the choice of partition generally affects
the performance of the partition scheme for non-orthogonal templates. For the
one-parameter template bank with overlaps given by (5.46), we consider both a
randomised partition and a more optimised (but not necessarily optimal) par-
tition with Um = {n ∈ Z+ | (m− 1)P < n ≤ mP}. We also include for compar-
ison a uniformly spaced M -template subset of the original bank; equivalently,
Um = {n ∈ Z+ |n = mP} where
⋃
m∈M Um 6= N. This “coarsened” template
bank is obtained not through compression, but by simply reducing the correla-
tion (increasing the maximal mismatch) between neighbouring templates. The
standard detection algorithm outlined in Section 5.2.1 is then applied for the
two partition schemes and the coarsening method.
Figure 5.7(a) shows plots of discrimination (using Xmax) against compres-
sion for both choices of partition and the coarsened template bank, with perfor-
mance in the presence of a GW signal averaged over theN possible locations of
the corresponding template in the bank. The optimised partition (with highly
correlated templates grouped together) outperforms its randomised counter-
part at all considered values of true SNR. It also shows improvement over the
coarsening method at higher compression, which is expected as it uses infor-
mation from the full N -template bank rather than just an M -template subset.
106 Template bank compression
(a) (b)
Figure 5.7: Plots of (a) discrimination D against compression rate κ for the
randomised/optimised partition scheme and the coarsening method, and (b)
accurate-identification/localisation rate PI against compression rate κ for the
optimised partition scheme and the coarsening method, at different values of
true SNR ρ for a non-orthogonal bank. Accurate localisation here is defined as
the identification of the template h1 or one of the nearest P templates.
§5.3 Orthogonality and subspaces 107
The largest statistic evaluation for the coarsened template bank identifies a
best guess for the GW signal, but the accuracy of this identification is zero if
the signal does not correspond to a template in the coarsened bank. Since the
spacing of the coarsened bank is P , we may consider the best-guess template
as representative of the P templates nearest to it (or P − 1 if P is odd), and
say that the largest statistic evaluation localises a best guess for the signal. We
then define the localisation to be accurate if the correct template h1 is one of
those templates (equivalently, if the identified best-guess template is h1 or one
of the nearest P templates). The identification algorithms in Section 5.2.1 also
identify a single best-guess template for the partition scheme, which allows
us to consider both accurate identification (to a precision of 1) and accurate
localisation (to a precision of P ) in the same way. Figure 5.7(b) shows plots
of these accuracy rates (using I1, which gives Neval = M + P in (5.17)) against
compression for the optimised partition scheme and the coarsening method.
As in Section 5.2.4, the turnaround in accurate-identification and localisa-
tion rates for the partition scheme is due to the additional statistic evaluations
of the fine-grained search. The localisation rates increase up to some level of
compression, which is mainly because “accurate” localisation is defined up to
a degree of precision that degrades with compression; this effect is seen for the
coarsening method as well. Localisation to within the spacing of the original
template bank (i.e. identification) decreases monotonically in accuracy for the
partition scheme, and will not be achievable for the majority of signals with
the coarsening method. The partition scheme localises the GW signal with
slightly greater accuracy than the coarsening method, and in fact identifies it
with virtually no fall-off in accuracy at significant compression levels.
Increasing the correlation between neighbouring templates is known to im-
prove the detection and localisation performance of a general template bank
[173]. Results in this section illustrate that the partition scheme retains these
benefits up to high levels of compression, and provides a superior alternative
to simply coarsening the template bank for computational savings. The via-
bility of conic compression becomes even more evident in Section 5.4, where
we apply the partition scheme to a larger and more broadly correlated two-
parameter template bank.
108 Template bank compression
5.3.2 Two-dimensional subspace
Throughout Sections 5.2 and 5.3.1, we have assumed that the GW signal is
exactly proportional to a template in the bank. To understand the impact on
compression performance when this is not the case, we consider a bank of N
uncorrelated templates obtained through some orthogonalisation procedure
(e.g. as in [164–166]) on a general template bank, and a signal lying in the
N -dimensional Hilbert space spanned by the orthogonal set.
IfN is large, the signal is typically restricted to a low-dimensional subspace
(this follows from the volume of anN -sphere). For simplicity, we assume it lies
exactly between two templates in a 2-D subspace, i.e.
x = A(h1 + h2) + {noise}, (5.47)
where the templates have been relabelled without loss of generality and A =
ρ/
√
2 from (2.54). Hence the expectation values of the original and conic statis-
tics become
E[xn] = A(δ1n + δ2n), (5.48)
E[Xm] = A card({1, 2} ∩Um), (5.49)
while their covariances remain as (5.7) and (5.16) respectively. The assumption
(5.47) is the worst-case scenario for a 2-D subspace, since the signal is maxi-
mally far from both templates in the subspace.
Although it is not possible to pre-optimise the choice of partition for or-
thogonal templates, the performance of the partition scheme in the presence of
a 2-D GW signal falls into two partition-dependent cases. At small values of
P , it is more likely that the labels 1 ∈ Um and 2 ∈ Um′ are assigned to different
sets (m 6= m′); as P increases, so does the probability that they are assigned to
the same set (m = m′), which improves performance (e.g. the effective SNR for
detection purposes is raised by a factor of
√
2).
The standard detection algorithm in Section 5.2.1 is applicable for a 2-D
signal, while the standard identification algorithms may be generalised at step
(iv) by considering the two largest original statistic evaluations instead. Fig-
ure 5.8 shows plots of discrimination and accurate-identification rate against
§5.3 Orthogonality and subspaces 109
(a) (b)
Figure 5.8: Plots of (a) discrimination D and (b) accurate-identification rate PI
against compression rate κ for the partition scheme with 1-D and 2-D signals,
at different values of true SNR ρ. The higher dotted curves for each value of
ρ correspond to the scenario where the template labels 1 and 2 are assigned
to the same set, while the lower curves correspond to them being assigned to
different sets. Accurate identification for the 2-D case is defined as the identi-
fication of both templates h1 and h2.
110 Template bank compression
compression for a 2-D signal ∝ (h1 + h2), compared against a 1-D signal ∝ h1
with the same true SNR ρ. The identification algorithm I2 is used, since the
accuracy rate of I1 falls to zero if m 6= m′. This gives Neval = M + 2P in (5.17).
For detection of a 2-D GW signal, the effectiveness of the partition scheme
is reduced slightly at lower SNRs, but mitigated by the case where m = m′ (i.e.
the higher dotted curves in Figure 5.8). Detection performance for this special
case actually improves up to some level of compression, which can occur as the
symmetry among all possible signals is broken (by the partitioning process).
A similar effect is seen for the example in Section 5.4. The discrimination for
a 1-D signal generally lies within the 2-D discrimination bounds; at higher
compression rates, there is little to no detection performance lost if the signal
is not confined to a 1-D subspace.
Accurate identification of a 2-D GW signal (i.e. the identification of both
h1 and h2 in this toy model) is more problematic than in the 1-D case, since
accuracy rates are reduced to begin with and fall off rapidly even at high SNR.
Nevertheless, options such as lowering compression or switching to Ii>2 are
available for the partition scheme, which should at least allow the template
with maximal signal overlap to be identified at acceptable accuracy rates.
If the true SNR is sufficiently high, the algorithms Ii>j may also be used to
identify a j-dimensional GW signal described by an arbitrary linear combina-
tion of templates, i.e.
x =
j∑
k=1
Akhk + {noise}, (5.50)
where Ak > Ak+1 and the templates have been relabelled without loss of gen-
erality. At step (iv) of the algorithms, each ordered weight Ak may be approx-
imated by the k-th largest original statistic evaluation x(k), with the detection
SNR given by
ρD =
√√√√ j∑
k=1
x2(k). (5.51)
While this method fully recovers the (relative) weights of a GW signal’s j
largest modes in the limit of zero noise, its accuracy might be significantly
diminished for lower-SNR signals or at higher levels of compression.
§5.4 Taylor-T2 example 111
5.4 Taylor-T2 example
In this section, we implement the (optimised) partition scheme described
in Sections 5.2.1 and 5.3.1 for a larger and more realistic example: a two-
parameter template bank of mixed-order PN waveforms, which describe the
gravitational radiation emitted during the inspiral part of a comparable-mass
binary coalescence. An optimised partition here (and in general) refers to a
partition of the template bank for which the convex hulls of the sets are non-
intersecting, such that highly correlated templates are grouped together.
The waveform family we use is the Taylor-T2 approximant [38, 174]
for a circular and non-inclined binary with comparable component masses
(m1,m2 ≥ m1). These waveforms are parametrised by chirp mass (2.31)
and symmetric mass ratio (2.27), and are written as PN expansions in the
frequency-related variable ξ := (GMη−3/5φ˙/c3)2/3, where units have been re-
stored and φ˙ is the orbital phase frequency. We truncate the PN expansions
at finite order, specifying the phase and amplitude to 3.5PN and 2PN respec-
tively; the resultant mixed-order waveform may be written compactly as [175]
hM,η(t) =
2GMη2/5
c2R
A(t) exp (2iψ(t)), (5.52)
where R is the source distance (which the true SNR ρ is inversely proportional
to), and expressions for the amplitude function A and tail-distorted orbital
phase ψ are given in Appendix C.
Template bank compression is potentially more important for analysing
data from space-based detectors, since the long duration and potential com-
plexity of GW signals in their low-frequency sensitivity band greatly increases
the cost of a grid-based detection search. As SMBH binary coalescences are an
anticipated source for LISA, we consider a Taylor-T2 signal with the parame-
ters θC = (1, 0.15), where θ := (M/(106M), η); this corresponds to a black-
hole binary inspiral with component masses (m1,m2) = (0.6, 2.5)×106M. The
duration of the signal is set to tc = 1 yr.
We also generate a bank of Taylor-T2 templates with the same duration,
each normalised with respect to the inner product (2.53), where Sn(f) is given
112 Template bank compression
by an analytic approximation to the (down-scoped) LISA noise power spectral
density [176]. These templates are gridded uniformly in the transformed pa-
rameters θ′ := θC + L(θ − θC) (128 points in each parameter), with the signal
lying in the middle of the four central templates and the linear transformation
L chosen such that the template overlaps are isotropic with respect to the grid
(at least for the central region). The maximal mismatch of each template with
its four nearest neighbours is around 0.01.
Since the N = 16384 templates are pre-sorted by the (skewed) square grid,
an optimised (but not necessarily optimal) partition is obtained by the obvious
grouping into M blocks of
√
P ×√P templates. This particular template bank
admits six nontrivial square partitions with P ∈ {4, 16, 64, 256, 1024, 4096}; we
do not consider the case P = 4096, as P = 1024 already yields a compression
rate of 99.9%. A large number of rectangular partitions (where P = 2i with
0 < i < 14) are also possible, but we omit these here for simplicity as they
are degenerate with the square partitions and among themselves. Square par-
titions are straightforward to generalise for various lattice choices [177–179],
and will be fairly optimal as long as the templates are gridded uniformly in
the parameter-space metric.
The expectation values of the original and conic statistics (given by the first
equality in (5.6) and (5.43) respectively) are visualised in Figure 5.9, where the
coarse-graining of the compression is evident. Overlaps for the Taylor-T2 tem-
plate bank are much less localised than the toy model overlaps in Section 5.3.1;
this is due to their wider cycle widths in both M and η, as well as a slight
degeneracy in the two parameters (overlaps at the boundary of the first plot
in Figure 5.9 can be as high as 0.4ρ). As the templates are so broadly corre-
lated and the GW signal is injected right in the centre of the bank, the partition
scheme is expected to perform well up to a high level of compression.
For comparison purposes, we again consider the simple coarsening method
discussed in Section 5.3.1. The smaller coarsened banks are formed by select-
ing individual templates near the centre of each square block in the original
bank, rather than by summing the templates as in the partition scheme. Detec-
tion and localisation performance for both the partition scheme and the coars-
ening method on the Taylor-T2 template bank with central injection is sum-
§5.4 Taylor-T2 example 113
Figure 5.9: Matrix/contour plots of the expectation values E[xn] and E[Xm] for
the partition scheme, at different values of set cardinality P (with compression
rate κ in parentheses) for a Taylor-T2 GW signal (red cross) injected between
the four central templates of a (128×128)-template Taylor-T2 bank. The bank is
gridded uniformly in linearly transformed parametersM′ (increasing from top
to bottom) and η′ (increasing from left to right) with maximal mismatch≈ 0.01.
Overlap values depend on the true SNR ρ (set to 1 in these plots), and range
from positive (orange) to negative (blue) in some subinterval of (−Pρ, Pρ).
114 Template bank compression
(a) (b)
Figure 5.10: Plots of (a) detection rate PD (at fixed false alarm rate PF ) and (b)
accurate-localisation rate PI (to nearest ν templates) against compression rate
κ for the optimised partition scheme and the coarsening method, at different
values of true SNR ρ for a GW signal injected with central parameters θC . Ac-
curate localisation here is defined as the identification of a template within the
central squares of ν templates.
§5.4 Taylor-T2 example 115
marised in Figure 5.10. The semi-log plots in this section use an abscissa of
− lg (1− κ)/3, as most of the considered compression rates are > 90%.
Instead of the discrimination (5.42), we quantify detection performance us-
ing the detection rates at two fixed false alarm rates PF = 10−2 and PF = 10−4
(the number of Monte-Carlo trials performed for each plot point is ∼ 105, and
so the errors are ∼ 10−3 for a 1-sigma binomial confidence interval). At all
considered values of SNR and fixed false alarm rate, there is no fall-off in the
partition scheme’s detection performance up to κ = 93.8% (and even a slight
increase, due to the special choice of central injection). While this is also the
case for the coarsening method, detection rates for the partition scheme are
distinctly higher at compression rates of > 90%, with improvements of over
0.1 at κ = 99.9%.
The identification algorithm I1 is used to localise the GW signal, which
gives Neval = M +P in (5.17). Rates for accurate localisation to within two cen-
tral squares of 12×12 templates (corresponding to< 1% of the entire bank) and
4 × 4 templates (< 0.1% of the bank) are considered. Localisation is typically
improved by compression up to κ = 93.7%, which is provided by two different
values of P (see discussion in Section 5.2.4). The two values are P = 16, beyond
which the matrix/contour plot of E[Xm] in Figure 5.9 loses scale-similarity to
that of E[xn], and P = 1024, for which performance is regained as each conic
template incorporates more of the original templates and accuracy is added
by the fine-grained search. Localisation is poorer at κ = 98.0%, which corre-
sponds to both P = 64 and P = 256. To reduce clutter in Figure 5.10(b), only
the higher localisation rates for κ = 93.7% and κ = 98.0% are plotted. The
partition scheme outperforms the coarsening method at most levels of com-
pression, especially in the case of accurate localisation to within the smaller
square of 4× 4 templates.
For the special case of a centrally injected GW signal, the detection and
localisation performance of the partition scheme is non-decreasing up to high
levels of compression and can even rise above that of the original template
bank; however, this may also be said for the coarsening method. To illustrate
that the improvement of the partition scheme over the coarsening method is
not simply due to the special choice of injection, we consider two other cases:
116 Template bank compression
Figure 5.11: Matrix/contour plots of the expectation values E[xn] for Taylor-T2
GW signals (red crosses) injected in a (128 × 128)-template Taylor-T2 bank at
random (left) and near the boundary (right).
a Taylor-T2 signal injected with randomly drawn parameters θR = (1.0, 0.16),
and another injected near the boundary of the bank with the parameters θB =
(0.98, 0.06) (i.e. in the middle of the four corner templates with low chirp mass
and symmetric mass ratio). The expectation values of the original statistics for
these two injections are visualised in Figure 5.11.
Figure 5.12 shows detection and accurate-localisation rates for both the par-
tition scheme and the coarsening method on the random and boundary injec-
tions. The random injection is actually recovered with slightly better rates than
the central injection rates in Figure 5.10, but with a similar improvement of the
partition scheme over the coarsening method. A more marked difference be-
tween the two methods is obtained for the boundary injection. Detection rates
for both methods are now non-increasing, with the partition scheme showing
greater improvement over the coarsening method; for the ρ = 6 case, the im-
provement is around 0.3 at κ = 93.8%. Rates for accurate localisation of the
boundary injection to within the corner square of 12 × 12 templates follow a
similar trend, with a largest improvement of around 0.5 (again for the ρ = 6
case at κ = 93.7%).
Detection and localisation performance for this Taylor-T2 example is
injection-dependent, as it is for any realistic template bank: there is clearly no
symmetry among all possible GW signals, since the templates are asymmetri-
§5.4 Taylor-T2 example 117
(a) (b)
Figure 5.12: Plots of (a) detection rate PD (at fixed false alarm rate PF = 10−2)
and (b) accurate-localisation rate PI (to nearest 12×12 templates) against com-
pression rate κ for the optimised partition scheme and the coarsening method,
at different values of true SNR ρ for a GW signal injected with random param-
eters θR and boundary parameters θB. Accurate localisation here is defined as
the identification of a template within the square of 12 × 12 templates nearest
to each injection.
118 Template bank compression
cally correlated and the signals may lie between templates. We have not un-
dertaken a full injection-averaged analysis (similar to that in Section 5.3.1) due
to the size of the template bank, but overall detection rates for such an analy-
sis should decrease monotonically with compression as per intuition, with the
partition scheme outperforming the coarsening method (as it does for the three
injections presented here, as well as several others we have examined).
The partition scheme is expected to remain robust for searches in a (d > 2)-
dimensional parameter space. As the number of templates that are highly cor-
related with the GW signal increases exponentially with d, enlarging the span
P of each conic template at the same rate should maintain detection sensitivity
while increasing the relative computational savings (which scale as 1 − 1/P ).
Good scaling with parameter-space dimensionality allows conic compression
to be competitive with other search techniques that reduce computational cost.
For example, the method of searching over the signal time-of-arrival using
fast Fourier transforms yields a logarithmic reduction in the number of search
points for that parameter [177], but for multidimensional searches an overall
logarithmic reduction is easily attained by the partition scheme with little im-
pact on performance. The two methods might even be combined for greater
savings, by constructing conic sums of templates aligned at a fixed reference
time and using their Fourier transforms to search over time-of-arrival.
5.5 Discussion
In this chapter, we have presented and compared three tunable conic compres-
sion schemes (partition, symmetric base and binomial coefficient) for a GW
template bank in a grid-based detection search. The bank is compressed in the
preparatory offline stage, which yields faster detection and localisation of sig-
nals by reducing the number of inner product evaluations performed online.
A recently proposed binary labelling method, modified to ensure the equal
treatment of templates, is subsumed as a particular case of the symmetric base
scheme. Optimal detection statistics have been calculated for all three schemes
under simplified conditions, and the standard maximum-overlap detection
statistic is shown to be significantly suboptimal for the base and binomial
§5.5 Discussion 119
schemes. While these two lossless schemes provide automatic identification
of the GW signal upon detection, the benefits of this are negated in the pres-
ence of noise. Furthermore, the lossy partition scheme offers better overall
performance than its counterparts at the same level of compression.
We have applied the partition scheme to toy models of (i) a correlated tem-
plate bank with a signal-proportional template and (ii) a signal lying in the
span of orthogonal templates, to show that it remains feasible under such con-
ditions. These toy models are instructive as they represent the two limiting
cases of a general template bank. Correlations among the original templates
result in partition-dependent performance, but this may be optimised before-
hand by grouping highly correlated templates together; the optimised parti-
tion scheme is then superior to a simple coarsening of the template bank. If
the signal is proportional to a linear combination of templates in an orthogo-
nal bank, the detection performance of the scheme is not significantly reduced.
Conic compression performs well if the original template bank is suffi-
ciently correlated, as demonstrated by our example implementation of the op-
timised partition scheme for a bank of ∼ 104 PN waveforms. We consider a
centrally injected GW signal, a randomly injected one, and one at the bound-
ary of the bank; again, the scheme is superior to the coarsening method across
the board. The partition scheme is shown to be viable for practical applica-
tions, as it maintains good detection sensitivity and localisation accuracy up to
high levels of compression and at all considered values of SNR for this more
realistic template bank.
In summary, our tunable conic compression schemes—specifically the opti-
mised partition scheme—might provide an effective method of improving the
speed, detection sensitivity and localisation accuracy of GW template banks.
The schemes are potentially useful for any search involving template banks, as
they are fully general and may easily be adapted to supplement existing algo-
rithms in GW data analysis pipelines. Conic compression is also particularly
promising in the context of LISA data analysis, where online grid searches
will be difficult as computational costs are more prohibitive; for example, the
method could be used as an online tool to rapidly identify nearby sources be-
fore merger and generate alerts for electromagnetic telescopes.
120 Template bank compression
Chapter 6
Gaussian process regression
The theoretical uncertainty in waveform models must be accounted for when
performing Bayesian parameter estimation on data from GW detectors, espe-
cially as it could be the dominant source of error at high SNRs. A recently
introduced method deals with model error by marginalising over it, using
a prior probability density that is constructed through the machine-learning
technique of Gaussian process regression.
In this chapter, we outline the framework of the method and summarise
the results of its application to LIGO parameter estimation, before performing
a preliminary investigation of its viability for EMRI sources in LISA data anal-
ysis. Through low-dimensional parameter estimation studies, we demonstrate
that the method remains computationally practical in the light of various diffi-
culties associated with searching the large EMRI parameter space. We also dis-
cuss possible strategies for its generalisation to higher-dimensional searches.
The material in this chapter is adapted from selected parts of [134, 180, 181],
but the main results in Section 6.3 have yet to be published.
6.1 Background
A typical EMRI will be observed by LISA for ∼ 105 orbits over the mission
lifetime. Although this allows its properties to be measured very precisely, the
results are consequently susceptible to any inaccuracy in the waveform models
that are used in the Bayesian parameter estimation algorithms. This “theoreti-
cal error” will likely dominate over statistical error at high SNRs [182]. At the
same time, EMRIs are notoriously difficult to model accurately, since the ex-
121
122 Gaussian process regression
treme mass ratio prohibits the use of NR simulations. Waveform models that
employ self-force calculations are still under development; they will also be
computationally expensive, and are unlikely to be used directly for parameter
estimation after they become available.
Current data analysis studies for LISA are therefore heavily reliant on the
kludges described in Chapter 4, which are designed for robust use in search
algorithms and can be modified to include self-force information. However, it
will still be challenging to do a rough detection search of the full EMRI pa-
rameter space with kludge waveforms [168], much less explore it with the
precision required for accurate parameter estimates. EMRI waveforms are ex-
tremely sensitive to small changes in their parameters, and so the global peak
in the vast and multimodal posterior surface is akin to the proverbial needle
in a haystack. This fact, coupled with the difficulty in modelling the complex
waveforms and the systematic bias due to theoretical error, makes EMRI pa-
rameter estimation the most formidable problem in LISA data analysis.
In this chapter, we investigate the machine-learning technique of Gaussian
process regression (GPR) [183, 184] as a possible strategy for mitigating the-
oretical error in EMRI parameter estimation. Specifically, GPR is used to in-
terpolate a small set of precomputed waveform differences between an accu-
rate model and an approximate one; the GPR interpolant then provides a prior
probability density for the waveform difference, which allows theoretical error
to be marginalised over in the standard Bayesian likelihood with the approx-
imate model [185]. The benefits of this method for GW parameter estimation
are twofold: it includes information from computationally expensive wave-
forms while searching with faster but less accurate ones, and accounts for any
residual model inaccuracy with more conservative error estimates.
An overview of the new marginalised-likelihood method is given in Sec-
tion 6.2. The technique of GPR (in the context of waveform interpolation)
and the training of the Gaussian process model from the precomputed set of
waveform differences are introduced in Sections 6.2.1 and 6.2.2 respectively;
Section 6.2.3 then briefly summarises results from a LIGO-oriented proof-
of-concept study, where the method is applied to parameter estimation for
comparable-mass black-hole binary mergers.
§6.2 The marginalised likelihood 123
In Section 6.3, the viability of the GPR likelihood for EMRI parameter es-
timation is investigated, and a simple argument is provided to show that the
minimal allowed density of the training set (over parameter space) is typically
much lower than the sampling density of a parameter estimation search with
the accurate waveforms. This is verified by one- and two-dimensional exam-
ples in Sections 6.3.1 and 6.3.2 respectively, where the relationship between
the waveform-difference Fisher information matrix and the optimality of the
GPR model and training set is also made clear. Finally, possible computational
strategies for generalising the method to higher-dimensional searches are dis-
cussed in Section 6.4.
6.2 The marginalised likelihood
In this chapter, we consider waveform models h(λ ∈ Λ) over a common `-
dimensional parameter space Λ, which we take to be isomorphic to R` for
simplicity. The standard Bayesian likelihood L(λ|x) for the model parame-
ters (given the data x) is defined in (2.59), with the usual noise-weighted inner
product (2.53) on the function space W of finite-length complex time series. A
maximum-likelihood estimation λML of the parameters may then be obtained
by maximising (2.59) over Λ, such that〈
∂h
∂λ
(λML)
∣∣x− h(λML)〉 = 0. (6.1)
For a waveform model hacc that provides an accurate description of the
source, the waveform template at the true parameter values λtrue is consistent
with the source signal, i.e. x = hacc(λtrue) + n. At leading order in n, any error
λ := λML − λtrue in the measured parameter values is then purely statistical,
in that it depends linearly on n. Using Einstein notation, we may write (6.1) as
〈(∂hacc)b|n− (∂hacc)a(λ)a〉 = 0 =⇒ (λ)a = (Γ−1acc)ab〈n|(∂hacc)b〉, (6.2)
where ∂h ≡ ∂h/∂λ is the waveform derivative vector and Γ is the Fisher infor-
mation matrix (2.61), with both evaluated at λML.
124 Gaussian process regression
In general, a waveform model happ that is used for parameter estimation
may only be approximate, such that happ(λtrue) 6= hacc(λtrue). At leading order
in n, any error in the measured parameter values may be written as
(λ)
a = (Γ−1app)
ab〈n|(∂happ)b〉 − (Γ−1app)ab〈h|(∂happ)b〉, (6.3)
where the first term is statistical in the sense of (6.2), and the second term
corresponds to the theoretical error that arises from the difference h between
the approximate and accurate waveforms, i.e.
h := happ − hacc. (6.4)
Again, all derivatives in (6.3) are evaluated at λML, while the waveform differ-
ence h is evaluated at λtrue.15
The statistical-error terms in (6.2) and (6.3) are inversely proportional to the
waveform SNR ρ = 〈h|h〉1/2, since ∂h ∝ ρ and Γ ∝ ρ2. On the other hand, the
theoretical-error term in (6.3) is independent of ρ. Hence the systematic bias
incurred by using approximate waveforms happ in (2.59) may dominate the
noise uncertainty for high-SNR sources, and is likely to be the limiting factor
in extracting parameter information from such signals [182].
One approach to account for this bias is to marginalise over the error of happ
(with respect to hacc) in (2.59), by specifying a suitable prior probability density
p(h) for the waveform difference [185]. The “marginalised likelihood” is given
by the (functional) integral
L ∝
∫
Dh Lappp(h), (6.5)
which can be evaluated analytically if p(h) is Gaussian (sinceLapp is also Gaus-
sian). Such a prior may be obtained through the technique of GPR, which pro-
vides an interpolant for h with an associated normal distribution at each point
in parameter space.
15 However, h(λtrue) = h(λML) at leading order, which allows the theoretical error in (6.3)
(and hence λtrue itself, at high SNR) to be estimated for a given measurement λML [182].
§6.2 The marginalised likelihood 125
6.2.1 The Gaussian process prior
In the GPR approach, the waveform difference h ∈ W is modelled as a zero-
mean Gaussian process over Λ, i.e.
h ∼ GP(0, k), (6.6)
where the mean is chosen (uninformatively) as the time series 0 ∈ W , and the
covariance function k(λ,λ′) is some symmetric and positive-definite bilinear
form on Λ. For any finite set of parameter points {λi ∈ Λ | i = 1, 2, ..., N}, the
corresponding set of waveform differences {h(λi) ∈ W | i = 1, 2, ..., N} has a
Gaussian probability distribution N (0,K) on WN , i.e.
p(v) ∝ 1
det K
exp
(
−1
2
vTK−1v
)
, (6.7)
where the waveform difference vector v and covariance matrix K are given
respectively by
v = [h(λi)]. (6.8)
K = [k(λi,λj)], (6.9)
It is convenient to write the quadratic form in (6.7) as
vTK−1v = tr(K−1M), (6.10)
where
M := vvT = [〈h(λi)|h(λj)〉]. (6.11)
In choosing a frequency-independent form k(λ,λ′) for the covariance func-
tion, we have assumed that the correlations among the waveform differences
across parameter space do not depend on frequency. The particular normalis-
ing factor in (6.7) is only correct for a single frequency component (i.e. if the
waveform difference at each parameter point is perfectly correlated across all
frequency bins), but is retained for practical reasons.16 Finally, the inner prod-
uct for waveform differences in (6.11) is chosen to be the same noise-weighted
16 Also, the “usual” exponent of 1/2 is absent due to integration over the complex plane.
126 Gaussian process regression
inner product (2.53) as for the waveforms. These assumptions simplify the
GPR calculations, but are also conservative in the sense that they yield less
informative likelihoods; a more detailed justification is provided in [180].
From the definition of a Gaussian process, the enlarged set {h(λi), h(λ)}
is again normally distributed with zero mean and the covariance matrix
K∗ :=
[
K k∗
kT∗ k∗∗
]
, (6.12)
where
k∗ := [k(λi,λ)], (6.13)
k∗∗ := k(λ,λ). (6.14)
If {h(λi)} is known, then the conditional probability density of h(λ) given
{h(λi)} is also Gaussian, i.e.
p(h(λ)|{h(λi)}) ∝ 1
σ2
exp
(
−1
2
〈h(λ)− µ|h(λ)− µ〉
σ2
)
, (6.15)
where µ(λ) and σ2(λ) are given respectively by
µ = kT∗K
−1v, (6.16)
σ2 = k∗∗ − kT∗K−1k∗. (6.17)
The conditional probability (6.15) forms the basis of GPR, and yields an
interpolation of h(λ) from a small precomputed “training” set
D := {(λi, h(λi)) | i = 1, 2, ..., N}. (6.18)
This interpolant is given by the waveform difference mean µ(λ), with associ-
ated variance σ2(λ); it provides a new GPR-informed waveform model
hGPR := happ − µ, (6.19)
which approximates hacc via (6.4). Equation (6.15) also supplies the prior den-
§6.2 The marginalised likelihood 127
sity for h in (6.5), which evaluates to
L ∝ 1
1 + σ2
exp
(
−1
2
〈x− hGPR|x− hGPR〉
1 + σ2
)
. (6.20)
The GPR marginalised likelihood has several desirable features for param-
eter estimation. A maximum-likelihood estimation of λwith (6.20) gives
〈(∂hGPR)b|n− h + µ− (∂hGPR)a(λ)a〉 = 0 (6.21)
from (6.1) and (6.19) with x = hacc(λtrue) + n and λML = λtrue + λ, and so
(λ)
a = (Γ−1GPR)
ab〈n|(∂hGPR)b〉 − (Γ−1GPR)ab〈h|(∂hGPR)b〉
+(Γ−1GPR)
ab〈µ|(∂hGPR)b〉, (6.22)
where the third term is proportional to the GPR interpolant µ and acts to cancel
the second term by design. This correction greatly reduces the systematic bias
due to theoretical error, provided the interpolant is performing optimally (i.e.
µ ≈ h) near λtrue.
Another safeguard against theoretical error is the presence of the GPR vari-
ance σ2 in (6.20). This variance is 1 when µ ≈ h, but may become ∼ 1 far
from all training-set points, or in the case of a suboptimally chosen training set
or covariance function. The consequent broadening of the Gaussian in (6.20)
is then conservative in nature, as it usually prevents the true parameter values
from being excluded at high significance.
Lastly, the premise of the GPR approach is based on the availability of ac-
curate waveforms hacc that are extremely expensive to compute, and hence
unsuitable for use in Monte Carlo search algorithms with the standard like-
lihood. The marginalised likelihood remains computationally tractable while
including information from hacc, since it only uses the approximate waveforms
happ and adds to them some linear combination of precomputed waveform
differences via (6.16). Any extra online computational cost from using the
marginalised likelihood thus scales linearly with the size N of the training set.
The scaling coefficient (relative to the cost of (2.59) with happ) is typically small;
for the analyses in Section 6.3, it is ∼ 10−3.
128 Gaussian process regression
6.2.2 Training the Gaussian process
With the zero-mean assumption in (6.6), the waveform difference model is
fully specified by the covariance function k. The standard approach is to define
a functional form for k that depends on a number of hyperparameters θ, and
to select values for θ by training the Gaussian process with information from
the set D. A covariance function k(λ,λ′) is stationary if it depends only on the
relative position λ−λ′ of the two parameter points; it is furthermore isotropic
if it depends only on
τ 2 := gab(λ− λ′)a(λ− λ′)b, (6.23)
where the gab are the `(` + 1)/2 independent components of some constant
parameter-space metric g on Λ.
An investigation of various common isotropic (hence stationary) covari-
ance functions in the GW context finds the performance of the GPR interpolant
and the marginalised likelihood to be fairly robust against changes in the func-
tional form for k [180]. Hence we consider a single fixed form in this chapter:
the squared-exponential covariance function
kSE(λ,λ
′) = σ2f exp
(
−1
2
τ 2
)
, (6.24)
which is the smooth limiting case for several different families of covariance
functions. The hyperparameters for the model GP(0, kSE) then comprise only
the metric components gab and some overall scale factor σ2f .
As the size of the training set D increases, the covariance matrix K rapidly
becomes ill-conditioned, even for a modestly sized set with N & 10. This is
partly mitigated by the addition of noise to D, such that the GPR interpolant
need only pass close to (rather than through) each training-set point. Keeping
σ2f as the overall scale factor, we transform
K→ K + σ2fσ2nIN , (6.25)
where IN is the identity matrix, and the fractional noise variance σ2n of each
training-set point is taken to be uniform and fixed (i.e. not treated as a hy-
§6.2 The marginalised likelihood 129
perparameter). In practical terms, the transformation (6.25) effectively reduces
the condition number of K, thereby facilitating its numerical inversion. We use
σ2n = 10
−4 throughout this work, which is the smallest value compatible with
all of the N . 100 training sets considered in Section 6.3.
The most straightforward method of selecting the Gaussian process hyper-
parameters θ is through maximum-likelihood estimation with the hyperlikeli-
hood Z(θ|D) = p({h(λi)}) in (6.7), i.e. the likelihood for the hyperparameters
given the training set. In other words, an optimal set of hyperparameters θML
is obtained by maximising the log-hyperlikelihood
lnZ = −1
2
tr(K−1M)− ln det K + const (6.26)
over the hyperparameter space Θ.
Part of this maximisation may be done analytically, since the overall scale
σ2f factors out of the matrix expressions in lnZ [181]. In the case of (6.26), lnZ
with K = σ2fKˆ achieves a maximum in σ
2
f at
σ2f =
1
2N
tr(Kˆ−1M). (6.27)
Substituting (6.27) back into (6.26) then gives a scale-invariant form for lnZ:
lnZ = −N ln tr(Kˆ−1M)− ln det Kˆ + const. (6.28)
Equations (6.27) and (6.28) effectively reduce the dimensionality of the hyper-
parameter space by one (to dim Θ = `(`+1)/2 for the model GP(0, kSE)), which
is useful for the low-dimensional searches conducted in Section 6.3.
6.2.3 Application to black-hole binary mergers
In [180], the viability of the GPR marginalised likelihood for improving GW
parameter estimation is demonstrated through a one-dimensional (` = 1)
study, using waveforms for merging black-hole binary systems with compa-
rable component masses (m1,m2 ≥ m1). Two waveform approximants im-
plemented in the LIGO Scientific Collaboration Algorithm Library [186] are
130 Gaussian process regression
considered: the phenomenologically fitted IMRPhenomC model [187] and the
analytic Taylor-F2 model [188], which are taken as accurate and approximate
respectively. Even though these two waveforms are qualitatively different
(IMRPhenomC describes the full inspiral–merger–ringdown while Taylor-F2
is inspiral-only), the marginalised likelihood functions as expected in reduc-
ing systematic bias.
Equation (6.20) is used to estimate the chirp mass (2.31) from synthetic data
x = hacc(λtrue) (no detector noise), where hacc(λtrue) is an injected IMRPhe-
nomC signal with Mtrue = 5.045M and fixed mass ratio q = 0.75. As the
density of the training set (with respect to some metric on Λ) is expected to
be the strongest determinant of interpolation performance, two different grid
sizes in M are considered: ∆M = 10−2M and ∆M = 5 × 10−3M. The
corresponding points are gridded uniformly around Mtrue across the range
5 ≤ M/M ≤ 5.6, such that the density of the training set is varied by fixing
its span and changing its cardinality.
A GPR model with the squared-exponential covariance function (6.24) is
then trained on both sets by optimising the single independent metric hyper-
parameter, which is chosen more intuitively in the ` = 1 case to be the covari-
ance length δM := (gMM)−1/2. In general (and for the considered sets), the
optimal value of δM shifts as the density and cardinality of the training set
are varied, but typically by less than a factor of two. The performance of the
marginalised likelihood is found to be similarly robust against the choice of
δM for a given training set.
Unsurprisingly, the performance of the marginalised likelihood is im-
proved for the denser training set. Higher fidelity between the GPR waveform
(6.19) and the accurate waveform is obtained across the span of the set; this is
quantified by overlaps O(hGPR|hacc) that are & 0.999. The variance σ2 associ-
ated with the waveform difference interpolant is smaller for the denser train-
ing set as well, with values that are . 10−3 relative to σ2f (which is the limiting
value of σ2 outside the span of the set). A maximum-likelihood estimation of
M with the corresponding marginalised likelihood is therefore closer to the
true value, and better constrained. The sparser training set in this study gives
O(hGPR|hacc) & 0.985 and σ2/σ2f . 10−2, with a marginalised likelihood that
§6.3 Application to extreme-mass-ratio inspirals 131
is shifted away and broader but still functional. As will be discussed in Sec-
tion 6.3, this is because its density is close to some threshold determined by the
optimal value of δM (which is largely independent of training-set density).
Different source SNRs in the range 8 ≤ ρ ≤ 64 are also considered
in this study. The marginalised likelihood for the sparser training set re-
duces the systematic error in the maximum-likelihood estimation ofM from
M = 5× 10−3M (around 10-sigma for a typical LIGO source with ρ = 16) to
M = 9×10−4M; it also broadens to remain consistent withMtrue at 2-sigma,
even at high SNR. These results are obtained in the regime where the overlaps
between the accurate and approximate waveforms are ≈ 0.35 across the span
of the training set. Although theoretical error is reduced if the approximate
model is improved, results in Section 6.3 show that the marginalised likeli-
hood remains needed for overlaps as high as ≈ 0.97, which may still lead to
significant systematic bias for a typical EMRI with ρ = 30.
6.3 Application to extreme-mass-ratio inspirals
The detection and characterisation of EMRIs is a formidable challenge in GW
data analysis, especially in the broader context of resolving these sources from
a LISA data set that is likely to contain an (over)abundance of long-lived and
overlapping signals [189]. Even as a standalone problem, searches of the EMRI
parameter space are greatly hindered by its large volume (with respect to the
Fisher information metric), which requires ∼ 1030 waveforms for full cover-
age in a template bank approach [168]. This is exacerbated by the long and
unwieldy waveforms themselves; a sampling rate of 0.2 Hz (the characteristic
frequency for an EMRI with a 106M central black hole) yields ∼ 107 samples
for both channels of a year-long waveform.
Due to the O(N3) cost of computing the Cholesky decomposition for K in
(6.16) and (6.17), it is clearly impractical—if not impossible—to cover any sig-
nificant fraction of parameter space with a single training set. The present pur-
pose of the GPR marginalised likelihood is thus restricted to precise parameter
estimation in highly localised regions of parameter space. Furthermore, if the
GPR approach is to be useful for EMRIs at all, the typical separation of points
132 Gaussian process regression
in the training set must be greater than the Fisher metric lengths of the accu-
rate waveform model, which determine the characteristic resolution of grid or
stochastic searches with that model [134].
A simple argument shows that this is normally the case for EMRI wave-
forms with ρ > 1. We consider a small neighbourhood of some point λ0 ∈ Λ,
along with a covariance metric g for a Gaussian process that optimally de-
scribes the distribution of h in that neighbourhood. The metric defines the
short covariance lengths δλa := (gaa)−1/2, i.e. the half widths of the associ-
ated hyperellipsoid in each one-dimensional parameter subspace through λ0.
These lengths place upper bounds on the characteristic grid sizes of the train-
ing set and lower bounds on its span; an incompatible training set typically
yields no peak in the log-hyperlikelihood surface (6.26), and so the regression
becomes suboptimal.
From the optimality of the Gaussian process, the covariance lengths asso-
ciated with g approximate the correlation lengths of h itself. Hence we have
δλa ∼ (δλover)a, where the vector of overlap lengths δλover is defined to satisfy
〈h(λ0)|h(λ0 + Paδλover)〉 = 0 (6.29)
for each a, with Pa projecting δλover onto the subspace corresponding to λa. At
leading order, we then have
δλa ∼ 〈h|h〉|〈h|(∂h)a〉| =
ρ secφ
a√
(Γ)aa
, (6.30)
where ∂h and Γ respectively denote the derivative vector and Fisher informa-
tion matrix of the waveform difference, and φa is the principal inner-product
angle between h and (∂h)a. The covariance lengths do not scale with the SNR
ρ of the waveform difference, since the final denominator in (6.30) is ∝ ρ.
We now consider the Fisher metric lengths of the waveform difference. As
pointed out in [180], the Fisher lengths of the difference between two wave-
form models are generally larger than those of the individual models, espe-
cially if both models generate waveforms with high overlap. This might be due
to the waveform difference having a lower SNR, or being more broadly corre-
§6.3 Application to extreme-mass-ratio inspirals 133
lated across parameter space. Since the first of these two factors is known, we
may separate it from the analysis and consider the Fisher lengths of the unit-
SNR waveform difference instead. For the EMRI examples that follow in this
section, these are found to be comparable to the Fisher lengths of the unit-SNR
waveforms, which is simply the statement that the waveform difference varies
across parameter space in similar fashion to the waveforms themselves.
Since the short (i.e. defined analogously to δλa) Fisher metric lengths of the
unit-SNR waveform difference are given by
(δλFish)
a :=
ρ√
(Γ)aa
, (6.31)
it follows that δλa & (δλFish)a. In general, any waveform derivative with re-
spect to a parameter that only affects amplitude will have φa ≈ 0, such that
δλa ∼ (δλFish)a. Nevertheless, the SNR-normalised Fisher lengths for the
waveform difference (or the waveforms themselves in the EMRI case) give
rough order-of-magnitude estimates for the optimal covariance lengths, and
are more straightforward to obtain. These estimates may be used to provide
initial guesses for the diagonal metric hyperparameters when maximising the
log-hyperlikelihood with standard optimisation routines.
Finally, we compare the optimal covariance lengths δλa to the unnor-
malised Fisher metric lengths of the accurate waveform model, which for EM-
RIs are given approximately by (δλFish)a/ρ through the above argument. The
former lengths determine the maximal separation of points in the training set,
while the latter describe the support of the accurate posterior probability den-
sity. Since δλa & (δλFish)a, it follows for ρ > 1 that the lowest training-set
density allowed (while ensuring interpolation optimality) will typically be far
below the sampling density required to resolve the posterior surface. Hence
the GPR approach should be viable for EMRIs, and its computational benefits
are enhanced for sources with higher SNR.
The validity of this simple argument is illustrated through one- and two-
dimensional analyses of the marginalised likelihood in Sections 6.3.1 and 6.3.2
respectively. Kludge waveform models are used in this study, due to their
availability and computational practicality. Waveforms generated from the NK
134 Gaussian process regression
model in Section 4.2.2 are taken to be accurate, since they have high fidelity
with Teukolsky-based waveforms up to an orbital separation of ≈ 5M . The
latest version of the AAK in Section 4.3 is used as the approximate model; it is
faster than the NK and matches its early phase evolution with high waveform
overlap, but dephases gradually as the compact object approaches plunge.
In both sections, the synthetic data x is an injected NK signal from a 101M
stellar-mass black hole orbiting a 106M SMBH, with no detector noise added.
The signal is two months long and sampled at 0.2 Hz, while the source dis-
tance is adjusted such that the SNR is ρ = 30 with respect to an analytic ap-
proximation for the (down-scoped) LISA noise power spectral density [176].
Other orbital parameters are chosen to ensure that the NK and AAK wave-
forms have an overlap of ≈ 0.97, so as to investigate the scenario in which the
approximate model is fairly accurate to begin with. As will be shown, EMRI
parameter estimation is susceptible to significant systematic bias even at this
level of theoretical error.
For the given source and waveform parameters, a single evaluation of the
NK likelihood takes 29.4 s on average, as compared to an average of 5.66 s per
evaluation for the AAK likelihood. This disparity in computational cost is
likely to be even greater in more realistic scenarios where, for example, the NK
is used as the approximate model and the accurate waveforms are provided by
a self-force model. In Section 6.3.1, we verify that the added cost of using the
GPR marginalised likelihood over the standard likelihood with approximate
waveforms is indeed proportional to the training-set size N , and with a small
scaling coefficient (as mentioned in Section 6.2.1).
6.3.1 One-dimensional example
In this section, the GPR marginalised likelihood (6.20) is used to estimate the
compact-object mass of an EMRI with µtrue = 101M, assuming all other pa-
rameters are known and fixed at their true values. The covariance metric on
the corresponding parameter subspace has a single component gµµ, which is
optimised through the maximisation of the log-hyperlikelihood (6.26). This
one-dimensional example provides a simple illustration of the relationships
§6.3 Application to extreme-mass-ratio inspirals 135
Figure 6.1: Plots of lnZ against lg (δµ/M) for eight 10-point training sets with
grid sizes −2.1 ≤ lg (∆µ/M) ≤ −1.4 (indicated by the abscissae of the solid
circles). The vertical dashed line corresponds to the Fisher metric length δµFish.
between the optimal covariance length δµ := (gµµ)−1/2, the Fisher metric length
δµFish := ((Γ)µµ)
−1/2, and the training-set grid size ∆µ.
The GPR model is trained on eight 10-point training sets with uniform
grids, where the grid sizes are distributed in the range −2.1 ≤ lg (∆µ/M) ≤
−1.4. This range is chosen to encompass the Fisher length of the unit-SNR
waveform difference17, which is approximately constant across the spans of
the considered training sets and evaluated at µtrue as lg (δµFish/M) = −1.96.
Each training set is placed such that µtrue lies at the geometric centre of its span
(and thus maximally far from the nearest points in the set).
Figure 6.1 shows plots of the log-hyperlikelihood for the eight training sets,
with the optimal covariance length for each set given by the abscissa of the
peak (where it exists). The optimal value δµ is approximately constant for all
valid training sets, and falls in the narrow range lg (δµFish/M) ≤ lg (δµ/M) ≤
−1.8. In comparison to the approach of [180] described in Section 6.2.3, vary-
17 For comparison, the Fisher lengths of the unit-SNR NK and AAK waveforms at µtrue are
lg (δµNKFish/M) = −1.86 and lg (δµAAKFish /M) = −1.85 respectively.
136 Gaussian process regression
ing the density of the training set by fixing its cardinality and changing its
span also shifts δµ by less than a factor of two, which demonstrates that both
span and cardinality have less impact than density on a training set’s perfor-
mance. Hyperlikelihood peaks emerge only for grid sizes lg (∆µ/M) ≤ −1.8,
indicating that 1/δµ determines a minimum threshold for the density of an op-
timal training set. Finally, δµFish appears to set a lower bound on δµ, which is
consistent with the discussion around (6.30) and (6.31).
We now consider the marginalised likelihood itself for three other 10-point
training sets. Firstly, a set DFish is placed around µtrue with grid size ∆µFish =
δµFish; training the GPR model on this set yields an optimal covariance length
lg (δµ/M) = −1.82. Two more sets Dcov and D2cov are constructed in the same
way, with the grid sizes ∆µcov = δµ and ∆µ2cov = 2δµ respectively. A different
optimal covariance length is found for Dcov, while there is no hyperlikelihood
peak forD2cov. Nevertheless, the performance of the marginalised likelihood is
found to be practically constant across the range lg (δµFish/M) ≤ lg (δµ/M) ≤
−1.8, and so the Gaussian process is not retrained for Dcov and D2cov (i.e. the
same value of δµ is used for all three training sets).
At a source SNR of ρ = 30, a high overlap of ≈ 0.97 between the accurate
and approximate waveforms still results in a 5-sigma bias due to theoretical
error; as seen in Figure 6.2, the approximate likelihood Lapp is peaked away
from µtrue with µ ≈ 3 × 10−3M, while the 1-sigma length for Lapp (and the
accurate likelihood Lacc) is ≈ 5 × 10−4M. The marginalised likelihood with
the training set DFish is virtually identical to Lacc, and is slightly broader with
Dcov but remains peaked near the true value. For the sparsest training setD2cov,
the peak of the marginalised likelihood has an error µ similar to that of Lapp,
although it is sufficiently broadened to ensure that it is still consistent with
µtrue at 2-sigma significance.
The one-dimensional example in this section indicates that the GPR ap-
proach might be viable for EMRIs, since even the densest considered training
setDFish has a grid size that is significantly larger than the width of the accurate
likelihood. By constructing a few additional training sets with different sizes
N , the marginalised likelihood is found to take an average of 5.66 + 0.013N s
per evaluation, i.e. ≈ 2% longer than the approximate likelihood for a 10-point
§6.3 Application to extreme-mass-ratio inspirals 137
Figure 6.2: One-dimensional likelihood plots for the standard likelihood with
accurate and approximate waveforms, and the marginalised likelihood with
the training sets DFish, Dcov and D2cov. The only training-set points within the
horizontal plot range belong to the densest setDFish, and are indicated by thick
marks on the horizontal axis.
138 Gaussian process regression
training set. However, it provides no computational savings over the accurate
likelihood for large training sets (with & 2000 points in this particular case),
which may restrict its usefulness for higher-dimensional problems. This lim-
itation, as well as the generally poor scaling of GPR with dimensionality, is
discussed further in Sections 6.3.2 and 6.4.
6.3.2 Two-dimensional example
In this section, the GPR marginalised likelihood (6.20) is used to estimate the
component masses of an EMRI with (µ,M)true = (101, 106)M, again assuming
all other parameters are known and fixed at their true values. Maximisation
of the log-hyperlikelihood (6.26) is now over the three independent compo-
nents (gµµ, gMM , gµM) of the covariance metric on the two-dimensional param-
eter subspace. The eigensystem {(λi, vˆi) | i = 1, 2} of g defines a covariance
ellipse with semi-principal axes {λ−1/2i vˆi} in the usual way.
Although the component-mass subspace is chosen for the two-dimensional
example here, a straightforward search in (µ,M) is not necessarily optimal in
the context of higher-dimensional parameter estimation. For example, since
the central mass M ≈ µ + M strongly determines the characteristic frequency
of an EMRI waveform, variations in the waveform difference with respect to
M may be reduced by rescaling the time coordinate to dimensionless time t/M .
This approach has been investigated, and yields longer (by an order of magni-
tude or so) covariance lengths as expected. However, it also results in less sta-
ble derivatives and poorer interpolation; this is likely because the AAK model
maps M to an unphysical mass M˜ that is then evolved through fits (see Sec-
tion 4.3.2), and so its waveforms vary differently from the NK waveforms with
respect to M . If more accurate models are used, the waveform difference will
have an infinite covariance length in total mass, such that rescaling the time
coordinate by µ+M reduces the component-mass subspace to a single degree
of freedom (e.g. the mass ratio µ/M ).
Three different training sets are considered for the (µ,M) example in this
section. The first is a (6×6)-point setDFish with (µ,M)true lying at the geometric
centre of its span; its points are placed uniformly on a grid defined by the semi-
§6.3 Application to extreme-mass-ratio inspirals 139
principal axes {λ−1/2i vˆi}Fish of the Fisher metric ellipse (where {(λi, vˆi)}Fish is
the eigensystem of Γ for the unit-SNR waveform difference). Two more (10×
10)-point training sets Ddense and Dsparse are constructed on rectangular grids,
with the grid sizes given by the short and long Fisher lengths respectively, i.e.
(∆µ,∆M)dense :=
(
1√
(Γ)µµ
,
1√
(Γ)MM
)
, (6.32)
(∆µ,∆M)sparse :=
(√
(Γ−1 )µµ,
√
(Γ−1 )MM
)
. (6.33)
As justified in Section 6.3.1, the GPR model is trained on a single training
set (Ddense in this case), and the same optimal covariance ellipse is subsequently
used for all three sets. The relative placement of points in the three training
sets is shown in Figure 6.3, along with the covariance and Fisher ellipses. Both
ellipses are aligned and the Fisher ellipse is slightly smaller, which is consistent
with the discussion around (6.30) and (6.31).
From the contour plots in Figure 6.4, the measurement of (µ,M) with
the approximate likelihood Lapp has a theoretical error of (µ,M) ≈ (2 ×
10−3, 6 × 100)M, and excludes (µ,M)true at beyond 2-sigma significance. The
marginalised likelihood with the training set Ddense is virtually identical to the
accurate likelihood Lacc; so too is the likelihood for DFish, which is sparser and
contains fewer points. More surprisingly, the training set Dsparse also yields a
likelihood that is very similar to Lacc, which indicates that a training-set den-
sity no lower than that corresponding to the long Fisher metric lengths (i.e. the
half extents of the unit-SNR Fisher ellipse in each parameter) will still be opti-
mal on the level of the marginalised likelihood. However, it may be difficult to
learn the optimal covariance metric from such a training set if it is too sparse
or contains too few points.
It is clear that a simple rectangular grid approach to the placement of train-
ing set points will not scale well with the dimensionality ` of the parameter
space, but uniform placement on a grid defined by the Fisher metric eigensys-
tem is also limited at moderately large `. From our one- and two-dimensional
studies, as well as preliminary investigations of a three-dimensional problem,
140 Gaussian process regression
Figure 6.3: Training-set point placement around (µ,M)true for Ddense (dots),
DFish (triangles) and Dsparse (squares). The grid for DFish is defined by the
semi-principal axes of the Fisher metric ellipse (green), and is aligned with the
optimal covariance ellipse (black) learnt from Ddense. The central grey square
corresponds to the plot range of Figure 6.4.
§6.3 Application to extreme-mass-ratio inspirals 141
Figure 6.4: Two-dimensional likelihood contour plots for the standard likeli-
hood with accurate and approximate waveforms, and the marginalised likeli-
hood with the training sets Ddense, DFish and Dsparse. All contours are 2-sigma.
The only training-set points within the plot range belong to Ddense, and are
indicated by solid circles.
142 Gaussian process regression
six points along each eigendirection appears to be the bare minimum for learn-
ing a covariance metric that well describes the waveform difference locally.
This necessitates O(N3) operations on a 6` × 6` covariance matrix in the train-
ing stage, which is computationally challenging for ` > 5.
If a suitable covariance metric can be obtained, the actual set of waveform
differences used in the interpolation stage does not have to be quite as large
or dense as the set used to train the GPR model. A set with just three points
along each eigendirection becomes unfeasible at ` > 8, since the marginalised
likelihood is likely to be more expensive than the accurate likelihood for N ∼
104; still, this is enough to estimate the seven intrinsic EMRI parameters (see
Section 4.2). In general, the dimensionality problem for the GPR approach
might be tackled by reducing the computational cost of operations on the N ×
N covariance matrix, or by reducing N itself while ensuring the training set is
functional. Possible methods for achieving these are beyond the scope of the
present work, but are discussed in the concluding remarks of Section 6.4.
6.4 Discussion
In this chapter, we have introduced the GPR marginalised likelihood and its
applications in GW data analysis, with a focus on investigating its viability
for EMRI parameter estimation through low-dimensional studies. Even in the
considered scenario where the waveform model used for parameter estima-
tion has a ≈ 0.97 match with the source signal at the true parameter values,
significant systematic bias from theoretical error will still arise for high-SNR
(ρ & 30) sources. The GPR approach is shown to mitigate this bias, and hence
to be suitable for improving the accuracy of EMRI parameter estimation (albeit
in highly localised regions of parameter space).
The performance of the marginalised likelihood is strongly dependent on
its precomputed training set of waveform differences, which relies on the ex-
istence of a more accurate waveform model that reproduces the source signal
with high fidelity. For the method to be practical, the density of the training set
must be significantly lower than the resolution of a posterior search with the
accurate waveforms. This is shown to be the case for EMRIs through a simple
§6.4 Discussion 143
argument, and is verified with one- and two-dimensional examples. Another
key result in these sections is a demonstration of how the Fisher information
matrix of the waveform difference (normalised to unit SNR) may be used to
inform the placement of training-set points, as well as to estimate a covariance
metric that describes the waveform difference locally.
While the marginalised likelihood shows early promise for EMRI param-
eter estimation, it is akin to other applications of GPR in being subject to the
curse of dimensionality. The number of training-set points required to search
an `-dimensional parameter subspace grows exponentially with `, which hin-
ders not just the offline training stage (since the covariance matrix is larger and
more ill-conditioned), but also the online interpolation stage (where a new lin-
ear combination of points is computed for each likelihood evaluation).
One possible approach to these computational problems is to replace the
squared-exponential covariance function (6.24) in the GPR model with a co-
variance function that has compact support on parameter space (e.g. the
Wendland polynomials [190]), such that the covariance matrix becomes sparse.
Iterative methods [191] may then be used to accelerate the Cholesky decompo-
sition of K in (6.16), (6.17) and (6.26). A compact-support covariance function
also reduces the number of training-set points summed in (6.16), which di-
rectly determines the evaluation speed of the marginalised likelihood.
Another strategy is to minimise the size of the training set itself. As seen
throughout Section 6.3, the covariance metric that is learnt during the training
stage also determines a fixed threshold for the density of a training set that
functions optimally at the interpolation stage. However, it may be possible to
lower this threshold density through reparametrisation or dimensionality re-
duction methods, e.g. the component-mass example discussed in Section 6.3.2.
In the case of parameters for which this is not feasible, the number of training-
set points used to cover the region of relevance may still be reduced through
a non-uniform placement of points, or non-geometric prescriptions such as
stochastic placement algorithms [192–194]. Full coverage of the search region
with a precomputed training set might not even be necessary; one possibility
is to use a smaller “moving” set that is updated adaptively as the marginalised
likelihood is sampled with Markov-chain-Monte-Carlo methods.
144 Gaussian process regression
Chapter 7
Conclusion
The historic first direct detection of GWs by Advanced LIGO at the end of 2015
[8] was but the opening event in the quick succession of exciting developments
that have since unfolded in the field. Within the short span of 18 months,
we have witnessed a second confirmed black-hole binary merger [9] and the
unequivocal success of the LISA Pathfinder proof-of-technology mission [18],
while the outlook for an impending detection with pulsar timing arrays is also
much improved [20]. Against the backdrop of this rapidly evolving landscape,
it is hoped that the work presented in this dissertation constitutes a modest but
timely contribution to the burgeoning field of GW astronomy.
In Chapter 3, we have found that Einstein–Maxwell interactions between
far-field GWs and electromagnetic fields are likely too weak for the practical
purposes of GW detection and parameter estimation, except under conditions
that might be contrived or extreme by astrophysical standards (e.g. those con-
sidered in Section 3.4). A characterisation of the electromagnetic perturbations
driven by strong-field gravitational radiation should yield effects that are more
pronounced, and will allow the modelling of GW sources in greater detail.
However, EMWs induced by GWs in the strong-field regime are restricted to
gravitational frequency bands (. 103 Hz) [96, 123], which presents the same
difficulties for direct detection as mentioned in Section 3.4.1.
The framework in Section 3.3 was originally intended as an intermedi-
ate step towards adapting the 1+3 approach for strong-field calculations, but
turned out to be analytically intractable even for the far-field problem in spher-
ical coordinates, due to the mixing of spherical harmonic modes by the tensor–
vector contractions in (3.38) and (3.39). An alternative research direction in
145
146 Conclusion
the area of Einstein–Maxwell interactions would be to unify or extend exist-
ing semi-analytical methods [96, 97, 123] for Schwarzschild/Kerr spacetimes.
Unification should be achievable in principle, as a common (and perhaps in-
evitable) trait of such methods is a decomposition of the coupled fields into
secondary variables distributed among the axial [47] and polar [48] sectors.
The work presented in Chapter 4 is of more pressing interest to the GW
community, due the scarcity of accurate and computationally efficient mod-
els that describe the information-rich waveforms from EMRIs. We have de-
veloped an augmented variant of a well-known kludge waveform model; the
new AAK model retains the speed of its predecessor, while matching the phase
evolution of more accurate kludges over a significant fraction of the inspiral.
With a (meta)stable version of the model released online as part of a kludge
software suite [135], AAK waveforms will hopefully see widespread use in the
next round of mock LISA data challenges.
EMRIs are an attractive research topic due to the inherent challenges they
pose for source modelling and data analysis, and there is much that remains
to be done in terms of populating the EMRI waveform inventory before the
LISA mission design is finalised at the end of this decade. The immediate
improvements that must be made to the AAK model are workaround solutions
to the problems mentioned in Section 4.4. For example, the ill-defined nature
of (4.37)–(4.39) at plunge may be resolved (inelegantly) by simply starting or
stopping the trajectory integration as close to plunge as numerically allowed.
Other modifications will involve streamlining the model to make it as fast and
robust as possible for the mock data challenges, e.g. the development of an
optimised method that detects and handles plunge with a minimal amount of
computational overhead, or the improvement of the fitting algorithms so as to
reduce the amount of tuning presently required.
Another important research direction in the area of kludge waveform
models is to bridge the present disconnect between data analysis work and
the gravitational self-force programme. The most accurate kludges currently
available are fitted to Teukolsky-based models, and hence account only for the
adiabatic, dissipative first-order self-force; a logical next step is to incorporate
the subleading “post-adiabatic” self-force phenomenon of resonances. Efforts
147
have been made to explore the prevalence of resonance jumps [195] and their
impact on EMRI detection rates [196], but the effect of resonances on the wave-
forms themselves has yet to be studied.
It is also crucial to set up a channel enabling kludges to be informed by
higher-order self-force results once they become available. Early steps to-
wards this goal have been made for an EMRI with a non-spinning central
black hole [68]; in this work, the compact object’s orbit is described as an
evolving geodesic whose parameters are governed by forced equations of mo-
tion, and it is explicitly computed by supplying a model for the self-force on
Schwarzschild orbits. A similar result may now be achieved for spinning black
holes, by using pieces that have recently become available: the forced equa-
tions of motion for a Kerr black hole [67] and the self-force on an eccentric,
equatorial Kerr orbit [197].
The third major theme of this doctoral thesis is the development of im-
proved techniques for GW data analysis, with the two overlapping but quali-
tatively different procedures of detection and parameter estimation covered in
Chapters 5 and 6 respectively. Future research directions under this theme are
the most open-ended, as they typically involve the application of new statis-
tical or computational methods to data analysis, as well as the study of their
efficacy using realistic source signals and detector noise.
Various families of tunable compression schemes for GW template banks
have been proposed in Chapter 5. In particular, the optimised partition of
a template bank into non-overlapping sets performs better than other consid-
ered groupings with intersection (which are still interesting from a purely com-
binatorial perspective), and also yields moderate gains over a straightforward
coarsening of the bank. By investigating the features of the method on a bank
of ∼ 104 PN waveforms, the first step towards advancing its usage in GW
detection has been taken; however, the development of a practical implemen-
tation and a more extensive comparison to existing orthogonal-decomposition
methods [164–166] is necessary for its adoption. Nevertheless, the generality of
these compression schemes means that the method need not stand on its own,
and may instead be integrated with existing algorithms for further controlled
compression (if the benefits are found to be worthwhile).
148 Conclusion
In Chapter 6, we have found through low-dimensional parameter estima-
tion studies that the GPR marginalised likelihood might be viable for improv-
ing the accuracy of searches in highly localised regions of EMRI parameter
space. Possible strategies for the generalisation of the method to higher di-
mensions are discussed in Section 6.4. Apart from working towards realistic
parameter estimation in all dimensions of the full (or at least intrinsic) param-
eter space, another target is to ensure that the highly localised prior support
required for the likelihood to be used in sampling algorithms is compatible
with the rough localisation provided by detection-stage searches. Methods
that address the curse of dimensionality might simultaneously achieve this,
but the problem can also be tackled from the detection end, e.g. by employing
a semi-coherent search with short segments of AAK templates that are highly
accurate over their duration, such that better localisation accuracy is obtained.
There are other interesting research directions associated with developing
the GPR marginalised likelihood: for example, extending it in a physically mo-
tivated way to account for the varying degree of error that is expected across
different frequency bins. The more general method of GPR waveform interpo-
lation also has possible applications that are unrelated to parameter estimation,
such as the smooth combination of waveform models incorporating different
features (e.g. a non-spinning eccentric model for comparable-mass mergers
and a circular one with spin).
Appendix A
Induced Maxwell field solution
For ρ sec θ  1 and homogeneous initial conditions, the solution to (3.34)–
(3.37) with the background GW in (3.40), the free EMW in (3.55) and (3.56),
and the induced electromagnetic field ansatz in (3.57) and (3.58) is given by
E (±)a = hE(0)ξ(±)(t)eiψP (±)a , (A.1)
B(±)a = hE(0)ξ(±)(t)eiψQ(±)a , (A.2)
where
ξ(±)(t) = m(±) exp (−i(n± k)t)
−m(±) cos (m(±)t) + i(n± k) sin (m(±)t), (A.3)
P
(±)
0 = Q
(±)
0 = 0, (A.4)
149
150 Induced Maxwell field solution
P
(±)
1 =
i sin2 (θ/2)
8m(±)(k ± n−m(±))(k ± n+m(±))
× [2n2 sin (2α− γ − 3φ) + 6n2 sin (2α + γ − 3φ)
−n2 sin (2α− γ − 2θ − 3φ) + n2 sin (2α + γ − 2θ − 3φ)
+4n2 sin (2α + γ − θ − 3φ) + 4n2 sin (2α + γ + θ − 3φ)
−n2 sin (2α− γ + 2θ − 3φ) + n2 sin (2α + γ + 2θ − 3φ)
−2(8k2 + 10kn+ 3n2) sin (2α− γ − φ)
−2n(2k + n) sin (2α + γ − φ)− n2 sin (2α− γ − 2θ − φ)
+n2 sin (2α + γ − 2θ − φ)− 2n(k + 2n) sin (2α− γ − θ − φ)
−2kn sin (2α + γ − θ − φ)− 2n(k + 2n) sin (2α− γ + θ − φ)
−2kn sin (2α + γ + θ − φ)− n2 sin (2α− γ + 2θ − φ)
+n2 sin (2α + γ + 2θ − φ)] , (A.5)
P
(±)
2 =
i sin2 (θ/2)
8m(±)(k ± n−m(±))(k ± n+m(±))
× [2n2 cos (2α− γ − 3φ) + 6n2 cos (2α + γ − 3φ)
−n2 cos (2α− γ − 2θ − 3φ) + n2 cos (2α + γ − 2θ − 3φ)
+4n2 cos (2α + γ − θ − 3φ) + 4n2 cos (2α + γ + θ − 3φ)
−n2 cos (2α− γ + 2θ − 3φ) + n2 cos (2α + γ + 2θ − 3φ)
+2(8k2 + 10kn+ 3n2) cos (2α− γ − φ)
+2n(2k + n) cos (2α + γ − φ) + n2 cos (2α− γ − 2θ − φ)
−n2 cos (2α + γ − 2θ − φ) + 2n(k + 2n) cos (2α− γ − θ − φ)
+2kn cos (2α + γ − θ − φ) + 2n(k + 2n) cos (2α− γ + θ − φ)
+2kn cos (2α + γ + θ − φ) + n2 cos (2α− γ + 2θ − φ)
−n2 cos (2α + γ + 2θ − φ)] , (A.6)
151
P
(±)
3 =
in sin θ
8m(±)(k ± n−m(±))(k ± n+m(±))
× [2(3k + n) sin (2α− γ − 2φ) + 2(k − n) sin (2α + γ − 2φ)
−3k sin (2α− γ − θ − 2φ) + k sin (2α + γ − θ − 2φ)
−3k sin (2α− γ + θ − 2φ) + k sin (2α + γ + θ − 2φ)
−n sin (2α− γ + 2θ − 2φ) + n sin (2α + γ + 2θ − 2φ)
−n sin (2α− γ − 2θ − 2φ) + n sin (2α + γ − 2θ − 2φ)] , (A.7)
Q
(±)
1 =
i sin2 (θ/2)
8m(±)(k ± n−m(±))(k ± n+m(±))
× [2n2 cos (2α− γ − 3φ)− 6n2 cos (2α + γ − 3φ)
−n2 cos (2α− γ − 2θ − 3φ)− n2 cos (2α + γ − 2θ − 3φ)
−4n2 cos (2α + γ − θ − 3φ)− 4n2 cos (2α + γ + θ − 3φ)
−n2 cos (2α− γ + 2θ − 3φ)− n2 cos (2α + γ + 2θ − 3φ)
−2(8k2 + 10kn+ 3n2) cos (2α− γ − φ)
+2n(2k + n) cos (2α + γ − φ)− n2 cos (2α− γ − 2θ − φ)
−n2 cos (2α + γ − 2θ − φ)− 2n(k + 2n) cos (2α− γ − θ − φ)
+2kn cos (2α + γ − θ − φ)− 2n(k + 2n) cos (2α− γ + θ − φ)
+2kn cos (2α + γ + θ − φ)− n2 cos (2α− γ + 2θ − φ)
−n2 cos (2α + γ + 2θ − φ)] , (A.8)
152 Induced Maxwell field solution
Q
(±)
2 =
i sin2 (θ/2)
8m(±)(k ± n−m(±))(k ± n+m(±))
× [−2n2 sin (2α− γ − 3φ) + 6n2 sin (2α + γ − 3φ)
+n2 sin (2α− γ − 2θ − 3φ) + n2 sin (2α + γ − 2θ − 3φ)
+4n2 sin (2α + γ − θ − 3φ) + 4n2 sin (2α + γ + θ − 3φ)
+n2 sin (2α− γ + 2θ − 3φ) + n2 sin (2α + γ + 2θ − 3φ)
−2(8k2 + 10kn+ 3n2) sin (2α− γ − φ)
+2n(2k + n) sin (2α + γ − φ)− n2 sin (2α− γ − 2θ − φ)
−n2 sin (2α + γ − 2θ − φ)− 2n(k + 2n) sin (2α− γ − θ − φ)
+2kn sin (2α + γ − θ − φ)− 2n(k + 2n) sin (2α− γ + θ − φ)
+2kn sin (2α + γ + θ − φ)− n2 sin (2α− γ + 2θ − φ)
−n2 sin (2α + γ + 2θ − φ)] , (A.9)
Q
(±)
3 =
in sin θ
8m(±)(k ± n−m(±))(k ± n+m(±))
× [2(3k + n) cos (2α− γ − 2φ)− 2(k − n) cos (2α + γ − 2φ)
−3k cos (2α− γ − θ − 2φ)− k cos (2α + γ − θ − 2φ)
−3k cos (2α− γ + θ − 2φ)− k cos (2α + γ + θ − 2φ)
−n cos (2α− γ + 2θ − 2φ)− n cos (2α + γ + 2θ − 2φ)
−n cos (2α− γ − 2θ − 2φ)− n cos (2α + γ − 2θ − 2φ)] , (A.10)
with m(±) = (k2 + n2 ± 2kn cos θ)1/2.
Appendix B
Combinatorial design theory
The problem of constructing a family of sets Um under the cardinality con-
straints (5.31) and (5.32) in Section 5.2.3 may be regarded geometrically as the
problem of constructing a collection ofN distinct points (representing template
labels) and M distinct lines (representing sets) with the following properties:
(i) each point lies on exactly R lines;
(ii) each line passes through exactly C points;
(iii) any two lines intersect at exactly I points;
(iv) any two points lie on at most R− 1 lines.
The final property is the automatic identification condition, i.e. no two labels
are assigned to exactly the same subfamily of sets.
The feasibility of carrying out such a construction (or finding additional
conditions on N , M and R that ensure it is possible) is a difficult and unsolved
problem in combinatorics. One special case that has been studied in detail is
R = C and I = 1. This implies that N = M = R2 − R + 1, and that any
two points must lie on exactly one line. Under these circumstances, the four
geometrical properties define a finite projective plane of order R − 1 [170]. It
is known that finite projective planes exist with prime orders [170], but there
is no finite projective plane of order 6 [198] or 10 [199], while the existence (or
otherwise) of an order-12 finite projective plane remains an open question.
The special case of finite projective planes is uninteresting from a
compression-scheme point of view, as it has N = M and hence achieves no
153
154 Combinatorial design theory
compression. However, it strongly indicates that the conditions (5.31) and
(5.32) are not sufficient to ensure the existence of a set construction with the
four required properties. Nonetheless, valid set constructions have been found
for small values of N , M and R; for example, (N,M,R) = (10, 6, 3) yields
C = 5, I = 2, and the set construction
U1 = {1, 2, 3, 4, 5} ,
U2 = {1, 2, 6, 7, 8} ,
U3 = {1, 3, 6, 9, 10} ,
U4 = {2, 5, 8, 9, 10} ,
U5 = {3, 4, 7, 8, 10} ,
U6 = {4, 5, 6, 7, 9} . (B.1)
Additional solutions for (N,M,R) = (12, 9, 3) and (N,M,R) = (14, 7, 3) also
exist. No counterexamples (i.e. values of (N,M,R) satisfying (5.31) and (5.32)
but admitting no set construction) have been found for N > M , although we
have not conducted an exhaustive search.
A general compression scheme satisfying the conditions (5.31) and (5.32)
might potentially admit more compression rates than the symmetric base
scheme for each value of N . Given the difficulties in actually constructing
the sets, however, we focus instead on the special case of “maximal represen-
tation” for fixed M and R (i.e. every M -digit binary number with exactly R 1’s
represents a distinct template label); this gives the binomial coefficient scheme
described in Section 5.2.3.
Appendix C
Taylor-T2 waveform expansions
The Taylor-T2 PN waveform (5.52) used in Section 5.4 describes the inspi-
ral part of a circular and non-inclined comparable-mass binary coalescence
[38, 174, 175]. With units restored, its amplitude and phase are written as ex-
pansions in the frequency-related variable
ξ :=
(
GM
c3η3/5
φ˙
)2/3
, (C.1)
with the orbital phase φ given to 3.5PN accuracy by
φ = −1
η
{
τ 5/8 +
(
3715
8064
+
55
96
η
)
τ 3/8 − 3
4
piτ 1/4
+
(
9275495
14450688
+
284875
258048
η +
1855
2048
η2
)
τ 1/8
+
(
− 38645
172032
+
65
2048
η
)
pi ln
(
τ
τ0
)
+
[
831032450749357
57682522275840
− 53
40
pi2 − 107
56
γ +
107
448
ln
( τ
256
)
+
(
−126510089885
4161798144
+
2255
2048
pi2
)
η +
154565
1835008
η2 − 1179625
1769472
η3
]
τ−1/8
+
(
188516689
173408256
+
488825
516096
η − 141769
516096
η2
)
piτ−1/4
}
, (C.2)
155
156 Taylor-T2 waveform expansions
where γ is the Euler–Mascheroni constant. Here τ is a time-related variable,
written in terms of the binary coalescence time tc as
τ =
c3η8/5
5GM(tc − t), (C.3)
where we set tc = 1 yr for a ∼ 106M black-hole binary inspiral.
The GW amplitude is then proportional to the 2PN amplitude function
A = ξ
(
2 +
1
3
(−13 + η)ξ + 4piξ3/2 + 1
180
(−837− 635η + 15η2)ξ2
)
, (C.4)
while the GW phase is twice the tail-distorted orbital phase
ψ = φ− 3ξ3/2
(
1− η
2
ξ
)
ln
(
ξ
ξ0
)
, (C.5)
with the 1PN factor of 1− (η/2)ξ included to account for the nonlinear interac-
tion between the gravitational field of the source and its emitted gravitational
radiation [200]. Finally, the constant frequency in ξ0 is set to φ˙0 = 10−4pi, which
corresponds to an approximate entry frequency of 10−4 Hz for LISA.
Bibliography
[1] A. Einstein. On gravitational waves. Proceedings of the Prussian Academy
of Sciences, 1918:154, 1918.
[2] R. A. Hulse and J. H. Taylor. Discovery of a pulsar in a binary system.
The Astrophysical Journal, 195:L51, 1975.
[3] I. H. Stairs. Testing general relativity with pulsar timing. Living Reviews
in Relativity, 6:5, 2003.
[4] A. G. Lyne et al. A double-pulsar system: A rare laboratory for relativis-
tic gravity and plasma physics. Science, 303:1153, 2004.
[5] J. Weber. Detection and generation of gravitational waves. Physical Re-
view, 117:306, 1960.
[6] W. H. Press and K. S. Thorne. Gravitational-wave astronomy. Annual
Review of Astronomy and Astrophysics, 10:335, 1972.
[7] G. M. Harry (for the LIGO Scientific Collaboration). Advanced LIGO:
The next generation of gravitational wave detectors. Classical and Quan-
tum Gravity, 27:084006, 2010.
[8] B. P. Abbott et al. Observation of gravitational waves from a binary black
hole merger. Physical Review Letters, 116:061102, 2016.
[9] B. P. Abbott et al. GW151226: Observation of gravitational waves from
a 22-Solar-mass binary black hole coalescence. Physical Review Letters,
116:241103, 2016.
[10] B. P. Abbott et al. Binary black hole mergers in the first Advanced LIGO
observing run. Physical Review X, 6:041015, 2016.
157
158 BIBLIOGRAPHY
[11] F. Acernese et al. Advanced Virgo: A second-generation interferometric
gravitational wave detector. Classical and Quantum Gravity, 32:024001,
2015.
[12] K. Somiya (for the KAGRA Collaboration). Detector configuration of
KAGRA—the Japanese cryogenic gravitational-wave detector. Classical
and Quantum Gravity, 29:124007, 2012.
[13] K. Danzmann et al. LISA: Laser Interferometer Space Antenna. 2017.
www.elisascience.org/files/publications/LISA L3 20170120.pdf.
[14] P. Amaro-Seoane et al. The gravitational universe. 2013. arXiv:1305.5720
[astro-ph.CO].
[15] S. Kawamura et al. The Japanese space gravitational wave antenna: DE-
CIGO. Classical and Quantum Gravity, 28:094011, 2011.
[16] J. Luo et al. TianQin: A space-borne gravitational wave detector. Classical
and Quantum Gravity, 33:035010, 2016.
[17] K. Danzmann (for the LISA Pathfinder Team and the eLISA Consor-
tium). LISA and its pathfinder. Nature Physics, 11:613, 2015.
[18] M. Armano et al. Sub-femto-g free fall for space-based gravitational
wave observatories: LISA Pathfinder results. Physical Review Letters,
116:231101, 2016.
[19] G. Hobbs et al. The International Pulsar Timing Array project: Using
pulsars as a gravitational wave detector. Classical and Quantum Gravity,
27:084013, 2010.
[20] S. R. Taylor et al. Are we there yet? Time to detection of nanohertz grav-
itational waves based on pulsar-timing array limits. The Astrophysical
Journal Letters, 819:L6, 2016.
[21] A. Sesana. Prospects for multiband gravitational-wave astronomy after
GW150914. Physical Review Letters, 116:231102, 2016.
BIBLIOGRAPHY 159
[22] C. W. Misner, K. S. Thorne, and J. A. Wheeler. Gravitation. W. H. Freeman,
1973.
[23] K. S. Thorne. Multipole expansions of gravitational radiation. Reviews of
Modern Physics, 52:299, 1980.
[24] C. Cutler and K. S. Thorne. An overview of gravitational-wave sources.
In N. Bishop and S. D. Maharaj, editors, Proceedings of the 16th Interna-
tional Conference on General Relativity and Gravitation. World Scientific,
2002.
[25] B. S. Sathyaprakash and B. F. Schutz. Physics, astrophysics and cosmol-
ogy with gravitational waves. Living Reviews in Relativity, 12:2, 2009.
[26] C. J. Moore, R. H. Cole, and C. P. L. Berry. Gravitational-wave sensitivity
curves. Classical and Quantum Gravity, 32:015014, 2015.
[27] T. Creighton. Advanced LIGO: Sources and astrophysics. Classical and
Quantum Gravity, 20:S853, 2003.
[28] P. Amaro-Seoane et al. Low-frequency gravitational-wave science with
eLISA/NGO. Classical and Quantum Gravity, 29:124016, 2012.
[29] C. D. Ott. The gravitational-wave signature of core-collapse supernovae.
Classical and Quantum Gravity, 26:063001, 2009.
[30] P. D. Lasky. Gravitational waves from neutron stars: A review. Publica-
tions of the Astronomical Society of Australia, 32:e034, 2015.
[31] T. Damour and A. Vilenkin. Gravitational wave bursts from cosmic
strings. Physical Review Letters, 85:3761, 2000.
[32] L. J. Rubbo, K. Holley-Bockelmann, and L. S. Finn. Event rate for extreme
mass ratio burst signals in the Laser Interferometer Space Antenna band.
The Astrophysical Journal Letters, 649:L25, 2006.
[33] P. D. Lasky et al. Gravitational-wave cosmology across 29 decades in
frequency. Physical Review X, 6:011035, 2016.
160 BIBLIOGRAPHY
[34] C. Caprini et al. Science with the space-based interferometer eLISA, II:
Gravitational waves from cosmological phase transitions. Journal of Cos-
mology and Astroparticle Physics, 04(2016):001, 2016.
[35] P. C. Peters and J. Mathews. Gravitational radiation from point masses
in a Keplerian orbit. Physical Review, 131:435, 1963.
[36] P. C. Peters. Gravitational radiation and the motion of two point masses.
Physical Review, 136:B1224, 1964.
[37] L. Blanchet and T. Damour. Post-Newtonian generation of gravitational
waves. Annales de l’Institut Henri Poincare´, 50:377, 1989.
[38] L. Blanchet. Gravitational radiation from post-Newtonian sources and
inspiralling compact binaries. Living Reviews in Relativity, 17:2, 2014.
[39] J. Centrella, J. G. Baker, B. J. Kelly, and J. R. van Meter. Black-hole bi-
naries, gravitational waves, and numerical relativity. Reviews of Modern
Physics, 82:3069, 2010.
[40] E. T. Newman and R. Penrose. An approach to gravitational radiation
by a method of spin coefficients. Journal of Mathematical Physics, 3:566,
1962.
[41] E. T. Newman and R. Penrose. Note on the Bondi–Metzner–Sachs group.
Journal of Mathematical Physics, 7:863, 1966.
[42] F. Pretorius. Evolution of binary black-hole spacetimes. Physical Review
Letters, 95:121101, 2005.
[43] M. Campanelli, C. O. Lousto, P. Marronetti, and Y. Zlochower. Accu-
rate evolutions of orbiting black-hole binaries without excision. Physical
Review Letters, 96:111101, 2006.
[44] J. G. Baker et al. Gravitational-wave extraction from an inspiralling con-
figuration of merging black holes. Physical Review Letters, 96:111102,
2006.
BIBLIOGRAPHY 161
[45] The SXS Collaboration. SXS Gravitational Waveform Database, 2016.
www.black-holes.org/waveforms.
[46] K. D. Kokkotas and B. G. Schmidt. Quasi-normal modes of stars and
black holes. Living Reviews in Relativity, 2:2, 1999.
[47] T. Regge and J. A. Wheeler. Stability of a Schwarzschild singularity. Phys-
ical Review, 108:1063, 1957.
[48] F. J. Zerilli. Effective potential for even-parity Regge–Wheeler gravita-
tional perturbation equations. Physical Review Letters, 24:737, 1970.
[49] S. A. Teukolsky. Rotating black holes: Separable wave equations for
gravitational and electromagnetic perturbations. Physical Review Letters,
29:1114, 1972.
[50] E. W. Leaver. An analytic representation for the quasi-normal modes of
Kerr black holes. Proceedings of the Royal Society A, 402:285, 1985.
[51] A. Buonanno and T. Damour. Effective one-body approach to general
relativistic two-body dynamics. Physical Review D, 59:084006, 1999.
[52] Y. Pan et al. Inspiral–merger–ringdown waveforms of spinning, pre-
cessing black-hole binaries in the effective-one-body formalism. Physical
Review D, 89:084006, 2014.
[53] P. Ajith et al. Inspiral–merger–ringdown waveforms for black-hole bina-
ries with nonprecessing spins. Physical Review Letters, 106:241101, 2011.
[54] M. Hannam et al. Simple model of complete precessing black-hole-
binary gravitational waveforms. Physical Review Letters, 113:151101,
2014.
[55] N. Yunes et al. Extreme mass-ratio inspirals in the effective-one-body
approach: Quasicircular, equatorial orbits around a spinning black hole.
Physical Review D, 83:044044, 2011.
162 BIBLIOGRAPHY
[56] P. Amaro-Seoane et al. Intermediate and extreme mass-ratio inspirals:
Astrophysics, science applications and detection using LISA. Classical
and Quantum Gravity, 24:R113, 2007.
[57] L. Barack and C. Cutler. LISA capture sources: Approximate waveforms,
signal-to-noise ratios, and parameter estimation accuracy. Physical Re-
view D, 69:082005, 2004.
[58] S. Babak et al. “Kludge” gravitational waveforms for a test-body orbiting
a Kerr black hole. Physical Review D, 75:024005, 2007.
[59] A. J. K. Chua and J. R. Gair. Improved analytic extreme-mass-ratio in-
spiral model for scoping out eLISA data analysis. Classical and Quantum
Gravity, 32:232002, 2015.
[60] L. Barack. Gravitational self-force in extreme mass-ratio inspirals. Clas-
sical and Quantum Gravity, 26:213001, 2009.
[61] E. Poisson, A. Pound, and I. Vega. The motion of point particles in
curved spacetime. Living Reviews in Relativity, 14:7, 2011.
[62] S. A. Hughes. Evolution of circular, nonequatorial orbits of Kerr black
holes due to gravitational-wave emission, II: Inspiral trajectories and
gravitational waveforms. Physical Review D, 64:064004, 2001.
[63] S. Drasco and S. A. Hughes. Gravitational wave snapshots of generic
extreme mass ratio inspirals. Physical Review D, 73:024027, 2006.
[64] E´. E´. Flanagan and T. Hinderer. Transient resonances in the inspirals of
point particles into black holes. Physical Review Letters, 109:071102, 2012.
[65] T. Hinderer and E´. E´. Flanagan. Two-timescale analysis of extreme mass
ratio inspirals in Kerr spacetime: Orbital motion. Physical Review D,
78:064028, 2008.
[66] E. A. Huerta and J. R. Gair. Influence of conservative corrections on
parameter estimation for extreme-mass-ratio inspirals. Physical Review
D, 79:084021, 2009.
BIBLIOGRAPHY 163
[67] J. R. Gair et al. Forced motion near black holes. Physical Review D,
83:044037, 2011.
[68] N. Warburton et al. Evolution of inspiral orbits around a Schwarzschild
black hole. Physical Review D, 85:061501(R), 2012.
[69] A. Ori and K. S. Thorne. Transition from inspiral to plunge for a compact
body in a circular equatorial orbit around a massive, spinning black hole.
Physical Review D, 62:124022, 2000.
[70] C. O. Lousto, H. Nakano, Y. Zlochower, and M. Campanelli.
Intermediate-mass-ratio black-hole binaries: Numerical relativity meets
perturbation theory. Physical Review Letters, 104:211101, 2010.
[71] E. A. Huerta and J. R. Gair. Intermediate-mass-ratio inspirals in the Ein-
stein Telescope, I: Signal-to-noise ratio calculations. Physical Review D,
83:044020, 2011.
[72] P. Jaranowski and A. Kro´lak. Analysis of gravitational-wave data. Cam-
bridge University Press, 2009.
[73] P. Jaranowski and A. Kro´lak. Gravitational-wave data analysis: Formal-
ism and sample applications. Living Reviews in Relativity, 15:4, 2012.
[74] M. Punturo et al. The Einstein Telescope: A third-generation gravita-
tional wave observatory. Classical and Quantum Gravity, 27:194002, 2010.
[75] J. Crowder and N. J. Cornish. Beyond LISA: Exploring future gravita-
tional wave missions. Physical Review D, 72:083005, 2005.
[76] T. J. W. Lazio. The Square Kilometre Array pulsar timing array. Classical
and Quantum Gravity, 30:224011, 2013.
[77] M. Maggiore. Gravitational waves, Volume 1: Theory and experiments. Ox-
ford University Press, 2007.
[78] O. D. Aguiar. Past, present and future of the resonant-mass gravitational
wave detectors. Research in Astronomy and Astrophysics, 11:1, 2010.
164 BIBLIOGRAPHY
[79] C. Cutler. Angular resolution of the LISA gravitational wave detector.
Physical Review D, 57:7089, 1998.
[80] T. A. Apostolatos, C. Cutler, G. J. Sussman, and K. S. Thorne. Spin-
induced orbital precession and its modulation of the gravitational wave-
forms from merging binaries. Physical Review D, 49:6274, 1994.
[81] T. A. Apostolatos. Search templates for gravitational waves from pre-
cessing, inspiraling binaries. Physical Review D, 52:605, 1995.
[82] B. F. Schutz. Data processing, analysis, and storage for interferometric
antennas. In D. G. Blair, editor, The detection of gravitational waves. Cam-
bridge University Press, 1989.
[83] B. J. Owen. Search templates for gravitational waves from inspiraling
binaries: Choice of template spacing. Physical Review D, 53:6749, 1996.
[84] D. George and E. A. Huerta. Deep neural networks to enable real-time
multimessenger astrophysics. 2017. arXiv:1701.00008[astro-ph.IM].
[85] P. Addesso et al. Compressed sensing for time-frequency gravitational
wave data analysis. 2016. arXiv:1605.03496[astro-ph.IM].
[86] L. S. Finn. Detection, measurement, and gravitational radiation. Physical
Review D, 46:5236, 1992.
[87] C. Cutler and E´. E´. Flanagan. Gravitational waves from merging compact
binaries: How accurately can one extract the binary’s parameters from
the inspiral waveform? Physical Review D, 49:2658, 1994.
[88] N. Christensen and R. Meyer. Using Markov chain Monte Carlo meth-
ods for estimating parameters with gravitational radiation data. Physical
Review D, 64:022001, 2001.
[89] C. Ro¨ver, R. Meyer, and N. Christensen. Coherent Bayesian inference
on compact binary inspirals using a network of interferometric gravita-
tional wave detectors. Physical Review D, 75:062004, 2007.
BIBLIOGRAPHY 165
[90] F. Feroz, J. R. Gair, M. P. Hobson, and E. K. Porter. Use of the MULTI-
NEST algorithm for gravitational wave data analysis. Classical and Quan-
tum Gravity, 26:215003, 2009.
[91] J. Crowder, N. J. Cornish, and J. L. Reddinger. LISA data analysis using
genetic algorithms. Physical Review D, 73:063011, 2006.
[92] A. J. K. Chua, P. Can˜izares, and J. R. Gair. Electromagnetic signatures of
far-field gravitational radiation in the 1+3 approach. Classical and Quan-
tum Gravity, 32:015011, 2015.
[93] E. S. Phinney. Finding and using electromagnetic counterparts of gravi-
tational wave sources: A white paper for the Astro2010 Decadal Review.
2009. arXiv:0903.0098[astro-ph.CO].
[94] I. Mandel and R. O’Shaughnessy. Compact binary coalescences in the
band of ground-based gravitational-wave detectors. Classical and Quan-
tum Gravity, 27:114007, 2010.
[95] S. Ghosh and G. Nelemans. Localizing gravitational wave sources
with optical telescopes and combining electromagnetic and gravitational
wave data. In C. F. Sopuerta, editor, Gravitational wave astrophysics.
Springer, 2015.
[96] H. Sotani, K. D. Kokkotas, P. Laguna, and C. F. Sopuerta. Gravitationally
driven electromagnetic perturbations of neutron stars and black holes.
Physical Review D, 87:084018, 2013.
[97] P. Pani, E. Berti, and L. Gualtieri. Scalar, electromagnetic, and gravita-
tional perturbations of Kerr-Newman black holes in the slow-rotation
limit. Physical Review D, 88:064048, 2013.
[98] C. Palenzuela et al. Binary black holes’ effects on electromagnetic fields.
Physical Review Letters, 103:081101, 2009.
[99] M. E. Gertsenshteı˘n. Wave resonance of light and gravitational waves.
Soviet Physics JETP, 14:84, 1962.
166 BIBLIOGRAPHY
[100] Y. B. Zel’dovich. Electromagnetic and gravitational waves in a stationary
magnetic field. Soviet Physics JETP, 38:652, 1974.
[101] C. Barrabe`s and P. A. Hogan. Interaction of gravitational waves with
magnetic and electric fields. Physical Review D, 81:064024, 2010.
[102] F. I. Cooperstock. The interaction between electromagnetic and gravita-
tional waves. Annals of Physics, 47:173, 1968.
[103] D. M. Zipoy. Light fluctuations due to an intergalactic flux of gravita-
tional waves. Physical Review, 142:825, 1966.
[104] R. Fakir. Gravitational wave detection: A nonmechanical effect. The
Astrophysical Journal, 418:202, 1993.
[105] S. Kopeikin, P. Korobkov, and A. Polnarev. Propagation of light in the
field of stationary and radiative gravitational multipoles. Classical and
Quantum Gravity, 23:4299, 2006.
[106] A. M. Cruise. An interaction between gravitational and electromagnetic
waves. Monthly Notices of the Royal Astronomical Society, 204:485, 1983.
[107] E. Montanari. On the propagation of electromagnetic radiation in the
field of a plane gravitational wave. Classical and Quantum Gravity,
15:2493, 1998.
[108] A. R. Prasanna and S. Mohanty. Gravitational wave-induced rotation of
the plane of polarization of pulsar signals. Europhysics Letters, 60:651,
2002.
[109] M. Halilsoy and O. Gurtug. Search for gravitational waves through the
electromagnetic Faraday rotation. Physical Review D, 75:124021, 2007.
[110] V. Faraoni. The rotation of polarization by gravitational waves. New
Astronomy, 13:178, 2008.
[111] R. Ragazzoni, G. Valente, and E. Marchetti. Gravitational wave detection
through microlensing? Monthly Notices of the Royal Astronomical Society,
345:100, 2003.
BIBLIOGRAPHY 167
[112] G. B. Lesovik, A. V. Lebedev, V. Mounutcharyan, and T. Martin. Detec-
tion of gravity waves by phase modulation of the light from a distant
star. Physical Review D, 71:122001, 2005.
[113] G. F. R. Ellis. Relativistic cosmology. In R. K. Sachs, editor, Proceedings
of the International School of Physics “Enrico Fermi”, Course 47, General rela-
tivity and cosmology. Academic Press, 1971.
[114] G. F. R. Ellis and H. van Elst. Cosmological models: Carge`se Lectures
1998. In M. Lachie`ze-Rey, editor, Proceedings of the NATO Advanced Study
Institute on Theoretical and Observational Cosmology. Kluwer Academic
Publishers, 1999.
[115] C. G. Tsagas, A. Challinor, and R. Maartens. Relativistic cosmology and
large-scale structure. Physics Reports, 465:61, 2008.
[116] P. K. S. Dunsby, B. A. C. C. Bassett, and G. F. R. Ellis. Covariant analysis
of gravitational waves in a cosmological context. Classical and Quantum
Gravity, 14:1215, 1997.
[117] M. Marklund, P. K. S. Dunsby, and G. Brodin. Cosmological electromag-
netic fields due to gravitational wave perturbations. Physical Review D,
62:101501(R), 2000.
[118] C. G. Tsagas, P. K. S. Dunsby, and M. Marklund. Gravitational wave
amplification of seed magnetic fields. Physics Letters B, 561:17, 2003.
[119] C. G. Tsagas. Gravitomagnetic amplification in cosmology. Physical Re-
view D, 81:043501, 2010.
[120] C. G. Tsagas. Gravitoelectromagnetic resonances. Physical Review D,
84:043524, 2011.
[121] C. G. Tsagas. Electromagnetic fields in curved spacetimes. Classical and
Quantum Gravity, 22:393, 2005.
[122] C. A. Clarkson and R. K. Barrett. Covariant perturbations of
Schwarzschild black holes. Classical and Quantum Gravity, 20:3855, 2003.
168 BIBLIOGRAPHY
[123] C. A. Clarkson, M. Marklund, G. Betschart, and P. K. S. Dunsby. The elec-
tromagnetic signature of black hole ring-down. The Astrophysical Journal,
613:492, 2004.
[124] M. Marklund, G. Brodin, and P. K. S. Dunsby. Radio wave emissions due
to gravitational radiation. The Astrophysical Journal, 536:875, 2000.
[125] M. Servin and G. Brodin. Resonant interaction between gravitational
waves, electromagnetic waves, and plasma flows. Physical Review D,
68:044017, 2003.
[126] A. P. Kouretsis and C. G. Tsagas. Gravito-electromagnetic resonances in
Minkowski space. Physical Review D, 88:044006, 2013.
[127] J. D. Jackson. Classical electrodynamics. Wiley, 1998.
[128] F. J. Dyson. Is a graviton detectable? Henri Poincare´ Prize Lecture 2012.
2013. publications.ias.edu/sites/default/files/poincare2012.pdf.
[129] F-Y. Li, H. Wen, and Z-Y. Fang. High-frequency gravitational waves hav-
ing large spectral densities and their electromagnetic response. Chinese
Physics B, 22:120402, 2013.
[130] S. A. Olausen and V. M. Kaspi. The McGill magnetar catalog. The Astro-
physical Journal Supplement Series, 212:6, 2014.
[131] F. Pacini. Rotating neutron stars, pulsars and supernova remnants. Na-
ture, 219:145, 1968.
[132] H. D. Falcke et al. A very brief description of LOFAR: The Low Fre-
quency Array. Highlights of Astronomy, 14:386, 2006.
[133] A. R. Taylor. The Square Kilometre Array. Proceedings of the International
Astronomical Union Symposia and Colloquia, 291:339, 2012.
[134] A. J. K. Chua. Augmented kludge waveforms and Gaussian process
regression for EMRI data analysis. Journal of Physics: Conference Series,
716:012028, 2016.
BIBLIOGRAPHY 169
[135] A. J. K. Chua and J. R. Gair. EMRI Kludge Suite, 2016. github.com/al-
vincjk/EMRI Kludge Suite.
[136] J. R. Gair, M. Vallisneri, S. L. Larson, and J. G. Baker. Testing general
relativity with low-frequency, space-based gravitational-wave detectors.
Living Reviews in Relativity, 16:7, 2013.
[137] K. A. Arnaud et al. An overview of the Mock LISA Data Challenges. AIP
Conference Proceedings, 873:619, 2006.
[138] W. Schmidt. Celestial mechanics in Kerr spacetime. Classical and Quan-
tum Gravity, 19:2743, 2002.
[139] M. Sasaki and H. Tagoshi. Analytic black hole perturbation approach to
gravitational radiation. Living Reviews in Relativity, 6:6, 2003.
[140] E. Forseth, C. R. Evans, and S. Hopper. Eccentric-orbit extreme-mass-
ratio inspiral gravitational wave energy fluxes to 7PN order. Physical
Review D, 93:064058, 2016.
[141] N. Sago and R. Fujita. Calculation of radiation reaction effect on orbital
parameters in Kerr spacetime. Progress of Theoretical and Experimental
Physics, 2015:073E03, 2015.
[142] A. Buonanno, Y. Chen, and M. Vallisneri. Detection template families for
gravitational waves from the final stages of binary-black-hole inspirals:
Nonspinning case. Physical Review D, 67:024016, 2003.
[143] B. M. Barker and R. F. O’Connell. Gravitational two-body problem with
arbitrary masses, spins, and quadrupole moments. Physical Review D,
12:329, 1975.
[144] V. A. Brumberg. Essential relativistic celestial mechanics. CRC Press, 1991.
[145] W. Junker and G. Scha¨fer. Binary systems: Higher order gravitational
radiation damping and wave emission. Monthly Notices of the Royal As-
tronomical Society, 254:146, 1992.
170 BIBLIOGRAPHY
[146] F. D. Ryan. Effect of gravitational radiation reaction on nonequatorial
orbits around a Kerr black hole. Physical Review D, 53:3064, 1996.
[147] S. A. Hughes. Evolution of circular, nonequatorial orbits of Kerr black
holes due to gravitational-wave emission. Physical Review D, 61:084004,
2000.
[148] K. Glampedakis, S. A. Hughes, and D. Kennefick. Approximating the
inspiral of test bodies into Kerr black holes. Physical Review D, 66:064005,
2002.
[149] J. R. Gair and K. Glampedakis. Improved approximate inspirals of test
bodies into Kerr black holes. Physical Review D, 73:064037, 2006.
[150] J. D. Bekenstein. Gravitational-radiation recoil and runaway black holes.
The Astrophysical Journal, 183:657, 1973.
[151] W. H. Press. Gravitational radiation from sources which extend into their
own wave zone. Physical Review D, 15:965, 1977.
[152] S. Chandrasekhar. The mathematical theory of black holes. Clarendon Press,
1983.
[153] Y. Mino. Perturbative approach to an orbital evolution around a super-
massive black hole. Physical Review D, 67:084027, 2003.
[154] S. Drasco and S. A. Hughes. Rotating black hole orbit functionals in the
frequency domain. Physical Review D, 69:044015, 2004.
[155] N. Yunes and E. Berti. Accuracy of the post-Newtonian approximation:
Optimal asymptotic expansion for quasicircular, extreme-mass ratio in-
spirals. Physical Review D, 77:124006, 2008.
[156] E. L. Rees. Graphical discussion of the roots of a quartic equation. Amer-
ican Mathematical Monthly, 29:51, 1922.
BIBLIOGRAPHY 171
[157] D. J. A. McKechan, C. Robinson, and B. S. Sathyaprakash. A tapering
window for time-domain templates and simulated signals in the detec-
tion of gravitational waves from coalescing compact binaries. Classical
and Quantum Gravity, 27:084020, 2010.
[158] M. Capderou. Satellites: Orbits and missions. Springer, 2005.
[159] A. Klein et al. Science with the space-based interferometer eLISA: Su-
permassive black hole binaries. Physical Review D, 93:024003, 2016.
[160] A. J. K. Chua and J. R. Gair. Tunable compression of template banks
for fast gravitational-wave detection and localization. Physical Review D,
93:122001, 2016.
[161] S. Mitra, S. V. Dhurandhar, and L. S. Finn. Improving the efficiency of
the detection of gravitational wave signals from inspiraling compact bi-
naries: Chebyshev interpolation. Physical Review D, 72:102001, 2005.
[162] R. J. E. Smith et al. Towards rapid parameter estimation on gravitational
waves from compact binaries using interpolated waveforms. Physical
Review D, 87:122002, 2013.
[163] P. Can˜izares, S. E. Field, J. R. Gair, and M. Tiglio. Gravitational wave
parameter estimation with compressed likelihood evaluations. Physical
Review D, 87:124005, 2013.
[164] I. S. Heng. Rotating stellar core-collapse waveform decomposition: A
principal component analysis approach. Classical and Quantum Gravity,
26:105005, 2009.
[165] K. Cannon et al. Singular value decomposition applied to com-
pact binary coalescence gravitational-wave signals. Physical Review D,
82:044025, 2010.
[166] S. E. Field et al. Reduced basis catalogs for gravitational wave templates.
Physical Review Letters, 106:221102, 2011.
172 BIBLIOGRAPHY
[167] Y. Wang. Fast detection and automatic parameter estimation of a gravi-
tational wave signal with a novel method. General Relativity and Gravita-
tion, 47:142, 2015.
[168] J. R. Gair et al. Event rate estimates for LISA extreme mass ratio capture
sources. Classical and Quantum Gravity, 21:S1595, 2004.
[169] S. V. Dhurandhar and B. S. Sathyaprakash. Choice of filters for the de-
tection of gravitational waves from coalescing binaries, II: Detection in
colored noise. Physical Review D, 49:1707, 1994.
[170] N. L. Biggs. Discrete mathematics. Oxford University Press, 2002.
[171] D. Singmaster. Repeated binomial coefficients and Fibonacci numbers.
Fibonacci Quarterly, 13:295, 1975.
[172] D. Singmaster. How often does an integer occur as a binomial coeffi-
cient? American Mathematical Monthly, 78:385, 1971.
[173] B. S. Sathyaprakash and S. V. Dhurandhar. Choice of filters for the de-
tection of gravitational waves from coalescing binaries. Physical Review
D, 44:3819, 1991.
[174] L. Blanchet, B. R. Iyer, C. M. Will, and A. G. Wiseman. Gravita-
tional waveforms from inspiralling compact binaries to second-post-
Newtonian order. Classical and Quantum Gravity, 13:575, 1996.
[175] R. H. Cole and J. R. Gair. Likelihood smoothing using gravitational wave
surrogate models. Physical Review D, 90:124043, 2014.
[176] P. Amaro-Seoane et al. eLISA/NGO: Astrophysics and cosmology in the
gravitational-wave millihertz regime. GW Notes, 6:4, 2013.
[177] P. R. Brady, T. Creighton, C. Cutler, and B. F. Schutz. Searching for peri-
odic sources with LIGO. Physical Review D, 57:2101, 1998.
[178] B. J. Owen and B. S. Sathyaprakash. Matched filtering of gravitational
waves from inspiraling compact binaries: Computational cost and tem-
plate placement. Physical Review D, 60:022002, 1999.
BIBLIOGRAPHY 173
[179] R. Prix. Template-based searches for gravitational waves: Efficient lattice
covering of flat parameter spaces. Classical and Quantum Gravity, 24:S481,
2007.
[180] C. J. Moore, C. P. L. Berry, A. J. K. Chua, and J. R. Gair. Improving
gravitational-wave parameter estimation using Gaussian process regres-
sion. Physical Review D, 93:064001, 2016.
[181] C. J. Moore, A. J. K. Chua, C. P. L. Berry, and J. R. Gair. Fast methods for
training Gaussian processes on large datasets. Royal Society Open Science,
3:160125, 2016.
[182] C. Cutler and M. Vallisneri. LISA detections of massive black hole inspi-
rals: Parameter extraction errors due to inaccurate template waveforms.
Physical Review D, 76:104018, 2007.
[183] D. J. C. MacKay. Information theory, inference, and learning algorithms.
Cambridge University Press, 2003.
[184] C. E. Rasmussen and C. K. I. Williams. Gaussian processes for machine
learning. MIT Press, 2006.
[185] C. J. Moore and J. R. Gair. Novel method for incorporating model un-
certainties into gravitational wave parameter estimates. Physical Review
Letters, 113:251101, 2014.
[186] LIGO Scientific Collaboration. LSC Algorithm Library Suite, 2016.
wiki.ligo.org/DASWG/LALSuite.
[187] L. Santamarı´a et al. Matching post-Newtonian and numerical relativity
waveforms: Systematic errors and a new phenomenological model for
nonprecessing black hole binaries. Physical Review D, 82:064016, 2010.
[188] T. Damour, B. R. Iyer, and B. S. Sathyaprakash. Comparison of search
templates for gravitational waves from binary inspiral. Physical Review
D, 63:044023, 2001.
174 BIBLIOGRAPHY
[189] S. Drasco. Strategies for observing extreme mass ratio inspirals. Classical
and Quantum Gravity, 23:S769, 2006.
[190] H. Wendland. Scattered data approximation. Cambridge University Press,
2004.
[191] Y. Saad. Iterative methods for sparse linear systems. Society for Industrial
and Applied Mathematics, 2003.
[192] S. Babak. Building a stochastic template bank for detecting massive black
hole binaries. Classical and Quantum Gravity, 25:195011, 2008.
[193] C. Messenger, R. Prix, and M. A. Papa. Random template banks and
relaxed lattice coverings. Physical Review D, 79:104017, 2009.
[194] I. W. Harry, B. Allen, and B. S. Sathyaprakash. Stochastic template place-
ment algorithm for gravitational wave data analysis. Physical Review D,
80:104014, 2009.
[195] U. Ruangsri and S. A. Hughes. Census of transient orbital resonances
encountered during binary inspiral. Physical Review D, 89:084036, 2014.
[196] C. P. L. Berry, R. H. Cole, P. Can˜izares, and J. R. Gair. Importance of
transient resonances in extreme-mass-ratio inspirals. Physical Review D,
94:124042, 2016.
[197] M. van de Meent. Gravitational self-force on eccentric equatorial orbits
around a Kerr black hole. Physical Review D, 94:044034, 2016.
[198] R. C. Bose. On the application of the properties of Galois fields to the
problem of construction of hyper-Graeco-Latin squares. Sankhya¯: The
Indian Journal of Statistics, 3:323, 1938.
[199] C. W. H. Lam, L. Thiel, and S. Swiercz. The non-existence of finite pro-
jective planes of order 10. Canadian Journal of Mathematics, 41:1117, 1989.
[200] L. Blanchet and G. Scha¨fer. Gravitational wave tails and binary star sys-
tems. Classical and Quantum Gravity, 10:2699, 1993.