Novel higher order regularisation
methods for image reconstruction
Konstantinos Papafitsoros
St Edmunds College
University of Cambridge
A thesis submitted for the degree of
Doctor of Philosophy
June 2014

Abstract
In this thesis we study novel higher order total variation-based variational
methods for digital image reconstruction. These methods are formulated in
the context of Tikhonov regularisation. We focus on regularisation techniques
in which the regulariser incorporates second order derivatives or a sophisti-
cated combination of first and second order derivatives. The introduction of
higher order derivatives in the regularisation process has been shown to be an
advantage over the classical first order case, i.e., total variation regularisation,
as classical artifacts such as the staircasing effect are significantly reduced or
totally eliminated. Also in image inpainting the introduction of higher order
derivatives in the regulariser turns out to be crucial to achieve interpolation
across large gaps.
First, we introduce, analyse and implement a combined first and second order
regularisation method with applications in image denoising, deblurring and in-
painting. The method, numerically realised by the split Bregman algorithm, is
computationally efficient and capable of giving comparable results with total
generalised variation (TGV), a state of the art higher order method. An addi-
tional experimental analysis is performed for image inpainting and an online
demo is provided on the IPOL website (Image Processing Online).
We also compute and study properties of exact solutions of the one dimen-
sional total generalised variation problem with L2 data fitting term, for simple
piecewise affine data functions, with or without jumps . This gives an insight
on how this type of regularisation behaves and unravels the role of the TGV
parameters.
Finally, we introduce, study and analyse a novel non-local Hessian functional.
We prove localisations of the non-local Hessian to the local analogue in sev-
eral topologies and our analysis results in derivative-free characterisations of
higher order Sobolev and BV spaces. An alternative formulation of a non-local
Hessian functional is also introduced which is able to produce piecewise affine
reconstructions in image denoising, outperforming TGV.
Keywords: Higher order total variation, functions of bounded Hessian, total
generalised variation, denoising, deblurring, inpainting, staircasing effect, split
Bregman, exact TGV solutions, non-local Hessian, characterisation of higher
order Sobolev and BV spaces.

Acknowledgements
I would like to thank at this point all the people that made this journey towards
the completion of this Ph.D. thesis a pleasant one and made the last four years
truly memorable.
Firstly, I would like to thank Carola Scho¨nlieb for being an excellent supervisor
from every perspective. Her continuous help and encouragement played a
great role to my academic evolution. She was the one that introduced me to
the field of mathematical imaging and showed me how a modern professional
mathematician should be.
The people I lived with in Cambridge played an important role during my
Ph.D.. A big thanks goes to Sara Merino for being the best flatmate and
friend possible for the last four years. The same amount of gratitude goes as
well to Nayia Constantinou and Luca Calatroni. I could not have asked for
better flatmates and good friends.
A stimulating environment in the office makes the working hours pleasant
and never boring. Bati Sengul and Spencer Hughes were an excellent pair of
officemates and friends and I thank them a lot for that.
I would like to thank Thalia Sotirchou for her love and support during the
demanding last year of the Ph.D..
Many thanks to all my friends in CCA and CMS, Marc Briant, Kolyan Ray
(my gym-mate), Damon Civin, Ed Mottram, Vaggelis Papoutsellis, Julio Brau,
Ioan Manolescu and also Panagiotis Kosteletos for all the nice moments we had
together.
Moreover, I would like to thank my collaborators Jan Lellmann, Kristian
Bredies and Daniel Spector as well as Martin Benning and Tuomo Valkonen
for useful discussions.
I am grateful to the CCA directors Arieh Iserles and James Norris for giving me
the opportunity to be part of it and to Emma Hacking for providing excellent
administrative support. I also acknowledge the financial support provided
by the Engineering and Physical Sciences Research Council, the Cambridge
Centre for Analysis, the department of Pure Mathematics and Mathematical
Statistics, the department of Applied Mathematics and Theoretical Physics
and the King Abdullah University of Science and Technology.
I would like to thank the thesis examiners Dr. Anders Hansen and Professor
Antonin Chambolle for carefully reading the manuscript and for the fruitful
discussion during the examination.
Above all I would like to thank my mother Emmy, my sister Nassia and all my
family, without whom I would not have achieved all these!
6
Statement of Originality
I hereby declare that my dissertation entitled “Novel higher order regularisa-
tion methods for image reconstruction” is not substantially the same as any
that I have submitted for a degree or diploma or other qualification at any other
university. I further declare that it contains nothing which is the outcome of
work done in collaboration with others, except where specifically indicated in
the text.
Chapter 1 is a review of the existing literature on variational methods for
image reconstruction. It places the dissertation into context and summarises
its results. It is a result of my own personal work.
Chapter 2 introduces some necessary mathematical preliminaries for the thesis.
It consists of a review of some of the known results of measure theory, functions
of bounded variation and convex analysis. It is not original material.
Chapter 3 is dedicated to the introduction, analysis and implementation of
the TV–TV2 method for image reconstruction and it is novel material. It is
a result of collaboration and it is contained in [PS14, PSS13]. The proofs as
well as the implementation of the method were written by me, apart from
Theorem 3.4.7 proved by my supervisor Carola Scho¨nlieb from whom I also
greatly benefited through several discussions. The C code implementation
for TV–TV2 inpainting that was used in the IPOL online demonstration was
written by my collaborator Bati Sengul, University of Cambridge.
Chapter 4 contains an analysis of the one dimensional total generalised varia-
tion denoising problem. It is novel material and it has been done in collabo-
ration with Kristian Bredies, University of Graz, Austria, and follows [PB13].
All the proofs were written by me apart from Proposition 4.4.3 while I greatly
benefited from my two visits in Graz.
Chapter 5 studies two forms of non-local Hessian functionals following [LPSS].
It is novel material and it has been done in collaboration with Jan Lellmann,
Carola Scho¨nlieb, University of Cambridge and Daniel Spector, Technion, Is-
rael. All the proofs are the result of my own personal work while I benefited
from discussions with all the three aforementioned collaborators. The numer-
ical implementation in Section 5.7 is due to my collaborator Jan Lellmann.

Contents
List of Notation 13
1 Introduction 19
1.1 Digital images, image processing and their mathematics . . . . . . . . . . . 19
1.2 The variational approach for image reconstruction . . . . . . . . . . . . . . 20
1.3 Total variation based methods . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4 Higher order variational models . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.5.1 The TV–TV2 approach for image reconstruction . . . . . . . . . . . 33
1.5.2 Exact solutions of TGV minimisation in one dimension . . . . . . . 35
1.5.3 Non-local Hessian . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.6 Organisation of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2 Mathematical preliminaries 39
2.1 Notation on function spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2 Radon measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Functions of bounded variation . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.1 Definitions and properties . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.2 Good representatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.4 Lower semicontinuous envelopes . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.5 Convex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.5.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.5.2 Fenchel–Rockafellar duality . . . . . . . . . . . . . . . . . . . . . . . 52
3 The combined TV–TV2 approach for image reconstruction 55
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.1 Organisation of the chapter . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Convex functions of measures . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 The space of functions of bounded Hessian BV2(Ω) . . . . . . . . . . . . . . 60
3.3.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . . . 60
3.3.2 Weak∗ and strict convergence in BV2(Ω) . . . . . . . . . . . . . . . . 61
9
CONTENTS
3.4 Well-posedness of the model . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.1 Existence and uniqueness . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.2 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5 The numerical implementation . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.5.1 Discretisation of the model . . . . . . . . . . . . . . . . . . . . . . . 74
3.5.2 Bregman iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.5.3 Numerical solution via the split Bregman algorithm . . . . . . . . . 79
3.6 Applications to image denoising and deblurring . . . . . . . . . . . . . . . . 83
3.7 Applications to image inpainting and online demo in IPOL . . . . . . . . . 93
3.7.1 Motivation and IPOL’s philosophy . . . . . . . . . . . . . . . . . . . 93
3.7.2 Split Bregman for TV–TV2 inpainting . . . . . . . . . . . . . . . . . 95
3.7.3 The colour image case . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.7.4 Stopping criteria and the selection of λ’s . . . . . . . . . . . . . . . . 100
3.7.5 Inpainting examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4 Exact solutions of the one dimensional TGV regularisation problem 107
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.1.1 Organisation of the chapter . . . . . . . . . . . . . . . . . . . . . . . 108
4.2 Basic properties of TGV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3 Formulation of the problem and optimality conditions . . . . . . . . . . . . 110
4.4 Properties of solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.4.2 L2–linear regressions . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4.3 Even and odd functions . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5 A note on the relationship of TV and TGV in dimension two . . . . . . . . 126
4.6 Computation of exact solutions . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.6.1 Piecewise constant function with a single jump . . . . . . . . . . . . 130
4.6.2 Piecewise affine function with a single jump . . . . . . . . . . . . . . 145
4.6.3 Hat function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
4.6.4 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5 Non-local Hessian: Localisation results, characterisation of higher order
Sobolev and BV spaces and applications 157
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.1.1 Organisation of the chapter . . . . . . . . . . . . . . . . . . . . . . . 160
5.2 Localisation – The smooth case . . . . . . . . . . . . . . . . . . . . . . . . . 160
5.3 Localisation – The W 2,p(Rd) case . . . . . . . . . . . . . . . . . . . . . . . . 165
5.4 Second order non-local integration by parts . . . . . . . . . . . . . . . . . . 169
5.5 Localisation – The BV2(Rd) case . . . . . . . . . . . . . . . . . . . . . . . . 171
5.6 Non-local characterisation of W 2,p(Rd) and BV2(Rd) . . . . . . . . . . . . . 173
10
CONTENTS
5.7 An alternative non-local Hessian approach and an application to image
denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6 Conclusions 181
A 183
A.1 Exclusion of the cases N-0-P2-J, N-0-P1-C, N-0-P2-C . . . . . . . . . . . . 183
A.2 Some useful Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Bibliography 191
11
12
List of Notation
Against each entry is the page at which the notation is introduced.
Sets and Measures:
R+ 41 The set of positive real numbers.
R 49 The set of extended real numbers, R = R ∪ {+∞}.
Sd×d 30 The set of d× d symmetric matrices.
Ω 42 An open subset of Rd.
B(X) 39 The Borel σ-algebra of X.
|µ| 41 The total variation measure of the finite Radon measure µ.
M(X) 41 The set of real valued finite Radon measures on X. Endowed
with the norm ‖µ‖M(X) = |µ|(X).
M(X,R`) 41 The set of R`–valued finite Radon measures on X. Endowed
with the norm ‖µ‖M(X) = |µ|(X).
M+(X) 41 The set of finite positive Radon measures on X.
M+loc(X) 41 The set of positive Radon measures on X.
‖T‖M 42 The Radon norm of the distribution T on a set Ω:
‖T‖M = sup{〈T, v〉 : v ∈ C∞c (Ω), ‖v‖∞ ≤ 1}. If T is a
finite Radon measure µ then ‖µ‖M = |µ|(Ω).
L 39 The Lebesgue measure on R.
Ld 39 The Lebesgue measure on Rd.
H 26 The one dimensional Hausdorff measure.
Hd 46 The d–dimensional Hausdorff measure.
µ
ν 42 The Radon–Nikody´m density of µ with respect to ν.
sgn(µ) 42 Alternative notation for µ|µ| .
µα 42 The absolutely continuous part of µ with respect to Lebesgue
measure, i.e., µα = µLd .
µs 42 The singular part of µ with respect to Lebesgue measure.
Spaces of continuous and differentiable functions:
13
Cc(X) 40 The space of real valued continuous functions of compact
support in X. Endowed with the supremum norm ‖u‖∞ =
supx∈X |u(x)|.
Cc(X,R`) 40 The space of R`–valued continuous functions of compact
support in X. Endowed with the supremum norm ‖u‖∞ =
supx∈X |u(x)|.
C0(X) 40 The completion of Cc(X) under the supremum norm.
C0(X,R`) 40 The completion of Cc(X,R`) under the supremum norm.
Ckc (X) 40 The space of real valued, k–times continuously differentiable
functions with compact support in X.
Ckc (X,R`) 40 The space of R`–valued, k–times continuously differentiable
functions with compact support in X.
Ckc (X,Sd×d) 30 The space of Sd×d–valued, k–times continuously differen-
tiable functions with compact support in X.
Ck(X) 40 The space of k–times differentiable functions, with continu-
ous derivatives up to the boundary of X.
C∞c (X) 40 The space of real valued, infinitely many times differentiable
functions with compact support in X.
C∞c (X,R`) 40 The space of R`–valued, infinitely many times differentiable
functions with compact support in X.
Spaces of Lebesgue integrable functions:
Lp(X,R`;µ) 39 Space of R`–valued, µ–measurable functions such that´
X |u|pdµ < ∞, where 1 ≤ p < ∞ and µ is a positive
Radon measure. The corresponding norm is ‖u‖Lp(X,R`;µ) =(´
X |u|pdµ
)1/p
.
Lp(X,R`) 39 Short version of Lp(X,R`;Ld).
Lp(X) 39 Short version of Lp(X,R;Ld).
L∞(Ω,R`;µ) 39 Space of R`–valued, µ–essentially bounded measurable func-
tions, where µ is a positive Radon measure. The correspond-
ing norm is ‖u‖L∞(Ω,R`;µ) = ess sup
x∈Ω
|u(x)|.
L∞(Ω) 39 Short version of L∞(Ω,R;Ld), i.e., the space of real valued,
essentially bounded Lebesgue measurable functions.
Sobolev spaces:
Du 24 The distributional derivative of the function u.
14
W k,p(Ω) 40 Sobolev space of functions u ∈ Lp(Ω), such that the dis-
tributional derivatives up to order k are also Lp functions
(weak derivatives). The corresponding norm is ‖u‖Wk,p(Ω) =(∑
|a|≤k
´
Ω |Dau|pdx
)1/p
, where Da denotes the a–th distri-
butional derivative of u and |a| = a1 + · · ·+ ad is the order
of the multiindex a = (a1, . . . , ad).
Hk(Ω) 40 The Sobolev space W k,2(Ω).
Hk0 (Ω) 40 The completion of C∞c (Ω) under the ‖ · ‖Hk(Ω) norm.
Functions of bounded variation, bounded Hessian and bounded deformation:
BV(Ω) 44 The space of functions of bounded variation on Ω, i.e., all
the functions u ∈ L1(Ω), such that Du is a Rd–valued finite
Radon measure. The corresponding norm is ‖u‖BV(Ω) =
‖u‖L1(Ω) + |Du|(Ω).
BV(Ω,R`) 44 The space of R`–valued functions of bounded variation on Ω,
i.e., all the functions u ∈ L1(Ω,R`), such that Du is a R`×d–
valued finite Radon measure. The corresponding norm is
‖u‖BV(Ω,R`) = ‖u‖L1(Ω,R`) + |Du|(Ω).
TV(u) 44 The total variation of u ∈ L1(Ω,R`), equal to |Du|(Ω) for a
function u ∈ BV(Ω,R`).
D2u 60 The second order distributional derivative of u.
BV2(Ω) 29 The space of functions of bounded Hessian on Ω, i.e., all
the functions u ∈ W 1,1(Ω) such that ∇u ∈ BV(Ω,Rd). The
corresponding norm is ‖u‖BV2(Ω) = ‖u‖BV(Ω) + |D2u|(Ω).
TV2(u) 29 The second order total variation of u, equal to |D2u|(Ω) for
a function u ∈ BV2(Ω).
Eu 30 The distributional symmetrised gradient of u.
ICTVβ,α(u) 29 The TV–TV
2 infimal convolution of u with weights α and
β, i.e., ICTVβ,α(u) = infv∈BV2(Ω) αTV(u− v) + β TV2(v).
TGV2β,α(u) 30 The second order total generalised variation of u with
weights α, β > 0.
BD(Ω) 30 The space of functions of bounded deformation on Ω, i.e., all
the functions u ∈ L1(Ω,Rd) such that Eu is a Rd×d–valued
finite Radon measure.
Dαu or ∇u 44 The absolutely continuous part of Du with respect to
Lebesgue measure, i.e., shorter version of (Du)α = DuLd .
Dsu 44 The singular part of Du with respect to Lebesgue measure,
i.e., shorter version of (Du)s.
u′ 44 Short version of ∇u, when u is an one dimensional BV func-
tion.
pV(u, (a, b)) 47 The pointwise variation of u in (a, b) ⊆ R.
eV(u, (a, b)) 47 The essential variation of u in (a, b) ⊆ R.
15
Ju 48 The jump set of u.
u(x−), u(x+) 48 Left and right limits of u at the point x.
ul, ur 48 Left and right continuous representatives of u.
uup, ulow 48 Good representatives of u, defined as u
up(x) =
max{ul(x), ur(x)} and ulow(x) = min{ul(x), ur(x)}.
uΩ 46 The trace of u in ∂Ω for a function u ∈ BV(Ω).
Miscellaneous notation:
uΩ 46 The mean value of u in Ω, i.e., uΩ =
1
Ld(Ω)
´
Ω u dx.
XA 21 The characteristic function of the set A, i.e.,
XA(x) =
{
1 if x ∈ A,
0 if x /∈ A.
IA 51 The indicator function of the set A, i.e.,
IA(x) =
{
0 if x ∈ A,
∞ if x /∈ A.
ϕ∞ 59 The recession function of ϕ : R` → R, defined for every
x ∈ R`: ϕ∞(x) := limt→∞ ϕ(tx)t .
scτF 50 The lower semicontinuous envelope (relaxed functional) of
F : X → R, with respect to the topology τ .
X∗ 50 The analytic dual of the space X. Its elements are denoted
by x∗, y∗ . . ..
F ∗ 51 The convex conjugate of a function F . If F : X → R, then
F ∗ : X∗ → R is defined as F ∗(x∗) = supx∈X〈x∗, x〉 − F (x).
Λ∗ 52 The adjoint operator Λ∗ : Y ∗ → X∗ of the bounded linear
functional Λ : X → Y , where X, Y are Banach spaces.
∂F 51 The subdifferential of F .
Sgn(µ) 114 The set valued sign of the finite Radon measure µ on Ω,
defined as:
Sgn(µ) = {v ∈ L∞(Ω) ∩ L∞(Ω, |µ|) : ‖v‖L∞(Ω) ≤ 1, v =
sgn(µ), |µ| − a.e.}.
f 120 The L1–linear regression of f , f := argmin
v affine
‖f − v‖L1(Ω).
f? 121 The L2–linear regression of f , f? := argmin
v affine
‖f − v‖2L2(Ω).
Gnu 37 Non-local gradient of u (explicit formulation).
Hnu 37 Non-local Hessian of u (explicit formulation).
Gσxu(x) 37 Non-local gradient of u at x with weighting function σx (im-
plicit formulation).
Hσxu(x) 37 Non-local Hessian of u at x with weighting function σx (im-
plicit formulation).
d 2u(x, y) 37 Second order finite difference scheme:
d 2u(x, y) = u(y)− 2u(x) + u(x+ (x− y)).
16
Assumed notation:
N - The set of natural numbers.
R, Rd - The set of real numbers and its d–Cartesian product. In
both cases the Euclidean norm is denoted by | · |.
R`×d - The set of `×d real matrices, equipped with the Frobenious
norm denoted by | · |.
D b Ω - D ⊆ Ω with D being compact.
µbA - The restriction of the measure µ to the set A.
δx - The Dirac atomic measure concentrated on {x}.
B(x, δ) - The open ball in Rd with center x and radius δ.
a⊗ b - The `×d matrix with (ij)–th entry aibj , for a ∈ R`, b ∈ Rd.
divu - Divergence of u : Ω→ Rd, i.e., divu = ∑di=1 ∂ui∂xi .
div2u - Second order divergence of u : Ω→ Rd×d, i.e., ∑di,j=1 ∂2uij∂xi∂xj .
∆u - The Laplacian of u : Ω→ R, i.e., ∆u = ∑di=1 ∂2ui∂x2i .
17
18
Chapter 1
Introduction
1.1 Digital images, image processing and their mathematics
This is “the earliest surviving camera photograph” [Hir08]:
Figure 1.1: Joseph Nice´phore Nie´pce: Point de vue du Gras, 1826 or 1827. Source:
http://en.wikipedia.org/wiki/History_of_photography.
Even though undoubtedly a photograph like the above represents a real breakthrough for
humanity, it would be considered an image of very low quality in the modern world. High
quality digital cameras dominate the photography world of our time and people always
look for the technically and visually perfect photo. Joseph Nice´phore Nie´pce had no hope
to improve his photograph and that was not only because he was lacking the proper –
by today’s standards – camera equipment. Even a few decades ago, when film cameras
were the photographer’s only choice, the post-processing techniques were only accessible
to the selected few that possessed the secrets of the darkroom. Yet, those techniques were
fairly limited, allowing only for some simple colour, brightness and contrast adjustment.
However, the development of computers and their unavoidable use in our everyday life
arrived soon and all the information started to be encoded in bits and bytes. Photographs
were no exception and digital photography was born. An image is no longer seen as a
19
Introduction
chemical mark left on a film but as a collection of numbers. Even though admittedly this
sounds less romantic, it provides a convenient framework for us mathematicians, to see
digital images as functions: A typical digital image of resolution N ×M , that is to say
N ·M pixels, can be modelled as a function u = (ured, ugreen, ublue)
u : {1, . . . , N} × {1, . . . ,M} → {0, . . . , 255}3, (1.1)
where each component measures the intensity of the corresponding channel (RGB im-
ages). This representation gives now a new meaning to image processing. For example,
the sentence “Given a digital image which has some undesirable characteristics, like noise,
blur etc., we would like to obtain a visually and aesthetically better version of it” can be
interpreted as “Given a function that has certain characteristics and some local or global
regularity properties, we would like to obtain another function, close to the initial one in
some sense, with potentially different regularity properties and with some desirable fea-
tures”. Mathematicians use the term functional to denote an object that takes a function
as an input and gives another function as an output. Thus, past complicated processing
procedures in the mini lab, like contrast and colour adjustment, are just simple linear
operations in the new language. However, more complicated image processing tasks like
noise removal require more sophisticated “functionals” or in other words, mathematical
methods and algorithms. If today we can perform these tasks it is not only due to the
introduction of computers but mainly because of the development of an advanced mathe-
matical technology both at a theoretical and an applied level. Therefore, there is a duality
between the increasingly challenging image processing problems and the progress of novel
mathematical theories. While a new mathematical method is most of the times intro-
duced to tackle a specific image processing task, it often turns out that this method leads
to fruitful mathematical theories that are interesting in their own right. A mathematician
working in mathematical imaging likes to prove things about a method, not only in order
to provide rigorously justified insights about the behaviour of the method but also because
he/she is intrigued by the mathematical beauty of the theory behind it.
We do not claim that the present thesis is solely about image processing. Even though,
we are motivated by complicated image processing problems and we introduce novel meth-
ods, most of the results can find their own autonomous place in the mathematical literature
(especially Chapter 5).
1.2 The variational approach for image reconstruction
We have already mentioned that a digital image can be considered as a vector valued func-
tion on a N ×M grid. However, in order to have a convenient and familiar mathematical
framework, the analysis is usually done in the continuous setting (returning to the discrete
20
1.2. The variational approach for image reconstruction
setting for the numerical implementation). Thus, an RGB colour image u is regarded as
a function
u : Ω→ R3, (1.2)
where Ω ⊆ R2 is typically a rectangular domain. Analogously greyscale images are mod-
elled as real valued functions on Ω, i.e., u : Ω→ R.
Suppose now that we have some image acquisition device. In an ideal world that device
should provide us with a perfect image u. Of course, the world perfect is subjective but
we ignore any artistic views on the matter. Thus for us, features like noise, blur, artificial
artifacts are always undesirable. Quite often, due to imperfections and limitations in the
acquisition process we obtain a corrupted version f of the image u. Corruptions could be
noise or blur for instance. A typical model in imaging assumes that u has been transformed
through a continuous and linear operation T with the additional presence of random noise.
In particular we assume that the two images fulfil the model
f = Tu+ η. (1.3)
Here η denotes a random noise component that follows a certain distribution. Typically,
noise is modelled by Gaussian, Poisson, uniform, “salt and pepper” (impulse) distributions
or even a combination of these. The operator T is called the forward operator of the
problem. Examples are the identity operator (in that case the only undesirable feature
is noise), blurring operators (in which case Tu denotes the convolution with a blurring
kernel) and pointwise multiplication with the characteristic function of Ω\D, XΩ\D, where
D ⊆ Ω is a subdomain of Ω (inpainting domain) where no data are known at all and in
which information from the known part should be interpolated (image inpainting).
In Figure 1.2 we give some examples of corrupted images for different forward operators
T and noise distributions. Depending on the nature of the operator T the reconstruction
process is given a specific name. The classical imaging tasks considered in this thesis are
the following:
(i) Image denoising: T is the identity operator, i.e., only noise is present, see Figures
1.2(b) and 1.2(d).
(ii) Image deblurring: T represents a convolution (blurring) operator and noise might
or might not be present, see Figure 1.2(c).
(iii) Image inpainting: In this case Tu = XΩ\Du, where D ⊆ Ω is a domain of missing
information and noise might or might not be present, see Figures 1.2(e) and 1.2(f).
In order to reconstruct the initial image u one has to invert the operator T . However,
this is not always possible. For example, in the case of inpainting the operator T is
not invertible at all while in the case of deblurring, even though the operator could be
21
Introduction
(a) The original image u. (b) Image f corrupted with Gaus-
sian noise: f = u + η, where
η follows a Gaussian distribu-
tion.
(c) Image f corrupted by Gaus-
sian blur and noise: f = Tu+
η, where T denotes a convo-
lution with a Gaussian kernel
and η follows a Gaussian dis-
tribution.
(d) Image f corrupted by impulse
noise: f = u + η, where η fol-
lows a “salt and pepper” dis-
tribution (only a percentage of
pixels are corrupted).
(e) Image f with missing infor-
mation (white letters): f =
XΩ\Du, where D denotes the
area covered by the letters.
(f) Image f with missing informa-
tion (white letters) and cor-
rupted by noise: f = XΩ\Du+
η, where D denotes the area
covered by the letters and η
follows a Gaussian distribu-
tion.
Figure 1.2: Examples of corrupted images for different operators T and noise distributions.
invertible in theory, its inversion involves an ill-conditioned problem even in the absence of
noise. The presence of noise further complicates the problem. In this case, it is a common
procedure to add some a priori information to the model, which in general is given by a
certain regularity assumption on the image u. This is the so-called regularisation process.
In its most general setting one tries to recover a reconstructed version of f as a minimiser
of a functional which has the following form:
J (u) = Φ(Tu, f) + Ψ(u). (1.4)
The function Φ is called the data fidelity or data fitting term and measures the distance
between the data f and the reconstruction u after the forward operator has acted on it.
Small values of this term ensure that the transformed image Tu is close to the data f
in a suitable sense. The function Ψ is called the regulariser or regularising term and it
imposes extra regularity on u. It is expected that small values of Ψ will lead, up to a
22
1.2. The variational approach for image reconstruction
certain extent, to the elimination of the undesirable features in the corrupted data f . The
two terms are typically balanced by one or more positive parameters within Ψ. The choice
of regulariser also decides upon the space in which the minimisation of J is well-posed.
It is normally chosen to be a Banach space X on which Ψ takes finite values.
As one can speculate, given the problem (1.3), a bad/good choice of the data fi-
delity term can lead to a reconstruction of high/poor quality. It is well known in the
mathematical imaging community that the correct choice of the fidelity term depends
on the noise distribution. For example if the noise follows a Gaussian distribution,
the appropriate choice for Φ is the squared L2 norm of the difference Tu − f , i.e.,
Φ(Tu, f) = 12
´
Ω(Tu − f)2dx. In the case of impulse noise, it has been shown in vari-
ous contexts [Nik04, Nik02, CE05, DAG09, WGO07] that the L1 norm of the difference
Tu − f , ´Ω |Tu − f |dx leads to more efficient restorations. In the case of Poisson noise,
the Kullback–Leibler divergence of Tu and f ,
´
Ω(Tu − f log Tu)dx is the most suitable
choice [LCA07, BSW+13, SBJM09] while the supremum norm ‖Tu − f‖L∞(Ω) should be
preferred in the case of uniform noise [Cla12]. Table (1.1) provides a summary of the
above. Noise of mixed distributions can also be considered. In that case a combination of
Type of noise Φ(Tu, f)
Gaussian 12
´
Ω(Tu− f)2dx
Impulse
´
Ω |Tu− f |dx
Poisson
´
Ω(Tu− f log Tu)dx
Uniform ‖Tu− f‖L∞(Ω)
Table 1.1: The correct choices of data fidelity terms for different noise distributions.
the corresponding fidelities is used [NWC13, DlRS12, CDlRS14, HL13].
While the statistics of the noise give rise to a particular choice of the fidelity term,
the choice of the regulariser is determined by our a priori knowledge of the properties
of the desired image u. Different regularisers impose different regularity and qualitative
characteristics on the image u. Squared Hilbert space norms have a long tradition in inverse
problems. The most prominent example is Tikhonov regularisation [Tik63, Whi23] with
Ψ(u) = α
ˆ
Ω
|∇u|2dx, (1.5)
where α > 0 and ∇u is a weak gradient of u, which – for Gaussian noise – leads to the
following minimisation of (1.4):
min
u∈H1(Ω)
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
|∇u|2dx. (1.6)
23
Introduction
The well-posedness of (1.6) follows from a standard application of the direct method of
calculus of variations, exploiting the fact that H1(Ω) is reflexive and thus bounded sets in
that space are weakly pre-compact. In the denoising case, i.e., when T is the identity, the
gradient flow of the corresponding Euler–Lagrange equation of (1.6) reads ut = α∆u−u+f .
The result of such a regularisation technique is a linearly, i.e., isotropically, smoothed image
u for which the smoothing strength does not depend on f . Hence, while eliminating the
disruptive noise in the given data, also desirable structures like edges in the reconstructed
image are blurred, see Figure 1.3. This observation gave way to a new class of non-smooth
norm regularisers, which aim to eliminate noise and smooth the image in homogeneous
areas, while preserving the relevant structures such as objects boundaries and edges.
(a) The original image u. (b) Image f corrupted with
Gaussian noise
(c) Denoised image with
Tikhonov regularisation.
Figure 1.3: Illustration of the isotropically smoothed reconstruction using Tikhonov reg-
ularisation, i.e., when Ψ(u) = α‖∇u‖2L2(Ω).
1.3 Total variation based methods
In their pioneering work, Rudin, Osher and Fatemi [ROF92] proposed the use of total
variation of u, TV(u), as regulariser. Recall that for a function u ∈ L1(Ω), the total
variation of u is defined as
TV(u) = sup
{ˆ
Ω
u divv dx : v ∈ C1c (Ω,R2), ‖v‖∞ ≤ 1
}
, (1.7)
where C1c (Ω,R2) is the space of R2–valued continuously differentiable functions with com-
pact support in Ω. Functions that have finite total variation are exactly those whose
distributional derivative Du can be represented by a finite Radon measure (space of func-
tions of bounded variation, BV(Ω)), see Chapter 2 for a complete account on this subject.
Recall also that if u ∈ W 1,1(Ω), one can easily show that TV(u) = ´Ω |∇u|dx. The
24
1.3. Total variation based methods
corresponding L2 fidelity minimisation problem now reads
min
u∈BV(Ω)
1
2
ˆ
Ω
(Tu− f)2dx+ αTV(u). (1.8)
The well-posedness of (1.8) is shown again using the direct method of calculus of variations
together with some compactness properties of the space BV(Ω) (Theorem 2.3.5). When
T = Id, the above problem is commonly known as the ROF problem. Here, less regularity
than in (1.6) is imposed in the image u, as BV functions are allowed to have jump discon-
tinuities. This non-smooth regularisation procedure results in a nonlinear smoothing of
the image, smoothing more in homogeneous areas and preserving characteristic structures
such as edges. In particular, total variation regularisation is tuned towards the preserva-
tion of edges and performs very well for piecewise constant images, where typically almost
a perfect reconstruction is obtained, up to a slight loss of contrast, see for example Figure
1.4.
(a) Piecewise constant image. (b) Image corrupted with
Gaussian noise
(c) TV restoration.
Figure 1.4: Illustration of the ability of TV regularisation to preserve edges in the recon-
structed image and its efficient restoration of piecewise constant images.
Since its introduction in image processing in 1992, total variation has found numer-
ous applications in the field, as the edge preservation ability made it quite an appealing
tool. These applications cover a wide area, from image denoising [ROF92, CL97, Nik04,
LCA07, BC98, CKS01, DHKS10, Pas12a], image deblurring [DS96, CW98, WYZ07, BT09,
Pas12b], image inpainting [SC02, CS05, Pas12c] and image zooming [ACHR06] to JPEG
decompression [ADF05, BH12b] MRI and PET reconstruction [BGH+14, BMPS14, Mu¨l13]
to name a few.
Properties of total variation regularisation have been extensively studied at the theo-
retical level as well. Acar and Vogel [AV94] studied the well-posedness of problem (1.8).
Meyer [Mey01] showed that in the L2–TV denoising problem with f being the characteris-
tic function of a disk, the solution is a characteristic function of the same disk with a loss
of contrast, whose amount depends on the parameter α, see Figure 1.5. Alter, Caselles
and Chambolle [ACC05] extended these results for characteristic functions of certain con-
vex bounded sets with a bound on the mean curvature of their boundaries (calibrable
25
Introduction
∼ α
f
u
Figure 1.5: Illustration of Meyer’s result: Total variation regularisation on a characteristic
function of a disk in R2 results only to a loss of contrast which is proportional to the
parameter α.
sets). For the same problem, Strong and Chan [SC03] computed exact solutions for 1D
and rotationally symmetric 2D data and similar results were obtained by Ring [Rin00].
In [Gra07] Grasmair proved the equivalence of one dimensional TV regularisation and the
taut–string algorithm and an extension to higher dimensions was done by Hinterberger et
al. [HHK+03]. Properties of the L1–TV model were studied in [Nik02, DAG09, CE05].
Finally, Caselles, Chambolle and Novaga [CCN07] showed that the discontinuity set of the
solution u of the L2–TV denoising problem in Ω ⊆ Rd, is contained, up to a Hd−1–null set,
in the discontinuity set of the data function f . Here Hd−1 denotes the (d−1)–dimensional
Hausdorff measure in Rd. Further regularity properties of the solution u of total variation
denoising were proved by Allard [All08a, All08b, All09].
As it is observed in all the aforementioned applications and rigorously justified analyti-
cally, total variation regularisation promotes piecewise constant reconstructions leading to
images with blocky like structures. This generally becomes apparent in images that do not
only consist of flat regions and jumps but also possess slanted regions, i.e., piecewise affine
parts, see Figure 1.6. This is called the staircasing effect. In one dimension, this roughly
(a) Image corrupted with Gaus-
sian noise
(b) Total variation denoised im-
age with blocky-like artifacts.
(c) Detail of total variation de-
noised image.
Figure 1.6: Illustration of the staircasing effect in total variation denoising in two dimen-
sions.
26
1.3. Total variation based methods
means that the total variation regularisation of a noisy signal f , is a staircase u, whose L2
norm is close to f , see Figure 1.7. In fact it can be rigorously proved that if in an open in-
Figure 1.7: The staircasing effect in total variation denoising in one dimension.
terval I we have u > f (or u < f), then u is constant on I [CCN07, Rin00, BKV13]. Since
u and f are BV functions defined a.e., the inequalities u > f and u < f are interpreted
through inequalities of certain good representatives of them, see Chapter 2 for the relevant
definitions. Analytical studies of the staircasing effect are also done in [DMFLM09, DS96].
Since it leads to unnaturally looking images, the staircasing effect is considered an unde-
sirable artifact and many methods have been proposed in order to reduce or eliminate it,
see Section 1.4 for a detailed discussion.
The ability of total variation regularisation to preserve and promote discontinuities on
the image u has also been exploited in image inpainting [SC02, CS05, Pas12c]. There, the
corresponding minimisation problem reads
min
u∈BV(Ω)
1
2
ˆ
Ω\D
(u− f)2dx+ αTV(u), (1.9)
where D is the inpainting domain, i.e., the missing part of the image. A desirable feature in
image inpainting is the interpolation of the image along large gaps (connectivity principle).
Unfortunately, it turns out that total variation is able to provide interpolation results with
sharp edges only if the gap is small enough [CKS02, Sch09]. In fact total variation minimi-
sation can be interpreted as penalisation of the length of level lines interpolating them via
the shortest path. In Figure 1.8 for example, we see that interpolation of a broken stripe
is achieved only if the gap length is smaller than the width of the stripe, while Tikhonov
regularisation (harmonic inpainting) leads to blurry reconstructions as expected, with no
connection occurring either. Total variation achieves perfect sharp reconstruction of the
stripe for small gaps but fails completely for large gaps. Also, following its general charac-
teristic, total variation promotes piecewise constant reconstructions inside the inpainting
domain, resulting again in images with blocky, unnaturally looking features, see Figure
1.9. In order to overcome these issues, several inpainting methods based on higher order
derivatives have been suggested, see Section 1.4.
27
Introduction
(a) Small gap. (b) TV inpainting. (c) Harmonic inpain-
ting.
(d) Large gap. (e) TV inpainting. (f) Harmonic inpain-
ting.
Figure 1.8: Illustration of the dependence of the connection ability of total variation
inpainting on the gap size and comparison with harmonic inpainting (Ψ(u) = α‖∇u‖2L2(Ω)).
The inpainting domain D is denoted with grey colour.
(a) Destroyed image. (b) TV inpainted image. (c) Detail of TV inpainted image.
Figure 1.9: Total variation inpainting promotes piecewise constant reconstructions inside
the inpainting domain which is unnatural for real world images.
1.4 Higher order variational models
In recent years, higher order versions of non-smooth image reconstructions methods have
been considered in order to reduce the staircasing effect and hence to improve the quality
of the restored image, as well as to fulfil the connectivity principle in inpainting. Some
incorporate a combination of first and higher order derivatives – mainly second order –
and some others are purely higher order. Intuitively, minimising second order derivative–
based energies, one is trying to recover a piecewise affine version of the data in contrast to
a piecewise constant version obtained by total variation regularisation. This is expected,
roughly because, due to the introduction of the second order derivatives in the regulariser,
its kernel now contains affine functions and not only constants.
Already in the paper of Chambolle and Lions [CL97], the authors proposed a higher
order method for denoising by means of an infimal convolution of two convex regularisers.
28
1.4. Higher order variational models
Here, a noisy image is splitted into u = u1 + u2 by solving
min
u1∈BV(Ω)
u2∈BV2(Ω)
1
2
ˆ
Ω
(u1 + u2 − f)2dx+ αTV(u1) + β TV2(u2). (1.10)
where TV2(u2) := TV(∇u2) and BV2(Ω) = {u ∈ W 1,1(Ω) : ∇u ∈ BV(Ω)}, the space
of functions of bounded Hessian [Dem85]. The idea is to decompose the image u into a
piecewise constant part u1 and a piecewise affine part u2. An alternative formulation for
the minimisation problem (1.10) is
min
u∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+ ICTVβ,α(u), (1.11)
with
ICTVβ,α(u) := min
v∈BV2(Ω)
αTV(u− v) + β TV2(v)
= min
w∈BV(Ω,R2)
w∈R(∇,R2)
α‖Du− w‖M + β‖Dw‖M. (1.12)
Here BV(Ω,R2) is the space of R2–valued functions of bounded variation, R(∇,R2) =
{w ∈ L1(Ω,R2) : ∃v ∈ W 1,1(Ω) such that w = ∇v} is the range of the operator ∇
and ‖T‖M is the Radon norm of the distribution T in Ω, ‖T‖M = sup{〈T, v〉 : v ∈
C∞c (Ω), ‖v‖∞ ≤ 1}. Note that for a finite Radon measure µ on Ω, we have ‖µ‖M = |µ|(Ω).
Again, the well-posedness of (1.10) and hence (1.11) is shown using the direct method of
calculus of variations. Note that while u1 + u2 is unique in (1.10), one can always add a
constant to u1 and subtract the same constant from u2, thus constructing another solution.
In that sense, u1 and u2 are not unique but their sum is.
Another attempt to combine first and second order regularisation originates from Chan,
Marquina, and Mulet [CMM01], who considered total variation minimisation together with
weighted versions of the Laplacian. More precisely, they considered a regularising term of
the form
α
ˆ
Ω
|∇u|dx+ β
ˆ
Ω
ϕ(|∇u|)(∆u)2dx, (1.13)
where ϕ must be a function with certain growth conditions at infinity in order to allow
jumps. The well-posedness of (1.13) in one dimension was rigorously analysed by Dal Maso
et al. in [DMFLM09]. In [CEP07] the term ‖∆u2‖2L2(Ω) was used instead of TV2(u2) in
(1.10), also in combination with the use of H−1 norm in the fidelity term.
Regularising with second order derivatives only, taking Ψ(u) = αTV2(u) has also been
considered by Lysaker et al., [LLT03], Scherzer [Sch98], Hinterberger and Scherzer [HS06a],
Bergounioux and Piffet [BP10], Lai et al. [LTC13]. In Lefkimmiatis et al. [LBU12], the
use of spectral norm of the Hessian matrix was proposed. Further, Po¨schl and Scherzer
29
Introduction
[PS08] studied minimisers of functionals which are regularised by the total variation of the
k–th derivative. Finally, a combination of total variation with a fourth order PDE was
considered by Lysaker and Tai in [LT06] while a higher order nonlinear diffusion filtering
denoising technique was studied by Didas at al. in [DWB09].
All the above methods manage to reduce the staircasing effect up to one degree but not
totally eliminate it. For example the decomposition obtained by the infimal convolution,
is not the desirable one since the effect of the TV2 term is not strong enough to make the
reconstruction piecewise smooth. On the other hand regularising only with TV2 introduces
a fair amount of blur in the image. Moreover, some of the problems are difficult to be
implemented and optimised – problem (1.13) is not even convex – while others are only
tuned for image denoising, lacking the wide applicability of total variation regularisation.
Total Generalised Variation
A high quality regulariser that eliminates the common staircasing artifacts is the total gen-
eralised variation (TGV) introduced recently by Bredies, Kunisch and Pock [BKP10]. The
second order total generalised variation of a function u ∈ L1(Ω) with positive parameters
α and β is defined as
TGV2β,α(u) = sup
{ˆ
Ω
udiv2v dx : v ∈ C2c (Ω,S2×2), ‖v‖∞ ≤ β, ‖divv‖∞ ≤ α
}
, (1.14)
where C2c (Ω,S2×2) is the space of S2×2–valued twice continuously differentiable functions
with compact support in Ω, with S2×2 being the space of 2 × 2 symmetric matrices.
Definition (1.14) is called the supremum or the predual definition of TGV. An equivalent
formulation, the minimum or primal definition, was shown in [BV11]
TGV2β,α(u) = min
w∈BD(Ω)
α‖Du− w‖M + β‖Ew‖M, (1.15)
where Ew is the distributional symmetrised gradient of w and BD(Ω) is the space of
functions of bounded deformation, i.e., the functions whose symmetrised distributional
gradient can be represented by a finite Radon measure, see [TS80, Tem85]. One should
note the difference between ICTVβ,α and TGV
2
β,α by observing the two definitions (1.12)
and (1.15). While the minimisation in the ICTVβ,α definition (1.12) is restricted over the
functions that are gradients, the minimisation in (1.15) is done over the wider space BD(Ω)
leading eventually for a more optimal decomposition in piecewise constant and piecewise
affine parts [Mu¨l13]. As a result, the total generalised variation–based regularisation, i.e.,
minimisation of (1.4) with Ψ(u) = TGV2β,α(u), has the ability to adapt to the regularity
of the data and images restored with this method are typically piecewise smooth that is
to say, not only the discontinuities but also the affine structures are preserved, see Figure
1.10. Let us note here that a modified infimal convolution approach that has been recently
30
1.4. Higher order variational models
proposed in the discrete setting by Setzer et al. in [SST11] has very similar results with
TGV and in fact in some if its variants the two methods are equivalent.
(a) Noisy image. (b) Total variation denoising. (c) Total generalised variation de-
noising.
Figure 1.10: Illustration of the ability of total generalised variation to preserve edges and
affine structures in the same time, thus eliminating the staircasing artifacts.
Some successful applications of total generalised variation in image restoration and re-
lated tasks have been done in image denoising [BKP10, Bre14, BDH13], image deblurring
[BV11, Bre14], image zooming [Bre14, BH13b], MRI reconstruction [KBPS11, BGH+14],
diffusion tensor imaging [VBK13] and JPEG decompression [BH12a, Hol13]. Moreover
there have already been a few contributions to the analysis of the model. Properties of
the one dimensional L1–TGV model were studied by Bredies, Kunisch and Valkonen in
[BKV13], while in [BH13a] Bredies and Holler studied the well-posedness of the corre-
sponding Tikhonov regularisation problem established in the BD space. A comparison
among TV, TV2, ICTV and TGV regularisation is done in [BBBM13] by Benning et al.
in the context of eigenfunctions, see also Benning’s Ph.D. thesis [Ben11]. In particular, the
authors investigate the capability of the three methods to recover certain one dimensional
data exactly apart from a loss of contrast (almost exact recovery). They compute exact
solutions of the corresponding minimisation problems for data that can be recovered al-
most exactly by one method but not the others. Additionally in his Ph.D. thesis [Mu¨l13],
Mu¨ller constructed a 2D function that can be recovered almost exactly by TGV but not by
ICTV, emphasising the differences between the two functionals in two dimensions. Note
that ICTV and TGV regularisations coincide in dimension one since in that case every L1
function can be written as a weak derivative of another function, compare again definitions
(1.12) and (1.15).
The main disadvantage of TGV regularisation problem is the increased computational
time involved to its solution. The most popular numerical method that is used for this
purpose is the primal-dual algorithm of Chambolle–Pock [CP11]. Even though TGV min-
imisation is solved fairly fast with this algorithm, it is much slower than TV minimisation
solved with the state of the art methods that have been developed for it, e.g. the split
Bregman algorithm [GO09].
31
Introduction
Inpainting methods
Higher order inpainting methods perform in general better than first order ones, like
total variation inpainting, because of the additional directional information used for the
interpolation process. Euler’s elastica is a popular higher order variational method of
this kind which is able to achieve large gap connectivity, see Nitzberg et al. [NMS93],
Masnou and Morel [MM98], Chan, Kang and Shen [CKS02] and also Tai, Hahn and
Chung [THC11]. In Euler’s elastica, the regularising term reads
ˆ
Ω
(
α+ β
(
∇ · ∇u|∇u|
)2)
|∇u| dx, (1.16)
i.e., is a combination of the total variation and the curvature of the level lines of u. How-
ever, one disadvantage of the Euler’s elastica inpainting model for analytic and numerical
issues is that it is a non-convex minimisation problem.
Convex approximations to the elastica energy have been studied in [BPW13]. Other
examples of higher order inpainting are the Cahn–Hilliard inpainting by Bertozzi, Ese-
doglu and Gillette [BEG07], TV–H−1 inpainting by Burger et al. [BHS09], Scho¨nlieb et
al. [SBBH09] and Hessian–based surface restoration by Lai et al. [LTC13]. Moreover,
curvature driven diffusion inpainting has been considered by Chan and Shen [CS01] while
Mumford–Shah based inpainting has been proposed by Esedoglu and Shen [ES02]. Finally,
inpainting via transport (first order inpainting) has also been considered, see for instance
Bertalmı´o et al. [BSCB00] and Bornemann and Ma¨rz [BM07].
All the aforementioned inpainting methods are local methods, that is to say the in-
formation that is needed to fill in the inpainting domain D is only taken from points
neighboring the boundary of D. Non-local or global methods take into account all the
information from the known part of the image, usually weighted by its distance to the
point that is to be filled in. This class of methods is very powerful, allowing to fill in
structures and textures almost equally well. However, they still have some disadvantages,
for instance high computational cost is involved in their solution. Moreover local methods
are sometimes more desirable, as they tend to preserve geometrical features better, espe-
cially in relatively small inpainting domains. For non-local inpainting methods the reader
is referred to e.g. Cao at al. [CGMP09], Criminisi et al. [CPT03] and Arias, Caselles and
Sapiro [ACS09].
1.5 Contribution
This thesis consists of three main distinguishable parts as these also result from the asso-
ciated papers that have been produced by the author as part of his Ph.D. [PS14, PSS13],
[PB13], [LPSS]. In [PS14], we introduce, analyse and implement an efficient combined first
32
1.5. Contribution
and second order regularisation method with applications in image denoising and deblur-
ring, also providing an additional experimental analysis for image inpainting accompanied
by an online demonstration in [PSS13]. We contribute to the analysis of TGV in [PB13]
by analytically solving the one dimensional L2–TGV denoising problem for simple data
functions and studying further properties of the model. Finally in [LPSS], we introduce
and study a non-local Hessian functional, proving its localisations to the classical Hes-
sian in several regularity levels, a work that also leads to some novel characterisations of
higher order Sobolev and BV spaces in the spirit of Bourgain, Brezis, Mironescu [BBM01].
We also introduce an alternative definition of non-local Hessian, that is able to produce
high quality piecewise affine reconstructions in image denoising as this is confirmed by
numerical examples.
In what follows, let us give an outline for each of these contributions which will be
presented in more details in Chapters 3, 4 and 5.
1.5.1 The TV–TV2 approach for image reconstruction
In Chapter 3, we introduce the following combined first and second order regulariser (TV–
TV2) for a function u ∈ BV2(Ω):
α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω), (1.17)
where ϕ1, ϕ2 are two convex functions ϕ1 : R2 → R+, ϕ2 : R4 → R+ with at most linear
growth at infinity. As usual α, β are positive regularisation parameters that balance the
two terms. The definition is set up in the context of convex functions of measures – the
term D2u is generally a finite Radon measure. This is done in order to define rigorously
expressions like
√|D2u|2 +  (where φ2(x) = √|x|2 + ) making them equal to√|∇u|2 + 
in the special case when D2u = ∇2uL2, i.e., when the measure D2u is representable by
a function ∇2u. In the case when ϕ2 is the Euclidean norm | · | the total variation of the
measure |D2u|(Ω) is recovered. We study the corresponding variational problem
min
u∈BV2(Ω)
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω). (1.18)
Under some mild additional assumptions, we prove the existence and uniqueness for so-
lutions of (1.18), via the method of relaxation in the spirit of [Ves01], i.e., by identifying
the minimising functional’s lower semicontinuous envelope with respect to the naturally
defined weak∗ and strict topologies in BV2(Ω). The method is implemented with the split
Bregman algorithm [GO09] to whose convergence theory we also contribute and we present
applications in image denoising and deblurring.
The idea of this combination of first and second order dynamics is to regularise with a
fairly large weight α in the first order term preserving the jumps as well as possible and
33
Introduction
using a not too large weight β for the second order term, in a way that staircasing artifacts
created by the first order regulariser are reduced without introducing any serious blur in the
reconstructed image. We show that for image denoising and deblurring the model (1.18)
offers solutions whose quality (accessed by the structural similarity index SSIM [WBSS04,
WB09]) is not far off from the ones produced by the TGV. Moreover, the computational
effort needed for its numerical solution is not much more than the one needed for solving
the standard total variation model. For comparison the numerical solution for TGV
regularisation is in general about ten times slower than this, see corresponding table in
Section 3.6.
We proceed in applying the TV–TV2 method in image inpainting, where we observe
that we can achieve connectivity over larger gaps than total variation. Thus, the method
results in more naturally looking inpainted images see Figure 1.11. The minimisation of
(a) Destroyed image. (b) TV inpainted image. (c) TV2 inpainted image.
(d) Detail of destroyed image. (e) Detail of TV inpainted image. (f) Detail of TV2 inpainted image
Figure 1.11: Illustration of the differences between pure TV and pure TV2 inpainting
reconstructions.
the TV–TV2 functional can be seen as convex simplification of the Euler’s elastica idea,
where we have replaced the non-convex curvature by the convex total variation of the
gradient of u. In an experimental analysis we indicate the differences between pure first
and pure second order total variation inpainting, we study the influence of the inpainting
domain on large gap connection and we propose a numerically justified rule for optimal
tuning of the parameters within the split Bregman iteration. We thus end up with a fast
implementation of TV–TV2 inpainting. Finally, we provide a source code for the algorithm
written in C and an online demonstration on Image Processing Online (IPOL) platform,
34
1.5. Contribution
accessible on the article web page http://dx.doi.org/10.5201/ipol.2013.40.
1.5.2 Exact solutions of TGV minimisation in one dimension
In Chapter 4, we study the one dimensional L2–TGV denoising model:
min
u∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+ TGV2β,α(u), (1.19)
where Ω = (a, b) is an open interval of R. The motivation is to understand deeper how
this kind of regularisation behaves and to unravel the role of the parameters α and β.
We formulate the predual problem of (1.19) and we derive the corresponding optimality
conditions in the spirit of [Rin00, BKV13] using Fenchel–Rockafellar duality. We examine
some of the basic properties of the solutions, e.g., behaviour near and away from the
boundary, preservation of discontinuities and facts about the L2–linear regression. We
also show that at least for even data functions, TGV and TV regularisations coincide for
large enough ratio β/α and we extend this result in dimension two. Moreover we compute
exact solutions of (1.19) for three different data functions f : a piecewise constant function
with a single jump, a piecewise affine function with a single jump and a hat function, see for
instance Figures 1.12 and 1.13. The length of the interval (a, b), the size of the jump and
the slope of the hat function appear as parameters in the solutions formulae. Emphasis is
given on how the characteristic features of the solutions (discontinuities, piecewise affinity)
are affected by the parameters α and β. The analysis is furnished with some numerical
experiments that verify our computations.
Note: We should mention here that a paper by Po¨schl and Scherzer [PS13], has been
prepared at the same time, independently of our work which also addresses exact solutions
of L2–TGV in dimension one. Their focus is on determining how to choose the parameters
α and β such that the solutions do not coincide with the respective L2–TV and L2–TV2
regularisations. There, the authors compute exact solutions for a characteristic function
of an interval I ⊆ (a, b) and for a hat function, with a fixed slope.
1.5.3 Non-local Hessian
In a pioneering work, Bourgain, Brezis and Mironescu [BBM01] examined functionals of
the type ˆ
Ω
ˆ
Ω
|u(x)− u(y)|p
|x− y|p ρn(x− y)dxdy, (1.20)
where ρn is a sequence of radial functions that concentrates at the origin as n tends to
infinity and 1 ≤ p < ∞. They characterised the spaces W 1,p(Ω) for p > 1 and BV(Ω) by
35
Introduction
f
u
(a)
f
u
(b)
f
u
(c)
f
u
(d)
Figure 1.12: All the possible types of solutions (red) of the L2–TGV denoising problem
for a single jump data function.
f
u
(a)
f
u
(b)
f
u
(c)
f
u
(d)
Figure 1.13: All the possible types of solutions (red) of the L2–TGV denoising problem
for a hat function.
showing that
u ∈W 1,p(Ω) ⇐⇒ u ∈ Lp(Ω) and lim inf
n→∞
ˆ
Ω
ˆ
Ω
|u(x)− u(y)|p
|x− y|p ρn(x− y)dxdy <∞,
u ∈ BV(Ω) ⇐⇒ u ∈ L1(Ω) and lim inf
n→∞
ˆ
Ω
ˆ
Ω
|u(x)− u(y)|
|x− y| ρn(x− y)dxdy <∞.
The above characterisations were used by Aubert and Kornprobst [AK09] to solve
some non-local variational problems in image denoising. In fact, there exists a fairly
large amount of literature regarding non-local models for image reconstruction – we have
already mentioned some non-local inpainting methods in Section 1.4. We mention as well
the work by Gilboa and Osher [GO07, GO08], Buades, Bartomeu, Morel (non-local means)
[BCM05] and also the Ph.D. thesis of Sawatzky [Saw11].
In [GM01], Gobbino and Mora examined the approximation of functionals depending
on the gradient of u and on the behaviour of u near the discontinuity points by families
of non-local functionals where the gradient is replaced by finite differences.
Ponce [Pon04] derived similar characterisations with [BBM01] by studying functionals
of the type ˆ
Ω
ˆ
Ω
ω
( |u(x)− u(y)|
|x− y|
)
ρ(x− y)dxdy, (1.21)
where ω is a continuous function and ρ are not necessarily radial.
Mengesha and Spector in [MS13] introduced the following non-local gradient operator:
36
1.5. Contribution
Gnu(x) = N
ˆ
Ω
u(x)− u(y)
|x− y|
x− y
|x− y|ρn(x− y)dy, x ∈ Ω. (1.22)
Functional (1.22) is defined rigorously as a distribution. The authors prove the localisation
of the functionals (1.22) to their classical analogue ∇u, in various topologies and they
obtain yet another characterisation of the spaces W 1,1(Ω) and BV(Ω):
u ∈W 1,p(Ω) ⇐⇒ u ∈ Lp(Ω) and lim inf
n→∞ ‖Gnu‖Lp(Ω) <∞, for p > 1,
u ∈ BV(Ω) ⇐⇒ u ∈ L1(Ω) and lim inf
n→∞ ‖Gnu‖L1(Ω) <∞, for p = 1.
In Chapter 5 which constitutes the final part of the thesis, we introduce and study the
following higher order non-local functional
Hnu(x) = d(d+ 2)
2
ˆ
RN
d 2u(x, y)
|x− y|2
(
(x− y))⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρn(x− y)dy, x ∈ R
d,
(1.23)
where d 2u(x, y) := u(y) − 2u(x) + u(x + (x − y)) and the functions ρn are radial. We
call (1.23) the explicit formulation of non-local Hessian. We show that the functional
(1.23) localises as n tends to infinity to the continuous analogue ∇2u, in the topology
that corresponds to the regularity of u. In particular we show that if u is smooth, this
convergence is uniform, while if u ∈ W 2,p(Rd), 1 ≤ p < ∞ then we have Lp convergence.
Finally, if u ∈ BV2(Rd) then we show that the sequence of measures HnuLd converges
weakly∗ to D2u. Finally after introducing of a second order non-local integration by
parts formula we are able to derive non-local characterisations of the spaces W 2,p(Ω) for
1 ≤ p <∞ and BV2(Rd). In particular, we prove the following
u ∈W 2,p(Rd) ⇐⇒ u ∈ Lp(Rd) and lim inf
n→∞ ‖Hnu‖Lp(Rd) <∞, p > 1,
u ∈ BV2(Rd) ⇐⇒ u ∈ L1(Rd) and lim inf
n→∞ ‖Hnu‖L1(Rd) <∞, p = 1.
We then proceed to an alternative, implicit formulation of a non-local Hessian func-
tional which is more suitable for variational problems in imaging. We define the non-local
gradient and non-local Hessian of u at a point x with a positive weighting function σx,
denoted by Gσxu(x) and Hσxu(x) respectively, as the minimisers of
(Hσxu(x),Gσxu(x)) := argmin
H′∈Rd×d,G′∈Rd
1
2
ˆ
Ω−{x}
Ru,G′,H′(x, z)2σx(z) dz, (1.24)
where
Ru,G′,H′(x, z) = u(x+ z)− u(x)−G′>z − 1
2
z>H ′z. (1.25)
Here Ω ⊆ Rd is an open and bounded domain. The weighting function σx is determined
37
Introduction
by solving a weighted Eikonal equation, thus making it aware of edges in the image u. We
provide numerical examples in image denoising where we use the L1 norm of this implicit
form of non-local Hessian as a regulariser, i.e., Ψ(u) = α ‖Hσu‖L1(Ω), for α > 0. Our
results indicate that this model is able to preserve edges and promote piecewise affine
reconstructions, achieving results of better quality than TGV does in that context.
1.6 Organisation of the thesis
In Chapter 2 we provide some mathematical preliminaries mostly on Radon measures,
functions of bounded variation, lower semicontinuous envelopes and some results from
convex analysis, following mainly [AFP00] [DM93] and [ET76].
In Chapter 3 we discuss the TV–TV2 approach for image reconstruction following
[PS14] and [PSS13]. The well-posedness of the method is shown and we discuss its appli-
cations to image denoising, deblurring and inpainting by implementing it with the split
Bregman algorithm.
In Chapter 4 we study the one dimensional total generalised variation denoising prob-
lem. We examine the basic properties of the model and we compute exact solutions for
simple data functions. All the material is contained in [PB13] apart from Section 4.5 in
where we state a result on TV–TGV relationship in dimension two that was obtained after
the submission of [PB13].
In Chapter 5, we introduce and study the explicit form of non-local Hessian func-
tional following [LPSS]. We prove its localisation to the classical Hessian and we provide
some non-local characterisations of W 2,p(Rd) and BV2(Rd) spaces. We also introduce the
implicit non-local Hessian formulation and apply it in image denoising.
Some Lemmas that require simple but extensive computations were moved to Appendix
A for the sake of the nice presentation of the thesis.
38
Chapter 2
Mathematical preliminaries
In this chapter we summarise some mathematical tools and notions that are necessary for
the rest of the thesis. In particular, after recalling some basic facts about Radon measures,
we proceed to a review of the theory of functions of bounded variation which is central
for our work. After that, we state some basic facts about lower semicontinuous envelopes
or otherwise called relaxed functionals. Finally, we mention some results from convex
analysis mainly from duality theory.
We assume that the reader has a basic knowledge of real and functional analysis,
measure theory and Sobolev space theory. For these subjects we refer the reader to
[DiB02, Bol90, LL01, Eva10].
2.1 Notation on function spaces
We start with a few remarks regarding our notation on functions spaces. Let µ be a positive
measure on a metric measure space (X,B(X)), where B(X) denotes the σ-algebra of X.
We denote by Lp(X,R`;µ) the space of R`–valued, µ–measurable functions u : X → R`,
such that
´
X |u|pdµ <∞, for 1 ≤ p <∞ and the space of µ–essentially bounded functions
for p =∞. These spaces are Banach spaces with the norms
‖u‖Lp(X,R`;µ) =
(ˆ
X
|u|pdµ
)1/p
, 1 ≤ p <∞,
‖u‖L∞(X,R`;µ) = ess sup
x∈X
|u(x)|, p =∞.
When the measure µ is omitted then it is assumed to be the Lebesgue measure Ld in Rd
(L in R) and of course in that case X is a subset of Rd. Thus, Lp(X,R`) is a short version
of Lp(X,R`;Ld). When the range of the functions, R`, is omitted then it is assumed that
` = 1. Hence, Lp(X) is a short version of Lp(X,R;Ld). This notation rule applies in all
the function spaces.
39
Mathematical preliminaries
The space of R`–valued continuous functions with compact support in X is denoted
by Cc(X,R`) and it is endowed with the supremum norm ‖u‖∞ = supx∈X |u(x)|. The
completion of Cc(X,R`) under the supremum norm is denoted by C0(X,R`). Consistently,
Cc(X) and C0(X) are short versions of Cc(X,R) and C0(X,R) respectively.
The space of R`–valued, k–times continuously differentiable functions with compact
support in Ω, where Ω is an open domain in Rd, is denoted by Ckc (Ω,R`). When k = ∞,
we have the space of smooth functions of compact support in Ω. Again, Ckc (Ω) and
C∞c (Ω) stand for Ckc (Ω,R) and C∞c (Ω,R) respectively. Similarly, the corresponding spaces
of differentiable functions with not necessarily compact support are denoted by Ck(Ω,R`),
C∞(Ω,R`), Ck(Ω) and C∞(Ω).
As usual, W k,p(Ω) denotes the Sobolev space of functions whose distributional deriva-
tives from order zero up to k are Lp functions. The corresponding norm is
‖u‖Wk,p(Ω) =
∑
|a|≤k
ˆ
Ω
|Dau|pdx
1/p ,
where Dau denotes the a–th distributional derivative of u and |a| = a1 + · · · + ad is the
order of the multiindex a = (a1, . . . , ad). Furthermore, H
k(Ω) denotes the Hilbert space
W k,2(Ω) and Hk0 (Ω) is the completion of C∞c (Ω) under the ‖ · ‖Hk(Ω) norm. As expected,
the corresponding R`–valued Sobolev spaces are denoted by W k,p(Ω,R`), Hk(Ω,R`) and
Hk0 (Ω,R`).
We denote with Ck(Ω) the space of k–times differentiable functions, with continuous
derivatives up to the boundary of Ω.
Finally, matrix–valued functions are also considered. For instance C∞c (X,R`×d) denotes
the space of R`×d matrix–valued, smooth functions of compact support in X and similar
notations hold for other spaces. Note that the norm used for matrices will always be the
Frobenious norm, unless stated otherwise.
2.2 Radon measures
Since standard textbooks are mainly concerned with positive measures, we give here a
condensed account of finite Radon measures. For this and the following section on func-
tions of bounded variation, we are following [AFP00] but we also refer the reader to
[EG92, Giu84, Rud87].
Definition 2.2.1 (Finite Radon measures). Let X be a locally compact, separable metric
space and let B(X), denote the Borel σ-algebra of X. We say that µ : B(X) → R` is
an R`–valued finite Radon measure if µ(∅) = 0 and for any sequence (An)n∈N of pairwise
40
2.2. Radon measures
disjoint elements of B(X) we have
µ
( ∞⋃
n=0
An
)
=
∞∑
n=0
µ(An).
The space of R`–valued finite Radon measures is denoted by M(X,R`), while in the case
` = 1 (real valued measures), we simply write M(X).
Note that every µ ∈ M(X,R`) can be written as µ = (µ1, . . . , µ`) where µi ∈ M(X) for
every i = 1, . . . , `.
Definition 2.2.2 (Total variation measure). For a measure µ ∈ M(X,R`) we define the
total variation measure of |µ| : B(X)→ R+, where R+ is the set of positive real numbers,
as follows
|µ|(A) = sup
{ ∞∑
n=1
|µ(An)| : An ∈ B(X) pairwise disjoint, A =
∞⋃
n=1
An
}
, A ∈ B(X).
It can be shown that if µ ∈M(X,R`) then |µ| is a finite positive measure. It is also useful
to introduce the notation M+(X) for the set of finite positive Radon measures on X and
M+loc(X) for the set of positive Radon measures on X, i.e., all the positive measures in
(X,B(X)) that are finite on every compact subset of X.
The spaceM(X,R`) is a Banach space under the norm ‖µ‖M(X,R`) = |µ|(X). In fact,
as the next theorem shows, the space M(X,R`) can be regarded as the dual space of
(C0(X,R`), ‖ · ‖∞).
Theorem 2.2.3 (Riesz representation theorem). Let X be a locally compact, separable
metric space and let T be a bounded linear functional on (C0(X,R`), ‖ · ‖∞). Then there
exists a unique µ ∈M(X,R`) such that
T (u) =
∑`
i=1
ˆ
X
ui dµi, ∀u ∈ C0(X,R`),
i.e., T can be represented by the measure µ. Moreover, we also have
‖T‖ = |µ|(X),
where ‖ · ‖ denotes the operator norm in the dual space of C0(X,R`).
From the definition of the operator norm in a dual space and the fact that C∞c (X,R`)
is dense in C0(X,R`) under the supremum norm, we have that for every µ ∈M(X,R`)
‖µ‖M(Ω,R`) = sup
{
〈µ, v〉 : v ∈ C∞c (X,R`), ‖v‖∞ ≤ 1
}
, (2.1)
41
Mathematical preliminaries
where here 〈µ, v〉 denotes the duality pairing 〈µ, v〉 = ∑`i=1 ´X ui dµi. Thus (2.1) gives an
alternative definition for ‖µ‖M(X,R`) and provides a nice framework to extend its use in
distributions: Given an R`–valued distribution T in an open domain Ω ⊆ Rd we define the
Radon norm of T to be
‖T‖M(Ω,R`) = sup
{
〈T, v〉 : v ∈ C∞c (Ω,R`), ‖v‖∞ ≤ 1
}
. (2.2)
In fact, as a consequence of Theorem 2.2.3 we have that‖T‖M(Ω,R`) < ∞ if and only if
T is a finite Radon measure µ and in that case ‖T‖M(Ω,R`) = |µ|(Ω). When there is no
ambiguity for the domain and the range of T we will simply write ‖T‖M.
We will often consider the Lebesgue decomposition of a finite Radon measure with
respect to a positive measure. We first recall the notion of absolute continuity and mutual
singularity for measures.
Definition 2.2.4. Let (X,B(X)) be a measure space.
(i) Let µ ∈ M+loc(X) and ν ∈ M(X,R`) . We say that ν is absolutely continuous with
respect to µ and we write ν  µ, if whenever µ(A) = 0 then |ν|(A) = 0.
(ii) Let µ, ν ∈M(X,R`). We say that µ and ν are mutually singular and we write µ⊥ν
if there exists an A ∈ B(X) such that |µ|(E) = 0 and |µ|(X \ E) = 0.
We note here that if µ ∈ M+loc(X) and f ∈ L1(X,R`;µ) then fµ denotes the measure
in M(X,R`) defined by
fµ(A) =
ˆ
A
f dµ, A ∈ B(X).
Theorem 2.2.5 (Lebesgue decomposition). Let µ be a σ-finite measure in M+loc(X) and
ν ∈M(X,R`). Then there exists a unique pair να, νs ∈M(X,R`) such that
ν = να + νs, να  µ, νs⊥µ.
Moreover there exists a unique f ∈ L1(X,R`;µ) such that να = fµ. The function f is
called the Radon–Nikody´m density of να with respect to µ and is denoted by ν
α
µ .
From now on, unless otherwise stated, all the Lebesgue decompositions will be consid-
ered with respect to the Lebesgue measure Ld. Thus, notation-wise µα and µs denote the
absolutely continuous and singular part of µ with respect to the Lebesgue measure.
It is very easy to check that a finite Radon measure µ is always absolutely continuous
with respect to its total variation measure |µ| and hence it has a density with respect to
it. This leads to the following theorem.
Theorem 2.2.6 (Polar decomposition). Let µ ∈ M(X,R`). Then there exists a unique
function sgn(µ) ∈ L1(X,R`; |µ|), equal to 1 µ-almost everywhere such that µ = sgn(µ)|µ|,
i.e, sgn(µ) = µ|µ| .
42
2.3. Functions of bounded variation
We finish this section mentioning two important notions of convergence in M(X,R`),
namely the weak∗ and the strict convergence.
Definition 2.2.7 (Weak∗ convergence of measures). Let µ ∈ M(X,R`), (µn)n∈N ⊆
M(X,R`). We say that (µn)n∈N converges weakly∗ to µ if for every u ∈ C0(X) we have
lim
n→∞
ˆ
X
u dµn =
ˆ
X
u dµ,
where the above integrals are computed component-wise, i.e,
ˆ
X
u dµ =
(ˆ
X
u dµ1, . . . ,
ˆ
X
u dµ`
)
.
Observe that the above notion of weak∗ convergence corresponds exactly to the usual
notion of weak∗ convergence in duals of Banach spaces since from the Riesz representation
theorem M(X,R`) is the dual space of C0(X,R`).
Definition 2.2.8 (Strict convergence of measures). Let µ ∈M(X,R`), (µn)n∈N ⊆M(X,R`).
We say that (µn)n∈N converges strictly to µ if
µn → µ, weakly∗ in measures and lim
n→∞ ‖µn‖M(X,R`) = ‖µ‖M(X,R`).
Recall that from the general Banach space theory, if µn → µ weakly∗ then
‖µ‖M(X,R`) ≤ lim infn→∞ ‖µn‖M(X,R`),
i.e., “mass” can still “escape” at infinity. Thus, strict convergence is a stronger notion of
convergence than weak∗.
2.3 Functions of bounded variation
2.3.1 Definitions and properties
The space of functions of bounded variation is a central notion in mathematical imaging
as functions that belong to that space are allowed to have jump discontinuities which are
interpreted as sharp edges in images. Here, we provide a short summary of the theory of
this space, giving emphasis to theorems that we are going to use. As usual Ω denotes an
open domain in Rd.
Definition 2.3.1 (Space of functions of bounded variation). A function u ∈ L1(Ω,R`)
is said to be a function of bounded variation or else u ∈ BV(Ω,R`) if its distributional
derivative Du can be represented by a R`×d–valued finite Radon measure (also denoted by
43
Mathematical preliminaries
Du). That is to say, Du ∈M(Ω,R`×d) and if u = (u1, . . . , u`) then
ˆ
Ω
ua
∂v
∂xi
dx = −
ˆ
Ω
v dDiu
a, ∀v ∈ C∞c (Ω), i = 1, . . . , d, a = 1, . . . , `.
Consistently with our notation, BV(Ω) denotes the space BV(Ω,R).
According to the Lebesgue decomposition theorem Du can be decomposed to the abso-
lutely continuous and singular part with respect to the Lebesgue measure
Du = Dαu+Dsu,
where Dαu and Dsu are short versions of (Du)α and (Du)s respectively. We will also use
the traditional notation ∇u for the absolutely continuous part of Du, that is ∇u = Dαu.
When d = ` = 1, ∇u will be also denoted by u′. It is immediate from the Definition 2.3.1
that W 1,1(Ω,R`) ⊆ BV(Ω,R`) since if u ∈W 1,1(Ω,R`) then Du = ∇uLd.
For a function u ∈ L1(Ω,R`) the total variation of u, TV(u), is defined as
TV(u) = sup
{∑`
a=1
ˆ
Ω
uadivvadx : v ∈ C1c (Ω,R`×d), ‖v‖∞ ≤ 1
}
.
It can be proven that TV(u) <∞ if and only if u ∈ BV(Ω,R`) and in that case TV(u) =
|Du|(Ω). If u ∈ W 1,1(Ω,R`) then |Du|(Ω) is simply the L1 norm of ∇u, ´Ω |∇u| dx. It
can be easily checked that the total variation is lower semicontinuous with respect to the
strong L1 convergence.
The space BV(Ω,R`) endowed with the norm
‖u‖BV(Ω,R`) = ‖u‖L1(Ω,R`) + |Du|(Ω),
is a Banach space. However the topology induced by that norm is very strong, in fact
smooth functions are not dense with respect to that topology. To that scope, two weaker
notions of convergence are introduced, namely the weak∗ and the strict convergence.
Definition 2.3.2 (Weak∗ convergence in BV). Let u ∈ BV(Ω,R`), (un)n∈N ⊆ BV(Ω,R`).
We say that the sequence (un)n∈N converges to u weakly∗ in BV(Ω,R`) if (un)n∈N converges
to u in L1(Ω,R`) and (Dun)n∈N converges to Du weakly∗ in M(Ω,R`×d), i.e.,
lim
n→∞ ‖un − u‖L1(Ω,R`) = 0 and limn→∞
ˆ
Ω
v dDun =
ˆ
Ω
v dDu, ∀v ∈ C0(Ω).
Definition 2.3.3 (Strict convergence in BV). Let u ∈ BV(Ω,R`), (un)n∈N ⊆ BV(Ω,R`).
We say that the sequence (un)n∈N converges to u strictly in BV(Ω,R`) if (un)n∈N converges
44
2.3. Functions of bounded variation
to u in L1(Ω,R`) and the total variations (|Dun|(Ω))n∈N converge to |Du|(Ω), i.e.,
lim
n→∞ ‖un − u‖L1(Ω,R`) = 0 and limn→∞ |Dun|(Ω) = |Du|(Ω).
It can be easily verified that strict convergence is induced by the following metric
d(u, v) = ‖u− v‖L1(Ω,R`) + ||Du|(Ω)| − |Dv|(Ω)| .
Moreover it can be also shown that strict convergence implies weak∗ convergence while
the opposite implication is not true in general. The usefulness of the introduction of these
weaker notions of convergence can be seen in the following theorem.
Theorem 2.3.4 (Strict approximation by smooth functions in BV). Let u ∈ L1(Ω,R`).
Then u ∈ BV(Ω,R`) if and only if there exists a sequence (un)n∈N ∈ C∞(Ω,R`) ∩
W 1,1(Ω,R`) such that
lim
n→∞ ‖un − u‖L1(Ω,R`) = 0 and limn→∞
ˆ
Ω
|∇un| dx <∞
In fact, if u ∈ BV(Ω,R`) then the sequence (un)n∈N can be chosen such that
lim
n→∞
ˆ
Ω
|∇un| dx = |Du|(Ω),
i.e., u can be approximated strictly in BV(Ω,R`) by a sequence of smooth functions.
The following compactness theorem is a useful tool for proving the well-posedness of
certain variational problems through the direct method of calculus of variations. Recall
that the lack of reflexivity of the Sobolev space W 1,1(Ω) is translated to lack of weak
compactness. Thus, as we also see in the next sections, minimisation problems originally
posed in W 1,1(Ω) and are often embedded in a suitable way in BV(Ω) whose compactness
properties guarantee a solution.
Theorem 2.3.5 (Compactness in BV). Let Ω be an open bounded domain in Rd with
Lipschitz boundary and let (un)n∈N be a sequence in BV(Ω,R`) such that
sup
n∈N
‖un‖BV(Ω,R`) <∞.
Then there exists a subsequence (unk)k∈N that converges to some u ∈ BV(Ω,R`), weakly∗
in BV(Ω,R`).
We now state a useful embedding theorem and the version of Poincare´ inequality in
BV. We define 1∗ = d/(d − 1) when d > 1 and 1∗ = ∞ when d = 1. Note also that uΩ
45
Mathematical preliminaries
denotes the mean value of u in Ω, i.e.,
uΩ :=
1
Ld(Ω)
ˆ
Ω
u dx. (2.3)
Theorem 2.3.6. ( BV embedding theorem / Poincare´ inequality) Let Ω be an open bounded
domain in Rd with Lipschitz boundary. Then BV(Ω) ⊆ L1∗(Ω) with continuous embedding.
Moreover if Ω is connected, the following Poincare´ type inequality holds
‖u− uΩ‖Lp(Ω) ≤ C|Du|(Ω), ∀u ∈ BV(Ω), 1 ≤ p ≤ 1∗,
for some constant C depending only on Ω.
We finish this section mentioning some facts about traces in BV.
Theorem 2.3.7 (Boundary trace theorem in BV). Let Ω be an open bounded domain in
Rd with Lipschitz boundary and let u ∈ BV(Ω,R`). Then for Hd−1–almost every x ∈ ∂Ω,
there exists uΩ(x) ∈ R` such that
lim
δ→0
1
δd
ˆ
Ω∩B(x,δ)
|u(y)− uΩ(x)| dy = 0.
Moreover,
‖uΩ‖L1(∂Ω,R`;Hd−1) ≤ C‖u‖BV(Ω,R`), (2.4)
for some constant C that depends only on Ω. The function uΩ ∈ L1(∂Ω,R`;Hd−1) is
called the trace of u on ∂Ω. Moreover the extension u of u to 0 out of Ω belongs to
BV(Rd,R`) and viewing Du as a measure on the whole of Rd and concentrated on Ω, Du
is given by
Du = Du+ (uΩ ⊗ νΩ)Hd−1b∂Ω,
where νΩ is the outward unit normal vector on ∂Ω and Hd−1b∂Ω is the (d−1)–dimensional
Hausdorff measure restricted on ∂Ω.
As a consequence of the boundary trace theorem, we also get the following results:
Theorem 2.3.8 (Integration by parts). Let Ω be an open bounded domain in Rd with
Lipschitz boundary and let u ∈ BV(Ω,R`). Then for every i = 1, . . . , d, a = 1, . . . , ` and
every function v ∈ C1(Ω) we have
ˆ
Ω
ua
∂v
∂xi
dx = −
ˆ
Ω
vdDiu
a +
ˆ
∂Ω
(uΩ)a(νΩ)iv dHd−1 (2.5)
Note that by summing (2.5) over i and by using an appropriate affine function v we
46
2.3. Functions of bounded variation
get the following estimate∣∣∣∣ˆ
Ω
u dx
∣∣∣∣ ≤ K (|Du|(Ω) + ‖uΩ‖L1(∂Ω,R`;Hd−1)) , ∀u ∈ BV(Ω), (2.6)
where the constant K > 0, depends only on d and Ω. If Ω is connected, we can combine
further the above estimate with the Poincare´ inequality (2.3) and get
‖u‖L2(Ω) ≤ C
(
|Du|(Ω) + ‖uΩ‖L1(∂Ω,R`;Hd−1)
)
, ∀u ∈ BV(Ω), (2.7)
where again, the constant C > 0, depends only on d and Ω.
Proposition 2.3.9. Let Ω be an open bounded domain in Rd with Lipschitz boundary,
u ∈ BV(Ω,R`) and v ∈ BV(Rd \ Ω,R`). Then the function
w(x) =
u(x) if x ∈ Ω,v(x) if x ∈ Rd \ Ω,
belongs to BV(Rd,R`) and viewing Du (respectively Dv) as a measure on the whole of Rd
and concentrated on Ω (respectively Rd \ Ω), we have
Dw = Du+Dv +
(
uΩ − vRd\Ω
)
⊗ νΩHd−1b∂Ω.
2.3.2 Good representatives
For this section, Ω will be a bounded open interval (a, b) ⊆ R. Since every function
u ∈ L1(a, b) is defined as an equivalence class of functions that are equal almost everywhere,
it is not possible to make sense of pointwise values of u. In BV(a, b) this problem can be
overcome through the introduction of good representatives. This is a notion that we are
going to use often in Chapter 4.
Definition 2.3.10 (Pointwise variation). For any function u : (a, b) → R, the pointwise
variation pV(u, (a, b)) of u in (a, b) is defined as
pV(u, (a, b)) = sup
{
n−1∑
i=1
|u(xi+1)− u(xi)| : n ≥ 2, a < x1 < · · · < xn < b
}
.
As one can observe the pointwise variation is extremely sensitive on the choice of the
representative of u. This motivates the following definition where the pointwise variation
in minimised among elements in the equivalence class of u.
Definition 2.3.11 (Essential variation). For any function u : (a, b) → R, the essential
47
Mathematical preliminaries
variation eV(u, (a, b)) of u in (a, b) is defined as
eV(u, (a, b)) = inf {pV(v, (a, b)) : v = u, L–a.e. in (a, b)} .
It can be proved that for every function u ∈ L1(a, b) the infimum in the definition of
essential variation is achieved and it is equal to the total variation of u. Thus, if u ∈
BV(a, b) then eV(u, (a, b)) = |Du|(a, b). Any function u˜ in the equivalence class of u with
the property pV(u˜, (a, b)) = |Du|(a, b) is called good or precise representative of u. The
next theorem shows that these representatives can be characterised in a more explicit way
and they have some good continuity and differentiability properties. We define the jump
set of u, Ju, to be the set of atoms of Du, i.e., Ju = {x ∈ (a, b) : Du({x}) 6= 0}.
Theorem 2.3.12 (Good representatives). Let u ∈ BV(a, b). Then the following state-
ments hold:
(i) There exists a unique c ∈ R such that
ul(x) := c+Du((a, x)), ur(x) := c+Du((a, x]), ∀x ∈ (a, b),
are good representatives of u, the left continuous and the right continuous one. Any
other function u˜ : (a, b)→ R is a good representative of u if and only if
u˜(x) ∈
{
λul(x) + (1− λ)ur(x) : λ ∈ [0, 1]
}
, ∀x ∈ (a, b).
(ii) Any good representative u˜ is continuous in (a, b) \ Ju and has a jump discontinuity
at any point of Ju:
u˜(x−) = ul(x) = ur(x−), u˜(x+) = ul(x+) = ur(x), ∀x ∈ Ju,
where x− and x+ denote left and right limits at x.
(iii) Any good representative u˜ is differentiable at L–a.e. point of (a, b) and the derivative
u˜′ is the density of Du with respect to L.
It is clear from the above that if u ∈ BV(a, b) then the functions uup and ulow are also
good representatives of u where
uup(x) := max{ul(x), ur(x)} and ulow(x) := min{ul(x), ur(x)}, ∀x ∈ (a, b).
see Figure 2.1 for an illustration.
48
2.4. Lower semicontinuous envelopes
(a) ur (b) ul
(c) uup (d) ulow
Figure 2.1: Good representatives ur, ul, uup and ulow of a BV function u.
2.4 Lower semicontinuous envelopes
This section concerns lower semicontinuous envelopes of functionals or otherwise called
relaxed functionals. We essentially follow [DM93] but [Bra02] is a good reference as well.
We denote by R the set of extended real numbers, i.e., R = R ∪ {+∞}. Let X be a set
endowed with some topology T . For every x ∈ X we denote by N (x) the set of all the
open neighbourhoods of x in X.
Definition 2.4.1 (Lower semicontinuity). A function F : X → R is called T –lower
semicontinuous at a point x ∈ X if for every t ∈ R with t < F (x), there exists a U ∈ N (x)
such that t < F (y) for every y ∈ U .
For first countable topological spaces (hence for metric spaces as well) lower semicontinuity
is translated to a more familiar expression:
Proposition 2.4.2. Suppose that (X, T ) is a first countable topological space. Then the
following are equivalent:
(i) F is T –lower semicontinuous at x.
(ii) For every sequence (xn)n∈N converging to x, we have
F (x) ≤ lim inf
n→∞ F (xn).
Definition 2.4.3 (Lower semicontinuous envelope). Let F : X → R. The lower semicon-
tinuous envelope or otherwise called the relaxed functional of F with respect to the topology
49
Mathematical preliminaries
T , scT F : X → R is defined as
scT F (x) = sup
{
G(x) : G : X → R, T –lower semicontinuous, G(y) ≤ F (y), ∀y ∈ X} ,
for every x ∈ R.
It easy to check that scT F is the greatest T –lower semicontinuous functional which is
smaller or equal than F . It can be also checked that
scT F (x) = sup
U∈N (x)
inf
y∈U
F (y).
Similarly to lower semicontinuity, if X is first countable then lower semicontinuous en-
velopes can be easily characterised.
Proposition 2.4.4. Suppose that (X, T ) is a first countable topological space. Then scT F
is characterised by the following two properties:
(i) For every sequence (xn)n∈N converging to x, we have
scT F (x) ≤ lim inf
n→∞ F (xn).
(ii) There exists a sequence (xn)n∈N converging to x, such that
scT F (x) ≥ lim sup
n→∞
F (xn).
The following proposition shows the usefulness of lower semicontinuous envelopes re-
garding the existence of minimum points.
Proposition 2.4.5. Let F : X → R and suppose that infx∈X F (x) ∈ R. Then if scT F
has minimum point we have that
min
x∈X
scT F (x) = inf
x∈X
F (x).
2.5 Convex analysis
In this section we recall some useful tools from convex analysis. In the first part we
introduce the basic concepts while in the second part we provide a short summary of the
Fenchel–Rockafellar duality theory. For further study we refer the reader to [ET76] which
we also follow here.
2.5.1 Basic concepts
We start with a few definitions and remarks on notation. If (X, ‖ · ‖X) is a Banach space
then X∗ denotes its dual space. The elements of X∗ are denoted with x∗, y∗, . . . etc.. The
50
2.5. Convex analysis
indicator function of a set A ⊆ X is defined as
IA(x) =
0 if x ∈ A,∞ if x /∈ A.
For a function F : X → R we define the domain of F , dom(F ), to be the set of points
where F takes finite values, i.e.,
dom(F ) = {x ∈ X : F (x) <∞}.
A function F is called proper if dom(F ) 6= ∅. We now define the convex conjugate F ∗ of
a function F :
Definition 2.5.1 (Convex conjugate). Let F : X → R. Then the convex conjugate of F ,
F ∗ : X∗ → R is defined as
F ∗(x∗) = sup
x∈X
〈x∗, x〉 − F (x), ∀x∗ ∈ X∗.
The following, easy to verify result, will be useful to us.
Proposition 2.5.2. Let (X, ‖ · ‖X) be a Banach space 1 < p <∞ and F : X → R with
F (x) =
1
p
‖x‖pX , ∀x ∈ X.
Then the convex conjugate of F is
F ∗(x∗) =
1
q
‖x∗‖qX∗ , x∗ ∈ X∗.
where 1/p+ 1/q = 1.
We finally recall the definition of the subdifferential of a function.
Definition 2.5.3. Let (X, ‖·‖X) be a Banach space and F : X → R. We say that x∗ ∈ X∗
belongs to the subdifferential of F at x ∈ X and we write x∗ ∈ ∂F (x) if
F (x) <∞ and 〈x∗, y − x〉+ F (x) ≤ F (y), ∀y ∈ X.
If ∂F (x) 6= ∅ we say that F is subdifferentiable in x.
The subdifferential of F at a given point can be also characterised as follows:
Proposition 2.5.4. Let F : X → R. Then x∗ ∈ ∂F (x) if and only if
F (x) + F ∗(x∗) = 〈x∗, x〉.
51
Mathematical preliminaries
2.5.2 Fenchel–Rockafellar duality
Fenchel–Rockafellar duality theory provides a convenient framework for characterising
solutions of variational problems. In this section, we present a basic review for variational
problems that have a particular form, i.e.,
inf
x∈X
F1(x) + F2(Λ(x)), (2.8)
where X, Y are Banach spaces, Λ : X → Y is a bounded linear operator and F1 : X →
R, F2 : Y → R are proper, convex, lower semicontinuous functions. The minimisation
problem (2.8) is called the primal problem and it is denoted by Pprimal. The dual problem
of (2.8) is denoted by Pdual and is defined as
sup
y∗∈Y ∗
−F ∗1 (Λ∗(y∗))− F ∗2 (−y∗), (2.9)
where Λ∗ : Y ∗ → X∗ is the adjoint operator of Λ. We denote with inf Pprimal and supPdual
the infimum of (2.8) and the supremum of (2.9) respectively. It can be shown that it always
holds supPdual ≤ inf Pprimal. If supPdual = inf Pprimal we say that no duality gap occurs.
The following theorem shows how the solutions of Pprimal and Pdual are related when no
duality gap occurs.
Theorem 2.5.5 (Optimality conditions). Suppose that both problems Pprimal and Pdual
have solutions and that
supPdual = inf Pprimal,
with that number being finite. Then all the solutions x of Pprimal and y∗ of Pdual are
related through the following optimality conditions:
Λ∗(y∗) ∈ ∂F1(x), (2.10)
−y∗ ∈ ∂F2(Λ(x)). (2.11)
Conversely, if x ∈ X, y∗ ∈ Y ∗ satisfy (2.10)–(2.11) then x is a solution of Pprimal, y∗ is
a solution of Pdual and supPdual = inf Pprimal with that number being finite.
We note here that using Proposition 2.5.4 and having in mind that F ∗∗ = F for convex
lower semicontinuous functions, we can easily check that the optimality conditions (2.10)-
(2.11) are equivalent to
x ∈ ∂F ∗1 (Λ∗(y∗)), (2.12)
Λ(x) ∈ ∂F ∗2 (−y∗). (2.13)
There have been quite a few results regarding necessary conditions for the absence of
52
2.5. Convex analysis
duality gap. We mention here a condition due to Attouch and Brezis [AB86], which is
useful for our purposes.
Theorem 2.5.6 (Attouch–Brezis). Suppose that X,Y are Banach spaces, Λ : X → Y
is a bounded linear operator and F1 : X → R, F2 : Y → R are proper, convex, lower
semicontinuous functionals. If the set⋃
λ≥0
λ(dom(F2)− Λ(dom(F1))),
is a closed subspace of Y , then
inf
x∈X
F1(x) + F2(Λ(x)) = max
y∗∈Y ∗
−F ∗1 (Λ∗(y∗))− F ∗2 (−y∗),
i.e., the dual problem Pdual has a solution and no duality gap occurs.
53
Mathematical preliminaries
54
Chapter 3
The combined TV–TV2 approach
for image reconstruction
3.1 Introduction
As we have already discussed in the introduction, one way to reduce the staircasing effect
in total variation-based image reconstruction methods, is the incorporation of higher order
derivatives in the regularisation process. Infimal convolution type of methods have been
proven to be very successful towards this task with total generalised variation being the
current state of the art. However, one of the disadvantages of higher order regularisation
methods is the increased computational cost involved in their implementation. TGV
regularisation is no exception even though sophisticated algorithms can be used for its
solution. Currently, TGV minimisation problems are commonly solved with the primal
dual hybrid method of Chambolle–Pock [CP11], see [Bre14]. We will show that the method
we propose here can be solved even faster with the split Bregman algorithm [GO09] while
still reducing the staircasing effect in image denoising and deblurring. Moreover, unlike
TGV, our method can be also used for image inpainting, having the advantage of being able
to interpolate images along large gaps in the inpainting domain, which is an improvement
to first order total variation inpainting.
In summary the model we are studying here is the following:
min
u∈BV2(Ω)
1
s
ˆ
Ω
|Tu− f |sdx+ α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω), s = 1, 2, (3.1)
where α, β are positive parameters, ϕ1, ϕ2 are two convex functions ϕ1 : R2 → R+,
ϕ2 : R4 → R+ with at most linear growth at infinity and T is suitable bounded linear
operator. The minimisation (3.1) is done over BV2(Ω), the space of functions of bounded
Hessian, see Section 3.3 for details. For the time being, we note that for a function
u ∈ BV2(Ω), the second order distributional derivative D2u, is a finite Radon measure.
55
The combined TV–TV2 approach for image reconstruction
This means that problem (3.1) is posed in the framework of convex functions of measures.
The motivation for that is to make sense of expressions like
´
Ω
√|∇2u|2 + , where  1
and ∇2u is not necessarily an L1 function but it can also be a measure. This concept
has already been considered in the first order case, e.g., in the study of the following
“smoothed” total variation denoising problem [Ves01]
min
u∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+
ˆ
Ω
√
|∇u|2 +  dx, (3.2)
where the term
´
Ω
√|∇u|2 +  dx has to be defined appropriately in order to make sense
for measures. Such “rounding offs” of non-smooth regularisers like the total variation
become necessary for the numerical implementation by means of time-stepping [Ves01],
multigrid methods [FSHW04, Vog95, VW97] or semi-smooth Newton methods [HS06b]
for instance.
Well-posedness of (3.1) is proved by identifying the minimising functional as a lower
semicontinuous envelope of another naturally defined functional. The underlying topology
under which the lower semicontinuous envelope is considered is a suitably defined topology
in BV2(Ω) which is analogous to the strict topology in BV(Ω). Uniqueness and stability
results are also shown. We proceed to the numerical implementation of our method using
the split Bregman algorithm. To do so, we leave the general framework of convex functions
of measures by choosing ϕ1 = | · | in R2 and ϕ2 = | · | in R4. As a result, the minimisation
we treat numerically is
min
u∈BV2(Ω)
1
s
ˆ
Ω
|Tu− f |sdx+ αTV(u) + β TV2(u), s = 1, 2, (3.3)
where TV2(u) denotes the second order total variation of u, |D2u|(Ω). Note that the
case β = 0, is the classical ROF model while the case α = 0, i.e., pure second order
regularisation, has been studied in [BP10]. We show that by choosing α 6= 0, β 6= 0 one
can achieve better results (as these are assessed by image quality measures) by both these
special cases. The idea in our approach is to weight TV with a fairly large α and in the
same time keeping β small enough such that no serious blur is introduced in the image
while staircasing is reduced. Even though the introduction of TV2 does not allow the
preservation of edges like TGV, by keeping the value of β small, the quality of the results
is still comparable with the TGV ones. Moreover, since our method do not decompose
the image a´ la infimal convolution, it can solved very efficiently with the split Bregman
algorithm. Thus, we end up with a very fast method which is ideal as a pre-processing step
and provides a solid basis for some simple and fast further improvements, e.g., contrast
adjustment and sharpening.
Let us stress finally one more time that the image model we are based on is f = Tu+η,
where T is a bounded, linear operator and η is random noise. This model is well suited for a
56
3.1. Introduction
variety of image reconstruction tasks e.g. denoising, deconvolution, inpainting, zooming, to
name a few. Even though this model and its corresponding variational problem (3.1) is not
suitable for some advanced medical imaging tasks like MRI and tomography, it introduces
a new regulariser, namely the weighted sum of first and second order total variation which
can be exploited in future research focusing on these tasks. In that case extension of our
results for complex valued images will be necessary as for instance the images that one
typically obtains from an MRI machine are of that kind, as a result of the use of Fourier
transform. However, as the present thesis deals with more “traditional” image processing
tasks we prefer to focus on the real valued case. We only use the fast Fourier transform
in Section 3.5.3 as a tool for the efficient solution of a certain minimisation problem but
the data obtained in the end are real valued.
3.1.1 Organisation of the chapter
In Section 3.2 we state some preliminary facts concerning convex functions of measures.
The definition of a convex function of a measure is given, along with a lower semicontinuity
result due to Buttazzo and Freddi.
In Section 3.3 we introduce the space of functions of bounded Hessian BV2(Ω). More
specifically, the definition of BV2(Ω) together with its basic properties is stated in Section
3.3.1. In Section 3.3.2 we introduce two naturally defined topologies on BV2(Ω), namely
the weak∗ and the strict topology.
The well-posedness of the TV–TV2 model is shown in Section 3.4. There, the min-
imising functional in (3.1) is identified as the lower semicontinuous envelope of a naturally
defined functional (defined in the context of convex functions instead of measures) with
respect to the aforementioned topologies of BV2(Ω). Existence, uniqueness as well as
stability results are shown.
Section 3.5 discusses the numerical realisation of the model using the split Bregman
iteration. In particular, Section 3.5.1 is concerned with the discretisation of the model.
The corresponding finite dimensional norms along with the discrete differential operators
are introduced. In Section 3.5.2 we recall the Bregman iteration and its use for solving
constrained optimisation problems and we contribute to its convergence theory. In Section
3.5.3, we describe how the Bregman iteration can be adapted to our model and we propose
a splitting strategy for it (split Bregman) as a numerical method for the solution of our
problem.
Applications in denoising and deblurring of the TV–TV2 approach are presented in
Section 3.6. We present numerical experiments for images that have been corrupted by
Gaussian noise, comparing the TV–TV2 method with TV, TV2, TGV and infimal convo-
lution denoising and we do the same for image deblurring.
The application of the TV–TV2 method in image inpainting is presented in detail
in Section 3.7. In Section 3.7.1, we provide a motivation for the use of our method for
57
The combined TV–TV2 approach for image reconstruction
inpainting tasks (large gap connectivity) and we present the concept and philosophy of
the online journal Image Processing Online in whose platform we provide an online, user
friendly, demonstration of TV–TV2 inpainting. The numerical realisation using the split
Bregman algorithm is described in Section 3.7.2, while the extension of the algorithm to
colour images is done in Section 3.7.3. Stopping criteria and optimal selection of the split
Bregman parameters are proposed and numerically justified in Section 3.7.4. Finally, in
Section 3.7.5 we provide some inpainting examples for both real life and synthetic images.
We stress the visual improvement of the TV–TV2 method over pure TV inpainting and we
discuss experimentally the role of the parameters α, β as well as the role of the geometry
of the inpainting domain, as far as connectivity along large gaps is concerned.
3.2 Convex functions of measures
In this section we introduce the notion of a convex function of a measure, for functions
with at most linear growth, following essentially [DT84]. Let ϕ : R` → R be a continuous
function, positively one homogeneous, i.e., for every x ∈ R`
ϕ(tx) = tϕ(x), ∀t ≥ 0.
Given a measure in µ ∈M(Ω,R`), where Ω is an open subset of Rd, we define the measure
ϕ(µ) ∈M(Ω) as follows:
ϕ(µ) := ϕ
(
µ
|µ|
)
|µ|. (3.4)
Notice that ϕ(µ) is well defined as µ/|µ| = 1, |µ|–a.e., and ϕ is bounded on S`−1 := {x ∈
R` : |x| = 1}. Moreover the following proposition, that was mentioned in [DT84] and we
give here a proof, holds:
Proposition 3.2.1. Suppose that ϕ : R` → R is a continuous function, positively one
homogeneous and µ ∈ M(Ω,R`). Then for every measure ν ∈ M+(Ω) such that µ is
absolutely continuous with respect to ν, we have
ϕ(µ) = ϕ
(
µ
|ν|
)
|ν|.
Moreover, if ϕ is a convex function from R` to R, then ϕ :M(Ω,R`)→M(Ω) is a convex
function is the space of measures as well.
Proof. Since µ  ν, we also have that |µ|  ν. Using the fact that ϕ is positively one
homogeneous and the fact that |µ|/ν is a positive function ν–a.e., we get
ϕ(µ) = ϕ
(
µ
|µ|
)
|µ| = ϕ
(
µ
|µ|
) |µ|
ν
ν = ϕ
(
µ
|µ|
|µ|
ν
)
ν = ϕ
(µ
ν
)
ν.
58
3.2. Convex functions of measures
Assuming now that ϕ is convex and using the first part of the proposition we get for
0 ≤ λ ≤ 1 and µ, ν ∈M(Ω,R`):
ϕ(λµ+ (1− λ)ν) = ϕ
(
λµ+ (1− λ)ν
|λµ+ (1− λ)ν|
)
|λµ+ (1− λ)ν|
= ϕ
(
λµ+ (1− λ)ν
|µ|+ |ν|
)
(|µ|+ |ν|)
= ϕ
(
λ
µ
|µ|+ |ν| + (1− λ)
ν
|µ|+ |ν|
)
(|µ|+ |ν|)
≤ λϕ
(
µ
|µ|+ |ν|
)
(|µ|+ |ν|) + (1− λ)ϕ
(
ν
|µ|+ |ν|
)
(|µ|+ |ν|)
= λϕ
(
µ
|µ|
)
|µ|+ (1− λ)ϕ
(
ν
|ν|
)
|ν|
= λϕ(µ) + (1− λ)ϕ(ν).
Suppose now that ϕ is not necessarily positively one homogeneous but a convex func-
tion ϕ : R` → R which has at most linear growth at infinity, i.e., there exists a positive
constant K such that
ϕ(x) ≤ K(1 + |x|), ∀x ∈ R`.
In that case the recession function ϕ∞ of ϕ is well defined everywhere, where
ϕ∞(x) := lim
t→∞
ϕ(tx)
t
, ∀x ∈ R`.
It can be proved in this case, see for instance [AFP00], that ϕ∞ is a convex, positively
one homogeneous function. We are now ready to define convex functions of measures for
functions of at most linear growth.
Definition 3.2.2 (Convex function of a measure). Let ϕ : R` → R be a convex function of
at most linear growth at infinity and let µ ∈ M(Ω,R`). Moreover, consider the Lebesgue
decomposition of µ with respect to the Lebesgue measure Ld in Ω:
µ =
( µ
Ld
)
Ld + µs.
Then we define the measure ϕ(µ) ∈M(Ω) as follows:
ϕ(µ) = ϕ
( µ
Ld
)
Ld + ϕ∞
(
µs
|µs|
)
|µs|. (3.5)
It can be proved in a similar way with Proposition 3.2.1 that ϕ :M(Ω,R`)→M(Ω,R) is
59
The combined TV–TV2 approach for image reconstruction
also a convex function. The following theorem, proved in [BF91] and can also be found in
[AFP00] establishes the lower semicontinuity of ϕ with respect to the weak∗ convergence
in M(Ω,R`).
Theorem 3.2.3 (Buttazzo–Freddi, 1991). Let Ω be an open subset of Rd, ν ∈M(Ω,R`),
(νn)n∈N ⊆M(Ω,R`) and let µ ∈M+(Ω), (µn)n∈N ⊆M+(Ω). Let ϕ : R` → R be a convex
function with at most linear growth at infinity and suppose that νn → ν and µn → µ
weakly∗ in measures. If ν = (ν/µ)µ + νs and νn = (νn/µn)µn + νsn are the Lebesgue
decompositions of ν and νn with respect to µ and µn respectively, then
ˆ
Ω
ϕ
(
ν
µ
)
dµ+
ˆ
Ω
ϕ∞
(
νs
|νs|
)
d|νs| ≤ lim inf
n→∞
ˆ
Ω
ϕ
(
νn
µn
)
dµn +
ˆ
Ω
ϕ∞
(
νsn
|νsn|
)
d|νsn|.
In particular, if µ = µn = Ld for all n ∈ N then the above inequality can be written as
ϕ(ν)(Ω) ≤ lim inf
n→∞ ϕ(νn)(Ω).
3.3 The space of functions of bounded Hessian BV2(Ω)
3.3.1 Definition and basic properties
In this section we recall the definition and the basic properties of the space of functions
of bounded Hessian. It was firstly introduced and studied by Demengel in [Dem85].
Definition 3.3.1 (Space of functions of bounded Hessian). Let Ω ⊆ Rd be open. A
function u ∈ W 1,1(Ω) is said to be a function of bounded Hessian or else u ∈ BV2(Ω) if
its second order distributional derivative D2u can be represented by a Rd×d–valued finite
Radon measure which is still denoted by D2u. In other words
BV2(Ω) =
{
u ∈W 1,1(Ω) : ∇u ∈ BV(Ω,Rd)
}
.
It can proved, in a similar way with the BV case that a function u ∈ L1(Ω) belongs to
BV2(Ω) if and only if its second order total variation TV2(u) is finite, where
TV2(u) = sup
{ˆ
Ω
udiv2v dx : v ∈ C2c (Ω,Rd×d), ‖v‖∞ ≤ 1
}
,
and in that case we have TV2(u) = |D2u|(Ω). The space BV2(Ω) is equipped with the
norm
‖u‖BV2(Ω) := ‖u‖BV(Ω) + |D2u|(Ω),
which makes it a Banach space. If Ω is connected and has a Lipschitz boundary then it
can be shown [Dem85] that there exist positive constants C1, C2 that depend only on Ω
60
3.3. The space of functions of bounded Hessian BV2(Ω)
such that
‖∇u‖L1(Ω,Rd) ≤ C1|D2u|(Ω) + C2‖u‖L1(Ω), ∀u ∈ BV2(Ω). (3.6)
It is obvious that in that case the norm ‖u‖ = ‖u‖L1(Ω) + |D2u|(Ω) is equivalent with
‖ · ‖BV2(Ω).
It is immediate that W 2,1(Ω) ⊆ BV2(Ω) since if u ∈ W 2,1(Ω) then D2u = ∇2uLd,
i.e., the second order distributional derivative is a function in L1(Ω,Rd×d). Generally, for
a function u ∈ BV2(Ω), we denote by ∇2u, the absolutely continuous part of D2u with
respect to Ld. Finally, it can also be proved, see also [Dem85] that the embedding of
BV2(Ω) into W 1,1(Ω) is compact.
3.3.2 Weak∗ and strict convergence in BV2(Ω)
Similarly with BV(Ω) we can define some weaker notions of convergence in BV2(Ω) namely
the weak∗ and the strict convergence. We also prove a weak∗ compactness theorem in
BV2(Ω) which is analogous to Theorem 2.3.5.
Definition 3.3.2 (Weak∗ convergence in BV2). Let u ∈ BV2(Ω), (un)n∈N ⊆ BV2(Ω). We
say that the sequence (un)n∈N converges to u weakly∗ in BV2(Ω) if (un)n∈N converges to
u in L1(Ω,Rd) and (∇un)n∈N converges to ∇u weakly∗ in BV(Ω,Rd), i.e.,
lim
n→∞ ‖un − u‖L1(Ω) = 0, limn→∞ ‖∇un −∇u‖L1(Ω,Rd) = 0,
and lim
n→∞
ˆ
Ω
v dD2un =
ˆ
Ω
v dD2u, ∀v ∈ C0(Ω).
It is not hard to check that weak∗ convergence is induced by a topology which has a basis
consisting of the following sets:
U(u0, F, ) =
{
u ∈ BV2(Ω) : ‖u0 − u‖L1(Ω) + ‖∇u0 −∇u‖L1(Ω,Rd)
+
∣∣∣∣ˆ
Ω
vi dD
2u0 −
ˆ
Ω
vi dD
2u
∣∣∣∣ < , i ∈ F} ,
where u0 ∈ BV2(Ω), F ⊆ N finite,  > 0 and vi ∈ C0(Ω) for every i ∈ F . Using now
Theorem 2.3.5 we can easily prove the following compactness result:
Theorem 3.3.3 (Compactness in BV2). Let Ω be an open bounded domain in Rd with
Lipschitz boundary and let (un)n∈N be a sequence in BV2(Ω) such that
sup
n∈N
‖un‖BV2(Ω) <∞.
Then there exists a subsequence (unk)k∈N that converges to some u, weakly
∗ in BV2(Ω).
61
The combined TV–TV2 approach for image reconstruction
Proof. From the compact embedding of BV2(Ω) into W 1,1(Ω) and the fact that the
sequence (∇un)n∈N is bounded in BV(Ω,Rd) we have that there exists a subsequence
(unk)k∈N, a function u ∈ W 1,1(Ω) and a function v ∈ BV(Ω,Rd) such that (unk)k∈N con-
verges to u strongly in W 1,1(Ω) and (∇unk)k∈N converges to v weakly∗ in BV(Ω,Rd).
Then, it is obvious that ∇u = v a.e., u ∈ BV2(Ω) and (unk)k∈N converges to u weakly∗ in
BV2(Ω).
Analogously with BV(Ω) we define the strict convergence in BV2(Ω).
Definition 3.3.4 (Strict convergence in BV2). Let u ∈ BV2(Ω), (un)n∈N ⊆ BV2(Ω). We
say that the sequence (un)n∈N converges to u strictly in BV2(Ω) if (un)n∈N converges to u
in L1(Ω) and the sequence (|D2un|(Ω))n∈N converges to |D2u|(Ω), i.e.,
lim
n→∞ ‖un − u‖L1(Ω = 0 and limn→∞ |D
2un|(Ω) = |D2u|(Ω).
It can be easily verified that the strict convergence in BV2(Ω) is induced by the fol-
lowing metric:
d(u, v) = ‖u− v‖L1(Ω) +
∣∣|D2u|(Ω)| − |D2v|(Ω)∣∣ , u, v ∈ BV2(Ω).
The following lemma can be used to compare the weak∗ and the strict convergence in
BV2(Ω).
Lemma 3.3.5. Suppose that Ω ⊆ Rd is an open, bounded, connected set with Lipschitz
boundary. Let u ∈ BV2(Ω), (un)n∈N ⊆ BV2(Ω) and suppose that (un)n∈N converges to u
strictly in BV2(Ω). Then (un)n∈N converges to u strongly in W 1,1(Ω), i.e.,
lim
n→∞ ‖un − u‖W 1,1(Ω) = 0.
Proof. Since the sequence (un)n∈N is strictly convergent in BV2(Ω), we have that the se-
quences (‖un‖L1(Ω))n∈N and (|D2un|(Ω))n∈N are bounded. Using the estimate (3.6) we
deduce that the sequence (‖∇un‖L1(Ω,Rd))n∈N is bounded as well, which implies that the
sequence (un)n∈N is bounded in BV2(Ω). From the compact embedding of BV2(Ω) into
W 1,1(Ω), we get that there exists a subsequence (unk)k∈N and a function v ∈ W 1,1(Ω)
such that (unk)k∈N converges to v strongly in W
1,1(Ω). In particular (unk)k∈N converges
to v in L1(Ω), thus v = u a.e. and hence, (unk)k∈N converges to u strongly in W
1,1(Ω).
However, since every subsequence (un)n∈N is bounded in BV2(Ω) we can repeat the same
argument and deduce that for every subsequence (unk)k∈N there exists a further subse-
quence (unkh )h∈N that converges to u strongly in W
1,1(Ω). This proves that the initial
sequence (un)n∈N converges to u strongly in W 1,1(Ω).
Corollary 3.3.6. Strict convergence implies weak∗ convergence in BV2(Ω).
62
3.4. Well-posedness of the model
Proof. The proof is straightforward using Lemma (3.3.5) and the fact that strict conver-
gence implies weak∗ convergence in BV(Ω,Rd).
As in the BV case, functions in BV2(Ω) can also be approximated strictly by smooth
functions. In fact a stronger result was proved in [DT84] were in addition to strict approx-
imation by a sequence of smooth functions (un)n∈N, also convergence of convex functions
of the second order total variations (ϕ(D2un))n∈N to ϕ(D2u) was shown.
Theorem 3.3.7 (Demengel–Temam, 1984). Suppose that Ω ⊆ Rd is an open set with
Lipschitz boundary and let ϕ : Rd×d → R be a convex function with at most linear growth
at infinity. Then for every u ∈ BV2(Ω) there exists a sequence (un)n∈N ∈ C∞(Ω)∩W 2,1(Ω)
such that (un)n∈N converges to u strictly in BV2(Ω) and
lim
n→∞ϕ(D
2un)(Ω) = ϕ(D
2u)(Ω).
Note that for a function u ∈W 2,1(Ω) the term ϕ(D2u)(Ω) is equal to ´Ω ϕ(∇2u) dx.
3.4 Well-posedness of the model
3.4.1 Existence and uniqueness
In this section we prove existence and uniqueness for the solutions of the minimisation
problem (3.1). We suppose that Ω is an open, bounded, connected subset of R2 with
Lipschitz boundary so that Theorems 2.3.5, 2.3.6, 3.3.3, 3.3.7 and Lemma 3.3.5 hold. We
remark that the analysis is done in R2, i.e., d = 2, not only because we are planning to
implement the model in two dimensional images but also because we are going to use the
Poincare´ inequality
‖u− uΩ‖L2(Ω) ≤ C|Du|(Ω), ∀u ∈ BV(Ω),
which is not true for d > 2. Also for convenience, we take s = 2 in (3.1) as the case s = 1
is treated similarly. We assume that T : L2(Ω)→ L2(Ω) is a bounded linear operator and
f ∈ L2(Ω).
We also assume that ϕ1 : R2 → R+ and ϕ2 : R4 → R+ are convex functions with at
most linear growth at infinity and they both satisfy a coercivity condition, i.e., there exist
constants K1, K2 > 0 such that
K1|x| ≤ ϕ1(x) ≤ K2(1 + |x|), ∀x ∈ R2, (3.7)
K1|x| ≤ ϕ2(x) ≤ K2(1 + |x|), ∀x ∈ R4. (3.8)
63
The combined TV–TV2 approach for image reconstruction
Thus, we are interested in the following minimisation problem
inf
u∈BV2(Ω)
H(u), (3.9)
where
H(u) :=
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω). (3.10)
Note that according to (3.5), H can also be written as
H(u) =
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β
ˆ
Ω
ϕ2(∇2u) (3.11)
+ β
ˆ
Ω
(ϕ2)∞
(
Ds∇u
|Ds∇u|
)
d|Ds∇u|.
Observe that BV2(Ω) is the right space in which one can try to prove existence of solutions
for (3.9). Indeed, if one tries to pose the problem is a Sobolev space setting, he/she would
have to minimise the following functional:
F (u) =
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β
ˆ
Ω
ϕ2(∇2u) dx, (3.12)
where now ∇2u is a function in L1(Ω,R2). The natural space for the functional F to be
defined in, is W 2,1(Ω). However, since this space is not reflexive we cannot prove existence
of solutions for the minimisation of F using the direct method of calculus of variations.
However, we will show that there is a connection between the F and H. We can naturally
extend the functional F in BV2(Ω) (also denoted by F ) as follows:
F (u) =

1
2
´
Ω(Tu− f)2dx+ α
´
Ω ϕ1(∇u) dx
+β
´
Ω ϕ2(∇2u) dx, if u ∈W 2,1(Ω)
+∞, if u ∈ BV2(Ω) \W 2,1(Ω).
(3.13)
Even though F is now defined in BV2(Ω), which, as we have seen, has the useful weak∗
compactness property, one can still not prove existence of solutions of the minimisation of
F using the direct method as the next proposition shows:
Proposition 3.4.1. Consider the functional F defined in (3.13). Then F is not lower
semicontinuous with respect to the strict topology in BV2(Ω) and hence it is neither with
respect to the weak∗ topology in BV2(Ω).
Proof. Note first that we can find a function u ∈ BV2(Ω) \ W 2,1(Ω), for instance, see
[Dem85] for such an example. Hence, from the definition of F , we have F (u) = ∞.
However, according to the Theorem 3.3.7 we can find a sequence (un)n∈N in W 2,1(Ω)
that converges to u strictly in BV2(Ω). It follows that the sequences (‖un‖L1(Ω))n∈N,
64
3.4. Well-posedness of the model
(‖∇2un‖L1(Ω,R4))n∈N, as well as (‖∇un‖L1(Ω,R2))n∈N (using estimate (3.6)) are bounded.
Using the Poincare´ inequality we can easily check that (un)n∈N is also bounded in L2(Ω).
From the fact that T is a bounded, linear operator and from conditions (3.7)–(3.8) on ϕ1
and ϕ2 we deduce that the sequence (F (un))n∈N is bounded as well. Hence, we get
F (u) > lim inf
n→∞ F (un),
which proves that F is not lower semicontinuous with respect to the strict topology in
BV2(Ω).
We are going to prove that H is the lower semicontinuous envelope of F with respect
to the weak∗ topology in BV2(Ω), i.e.,
H = sc(BV2(Ω),w∗)(F ).
The proof is done in two steps. Firstly, we show that H is indeed lower semicontinuous
with respect to the weak∗ topology.
Theorem 3.4.2. The functional H is lower semicontinuous with respect to the weak∗
topology in BV2(Ω).
Proof. Let u ∈ BV2(Ω), (un)n∈N ⊆ BV2(Ω) such that (un)n∈N converges to u weakly∗ in
BV2(Ω). We have to show that
H(u) ≤ lim inf
n→∞ H(un). (3.14)
From the weak∗ convergence we have that (un)n∈N converges to u in W 1,1(Ω). Using the
standard Sobolev inequality, see for instance [Eva10],
‖v‖L2(Ω) ≤ C‖v‖W 1,1(Ω), ∀v ∈W 1,1(Ω),
we deduce that (un)n∈N converges to u in L2(Ω). Since T : L2(Ω)→ L2(Ω) is continuous
we get that the map u 7→ 12
´
Ω(Tu− f)2dx is continuous and hence we have
1
2
ˆ
Ω
(Tun − f)2dx→ 1
2
ˆ
Ω
(Tu− f)2dx, as n→∞. (3.15)
It is not hard to check that since ϕ1 is convex with at most linear growth at infinity, then
is Lipschitz, say with constant L > 0. Using that fact we get∣∣∣∣ˆ
Ω
ϕ1(∇un) dx−
ˆ
Ω
ϕ1(∇u) dx
∣∣∣∣ ≤ ˆ
Ω
|ϕ1(∇un)− ϕ1(∇u)| dx
≤ L
ˆ
Ω
|∇un −∇u| dx. (3.16)
65
The combined TV–TV2 approach for image reconstruction
From the estimate (3.16) and the fact that (∇un)n∈N converges to ∇u in L1(Ω,R2) we
eventually get that
ˆ
Ω
ϕ1(∇un) dx→
ˆ
Ω
ϕ1(∇u) dx, as n→∞. (3.17)
Finally from the weak∗ convergence we have that (D2un)n∈N converges to D2u weakly∗ in
M(Ω,R2×2). We can then apply Theorem 3.2.3 for µn = µ = L2, ν = D2u, νn = D2un
and get
ϕ2(D
2u)(Ω) ≤ lim inf
n→∞ ϕ2(D
2un)(Ω). (3.18)
Combining (3.15), (3.17) and (3.18) we derive (3.14).
Theorem 3.4.3. The functional H is the lower semicontinuous envelope of F with respect
to the weak∗ topology in BV2(Ω).
Proof. We are going to use the characterisation of lower semicontinuous envelopes as that
was given in Proposition 2.4.4. Condition (i) of Proposition 2.4.4 follows from the lower
semicontinuity of H and the fact that H ≤ F . Thus it suffices to prove condition (ii), i.e.,
to prove that for every u ∈ BV2(Ω), there exists a sequence (un)n∈N ∈ BV2(Ω) converging
to u weakly∗ and
H(u) = lim
n→∞F (un). (3.19)
From Theorem 3.3.7 we get the existence of a sequence (un)n∈N ∈ C∞(Ω) ∩W 2,1(Ω) such
that (un)n∈N converges strictly, and thus also weakly∗ in BV2(Ω). Moreover, from the
same theorem we also get that
lim
n→∞ϕ2(D
2un)(Ω) = ϕ2(D
2u)(Ω). (3.20)
We also have that (un)n∈N converges to u in W 1,1(Ω) which, according to the proof of
Theorem 3.4.2 implies that
1
2
ˆ
Ω
(Tu− f)2dx+α
ˆ
Ω
ϕ1(∇un) dx→ 1
2
ˆ
Ω
(Tu− f)2dx+α
ˆ
Ω
ϕ1(∇un) dx, as n→∞.
(3.21)
Combining (3.20), (3.21) and that fact that H = F on C∞(Ω)∩W 2,1(Ω) we get (3.19).
Observe that since strict convergence induces weak∗ convergence, we get that H is also
the lower semicontinuous envelope of F with respect to the strict convergence in BV2(Ω),
i.e.,
H = sc(BV2(Ω),w∗)(F ) = sc(BV2(Ω),strict)(F ).
Let us note here that Theorem 3.4.3 can also be proved by incorporating a general
relaxation result in [AC94], where the authors identify lower semicontinuous envelopes of
functionals of the type
´
Ω ϕ(∇ku) dx in higher order BV spaces.
66
3.4. Well-posedness of the model
We are now ready to prove existence of solutions for the minimisation problem (3.9),
where we essentially use the lower semicontinuity of H, combined with the weak∗ compact-
ness in BV2(Ω). The proof of the following theorem follows the proof of the corresponding
theorem in [Ves01] which deals with the minimisation of the analogue first order functional
(β = 0).
Theorem 3.4.4. In addition to the already mentioned conditions on Ω, T , ϕ1 and ϕ2,
assume also that α > 0, β > 0 and T (XΩ) 6= 0. Then the minimisation problem
minu∈BV2(Ω)H(u), i.e.,
min
u∈BV2(Ω)
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω), (3.22)
has a solution u? ∈ BV2(Ω).
Proof. Let (un)n∈N be a minimising sequence for (3.22), i.e., limn→∞H(un) = inf H and
let M > 0 be an upper bound for (H(un))n∈N. We have that
sup
n∈N
ϕ2(D
2un)(Ω) < M, sup
n∈N
ˆ
Ω
ϕ1(∇un) dx < M and sup
n∈N
1
2
ˆ
Ω
(Tun − f)2dx < M.
(3.23)
The estimates (3.23) together with the coercivity assumptions (3.7)–(3.8) give that
sup
n∈N
|Dun|(Ω) = sup
n∈N
ˆ
Ω
|∇un| dx < M
K1
, (3.24)
sup
n∈N
|D2un|(Ω) < M
K1
. (3.25)
We now show that the sequence (un)n∈N is bounded in L2(Ω), following essentially [Ves01].
By the Poincare´ inequality, there exists a constant C > 0 such that for every n ∈ N
‖un‖L2(Ω) =
∥∥∥∥un −XΩ 1L2(Ω)
ˆ
Ω
un dx+ XΩ 1L2(Ω)
ˆ
Ω
un dx
∥∥∥∥
L2(Ω)
≤ C|Dun|(Ω) + 1L2(Ω)
∣∣∣∣ˆ
Ω
un dx
∣∣∣∣
≤ CM
K1
+
1
L2(Ω)
∣∣∣∣ˆ
Ω
un dx
∣∣∣∣ .
Thus in order to bound (un)n∈N in L2(Ω) it suffices to bound |
´
Ω un dx| uniformly in n.
We have for every n ∈ N∥∥∥∥T (XΩ 1L2(Ω)
ˆ
Ω
un dx
)∥∥∥∥
L2(Ω)
≤
∥∥∥∥T (XΩ 1L2(Ω)
ˆ
Ω
un dx
)
− Tun
∥∥∥∥
L2(Ω)
+ ‖Tun − f‖L2(Ω) + ‖f‖L2(Ω)
67
The combined TV–TV2 approach for image reconstruction
≤ ‖T‖
∥∥∥∥un −XΩ 1L2(Ω)
ˆ
Ω
un dx
∥∥∥∥
L2(Ω)
+ ‖Tun − f‖L2(Ω) + ‖f‖L2(Ω)
≤ C‖T‖|Dun|(Ω) +
√
2M + ‖f‖L2(Ω)
≤ CM
K1
‖T‖+
√
2M + ‖f‖L2(Ω). (3.26)
Thus, setting K = CMK1 ‖T‖ +
√
2M + ‖f‖L2(Ω) and using the fact that T (XΩ) 6= 0, it
follows from (3.26) ∣∣∣∣ˆ
Ω
un dx
∣∣∣∣ ‖T (XΩ)‖L2(Ω) ≤ KL2(Ω),
and since T (XΩ) 6= 0 we get ∣∣∣∣ˆ
Ω
un dx
∣∣∣∣ ≤ KL2(Ω)‖T (XΩ)‖L2(Ω) .
Since the sequence (un)n∈N is bounded in L2(Ω) and Ω is a bounded domain, we have that
it is also bounded in L1(Ω), and from estimates (3.24)–(3.25) we deduce that it is bounded
in BV2(Ω) as well. From Theorem 3.3.3 we obtain the existence of a subsequence (uk)k∈N
that converges weakly∗ to a function u? ∈ BV2(Ω). Since the functional H is weakly∗
lower semicontinuous we have that
inf F ≤ H(u?) ≤ lim inf
n→∞ H(un) = inf F,
which implies that
H(u?) = min
u∈BV2(Ω)
H(u).
Let us note here that in the above proof we used the fact that α > 0 in order to obtain
an L1(Ω) bound in the gradient of the minimising sequence (un)n∈N that was further used
to bound (un)n∈N in L2(Ω) (for the case β = 0 see [Ves01]). However even in the case
α = 0, we can also bound (un)n∈N in L2(Ω), and hence in BV2(Ω) using (3.6), if T satisfies
a condition of the type
M‖u‖L2(Ω) ≤ ‖Tu‖L2(Ω), ∀u ∈ L2(Ω), (3.27)
for a constant M > 0. For example (3.27) is true if T has a bounded inverse by setting
M = 1/‖T−1‖. If T does not satisfy a condition of the type (3.27) then it is not clear how
to get existence. However, we are still able to prove existence when α = 0, for operators
T that correspond to image inpainting. i.e., projections on subsets D of Ω. The proof is
more involved than the one of Theorem 3.4.4 and makes use of theorems on BV traces.
68
3.4. Well-posedness of the model
Ω \D
D
Figure 3.1: Inpainting domain D.
Theorem 3.4.5. If β > 0, then the minimisation problem
inf
u∈BV2(Ω)
ˆ
Ω\D
(u− f)2dx+ β ϕ2(D2u), (3.28)
has a solution, where D b Ω is an open, connected set with Lipschitz boundary, see also
Figure 3.1.
Proof. Consider a minimising sequence (un)n∈N for (3.28). From the coercivity assump-
tions (3.8) we get that
sup
n∈N
|D2un|(Ω) <∞. (3.29)
In order to prove the theorem it suffices to bound (un)n∈N and (∇un)n∈N in L1(Ω) and
L1(Ω,R2) respectively. Then the proof can be finished using a straightforward application
of the direct method as was done in the proof of Theorem 3.4.4. Since (un)n∈N is a
minimising sequence for (3.28) and Ω is bounded we have that
sup
n∈N
‖un‖L1(Ω\D) <∞. (3.30)
Estimates (3.29), (3.30) in combination with (3.6) give that
sup
n∈N
|Dun|(Ω \D) <∞. (3.31)
Denote with u
(1)
n and u
(2)
n the restrictions of un in Ω\D and D respectively. Then according
to Theorems 2.3.7 and 2.3.9 we have for every n ∈ N
|Dun|(Ω) = |Dun|(Ω \D) + |Dun|(D) + ‖(u(1)n )Ω\D − (u(2)n )D‖L1(∂D;H). (3.32)
Since un ∈W 1,1(Ω) we have for every n ∈ N
|Dun|(Ω) =
ˆ
Ω
|∇u| dx =
ˆ
Ω\D
|∇u| dx+
ˆ
D
|∇u| dx = |Dun|(Ω \D) + |Dun|(D). (3.33)
From (3.32) and (3.33) we deduce that
‖(u(1)n )Ω\D‖L1(∂D;H) = ‖(u(2)n )D‖L1(∂D;H). (3.34)
69
The combined TV–TV2 approach for image reconstruction
From the estimate (2.4) of Theorem 2.3.7 and from (3.30), (3.31) and (3.34) we get that
sup
n∈N
‖(u(2)n )D‖L1(∂D;H) <∞. (3.35)
Suppose now that supn∈N ‖un‖L1(D) =∞. Then from the estimate (2.7) and (3.35), that
would mean that
sup
n∈N
|Dun|(D) =∞, (3.36)
i.e, there exists an i0 ∈ {1, . . . , d} such that
sup
n∈N
∥∥∥∥ ∂un∂xi0
∥∥∥∥
L1(D)
=∞, (3.37)
but this is a contradiction. In order to see that, observe first that again from Theorems
2.3.7 and 2.3.9 we get for every n ∈ N
|D2un|(Ω) = |D2un|(Ω \D) + |D2un|(D) + ‖(∇u(1)n )Ω\D − (∇u(2)n )D‖L1(∂D,Rd;H). (3.38)
From (3.29), (3.31) and the estimate (2.4) we get that
sup
n∈N
‖(∇u(1)n )Ω\D‖L1(∂D,Rd;H) <∞, (3.39)
which in combination with (3.29) and (3.38) gives
sup
n∈N
‖(∇u(2)n )D‖L1(∂D,Rd;H) <∞⇒ sup
n∈N
∥∥∥∥∥∥
(
∂u
(2)
n
∂xi0
)D∥∥∥∥∥∥
L1(∂D;H)
<∞. (3.40)
But again estimate (2.7) gives
∥∥∥∥ ∂un∂xi0
∥∥∥∥
L1(D)
≤ C
∣∣∣∣D( ∂un∂xi0
)∣∣∣∣ (D) +
∥∥∥∥∥∥
(
∂u
(2)
n
∂xi0
)D∥∥∥∥∥∥
L1(∂D;H)
 . (3.41)
From (3.37), (3.40) and (3.41) we deduce
sup
n∈N
∣∣∣∣D( ∂un∂xi0
)∣∣∣∣ (D) =∞,
which cannot happen due to (3.29).
Observe that the above proof works also in the case where the inpainting domain
consists of finitely many open connected sets (Di)i=1,...,N b Ω with Lipschitz boundaries
such that dist(Di, Dj) > 0 for every i 6= j.
70
3.4. Well-posedness of the model
We now prove uniqueness of solutions for (3.9). The following proof also follows the
proof of the corresponding theorem for the first order analogue in [Ves01].
Theorem 3.4.6. If, in addition to T (XΩ), T is injective or if ϕ1 is strictly convex, then
the solution of the minimisation problem (3.9) is unique.
Proof. Suppose that u1 and u2 are two minimisers for (3.9). Using Proposition 3.2.1, it
can be easily checked that H is convex. If Tu1 6= Tu2 then from the strict convexity of
the fidelity term in H we have
H
(
1
2
u1 +
1
2
u2
)
<
1
2
H(u1) +
1
2
H(u2) = inf H,
which is a contradiction. Thus we must have Tu1 = Tu2. If T is injective, we have u1 = u2.
If T is not injective but ϕ1 is strictly convex, we must have ∇u1 = ∇u2 otherwise we get
the same contradiction as before. In that case, since Ω is connected, there exists a constant
c such that u1 = u2 + cXΩ and since T (XΩ) 6= 0, we get c = 0.
We note that in general we cannot obtain uniqueness results when the L1 norm is used
in the fidelity term, due to the lack of strict convexity.
3.4.2 Stability
In order to complete the well-posedness picture for the problem
min
u∈BV2(Ω)
1
2
ˆ
Ω
(Tu− f)2dx+ α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω), (3.42)
it remains to analyse its stability. More precisely, we want to know which effect deviations
in the data f have on a corresponding minimiser of (3.42). Ideally the deviation in the
minimisers for different input data should be bounded by the deviation of the data. Let
Ψ be the regularising functional in (3.42), i.e.,
Ψ(u) = α
ˆ
Ω
ϕ1(∇u) dx+ β ϕ2(D2u)(Ω).
It has been demonstrated by many authors [BO04, BRH07, Po¨s08, BB11] that Bregman
distances related to the regularisation functional Ψ are natural error measures for vari-
ational regularisation methods with Ψ convex. In particular Po¨schl [Po¨s08] has derived
estimates for variational regularisation methods for power of metrics, which apply to the
functional we consider here. However, for demonstration issues and in order to make the
constants in the estimates more explicit we state and prove here the result for a special
case of (3.42).
For what we are going to do we assume that one of the regularisers is differentiable.
Without use of generality, we assume that the convex function φ1 is differentiable. Analo-
71
The combined TV–TV2 approach for image reconstruction
gous analysis can be done if φ2 is differentiable or even under weaker continuity solutions,
see [BL11]. Let u˜ be the original image and f˜ the exact data (without noise), i.e., f˜ is
a solution of T u˜ = f˜ . We assume that the noisy data deviate from the exact data by
‖f˜ − f‖L2(Ω) ≤ δ, for a small δ > 0. For the original image u˜ we assume that the following
condition, called source condition, holds
There exists a ξ ∈ ∂Ψ(u˜) such that ξ = T ∗q for a source element q ∈ L2(Ω), (SC)
where T ∗ is the adjoint of T . As it was shown in [BO04] for the general setting, the
elements that satisfy SC are exactly the minimisers of (3.42) for arbitrary f , α and β.
Since φ1 is differentiable and both φ1 and φ2 are convex, the subdifferential of Ψ can be
written as a sum of subdifferentials [ET76]
∂Ψ(u) = α∂
(ˆ
Ω
φ1(∇(·))dx
)
(u) + β ∂
(
φ2(D
2(·))(Ω)) (u)
= −α div(φ′1(∇u)) + β ∂
(
φ2(D
2(·))(Ω)) (u).
We also define the symmetric Bregman distance for the regularising functional Ψ as
DsymmΨ (u1, u2) := 〈p1 − p2, u1 − u2〉, p1 ∈ ∂Ψ(u1), p2 ∈ ∂Ψ(u2).
Theorem 3.4.7. Let u˜ be the original image with source condition SC satisfying T u˜ = f˜ .
Let f ∈ L2(Ω) be the noisy data with ‖f˜ − f‖L2(Ω) ≤ δ. Then a minimiser u of (3.42)
satisfies
αDsymm´
Ω φ1(∇(·))
(u, u˜) + βDsymm
φ2(D2(·))(Ω)(u, u˜) +
1
2
‖Tu− f˜‖L2(Ω) ≤ ‖q‖2L2(Ω) + δ2, (3.43)
where q is the source element in SC. Moreover if u1 and u2 are minimisers for (3.42) with
data f1 and f2 then we have the following estimate
αDsymm´
Ω φ1(∇·)
(u1, u2)+βD
symm
φ2(D2(·))(Ω)(u1, u2)+
1
2
‖Tu1−Tu2‖L2(Ω) ≤
1
2
‖f1−f2‖2L2(Ω). (3.44)
Proof. The optimality condition for (3.42) reads
αp1 + βp2 + T
∗(Tu− f) = 0, where p1 = −div(φ′1(∇u)), p2 ∈ ∂
(
φ2(D
2(·))(Ω)) (u)
(3.45)
Adding the element ξ = T ∗q from SC and T ∗f˜ to (3.45), we get
αp1 + βp2 − ξ + T ∗(Tu− f˜) = T ∗((f − f˜)− q). (3.46)
Note now that ξ = αξ1 + βξ2 for some ξ1 and ξ2 that belong to the subdifferential of´
Ω φ1(∇·) and φ2(D2(·))(Ω) respectively. Using that fact and taking the duality product
72
3.5. The numerical implementation
of (3.46) with u− u˜ we get
α〈p1 − x1, u− u˜〉+ β〈p2 − ξ2, u− u˜〉+ ‖Tu− f˜‖L2 = 〈(f − f˜)− q, Tu− f˜〉. (3.47)
where we used T u˜ = f˜ . By Young’s inequality, we eventually get
αDsymm´
Ω φ1(∇(·))
(u, u˜)+βDsymm
φ2(D2(·))(Ω)(u, u˜)+
1
2
‖Tu−f˜‖L2(Ω) ≤ ‖q‖2L2(Ω)+‖f˜−f‖2L2(Ω), (3.48)
and in view of ‖f˜ − f‖L2(Ω) ≤ δ we get (3.43). Similarly, if u1 and u2 are minimisers for
(3.42) with data f1 and f2 we can derive estimate (3.44) by subtracting the optimality
conditions for u1 and u2 and applying Young’s inequality once again.
3.5 The numerical implementation
It is a common practice in mathematical imaging and inverse problems in general, to
introduce a model in the continuous set up and after analysing it, to discretise functions
and operators in order to implement it in a computer. Here we make no exceptions. In
order to be completely rigorous, the link between the continuous and the discretised model
should be provided. Such a link is normally given in terms of Γ-convergence [Bra02, DM93]
of the discrete functional to the continuous one as the resolution of the image tends to
infinity. However, as this is not our purpose here we leave this issue for future research.
We use standard discretisations for the norms, differential operators and the operator
T . The discretisation of the latter is the obvious one as we only consider T to be the
identity function (denoising), a convolution operator (deblurring) or a projection operator
(inpainting). However, let us just mention here that in some advanced medical imaging
tasks e.g. MRI imaging, the way one discretises the problem is not trivial and it is crucial
for the quality of the reconstruction results.
In this section we work with the discretised version of the problem (3.1) and we discuss
its numerical realisation by the so-called split Bregman technique [GO09]. We start by
defining the discrete versions of L1 and L2 norms and we also introduce the appropriate
discrete differential operators. We proceed with an introduction to the Bregman iteration
which is used to solve constrained optimisation problems, an idea originated in [OBG+].
In [GO09] the Bregman iteration and an operator splitting technique (split Bregman) was
used to solve the total variation minimisation problem after reformulating the problem
as a constrained one. In the latter paper it was also also proved that the iterates of the
Bregman iteration converge to the solution of the constrained problem assuming that the
iterates satisfy the constraint in a finite number of iterations. Here, we give a more general
convergence result where we do not use that assumption, see Theorem 3.5.1. We proceed
with describing how our problem can be solved with the Bregman iteration, using the
splitting procedure mentioned above. We finish by applying our method to image denoising
73
The combined TV–TV2 approach for image reconstruction
and deblurring comparing it with TGV, TV2 and infimal convolution minimisation, with
respect both to reconstruction quality and computational time. An extensive study of the
application of our method to image inpainting is presented in Section 3.7.
3.5.1 Discretisation of the model
In this section we study the discretisation of the problem (3.1). In our numerical examples
we set ϕ1 = | · | and ϕ2 = | · |, the Euclidean norms in R2 and R4 respectively. That is to
say we discretise the following problem
min
u∈BV2(Ω)
1
s
ˆ
Ω
|Tu− f |sdx+ αTV(u) + β TV2(u), s = 1, 2. (3.49)
In order to do so, we specify the corresponding discrete operators and norms that appear
in the continuous functional. In the discrete setting u and f are elements of RN×M where
N × M is the resolution of the image and T : RN×M → RN×M , is a linear operator.
Here, we consider only greyscale images and we describe an extension to colour images
for the application of (3.49) to image inpainting in Section 3.7.3. We define the finite
dimensional L1 and L2 norms of a discrete vector field U : {1, . . . , N}×{1, . . . ,M} → RK
with U = (U1, . . . , UK) as follows:
‖U‖2 =
 N∑
i=1
M∑
j=1
K∑
k=1
Uk(i, j)
2
1/2 ,
‖U‖1 =
N∑
i=1
M∑
j=1
(
K∑
k=1
Uk(i, j)
2
)1/2
.
We also define the discrete first and second order differential operators Dx, Dy, Dxx,
Dyy and Dxy which are all operators from RN×M to RN×M . We note here that we assume
periodic boundary conditions for u. By choosing periodic boundary conditions, the action
of each of the discrete differential operators can be regarded as a circular convolution
of u and allows the use of Fast Fourier Transform (FFT), see also [WT10]. Thus, even
though the use of periodic boundary conditions is not necessary, it has been chosen in
order to speed up the computational performance of our implementation, especially as far
as deblurring and inpainting are concerned.
In our discrete implementation, the coordinates x and y are oriented along columns
and rows respectively. We define the discrete gradient ∇u = (Dxu,Dyu) as a forward
finite difference operator
Dxu(i, j) =
u(i, j + 1)− u(i, j) if 1 ≤ i ≤ N, 1 ≤ j < M,u(i, 1)− u(i, j) if 1 ≤ i ≤ N, j = M,
74
3.5. The numerical implementation
Dyu(i, j) =
u(i+ 1, j)− u(i, j) if 1 ≤ i < N, 1 ≤ j ≤M,u(1, j)− u(i, j) if i = N, 1 ≤ j ≤M.
We also need the discrete divergence operator div :
(
RN×M
)2 → RN×M that has the
adjointness property
−divp · u = p · ∇u, ∀u ∈ RN×M , p ∈ (RN×M)2 .
This property is essentially the discrete analogue of integration by parts. For a p =
(p1, p2) ∈
(
RN×M
)2
we define
(divp)(i, j) =
←−
Dxp1(i, j) +
←−
Dyp2(i, j),
where
←−
Dx and
←−
Dy are the following backward finite difference operators:
←−
Dxu(i, j) =
u(i, j)− u(i,M) if 1 ≤ i ≤ N, j = 1,u(i, j)− u(i, j − 1) if 1 ≤ i ≤ N, 1 < j ≤M,
←−
Dyu(i, j) =
u(i, j)− u(N, j) if i = 1, 1 ≤ j ≤M,u(i, j)− u(i− 1, j) if 1 < i ≤ N, 1 ≤ j ≤M.
Analogously we define the discrete Hessian ∇2u = (Dxxu,Dyyu,Dxyu,Dyxu). This
consists of the corresponding compositions of the first order operators,
Dxxu(i, j) =

u(i,m)− 2u(i, j) + u(i, j + 1) if 1 ≤ i ≤ N, j = 1,
u(i, j − 1)− 2u(i, j) + u(i, j + 1) if 1 ≤ i ≤ N, 1 < j < M,
u(i, j − 1)− 2u(i, j) + u(i, 1) if 1 ≤ i ≤ n, j = M,
Dyyu(i, j) =

u(n, j)− 2u(i, j) + u(i+ 1, j) if i = 1, 1 ≤ j ≤M,
u(i− 1, j)− 2u(i, j) + u(i+ 1, j) if 1 < i < N, 1 ≤ j ≤M,
u(i− 1, j)− 2u(i, j) + u(1, i) if i = N, 1 ≤ j ≤M,
Dxyu(i, j) =

u(i, j)− u(i+ 1, j)− u(i, j + 1) + u(i+ 1, j + 1) if 1 ≤ i < N, 1 ≤ j < M,
u(i, j)− u(1, j)− u(i, j + 1) + u(1, j + 1) if i = N, 1 ≤ j < M,
u(i, j)− u(i+ 1, j)− u(i, 1) + u(i+ 1, 1) if 1 ≤ i < N, j = M,
u(i, j)− u(1, j)− u(i, 1) + u(1, 1) if i = N, j = M.
75
The combined TV–TV2 approach for image reconstruction
One can easily check that Dxy = Dx(Dy) = Dy(Dx) = Dyx. As before, we also need the
discrete second order divergence operator div2 :
(
RN×M
)4 → RN×M with the adjointness
property
div2q · u = q · ∇2u, ∀u ∈ Rn×m, q ∈ (RN×M)4 .
For a q = (q11, q22, q12, q21) ∈
(
RN×M
)4
we define
(div2q)(i, j) =
←−−
Dxxq11(i, j) +
←−−
Dyyq22(i, j) +
←−−
Dxyq12(i, j) +
←−−
Dyxq21(i, j),
where ←−−
Dxx = Dxx,
←−−
Dyy = Dyy,
←−−
Dxy =
←−−
Dyx with
←−−
Dxyu(i, j) =

u(i, j)− u(i,M)− u(N, i) + u(N,M) if i = 1, j = 1,
u(i, j)− u(i, j − 1)− u(N, j) + u(N, j − 1) if i = 1, 1 < j ≤M,
u(i, j)− u(i− 1, j)− u(i,M) + u(i− 1,M) if 1 < i ≤M j = 1,
u(i, j)− u(i, j − 1)− u(i− 1, j) + u(i− 1, j − 1) if 1 < i ≤ N, 1 < j ≤M.
Figure 3.2 describes schematically the action of all the above discrete differential operators.
According to all the definitions above the discretised version of problem (3.49) can be
− +
Dx
i, j + 1i, j
− +
i, ji, j − 1
←−
Dx
−2+ +
i, ji, j − 1 i, j + 1
Dxx =
←−−
Dxx
+
−
+
−i, j
i+ 1, j i, j
i− 1, j
Dy
←−
Dy Dyy =
←−−
Dyy
−2
+
+
i, j
i− 1, j
i+ 1, j
+
+
−
− +
+ −
−
i, j i, j + 1 i− 1, j − 1 i− 1, j
i+ 1, j + 1i+ 1, j i, ji, j − 1
Dxy
←−−
Dxy
Figure 3.2: Illustration of the discrete derivative approximations.
written as
min
u∈RN×M
1
s
‖Tu− f‖ss + α‖∇u‖1 + β‖∇2u‖1, s = 1, 2. (3.50)
76
3.5. The numerical implementation
3.5.2 Bregman iteration
We now introduce the Bregman and split Bregman iterations as suitable numerical meth-
ods for the solution of (3.50). We would like to recall some basic aspects of the general
theory of Bregman iteration before finishing this discussion with a novel convergence result
in Theorem 3.5.1.
Suppose we want to solve the following constrained minimisation problem
min
u∈RL
E(u), such that Au = b, (3.51)
where E is a convex function and A : RL → RD a linear map. We can transform the con-
strained minimisation problem (3.51) into an unconstrained one, introducing a parameter
λ:
sup
λ
min
u∈RL
E(u) +
λ
2
‖Au− b‖22, (3.52)
where in order to satisfy the constraint Au = b, one has to let λ increase to infinity.
Instead of doing that we perform the Bregman iteration as it was proposed in [OBG+]
and [YOGD08]:
Bregman Iteration
un+1 = argmin
u∈RL
E(u) +
λ
2
‖Au− bn‖22, (3.53)
bn+1 = bn + b−Aun+1. (3.54)
In [OBG+], assuming that (3.53) has a unique solution, the authors derived among
others, the following facts about the iterates un:
(i) The constraint is satisfied at infinity: ‖Aun − b‖22 <
C1
n− 1 , C1 > 0, ∀n ≥ 2. (3.55)
(ii) Monotonic decrease of the residuals: ‖Aun+1 − b‖22 ≤ ‖Aun − b‖22, ∀n ∈ N. (3.56)
(iii) Summability of the residuals:
∞∑
n=0
‖Aun − b‖22 <∞, (3.57)
(iv) Uniform bound on E(un): E(un) < C2, C2 > 0, ∀n ∈ N. (3.58)
Using these results, we are able to prove convergence of the Bregman iterates to a solution
of the original constrained problem (3.51). This result was proved in [GO09] in the case
when the constraint is satisfied in a finite number of iterations, i.e., Aun0 = b for some
iterate un0 . Here we do not use that assumption, something that makes our result a
77
The combined TV–TV2 approach for image reconstruction
genuine contribution to the convergence theory of Bregman iteration.
Theorem 3.5.1. Suppose that the constrained minimisation problem (3.51) has a unique
solution u?. Moreover suppose that the convex function E is coercive and that (3.53) has
a unique solution for every n ∈ N. Then the sequence of the iterates (un)n∈N of Bregman
iteration converges to u?.
Proof. Since (3.53) has a unique solution for every n ∈ N we have that the statements
(3.55)–(3.58) hold. Moreover we have for every n ≥ 1
bn = b0 +
n∑
`=1
(b−Au`)⇒ ‖bn‖2 ≤ ‖b0‖2 +
n∑
`=1
‖b−Au`‖2. (3.59)
From (3.58) and the coercivity of E we have that the sequence (‖un‖2)n∈N is bounded, say
by a constant C > 0. Thus, it suffices to show that every accumulation point of (un)n∈N is
equal to u?. Using the fact that Au? = b, together with (3.55), we have for every increasing
sequence of naturals (nk)k∈N, k > 2,
λ
2
‖Au? − bnk−1‖22 ≤
λ
2
(‖Au? −Aunk‖2 + ‖Aunk − bnk−1‖2)2
=
λ
2
(‖Aunk − b‖2 + ‖Aunk − bnk−1‖2)2
≤ λC1
2(nk − 1) + λ‖Aunk − b‖2‖Aunk − bnk−1‖2 +
λ
2
‖Aunk − bnk−1‖22.
(3.60)
Since unk is a solution to (3.53) we have
E(unk) +
λ
2
‖Aunk − bnk−1‖22 ≤ E(u?) +
λ
2
‖Au? − bnk−1‖22, ∀k ≥ 1. (3.61)
Combining (3.60) and (3.61) with (3.55) and (3.59) we get for all k ≥ 2,
E(unk) ≤ E(u?) +
λC1
2(nk − 1) + λ‖Aunk − b‖2‖Aunk − bnk−1‖2
≤ E(u?) + λC1
2(nk − 1) + λ‖Aunk − b‖2‖Aunk‖2 + λ‖Aunk − b‖2‖bnk−1‖2
≤ E(u?) + λC1
2(nk − 1) +
λ
√
C1‖A‖‖unk‖2√
nk − 1
+ λ‖Aunk − b‖2‖bnk−1‖2
≤ E(u?) + λC1
2(nk − 1) +
λ
√
C1‖A‖C√
nk − 1
+ λ‖Aunk − b‖2
(
‖b0‖2 +
nk∑
`=1
‖Au` − b‖2
)
≤ E(u?) + λC1
2(nk − 1) +
λ
√
C1(‖A‖C + ‖b0‖2)√
nk − 1
+ λ‖Aunk − b‖2
nk∑
`=1
‖Au` − b‖2.
(3.62)
78
3.5. The numerical implementation
Suppose now that (unk)k∈N converges to some u˜ as ` goes to infinity. Taking (3.55) into
account, we have that ‖Au˜− b‖2 = 0, i.e., u˜ satisfies the constraint Au˜ = b. On the other
hand, taking limits in (3.62) and using Kronecker’s Lemma, see Lemma A.2.1 in Appendix
A, we have that the limit in the right hand side of (3.62) is E(u?). Thus we have
E(u˜) ≤ E(u?),
which means that u˜ is a solution of the constrained problem (3.51) and since the solution
is unique then u˜ = u?. Since every accumulation point of the bounded sequence (un)n∈N
is equal to u?, we conclude that the whole sequence converges to u?.
3.5.3 Numerical solution via the split Bregman algorithm
In this section, we explain how the Bregman iteration (3.53)–(3.54) together with a split-
ting technique can be used to implement numerically the minimisation of the functional
(3.50). The idea originates from [GO09], where such a procedure was applied to total
variation minimisation and was given the name split Bregman algorithm. This iterative
technique is equivalent to certain instances of combinations of the augmented Lagrangian
method with classical operator splitting such as Douglas–Rachford, see [Set09].
Recall that we want to solve the following unconstrained minimisation problem:
min
u∈RN×M
1
2
‖Tu− f‖22 + α‖∇u‖1 + β‖∇2u‖1, (3.63)
where here for simplicity, we consider only the L2 fidelity case. The derivation of the
split Bregman algorithm for solving (3.63) starts with the observation that the above
minimisation problem is equivalent to the following constrained minimisation problem
min
u∈RN×M
v∈(RN×M)2
w∈(RN×M)4
1
2
‖Tu− f‖22 + α‖v‖1 + β‖w‖1, such that v = ∇u, w = ∇2u. (3.64)
It is clear that since the discrete gradient and Hessian are linear operations, the minimi-
sation problem (3.64) can be reformulated into the more general problem
min
z∈RL
E(z), such that Az = b,
where E : RL → R+ is convex, A is a L × L matrix and b is a vector of length L, where
L = 7NM . It is also easy to see that the iterative scheme of the type (3.53)–(3.54) that
corresponds to the constrained minimisation problem (3.64) is:
(un+1, vn+1, wn+1) = argmin
u,v,w
1
2
‖Tu− f‖22 + α‖v‖1 + β‖w‖1 +
λ
2
‖bn1 +∇u− v‖22 (3.65)
79
The combined TV–TV2 approach for image reconstruction
+
λ
2
‖bn2 +∇2u− w‖22,
bn+11 = b
n
1 +∇un+1 − vn+1, (3.66)
bn+12 = b
n
2 +∇2un+1 − wn+1, (3.67)
where bn+11 = (b
n+1
1,1 , b
n+1
1,2 ) ∈
(
RN×M
)2
and bn+12 = (b
n+1
2,11 , b
n+1
2,22 , b
n+1
2,12 , b
n+1
2,21 ) ∈
(
RN×M
)4
.
Notice that at least in the case where T is injective (e.g. denoising, deblurring), the
minimisation problem (3.65) has a unique solution. Moreover, in that case, the functional
E is coercive and the constrained minimisation problem (3.64) has a unique solution, thus,
Theorem 3.5.1 holds. We also note that one can consider having two parameters λ1 and
λ2 for the first and second order term in (3.65) respectively, as it is easily checked that
this does not affect the convergence of the Bregman iteration.
Our next concern is the efficient numerical solution of the minimisation problem (3.65).
We follow [GO09] and minimise with respect to u, v and w alternatingly:
Split Bregman for L2 −TV −TV2 denoising, deblurring
Subproblem 1: un+1 = argmin
u∈RN×M
1
2
‖Tu− f‖22 +
λ1
2
‖bn1 +∇u− vn‖22 (3.68)
+
λ2
2
‖bn2 +∇2u− wn‖22,
Subproblem 2: vn+1 = argmin
v∈(RN×M )2
α‖v‖1 + λ1
2
‖bn1 +∇un+1 − v‖22, (3.69)
Subproblem 3: wn+1 = argmin
w∈(RN×M )4
β‖w‖1 + λ2
2
‖bn2 +∇2un+1 − w‖22, (3.70)
Update on b1: b
n+1
1 = b
n
1 +∇un+1 − vn+1, (3.71)
Update on b2: b
n+1
2 = b
n
2 +∇2un+1 − wn+1. (3.72)
We note that the above splitting strategy is tuned for image denoising and deblurring.
As far as image inpainting is concerned an alternative splitting is required for efficient
implementation. We discuss that in Section 3.7.2. The above alternating minimisation
scheme, make up the split Bregman iteration that is proposed in [GO09] to solve the
total variation minimisation problem as well as problems related to compress sensing. For
convergence properties of the split Bregman iteration and also other splitting techniques
we refer the reader to [CW06, EZC10, Set09]. In [WT10] and [YOGD08], it is noted that
the Bregman iteration coincides with the augmented Lagrangian method. Minimising
80
3.5. The numerical implementation
alternatingly with respect to the variables in the augmented Lagrangian method results
to the alternating direction method of multipliers (ADMM), see [Gab83]. Thus, split
Bregman is equivalent to ADMM. In [Eck89] and [Gab83] it is shown that ADMM is
equivalent to the Douglas–Rachford splitting algorithm whose convergence is guaranteed.
We refer the reader to [Set09] for an interesting study in this subject.
We now discuss how we solve each of the minimisation problems (3.68)–(3.70) for
the case of denoising and deblurring. Thus, we assume that T is either the identity or
represents a circular convolution. We denote by F the two dimensional discrete Fourier
transform.
Subproblem 1: We want to solve (3.68), i.e.,
un+1 = argmin
u∈RN×M
1
2
‖Tu− f‖22 +
λ1
2
‖bn1 +∇u− vn‖22 +
λ2
2
‖bn2 +∇2u− wn‖22.
This is solved through its optimality condition. In the continuous setting that results in a
fourth order linear PDE. In the discrete setting the optimality condition reads as follows:
T ∗Tun+1 − λ1
(←−
Dx(Dx(u
n+1)) +
←−
Dy(Dy(u
n+1))
)
+ λ2
(←−−
Dxx(Dxx(u
n+1)) +
←−−
Dyy(Dyy(u
n+1)) + 2
←−−
Dxy(Dxy(u
n+1))
)
= f + λ1
(←−
Dx(b
n
1,1 − vn1 ) +
←−
Dy(b
n
1,2 − vn2 )
)
− λ2
(←−−
Dxx(b
n
2,11 − wn11) +
←−−
Dyy(b
n
2,22 − wn22) + 2
←−−
Dxy(b
n
2,12 − wn12)
)
, (3.73)
where here T ∗ is the adjoint operator of T . Note that for the special cases we are consid-
ering here we have in fact T = T ∗. Due to the periodic boundary conditions, the discrete
differential operators act like circular convolution. The same holds for the operator T from
our assumptions. Thus taking Fourier transforms in (3.73) gives
FD · F(un+1) = F(Rn),
where
FD := F(T ∗) · F(T )− λ1
(
F(←−Dx(Dx)) + F(←−Dy(Dy)))
)
+ λ2
(
F(←−−Dxx(Dxx)) + F(←−−Dyy(Dyy)) + F(←−−Dxy(Dxy))
)
,
and Rn is the righthand side of (3.73). Here “·” denotes pointwise multiplication of
matrices. Thus we have
un+1 = F−1 (F(Rn)/FD) ,
where “/” denotes pointwise division. Note that the term FD needs to be calculated only
81
The combined TV–TV2 approach for image reconstruction
once at the beginning of the algorithm.
Subproblem 2: Problem (3.68) has a closed form solution too. We have
vn+1 = argmin
v∈(RN×M )2
α‖v‖1 + λ1
2
‖bn1 +∇un+1 − v‖22
= argmin
v∈(RN×M )2
∑
i,j
α
√
v1(i, j)2 + v2(i, j)2 +
λ1
2
(
(bn1,1(i, j) +Dxu
n+1(i, j)− v1(i, j))2
+(bn1,2(i, j) +Dyu
n+1(i, j)− v2(i, j))2
)⇒
vn+1(i, j) = argmin
(z1,z2)∈R2
∑
i,j
α
√
z21 + z
2
2 +
λ1
2
(
(bn1,1(i, j) +Dxu
n+1(i, j)− z1)2
+(bn1,2(i, j) +Dyu
n+1(i, j)− z2)2
)
.
Setting sn(i, j) = (sn1 (i, j), s
n
2 (i, j)) =
(
bn1,1(i, j) +Dxu
n+1(i, j), bn1,2(i, j) +Dyu
n+1(i, j)
) ∈
R2 it is easy to check that
vn+11 (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn1 (i, j)
|sk(i, j)| ,
vn+11 (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn2 (i, j)
|sk(i, j)| ,
with the convention that 0/0 = 0. The above operation on v is known as shrinkage.
Subproblem 3: We have
wn+1 = argmin
w∈(RN×M )4
β‖w‖1 + λ2
2
‖bn2 +∇2un+1 − w‖22.
Working in the same way as in subproblem 2 and setting
tn(i, j) = (tn1 (i, j), t
n
2 (i, j), t
n
3 (i, j), t
n
3 (i, j))
=
(
bn2,11(i, j) +Dxxu
n+1(i, j), bn2,22(i, j) +Dyyu
n+1(i, j),
bn2,12(i, j) +D12u
n+1(i, j), bn2,12(i, j) +Dxyu
n+1(i, j)
) ∈ R4,
we obtain
wn+111 (i, j) = max
(
|tn(i, j)| − β
λ2
, 0
)
tn1 (i, j)
|tn(i, j)| ,
wn+122 (i, j) = max
(
|tn(i, j)| − β
λ2
, 0
)
tn2 (i, j)
|tn(i, j)| ,
wn+112 (i, j) = max
(
|tn(i, j)| − β
λ2
, 0
)
tn3 (i, j)
|tn(i, j)| .
82
3.6. Applications to image denoising and deblurring
We remark here that all three subproblems (3.68)–(3.70) are solved exactly. This is in
contrast with several previous works on total variation minimisation using split Bregman
[GO09, Pas12a, Pas12c, Pas12b] where the subproblem of the type (3.68) is solved ap-
proximately with one iteration of Gauss–Seidel to the linear system that results from the
optimality condition. There, the optimality condition leads to a second order linear PDE
instead of fourth order. Even though, convergence is achieved using both methods, the
FFT approach leads to a faster and a more robust convergence, see also the corresponding
experimental discussion in Section 3.7.2.
Note finally that different values of λ1 and λ2 can result in different convergence speeds.
Even though, it is not obvious a priori how to choose λ1 and λ2 is order to get optimal
performance, this choice has to be done only once and a potential user does not have to
worry about it. For the cases of denoising and deblurring we have found empirically that
λ1 = λ2 = 1 give good results. For the case of inpainting, see the corresponding discussion
in Section 3.7.4.
3.6 Applications to image denoising and deblurring
Image denoising
In this section we discuss the application of the TV–TV2 approach (3.50) to image denois-
ing, i.e., when the operator T is the identity. We have performed experiments to images
that have been corrupted by Gaussian noise, thus the L2 norm in the fidelity term is the
most suitable. We compare our method with infimal convolution [CL97] solved also with a
split Bregman scheme and the total generalised variation [BKP10] solved with the primal-
dual method of Chambolle and Pock [CP11] as it is described in [Bre14]. We present
among others, examples for a = 0, β > 0 and α > 0, β = 0. Recall that for β = 0, our
model corresponds to the classical ROF denoising model, while for α = 0, it corresponds
to the pure TV2 restoration [BP10].
Our main assessment tool for the quality of the reconstructions is the structural sim-
ilarity index SSIM [WBSS04, WB09]. The reason for this choice is that in contrast to
traditional quality measures like the peak signal-to-noise ratio PSNR, the SSIM index also
assesses the conservation of the structural information of the reconstructed image. A jus-
tification for the choice of SSIM as a good fidelity measure instead of the traditional PSNR
can be seen in Figure 3.3. Note that a perfect reconstruction has SSIM=1. Figures 3.3(c)
and 3.3(d) show the denoising results of pure TV minimisation (β = 0). Figure 3.3(c) is
the one with the highest SSIM value (0.6595) while the SSIM value of Figure 3.3(d) is
significantly lower (0.4955). This assessment comes in agreement with the human point
of view since, even though this is subjective, one would consider Figure 3.3(c) as a better
reconstruction. On the other hand, Figure 3.3(c) has slightly smaller PSNR value (14.02)
than Figure 3.3(d) (14.63), which is the one with the highest PSNR value. Similar results
83
The combined TV–TV2 approach for image reconstruction
(a) Clean image (b) Noisy image (c) Best SSIM assess-
ment
(d) Best PSNR assess-
met
Figure 3.3: Justification for the use of SSIM as an appropriate image quality assessment
tool. The initial image, Figure (a) is contaminated with Gaussian noise of variance 0.5,
Figure (b). We show the best SSIM valued, Figure (c), (SSIM=0.6595, PSNR=14.02) and
the best PSNR valued, Figure (d), (SSIM=0.4955, PSNR=14.63), recontructions among
reconstructed images with pure TV denoising (β = 0). The better SSIM assessment agrees
more with the human perception.
are obtained for β > 0.
For this section, we use a predefined number of iterations as a stopping criterium
for our algorithm which in most examples is 300. In fact, we observe that after 80-100
iterations the relative residual of successive iterates is of the order of 10−3 or lower (see
also Table 3.1) and hence no noticeable change in the iterates is observed after that.
Figure 3.4 depicts one denoising example, where the original image has been corrupted by
Gaussian noise of variance 0.005. For better visualisation, we include the middle row slices
of all reconstructions in Figure 3.5. The highest SSIM value for TV denoising is achieved
for α = 0.12 (SSIM=0.8979) while the highest one for TV–TV2 is achieved for α = 0.06,
β = 0.03 (SSIM=0.9081). This is slightly better than infimal convolution (SSIM=0.9053).
Note, however, that this optimal combination of α and β in terms of SSIM does not
always correspond to the best visual result. In general, the latter corresponds to a slightly
larger β than the one chosen by SSIM see Figure 3.4(h). Still, for proof of concept, we
prefer to stick with an objective quality measure and SSIM, in our opinion, is the most
reliable choice for that matter. In the image of Figure 3.4(h) the staircasing effect has
almost disappeared and the image is still pleasant to the human eye despite being slightly
more blurry. This slight blur, which is the price that the method pays for the removal of
the staircasing effect, can be easily and efficiently removed in post processing using simple
sharpening filters, e.g. in GIMP. We did that in Figure 3.4(i), also adjusting the constrast,
achieving a very good result both visually and SSIM-wise (0.9463). The overall highest
SSIM value is achieved by TGV (0.9249), producing a reconstruction of very good quality.
However, TGV converges slowly to the true solution. In order to check that, we compute
the ground truth solution (denoted by GT) for the parameters of the TGV problem that
correspond to Figure 3.4(d), by taking a large amount of iterations (2000). We check the
84
3.6. Applications to image denoising and deblurring
(a) Clean image, SSIM=1 (b) Noisy image, Gaussian noise,
variance=0.005, SSIM=0.3261
(c) TV denoising, α=0.12,
SSIM= 0.8979
(d) TGV denoising, SSIM=0.9249 (e) Infimal convolution denoising,
SSIM=0.9053
(f) TV–TV2 denoising, α=0.06,
β=0.03, SSIM=0.9081
(g) TV2 denoising, β=0.07,
SSIM=0.8988
(h) TV–TV2 denoising, α=0.06,
β=0.06, SSIM=0.8989
(i) TV–TV2 denoising, α=0.06,
β=0.06, SSIM=0.9463, Post-
processing: GIMP sharpening
& contrast
Figure 3.4: Denoising of a synthetic image that has been corrupted by Gaussian noise of
variance 0.005.
GPU time that is needed for the iterates to have a relative residual
‖un −GT‖2
‖GT‖2 ≤ 10
−3,
and we do the same for the TV–TV2 example of Figure 3.4(f). For TGV, it takes 1297
iterations (primal-dual method [CP11]) and 36.25 seconds while for TV–TV2 it takes 86
split Bregman iterations and 4.05 seconds, see Table 3.1. That makes our method more
suitable for cases where fast but not necessarily optimal results are needed, e.g. video
processing.
In order to examine the quality of the reconstructions that are produced from each
85
The combined TV–TV2 approach for image reconstruction
(a) Clean image, SSIM=1 (b) Noisy image, Gaussian noise,
variance=0.005, SSIM=0.3261
(c) TV denoising, α=0.12,
SSIM= 0.8979
(d) TGV denoising, SSIM=0.9249 (e) Infimal convolution denoising,
SSIM=0.9053
(f) TV–TV2 denoising, α=0.06,
β=0.03, SSIM=0.9081
(g) TV2 denoising, β=0.07,
SSIM=0.8988
(h) TV–TV2 denoising, α=0.06,
β=0.06, SSIM=0.8989
(i) TV–TV2 denoising, α=0.06,
β=0.06, SSIM=0.9463,
Post-processing: GIMP sharp-
ening & contrast
Figure 3.5: Corresponding middle row slices of images in Figure 3.4.
method as the number of iteration increases, we have plotted the evolution of the SSIM
values in Figure 3.6. In the horizontal axis, instead of the number of iterations, we put
the absolute CPU time calculated by the product of the number of iterations times the
CPU time per iteration as it is seen in Table 3.1. We observe that for TGV, the SSIM
value increases gradually with time, while for the methods solved with split Bregman, the
image quality peaks very quickly and then remains almost constant except for TV, where
the staircasing appears in the later iterations, leading eventually to a decrease of the SSIM
value.
We also examine how the SSIM and PSNR values of the restored images behave as
86
3.6. Applications to image denoising and deblurring
0 0.17 0.5 1.08 1.5 1.89 2.5 3
0.7
0.75
0.8
0.85
0.9
Seconds
S
S
I
M
 
 
TV
TV-TV
2
TGV
In f .C onv .
Figure 3.6: Evolution of the SSIM index with absolute CPU time for the examples of
Figure 3.4. For TV denoising the SSIM value peaks after 0.17 seconds (0.9130) and
after it drops when the staircasing appears, see corresponding comments on [GO09]. For
TV–TV2 the peak appears after 1.08 seconds (0.9103) and remains essentially constant.
The TGV iteration starts to outperform the methods after 1.89 seconds. This shows the
potential of split Bregman to produce visually satisfactory results before convergence has
occurred, in contrast with the primal-dual method.
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.05
0.1
0.15
0.2
0.25
0.3
SSIM
Values of β
V
a
lu
e
s
o
f
α
 
 
0.85
0.86
0.87
0.88
0.89
0.9
(a) SSIM values
0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
0
0.05
0.1
0.15
0.2
0.25
0.3
PSNR
Values of β
V
a
lu
e
s
o
f
α
 
 
26
27
28
29
30
31
32
(b) PSNR values
Figure 3.7: Plot of the SSIM and PSNR values of the restored image as functions of α
and β, for the example of Figure 3.4. For display convenience all the values under 0.85
(SSIM) and 26 (PSNR) are colour with dark blue. The dotted cells correspond to the
highest SSIM (0.9081) and PSNR (32.39) value that was achieved achieved for α = 0.06,
β = 0.03 and α = 0.06, β = 0.005 respectively. Note that the first column in both plots
corresponds to pure TV denoising, (β = 0).
87
The combined TV–TV2 approach for image reconstruction
a function of the weighting parameters α and β. In Figure 3.7 we plot these values for
α = 0, 0.02, 0.04, . . . , 0.3 and β = 0, 0.005, 0.01, . . . , 0.1. The plot suggests that both
quality measures behave in a continuous way and they have a global maximum. However,
PSNR tends to rate higher those images that have been processed with a small value of
β or even with β = 0 which is not the case for SSIM. An explanation for this is that
higher values of β result to a further loss of contrast, see Figure 3.8, something that is
penalised by the PSNR. Note however, that the contrast can be recovered easily in a
post-prosessing stage while it is not an easy task to reduce the staircasing effect using
conventional processing methods.
50 100 150 200 250 300 350 400
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
 
 
Original
β = 0
β = 0.06
150 160 170 180 190 200 210 220 230 240 250
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
 
 
Original
β = 0
β = 0.06
Figure 3.8: Middle row slices of reconstructed images (left) with α = 0.12, β = 0 (blue
colour) and α = 0.12, β = 0.06 (red colour). Slices of the original image are plotted
with black colour. A detail of the plot is provided on the right image. Even though the
TV–TV2 method eliminates the staircasing effect, it also results to further slight loss of
contrast.
Finally in Figure 3.9 we perform denoising in an natural image which has been cor-
rupted by Gaussian noise of variance 0.005. The staircasing of TV denoising (SSIM=0.8168)
is obvious in the Figures 3.9(c) and (i). The overall best performance of the TV–TV2
method (SSIM=0.8319) is achieved by choosing α = β = 0.017, Figure 3.9(d). However,
one can get satisfactory results by choosing α = β = 0.023, eliminating further the stair-
casing without blurring the image too much, compare for example the details in Figures
3.9(j) and 3.9(k).
Image deblurring
In our deblurring implementation T denotes a circular convolution with a discrete Gaussian
kernel (σ = 2, size: 11 × 11 pixels). The blurred image is also corrupted by additive
Gaussian noise of variance 10−4. Deblurring results are shown in Figure 3.10 and the
corresponding middle row slices in Figure 3.11.
As in the denoising case the introduction of the second order term with a small weight β
decreases noticeably the staircasing effect, compare Figures 3.10(c) and 3.10(f). Moreover,
we can achieve better visual results if we increase further the value of β without blurring
the image significantly, Figure 3.10(g). Infimal convolution, does not give a satisfactory
88
3.6. Applications to image denoising and deblurring
(a) Clean image, SSIM=1 (b) Noisy image, Gaussian noise,
variance=0.005, SSIM=0.4436
(c) TV denoising, α=0.05,
SSIM=0.8168
(d) TV–TV2 denoising, α=0.017,
β=0.017, SSIM=0.8319
(e) TV–TV2 denoising, α=0.023,
β=0.023, SSIM=0.8185
(f) TV2 denoising, β=0.05,
SSIM=0.8171
(g) Clean image, SSIM=1 (h) Noisy image, Gaussian
noise, variance=0.005,
SSIM=0.4436
(i) TV denoising,
α=0.05, SSIM=0.8168
(j) TV–TV2 denoising,
α=0.017, β=0.017,
SSIM=0.8319
(k) TV–TV2 denoising,
α=0.023, β=0.023,
SSIM=0.8185
(l) TV2 denoising,
β=0.05, SSIM=0.8171
Figure 3.9: Denoising of a natural image that has been corrupted by Gaussian noise of
variance 0.005.
89
The combined TV–TV2 approach for image reconstruction
(a) Clean image , SSIM=1 (b) Blurred and noisy image,
SSIM=0.8003
(c) TV deblurring, α=0.006,
SSIM=0.9680
(d) TGV deblurring,
SSIM=0.9806
(e) Infimal convolution
deblurring, SSIM=0.9466
(f) TV–TV2 deblurring, α=0.004,
β=0.0001, SSIM=0.9739
(g) TV–TV2 deblurring, α=0.004,
β=0.0002, SSIM=0.9710
(h) TV2 deblurring, β=0.0012,
SSIM=0.9199
Figure 3.10: Deblurring of a blurred (Gaussian kernel of variance σ = 2) and noisy (addi-
tive Gaussian noise, variance 10−4) synthetic image.
90
3.6. Applications to image denoising and deblurring
(a) Clean image , SSIM=1 (b) Blurred and noisy image,
SSIM=0.8003
(c) TV deblurring, α=0.006,
SSIM=0.9680
(d) TGV deblurring,
SSIM=0.9806
(e) Infimal convolution
deblurring, SSIM=0.9466
(f) TV–TV2 deblurring, α=0.004,
β=0.0001, SSIM=0.9739
(g) TV–TV2 deblurring, α=0.004,
β=0.0002, SSIM=0.9710
(h) TV2 deblurring, β=0.0012,
SSIM=0.9199
Figure 3.11: Corresponding middle row slices of images in Figure 3.10.
91
The combined TV–TV2 approach for image reconstruction
(a) Clean image, SSIM=1 (b) Blurred and noisy image,
SSIM=0.7149
(c) TV deblurring, α=0.0007,
SSIM=0.8293
(d) TV–TV2 deblurring,
α=0.0005, β=0.0001,
SSIM=0.8361
(e) TV–TV2 deblurring,
α=0.0005, β=0.0003,
SSIM=0.8307
(f) TV–TV2 deblurring,
α=0.0005, β=0.0003,
SSIM=0.8330,
Post-processing: GIMP
sharpening
(g) Clean image, SSIM=1 (h) Blurred and noisy im-
age, SSIM=0.7149
(i) TV deblurring,
α=0.0007,
SSIM=0.8293
(j) TV–TV2 deblurring,
α=0.0005, β=0.0001,
SSIM=0.8361
(k) TV–TV2 deblurring,
α=0.0005, β=0.0003,
SSIM=0.8307
(l) TV–TV2 deblurring,
α=0.0005, β=0.0003,
SSIM=0.8330,
Post-processing: GIMP
sharpening
Figure 3.12: Deblurring of a blurred (Gaussian kernel of variance σ = 2) and noisy (addi-
tive Gaussian noise, variance 10−4) natural image.
92
3.7. Applications to image inpainting and online demo in IPOL
result here, Figure 3.10(e). TGV gives again the best qualitative result, Figure 3.10(d),
but the computation takes about 10 minutes. Even though the time comparison is not
completely fair here (the implementation described in [Bre14] does not use FFT) it takes
a few thousands of iterations for TGV to deblur the image satisfactorily, in comparison
with a few hundreds for the TV–TV2 method.
In Figure 3.12 we show the performance of TV–TV2 method for deblurring a natural
image. The best result for TV–TV2 (SSIM=0.8361) is achieved with α = 0.0005 and
β = 0.0001, Figure 3.12(d). As in the case of denoising, one can increase the value of β
slightly, reducing further the staircasing effect, Figures 3.12(e) and 3.12(k). The additional
blur which is a result of the larger β can be controlled using a sharpening filter, Figures
3.12(f) and 3.12(l).
Discussion
In the case of denoising and deblurring we compared our method with TGV, which is con-
sidered a state of the art method in the area of higher order image reconstruction methods
in the variational context. Indeed, in both image reconstruction tasks, TGV gives better
qualitative results, in terms of the SSIM index. However, the computational time that was
needed to obtain the TGV result solved with the primal-dual method is significantly more
than the one needed to compute the TV–TV2 method using split Bregman, see Table 3.1.
We also show that with simple and fast post-processing techniques we can obtain results
comparable with TGV. For these reasons, we think that the TV–TV2 approach is partic-
ularly interesting for applications in which the speed of computation matters. Regarding
the comparison with infimal convolution, our method is slightly faster and results in better
reconstructions in deblurring while in denoising the results are comparable.
3.7 Applications to image inpainting and online demo in
IPOL
3.7.1 Motivation and IPOL’s philosophy
In this section we present examples of the application of our TV–TV2 approach to image
inpainting. Recall that in image inpainting, the goal is to reconstruct an image inside a
missing part D ⊆ Ω, the inpainting domain, using the information from the intact part.
The corresponding minimisation problem in the discrete setting reads as follows
min
u∈RN×M
1
2
‖XΩ\D(u− f)‖22 + α‖∇u‖1 + β‖∇2u‖1. (3.74)
We note that in the inpainting task one wants to keep both values of α and β small,
such that more weight is put on the fidelity term so it remains close to zero and hence
93
The combined TV–TV2 approach for image reconstruction
Denoising (B&W) – Image size: 200×300
No of iterations for GT No of iterations for ‖uk −GT‖2/‖GT‖2 ≤ 10−3 CPU time (secs) time per iteration (secs)
TV 2000 136 2.86 0.0210
TV2 2000 107 3.62 0.0338
TV–TV2 2000 86 4.05 0.0471
Inf.-Conv. 2000 58 3.33 0.0574
TGV 2000 1297 36.25 0.0279
Deblurring (B&W) – Image size: 200×300
No of iterations for GT No of iterations for ‖uk −GT‖2/‖GT‖2 ≤ 10−2 CPU time (secs) time per iteration (secs)
TV 1000 478 10.72 0.0257
TV2 1000 108 3.64 0.0337
TV–TV2 1000 517 25.47 0.0493
Inf.-Conv. 1000 108 7.47 0.0692
TGV CPU time more than 10 minutes 1.22
Table 3.1: Computational times for the examples of Figures 3.4 and 3.10. We computed
a ground true solution (GT) for every method by taking a large number of iterations and
record the number of iterations and CPU time it takes for the relative residual between the
iterates and the ground true solution to fall below a certain threshold. The TGV examples
were computed using σ = τ = 0.25 in the primal-dual method described in [Bre14]. The
implementation was done using MATLAB in a Macbook 10.7.3, 2.4 GHz Intel Core 2 Duo
and 2 GB of memory.
it essentially holds u = f in Ω \ D. As we are going to see later on, connectivity along
large gaps in the inpainting domain is essentially achieved only with pure TV2 inpainting
(α = 0). However, we keep on using this TV–TV2 combination in order to keep the
method flexible, regarded as a superset of pure TV and pure TV2 inpainting. In this way,
we can study the effect that each term has on the inpainting image.
As we have already mentioned, our motivation here is that the kind of regularisation
given in (3.74) has the ability to connect large gaps in the inpainting domain with the price
of some blur, see Figure 3.13. There, the task is to inpaint a large gap in a black stripe.
(a) Inpainting domain:
grey area
(b) Harmonic inpaint-
ing
(c) TV inpainting (d) TV2 inpainting
Figure 3.13: Illustration of the connectivity property of TV2 inpainting.
Harmonic inpainting, i.e., regularising with ‖∇u‖22, achieves no connectivity, producing a
94
3.7. Applications to image inpainting and online demo in IPOL
rather smooth result, Figure 3.13(b). Pure TV inpainting is not able to make a connection
at all, Figure 3.13(c) but pure TV2 inpainting is able to connect the large gap with the
price of some blur. As we will see in Section 3.7.5, this connectivity depends also on the
size and geometry of the inpainting domain.
We will next present an algorithm in order to solve (3.74). We also provide a source
code for that algorithm written in C and an online demonstration in Image Processing
Online (IPOL) accessible on the online version of [PSS13], http://dx.doi.org/10.5201/
ipol.2013.40. IPOL is a peer reviewed online research journal specialising on image
processing algorithms. Its philosophy is that each article consists of a standard text
describing an image processing algorithm and also its source code, along with an online
demonstration in a provided platform. Everyone has access to the code and the online
demonstration and can test by themselves what each method does (open access software).
In Figure 3.14 we provide two screenshots from the IPOL website that show the structure
(a) Setting D and the parameters α, β. (b) Showing the inpainting result.
Figure 3.14: IPOL’s online demonstration platform.
of the online demonstration. The user can choose the parameters α and β as well as
the inpainting domain D. There is also an opportunity for the users to upload their
own image. Every test of the method is stored in the “Archive” section of the platform.
Each algorithm should be able to compute the result in at most 30 seconds for images of
resolution up to 500× 500 pixels. Thus, we need a fast and efficient algorithm in order to
solve (3.74) which we present in the next section.
3.7.2 Split Bregman for TV–TV2 inpainting
The algorithm for TV–TV2 inpainting is also based on a splitting technique but it is slightly
different than the one used for denoising and deblurring. As in the latter cases, we first
transform the unconstrained minimisation problem (3.74) into a constrained one, formulate
the corresponding Bregman iteration algorithm and then we minimise alternatingly with
95
The combined TV–TV2 approach for image reconstruction
respect to all the different variables.
Note that (3.74) is equivalent to
min
u∈RN×M , u˜∈RN×M
v∈(RN×M )2, w∈(RN×M )4
‖XΩ\D(u−f)‖22+α‖v‖1+β‖w‖1, such that u = u˜, v = ∇u˜, w = ∇2u˜.
(3.75)
The introduction of the auxiliary variable u˜ is crucial since it allows the use of fast Fourier
transform, leading to a fast implementation see Remark in page 97. The Bregman iteration
that corresponds to (3.75) is
(un+1, u˜n+1, vn+1, wn+1) = argmin
u,u˜,v,w
‖XΩ\D(u− f)‖22 + α‖v‖1 + β‖w‖1
+
λ0
2
‖bn0 + u˜− u‖22 +
λ1
2
‖bn1 +∇u˜− v‖22
+
λ2
2
‖bn2 +∇2u˜− w‖22, (3.76)
bn+10 = b
n
0 + u˜
n+1 − un+1,
bn+11 = b
n
1 +∇u˜n+1 − vn+1,
bn+12 = b
n
2 +∇2u˜n+1 − wn+1.
Solving (3.76) approximately by minimising alternatingly with respect to u, u˜, v and
w, leads to the split Bregman algorithm for TV–TV2 inpainting. We also refer the reader
to the works by Getreuer in IPOL [Pas12a, Pas12b, Pas12c] which use the split Bregman
method for TV denoising, deblurring and inpainting respectively. In our case the split
Bregman formulation reads as follows:
Split Bregman for L2 −TV −TV2 inpainting – Greyscale images
Subproblem 1: un+1 = argmin
u∈RN×M
1
2
‖XΩ\D(u− f)‖22 +
λ0
2
‖bn0 + u˜n − u‖22 (3.77)
Subproblem 2: u˜n+1 = argmin
u˜∈RN×M
λ0
2
‖bn0 + u˜− un+1‖22 +
λ1
2
‖bn1 +∇u˜− vn‖22 (3.78)
+
λ2
2
‖bn2 +∇2u˜− wn‖22,
Subproblem 3: vn+1 = argmin
v∈(RN×M )2
α‖v‖1 + λ1
2
‖bn1 +∇u˜n+1 − v‖22, (3.79)
Subproblem 4: wn+1 = argmin
w∈(RN×M )4
β‖w‖1 + λ2
2
‖bn2 +∇2u˜n+1 − w‖22, (3.80)
Update on b0: b
n+1
0 = b
n
0 + u˜
n+1 − un+1, (3.81)
Update on b1: b
n+1
1 = b
n
1 +∇u˜n+1 − vn+1, (3.82)
96
3.7. Applications to image inpainting and online demo in IPOL
Update on b2: b
n+1
2 = b
n
2 +∇2u˜n+1 − wn+1. (3.83)
Note that problem (3.78) is solved through its optimality condition, using FFT, exactly
like the corresponding problem (3.68). Similarly, problems (3.79) and (3.80) are solved via
shrinkage, like problems (3.69) and (3.70). Lastly, it is easy to show that problem (3.77)
has a closed form solution, so it can be solved very fast as well. Indeed, we have,
un+1 = argmin
u∈RN×M
1
2
‖XΩ\D(u− f)‖22 +
λ0
2
‖bn0 + u˜n − u‖22
= argmin
u∈RN×M
∑
i,j
XΩ\D(u(i, j)− f(i, j))2 +
λ0
2
(bn0 (i, j) + u˜
n(i, j)− u(i, j))2 ⇒
un+1(i, j) = argmin
z∈R
XΩ\D(z − f(i, j))2 +
λ0
2
(bn0 (i, j) + u˜
n(i, j)− z)2 .
It is trivial to calculate the solution of the last minimisation problem:
un+1(i, j) =
(
2XΩ\D(i, j)
λ0 + 2
)
f(i, j)+
(
λ0 + 2(1−XΩ\D(i, j))
λ0 + 2
)
(bn0 (i, j)+ u˜
n(i, j)). (3.84)
Remark: One can notice here the utility of the auxiliary variable u˜. Without this variable,
the minimisation problem (3.77) would be absent and the problem (3.78) would have the
form
un+1 = argmin
u∈RN×M
‖XΩ\D(u− f)‖22 +
λ1
2
‖bn1 +∇u− vn‖22 +
λ2
2
‖bn2 +∇2u− wn‖22. (3.85)
However, we cannot take advantage of the fast Fourier transform in order to solve the
optimality condition of (3.85) since in general F(XΩ\D) 6= XΩ\DF(u). An alternative
approach would be to solve (3.85) approximately, using one or more iterations of Gauss–
Seidel to the linear system that corresponds to its optimality condition as it is done in
[GO09, Pas12a, Pas12c, Pas12b]. That would reduce the number of subproblems to be
solved but it would have the disadvantage that one of them is not solved exactly. However,
we prefer to introduce this auxiliary variable u˜ increasing the number of subproblems to
four but being able to solve them exactly. In order to justify that, see Figure 3.15 for a
quantitative comparison of the two approaches. There, we perform TV–TV2 inpainting
on the image of Figure 3.15(a) using both algorithms i.e., (3.68)–(3.72) using one iteration
of Gauss-Seidel for (3.68) and (3.77)–(3.83) using FFT in (3.77). For each approach, we
plot the relative error
‖un −GT‖2
‖GT‖2 ,
where GT denotes a ground solution which is taken to be an iterate un for a large enough
97
The combined TV–TV2 approach for image reconstruction
(a) Original image. (b) Inpainting domain (c) Inpainted image with
α = β = 0.001.
0 200 400 600 800 1000 1200 1400 1600 1800 2000
10−10
10−8
10−6
10−4
10−2
100
No of iterations
R
e
la
t
iv
e
e
r
r
o
r
b
e
tw
e
e
n
it
e
r
a
t
e
s
-s
o
lu
t
io
n
 
 
FFT
GS, λ 1 = λ 2 = 0.01
GS, λ 1 = λ 2 = 0.001
GS, λ 1 = λ 2 = 0.005
GS, λ 1 = λ 2 = 0.05
(d) Plot of the relative error ‖uk −GT‖2/‖GT‖2 versus the number of iterations. Here sol
denotes a ground solution which is taken to be the solution un for a large enough n
(3000).
Figure 3.15: Comparison of the efficiency of the use of fast Fourier transform for the
solution of (3.78) over one iteration of Gauss–Seidel for the solution of (3.85).
n (3000). For the FFT approach, the choice of λ’s is done according to the rule (3.105)
presented in Section 3.7.4, while for the Gauss–Seidel the selection of λ’s is shown in
Figure 3.15(d). As Figure 3.15(d) suggests (and also observed empirically) solving the
linear subproblem with FFT leads to a faster convergence. Using more than one iterations
of Gauss-Seidel does not improve the situation as it increases further the computational
time, see also the corresponding discussion in [GO09].
3.7.3 The colour image case
So far we have only considered greyscale images, i.e., u ∈ Rn×m. However it is easy
to derive a version of the algorithm (3.77)–(3.83) for colour images, that is to say when
u ∈ (R3)N×M , where for simplicity we only consider RGB images. Starting from the
continuous setting, if u = (u1, u2, u3) = (u, u, u) ∈ BV(Ω,R3), then recall that the total
variation of u is defined as
TV(u) = sup
{
3∑
a=1
ˆ
Ω
uadivva dx : v ∈ C1c (Ω,R3×2), ‖v‖∞ ≤ 1
}
. (3.86)
98
3.7. Applications to image inpainting and online demo in IPOL
As in the scalar case, if u ∈W 1,1(Ω,R3), then
TV(u) =
ˆ
Ω
|∇u| dx. (3.87)
Analogously we define TV2(u). In the discrete setting the first and second order total
variation have the form
TV(u) = ‖(∇u,∇u,∇u)‖1, (3.88)
TV2(u) = ‖(∇2u,∇2u,∇2u)‖1. (3.89)
The minimisation problem corresponding to (3.74) now reads
min
u∈(R3)N×M
‖XΩ\D(u− f)‖22 + α‖∇u‖1 + β‖∇2u‖1. (3.90)
It is not hard to check that the only difference to the minimisation algorithm of the
greyscale case lies in the subproblems (3.79) and (3.80). This is the only point where the
colours are “mixed”. Subproblems (3.77), (3.78) and the updates (3.81)–(3.83) are done
separately for each channel. This favours parallelisation of the algorithm. Thus the split
Bregman algorithm for colour image inpainting reads as follows:
Split Bregman for L2 −TV −TV2 inpainting – colour images
Subproblem 1: Solve (3.77) separately for un+1, un+1, un+1, (3.91)
Subproblem 2: Solve (3.77) separately for u˜n+1, u˜n+1, u˜n+1, (3.92)
Subproblem 3: vn+1 = argmin
v∈((RN×M )2)3
α‖v‖1 + λ1
2
‖bn1 +∇u˜n+1 − v‖22, (3.93)
Subproblem 4: wn+1 = argmin
w∈((RN×M )4)3
β‖w‖1 + λ2
2
‖bn2 +∇2u˜n+1 −w‖22, (3.94)
Update on b0: Do update (3.81) separately for b
n+1
0 , b
n+1
0 , b
n+1
0 , (3.95)
Update on b1: Do update (3.82) separately for b
n+1
1 , b
n+1
1 , b
n+1
1 , (3.96)
Update on b2: Do update (3.83) separately for b
n+1
2 , b
n+1
2 , b
n+1
2 . (3.97)
99
The combined TV–TV2 approach for image reconstruction
As far as the solution of (3.93) is concerned, similarly as before we set
sn(i, j) = (sn1 (i, j), s
n
2 (i, j), s
n
1 (i, j), s
n
2 (i, j), s
n
1 (i, j), s
n
2 (i, j))
=
(
bn1,1(i, j) +D1u˜
n+1(i, j), bn1,2(i, j) +D2u˜
n+1(i, j),
bn1,1(i, j) +D1u˜
n+1(i, j), bn1,2(i, j) +D2u˜
k+1(i, j),
bn1,1(i, j) +D1u˜
n+1(i, j), bn1,2(i, j) +D2u˜
n+1(i, j)
)
,
and we can easily compute
vn+1x (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn1 (i, j)
|sn(i, j)| , (3.98)
vn+1y (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn2 (i, j)
|sn(i, j)| , (3.99)
vn+1x (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn1 (i, j)
|sn(i, j)| , (3.100)
vn+1y (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn2 (i, j)
|sn(i, j)| , (3.101)
vn+1x (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn1 (i, j)
|sn(i, j)| , (3.102)
vn+1y (i, j) = max
(
|sn(i, j)| − α
λ1
, 0
)
sn2 (i, j)
|sn(i, j)| . (3.103)
Analogously, we compute the solution of (3.94). We finally note that an analogous gen-
eralisation to the colour case can be done for the denoising and deblurring algorithm
(3.68)–(3.72).
3.7.4 Stopping criteria and the selection of λ’s
Establishing stopping criteria for our algorithm is necessary for the online implementation.
After extensive experimentation, we have found empirically that a stopping criterion de-
pending on the relative residue of the iterates un is an appropriate one for the termination
of the algorithm. Thus, the iterations are terminated when
max
{ |uk − uk−1|
|uk−1| ,
|uk − uk−1|
|uk−1| ,
|uk − uk−1|
|uk−1|
}
≤ 8 · 10−5. (3.104)
The values of the parameters λ0, λ1 and λ2 have to be chosen carefully. This selection
is done automatically in the online demo. All the combinations eventually lead to con-
vergence but the number of iterations that are needed for convergence differ dramatically
for different combinations. Note that in the case of denoising and deblurring, only two
parameters have to be specified, λ1 and λ2 instead of three here. This makes the choice
a more difficult task as they have to be chosen such that they balance each other in an
100
3.7. Applications to image inpainting and online demo in IPOL
optimal way within the problems (3.77)–(3.80). Empirically we have found that λ1 and λ2
have to be one or two orders of magnitude larger than α and β respectively. The value of
λ0 should not be much larger than α and β as that leads to a slow convergence. Here we
choose the following values of these parameters, a combination that has shown to ensure
fast convergence:
λ1 = 10α, λ2 = 10β, λ0 = max{α, β}. (3.105)
Table 3.2 and Figure 3.16 justify this empirical choice. In Figure 3.16, we perform pure
TV2 inpainting to the image of Figure 3.15 for different choices of λ0 and λ2 and fixed
β = 0.001. In each case we plot the relative error between the iterates and the solution.
The solution is taken to be a large iterate (n = 4000) that is obtained with the choice
(3.105). What we generally observe can be seen in Table 3.2. The choice (3.105) leads to
a fast convergence with small oscillatory behaviour in the iterates, see black line in Figure
3.16.
Choice of λ’s Convergence to solution Qualitative behaviour
λ0 = β, λ2 = 10β Fast Small oscillations
λ0 = β, λ2  10β Fairly fast Smooth transition
λ0 = β, λ2  10β Very slow Wild oscillations
λ0  β, λ2 = 10β Very slow Smooth transition
λ0  β, λ2 = 10β Slow Smooth transition
Table 3.2: Different behaviours of the iterates for different choices of the λ parameters.
We refer the reader to the Appendix of [PSS13] for some further experimental dis-
cussion on the effect of different choices of λ’s. We should note here that the stopping
criterion (3.104) works well only in combination with the choices (3.105). Some choices of
λ’s lead to such a slow convergence that even though criterion (3.104) might be satisfied,
the gap in the inpainting domain may not be filled in yet.
3.7.5 Inpainting examples
In this section we give some inpainting examples of the TV–TV2 method. Mainly we want
to emphasise the differences in practice between pure first order (β = 0) and pure second
order (α = 0) total variation inpainting.
In Figure 3.17 we see a first example where we try to remove a large font text from a
natural image. The TV2 result seems more pleasant to the human eye (subjective) as TV
inpainting produces a piecewise constant reconstruction in the inpainting domain. This is
confirmed quantitatively by the PSNR values inside the inpainting domain, as these are
101
The combined TV–TV2 approach for image reconstruction
0 200 400 600 800 1000 1200 1400 1600 1800 2000
10−10
10−8
10−6
10−4
10−2
100
No of iterations
R
e
la
t
iv
e
e
r
r
o
r
b
e
t
w
e
e
n
it
e
r
a
t
e
s
-
s
o
lu
t
io
n
 
 
λ 2 = 10
−2
, λ 0 = 10
−3
λ 2 = 10
−1
, λ 0 = 10
−3
λ 2 = 10
−4
, λ 0 = 10
−3
λ 2 = 10
−2
, λ 0 = 10
−1
λ 2 = 10
−2
, λ 0 = 10
−5
Figure 3.16: Plot of the relative error of the iterates and the solution (pure TV2 inpainting
of the image in Figure 3.15) for different choices of the parameters λ0 and λ2. The value
of β is 0.001 and the solution is taken to be the 4000–th iterate obtained with λ0 = 0.001,
λ2 = 0.01, see black line. We observe that other choices of λ’s generally lead to a slower
convergence.
(a) Original image with text (b) Pure TV inpainting, α=0.001 (c) Pure TV2 inpainting, β=0.001
(d) Detail of original image (e) Detail of TV inpainting (f) Detail of TV2 inpainting
Figure 3.17: Removing large font text from an image. Even though this is subjective, we
believe that the pure TV2 result looks more realistic since the piecewise constant TV re-
construction is not desirable for natural images, see for example the inpainting of the letter
“T”. The PSNR inside the inpainting domain for the three channels is (34.19, 34.97, 36.99)
for the TV inpainting and (34.74, 35.34, 37.20) for the TV2 inpainting.
higher for the TV2 inpainting, see the caption of Figure 3.17.
In Figure 3.18 we see a second example with colour stripes. Here we include also
102
3.7. Applications to image inpainting and online demo in IPOL
(a) Original image. (b) Harmonic
inpainting, α=0.001
(c) Pure TV inpainting,
α=0.001
(d) Pure TV2 inpaint-
ing, β=0.001
(e) Detail of TV
inpainting
(f) Detail of TV2
inpainting
(g) Detail of harmonic
inpainting
(h) Detail of TV2
inpainting
Figure 3.18: colour stripes inpainting. Observe the difference between TV and TV2 in-
painting in the light blue stripe at the left, Figures (e) and (f), where in the TV2 case the
blue colour is propagated inside the inpainting domain.
a harmonic inpainting example. As expected, in the case of TV inpainting the image
is piecewise constant inside the inpainting domain, see Figure 3.18(c). This produces
a desirable result in stripes whose width is larger than the size of the gap, connecting
the gap and preserving the edges there, while TV2 inpainting, Figure 3.18(d), adds some
additional blur at the edge of the stripe that belongs to the inpainting domain. However,
TV inpainting fails to connect the thin stripes while TV2 propagates the right colour more
reliably, see Figures 3.18(e) and 3.18(f). Finally, notice the difference between harmonic
and TV2 inpainting in terms of connectivity, as the example of the yellow stripe of Figures
3.18(g) and 3.18(h) indicates.
Let us note here that Euler’s elastica is also able of achieving connection along large
gaps but even its fast implementation in [THC11] is more than ten times slower than our
method, see the corresponding tables in that paper.
We also want to examine how the inpainting result behaves while we are moving from
pure TV to pure TV2 inpainting in a continuous way. We can see this in Figure 3.19,
where, keeping the geometry of the inpainting domain fixed, we see how connectivity is
achieved while we are making the transition from pure TV to pure TV2. The parameters
α and β vary as follows:
α : 0.01→ 0.008→ 0.006→ 0.005→ 0.004→ 0.002→ 0,
103
The combined TV–TV2 approach for image reconstruction
(a) Inpainting domain (b) α=0, β=0.01 (c) α=0.008, β=0.002 (d) α=0.006, β=0.004
(e) α=0.005, β=0.005 (f) α=0.004, β=0.006 (g) α=0.002, β=0.008 (h) α=0, β=0.01
Figure 3.19: Inpainting of a stripe with a large gap. Transition from pure TV to pure TV2.
Connectivity is achieved only for large ratio β/α, see Figures (g) and (h). No connectivity
is obtained when the weights α and β are equal, see Figure (e).
β : 0 → 0.002→ 0.004→ 0.005→ 0.006→ 0.008→ 0.01.
As we see in Figure 3.19, the large gap in the stripe is connected only for large values of
β and small values of α. In the case where both weights α and β are equal we observe no
connection.
Finally we want to show the influence that the inpainting domain has on large gap con-
nectivity. As we saw, in Figures 3.13(d) and 3.19(h), pure TV2 inpainting has the ability to
connect large gaps along the inpainting domain. However, as numerical experiments have
shown, the quality of the connection depends on the geometry of the inpainting domain
D, see Figures 3.20 and 3.21. In particular, for the broken stripe example of Figures 3.20
and 3.21, the domain must extend above, underneath, left and right of the gap between
the two parts of the stripe. For example in Figure 3.20(e), where the inpainting domain is
just the area between the two parts, we do not have connection, see Figure 3.20(j), nor we
have when we extend the domain only in the vertical direction, Figures 3.21(e) and (j).
In conclusion, TV–TV2 inpainting, especially its pure TV2 version, is a fast inpainting
method that is capable of producing visually pleasing results. Its computational cost is
not much higher than the one needed for TV inpainting and its connectivity properties
make it in our opinion an appealing method to use, as far as local inpainting is concerned.
104
3.7. Applications to image inpainting and online demo in IPOL
(a) Domain 1 (b) Domain 2 (c) Domain 3 (d) Domain 4 (e) Domain 5
(f) Domain 1, TV2
inpainting
(g) Domain 2, TV2
inpainting
(h) Domain 3, TV2
inpainting
(i) Domain 4, TV2
inpainting
(j) Domain 5, TV2
inpainting
Figure 3.20: Different pure TV2 inpainting results for different inpainting domains of
decreasing height. In all computations we set β = 0.001.
(a) Domain 1 (b) Domain 6 (c) Domain 7 (d) Domain 8 (e) Domain 9
(f) Domain 1, TV2
inpainting
(g) Domain 6, TV2
inpainting
(h) Domain 7, TV2
inpainting
(i) Domain 8, TV2
inpainting
(j) Domain 9, TV2
inpainting
Figure 3.21: Different pure TV2 inpainting results for different inpainting domains of
decreasing width. In all computations we set β = 0.001.
105
The combined TV–TV2 approach for image reconstruction
106
Chapter 4
Exact solutions of the one
dimensional TGV regularisation
problem
4.1 Introduction
As we have already seen in the introduction of the thesis, the total generalised variation
of second order is a high quality regulariser that incorporates second order derivatives.
In its more general definition, it incorporates derivatives up to k–th order, see [BKP10].
Given a function u ∈ L1(Ω) with Ω ⊆ R2 being open and bounded, its total generalised
variation of order k is defined as:
TGVkα(u) = sup
{ˆ
Ω
udivkv dx : v ∈ Ckc (Ω,Symk(Rd)), ‖diviv‖∞ ≤ αi, i = 0, . . . , k − 1
}
,
(4.1)
where α = (α0, . . . , αk−1) is a multi-index of strictly positive entries and Symk(Rd) denotes
the space of symmetric tensors of order k with arguments in Rd. However, regarding its
use in regularisation problems of the type
min
u
Φ(Tu, f) + TGVkα(u),
only the case k = 2 is typically considered. This is because of the increased computa-
tional cost to minimise TGVkα(u), for k > 2 and also due to the fact that no significant
improvement to the reconstructed image is observed to justify this additional cost. Note
also that, as it follows easily from the definition,
TGV1α(u) = αTV(u).
107
Exact solutions of the one dimensional TGV regularisation problem
We are thus focusing on the second order total generalised variation, which can be written
in a more familiar way as
TGV2β,α(u) = sup
{ˆ
Ω
udiv2v dx : v ∈ C2c (Ω,Sd×d), ‖v‖∞ ≤ β, ‖divv‖∞ ≤ α
}
, (4.2)
where α, β > 0 and Sd×d is the space of d× d symmetric matrices.
As we mentioned in the introduction, the use of TGV in variational image reconstruc-
tion problems is becoming increasingly popular, due to the high quality of the restoration
results. Like TV, it has the ability to preserve edges in the reconstructed images and
in the same time it avoids the creation of the staircasing effect. Our motivation for this
chapter is to gain more understanding of these properties of TGV by computing exact
solutions for the one dimensional TGV denoising problem with L2 fidelity term
min
u
1
2
ˆ
Ω
(u− f)2dx+ TGV2β,α(u), (4.3)
for simple data functions f . We examine under which conditions discontinuities are pre-
served and what is the role of the parameters α and β. The data functions we are con-
sidering is a piecewise constant function fp.c., a piecewise affine function fp.a. and a hat
function fhat, see Figure 4.1. The rationale is that by choosing as data the functions fp.c.
h
L L
(a) Function fp.c.
h
L L
f ′p.a. = λ
f ′p.a. = λ
(b) Function fp.a.
L L
λL
f ′hat = −λ f ′hat = λ
(c) Function fhat
Figure 4.1: Data functions for which we compute exact solutions of the L2–TGV2β,α de-
noising problem.
and fp.a. we can study the preservation of jumps and piecewise affinity in the solutions
while by choosing fhat we examine the effect of TGV on local extrema.
4.1.1 Organisation of the chapter
We start by describing some basic properties of the TGV functional in Section 4.2, includ-
ing a useful equivalent definition that we are going to use for the rest of the chapter.
In Section 4.3 we formulate the one dimensional L2–TGV2β,α denoising problem. We
identify its predual problem for which we prove well-posedness and derive the correspond-
ing optimality conditions using Fenchel–Rockafellar duality theory, as this was described
in Section 2.5.2.
108
4.2. Basic properties of TGV
Section 4.4 deals with a description of the properties of the solutions. More specifically,
in Section 4.4.1, the structure of solutions is examined with emphasis on piecewise affinity,
the preservation of discontinuities, behaviour near and away of the boundary of Ω and
near jump points. In Section 4.4.2 we prove some interesting facts about the L2–linear
regression of the data. Among other results, we provide certain thresholds for the param-
eters α and β, above which, the resulting solution is the L2–linear regression of the data
f . Section 4.4.3 starts with some remarks on the inheritance of the data’s symmetry to
the solution. We proceed with proving the equivalence of TGV and TV regularisation, at
least for even data, under some assumptions on α and β.
In Section 4.5 we extend the latter result to dimension two. In particular, we prove that
for symmetric enough data functions f , the two dimensional TGV and TV regularisations
coincide for large enough ratio β/α and we also provide some numerical examples that
justify this result.
The computation of exact solutions for simple data functions f is done in Section
4.6. In Section 4.6.1 exact solutions are computed, taken as data a piecewise constant
function with a single jump. We show that only four types of solutions are possible and
we show for which combinations of the parameters α and β we have each kind of solution.
Having these results as a basis, we determine straightforwardly the solutions for a piecewise
affine function with a single jump, in Section 4.6.2. In Section 4.6.3 we compute exact
solutions for a hat (absolute value–type of) function and also provide the corresponding
combinations for α and β that lead to each type of solutions. As in the previous case,
only four different types of solutions can occur.
Finally, in Section 4.6.4 we provide some numerical experiments. We show that the
exact solutions coincide with the numerical ones in the absence of noise, as expected. We
also numerically compute some solutions with noisy data, in which we observe only a slight
deviation from the corresponding solutions with clean data.
4.2 Basic properties of TGV
In this section we recall some of the basic properties of the TGV functional as these were
proved in [BKP10, BV11, BKV13].
Proposition 4.2.1. Let Ω ⊆ Rd be open and bounded. Then the following facts hold:
(i) TGV2β,α is a seminorm on the Banach space
BGV2β,α(Ω) =
{
u ∈ L1(Ω) : TGV2β,α(u) <∞
}
,
where ‖u‖BGV2β,α(Ω) := ‖u‖L1(Ω) + TGV
2
β,α(u).
(ii) TGV2β,α(u) = 0 if and only of u is a polynomial of degree less than 2.
109
Exact solutions of the one dimensional TGV regularisation problem
(iii) TGV2β,α is rotationally invariant.
(iv) TGV2β,α is a proper, convex, lower semicontinuous functional on every L
p(Ω), 1 ≤
p <∞.
(v) If Ω has Lipschitz boundary then there exist constants 0 < c < C that depend only
on Ω, such that for every u with TGV2β,α(u) <∞ we have
c‖u‖BV(Ω) ≤ ‖u‖L1(Ω) + TGV2β,α(u) ≤ C‖u‖BV(Ω).
The statement (v) of Proposition 4.2.1 actually says that BGV2β,α(Ω) and BV(Ω) are topo-
logically equivalent Banach spaces. Thus, the minimisation of the L2–TGV2β,α problem is
done over BV(Ω). We now present an alternative definition of TGV2β,α that was proved
in [BKV13].
Proposition 4.2.2. Let u ∈ L1(Ω). Then
TGV2β,α(u) = min
w∈BD(Ω)
α‖Du− w‖M + β‖Eu‖M, (4.4)
where BD(Ω) is the space of functions of bounded deformation, i.e., the space of all func-
tions in L1(Ω,Rd) such that the symmetrised gradient Eu can be represented by a finite
Radon measure. In one dimension we have BD(Ω) = BV(Ω) and E(u) = Du, thus (4.4)
can be written as
TGV2β,α(u) = min
w∈BV(Ω)
α‖Du− w‖M + β‖Du‖M. (4.5)
As we have mentioned in the introduction, (4.4) is called the minimum or primal
definition of TGV2β,α and it is the one that we are going to use for our purposes, while
(4.2) is called the supremum or predual definition.
4.3 Formulation of the problem and optimality conditions
We are interested in studying the one dimensional second order TGV denoising problem
with L2 fidelity term, i.e.,
min
u∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+ TGV2β,α(u), (4.6)
where Ω = (a, b) is an open interval of R and f ∈ L2(Ω). Note that existence of solution for
(4.6) follows from a simple application of the direct method of calculus of variations using
the weak∗ compactness in BV(Ω) and the lower semicontinuity of TGV while uniqueness
follows from the strict convexity of (4.6). Using the formulation (4.5), the problem (4.6)
110
4.3. Formulation of the problem and optimality conditions
is equivalent to
min
u∈BV(Ω)
w∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+ α‖Du− w‖M + β‖Dw‖M. (P)
Let us note here that while a solution (w, u) for P exists and uniqueness is guaranteed for
u, the same is not true for w. In fact w is a solution to an L1–TV problem, which is not
strictly convex:
w ∈ argmin
w∈BV(Ω)
α‖Du− w‖M + β‖Dw‖M ⇐⇒
w ∈ argmin
w∈BV(Ω)
‖Dau+Dsu− w‖M + β
α
|Dw|(Ω) ⇐⇒
w ∈ argmin
w∈BV(Ω)
ˆ
Ω
|u′ − w| dx+ β
α
TV(w).
In order to study exact solutions of the problem P, we essentially follow [BKV13]
where the corresponding L1 fidelity term case is studied. We identify the predual problem
of P and derive the optimality conditions using Fenchel–Rockafellar duality.
Consider the following problem
sup
{ˆ
Ω
fv′′dx− 1
2
ˆ
Ω
(v′′)2dx : v ∈ H20 (Ω), ‖v‖∞ ≤ β, ‖v′‖∞ ≤ α
}
. (P ′)
We shall prove that P ′ is the predual problem of P. Firstly, we show that P ′ has a solution
indeed.
Proposition 4.3.1. The problem P ′ admits a unique solution in H20 (Ω).
Proof. Because of the estimate
‖v‖∞ + ‖v′‖∞ ≤ C‖v‖H20 (Ω), ∀v ∈ H
2
0 (Ω),
see [Eva10], we have that the set
K =
{
v ∈ H20 (Ω) : ‖v‖∞ ≤ β, ‖v′‖∞ ≤ α
}
,
is norm-closed and since it is convex, it is also weakly closed from Mazur’s theorem. Since,
according to Proposition 2.5.2, (12‖ · ‖2L2(Ω))∗ = 12‖ · ‖2L2(Ω), we have that
sup
v∈L2(Ω)
ˆ
Ω
fv dx− 1
2
ˆ
Ω
v2 dx =
(1
2
‖ · ‖2L2(Ω)
)∗
(f) =
1
2
‖f‖2L2(Ω).
Thus the supremum in P ′ is finite and we denote it with supP ′. Consider now a maximising
111
Exact solutions of the one dimensional TGV regularisation problem
sequence (vn)n∈N such that
ˆ
Ω
fv′′ndx−
1
2
ˆ
Ω
(v′′n)
2dx→ supP ′, as n→∞.
Then there exists a positive constant M such that∣∣∣∣ˆ
Ω
fv′′n dx−
1
2
ˆ
Ω
(v′′n)
2 dx
∣∣∣∣ ≤M, ∀n ∈ N. (4.7)
We will show that the sequence
(‖v′′n‖L2(Ω))n∈N is bounded as well. Suppose not, then
lim supn→∞
´
Ω(v
′′
n)
2 dx =∞. Thus, passing to a subsequence if necessary we have
ˆ
Ω
fv′′n dx−
1
2
ˆ
Ω
(v′′n)
2 dx ≤ ‖f‖L2(Ω)‖v′′n‖L2(Ω) −
1
2
‖v′′n‖2L2(Ω) → −∞, as n→∞.
which is a contradiction from (4.7). Hence, the sequence (vn)n∈N is bounded in H20 (Ω)
and since that space is reflexive we get the existence of a subsequence (vnk)k∈N converging
weakly to a function v ∈ H20 (Ω). Since K is weakly closed we have that v ∈ K. Moreover
the maximising functional is weakly upper semicontinuous since
´
Ω f(·)′′ dx and−12‖·‖2L2(Ω)
are weakly continuous and weakly upper semicontinuous in H20 (Ω) respectively. Thus, we
have
supP ′ ≥
ˆ
Ω
fv′′ dx− 1
2
ˆ
Ω
(v′′)2dx ≥ lim sup
k→∞
ˆ
Ω
fv′′nkdx−
1
2
ˆ
Ω
(v′′nk)
2dx = supP ′,
which means that v is a solution to P ′. This solution is unique as the maximising functional
is strictly concave, defined on a convex domain.
We define now the following spaces and operators:
• X = H20 (Ω)×H10 (Ω), Y = H10 (Ω)× L2(Ω),
• Λ : X → Y, linear, bounded,
• F1 : X → R, F2 : Y → R,
with
Λ(v, ω) = (ω + v′, ω′),
F1(v, ω) = I{‖·‖∞≤β}(v) + I{‖·‖∞≤α}(ω),
F2(φ, ψ) = I{0}(φ) +
ˆ
Ω
fψ dx+
1
2
ˆ
Ω
ψ2 dx.
It is easy to check that under these definitions the problem P ′ is a equivalent to
− min
(v,ω)∈X
F1(v, ω) + F2(Λ(v, ω)). (4.8)
112
4.3. Formulation of the problem and optimality conditions
The dual problem of (4.8), see Section 2.5.2, is
min
(w,u)∈Y ∗
F ∗1 (−Λ∗(w, u)) + F ∗2 (w, u). (4.9)
Moreover, X, Y are Banach spaces, F1, F2 are proper, lower semicontinuous functions and
the condition
Y =
⋃
λ≥0
λ(dom(F2)− Λ(dom(F1))), (4.10)
holds. Thus we have that Theorem 2.5.6 holds, i.e., no duality gap occurs:(
min
(v,ω)∈X
F1(v, ω) + F2(Λ(v, ω))
)
+
(
min
(w,u)∈Y ∗
F ∗1 (−Λ∗(w, u)) + F ∗2 (w, u)
)
= 0.
The fact that condition (4.10) holds follows from [BKV13, Proposition 3.6], where the
same condition is proved for the same X, Y , F1, Λ and for a F2 with a smaller domain.
Our next step is to identify the dual problem (4.9) with the problem P.
Proposition 4.3.2. The problem
min
(w,u)∈Y ∗
F ∗1 (−Λ∗(w, u)) + F ∗2 (w, u), (4.11)
is equivalent to P in the sense that (w, u) solve (4.11) if and only if (w, u) ∈ BV(Ω) ×
BV(Ω) and solve P.
Proof. The proof follows closely the proof of the corresponding theorem in [BKV13].
Firstly, we compute F ∗1 . For a pair (σ, τ) ∈ H20 (Ω)∗ ×H10 (Ω)∗ we have that
F ∗1 (σ, τ) = sup
(v,ω)∈H20 (Ω)×H10 (Ω)
‖v‖∞≤β
‖ω‖∞≤α
〈σ, v〉+ 〈τ, ω〉 = β sup
v∈H20 (Ω)
‖v‖∞≤1
〈σ, v〉+ α sup
ω∈H10 (Ω)
‖ω‖∞≤1
〈τ, ω〉.
Using density arguments, it can be checked that
F ∗1 (σ, τ) = β sup
v∈C∞c (Ω)
‖v‖∞≤1
〈σ, v〉+ α sup
ω∈C∞c (Ω)
‖ω‖∞≤1
〈τ, ω〉. (4.12)
We also have for every (w, u) ∈ Y ∗ and (v, ω) ∈ X,
−〈Λ∗(w, u︸︷︷︸
∈Y ∗
)
︸ ︷︷ ︸
∈X∗
, (v, ω︸︷︷︸
∈X
)〉 = −〈(w, u),Λ(v, ω)〉 = −〈w, v′ + ω〉 − 〈u, ω′〉. (4.13)
113
Exact solutions of the one dimensional TGV regularisation problem
Combining (4.12) and (4.13) we have that for every (w, u) ∈ Y ∗
F ∗1 (−(Λ∗(w, u))) = β sup
v∈C∞c (Ω)
‖v‖∞≤1
−〈w, v′〉+ α sup
ω∈C∞c (Ω)
‖ω‖∞≤1
−〈w,ω〉 − 〈u, ω′〉. (4.14)
Since w ∈ H10 (Ω)∗ and u ∈ L2(Ω), they can be regarded as distributions and thus (4.14)
can be written as
F ∗1 (−(Λ∗(w, u))) = β sup
v∈C∞c (Ω)
‖v‖∞≤1
〈Dw, v〉+ α sup
ω∈C∞c (Ω)
‖ω‖∞≤1
〈Du− w,ω〉. (4.15)
Since we consider the minimisation problem (4.11), both terms in (4.15) are finite. Thus,
see Theorem 2.2.3 and the discussion after that, we have that Dw is a Radon measure, w is
an L1 function (thus a BV function) and Du is a Radon measure, i.e., u is a BV function as
well. Thus, if (w, u) is a pair of minimisers for (4.11) we have that (w, u) ∈ BV(Ω)×BV(Ω)
and
F ∗1 (−(Λ∗(w, u))) = β‖Dw‖M + α‖Du− w‖M.
We now compute F ∗2 . We have
F ∗2 (w, u) = sup
(φ,ψ)∈Y
φ=0
〈w, φ〉+
ˆ
Ω
uψ dx−
ˆ
Ω
fψ dx− 1
2
ˆ
Ω
ψ2 dx
= sup
ψ∈L2(Ω)
ˆ
Ω
(u− f)ψ dx− 1
2
ˆ
Ω
ψ2dx
=
(
1
2
‖ · ‖2L2(Ω)
)∗
(u− f)
=
1
2
ˆ
Ω
(u− f)2dx,
and the proof is complete.
We are now ready to derive the optimality conditions that link the solutions of the
problems P and P ′. We first need the following definition.
Definition 4.3.3. Let µ ∈M(Ω). We define the set-valued sign, Sgn(µ) as
Sgn(µ) = {v ∈ L∞(Ω) ∩ L∞(Ω, |µ|) : ‖v‖L∞(Ω) ≤ 1, v = sgn(µ), |µ| − a.e.}
In other words Sgn(µ) is the set of all functions v that are equal to the Radon–
Nikody´m density µ/|µ|, µ–almost everywhere with the additional property that |v(x)| ≤ 1,
Lebesgue–almost everywhere. The following lemma was proved in [BKV13].
114
4.3. Formulation of the problem and optimality conditions
Lemma 4.3.4. If µ ∈M(Ω) then
Sgn(µ) ∩ C0(Ω) = ∂‖ · ‖M(µ) ∩ C0(Ω).
The following proposition characterises the minimisers of the problem P.
Proposition 4.3.5 (Optimality conditions for L2–TGV2β,α). A pair (w, u) ∈ BV(Ω) ×
BV(Ω) is a minimiser for P if and only if there exists a function v ∈ H20 (Ω), such that
v′′ = f − u, (Cf )
−v′ ∈ α Sgn(Du− w), (Cα)
v ∈ β Sgn(Dw). (Cβ)
Proof. Since there is no duality gap, from Theorem 2.5.5 we get that the solutions (v, ω)
and (w, u) of P ′ and P respectively (or equivalently (4.8) and (4.9)) are linked through
the optimality conditions
(v, ω) ∈ ∂F ∗1 (−Λ∗(w, u)), (4.16)
Λ(v, ω) ∈ ∂F ∗2 (w, u). (4.17)
Conversely, (w, u) is a solution for P if there exists a v ∈ H20 (Ω) ⊆ C0(Ω) such that (4.16)–
(4.17) hold with ω = −v′. We have that the first optimality condition (4.16) is equivalent
to
F ∗1 (−Λ∗(w, u)) + 〈(v, ω), (σ, τ) + Λ∗(w, u)〉 ≤ F ∗1 (σ, τ), ∀(σ, τ) ∈ X∗
⇔ α‖Du− w‖M + β‖Dw‖M + 〈(v, ω), (σ −Dw, τ − (Du− w))〉 ≤ α‖τ‖M + β‖σ‖M,
∀(σ, τ) ∈ X∗
⇔ α‖Du− w‖M + β‖Dw‖M + 〈v, σ −Dw〉+ 〈ω, τ − (Du− w)〉 ≤ α‖τ‖M + β‖σ‖M,
∀(σ, τ) ∈ X∗
⇔
α‖Du− w‖M + 〈ω, τ − (Du− w)〉 ≤ α‖τ‖M, ∀τ ∈ H10 (Ω)∗β‖Dw‖M + 〈v, σ −Dw〉 ≤ β‖σ‖M, ∀σ ∈ H20 (Ω)∗
⇔
α‖Du− w‖M + 〈ω, ψ − (Du− w)〉 ≤ α‖τ‖M, ∀τ ∈M(Ω)β‖Dw‖M + 〈v, σ −Dw〉 ≤ β‖σ‖M, ∀τ ∈M(Ω)
⇔
ω ∈ α∂‖ · ‖M(Du− w)v ∈ β ∂‖ · ‖M(Dw) (4.18)
115
Exact solutions of the one dimensional TGV regularisation problem
Using Lemma 4.3.4, the relations (4.18) are equivalent to
ω ∈ α Sgn(Du− w), (4.19)
v ∈ β Sgn(Dw). (4.20)
Now the second optimality condition (4.17) is equivalent to
F ∗2 (w, u) + 〈Λ(v, ω), (wˆ, uˆ)− (w, u)〉 ≤ F ∗2 (wˆ, uˆ), ∀(wˆ, uˆ) ∈ Y ∗
⇔ 1
2
ˆ
Ω
(u− f)2dx+ 〈ω + v′, wˆ − w〉+ 〈ω′, uˆ− u〉 ≤ 1
2
ˆ
Ω
(uˆ− f)2dx, ∀(wˆ, uˆ) ∈ Y ∗
⇔
0 + 〈ω + v′, wˆ − w〉 ≤ 0, ∀wˆ ∈ H10 (Ω)∗1
2
´
Ω(u− f)2dx+ 〈ω′, uˆ− u〉 ≤ 12
´
Ω(uˆ− f)2dx, ∀uˆ ∈ L2(Ω)∗
⇔
ω = −v′ω′ ∈ ∂ (12‖ · −f‖2L2(Ω)) (u)
⇔
ω = −v′ω′ = u− f (4.21)
Combining (4.21) with (4.19) and (4.20) we deduce (Cf )–(Cβ).
We note here that analogous optimality conditions hold for the L1–TGV2β,α problem
as well as for the corresponding L1–TV, L2–TV problems, see [BKV13] and [Rin00]. We
provide a summary of these in Table 4.1.
Optimality conditions – Dimension one
L1–αTV L2–αTV L1–TGV2β,α L
2–TGV2β,α
u ∈ BV(Ω), v ∈ H10(Ω) u ∈ BV(Ω), v ∈ H10(Ω) u,w ∈ BV(Ω), v ∈ H20(Ω) u,w ∈ BV(Ω), v ∈ H20(Ω)
v′ ∈ Sgn(f − u),
−v ∈ α Sgn(Du)
v′ = f − u,
−v ∈ α Sgn(Du)
v′′ ∈ Sgn(f − u),
−v′ ∈ α Sgn(Du− w),
v ∈ β Sgn(Dw)
v′′ = f − u,
−v′ ∈ α Sgn(Du− w),
v ∈ β Sgn(Dw)
Table 4.1: Summary of the optimality conditions for the one dimensional TV and TGV
denoising problems with L1 and L2 fidelity terms.
116
4.4. Properties of solutions
4.4 Properties of solutions
4.4.1 Basic properties
Before we proceed to our computations of exact solutions for simple data functions, we
firstly show some properties of the solutions of the one dimensional L2–TGV2β,α regular-
isation. The following two propositions were proved in [BKV13] for the one dimensional
L1–TGV2β,α model and they hold in the L
2 fidelity case as well. Their proofs are minor
adjustments of the corresponding proofs for the L1 fidelity case and we omit them. Recall
also the definitions of the good representatives uup and ulow of a function u ∈ BV(Ω),
defined after Theorem 2.3.12.
Proposition 4.4.1. Let f ∈ BV(Ω) and suppose that u,w ∈ BV(Ω) solve P. Suppose
that uup < flow on an open interval I ⊆ Ω. Then the following hold:
(i) (Du− w)bI = 0, that is to say u′ = w on I and |Dsu|(I) = 0.
(ii) w′ = 0 on I and 0 ≤ −DwbI  δx for some x ∈ I.
(iii) The function w = u′ is non-increasing on I.
Similarly, if ulow > f
up on I then we have
(i) (Du− w)bI = 0, that is to say u′ = w on I and |Dsu|(I) = 0.
(ii) w′ = 0 on I and 0 ≤ DwbI  δx for some x ∈ I.
(iii) The function w = u′ is non-decreasing on I.
The following proposition states that the jump set of the solution u is contained in the
jump set of the solution data f .
Proposition 4.4.2. Let f ∈ BV(Ω) and let
Gf := {(x, t) ∈ Ω× R : x ∈ Jf , t ∈ [flow(x), fup(x)]} .
If u solves the L2–TGV2β,α minimisation problem with data function f , then
Gu ⊆ Gf . (4.22)
In particular Ju ⊆ Jf .
Let us note that in higher dimensions (Ω ⊆ Rd), an analogue inclusion holds (up to an
Hd−1–null set) for the jump sets of the data and the solution of the L2–TV problem, see
[CCN07] and also [Jal12, Jal14]. The corresponding problem for the L2–TGV2β,α model
in higher dimensions still remains open, however, significant advances have been made
recently by Valkonen [Val14a, Val14b].
The following proposition states that at least away from the boundary of Ω, the solution
u can be bounded pointwise by the data f .
117
Exact solutions of the one dimensional TGV regularisation problem
Proposition 4.4.3 (Behaviour away from the boundary). Let u be the solution to the
problem P with data f and let (c, d) ⊆ {x ∈ (a, b) : fup(x) < ulow(x)} be a maximal
interval with that property and a < c < d < b. Then
ulow(c) ∈ [flow(c), fup(c)], (4.23)
uup(d) ∈ [flow(d), fup(d)], (4.24)
and
inf
x∈(c,d)
fup(x) ≤ min
x∈[c,d]
ulow(x) ≤ max
x∈[c,d]
uup(x) ≤ max{fup(c), fup(d)}. (4.25)
Analogue results hold for the uup < flow case.
Proof. Let us note first that the set {x ∈ (a, b) : fup(x) < ulow(x)} is open, see [BKV13].
We prove here (4.23) while (4.24) can be shown similarly.
Notice that in the case where ulow(c) < u
up(c), i.e., c ∈ Ju, (4.23) follows directly from
(4.22), since in that case we have [ulow(c), u
up(c)] ⊆ [flow(c), fup(c)]. Thus, we suppose
that ulow(c) = u
up(c). We have two cases now, either flow(c) = f
up(c) or flow(c) < f
up(c).
Consider the first case, flow(c) = f
up(c), and suppose towards contradiction that fup(c) <
ulow(c). In that case we claim that there exists a δ > 0 such that for every x ∈ (c − δ, c]
we have fup(x) < ulow(x) which is a contradiction by the maximality of (c, d). Indeed,
if the claim is not true, there exists a sequence (xn)n∈N with xn < c for every n ∈ N
such that xn → c and ulow(xn) ≤ fup(xn) for every n ∈ N. Since both ulow and fup are
continuous at c that would mean that ulow(c) ≤ fup(c). If fup(c) > ulow(c) we arrive
again in contradiction since from the continuity of fup and ulow at c there must be points
in (c, d) where fup > ulow which contradicts our assumption. We work similarly for the
case fup(c) < ulow(c).
In order to prove (4.25), notice first that Proposition 4.4.1 implies that u is continuous
and convex in (c, d), thus uup = ulow there. From the convexity of u we have for every
0 ≤ λ ≤ 1
uup(λc+ (1− λ)d) ≤ λuup(c) + (1− λ)uup(d)
≤ λmax{fup(c), fup(d)}+ (1− λ) max{fup(c), fup(d)}
= max{fup(c), fup(d)}. (4.26)
Finally, we have that fup(x) < ulow(x) for all x ∈ (c, d). From the existence of the side
limits we have that ulow can be extended continuously to [c, d] and thus
inf
x∈(c,d)
fup(x) ≤ min
x∈[c,d]
ulow(x). (4.27)
118
4.4. Properties of solutions
Combining (4.26) and (4.27) we get (4.25).
We note that the above pointwise estimates hold only away from the boundary of
Ω. Indeed as we will see in the next sections, there are examples where u < inf f and
u > sup f near the boundary (in the sense of good representatives). This is in fact in
contrast to TV regularisation, where one can show that inf f < ulow < u
up < sup f in the
whole domain Ω. In the following proposition we investigate further the structure of the
solution u close to the boundary of Ω.
Proposition 4.4.4 (Behaviour near the boundary). The following two statements hold:
(i) If (a, c) is a maximal interval subset of {x ∈ (a, b) : uup < flow}, then uup is an
affine function there.
(ii) Suppose that u = f a.e. in a maximal set of a form (a, c) and suppose that uup < flow
in a set of the form (c, d). Then u is affine in (a, d).
Analogue results hold for the case ulow > f
up and also near b.
Proof. (i) We know from from Proposition 4.4.1 that u is continuous and concave in (a, c).
We claim that Dw = 0, i.e., w is a constant function there, and thus from the fact that
u′ = w we get that u is affine. Indeed, we have that v is strictly convex on (a, c) with
v(a) = v′(a+) = 0. This means that v is strictly increasing in (a, c). If Dwb(a, c) = −δx
for some x ∈ (a, c) we would get from (Cβ) that v(x) = −β and v(x′) < −β for every
a < x′ < x which is a contradition.
(ii) Because of the fact that u = f a.e. in (a, c) we have from (Cf ) that v is an affine
function there and since v ∈ H20 (a, b) we have that v = 0 in (a, c). Condition (Cβ) forces
w to be constant and condition (Cα) forces Du = w and thus u is an affine function in
(a, c) say with derivative equal to λ1. Since u
up < flow in (c, d) we have that u is also
an affine function there, say with derivative equal to λ2. Moreover, u must be continuous
in c because otherwise from condition (Cα) we would have that |v′(c)| = α and then v′
would be discontinuous at c. Also, we have λ1 = λ2. This is because of the fact that
wb(a, c) = λ1, wb(c, d) = λ2 and if λ1 6= λ2, from condition (Cβ) we would have that
|v(c)| = β and thus v would be discontinuous at c. Note finally, that u cannot have a
gradient change in (c, d) because then v would have a local minimum in (c, d) which is
again impossible.
The second part of Proposition 4.4.4 tells us that the case u = f near the boundary
can possibly happen only if f is affine there. In the next proposition we point out that
the derivative of u cannot change across a jump point.
Proposition 4.4.5 (Behaviour at a jump point). Let u be a solution to the problem P
with data f . Suppose that u has a jump discontinuity at a point x0 and that there exists
119
Exact solutions of the one dimensional TGV regularisation problem
an  > 0 such that uup < flow or ulow > f
up in (x0− , x0 + ) where also f is continuous.
Then there exists 0 < δ <  such that u′ is constant on (x0− δ, x0 + δ), i.e., the derivative
of u does not jump on x0.
Proof. Since u has a jump discontinuity at x0, from condition (Cα) we have that
|v′(x0)| = α. (4.28)
Since uup < flow or ulow > f
up in (x0 − , x0 + ) we have from Proposition 4.4.1 that
Du = w in (x0 − , x0) and also in (x0, x0 + ). Moreover, there exists δ ≤  such that Du
is constant in each one of the intervals (x0−δ, x0) and (x0, x0 +δ). Thus if Du has a jump
discontinuity then the same is true for w and thus from condition (Cβ) we have that
|v(x0)| = β. (4.29)
However its is easy to check that (4.28) and (4.29) cannot hold simultaneously, since v
has an extremum in x0 and hence its derivative must be zero there. Thus we have a
contradiction.
In Figure 4.2 we provide an illustration about how the possible solutions can look
like. For example, in Figure 4.2(a) solution of the type u1 cannot occur as according
to Proposition 4.4.1 u must be piecewise affine in (c, d). Similarly, u3 violates the fact
that u′ must be non-increasing in (c, d) and also contradicts Proposition 4.4.5 since its
derivative changes at the jump point d. Solution u5 comes in contrast to Proposition
4.4.2, as it introduces a new jump. In Figure 4.2(b), solution u1 cannot occur since f is
not affine near the boundary, see Proposition 4.4.4, while u3 violates the (i) part of the
same proposition. Finally, in Figures 4.2(c) and 4.2(d) we describe two examples where
the solution can possibly coincide with the data near the boundary. Note that the solution
u1 in Figure 4.2(a) cannot occur as again the (ii) part of Proposition 4.4.4 is violated.
4.4.2 L2–linear regressions
We continue with some results concerning the L2–linear regression of the data f . In
[BKV13], it was proved that in the L1 fidelity case, there exist some thresholds α∗, β∗ > 0
such that if α > α∗, β > β∗ then the solution u of the one dimensional L1–TGV2β,α
problem is the L1–linear regression f of f , where
f ∈ argmin
v affine
‖f − v‖L1(Ω).
There, the values of α∗ and β∗ are independent of the function f and depend only on the
domain Ω. Here we show that these thresholds exist for the L2 fidelity case as well but
they depend on the data f . For a function f ∈ BV(Ω), we define f? to be the L2–linear
120
4.4. Properties of solutions
u2
u3
u4
u5
c d
f
u1
(a) Away from the boundary
a
u1
u2
u3
f
(b) Near the boundary
a b
u1
f
u2
(c) Solution u equal to the data f near the
boundary
a b
f
u
(d) Solution u equal to the data f near the
boundary
Figure 4.2: Possible (blue) and not possible (red) solutions of the L2–TGV2β,α problem as
these are dictated by Propositions 4.4.1–4.4.5. These are not actual computed solutions
but they illustrate their qualitative properties.
regression of f , i.e.,
f? := argmin
v affine
‖f − v‖2L2(Ω).
We derive the following result:
Proposition 4.4.6 (Thresholds for L2–linear regression). Let u be the solution to P with
data function f . Then u is an affine function if and only if u = f?. Moreover, if
α ≥ b− a
2
‖f − f?‖∞, (4.30)
β ≥ (b− a)
2
4
‖f − f?‖∞, (4.31)
then u is equal to f?.
Proof. We firstly show that if u is an affine function and solves P, then u = f?. Since u
is affine we have that D2u = 0, thus the optimum w in P is w = Du. Then, we obviously
have
u = argmin
u˜ affine
1
2
‖f − u˜‖2L2(Ω),
121
Exact solutions of the one dimensional TGV regularisation problem
i.e., u = f?. For the second part, since Df? −w = 0 and Dw = 0, from (Cf )–(Cβ) we get
that in order for the solution to be equal to f? it suffices to find a function v ∈ H20 (Ω),
such that
v′′ = f − f?, ‖v‖∞ ≤ β and ‖v′‖∞ ≤ α. (4.32)
It is easy to see that for every function v ∈ W 1,∞(Ω) with v(a) = v(b) = 0 (recall that v
is necessarily continuous) we have the estimate
‖v‖∞ ≤ b− a
2
‖v′‖∞.
Hence, a function v ∈ H20 (Ω) that satisfies v′′ = f − f?, it satisfies
‖v′‖∞ ≤ b− a
2
‖f − f?‖∞, (4.33)
‖v‖∞ ≤ (b− a)
2
4
‖f − f?‖∞. (4.34)
Thus, in order to have this kind of solution, it suffices to choose α and β as in (4.30)–(4.31)
and to find a function v ∈ H20 (Ω) that satisfies v′′ = f − f?. It can be easily checked that
the function
v(x) =
ˆ x
a
ˆ t
a
(f − f∗) dsdt,
satisfies these conditions, baring also in mind that the conditions
ˆ b
a
(f − f?)′ dx = 0 and
ˆ b
a
x(f − f?) dx = 0,
hold for the L2–linear regression.
We note that in Sections 4.6.1 and 4.6.3, we provide some sharper L2–linear regression
thresholds for the data functions for which we are computing exact solutions. The next
proposition states that the solution u to the L2–TGV2β,α problem, has the same L
2–linear
regression with the data f .
Proposition 4.4.7. Let uf,β,α be the solution to L
2–TGV2β,α minimisation problem with
data f . Then f? = u?f,β,α, i.e.,
argmin
v affine
‖f − v‖2L2(Ω) = argmin
v affine
‖uf,β,α − v‖2L2(Ω).
Proof. Since the L2–linear regression is a geometric notion, we can assume without loss of
generality that Ω is an interval of the form (0, b). From (Cf ), we have that v
′′ = f −uf,β,α
with v ∈ H20 (Ω). Thus, by integrating by parts, we have that
´
Ω v
′′ dx = 0 and
´
Ω xv
′′ dx =
122
4.4. Properties of solutions
0 which implies
ˆ
Ω
f dx =
ˆ
Ω
uf,β,α dx and
ˆ
Ω
xf dx =
ˆ
Ω
xuf,β,α dx.
Then the proof is finished by simply observing that for a function g we have
argmin
v affine
‖g − v‖2L2(Ω) = argmin
v=λx+µ
‖g − λx− µ‖2L2(Ω)
= argmin
v=λx+µ
1
2
λ2b2 + λb3 + bµ2 − 2λ
ˆ
Ω
xg dx− 2µ
ˆ
Ω
g dx,
which means that g? depends only on the quantities
´
Ω g dx and
´
Ω xg dx.
4.4.3 Even and odd functions
Before we proceed with the computation of exact solutions, we point out some facts
concerning odd and even data functions. In particular, at the end of this section we
prove that in some symmetric cases, TV and TGV regularisations coincide, given that α
and β satisfy a very simple condition. It is convenient for this section to assume that Ω
is an interval of the type (−L,L). We first prove the following two propositions, which
essentially state that the symmetry of the data f is inherited to the solutions u and v of
the problems P and P ′ respectively.
Proposition 4.4.8. Let f ∈ BV(−L,L) be an odd (even) function. Then the solution
v ∈ H20 (−L,L) to the problem P ′ is also odd (even).
Proof. We prove the odd case, the even case is proved analogously. Suppose that f(x) =
−f(−x) a.e.. We will show that v(x) = −v(−x). Let fˆ(x) = −f(−x) and vˆ to be the
solution to P ′ with data function fˆ and set v˜(x) = −v(−x). Obviously
‖v‖∞ ≤ β, ‖v′‖∞ ≤ α ⇐⇒ ‖v˜‖∞ ≤ β, ‖v˜′‖∞ ≤ α.
Since v˜(x) = −v(−x), we have v˜′′(x) = −v′′(−x) a.e. and thus
ˆ L
−L
fˆ(x)v˜′′(x)dx− 1
2
ˆ L
−L
(v˜′′(x))2dx ≤
ˆ L
−L
fˆ(x)vˆ′′(x)dx− 1
2
ˆ L
−L
(vˆ′′(x))2dx =⇒
ˆ L
−L
f(x)v′′(x)dx− 1
2
ˆ L
−L
v′′(x)2dx ≤
ˆ L
−L
f(x)(−vˆ′′(−x))dx− 1
2
ˆ L
−L
((−vˆ(−x))′′)2dx,
thus v(x) = −vˆ(−x) but since fˆ = f we have that vˆ = v.
Proposition 4.4.9. Let f ∈ BV(−L,L) be an odd (even) function. Then the solution u
to the problem P is also odd (even).
123
Exact solutions of the one dimensional TGV regularisation problem
Proof. Again we only prove the odd case as the even case can be proved in a similar fashion.
Set fˆ(x) = −f(−x), uˆ the solution of P with data function fˆ and set u˜(x) = −u(−x). One
can easily notice from the rotational invariance of TGV that TGV2β,α(u) = TGV
2
β.α(u˜).
We have
1
2
ˆ L
−L
(uˆ(x)− fˆ(x))2dx+ TGV2β,α(uˆ) ≤
1
2
ˆ L
−L
(u˜(x)− fˆ(x))2dx+ TGV2β,α(u˜)
=⇒
1
2
ˆ L
−L
(−uˆ(−x)− f(x))2dx+ TGV2β,α(−uˆ(−·)) ≤
1
2
ˆ L
−L
(u(x)− f(x))2dx+ TGV2β,α(u).
Thus we have u(x) = −uˆ(−x). Since fˆ = f we have uˆ = u and thus u(x) = −u(−x).
We are going to prove that at least for even data functions f , if the ratio β/α of
the TGV2β,α parameters is large enough, then TGV regularisation is equivalent to TV
regularisation with weight α. We firstly need the following lemma concerning the total
variation of even functions. It states that the total variation of an even function cannot
be decreased by adding an affine function to it.
Lemma 4.4.10. Let u ∈ BV(−L,L) be an even function. Then for every c ∈ R we have
‖Du‖M ≤ ‖Du+ c‖M. (4.35)
Proof. We can assume without loss of generality that c > 0. In order to show (4.35), it
suffices to show that ˆ
(−L,L)
|u′| dx ≤
ˆ
(−L,L)
|u′ + c| dx. (4.36)
Since u is even we have that u′ is odd and thus up to null sets we have
{u′ > 0} ∩ (−L, 0) = −{u′ < 0} ∩ (0, L),
{u′ < 0} ∩ (−L, 0) = −{u′ > 0} ∩ (0, L).
Then we estimate
ˆ
(−L,L)
|u′| dx =
ˆ
{u′>0}∩(−L,0)
u′ dx −
ˆ
{u′<0}∩(0,L)
u′ dx +
ˆ
{u′>0}∩(0,L)
u′ dx −
ˆ
{u′<0}∩(−L,0)
u′ dx
=
ˆ
{u′>0}∩(−L,0)
c+ u′ dx −
ˆ
{u′<0}∩(0,L)
c+ u′ dx +
ˆ
{u′>0}∩(0,L)
c+ u′ dx −
ˆ
{u′<0}∩(−L,0)
c+ u′ dx
≤
ˆ
{u′>0}∩(−L,0)
c+ u′ dx +
ˆ
{u′<0}∩(0,L)
|c+ u′| dx +
ˆ
{u′>0}∩(0,L)
c+ u′ dx +
ˆ
{u′<0}∩(−L,0)
|c+ u′| dx
124
4.4. Properties of solutions
+
ˆ
{u′=0}∩(−L,L)
|c| dx
=
ˆ
(−L,L)
|u′ + c| dx.
Theorem 4.4.11. Let f ∈ BV(−L,L) be an even function. If
β
α
≥ L, (4.37)
then the solution of the L2–TGV2β,α regularisation problem is the same with the solution
of the following TV regularisation problem:
min
u∈BV(−L,L)
1
2
ˆ L
L
(u− f)2dx+ αTV(u).
Proof. Recall that for every u ∈W 1,1(−L,L) the following Poincare´ inequality holds,
‖u− uΩ‖L1(−L,L) ≤ L‖∇u‖L1(−L,L),
i.e., the Poincare´ constant is equal to L, see for example [AFP00]. Using density arguments,
we can prove that the same inequality holds for the space BV(−L,L) with the same
constant, i.e., for every u ∈ BV(−L,L)
‖u− uΩ‖L1(−L,L) ≤ L‖Du‖M. (4.38)
Since f is even, from Proposition 4.4.9 we have that the solution of
min
u∈BV(−L,L)
1
2
ˆ
Ω
(u− f)2dx+ TGV2β,α(u), (4.39)
is an even function as well. Thus, the problem (4.39) is equivalent to
min
u∈BV(−L,L)
u even
1
2
ˆ L
−L
(u− f)2dx+ TGV2β,α(u). (4.40)
Similarly we can prove that the outcome of TV regularisation for even data, is even as well.
Thus, it suffices to prove that for an even function u, we have TGV2β,α(u) = αTV(u) :=
α‖Du‖M, provided (4.37) holds. We calculate successively:
TGV2β,α(u) = min
w∈BV(Ω)
α‖Du− w‖M + β‖Dw‖M
125
Exact solutions of the one dimensional TGV regularisation problem
≤ α‖Du‖M (choosing w = 0),
≤ α‖Du− wΩ‖M ∀w ∈ BV(Ω), (from Lemma 4.4.10),
≤ α‖Du− w‖M + α‖w − wΩ‖L1(Ω)
≤ α‖Du− w‖M + βαL
β
‖Dw‖M (Poincare´ inequality),
≤ max
{
1,
aL
β
}
(α‖Du− w‖M + β‖Dw‖M)
= α‖Du− w‖M + β‖Dw‖M (since (4.37) holds).
Thus, we have proved above that
α‖Du‖M ≤ α‖Du− w‖M + β‖Dw‖M ∀w ∈ BV(Ω)
which implies that
TGV2β,α(u) ≤ α‖Du‖M
≤ min
w∈BV(Ω)
α‖Du− w‖M + β‖Dw‖M
= TGV2β,α(u),
and the proof is complete.
Let us emphasise the fact that the condition (4.37) is independent of the function f .
Moreover, as we will see in Section 4.6.1 there exist simple examples of odd data functions
(function fp.c.) where the TGV and TV regularisations do not coincide for any choice
of α and β. In Figure 4.3, we verify numerically the above result. Both TGV and TV
minimisation problems are solved with the Chambolle–Pock primal dual method [CP11]
as it is described in [Bre14].
4.5 A note on the relationship of TV and TGV in dimension
two
In this section we want to point out that a version of Theorem 4.4.11 holds in dimension
two as well, i.e., if the ratio β/α is large enough and the data function f is symmetric
enough then TV and TGV regularisations coincide. To do so, we first recall some results
from the theory of functions of bounded deformation, see [TS80, Tem85]. We state them
in dimension two but higher dimensional analogues also hold.
Proposition 4.5.1. Let Ω ⊆ R2 be an open, bounded set with Lipschitz boundary. Then
126
4.5. A note on the relationship of TV and TGV in dimension two
0 50 100 150 200 250 300 350 400 450 500
−2
−1
0
1
2
3
4
 
 
f
TGV: α = 10, β = 200
TGV: α = 10, β = 3000
TV : α = 10
Figure 4.3: Numerical verification of Theorem 4.4.11. Using the even function f shown
above as a data function and setting β/α ≥ L, the solutions of the TGV2β,α and αTV
regularisation problems coincide. If we decrease the ratio β/α violating condition (4.37),
the two regularisations do not coincide any more with the TGV solution (red) being
different than the TV one (blue).
for every function w ∈ BD(Ω) there exists an element rw ∈ KerE such that
‖w − rw‖L1(Ω,R2) ≤ C‖Ew‖M, (4.41)
where the constant C depends only on the domain Ω. Note moreover that
KerE = {r ∈ L1(Ω,R2) : r(x) = Ax+B, B ∈ R2, A ∈ R2×2 is a skew symmetric matrix}.
Recall that in dimension one the variable w is a solution to an L1–TV problem. In
dimension two, w is a solution to the following L1–E problem:
w ∈ argmin
w˜∈BD(Ω)
ˆ
Ω
|Dαu− w˜| dx+ β
α
‖Ew˜‖M. (4.42)
We are thus turning our attention to the more generic L1–E problem
min
w∈BD(Ω)
‖f − w‖L1(Ω,R2) + λ‖Ew‖M, f ∈ L1(Ω,R2), λ > 0. (4.43)
Notice that (4.43) does not have necessarily unique solutions as it is not strictly convex.
The next theorem states that that if the parameter λ is large enough then a solution w of
(4.43) belongs to KerE . This is analogous to the L1–TV problem, where for large enough
value of the parameter λ we have that the solution is constant, i.e., it belongs to the kernel
127
Exact solutions of the one dimensional TGV regularisation problem
of TV.
Proposition 4.5.2. Let Ω ⊆ R2 be an open, bounded set with Lipschitz boundary and let
f ∈ L1(Ω,R2). Then there exists a λ∗ > 0 such that for every λ > λ∗, if wλ is a solution
of (4.43) with parameter λ then
wλ ∈ argmin
w∈KerE
‖f − w‖L1(Ω,R2).
Proof. Since wλ is a solution of (4.43) we have that
‖f − wλ‖L1(Ω,R2) + λ‖Ewλ‖M ≤ ‖f‖L1(Ω,R2). (4.44)
From inequality (4.41) we get
‖f − wλ‖L1(Ω,R2) +
λ
C
‖wλ − rwλ‖L1(Ω,R2) ≤ ‖f‖L1(Ω,R2). (4.45)
It is now easy to see that Wλ := wλ − rwλ solves the following problem:
min
w∈BD(Ω)
‖(f − rwλ)− w‖L1(Ω,R2) + λ‖Ew‖M. (4.46)
Indeed, one has to check that
‖(f − rwλ)−Wλ‖L1(Ω,R2) + λ‖EWλ‖M ≤ ‖(f − rwλ)− w‖L1(Ω,R2) + λ‖Ew‖M, ∀w ∈ BD(Ω)
⇐⇒
‖(f − wλ‖L1(Ω,R2) + λ‖Ewλ‖M ≤ ‖f − (w + rwλ)‖L1(Ω,R2) + λ‖E(w + rwλ)‖M, ∀w ∈ BD(Ω)
⇐⇒
‖(f − wλ‖L1(Ω,R2) + λ‖Ewλ‖M ≤ ‖f − w‖L1(Ω,R2) + λ‖Ew‖M, ∀w ∈ BD(Ω),
with the last inequality being true since wλ is a solution of (4.43). Setting Fλ := f − rwλ ,
using the fact that Wλ solves (4.46) and the inequality ‖Wλ‖L1(Ω,R2) ≤ C‖EWλ‖M we get
‖Fλ −Wλ‖L1(Ω,R2) +
λ
C
‖Wλ‖L1(Ω,R2) ≤ ‖Fλ‖L1(Ω,R2). (4.47)
A simple application of triangle inequality in (4.47) yields that if λ > C, then we must
have Wλ = 0, i.e., wλ = rwλ . We thus set λ
∗ = C and it is obvious that if λ > λ∗ then
wλ ∈ argmin
w∈KerE
‖f − w‖L1(Ω,R2).
We are now ready to prove that under some symmetry assumptions for Ω and f and
for a large enough ratio β/α, TV and TGV regularisations coincide in dimension two as
128
4.5. A note on the relationship of TV and TGV in dimension two
well. For simplicity we will assume that Ω is a square but other symmetric domains can
be considered as well.
Theorem 4.5.3. Suppose that Ω ⊆ R2 is a bounded square, centred at the origin. Let
f ∈ BV(Ω) satisfy the following symmetry properties
(i) f is symmetric with respect to both x and y axis, i.e.,
f(x1, x2) = f(−x1, x2), f(x1, x2) = f(x1,−x2), for a.e. (x1, x2) ∈ Ω. (4.48)
(ii) f is invariant under pi/2 rotations, i.e., f(Opi/2x) = f(x), where Opi/2 denotes
counterclockwise rotation by pi/2 degrees. (4.49)
Then if β/α > C, where C is the constant appearing in (4.41), the solution of the L2–
TGV2β,α regularisation problem with data f is the same with the solution of the following
TV regularisation problem:
min
u∈BV(Ω)
1
2
ˆ
Ω
(u− f)2dx+ αTV(u).
Proof. Recall that the L2–TGV2β,α reads
min
u∈BV(Ω)
w∈BD(Ω)
1
2
ˆ
Ω
(u− f)2dx+ α‖Du− w‖M + β‖Ew‖M, (4.50)
where as we have mentioned the variable w satisfies
w ∈ argmin
w∈BD(Ω)
ˆ
Ω
|Dαu− w| dx+ β
α
‖Ew‖M.
Since β/α > C is satisfied from Proposition 4.5.2 we have that w is of the form w(x) =
Ax+B for some B ∈ R2 and a skew symmetric matrix A. Since f satisfies the symmetry
properties (4.48)–(4.49) from the rotational invariance of TGV we have the same holds
for the solution u. That means that Dαu has the following properties
Dαu(x) = −Dαu(−x), Dα1 u(x1, x2) = Dα1 u(x1,−x2), Dα2 u(x1, x2) = Dα2 u(−x1, x2)
(4.51)
D1u(Opi/2x) = D2u(x), (4.52)
for almost every x = (x1, x2) ∈ Ω. Since w satisfies
w ∈ argmin
w˜∈KerE
‖Dαu− w˜‖L1(Ω,R2),
and w is of the form w(x) = Ax+B it is easy to check (see Lemma A.2.3 in Appendix A)
129
Exact solutions of the one dimensional TGV regularisation problem
that w is zero.
In Figure 4.5 we provide two numerical examples that verify Theorem 4.5.3. There,
we apply TV and TGV denoising to a characteristic function of a circle centred both at
the origin, Figure 4.5.3(a), and away from it, Figure 4.5.3(f). Note that the symmetry
properties (4.48)–(4.49) are satisfied for the first case. There, we observe that by choosing
the ratio β/α large enough TGV2β,α and αTV regularisations produce the same results,
Figures 4.5.3(b) and 4.5.3(c). However, TV and TGV do not coincide for small ratio β/α,
Figure 4.5.3(d). Finally note that when the symmetry is broken, Figure 4.5.3(f), the TV
and TGV solutions do not coincide even for large ratio β/α, Figures 4.5.3(g) and 4.5.3(h).
4.6 Computation of exact solutions
In this section we compute exact solutions for the one dimensional L2–TGV2β,α regulari-
sation problem for several data functions f . In particular we calculate exact solutions for
simple piecewise constant, piecewise affine and hat functions. We focus on the structure
of the solutions and their relationship with the parameters α and β.
4.6.1 Piecewise constant function with a single jump
For convenience in the calculations, for this sections, Ω will be an interval of the form
(0, 2L). We define fp.c. to be a piecewise constant function with a single jump, i.e.,
fp.c.(x) =
0 if x ∈ (0, L),h if x ∈ [L, 2L), (4.53)
for some h > 0, see Figure 4.4.
h
L L
Figure 4.4: Piecewise constant function fp.c. with a jump discontinuity at x = L.
In order to compute exact solutions for fp.c. our strategy is as follows: Firstly, we
investigate what are the possible types of solutions, describing then in a qualitative way
and then we calculate them explicitly.
130
4.6. Computation of exact solutions
(a) Original image,
200× 200 pixels
(b) TV denoising,
α = 10
(c) TGV denoising,
α = 10, β = 106
(d) TGV denoising,
α = 10, β = 200
0 50 100 150 200 250
0
0.2
0.4
0.6
0.8
1
 
 
f
TGV: α = 10, β = 106
TGV: α = 10, β = 200
TV : α = 10
(e) Corresponding middle row slices
(f) Original image,
200× 200 pixels
(g) TV denoising,
α = 10
(h) TGV denoising,
α = 10, β = 106
0 50 100 150 200 250
0
0.2
0.4
0.6
0.8
1
 
 
f
TGV: α = 10, β = 106
TV : α = 10
(i) Corresponding middle row slices
Figure 4.5: Illustration of the two dimensional TV–TGV equivalence for symmetric data
when β/α is large enough. Notice that the equivalence does not hold once the symmetry is
broken, Figures (f)–(i). The red lines to the original images indicate the slices in Figures
(d) and (f).
131
Exact solutions of the one dimensional TGV regularisation problem
Since after a simple translation fp.c. is an odd function, from Proposition 4.4.9 we have
that the solution u is symmetric with respect to the point (L, h/2). That is to say
u(x) = −u(2L− x) + h, ∀x ∈ (L, 2L).
Thus, we only need to describe the solution in (0, L). The following theorem states which
kinds of solutions are allowed to occur.
Theorem 4.6.1. Let u be the solution to the L2–TGV2β,α regularisation problem with data
fp.c.. Then u can only be of the following form on (0, L):
• There exists 0 < x1 < L such that u is strictly negative and affine in (0, x1) with
u(x1) = 0.
• There exists potentially 0 < x1 ≤ x˜1 < L such that u = 0 in [x1, x˜1].
• u > 0 in (x˜1, L) consisting of at most two affine parts of increasing gradient in
(x˜1, x2) and (x2, L) respectively with x˜1 < x2 < L. Moreover in the case x˜1 = x1, u
has the same gradient in (0, x1) and (x1, x2).
Note: In fact, as we will see later, x˜1 must be equal to x1, i.e., the allowed solutions are
strictly negative in (0, x1) and strictly positive in (x1, L). However this is not obvious by
simply interpreting the optimality conditions (Cf ), (Cα) and (Cβ) (which is how Theorem
4.6.1 is proved) but it is a result of subsequent calculations, see Proposition 4.6.2 and
Propositions A.1.1–A.1.3 in Appendix A.
Proof of Theorem 4.6.1. Let us note first that from Propositions 4.4.2 and 4.4.3, we have
that u is continuous in (0, L) with u(L−) ≥ 0. Also, from Proposition 4.4.4 we have that
if u < 0 on a set of the form (0, x1) then u must be affine there. Moreover, Proposition
4.4.1 implies that there is no interval (c, d) ⊆ (0, L) such that u < 0 (u > 0) in (c, d) with
u(c+) = u(d−) = 0. Indeed, for the u < 0 case, there would be a point x ∈ (c, d) such
that u is affine on (c, x) and (x, d) with strictly negative and strictly positive gradient
respectively on these intervals. But then u′ would be increasing in an interval where
uup < (fp.c.)low, which is a contradiction. We work similarly for the u > 0 case. We
conclude that u can be strictly negative only in an interval of the type (0, c), 0 ≤ c ≤ L
and strictly positive only in an interval of the type (d, L), 0 ≤ d ≤ L, in the latter possibly
consisting of two affine parts of increasing gradient. In particular, u is increasing in (0, L).
The proof will be complete if we prove that the following situations cannot happen:
(i) u > 0 in (0, L).
(ii) u = 0 in (0, x1), 0 < x1 < L and u > 0 in (x1, L).
(iii) u = 0 in (0, L).
(iv) u < 0 in (0, x1), 0 < x1 < L and u = 0 in (x1, L).
132
4.6. Computation of exact solutions
(v) u < 0 in (0, L).
For (i) suppose that u > 0 in (0, L). By symmetry we have that u < h in (L, 2L).
This means from (Cf ) that v is strictly convex in (0, L), strictly concave in (L, 2L) with
v(0+) = v
′(0+) = v(2L−) = v′(2l−) = 0 something that cannon happen.
For (ii) suppose that there exists a point 0 < x1 < L such that u = 0 in (0, x1) and
u > 0 in (x1, L). Then from condition (Cf ) v will be an affine function in (0, x1) but since
v ∈ H20 (0, 2L), we have v(0+) = v′(0+) = 0 and thus v = 0 in (0, x1). Moreover Du = 0 in
(0, x1) forcing w = 0 a.e. there as well, because otherwise from condition (Cα) we would
have that |v′| = α on a set of positive measure in (0, x1). Now, since u > 0 in (x1, L)
(assume without loss of generality that u is affine there) from Proposition 4.4.1 we have
that w = Du = c > 0 there, something that makes w to have jump discontinuity at x1
and thus from (Cβ) we get that v(x1) = β. However, this contradicts to the continuity of
v.
For (iii) suppose that u = 0 in (0, L). Then by symmetry we have that u = h in
(L, 2L). Thus, from condition (Cf ) again, we have that v will be affine in those intervals
and from its boundary conditions and continuity we get that v = 0 in (0, L). But since u
has a jump discontinuity at x = L, according to condition (Cα), we must have v
′(L) = −α,
a contradiction.
For (iv) suppose that there exists a point 0 < x1 < L such that u < 0 in (0, x1)
and u = 0 in (x1, L). Then, from (Cf ) v will be strictly convex and thus have a strictly
increasing derivative in (0, x1). Since v
′(0+) = 0 we have that v′(x1) > 0. Moreover again
from (Cf ), v will be affine in (x1, L) something that forces v(L) > 0. However, since after
a simple translation fp.c. is an odd function and according to Proposition 4.4.8 the same
is true for the continuous function v, we must have v(L) = 0. The case (v) is excluded in
a similar way.
Finally, for the last statement of the proposition, suppose that u < 0 in (0, x1), u > 0 in
(x1, L) and suppose towards contradiction that the derivative of u has a jump discontinuity
at x1. Since Du = w in (0, x1) and (x1, L) this means that w will have a jump discontinuity
at x1 and condition (Cβ) imposes that |v(x1)| = β. The case v(x1) = −β is impossible
since v is strictly increasing in (0, x1), and also if v(x1) = β from the fact that v
′(x1) > 0
and v′ is continuous at x1, we have that v is increasing in (x1, x1 + δ) for some δ > 0, thus
v(x1 + δ) > β, contradicting (Cβ).
Notice how we take advantage of the symmetries of u and v in order to prove Theorem
4.6.1. It remains to rule out the possibility that x1 < x˜1. In particular, we have to show
133
Exact solutions of the one dimensional TGV regularisation problem
that the four following situations cannot occur 1:
(i) u < 0 in (0, x1), u = 0 in (x1, x˜1), u > 0 with one affine part in (N-0-P1-J)
(x˜1, L), jump discontinuity at x = L,
(ii) u < 0 in (0, x1), u = 0 in (x1, x˜1), u > 0 with two affine parts in (N-0-P2-J)
(x˜1, L), jump discontinuity at x = L,
(iii) u < 0 in (0, x1), u = 0 in (x1, x˜1), u > 0 with one affine part in (N-0-P1-C)
(x˜1, L), continuous at x = L,
(iv) u < 0 in (0, x1), u = 0 in (x1, x˜1), u > 0 with two affine parts in (N-0-P2-C)
(x˜1, L), continuous at x = L,
where 0 < x1 < x˜1 < L, see also Figure 4.6.
f
u
(a) N-0-P1-J
f
u
(b) N-0-P2-J
f
u
(c) N-0-P1-C
f
u
(d) N-0-P2-C
Figure 4.6: Not allowed type of solutions for the data function fp.c..
In the following we show that N-0-P1-J cannot occur. The exclusions of the cases N-0-
P2-J, N-0-P1-C, N-0-P2-C are done in a similar way and they can be found in Appendix
A.
Proposition 4.6.2. The case N-0-P1-J cannot occur.
Proof. Suppose that u < 0 in (0, x1), u = 0 in (x1, x˜1) and u > 0 and affine in (x˜1, L)
with 0 < x1 < x˜1 < L. We claim that in that case, u
′ is constant, say equal to c > 0,
in (0, x1) ∪ (x˜1, L), w = c in (0, L) and v is affine in (x1, x˜1) with gradient equal to α
there. Indeed, let c be the value of the derivative of u in (0, x1). We know already from
Proposition 4.4.1 that w = u′ there. Now, if w 6= c on a set of positive measure in
(x1, x˜1), that will mean that Dw 6= 0 somewhere in (x1, x˜1) and thus from (Cβ), v must
be ±β somewhere in (x1, x˜1) and also |v| ≤ β there. But this is not possible since v is
strictly convex in (0, x1) with v
′(0+) = 0 and it is affine in (x1, x˜1) and thus v is strictly
increasing in (0, x˜1). Moreover, w = v
′ = c in (x˜1, L) because otherwise w would have a
jump discontinuity at x˜1, something that forces v(x˜1) = β but this cannot happen since
1N: negative, 0: zero, P1: positive one affine part, P2: positive two affine parts, J: jump discontinuity,
C: continuous.
134
4.6. Computation of exact solutions
v′(x˜1) > 0. Finally, since Du− w = −c < 0 in (x1, x˜1), from condition (Cα) we have that
v′ = α there.
Also note that since u has a jump discontinuity at x = L, condition (Cα) forces
v′(L) = −α and from the symmetry of v we must have v(L) = 0.
For convenience we set
`1 = x1 > 0, `2 = x˜1 − x1 > 0, `3 = L− x˜1 > 0 with `1 + `2 + `3 = L.
and
v1 = vb(0, x1), v2 = vb(x1, x˜1), v3 = vb(x˜1, L).
Since u is an affine function on (0, x1) we have that v1 is a cubic function
v1(x) = a1x
3 + b1x
2 + c1x+ d1, x ∈ (0, x1),
that satisfies the conditions
v1(0) = 0, v
′
1(0) = 0, v
′
1(`1) = α, v
′′
1(`1) = 0. (4.54)
Thus v1 will have the form
v1(x) = − α
3`21
x3 +
α
`1
x2, x ∈ (0, x1). (4.55)
Since v2 is an affine function in (x1, x2) with gradient α and v is continuous at x1, we have
that v2 will be of the form
v2(x) = α(x− `1) + 2α`1
3
, x ∈ (x1, x˜1). (4.56)
Since u is a strictly positive affine function on (x˜1, L) (with the same gradient as in (0, x1)),
we have that v3 must be a cubic function
v3(x) = a3x
3 + b3x
2 + c3x+ d3, x ∈ (x˜1, L),
that satisfies the following conditions at x˜1 = `1 + `2:
v3(`1+`2) = α`2+
2α`1
3
, v′3(`1+`2) = α, v
′′
3(`1+`2) = 0, v
′′′
3 (`1+`2) = −
2α
`21
, (4.57)
and also the following conditions at L = `1 + `2 + `3:
v3(`1 + `2 + `3) = 0, (4.58)
v′3(`1 + `2 + `3) = −α. (4.59)
135
Exact solutions of the one dimensional TGV regularisation problem
Conditions (4.57) give
v3(x) = − α
3`21
(x− `1 − `2)3 + α(x− `1 − `2) + α`2 + 2α`1
3
, x ∈ (x˜1, L). (4.60)
What is left is to find the relationship among `1, `2 and `3. Condition (4.59) gives
`3 =
√
2`1, (4.61)
and the condition (4.58) together with (4.61) gives
2 +
√
2
3
`1 + `2 = 0, (4.62)
which is a contradiction since both `1 and `2 are strictly positive. Thus, this kind of
solution cannot occur.
From all the previous results, it follows that only the following situations can occur:
(i) u < 0 in (0, x1), u > 0 with one affine part in (x1, L), jump (N-P1-J)
discontinuity at x = L,
(ii) u < 0 in (0, x1), u > 0 with two affine parts in (x1, L), jump (N-P2-J)
discontinuity at x = L,
(iii) u < 0 in (0, x1), u > 0 with one affine part in (x1, L), continuous (N-P1-C)
at x = L,
(iv) u < 0 in (0, x1), u > 0 with two affine parts in (x1, L), continuous (N-P2-C)
at x = L,
where 0 < x1 < L. In Figure 4.7 we provide a qualitative description of how these four
allowed solutions look like, along with a description of the corresponding variables w and
v. Notice that from Theorem 4.4.6, the solution of the type N-P1-C is in fact the L2–linear
regression of fp.c.. Note finally that it can easily checked that the direction of the jump of
u at x = L, is the same with that of the data fp.c.. Otherwise, from (Cα) we would have
that v′(L) = α and that would lead again to contradictions. Our next step is to identify
the combinations of the parameters α and β that lead to each type of solution.
Proposition 4.6.3 (N-P1-J). The solution u of the problem P with data function fp.c.,
is of the type N-P1-J, see Figure 4.7(a), if and only if α and β satisfy the conditions
β
α
≥ 4L
27
and α <
hL
8
. (4.63)
136
4.6. Computation of exact solutions
x
f
u
x
Du JδL
v
x
β
−β
w
v′ = −α
J
(a) Solution of the type
N-P1-J.
x
f
u
x
Du
J
JδL
v
x
β
−β
v′ = −α
w
(b) Solution of the type
N-P2-J.
x
f
u
x
Du
v
x
β
−β
|v′| ≤ α
w
(c) Solution of the type
N-P1-C.
x
f
u
x
Du
w
v
x
β
−β
|v′| ≤ α
(d) Solution of the type
N-P2-C.
Figure 4.7: All the possible types of solution u to the minimisation problem P for the
data function fp.c.. We also show the forms of the corresponding variables w and v.
More specifically in that case:
u(x) =
 6αL2x− 2αL , if x ∈ (0, L),6α
L2
(x− L) + h− 4αL , if x ∈ (L, 2L),
(4.64)
i.e., u has a jump discontinuity of size h− 8α/L at x = L.
Proof. It is easy to check that the solution u will be of the type N-P1-J if and only if we
can find a function v in [0, L] that satisfies the following conditions:
v is a cubic polynomial, (from condition (Cf ), v
′′ = −u is an affine function), (4.65)
v(0) = 0, v′(0) = 0, (boundary conditions for v), (4.66)
v(L) = 0, (v is an odd function), (4.67)
v′(L) = −α, (u jumps at L, condition (Cα)), (4.68)
0 < −v′′(L) < h
2
, (u jumps at L & symmetry), (4.69)
|v′| ≤ α, (condition (Cα)), (4.70)
|v| ≤ β, (condition (Cβ)). (4.71)
It is now easy to check that the function
v(x) = − α
L2
x3 +
α
L
x2, x ∈ [0, L],
satisfies conditions (4.65)–(4.68) and also the condition (4.70). Observe that v obtains its
137
Exact solutions of the one dimensional TGV regularisation problem
maximum in [0, L] at x = 2L/3, with v(2L/3) = 4αL/27, thus (4.71) can be written as
4αL
27
≤ β.
Finally, condition (4.69) can be written as
0 <
4α
L
<
h
2
.
The last two inequalities form the two conditions in (4.63). Using the fact that u = −v′′
in (0, L) and taking advantage of the symmetry of u, we compute (4.64).
We note here that the computation of the above type of exact solution for fp.c. (for a
fixed jump h = 2) is also done in [Ben11] in order to indicate that fp.c. cannot be recovered
by TGV regularisation up to a loss of contrast, as it is done in TV regularisation
We now proceed to the study of the case N-P2-J. As we will see, it is computationally
inaccessible to find the exact conditions on α and β that result to this kind of solution
but we are able to provide some sufficient conditions that are not far away from being
necessary as well.
Proposition 4.6.4 (N-P2-J). If α and β satisfy the conditions
β
α
<
4L
27
and β >
4α2
3h
, (4.72)
then the solution u of the problem P with data function fp.c. is of the type N-P2-J, see
Figure 4.7(b). More specifically, there exist points 0 < x1 < x2 < 0 with
x1 =
x2
2
and L− 3β
α
< x2 < L− 9β
4α
,
such that u < 0 and u > 0 in (0, x1) and (x1, L) respectively with u
′ having a positive jump
at x2.
Proof. Again, it is easily verified that u will be a solution of the type N-P2-J if and only
if we can find `1 > 0, `2 > 0 with `1 + `2 = L and two functions v1, v2 defined on [0, `1]
and [`1, `1 + `2] respectively, such that the following conditions are satisfied:
v1, v2 are cubic polynomials, (from condition (Cf )), (4.73)
v1(0) = 0, v
′
1(0) = 0, (boundary conditions for v), (4.74)
v1(`1) = β, v
′
1(`1) = 0, (v has a max at `1, condition (Cα)), (4.75)
v2(`1) = β, v
′
2(`1) = 0, v
′′
2(`1) = v
′′
1(`1), (continuity of v, v
′ and u), (4.76)
v′′′2 (`1) < v
′′′
1 (`1), (u
′ has a positive jump at `1) (4.77)
v2(L) = 0, (v is an odd function), (4.78)
138
4.6. Computation of exact solutions
v′2(L) = −α, (u jumps at L, condition (Cα)), (4.79)
0 < −v′′2(L) <
h
2
, (u jumps at L & symmetry), (4.80)
|v′i| ≤ α, i = 1, 2, (condition (Cα)), (4.81)
|vi| ≤ β, i = 1, 2, (condition (Cβ)). (4.82)
In that case `1 = x2 and `2 = L− `1. From conditions (4.73)–(4.75), we get that
v1(x) = −2β
`31
x3 +
3β
`21
x2, x ∈ [0, `1]. (4.83)
One can easily check from the fact that u = −v′′1 in (0, `1), that x1 = `1/2. We can also
check that condition (4.81) for v2 is equivalent to
3β
2α
≤ `1, (4.84)
while condition (4.82) is also satisfied in that case. After some computations, we have
that conditions (4.76), (4.78), (4.79) give that v2 will be of the form
v2(x) =
(
2β
`21`2
− α
3`22
)
(x− `1)3 − 3β
`21
(x− `1)2 + β. (4.85)
We also get the following relationship between `1 and `2:
γ`22 + `2`
2
1 − γ`21 = 0, with γ :=
3β
α
, (4.86)
where one can check that `2 < γ independently of the value of `1. Notice now, that if the
condition (4.77) is satisfied, then v2 will be decreasing, with decreasing derivative as well
something that would imply that conditions (4.81)–(4.82) hold for v2. With the help of
(4.85)–(4.86) and condition `1 + `2 = L, we get that condition (4.77) is equivalent to
`21 − 2L`1 + 2γL < 0. (4.87)
The inequality (4.87) is true if and only if
γ <
L
2
and L−
√
L2 − 2γL < `1 ⇐⇒ `2 <
√
L2 − 2γL. (4.88)
From the fact that `1 + `2 = L and (4.86) we get that `2 is the unique solution of
φ(`2) :=
√
γ`22
γ − `2 + `2 − L = 0. (4.89)
In view of (4.89) we have that `2 =
√
L2 − 2γL if and only if γ = 4L/9. Since φ is strictly
139
Exact solutions of the one dimensional TGV regularisation problem
increasing, one can check that (4.88) is equivalent to the very simple expression
β
α
<
4L
27
. (4.90)
Notice that in that case (4.84) is also satisfied. Finally, the last condition that has to be
satisfied is (4.80). Using (4.85) and (4.86) it can be checked that it is equivalent to
4`2 − 2γ
`22
<
h
2α
. (4.91)
Ideally, one would like to obtain an explicit expression for `2 from (4.89) and obtain an
inequality involving α and β using (4.91). However, this is practically impossible as one
would have to solve a cubic equation for `2. This is why we are giving some estimates
instead. One can check that from (4.89) and (4.90) we can derive that
3
4
γ < `2 < γ. (4.92)
We have that the expression (4`2 − 2γ)/`22 in (4.91) is a strictly increasing function of `2
provided that `2 < γ which is true in our case. Thus, in order to satisfy (4.91) a sufficient
(but not necessary) condition is
4γ − 2γ
γ2
<
h
2α
⇐⇒ β > 4α
2
3h
.
Remark: Let us also point out the following fact: Suppose that α∗ and β∗ are such, so
that conditions (4.72) hold, i.e., the solution is of the type N-P2-J. Then there exists a
0 < c < β∗ such that for every β ≤ c the condition (4.91) is violated, so we do not have
this type of solution. Indeed from (4.91) and (4.92) we have that
16
9γ
<
4`2 − 2γ
`22
<
h
2α
.
Thus keeping α fixed and choosing β such that
16
9γ
>
h
2α
⇐⇒ β < 32α
2
27h
, (4.93)
we cannot have the solution of the type N-P2-J any more.
We now turn our attention to the solution of the type N-P1-C. As we mentioned
earlier, in that case u is an affine function and thus also the L2–linear regression of fp.c.,
see Theorem 4.4.6. In the same theorem we gave some thresholds for α and β in the
general case but we can be more explicit in this specific example.
140
4.6. Computation of exact solutions
Proposition 4.6.5 (N-P1-C). The solution u of the problem P with data function fp.c.
is of the type N-P1-C, see Figure 4.7(c), if and only if α and β satisfy the conditions
α ≥ hL
8
and β ≥ hL
2
54
. (4.94)
Moreover, in that case u will be the L2–linear regression of fp.c. and equal to
u(x) =
3h
4L
x− h
4
, x ∈ (0, 2L). (4.95)
Proof. Again we can check that the solution will be of the type N-P1-C if and only if we
can find a function v defined on [0, L] such that the following conditions hold:
v is a cubic polynomial, (4.96)
v(0) = 0, v′(0) = 0, (4.97)
v(L) = 0, (4.98)
−v′′(L) = h
2
, (condition (Cf ), continuity at L & symmetry), (4.99)
|v′| ≤ α, (4.100)
|v| ≤ β. (4.101)
We can easily check that conditions (4.96)–(4.99) give
v(x) = − h
8L
x3 +
h
8
x2, x ∈ [0, L]. (4.102)
Moreover, the maximum values of |v| and |v′| are hL254 and hL8 respectively. Thus, conditions
(4.100)–(4.101) will be satisfied if and only if (4.94) holds.
Finally, we investigate under which combinations of α and β we have solutions of the
type N-P2-C. As in the case N-P2-J, it is computationally inaccessible to provide explicit
conditions, thus we provide again some sufficient but not necessary conditions.
Proposition 4.6.6 (N-P2-C). If α and β satisfy the conditions
β <
hL2
54
and β ≤ 4Lα
9
− L
2
27
, (4.103)
then the solution u of the problem P with data function fp.c. is of the type N-P2-C, see
Figure 4.7(d), and u′ has a positive jump at a point x2 where 2L3 < x2 < L.
Proof. As in the N-P2-J case, u will be a solution of the type N-P2-C if and only if we
can find `1 > 0, `2 > 0 with `1 + `2 = L and two functions v1, v2 defined on [0, `1] and
141
Exact solutions of the one dimensional TGV regularisation problem
[`1, `1 + `2] respectively, such that the following conditions are satisfied:
v1, v2 are cubic polynomials, (4.104)
v1(0) = 0, v
′
1(0) = 0, (4.105)
v1(`1) = β, v
′
1(`1) = 0, (4.106)
v2(`1) = β, v
′
2(`1) = 0, v
′′
2(`1) = v
′′
1(`1), (4.107)
v′′′2 (`1) < v
′′′
1 (`1), (4.108)
v2(L) = 0, (4.109)
−v′′(L) = h
2
, ((Cf ), continuity at L & symmetry), (4.110)
|v′i| ≤ α, i = 1, 2, (4.111)
|vi| ≤ β, i = 1, 2. (4.112)
Conditions (4.104)–(4.106) together with (4.111)–(4.112) yield
v1(x) = −2β
`31
x3 +
3β
`21
x2, x ∈ [0, `1], (4.113)
with
3β
2α
≤ `1. (4.114)
Supposing that v2 is of the form
v2(x) = a3(x− `1)3 + a2(x− `1)2 + a1(x− `1) + a0, x ∈ [`1, `1 + `2],
conditions (4.107), (4.109), (4.110) give
v2(x) = a3(x− `1)3 − 3β
`21
(x− `1)2 + β, x ∈ [`1, `1 + `2], (4.115)
and
a3`
3
2 −
3β`22
`21
+ β = 0, (4.116)
−6a3`2 + 6β
`21
=
h
2
. (4.117)
One can check that conditions (4.107) and (4.110) impose v′2 to be decreasing , so in order
to satisfy (4.111) and (4.112) it suffices to impose |v′2(`1 + `2)| ≤ α, which, with the help
of (4.115) and (4.117), is equivalent to
3β`2
`21
+
h`2
4
≤ α. (4.118)
142
4.6. Computation of exact solutions
Moreover combining (4.116), (4.117) and the fact that `1+`2 = L, we can get a relationship
between `1 and `2 and an equation for `1:
`2 =
√√√√ 12β
h+ 24β
`21
, (4.119)
φ(`1) := `1 +
√√√√ 12β
h+ 24β
`21
− L = 0. (4.120)
It is easy to see that the equation (4.120) has a unique solution in (0, L) but one has
to solve a quartic in order to express it explicitly. Thus, as in the case N-P2-J we give
some estimates. Observe firstly that using (4.113), (4.115), (4.117) and (4.119) condition
(4.108) is satisfied if and only if
18β`1
`2
− 6β`
3
1
`32
< −12β, (4.121)
which is satisfied if and only if `2 <
`1
2 . However, from (4.119), this is true if and only if
`1 >
√
24β
h . From the fact that the function φ is strictly increasing, the last inequality is
true if and only if
β <
hL2
54
. (4.122)
It remains to identify some sufficient conditions for (4.118). Observe that under (4.122)
we have φ(2L/3) < 0 and φ(L > 0), which means that
2L
3
< `1 < L and 0 < `2 <
L
3
. (4.123)
Using (4.123), we find that a sufficient (but not necessary) condition for (4.118) is
3β
(2L3 )
2
L
3
+
hL
12
< α ⇐⇒ β ≤ 4Lα
9
− L
2h
27
. (4.124)
We can easily verify that under (4.124) the condition (4.114) is satisfied as well.
We now summarise how the solutions of the problem P with data function fp.c. are
affected by the different choices of the parameters α and β. In Figure 4.8 we have par-
titioned the set {α > 0, β > 0} into different areas that correspond to the four different
possible solutions. These areas are:
(1) Blue colour:
{
α < hL8 , β ≥ 4L27α
}
, (necessary and sufficient condition), N-P1-J, see
Figure 4.7(a).
(2) Purple colour:
{
β < 4L27α, β ≥ 3627hα2
}
, (sufficient condition), N-P2-J, see Figure
143
Exact solutions of the one dimensional TGV regularisation problem
hL
8
αhL
12
(0, 0)
hL2
54
β = 4L27α
(4)
(3)
(1)
(1) (2) (3) (4)
β
hL
9
(2)
β = 3227hα
2
β = 3627hα
2
β = 4L9 α− hL
2
27
Figure 4.8: Illustration of the four different types of solutions of P for the piecewise
constant function fp.c. that result for different combinations of the parameters α and β.
4.7(b).
(3) Yellow colour:
{
α ≥ hL8 , β ≥ hL
2
54
}
, (necessary and sufficient condition), N-P1-C, see
Figure 4.7(c).
(4) Orange colour:
{
β ≤ 3227hα2, β < hL
2
54
}
, (sufficient condition), N-P2-C, see Figure
4.7(d).
Notice that according to Proposition 4.6.6 our initial sufficient conditions for the solu-
tion of the type N-P2-C to happen were β < hL
2
54 and β ≤ 4L9 α− L
2h
27 . However, according
to the remark after Proposition 4.6.4, page 140, when condition β ≤ 3227hα2 holds then
the solution of the type N-P2-J cannot happen and since the conditions for N-P1-C and
N-P2-C are necessary and sufficient we conclude that when β ≤ 3227hα2 holds (together
with α ≤ hL8 ) then the solution of the type N-P2-C occur. Notice that{
α > 0, β > 0 : β ≤ 4L
9
α− L
2h
27
, α ≤ hL
8
}
⊆
{
α > 0, β > 0 : β ≤ 32
27h
α2, α ≤ hL
8
}
,
see Figure 4.8.
As we have mentioned, due to computational issues it is difficult to derive necessary
144
4.6. Computation of exact solutions
and sufficient conditions for the solutions of the type N-P2-J and N-P2-C. However, the
estimates we have provided are not far away from being sharp as the unknown grey area
in Figure 4.8, {β > 3227hα2, β < 3627hα2, β < 4L27α} is relatively small.
4.6.2 Piecewise affine function with a single jump
In this section we choose the data function to be a simple piecewise affine function fp.a.,
defined as
fp.a.(x) =
λ(x− L) if x ∈ (0, L),λ(x− L) + h if x ∈ [L, 2L), (4.125)
for some h > 0 and λ ∈ R, see also Figure 4.9.
h
L L
f ′p.a. = λ
f ′p.a. = λ
Figure 4.9: Piecewise affine function fp.c. with a jump discontinuity at x = L.
Note that fp.c. is a special case of fp.a. where λ = 0. However, we decided to treat the fp.c.
case as a separate one since as the next proposition shows the solutions of P with data
fp.a. can be obtained from the ones that correspond to fp.c.. The latter ones are more easy
to compute since fp.c. has a simpler form. The following Proposition essentially states
that if an affine function is added to the data f then the solution u of P is also shifted by
the same affine function.
Proposition 4.6.7. The function ufp.c. is a solution to the minimisation problem P with
data function fp.c. if and only if ufp.a. is a solution to P with data function fp.a., where
ufp.a.(x) = ufp.c.(x) + λ(x− L).
Proof. Suppose that ufp.c. is a solution of P with data fp.c., for some combination of the
parameters α and β and let vfp.c. , wfp.c. be the corresponding v and w variables. We
will show that ufp.a.(x) = ufp.c.(x) + λ(x − L) is a solution for data fp.a. for the same
145
Exact solutions of the one dimensional TGV regularisation problem
combination of α and β and vice versa. The optimality conditions (Cf ), (Cα), (Cβ) read:
v′′fp.c. = f − ufp.c. ,
−v′fp.c. ∈ α Sgn(Dufp.c. − wfp.c.),
vfp.c. ∈ β Sgn(Dwfp.c.).
Observing that fp.a.(x) = fp.c.(x)+λ(x−L) we set vfp.a. = vfp.c. and wfp.a.(x) = wfp.c.(x)+
λ. Then we have
v′′fp.a. = v
′′
fp.c. = fp.c. − ufp.c. = fp.a. − ufp.a. ,
and
−v′fp.a. ∈ α Sgn(Dufp.a. − wfp.a.) ⇐⇒ −v′fp.a. ∈ α Sgn(Dufp.c. + λ− wfp.c. − λ)
⇐⇒ −v′fp.c. ∈ α Sgn(Dufp.c. − wfp.c.).
Finally,
vfp.a. ∈ β Sgn(Dwfp.a.) ⇐⇒ vfp.a. ∈ β Sgn(Dwfp.c.) ⇐⇒ vfp.c. ∈ β Sgn(Dwfp.c.),
thus (Cf ), (Cα), (Cβ) hold for ufp.a. , vfp.a. and wfp.a. . We can similarly show that if ufp.a. is
a solution for data fp.a. then ufp.c.(x) = ufp.a.(x)−λ(x−L) is a solution for data fp.c..
(a) Solution of the type
N-P1-J.
(b) Solution of the type
N-P2-J.
(c) Solution of the type
N-P1-C.
(d) Solution of the type
N-P2-C.
Figure 4.10: All the possible types of solution u to the minimisation problem P for the
data function fp.a..
In Figure 4.10 we show all the possible type of solutions for the data fp.a.. These
solutions correspond to the same combinations of α and β that are shown in Figure 4.8.
One can observe here the capability of TGV to preserve piecewise affine structures. This
is in contrast to TV regularisation which promotes piecewise constant reconstructions. In
fact, one can easily check using the optimality conditions of the L2–TV problem in Table
4.1 that all TV solutions have the form shown in Figure 4.11.
146
4.6. Computation of exact solutions
Figure 4.11: Type of solutions for the L2–TV problem with data fp.a..
4.6.3 Hat function
In this section we compute exact solutions for the hat function
fhat(x) = λ|x− L| − λL,
where λ > 0, see Figure 4.12. The study of this case gives an insight about how TGV
L L
λL
f ′hat = −λ f ′hat = λ
Figure 4.12: The hat function fhat.
affects local extrema. Note that since fhat is an even function, the solutions will be even
as well. Working similarly as we did for the function fp.c., we conclude that the only types
of solutions u are the following2:
(i) u is constant in (0, x1), equal to fhat in (x1, x2) and constant in (x1, L), (C-E-C)
(ii) u is affine in (0, x1), equal to fhat in (x1, x2) and affine in (x1, L), (A-E-A)
(iii) u is constant in (0, L), (C)
(iv) u is affine in (0, L), (A)
where 0 < x1 < x2 < L. In Figure 4.13 we show how these four types of solutions look
like, along with the corresponding w and v variables. Notice that the solution of the type
C is the L2–linear regression of fhat. Note also, that the solutions of the type C-E-C
correspond to the TV solutions predicted by Theorem 4.4.11. We finally note that the
2C: constant, E: equal to fhat, A: affine.
147
Exact solutions of the one dimensional TGV regularisation problem
solution of the type A is also computed in [Ben11, Mu¨l13, BBBM13] in order to indicate
that fhat can be recovered by TGV regularisation up to a loss of contrast.
t
t
x
Du
w
β
v
x
x
v′ = α v′ = −α
u
(a) Solution of the type
C-E-C.
t
t
x
Du
w
β
v
x
x
v′ = −αv′ = α
u
(b) Solution of the type
A-E-A.
L
2
λL
2
x
Du
w
β
v
x
x
u
(c) Solution of the type
C.
t
t
L
2
Du
w
β
v
x
x
x
u
(d) Solution of the type
A.
Figure 4.13: All the possible types of solution u to the minimisation problem P for the
data function fhat. We also show the forms of the corresponding variables w and v.
In the following, we investigate which combinations of the parameters α and β corre-
spond to each type of solution. Unlike the case with the function fp.c., here we are able to
give necessary and sufficient conditions for all the types of solutions. We summarise our
results in the following proposition.
Proposition 4.6.8. Consider the problem P with data function fhat. Then we have the
following four cases:
(i) The solution u is of the type C-E-C, Figure 4.13(a), if and only if
α <
λL2
8
and β ≥ αL− 2α
3
√
2α
λ
. (4.126)
In that case
u(x) =

−√2αλ if x ∈
(
0,
√
2α
λ
)
,
−λx if x ∈
(√
2α
λ , L−
√
2α
λ
)
,
−λL+√2αλ if x ∈
(
L−
√
2α
λ ), L
)
.
(ii) The solution u is of the type A-E-A, Figure 4.13(b), if and only if
β >
2Lα
3
and β < αL− 2α
3
√
2α
λ
. (4.127)
148
4.6. Computation of exact solutions
In that case
u(x) =

−µx−√2α(λ− µ) if x ∈ (0, 32 (L− βα)) ,
−λx if x ∈
(
3
2
(
L− βα
)
, L− 32
(
L− βα
))
,
−µx− (λ− µ)L+ 2√2α(λ− µ) if x ∈ (L− 32 (L− βα) , L) ,
with
µ = λ− 8α
9
(
L− βα
)2 .
(iii) The solution u is of the type C, Figure 4.13(c), if and only if
β ≥ λL
3
12
and α ≥ λL
2
8
. (4.128)
In that case
u(x) =
λL
2
, x ∈ (0, 2L).
(iv) The solution u is of the type A, Figure 4.13(d), if and only if
β <
λL3
12
and β ≤ 2Lα
3
. (4.129)
In that case
u(x) = µ|x− L| − (λL− t), x ∈ (0, 2L)
with
µ = λ− 12β
L3
and t =
6β
L2
.
Proof. (i) Since we are looking for solutions of the type C-E-C, we must find 0 < x1 < x2 <
L such that u = t for a constant t in (0, x1), u = fhat in (x1, x2) and u = −t−λ(x2−x1) in
(x2, L). As far as the variable w is concerned, we have that w = Du = 0 in (0, x1)∪(x2, L).
However since u = fhat in (x1, x2), we have that v is affine in that interval, thus it cannot
have an extremum there. Thus, condition (Cβ) forces w = 0 in (x1, x2) as well and (Cα)
forces v′ = α there. We conclude that the solution u will be of the type C-E-C if and only
if we can find 0 < x1 < x2 < L and a function v with the following properties:
v and v′ are continuous, (v ∈ H20 ), (4.130)
v is a cubic polynomial on (0, x1) and (x2, L), (condition (Cf )), (4.131)
v(0) = 0, v′(0) = 0, (boundary conditions for v), (4.132)
v is affine and v′ = α on (x1, x2), (as explained above), (4.133)
v′(L) = 0, (v is an even function), (4.134)
149
Exact solutions of the one dimensional TGV regularisation problem
|v| ≤ β, (condition (Cβ)), (4.135)
|v′| ≤ α, (condition (Cα)). (4.136)
After some computations we find that
x1 = L− x2 =
√
2α
λ
, t =
√
2αλ,
v(x) =

−λ6x3 +
√
αλ
2 x
2 if x ∈
(
0,
√
2α
λ
)
,
α(x− x1) + λx
2
1
3 if x ∈
(√
2α
λ , L−
√
2α
λ
)
,
−λ6 (x+ x1 − L)3 + α(x+ x1 − L)
+α(L− 2x1) + λx
3
1
3 if x ∈
(
L−
√
2α
λ , L
)
.
From the equation v′′ = fhat − u we can get an expression for u. Moreover x1 < x2 holds
if and only if
α <
λL2
8
.
Finally one can check that (4.136) holds and in order to satisfy (4.135), since v is increasing
in (0, L), it suffices to have v(L) ≤ β something that translates to
β ≥ αL− 2α
3
√
2α
λ
.
(ii) The proof follows essentially the proof of (i). Here we are looking for solutions of the
type u = −µx− t instead of u = t on (0, x1). In addition to the conditions (4.130)–(4.136)
for v here we also have
v(L) = β, (condition (Cβ)), (4.137)
because w makes a positive jump at x = L, see also Figure 4.13(b). Again after some
computations we find
x1 = L− x2 =
√
2α
λ− µ, t =
√
2α(λ− µ),
v(x) =

−λ−µ6 x3 +
√
α(λ−µ)
2 x
2 if x ∈
(
0,
√
2α
λ−µ
)
,
α(x− x1) + (λ−µ)x
2
1
3 if x ∈
(√
2α
λ−µ , L−
√
2α
λ−µ
)
,
−λ−µ6 (x+ x1 − L)3 + α(x+ x1 − L)
+α(L− 2x1) + (λ−µ)x
3
1
3 if x ∈
(
L−
√
2α
λ−µ , L
)
.
150
4.6. Computation of exact solutions
Again one can check that |v′| < α. Moreover the condition v(L) = β gives
αL− 2α
3
√
2α
λ−−µ = β ⇐⇒ µ = λ−
8α
9
(
L− βα
)2 . (4.138)
Since we are looking into cases where 0 < µ < λ we must have
√
2α
λ−µ >
√
2α
λ and using
(4.138) this translates to
β < αL− 2α
3
√
2α
λ
. (4.139)
Finally, it is easily checked that in order to impose x1 < x2, we must have
β >
2Lα
3
. (4.140)
(iii) In this case, we are looking for solutions of the type u = −t, t > 0. The function v
will be a cubic polynomial that satisfy the conditions:
v(0) = 0, v′(0) = 0, v′(L) = 0, v′′ = h+ t, |v| ≤ β, |v′| ≤ α.
We easily compute
t =
λL
2
, v(x) = −λ
6
x3 +
1
2
tx2, x ∈ (0, L).
We can also check that the conditions |v| ≤ β, |v′| ≤ α are equivalent to
β ≥ λL
3
12
, α ≥ λL
2
8
,
respectively.
(iv) The proof is similar to (iii). We are looking for a solution of the type u(x) = −µx− t,
x ∈ (0, L), 0 < µ < λ and t > 0. As before, the function v will be a cubic polynomial
satisfying the conditions:
v(0) = 0, v′(0) = 0, v(L) = β, v′(L) = 0, v′′ = h+ t, |v| ≤ β, |v′| ≤ α.
We get that
v(x) = −λ− µ
6
x3 +
t
2
x2, x ∈ (0, L), with t = 6β
L2
, µ = λ− 12β
L3
.
Thus 0 < µ < λ is equivalent to
β <
λL3
12
.
151
Exact solutions of the one dimensional TGV regularisation problem
Finally, we check easily that |v| ≤ β holds and |v′| ≤ α is equivalent to
β ≤ 2Lα
3
.
α
β
λL2
8
λL3
12
β = 2Lα3
(3)
(4)
(2)
(1) (2) (3)
β = αL− 2α3
√
2α
λ
(1)
(4)
(0, 0)
Figure 4.14: Illustration of the four different types of solutions of P for the hat function
fhat that result for different combinations of the parameters α and β.
As we did for the function fp.c. we summarise how the solutions of the problem P with
data fhat are determined by α and β. In Figure 4.14 we have partitioned again the set
{α > 0, β > 0} into different areas that correspond to the four different possible solutions.
We note again that in contrast to the cases of the functions fp.c. and fp.a., here we provide
both necessary and sufficient conditions for all the four different types of solutions. The
corresponding areas are:
(1) Blue colour:
{
α < λL
2
8 , β ≥ αL− 2α3
√
2α
λ
}
, C-E-C, see Figure 4.13(a).
152
4.6. Computation of exact solutions
(2) Purple colour:
{
β > 2Lα3 , β < αL− 2α3
√
2α
λ
}
, A-E-A, see Figure 4.13(b).
(3) Yellow colour:
{
α ≥ λL28 , β ≥ λL
3
12
}
, C, see Figure 4.13(c).
(4) Orange colour:
{
β < λL
3
12 , β ≤ 2Lα3
}
, A, see Figure 4.13(d).
4.6.4 Numerical experiments
In this final section, we compare our theoretical results with numerical ones obtained
by solving the discrete version of P with the primal-dual algorithm of Chambolle-Pock,
[CP11]. A description of the algorithm for TGV minimisation can be found in [Bre14].
We also compute some numerical results with the presence of Gaussian noise that show
that TGV regularisation is quite robust for noisy data. Let us note here that even though
some sensitivity analysis can be done [BV11], it is not an easy task to prove that the cor-
responding solutions of P with clean and corrupted data have the same structure provided
the noise is sufficiently small. Some relevant work has been done in [BB13] in terms of
ground states of regularisation functionals.
In Figure 4.15, we plot the exact and the numerical solutions of P using the function
fp.c. as a data function. We choose the parameters α and β so that we have the solution
of the type N-P1-J. We observe that for clean data, the exact and the numerical solutions
coincide, see Figure 4.15(a). Moreover, even under the presence of noise, the numerical
solution is not far away from the corresponding solution without the noise, Figure 4.15(b).
We observe similar results in Figures 4.16 and 4.17 where we choose the parameters
so that we have solutions of the type N-P2-J and N-P1-C respectively.
In Figure 4.18 we use the function fhat as a data function. Again, without the presence
of noise, the exact solution agrees with the numerical one, Figure 4.18(a). However, when
noise is added, even though the numerical solution is close to the one that corresponds to
the clean data, some staircasing is observed, Figure 4.18(b). This is not surprising as with
these combinations of α and β, TGV behaves like TV as it was shown in Theorem 4.4.11.
In Figure 4.19 the parameters are chosen so that the solution of the type A-E-A occurs.
In the clean data case we have agreement between the exact and the numerical solution,
Figure 4.19(a), but in the noisy case a kind of “affine” staircasing effect appears in the
area where the exact solution equals with the data, see detail of Figure 4.19(b).
Finally, in Figure 4.20, the parameters are chosen so the solution of the type A occurs.
Again, the numerical solution agrees with the exact one and deviates from it slightly in
the presence of noise, while no staircasing of any kind appears this time.
153
Exact solutions of the one dimensional TGV regularisation problem
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f
TGV–numerical: α = 1250, β = 50000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f noisy, Gaussian noise, σ = 10
TGV–numerical: α = 1250, β = 50000
TGV solution of unnoisy f
(b) Noisy data, the numerical solution deviates
slightly from the corresponding exact solution
with clean data.
Figure 4.15: Piecewise constant function fp.c.. The parameters α and β satisfy the condi-
tions (4.63), thus the solution is of the type N-P1-J.
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f
TGV–numerical: α = 1250, β = 30000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f noisy, Gaussian noise, σ = 10
TGV–numerical: α = 1250, β = 30000
TGV solution of unnoisy f
(b) Noisy data, the numerical solution deviates
slightly from the corresponding exact solution
with clean data.
Figure 4.16: Piecewise constant function fp.c.. The parameters α and β satisfy the condi-
tions (4.72), thus the solution is of the type N-P2-J.
154
4.6. Computation of exact solutions
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f
TGV–numerical: α = 4000, β = 120000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 50 100 150 200 250 300 350 400 450 500
−40
−20
0
20
40
60
80
100
120
140
 
 
f noisy, Gaussian noise, σ = 10
TGV–numerical: α = 4000, β = 120000
TGV solution of unnoisy f
(b) Noisy data, the numerical solution deviates
slightly from the corresponding exact solution
with clean data.
Figure 4.17: Piecewise constant function fp.c.. The parameters α and β satisfy the condi-
tions (4.94), thus the solution is of the type N-P1-C (L2–linear regression).
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f
TGV–numerical: α = 400, β = 200000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f noisy, Gaussian noise, σ = 1
TGV–numerical: α = 400, β = 200000
TGV solution of unnoisy f
Detail
(b) Noisy data, appearance of staircasing effect.
Figure 4.18: Hat function fhat. The parameters α and β satisfy the conditions (4.126),
thus the solution is of the type C-E-C.
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f
TGV–numerical: α = 400, β = 120000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f noisy, Gaussian noise, σ = 1
TGV–numerical: α = 400, β = 120000
TGV solution of unnoisy f
Detail
(b) Noisy data, appearance of an “affine” staircas-
ing effect.
Figure 4.19: Hat function fhat. The parameters α and β satisfy the conditions (4.127),
thus the solution is of the type A-E-A.
155
Exact solutions of the one dimensional TGV regularisation problem
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f
TGV–numerical: α = 400, β = 100000
TGV–exact
(a) Clean data, the numerical solution agrees with
the exact one.
0 100 200 300 400 500 600 700 800
−10
0
10
20
30
40
50
 
 
f noisy, Gaussian noise, σ = 1
TGV–numerical: α = 400, β = 100000
TGV solution of unnoisy f
Detail
(b) Noisy data, the numerical solution deviates
slightly from the corresponding exact solution
with clean data.
Figure 4.20: Hat function fhat. The parameters α and β satisfy the conditions (4.129),
thus the solution is of the type A.
156
Chapter 5
Non-local Hessian: Localisation
results, characterisation of higher
order Sobolev and BV spaces and
applications
5.1 Introduction
In this chapter we are concerned with two different forms of non-local Hessian functionals.
For the first formulation (explicit formulation) we prove its localisation to the classical
Hessian as the non-locality vanishes and we use it to characterise some higher order Sobolev
and BV spaces. The second formulation (implicit formulation) that we introduce, is better
tuned for use in variational problems in imaging. We provide some numerical examples
for image denoising where we show that the model is capable of outperforming TGV as
far as the restoration of piecewise affine data is concerned.
Explicit formulation
We are initially interested in studying the following non-local Hessian functional (explicit
formulation):
Hnu(x) = d(d+ 2)
2
ˆ
RN
d 2u(x, y)
|x− y|2
(
(x− y)⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρn(x− y)dy, x ∈ R
d,
(5.1)
with
d 2u(x, y) := u(y)− 2u(x) + u(x+ (x− y)),
157
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
see Figure 5.1, where here Id is the d× d identity matrix and ρn is a sequence of L1(Rd)
radial functions, i.e., ρn(x) = ρn(|x|), that satisfy the properties
ρn ≥ 0,
ˆ
Rd
ρndx = 1, lim
n→∞
ˆ
|x|>δ
ρndx = 0, ∀δ > 0. (5.2)
This means that eventually all the mass of ρn is concentrating to the origin, leading finally
to the localisation of (5.1).
x
y1
y2
y3
y4
x+ (x− y1)
x+ (x− y2)
x+ (x− y3)
x+ (x− y4)
Figure 5.1: Configuration of the second order finite difference scheme d 2u(x, y) = u(y)−
2u(x) + u(x+ (x− y)).
As we have already mentioned in the introduction, analogous first order non-local func-
tionals have been introduced and studied in the literature. More particularly, Bourgain,
Brezis and Mironescu [BBM01] examined functionals of the type
ˆ
Ω
ˆ
Ω
|u(x)− u(y)|p
|x− y|p ρn(x− y)dxdy, (5.3)
where Ω is a smooth bounded domain in Rd and 1 ≤ p <∞. They characterised W 1,p(Ω)
and BV(Ω) by proving that,
u ∈W 1,p(Ω) ⇐⇒ u ∈ Lp(Ω) and lim inf
n→∞
ˆ
Ω
ˆ
Ω
|u(x)− u(y)|p
|x− y|p ρn(x− y)dxdy <∞, p > 1,
u ∈ BV(Ω) ⇐⇒ u ∈ L1(Ω) and lim inf
n→∞
ˆ
Ω
ˆ
Ω
|u(x)− u(y)|
|x− y| ρn(x− y)dxdy <∞, p = 1.
Ponce in [Pon04] derived similar characterisations by studying functionals of the type
ˆ
Ω
ˆ
Ω
ω
( |u(x)− u(y)|
|x− y|
)
ρ(x− y)dxdy, (5.4)
158
5.1. Introduction
where ω is a continuous function and ρ are not necessarily radial. Analogous were the
results by Gobbino and Mora in [GM01].
Mengesha and Spector in [MS13] introduced the following non-local gradient operator:
Gnu(x) = d
ˆ
Ω
u(x)− u(y)
|x− y|
x− y
|x− y|ρn(x− y)dy, x ∈ Ω. (5.5)
Functional (5.5) is defined rigorously as a distribution. The authors proved the localisation
of the functionals (5.5) to their classical analogue, ∇u, in the topology that corresponds
to the regularity of the function u and they obtained yet another characterisation of the
spaces W 1,p(Ω) and BV(Ω):
u ∈W 1,p(Ω) ⇐⇒ u ∈ Lp(Ω) and lim inf
n→∞ ‖Gnu‖Lp(Ω) <∞, p > 1
u ∈ BV(Ω) ⇐⇒ u ∈ L1(Ω) and lim inf
n→∞ ‖Gnu‖L1(Ω) <∞, p = 1.
In this chapter, we prove the localisation of functional (5.1) to the classical Hessian∇2u
in various topologies and we obtain some novel characterisations of the spaces W 2,p(Rd)
and BV2(Rd). In summary, we prove the following localisations:
if u ∈ C2c (Rd) =⇒ Hnu→ ∇2u uniformly,
if u ∈W 2,p(Rd) =⇒ Hnu→ ∇2u in Lp(Rd,Rd×d), 1 ≤ p <∞,
if u ∈ BV2(Rd) =⇒ HnuLd → D2u weakly∗ in measures.
We also prove the following derivative-free characterisations of W 2,p(Rd) and BV2(Rd) in
the spirit of [BBM01] and [MS13]:
u ∈W 2,p(Rd) ⇐⇒ u ∈ Lp(Rd) and lim inf
n→∞ ‖Hnu‖Lp(Rd,Rd×d) <∞, p > 1, (5.6)
u ∈ BV2(Rd) ⇐⇒ u ∈ L1(Rd) and lim inf
n→∞ ‖Hnu‖L1(Rd,Rd×d) <∞, p = 1. (5.7)
We must note here that the results in [MS13] hold for the vector value case as well,
that is to say one can define a form of non-local Hessian as
Gn(∇u)(x) = d
ˆ
Ω
∇u(x)−∇u(y)
|x− y| ⊗
x− y
|x− y|ρn(x− y)dy, (5.8)
and obtain a straitforward characterisation of W 2,p(Rd) and BV2(Rd):
u ∈W 2,p(Rd) ⇐⇒ u ∈W 1,p(Rd) and lim inf
n→∞ ‖Gn∇u‖Lp(Rd,Rd×d) <∞, p > 1 (5.9)
u ∈ BV2(Rd) ⇐⇒ u ∈W 1,1(Rd) and lim inf
n→∞ ‖Gn∇u‖L1(Rd,Rd×d) <∞, p = 1. (5.10)
However the advantage of the characterisations (5.6)–(5.7) over (5.9)–(5.10) is that they
159
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
are derivative-free and require the function u to belong to an Lp space instead of W 1,p.
Implicit formulation
We then proceed to the definition of the implicit formulation of a non-local Hessian func-
tional, see Definition 5.7.1. The non-local Hessian Hσxu(x) of a function u at a point x,
is defined implicitly through a minimisation problem. We note that in this formulation
the weighting function σx is not necessarily radial and depends heavily on the point x. It
is constructed in such a way (by solving a weighted Eikonal equation) so that when x lies
near an edge, points across the edge are weighted with small or zero weight something
that eventually leads to an edge preservation scheme.
We solve the corresponding regularisation problem for the case of image denoising,
where we observe that the model is able to recover to a very large degree piecewise affine
data, outperforming TGV.
5.1.1 Organisation of the chapter
In Section 5.2 we prove the localisation of the sequence (Hnu)n∈N to the classical Hessian
∇2u for smooth, compactly supported functions u. This localisation occurs in the uniform
topology.
In Section 5.3, after proving that Hnu is well defined for functions u ∈ W 2,p(Rd), we
show that (Hnu)n∈N localises to the second order weak derivative ∇2u.
A second order non-local integration by parts formula is shown in Section 5.4 where
also a second order non-local divergence is introduced. This formula is an essential tool
for proving the localisation of (Hnu)n∈N for the BV2 case as well as for the non-local
characterisation of BV2(Rd).
In Section 5.6 we derive the non-local characterisations of W 2,p(Rd) and BV2(Rd),
after redefining the non-local Hessian functional as a distribution.
Finally, in Section 5.7 we introduce the implicit formulation of non-local Hessian and
we provide some numerical examples in image denoising.
5.2 Localisation – The smooth case
Our target for this section is to show that at least for sufficiently smooth functions u, the
sequence (Hnu)n∈N converges to the classical Hessian ∇2u uniformly. We first need the
following lemma:
Lemma 5.2.1. Let ci,jn (x) be the following quantities for i, j = 1, . . . , d:
ci,jn (x) = (A(d))i,j
ˆ
Rd
(xi − yi)2(xj − yj)2
|x− y|4 ρn(x− y)dy,
160
5.2. Localisation – The smooth case
where
(A(d))i,j =
d2 + 2d if i 6= j,d2+2d
3 if i = j.
Then ci,jn = 1 for every n ∈ N and i, j = 1, . . . , d.
Proof. The proof follows directly from Lemma A.2.2 in Appendix A.
We are now ready to prove the localisation of Hnu to ∇2u for sufficiently smooth
functions:
Theorem 5.2.2. Suppose that u ∈ C2c (Rd). Then
Hnu→ ∇2u, uniformly as n→∞.
Proof. Note first that by using the mean value theorem for one dimensional restrictions
of u we have
d 2u(x, y) = (x− y)T
(ˆ 1
0
ˆ 1
0
∇2u(x+ (t+ s− 1)(y − x))dsdt
)
(x− y). (5.11)
Define now the following quantity for δ > 0:
Qδu(x) =
∣∣∣∣∣
ˆ
B(x,δ)
d 2u(x, y)− (x− y)T∇2u(x)(x− y)
|x− y|2
(x− y)⊗ (x− y)
|x− y|2 ρn(x− y)dy
∣∣∣∣∣ .
(5.12)
We claim that we can make Qδu(x) as small as possible, independently of x, by choosing
sufficiently small δ > 0. Indeed, given  > 0, using the uniform continuity of the partial
derivatives of u, we can choose δ > 0 such that for every i, j = 1, . . . , d we have∣∣∣∣ ∂u∂xi∂xj (x)− ∂u∂xi∂xj (y)
∣∣∣∣ < , whenever |x− y| < δ.
Using (5.11) we can estimate
Qδu(x) =
∣∣∣∣∣
ˆ
B(x,δ)
d 2u(x, y)− (x− y)T∇2u(x)(x− y)
|x− y|2
(x− y)⊗ (x− y)
|x− y|2 ρn(x− y)dy
∣∣∣∣∣
=
∣∣∣∣∣∣
ˆ
B(0,δ)
zT
(´ 1
0
´ 1
0 ∇2u(x+ (1− s− t)z)−∇2u(x)dsdt
)
z
|z|2
z ⊗ z
|z|2 ρn(z)dz
∣∣∣∣∣∣
≤ d
ˆ
B(0,δ)
|x− y||z|
|z|2
|z||z|
|z|2 ρn(z)dz
≤ d. (5.13)
161
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
On the other hand we have that Qδu(x) is also equal to
Qδu(x) =
∣∣∣∣∣
ˆ
B(x,δ)
d 2u(x, y)
|x− y|2
(x− y)⊗ (x− y)
|x− y|2 ρn(x− y)dy
−
ˆ
B(x,δ)
(x− y)T∇2u(x)(x− y)
|x− y|2
(x− y)⊗ (x− y)
|x− y|2 ρn(x− y)dy︸ ︷︷ ︸
C
∣∣∣∣∣∣∣∣∣ , (5.14)
where C is an d× d matrix whose (i0, j0)–th element is equal to
C(i0, j0) =
d∑
i,j=1
∂u
∂xi∂xj
(x)
ˆ
B(0,δ)
zizjzi0zj0
|z|4 ρn(z)dz. (5.15)
However due to radial symmetry we have that the following integrals vanish:
i0 = j0 :
ˆ
B(0,δ)
ziz
3
j0
|z|4 ρn(z)dz = 0, for i 6= j0,
i0 = j0 :
ˆ
B(0,δ)
zizjz
2
j0
|z|4 ρn(z)dz = 0, for i 6= j0, j 6= j0, i 6= j,
i0 6= j0 :
ˆ
B(0,δ)
zizjzi0zj0
|z|4 ρn(z)dz = 0, for i 6= i0, j 6= j0.
This means that when i0 6= j0, the only integrals that are not zero in (5.15) correspond
to the cases i = i0, j = j0 or i = j0, j = i0. Thus, using also the symmetry of the mixed
derivatives, we have that
C(i0, j0) = 2
∂u
∂xi0∂xj0
(x)
ˆ
B(0,δ)
z2i0z
2
j0
|z|4 ρn(z)dz, i0 6= j0. (5.16)
Now, when i0 = j0 we have that the only integrals that are not zero in (5.15) correspond
to the case i = j. This means that
C(i0, j0) =
d∑
i=1
∂2u
∂x2i
(x)
ˆ
B(0,δ)
z2i z
2
i0
|z|4 ρn(z)dz, i0 = j0. (5.17)
We now proceed to the final steps of the proof. Since ci,jn = 1, in order to prove the
theorem it suffices to prove that
∣∣Hnu− cn ×∇2u∣∣→ 0, uniformly as n→∞.
where here “×” denotes pointwise multiplication between matrices. We prove that the
above quantity converges to zero uniformly, component-wise, i.e.,
∣∣∣(Hnu− cn ×∇2u)(i0,j0)∣∣∣→
162
5.2. Localisation – The smooth case
0 by considering two cases i0 6= j0 and i0 = j0.
Case i0 6= j0 : Combining, (5.13), (5.14) and (5.16) we have that for every  > 0 there
exists a δ > 0 such that∣∣∣∣∣
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
zi0zj0
|z|2 ρn(z)dz − 2
∂u
∂xi0∂xj0
(x)
ˆ
B(0,δ)
z2i0z
2
j0
|z|4 ρn(z)dz
∣∣∣∣∣ ≤ . (5.18)
Using (5.18), we have for large enough n
∣∣∣(Hnu− cn ×∇2u)(i0,j0)∣∣∣ =
∣∣∣∣d(d+ 2)2
ˆ
Rd
d 2u(x, y)
|x− y|2
(xi0 − yi0)(xj0 − yj0)
|x− y|2 ρn(x− y)dy
− ∂u
∂xi0∂xj0
(x)d(d+ 2)
ˆ
Rd
z2i0z
2
j0
|z|4 ρn(z)dz
∣∣∣∣∣
≤ d(d+ 2)
2
+
d(d+ 2)
2
∣∣∣∣∣
ˆ
|z|≥δ
d 2u(x, x+ z)
|z|2
zi0zj0
|z|2 ρn(z)dz
∣∣∣∣∣
+
d(d+ 2)
2
∣∣∣∣ ∂u∂xi∂xj (x)
∣∣∣∣
∣∣∣∣∣
ˆ
|z|≥δ
z2i0z
2
j0
|z|4 ρn(z)dz
∣∣∣∣∣
≤ d(d+ 2)
2
+
d(d+ 2)
2
4‖u‖∞
δ2
ˆ
|z|≥δ
ρn(z)dz
+
d(d+ 2)
2
‖∇2u‖∞
ˆ
|z|≥δ
ρn(z)dz
< d(d+ 2).
Case i0 = j0 :
This case is technically more difficult. Note that we cannot follow the corresponding proof
for the i0 6= j0 case as all the pure second derivatives appear in (5.17). However we can
take advantage of the fact that ∂
2u
∂x2i0
(x) appears with different weight than ∂
2u
∂x2i
(x), for
i 6= i0. Note that this is the reason for the correction |x−y|
2
d+2 in the diagonal elements of
the non-local Hessian Hnu. As before, combining, (5.13), (5.14) and (5.17) we have that
for every  > 0 there exists a δ > 0 such that∣∣∣∣∣
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
z2i0
|z|2 ρn(z)dz −
d∑
i=1
∂2u
∂x2i
(x)
ˆ
B(0,δ)
z2i z
2
i0
|z|4 ρn(z)dz
∣∣∣∣∣ ≤ . (5.19)
By considering the inequality (5.19) for every i 6= i0 and summing over all i 6= i0 we get∣∣∣∣∣∣∣
d∑
i=1
i 6=i0
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
z2i
|z|2 ρn(z)dz −
d∑
i=1
i 6=i0
d∑
j=1
∂2u
∂x2j
(x)
ˆ
B(0,δ)
z2j z
2
i
|z|4 ρn(z)dz
∣∣∣∣∣∣∣ ≤ (d− 1).
(5.20)
163
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
Observe now that using Lemma 5.2.1, the second term in the lefthand side of (5.20) can
be written as follows:
d∑
i=1
i 6=i0
d∑
j=1
∂2u
∂x2j
(x)
ˆ
B(0,δ)
z2j z
2
i
|z|4 ρn(z)dz = (d− 1)
∂2u
∂x2i0
(x)
ˆ
B(0,δ)
z2i z
2
i0
|z|4︸ ︷︷ ︸
i 6=i0
ρn(z)dz
+
d∑
i=1
i 6=i0
∂2u
∂x2i
(x)
ˆ
B(0,δ)
z4i
|z|4 ρn(z)dz
+
d∑
i=1
i 6=i0
(d− 2)∂
2u
∂x2i
(x)
ˆ
B(0,δ)
z2j z
2
i
|z|4︸ ︷︷ ︸
j 6=i
ρn(z)dz
=
d− 1
3
∂2u
∂x2i0
(x)
ˆ
B(0,δ)
z4i0
|z|4 ρn(z)dx
+
d∑
i=1
i 6=i0
(d+ 1)
∂2u
∂x2i
(x)
ˆ
B(0,δ)
z2j z
2
i
|z|4︸ ︷︷ ︸
j 6=i
ρn(z)dz. (5.21)
Now, (5.20) combined with (5.21) give∣∣∣∣∣∣∣
1
d+ 1
d∑
i=1
i 6=i0
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
z2i
|z|2 ρn(z)dz (5.22)
− d− 1
3(d+ 1)
∂2u
∂x2i0
(x)
ˆ
B(0,δ)
z4i0
|z|4 ρn(z)dx−
d∑
i=1
i 6=i0
∂2u
∂x2i
(x)
ˆ
B(0,δ)
z2i0z
2
i
|z|4 ρn(z)dz
∣∣∣∣∣∣∣ ≤
d− 1
d+ 1
.
Using (5.19), (5.22) and the triangle inequality we can now vanish the terms ∂
2u
∂x2i
(x), for
i 6= i0: ∣∣∣∣∣∣∣
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
z2i0 − 1d+1
∑d
i=1
i 6=i0
z2i
|z|2 ρn(z)dy
− 2d+ 4
3(d+ 1)
∂2u
∂x2i0
(x)
ˆ
B(z,δ)
z4i0
|z|4 ρn(z)dz
∣∣∣∣∣ ≤ 2dd+ 1. (5.23)
which is equivalent to∣∣∣∣∣d(d+ 2)2
ˆ
B(0,δ)
d 2u(x, x+ z)
|z|2
z2i0 − 1d+2 |z|2
|z|2 ρn(z)dz
164
5.3. Localisation – The W 2,p(Rd) case
− ∂
2u
∂x2i0
(x)(A(d))i0,j0
ˆ
B(0,δ)
z4i0
|z|4 ρn(z)dz
∣∣∣∣∣ ≤ d2. (5.24)
Notice that the last inequality (5.24) corresponds to inequality (5.18) for the i0 6= j0 case.
Using (5.24) we can now finish the proof in a similar fashion. For large enough n we have
∣∣(Hnu− cn ×∇2u)(i0,i0)∣∣ =
∣∣∣∣∣d(d+ 2)2
ˆ
Rd
d 2u(x, y)
|y − x|2
(yi0 − xi0)2 − 1d+2 |y − x|2
|y − x|2 ρn(x− y)dy
− ∂
2u
∂x2i0
(x)(A(d))i0,j0
ˆ
Rd
z4i0
|z|4 ρn(z)dz
∣∣∣∣∣
≤ d2+ d(d+ 2)
2
ˆ
|z|≥δ
∣∣d 2u(x, x+ z)∣∣
|z|2
∣∣∣∣∣z2i0 − 1d+2 |z|2|z|2
∣∣∣∣∣︸ ︷︷ ︸
≤1
ρn(z)dz
+
∣∣∣∣∣ ∂2u∂x2i0 (x)
∣∣∣∣∣ (A(d))i0,j0
ˆ
|z|≥δ
z4i0
|z|4︸︷︷︸
≤1
ρn(z)dz
≤ d2+ d(d+ 2)
2
4‖u‖∞
δ2
ˆ
|z|≥δ
ρn(z)dz
+ ‖∇2u‖∞(A(d))i0,j0
ˆ
|z|≥δ
ρn(z)dz
≤ 2d2.
This completes the proof of the Theorem.
5.3 Localisation – The W 2,p(Rd) case
The objective of this section is to show that if u ∈W 2,p(Rd), 1 ≤ p <∞ then the non-local
Hessian Hnu converges to ∇2u in Lp. The first step is to show that in that case, Hnu is
indeed an Lp function. This follows from the estimate in the following lemma.
Lemma 5.3.1. Suppose that u ∈W 2,p(Rd), where 1 ≤ p <∞. Then Hnu ∈ Lp(Rd,Rd×d)
with ˆ
Rd
|Hnu(x)|pdx ≤M‖∇2u‖pLp(Rd,Rd×d), (5.25)
where the constant M depends only on d and p.
Proof. Recall, see [Bre83], that there exists a constant M = M(p, d) such that for every
v ∈W 1,p(Rd) ˆ
Rd
|v(x− y)− v(x)|pdx ≤M |y|p‖∇v‖p
Lp(Rd,Rd). (5.26)
We first prove the result for functions v ∈ C∞(Rd) ∩W 2,p(Rd) and then we complete the
165
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
proof using a density argument. Using (5.26), Ho¨lder’s and Jensen’s inequality, equation
(5.11) as well as Fubini’s theorem we have the following successive estimates (the constant
is always denoted with M and everytime it depends only on d and p):
ˆ
Rd
|Hnv(x)|pRd×ddx
=
(
d(d+ 2)
2
)p ˆ
Rd
∣∣∣∣∣∣
ˆ
Rd
d 2v(x, y)
|x− y|2
(
(x− y)⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρn(x− y)dy
∣∣∣∣∣∣
p
dx
≤M
ˆ
Rd
(ˆ
Rd
|d 2v(x, y)|
|x− y|2 ρn(x− y)dy
)p
dx
≤M
ˆ
Rd
(ˆ
Rd
|d 2v(x, y)|p
|x− y|2p ρn(x− y)dy
)ˆ
Rd
ρ(x− y)dy︸ ︷︷ ︸
=1

p/p′
dx (5.27)
≤M
ˆ
Rd
(ˆ
Rd
| ´ 10 ∇v(x+ t(y − x))−∇v(x+ (t− 1)(y − x))dt|p|x− y|p
|x− y|2p ρn(x− y)dy
)
dx
≤M
ˆ
Rd
(ˆ
Rd
´ 1
0 |∇v(x+ t(y − x))−∇v(x+ (t− 1)(y − x))|pdt
|x− y|p ρn(x− y)dy
)
dx
≤M
ˆ
Rd
(ˆ
Rd
(ˆ 1
0
ˆ 1
0
∣∣∇2v(x+ (t+ s− 1)(y − x))∣∣p dsdt) ρn(x− y)dy) dx
= M
ˆ 1
0
ˆ 1
0
ˆ
Rd
(ˆ
Rd
(∣∣∇2v(x+ (t+ s− 1)(y − x))∣∣p) ρn(x− y)dy) dxdsdt
= M
ˆ 1
0
ˆ 1
0
ˆ
Rd
(ˆ
Rd
(∣∣∇2v(x+ (t+ s− 1)ξ)∣∣p) ρn(ξ)dξ) dxdsdt,
= M
ˆ 1
0
ˆ 1
0
ˆ
Rd
(
ρn(ξ)
ˆ
Rd
(∣∣∇2v(x+ (t+ s− 1)ξ)∣∣p) dx) dξdsdt,
= M
ˆ 1
0
ˆ 1
0
ˆ
Rd
ρn(ξ)‖∇2v‖pLp(Rd,Rd×d)dξdsdt
= M‖∇2v‖p
Lp(Rd,Rd×d).
Consider now a sequence (vk)k∈N in C∞(Rd)∩W 2,p(Rd) approximating u in W 2,p(Rd). We
already have from above that
ˆ
Rd
(ˆ
Rd
|d 2vk(x, y)|p
|x− y|2p ρn(x− y)dy
)
dx ≤M‖∇2vk‖Lp(Rd,Rd×d), ∀k ∈ N, (5.28)
or
ˆ
Rd
(ˆ
Rd
|d 2vk(x, x+ z)|p
|z|2p ρn(z)dz
)
dx ≤M‖∇2vk‖Lp(Rd,Rd×d), ∀k ∈ N. (5.29)
Since vk converges to u in L
p(Rd) we have that there exists a subsequence (vk`)`∈N con-
166
5.3. Localisation – The W 2,p(Rd) case
verging to u almost everywhere. From an application of Fatou’s lemma we get that for
every x ∈ Rd
ˆ
Rd
|d 2u(x, x+ z))|p
|z|2p ρn(z)dz︸ ︷︷ ︸
F (x)
≤M lim inf
`→∞
ˆ
Rd
|d 2vk`(x, x+ z)|p
|z|2p ρn(z)dz︸ ︷︷ ︸
Fk` (x)
.
Applying one more time Fatou’s Lemma to (Fk`)`∈N and F we get that
ˆ
Rd
ˆ
Rd
|d 2u(x, x+ z))|p
|z|2p ρn(z)dzdx ≤M lim inf`→∞
ˆ
Rd
ˆ
Rd
|d 2vk`(x, x+ z))|p
|z|2p ρn(z)dzdx
≤M lim inf
`→∞
‖∇2vk`‖Lp(Rd,Rd×d)
= M‖∇2u‖Lp(Rd,Rd×d).
From the last inequality and the fact that (5.27) holds for W 2,p functions as well, the proof
is complete.
We now have the necessary tools to prove the localisation for W 2,p functions.
Theorem 5.3.2. Let 1 ≤ p <∞. Then for every u ∈W 2,p(Rd) we have that
Hnu→ ∇2u in Lp(Rd,Rd×d) as n→∞.
Proof. We notice first that the result holds for functions v ∈ C2c (Rd). In order to see that
observe that it suffices to show that for every  > 0, there exists a constant L > 1 such
that
sup
n∈N
ˆ
B(0,Lr)c
|Hnv(x)|pdx ≤ , (5.30)
where B(0, r) is a ball containing the support of v. In that case we can estimate as follows
ˆ
Rd
|Hnv(x)−∇2v(x)|pdx ≤
ˆ
B(0,Lr)
|Hnv(x)−∇2v(x)|pdx+ , (5.31)
and using Theorem 5.2.2 we derive
lim sup
n→∞
ˆ
Rd
|Hnv(x)−∇2v(x)|pdx ≤ ,
from where the result follows since  is arbitrary. In order to get (5.30) we apply Jensen’s
inequality with respect to the measure ρnLd
ˆ
B(0,Lr)
|Hnv(x)|pdx ≤
ˆ
B(0,Lr)c
ˆ
Rd
|v(y)− 2v(x) + v(x+ (x− y))|p
|x− y|2p ρn(x− y)dydx
167
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
=
ˆ
B(0,Lr)c
ˆ
B(0,r)
|v(y)|p
|x− y|2p ρn(x− y)dydx
+
ˆ
B(0,Lr)c
ˆ
{y:x+(x+y)∈B(0,r)}
|v(x+ (x− y))|p
|x− y|2p ρn(x− y)dydx.
Letting z = x+ (x− y), we obtain
ˆ
B(0,Lr)c
ˆ
{y:x+(x+y)∈B(0,r)}
|v(x+ (x− y))|p
|x− y|p ρn(x− y)dydx =
ˆ
B(0,Lr)c
ˆ
B(0,r)
|v(z)|p
|z − x|2p ρn(z − x)dzdx,
and therefore by the symmetry of ρn we have
ˆ
B(0,Lr)c
|Hnv(x)|pdx ≤ 2
ˆ
B(0,Lr)c
ˆ
B(0,r)
|v(y)|p
|x− y|2p ρn(x− y)dydx
≤ 2
(L− 1)2p
ˆ
B(0,Lr)c
ˆ
B(0,r)
|v(y)|pρn(x− y)dydx
≤ 2
(L− 1)2p ‖ρn‖L1(Rd)‖v‖
p
Lp(Rd).
Thus (5.30) follows by choosing L to be large enough.
We use now the fact that that C∞c (Rd) and hence C2c (Rd) is dense in W 2,p(Rd), see
for example [Bre83]. Let  > 0, then from density we have that there exists a function
v ∈ C2c (Rd) such that
‖∇2u−∇2v‖Lp(Rd,Rd×d) ≤ .
Thus using also Lemma 5.3.1 we have
‖Hnu−∇2u‖Lp(Rd,Rd×d) ≤ ‖Hnu−Hnv‖Lp(Rd,Rd×d) + ‖Hnv −∇2v‖Lp(Rd,Rd×d)
+ ‖∇2v −∇2u‖Lp(Rd,Rd×d)
≤ C+ ‖Hnv −∇2v‖Lp(Rd,Rd×d) + ,
Taking limits as n→∞ we get
lim sup
n→∞
‖Hnu−∇2u‖Lp(Rd,Rd×d) ≤ (C + 1),
and thus we conclude that
lim
n→∞ ‖Hnu−∇
2u‖Lp(Rd,Rd×d) = 0.
168
5.4. Second order non-local integration by parts
5.4 Second order non-local integration by parts
Before we proceed to the localisation of Hn for the BV2 case we first need to introduce
a second order non-local integration by parts formula which is an essential tool for the
proofs of the next section.
We define the second order non-local divergence of a function φ = (φij)
d
i,j=1 as
D2nφ(x) =
d(d+ 2)
2
ˆ
Rd
d 2φ(x, y)
|x− y|2 ·
(
(x− y)⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρn(x− y)dy. (5.32)
where here A · B = ∑di,j=1AijBij for two d × d matrices A and B. Notice that (5.32) is
well defined for φ ∈ C2c (Rd,Rd×d).
Theorem 5.4.1 (Second order non-local integration by parts formula). Suppose that
u ∈ Lp(Rd), 1 ≤ p <∞, |d 2u(x,y)||x−y|2 ρn(x− y) ∈ L1(Rd×Rd) and let φ ∈ C2c (Rd,Rd×d). Then
ˆ
Rd
Hρu(x) · φ(x)dx =
ˆ
Rd
u(x)D2ρφ(x)dx. (5.33)
Proof. The proof is very similar with the corresponding proof in [MS13]. Using Fubini’s
and dominated convergence theorem we have
ˆ
RN
Hρu(x)φ(x)dx
=
d(d+ 2)
2
lim
→0
ˆ
Rd
ˆ
Rd\B(x,)
d 2u(x, y)
|x− y|2
(
(x− y)⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρ(x− y) · φ(x)dydx
=
d(d+ 2)
2
lim
→0
ˆ
Dd
d 2u(x, y)
|x− y|2
(
(x− y)⊗ (x− y)− |x−y|2d+2 Id
)
|x− y|2 ρ(x− y) · φ(x)d(L
d)2(x, y),
where Dd := Rd × Rd \ {|x− y| < }. Similarly we have
ˆ
Rd
u(x)D2ρφ(x)dx
=
d(d+ 2)
2
lim
→0
ˆ
Rd
ˆ
Rd\B(x,)
u(x)
d∑
i,j=1
d 2φij(x, y)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)dydx
=
d(d+ 2)
2
lim
→0
ˆ
Dd
u(x)
d∑
i,j=1
d 2φij(x, y)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)d(L
d)2(x, y),
where, for notational convenience, we used the standard convention
δij =
1 if i = j,0 if i 6= j.
169
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
Thus, it suffices to show that for every i, j and  > 0 we have
ˆ
Dd
d 2u(x, y)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)φij(x)d(L
d)2(x, y) =
ˆ
Dd
u(x)
d 2φij(x, y)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)d(L
d)2(x, y). (5.34)
In order to show (5.34), it suffices to prove
ˆ
Dd
u(y)φij(x)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)d(L
d)2(x, y) =
ˆ
Dd
u(x)φij(y)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)d(L
d)2(x, y), (5.35)
and
ˆ
Dd
u(x+ (x− y))φij(x)
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
d+2
|x− y|2 ρ(x− y)d(L
d)2(x, y) =
ˆ
Dd
u(x)φij(x+ (x− y))
|x− y|2
(xi − yi)(xj − yj)− δij |x−y|
2
N+2
|x− y|2 ρ(x− y)d(L
d)2(x, y). (5.36)
Equation (5.35) can be easily showed by alternating x and y and using the symmetry of the
domain. Finally equation (5.36) can be proved by employing the substitution u = 2x− y,
v = 3x− 2y, noting that x− y = v − u and that the determinant of the Jacobian of this
substitution is −1.
The following lemma shows the convergence of the second order non-local divergence
to the continuous analogue div2φ, where φ ∈ C2c (Rd,Rd×d) and
div2φ :=
d∑
i,j=1
∂φij
∂xi∂xj
.
Lemma 5.4.2. Let φ ∈ C2c (Rd,Rd×d). Then for every 1 ≤ p ≤ ∞ we have
lim
n→∞ ‖D
2
nφ− div2φ‖Lp(Rd) = 0. (5.37)
Proof. The convergence for p = ∞, follows immediately from Theorem 5.2.2 and it also
holds for 1 ≤ p <∞ from Theorem 5.3.2.
170
5.5. Localisation – The BV2(Rd) case
5.5 Localisation – The BV2(Rd) case
In this section we prove that for functions u ∈ BV2(Rd) the sequence Hn localises to D2u
weakly∗ in the sense of measures as n tends to infinity. In order to show that, we first
need to prove that the non-local Hessian is well defined for BV2 functions. We do that in
the following lemma.
Lemma 5.5.1. Suppose that u ∈ BV2(Rd). Then Hnu ∈ L1(Rd,Rd×d) with
ˆ
RN
|Hnu(x)|dx ≤M |D2u|(Rd), (5.38)
where the constant M depends only on d.
Proof. Let (uk)k∈N be a sequence of functions in C∞(Rd) that converges strictly to u in
BV2(Rd). From the same calculations as in the proof of Lemma 5.3.1 we have for every
k ∈ N ˆ
Rd
|Hnuk(x)|dx ≤M‖∇2uk‖L1(Rd,Rd×d).
Using Fatou’s Lemma in a similar way as in Lemma 5.3.1 we get
ˆ
Rd
|Hnu(x)|dx ≤M lim inf
k→∞
|D2uk|(Rd)
= |D2u|(Rd).
We can now proceed to the proof of the localisation result for BV2 functions. We define
(µn)n∈N to be the sequence of Rd×d–valued finite Radon measures with µn := HnuLd.
Theorem 5.5.2. Let u ∈ BV2(Rd). Then
µn → D2u, weakly∗ in measures,
i.e., for every φ ∈ C0(Rd,Rd×d)
lim
n→∞
ˆ
Rd
Hnu(x) · φ(x)dx =
ˆ
Rd
φ(x) dD2u. (5.39)
Proof. Since C∞c (Rd,Rd×d) is dense in C0(Rd,Rd×d) it suffices to prove (5.39) for every
ψ ∈ C∞c (Rd,Rd×d). Indeed, suppose we have done this, let  > 0 and let φ ∈ C0(Rd,Rd×d)
and ψ ∈ C∞c (Rd,Rd×d) such that ‖φ− ψ‖∞ < . Then, using also the estimate (5.38), we
171
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
have∣∣∣∣ˆ
Rd
Hn(x) · φ(x)dx−
ˆ
Rd
φ(x) dD2u
∣∣∣∣ ≤ ∣∣∣∣ˆ
Rd
Hn(x) · (φ(x)− ψ(x))dx
∣∣∣∣
+
∣∣∣∣ˆ
Rd
Hn(x) · ψ(x)dx−
ˆ
Rd
ψ(x) dD2u
∣∣∣∣
+
∣∣∣∣ˆ
Rd
(φ(x)− ψ(x)) dD2u
∣∣∣∣
≤ 
ˆ
R
|Hnu(x)|dx
+
∣∣∣∣ˆ
Rd
Hn(x) · ψ(x)dx−
ˆ
Rd
ψ(x) dD2u
∣∣∣∣
+ |D2u|(Rd)
≤M|D2u|(Rd)
+
∣∣∣∣ˆ
Rd
Hn(x) · ψ(x)dx−
ˆ
Rd
ψ(x) dD2u
∣∣∣∣
+ |D2u|(Rd).
Taking the limit n→∞ from both sides of the above inequality we get
lim sup
n→∞
∣∣∣∣ˆ
Rd
Hn(x) · φ(x)dx−
ˆ
Rd
φ(x) dD2u
∣∣∣∣ ≤ M˜,
for a constant M˜ and since  is arbitrary we have (5.39). We thus proceed to prove (5.39) for
compactly supported smooth functions. From the estimate (5.38) we have that (|µn|)n∈N
is bounded, thus there exists a subsequence (µnk)k∈N and a Rd×d–valued Radon measure
µ such that µnk converges to µ weakly
∗. This means that for every ψ ∈ C∞c (Rd,Rd×d) we
have
lim
k→∞
ˆ
Rd
Hnk(x) · ψ(x)dx =
ˆ
Rd
ψ(x) · dµ.
On the other hand from the integration by parts formula (5.33) and Lemma 5.4.2 we get
lim
k→∞
ˆ
Rd
Hnk(x) · ψ(x)dx = lim
k→∞
ˆ
Rd
u(x)D2nkψ(x)dx
=
ˆ
Rd
u(x)div2ψ(x)dx
=
ˆ
Rd
ψ(x) · dD2u.
This means that µ = D2u. Observe now that since we actually deduce that every sub-
sequence of (µn)n∈N has a further subsequence that converges to D2u weakly∗, then the
initial sequence (µn)n∈N converges to D2u weakly∗.
Let us note here that in the case d = 1, we can also prove strict convergence of the
172
5.6. Non-local characterisation of W 2,p(Rd) and BV2(Rd)
measures µn to D
2u, that is, in addition to (5.39) we also have
|µn|(R)→ |D2u|(R).
Theorem 5.5.3. Let d = 1. Then the sequence (µn)n∈N converges to D2u strictly as
measures, i.e,
µn → D2u, weakly∗ in measures, (5.40)
and
|µn|(R)→ |D2u|(R). (5.41)
Proof. The weak∗ convergence was proved in Theorem 5.5.2. Since the total variation
norm is lower semicontinuous with respect to the weak∗ convergence, we also have
|D2u|(R) ≤ lim inf
n→∞ |µn|(R). (5.42)
Thus it suffices to show that
lim sup
n→∞
|µn|(R) ≤ |D2u|(R). (5.43)
Note that in dimension one, the non-local Hessian formula is written as
Hnu(x) =
ˆ
R
u(y)− 2u(x) + u(x+ (x− y))
|x− y|2 ρn(x− y)dy. (5.44)
Following the proof of Lemma 5.3.1, we can easily verify that for v ∈ C∞(R)∩BV2(R) we
have ˆ
R
|Hnv(x)|dx ≤ ‖∇2v‖L1(R),
i.e., the constant M that appears in the estimate (5.25) is equal to one. Using Fatou’s
Lemma and the BV2 strict approximation of u by smooth functions we get that
|µn|(R) =
ˆ
R
|Hnu(x)|dx ≤ |D2u|(R),
from where (5.43) follows straightforwardly.
5.6 Non-local characterisation of W 2,p(Rd) and BV2(Rd)
Characterisation of Sobolev and BV spaces in terms of non-local, derivative-free energies
has been done so far only in the first order case, see [BBM01, Pon04, Men12, MS13]. Here
we characterise the spaces W 2,p(RN ), 1 < p < ∞ and BV2(RN ) using our definition of
non-local Hessian. For that we need to define the non-local Hessian as a distribution. Note
that so far we have used the non-local Hessian operator Hn for regular enough functions so
173
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
that formula (5.1) is well-defined. Indeed, the estimates (5.25) and (5.38) show that (5.1)
is well defined for W 2,p(Rd), 1 ≤ p < ∞, and BV2(Rd) functions. One can easily check
that it is also well defined for W 2,∞(Rd) functions. Moreover, it is very easy to check that
if u ∈ C2c (Rd) then the following estimate hold
|Hnu(x)| ≤ d
2(d+ 2)
2
‖∇2u‖L∞(Rd), ∀x ∈ Rd. (5.45)
In fact as the proof of Theorem 5.4.1 shows, a sufficient condition for the well-posedness
of (5.1) is
|d 2u(x, y)|
|x− y|2 ρn(x− y) ∈ L
1(Rd × Rd), (5.46)
and in that case we have that Hnu ∈ L1(Rd). However, we define Hn as a distribution,
extending thus the definition for arbitrary Lp functions 1 ≤ p <∞.
Definition 5.6.1 (Distributional definition of non-local Hessian). Let u ∈ Lp(Rd) for
some p ∈ [1,∞). We define the distributional non-local Hessian Hnu as
〈Hnu, φ〉 =
ˆ
Rd
u(x)D2nφ(x)dx, ∀φ ∈ C2c (Ω,Rd×d). (5.47)
Observe that since φ ∈ W 2,p(Rd) for every p ∈ [1,∞], estimates (5.25) and (5.45)
imply that D2nφ ∈ Lp(Rd) for every p ∈ [1,∞] and that (5.47) is well defined from Ho¨lder’s
inequality. Note also that Hnu is indeed a distribution. In order to see that observe that
if u ∈ L1(Rd) then estimate (5.45) gives
〈Hnu, φ〉 ≤M‖u‖L1(Rd)‖∇2φ‖∞, (5.48)
where M depends only on d. If u ∈ Lp(Rd) with 1 < p <∞ then from the estimate 5.3.1
and the fact that ∇2φ is of compact support we have
〈Hnu, φ〉 ≤M‖u‖Lp(Rd)‖∇2φ‖Lq(Rd,Rd×d) ≤M‖u‖Lp(Rd)‖∇2φ‖∞, (5.49)
where 1/p+1/q = 1 and the constant M depends only on p and d. Finally, the integration
by parts formula (5.33) shows that if (5.46) holds (for example in the case of W 2,p(Rd),
BV2(Rd) and C2c (Ω,Rd×d) functions) then the distribution Hnu can be represented by the
function Hnu. We are now ready to prove our characterisations.
Theorem 5.6.2. Let u ∈ Lp(Rd) for some 1 < p <∞. Then
u ∈W 2,p(Rd) ⇐⇒ Hnu ∈ Lp(Rd,Rd×d) ∀n ∈ N and lim inf
n→∞
ˆ
Rd
|Hnu(x)|pdx <∞.
(5.50)
174
5.6. Non-local characterisation of W 2,p(Rd) and BV2(Rd)
Let u ∈ L1(Rd). Then
u ∈ BV2(Rd) ⇐⇒ Hnu ∈ L1(Rd,Rd×d) ∀n ∈ N and lim inf
n→∞
ˆ
Rd
|Hnu(x)|dx <∞.
(5.51)
Proof. Firstly, we prove (5.50). Suppose that u ∈W 2,p(Rd). Then, Lemma 5.3.1 gives
lim inf
n→∞
ˆ
Rd
|Hnu(x)|pdx ≤M‖∇2u‖pLp(Rd) <∞.
Suppose now conversely that
lim inf
n→∞
ˆ
Rd
|Hnu(x)|pdx <∞.
This means that up to a subsequence, Hnu is bounded in Lp(Rd,Rd×d), thus there exists a
subsequence (Hnku)k∈N and v ∈ Lp(Rd,Rd×d) such thatHnku→ v weakly in Lp(Rd,Rd×d).
Thus, using the definition of Lp weak convergence together with the non-local integration
by parts formula and Lemma 5.4.2, we have for every ψ ∈ C∞c (Rd,Rd×d),
ˆ
Rd
v(x) · ψ(x)dx = lim
k→∞
ˆ
Rd
Hnku(x) · ψ(x)dx
= lim
k→∞
ˆ
Rd
u(x)D2nkψ(x)dx
=
ˆ
Rd
u(x)div2ψ(x)dx,
something that shows that v = ∇2u ∈ Lp(Rd,Rd×d) is the second order weak derivative
of u. Now since the second order distributional derivative is a function, the same is true
for the first distributional derivatives and they belong to Lp and thus u ∈W 2,p(Rd). This
is a consequence of the Gagliardo–Nirenberg interpolation inequality [Nir59]
‖∇u‖Lp(Rd,Rd) ≤ C‖∇2u‖
1
2
Lp(Rd,Rd×d)‖u‖
1
2
Lp(Rd), (5.52)
where the constant C depends only on d.
We now proceed in proving (5.51). Again supposing that u ∈ BV2(Rd) we have that
Lemma 5.5.1 gives us
lim inf
n→∞
ˆ
Rd
|Hnu(x)|dx ≤ C|D2u|(Rd).
Suppose now that
lim inf
n→∞
ˆ
RN
|Hnu(x)|dx <∞.
Considering again the measures µn = HnuLd we have that there exists a subsequence
175
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
(µnk)k∈N and a finite Radon measure µ such that µnk → µ weakly∗. Then for every
ψ ∈ C∞c (Rd,Rd×d) we have, similarly as before,
ˆ
Rd
ψ · dµ = lim
k→∞
ˆ
Rd
Hnku(x) · ψ(x)dx
= lim
k→∞
ˆ
Rd
u(x)D2nkψ(x)dx
=
ˆ
Rd
u(x)div2ψ(x)dx,
something that shows that µ = D2u. Thus, u ∈ BV2(Rd), since if the second order
distributional derivative is a finite Radon measure then the first order one is an L1 function.
This again due to a special form of the Gagliardo–Nirenberg interpolation inequality for
W 2,1 functions this time. In order to see that, notice first that from [TS80] we have first
that Du ∈ L1loc(Rd). Fix an Ω b Rd, then we have that ubΩ ∈ BV2(Ω). From Theorem
3.3.7 we can find a sequence (un)n∈N ∈W 2,1(Ω) that converges to ubΩ strictly in BV2(Ω).
From the definition of strict convergence together with the lower semicontinuity of total
variation with respect to the strong L1 convergence we have
‖∇(ubΩ)‖L1(Ω) ≤ lim inf
n→∞ ‖∇un‖L1(Ω)
≤ lim inf
n→∞ C‖∇
2un‖
1
2
L1(Ω,Rd×d)‖un‖
1
2
L1(Ω)
= C‖∇2(ubΩ)‖
1
2
L1(Ω,Rd×d)‖ubΩ‖
1
2
L1(Ω)
≤ C‖∇2u‖
1
2
L1(Rd,Rd×d)‖u‖
1
2
L1(Rd),
from where we deduce that ∇u ∈ L1(Rd) with
‖∇u‖L1(Rd) ≤ C‖∇2u‖
1
2
L1(Rd,Rd×d)‖u‖
1
2
L1(Rd),
and hence u ∈ BV2(Rd).
5.7 An alternative non-local Hessian approach and an ap-
plication to image denoising
In the previous sections we saw that the explicit formulation of non-local Hessian defined
in (5.1) has some good analytical properties, i.e., the sequence (Hnu)n∈N converges to the
local analogue ∇2u as the non-locality vanishes. It also led to novel non-local character-
isations of the spaces W 2,p(Rd) and BV2(Rd). However the formulation (5.1) is not so
convenient for implementation in variational problems in imaging. One reason is that it
cannot be defined straightforwardly for functions u : Ω → R, where Ω is a bounded do-
176
5.7. An alternative non-local Hessian approach and an application to image denoising
main as the point x+ (x−y) does not necessarily belong to Ω whenever x ∈ Ω. Moreover,
the radial symmetry of the weighting functions ρn is in practice not desirable for imaging
purposes. For example, given that a point x of an image u lies next to an edge, we would
like to have the freedom to define a form of non-local Hessian of u at x using a weighting
function that puts more weight on points that lie on the same side of x, see Figure 5.2.
That would eventually lead to an edge preservation regularisation model.
(a) A point near an edge. (b) Weighting of the neighbour-
ing pixels that correspond to
a radially symmetric weight-
ing function.
(c) Weighting of the neighbour-
ing pixels that correspond to
weighting function aware of
discontinuities.
Figure 5.2: Radially symmetric weighting function versus a weighting function aware of
discontinuities.
To that scope we define an implicit formulation of a non-local Hessian Hσxu(x) and
non-local gradient Gσxu(x) of a function u at a point x with a weighting function σx as
follows:
Definition 5.7.1 (Implicit definition of non-local Hessian). Let Ω ⊆ Rd be an open
bounded domain and u ∈ L2(Ω). We define Gσxu(x) and Hσxu(x) to be the non-local
gradient and Hessian respectively, of u at a point x ∈ Ω with a positive weight function
σx ∈ L∞(Ω \ {x}) as the minimisers of
(Hσxu(x),Gσxu(x)) := argmin
H′∈Rd×d,G′∈Rd
1
2
ˆ
Ω−{x}
Ru,G′,H′(x, z)2σx(z) dz, (5.53)
where
Ru,G′,H′(x, z) = u(x+ z)− u(x)−G′>z − 1
2
z>H ′z. (5.54)
The idea behind Definition 5.7.1 is that at a given point x, we choose Hσxu(x) and
Gσxu(x) so that the remaining values of u best fit a quadratic model with gradient Gσxu(x)
and Hessian Hσxu(x) located at x.
In this section we work with the discretised version of (5.53)–(5.54) in dimension 2,
i.e., the integral in (5.53) is interpreted as a sum and Ω as an N ×M grid. We can now
177
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
define the corresponding non-local Hessian regularised minimisation problems as follows:
Ls–H regularisation
min
u∈RN×M
1
s
‖u− f‖ss + α‖Hσu‖1, α > 0, s = 1, 2, (5.55)
subject to
(Hσxu(x),Gσxu(x)) = argmin
H′∈R2×2,G′∈R2
1
2
ˆ
Ω−{x}
Ru,G′,H′(x, z)2σx(z) dz, (5.56)
Ru,G′,H′(x, z) = u(x+ z)− u(x)−G′>z − 1
2
z>H ′z. (5.57)
Note that the optimality conditions of the quadratic minimisation problem (5.56)–
(5.57) are linear. Thus essentially, problem (5.55)–(5.57) is an optimisation problem with
convex objective and linear constraints. We can solve this problem for instance, using
CVX in MATLAB, a package for specifying and solving convex programs [GB14, GB08].
Our next concern is how to choose the weighting function σx at a point x. As we have
discussed previously ideally this function has to be chosen in a way such that points that
lie across an edge have low or zero weights. We do that by defining first a positive cost
function cx : Ω→ R, where cx(y) must be high when we cannot move from x to y without
passing from points with large gradient (edge). A function like that can be defined by
solving a weighted Eikonal equation
‖∇cx‖2 = φ(∇f), cx(x) = 0, (5.58)
where φ(∇f) = ‖∇f‖22 + , for a small  > 0. Equation (5.58) can be solved efficiently
using a fast marching method [Set99, OF03]. Defining cx as above we have that cx(y) is
high if x and y are separated by a strong edge.
For efficiency in the numerical implementation we do not use all the points y ∈ Ω for the
formulation of Hσxu(x) but only those K ones with the shortest distance cx(y) to x. That
is to say, we sort the neighbours of x, y1, y2, . . . such that cx(y1) ≤ cx(y2), . . . , cx(yK) ≤ . . .
and we define the weighting function σx we follows:
σx(yi) =
 1(cx(yi))2 if i ≤ K,0, if i > K. (5.59)
This number K can be predefined and for our numerical implementations we use K = 12,
while we also set  = 0.01.
178
5.7. An alternative non-local Hessian approach and an application to image denoising
(a) Clean image (b) Noisy image,
SSIM=0.3261. PSNR=22.82
(c) TV denoised image,
SSIM=0.8979, PSNR=31.82
(d) TGV denoised image,
SSIM=0.9249, PSNR=33.29
(e) L2–H denoised image, α=0.2,
SSIM=0.9262, PSNR= 34.39
(f) L1–H denoised image, α=1.2,
SSIM=0.9766, PSNR= 35.97
Figure 5.3: Denoising example for Gaussian noise of variance 0.005.
(a) Clean image (b) Noisy image,
SSIM=0.3261. PSNR=22.82
(c) TV denoised image,
SSIM=0.8979, PSNR=31.82
(d) TGV denoised image,
SSIM=0.9249, PSNR=33.29
(e) L2–H denoised image,
SSIM=0.9262, PSNR= 34.39
(f) L1–H denoised image,
SSIM=0.9766, PSNR= 35.97
Figure 5.4: Corresponding middle row slices for the results of Figure 5.3.
179
Non-local Hessian: Localisation results, characterisation of higher order Sobolev and BV
spaces and applications
Figure 5.3 depicts a denoising example for the same synthetic example we used in
Section 3.6 (Gaussian noise of variance 0.005). In Figure 5.4 we show the corresponding
middle row slices. We observe that the L2 fidelity based, non-local Hessian denoised image,
is of higher quality then the TGV denoised one, as this is confirmed by both PSNR and
SSIM quality measures, see Figure 5.3(e). Notice however that we can achieve a far better
result using the L1 norm in the fidelity term, Figure 5.3(f), despite the fact that the noise
is Gaussian. This is because this special selection of the weights σx allows for a use of a
quite high regularisation parameter, (α = 1.2 in that example), while still preserving the
edges. In the same time the L1 fidelity term is capable of preserving the contrast better
than the L2 case, compare the slices in Figures 5.4(e) and 5.4(f).
180
Chapter 6
Conclusions
Higher order variational regularisation methods have been an active field of research among
the mathematical imaging community. These kind of methods have been proposed to
tackle both classical and modern image processing tasks. The high quality of the recon-
structions make them appealing for a variety of real world applications. In the same time,
novel mathematical techniques are developed, aimed to implement these methods as well
as to analyse their properties. We provide here a short summary about how the present
thesis contributed to this field and we give some directions for future research work.
In Chapter 3 we introduced a combined first and second order method for image
restoration. After a rigorous analysis of the well-posedness of the model, we applied the
method for image denoising, deblurring and inpainting, providing also an online user-
friendly implementation for the latter task in IPOL’s website. The simplicity of the model
combined with the split Bregman algorithm leads to a very efficient implementation, in-
dicating the suitability of the method as a fast pre-processing technique that approaches
the state of the art results as far as restoration quality in concerned. We think that a
fully automatisation of the method, i.e., automatic selection of the optimal parameters
α and β, is a usefull research direction that will enhance further the applicability of the
method. Among other techniques, this optimal parameter learning can be achieved for
instance through bilevel optimisation approaches [DlRS12, KP13, CDlRS14].
In Chapter 4 we contributed to the analysis of the second order total generalised
variation model, a state of the art higher order regulariser, focusing mainly in the one
dimensional case. We exhibited results concerning the structure of solutions of the L2–
TGV problem as well as the relationship of TV and TGV both in dimension one and two.
We computed exact solutions for simple data functions, providing further insights into the
regularising mechanism of TGV. A future research direction involves extension of these
results in higher dimensions and investigation of the analytical properties of the model for
other imaging tasks, e.g., deblurring. Identifying ground states for TGV in the spirit of
[BB13] is another possibility.
181
Conclusions
Finally in Chapter 5 we introduced and analysed a non-local Hessian functional. As the
non-locality vanishes, we proved the convergence of the non-local Hessian of a function
u to the continuous analogue with respect to the natural topology that corresponds to
the regularity of u. This analysis led to novel characterisations of some higher order
Sobolev and BV spaces, namely, W 2,p(Rd) and BV2(Rd). Defining a non-local Hessian in
an alternative, implicit way, resulted in a model that can be easily handled numerically
for image processing purposes. We gave a flavour of this application in image denoising,
showing that the model is capable of obtaining true piecewise affine reconstructions, being
in that sense, superior to TGV. Future work involves the investigation of the application
of the method to other imaging tasks as well as a study of the relationship between the
explicit and the implicit formulation of the non-local Hessian functional.
182
Appendix A
A.1 Exclusion of the cases N-0-P2-J, N-0-P1-C, N-0-P2-C
In Section 4.6.1 we mentioned that solutions of the type N-0-P1-J, N-0-P2-J, N-0-P1-C and
N-0-P2-C cannot occur when we consider the one dimensional L2–TGV2β,α minimisation
problem with data fp.c.. In Proposition 4.6.2 we proved that fact for solutions of the
type N-0-P1-J. Here, we do the same for the cases N-0-P2-J, N-0-P1-C and N-0-P2-C in
Propositions A.1.1, A.1.2 and A.1.3 respectively.
Proposition A.1.1. The case N-0-P2-J cannot occur.
Proof. Suppose that there exist points 0 < x1 < x˜1 < x2 < L such that u < 0 in (0, x1),
u = 0 in (x1, x˜1) and u > 0 in (x˜1, L). Suppose further that u
′ has a positive jump at x2
and u has a jump discontinuity at x = L. Similarly to the proof of Proposition 4.6.2 we
can show that the gradient of u in (0, x1) and in (x˜1, x2) is the same, say equal to c, and
that w = c in (0, x2). As in Proposition 4.6.2 we set
`1 = x1, `2 = x˜1 − x1, `3 = x2 − x˜1, `4 = L− x3 with `1 + `2 + `3 + `4 = L,
and
v1 = vb(0, x1), v2 = vb(x1, x˜1), v3 = b(x˜1, x2), v4 = b(x2, L).
As in Proposition 4.6.2 we have
v1(x) = − α
3`21
x3 +
α
`1
x2, x ∈ (0, x1), (A.1)
v2(x) = α(x− `1) + 2α`1
3
, x ∈ (x1, x˜1), (A.2)
v3(x) = − α
3`21
(x− `1 − `2)3 + α(x− `1 − `2) + α`2 + 2α`1
3
, x ∈ (x˜1, x2). (A.3)
183
Now the extra conditions for v3 are
v′3(`1 + `2 + `3) = 0, (A.4)
and
v3(`1 + `2 + `3) = β. (A.5)
Condition (A.4) gives
`3 = `1, (A.6)
while condition (A.5) in combination with (A.6) gives
4
3
`1 + `2 =
β
α
. (A.7)
The function v4 will be also a cubic function
v4(x) = a4x
3 + b4x
2 + c4x+ d4, x ∈ (x2, L).
We have v4 must satisfy the conditions at x = `1 + `2 + `3:
v4(`1+`2+`3) = β, v
′
4(`1+`2+`3) = 0, v
′′
4(`1+`2+`3) = v
′′
3(`1+`2+`3) = −
2α
`1
, (A.8)
from which we get that
v4 = α4(x− `1 − `2 − `3)3 − α
`1
(x− `1 − `2 − `3)2 + β, (A.9)
plus the following conditions on x = `1 + `2 + `3 + `4:
v4(`1 + `2 + `3 + `4) = 0, v
′
4(`1 + `2 + `3 + `4) = −α, (A.10)
from which, in combination with (A.9) we get
a4`
3
4 −
α
`1
`24 + β = 0, (A.11)
3a4`
2
4 −
2α
`1
`4 = −α. (A.12)
Multiplying (A.12) by `4 and using (A.11) we end up to
`24 + `1`4 − γ`1 = 0, (A.13)
where we define
γ := 3
β
α
.
184
A.1. Exclusion of the cases N-0-P2-J, N-0-P1-C, N-0-P2-C
Solving (A.13) with respect to `4 we get the (positive solution):
`4 =
√
`21 + 4γ`1 − `1
2
. (A.14)
We now use the fact that `1 +`2 +`3 +`4 = L in combination with (A.6), (A.7) and (A.14)
and we get √
`21 + 4γ`1
2
= L− γ
3
− `1
6
. (A.15)
Squaring up both sides of (A.15) after a bit of algebra we end up to the following quadratic
equation for `1
2`21 + `1(8γ + 3L)− (3L− γ)2 = 0. (A.16)
whose only possible positive root is
`1 =
3
√
9L2 + 8γ2 − (8γ + 3L)
4
. (A.17)
However we must make sure that `2 > 0 thus from (A.7) we must impose
0 < `1 < γ/4. (A.18)
We claim now that the condition |v′4| ≤ α is violated. Notice firstly that a4 > 0. Indeed
from (A.12) we have that a4 > 0 if and only if `4 > `1/2 and using (A.14) that is if and
only if `1 <
4
3γ which is true from (A.18). Using that and the fact that v
′
4(`1 +`2 +`3) = 0,
v′4(L) = −α if v′′4(x) = 0 for some x ∈ (`1 + `2 + `3, L) then v′4(x) < v′4(L) = −α and we
are done. Indeed we have that v′′4 vanishes at x0 = `1 + `2 + `3 +
α
3a4`1
and we have that
x0 < L if and only if `4 > `1 which again from (A.14) that is if and only if `1 <
γ
2 which
is true from (A.18).
We conclude that neither this type of solution is possible.
Proposition A.1.2. The case N-0-P1-C cannot occur.
Proof. Suppose that there exist points 0 < x1 < x˜1 < L such that u < 0 in (0, x1), u = 0
in (x1, x˜1) and u > 0 in (x˜1, L). Suppose further that u
′ is constant in (x˜1, L) and u is
continuous at x = L. Similarly to the proof of Propositions 4.6.2 and A.1.1 we can show
that the gradient of u in (0, x1) and in (x˜1, L) is the same, say equal to c, and that w = c
in (0, L). Note that due to symmetry we have that condition u(L) = h/2 holds. As in the
Propositions 4.6.2 and A.1.1 we have that, setting `1 = x1, `2 = x˜1 − x1, `3 = L− x˜1,
v1(x) = − α
3`21
x3 +
α
`1
x2, x ∈ (0, x1), (A.19)
185
v2(x) = α(x− `1) + 2α`1
3
, x ∈ (x1, x˜1), (A.20)
v3(x) = − α
3`21
(x− `1 − `2)3 + α(x− `1 − `2) + α`2 + 2α`1
3
, x ∈ (x˜1, L), (A.21)
and v3 has the following conditions on x = L = `1 + `2 + `3:
v3(`1 + `2 + `3) = 0 ⇐⇒ − `
3
3
3`21
+ `3 + `2 +
2`1
3
= 0, (A.22)
|v′3(`1 + `2 + `3)| ≤ α ⇐⇒ 0 ≤
`23
`21
≤ 2, (A.23)
−v′′3(`1 + `2 + `3) =
h
2
⇐⇒ `21 =
4α`3
h
. (A.24)
Combining (A.23) and (A.24) we get
0 ≤ `3 ≤ 8α
h
. (A.25)
Moreover combining (A.22) and (A.24) we get
−h`23
12α
+ `3 + `2 +
2`1
3
= 0. (A.26)
However one can easily check that the term
−h`23
12α +`3 is positive if and only if 0 ≤ `3 ≤ 12αh ,
which is the case from condition (A.25) thus (A.26) forms a contradiction.
We conclude that neither this solution can happen.
Proposition A.1.3. The case N-0-P2-C cannot occur.
Proof. Suppose that there exist points 0 < x1 < x˜1 < x2 < L such that u < 0 in (0, x1),
u = 0 in (x1, x˜1) and u > 0 in (x˜1, L). Suppose further that u
′ has a positive jump at x2
and u is continuous at x = L. As in the previous cases we have
v1(x) = − α
3`21
x3 +
α
`1
x2, x ∈ (0, x1), (A.27)
v2(x) = α(x− `1) + 2α`1
3
, x ∈ (x1, x˜1), (A.28)
v3(x) = − α
3`21
(x− `1 − `2)3 + α(x− `1 − `2) + α`2 + 2α`1
3
, x ∈ (x˜1, x2), (A.29)
186
A.1. Exclusion of the cases N-0-P2-J, N-0-P1-C, N-0-P2-C
with v3 having the following conditions
v′3(`1 + `2 + `3) = 0 ⇐⇒ `3 = `1, (A.30)
v3(`1 + `2 + `3) = β ⇐⇒ 4
3
`1 + `2 =
β
α
. (A.31)
Finally as before, v4 will be of the form
v4 = a4(x− `1 − `2 − `3)3 − α
`1
(x− `1 − `2 − `3)2 + β, x ∈ (x2, L), (A.32)
with the following conditions on x = L = `1 + `2 + `3 + `4:
v4(`1 + `2 + `3 + `4) = 0 ⇐⇒ a4`34 −
α`24
`1
+ β = 0, (A.33)
|v′4(`1 + `2 + `3 + `4)| ≤ α ⇐⇒
∣∣∣∣3a4`24 − 2α`4`1
∣∣∣∣ ≤ α, (A.34)
−v′4(`1 + `2 + `3 + `4) =
h
2
⇐⇒ −6a4`4 + 2α
`1
=
h
2
. (A.35)
Combining (A.34) and (A.35) we end up to
`4
`1
+
h`4
4α
≤ 1, (A.36)
which in particular implies that
`4 < `1. (A.37)
Observe now that because of the fact that `2 > 0 from (A.31) we get that `1 < γ/4. But
now, since v4(`1 + `2 + `3) = β, v4(`1 + `2 + `3 + `4) = 0, from the mean value theorem
and the fact that the derivative of v4 is decreasing we have that
v′4(`1 + `2 + `3 + `4) < −
β
`4
. (A.38)
Moreover
`4 < `1 <
γ
4
⇒ − β
`4
< −4α
3
< −α,
and thus from (A.38) we have that v′4(L) < −α, which is a contradiction.
Thus neither this kind of solution is possible.
187
A.2 Some useful Lemmas
The following lemma (pointed out by Luca Calatroni) was used in the proof of Theorem
3.5.1.
Lemma A.2.1 (Kronecker’s lemma). Suppose that (an)n∈N and (bn)n∈N are two sequences
of real numbers such that
∑∞
n=1 an <∞ and 0 < b1 ≤ b2 ≤ . . . with bn →∞. Then
1
bn
n∑
k=1
bkak → 0, as n→∞.
In particular, if (cn)n∈N is a decreasing positive real sequence such that
∑∞
n=1 c
2
n < ∞,
then
cn
n∑
k=1
ck → 0, as n→∞.
The following Lemma was used in the proof of Lemma 5.2.1.
Lemma A.2.2. Let d ≥ 2. Then for every i, j = 1, . . . , d with i 6= j and for every R > 0
we have ˆ
|z|<R
z2i z
2
j
|z|4 ρn(z)dz =
1
3
ˆ
|z|<R
z4i
|z|4 ρn(z)dz. (A.39)
In particular this means that
ˆ
Rd
z2i z
2
j
|z|4 ρn(z)dz =
1
3
ˆ
Rd
z4i
|z|4 ρn(z)dz, (A.40)
and ˆ
Rd
z2i z
2
j
|z|4 ρn(z)dz =
 1N(N+2) if i 6= j,3
N(N+2) if i = j.
(A.41)
Proof. Note first that due to symmetry, it suffices to prove (A.39) for i = 1 and j = 2.
We will consider three different cases depending on whether d = 2, d = 3 or d > 3.
d = 2: Employing the transformation z1 = r cos θ, z2 = r sin θ, we have that
ˆ
|z|<R
z21z
2
2
|z|4 ρn(z)dz =
ˆ R
r=0
rρˆn(r)dr
ˆ 2pi
0
cos2 θ sin2 θ dθ =
pi
4
ˆ R
r=0
rρˆn(r)dr,
ˆ
|z|<R
z41
|z|4 ρn(z)dz =
ˆ R
r=0
rρˆn(r)dr
ˆ 2pi
0
cos4 θ dθ =
3pi
4
ˆ R
r=0
rρˆn(r)dr,
and thus, (A.39) holds.
188
A.2. Some useful Lemmas
d = 3: Here we employ the substitution z1 = r cos θ1, z2 = r sin θ1 cos θ2. Taking into
account that the determinant of Jacobian of this substitution is r2 sin θ1 we have that
ˆ
|z|<R
z21z
2
2
|z|4 ρn(z)dz =
ˆ R
r=0
r2ρˆn(r)dr
ˆ pi
0
cos2 θ1 sin
3 θ1 dθ1
ˆ 2pi
θ2=0
cos2 θ2dθ2
=
4pi
15
ˆ R
r=0
r2ρˆn(r)dr,
ˆ
|z|<R
z41
|z|4 ρn(z)dz =
ˆ R
r=0
r2ρˆn(r)dr
ˆ pi
0
cos4 θ1 sin θ1 dθ1
ˆ 2pi
0
dθ2
=
4pi
5
ˆ R
r=0
r2ρˆn(r)dr,
hence, (A.39) holds again.
d > 3: For this case, we employ the hyperspherical coordinate transformation:
z1 = r cos θ1,
z2 = r sin θ1 cos θ2,
z3 = r sin θ1 sin θ2 cos θ3,
...
zd−1 = r sin θ1 . . . sin θd−2 cos θd−1,
zd = r sin θ1 . . . sin θd−2 sin θd−1.
Here θ1, θ2, . . . , θd−2 range over [0, pi) while θd−1 ranges over [0, 2pi). The determinant
of the Jacobian of theis transformation is rd−1 sind−2 θ1 sind−3 θ2 . . . sin θd−2, thus we can
calculate
ˆ
|z|<R
z21z
2
2
|z|4 ρn(z)dz =
ˆ R
0
rd−1ρˆn(r)dr
ˆ pi
0
sind−4 θ3 dθ3 · · ·
ˆ pi
0
sin θd−2 dθd−2
ˆ 2pi
0
dθd−1︸ ︷︷ ︸
C
·
ˆ pi
0
cos2 θ1 sin
d θ1 dθ1
ˆ pi
0
cos2 θ2 sin
d−3 θ2 dθ2
= C
piΓ
(
d
2 − 1
)
4Γ
(
d
2 + 2
) ,
ˆ
|z|<R
z41
|z|4 ρn(z)dz =
ˆ R
0
rd−1ρˆn(r)dr
ˆ pi
0
sind−4 θ3 dθ3 · · ·
ˆ pi
0
sin θd−2 dθd−2
ˆ 2pi
0
dθd−1︸ ︷︷ ︸
C
·
ˆ pi
0
cos4 θ1 sin
d−2 θ1 dθ1
ˆ pi
0
sind−3 θ2 dθ2
= C
3piΓ
(
d
2 − 1
)
4Γ
(
d
2 + 2
) .
189
Hence, (A.39) holds in this case as well and the proof is complete.
Finally, the following lemma was used in the proof of Theorem 4.5.3.
Lemma A.2.3. Let Ω be a square centred at the origin and suppose that f = (f1, f2) ∈
L1(Ω,R2) satisfies the symmetry properties
f(x) = −f(−x), f1(x1, x2) = f(x1,−x2), f2(x1, x2) = f2(−x1, x2), (A.42)
f1(Opi/2x) = f2(x), (A.43)
for almost every x = (x1, x2) ∈ Ω. Then the minimisation problem
min
w∈KerE
‖f − w‖L1(Ω,R2), (A.44)
admits w = 0, as a solution.
Proof. Since
KerE = {r ∈ L1(Ω,R2) : r(x) = Ax+B, B ∈ R2, A ∈ R2×2 is a skew symmetric matrix},
we have that the minimisation problem (A.44) is equivalent to
min
a,b1,b2∈R
ˆ
Ω
(
f1(x)− ax2 − b1)2 + (f2(x) + ax1 − b2)2
)1/2
dx. (A.45)
Since f1 and f2 coincide after a pi/2 rotation, the problem (A.45) is now equivalent to
min
a,b∈R
√
2
ˆ
Ω
|f2(x) + ax1 − b| dx. (A.46)
From the symmetries (A.43) we have that if a minimises (A.45) the same is true for −a.
Thus from convexity of the problem (A.46) we have that also a = 0 is an admissible
solution. Thus finally we have that the problem (A.46) is equivalent to
min
b∈R
√
2
ˆ
Ω
|f2(x)− b| dx. (A.47)
Now we take advantage of the fact that f2(x1,−x2) = −f2(x1, x2) for almost every
(x1, x2) ∈ Ω. In that case it is obvious that if b solves (A.47) then the same is true
for −b, thus again from convexity we have that b = 0 is an admissible solution.
190
Bibliography
[AB86] H. Attouch and H. Brezis, Duality for the sum of convex functions in general
Banach spaces, North-Holland Mathematical Library 34 (1986), 125–133.
(Cited on page 53).
[AC94] M. Amar and V.D. Cicco, Relaxation of quasi-convex integrals of arbi-
trary order, Proceedings of the Royal Society of Edinburgh: Section A
Mathematics 124 (1994), no. 5, 927–946, http://dx.doi.org/10.1017/
S0308210500022423. (Cited on page 66).
[ACC05] F. Alter, V. Caselles, and A. Chambolle, A characterization of convex cal-
ibrable sets in RN , Mathematische Annalen 332 (2005), no. 2, 329–366,
http://dx.doi.org/10.1007/s00208-004-0628-9. (Cited on page 25).
[ACHR06] A. Almansa, V. Caselles, G. Haro, and B. Rouge´, Restoration and zoom of
irregularly sampled, blurred, and noisy images by accurate total variation
minimization with local constraints, Multiscale Modeling & Simulation 5
(2006), no. 1, 235–272, http://dx.doi.org/10.1137/050634086. (Cited
on page 25).
[ACS09] P. Arias, V. Caselles, and G. Sapiro, A variational framework for
non-local image inpainting, Energy Minimization Methods in Computer
Vision and Pattern Recognition, 2009, http://dx.doi.org/10.1007/
978-3-642-03641-5_26, pp. 345–358. (Cited on page 32).
[ADF05] F. Alter, S. Durand, and J. Froment, Adapted total variation for arti-
fact free decompression of JPEG images, Journal of Mathematical Imag-
ing and Vision 23 (2005), no. 2, 199–211, http://dx.doi.org/10.1007/
s10851-005-6467-9. (Cited on page 25).
[AFP00] L. Ambrosio, N. Fusco, and D. Pallara, Functions of bounded variation and
free discontinuity problems, Oxford University Press, USA, 2000. (Cited on
pages 38, 40, 59, 60, and 125).
191
BIBLIOGRAPHY
[AK09] G. Aubert and P. Kornprobst, Can the nonlocal characterization of Sobolev
spaces by Bourgain et al. be useful for solving variational problems?, SIAM
Journal on Numerical Analysis 47 (2009), no. 2, 844–860, http://dx.doi.
org/10.1137/070696751. (Cited on page 36).
[All08a] W. Allard, Total variation regularization for image denoising, I. Geometric
theory, SIAM Journal on Mathematical Analysis 39 (2008), no. 4, 1150–1190,
http://dx.doi.org/10.1137/060662617. (Cited on page 26).
[All08b] , Total variation regularization for image denoising, II. Examples,
SIAM Journal on Imaging Sciences 1 (2008), no. 4, 400–417, http://dx.
doi.org/10.1137/070698749. (Cited on page 26).
[All09] , Total variation regularization for image denoising, III. Examples.,
SIAM Journal on Imaging Sciences 2 (2009), no. 2, 532–568, http://dx.
doi.org/10.1137/070711128. (Cited on page 26).
[AV94] R. Acar and C.R. Vogel, Analysis of bounded variation penalty methods for
ill-posed problems, Inverse Problems 10 (1994), no. 6, 1217–1229, http:
//dx.doi.org/10.1088/0266-5611/10/6/003. (Cited on page 25).
[BB11] M. Benning and M. Burger, Error estimates for general fidelities, Electronic
Transactions on Numerical Analysis 38 (2011), 44–68. (Cited on page 71).
[BB13] , Ground states and singular vectors of convex variational regular-
ization methods, Methods and Applications of Analysis 20 (2013), no. 4,
295–334, http://dx.doi.org/10.4310/MAA.2013.v20.n4.a1. (Cited on
pages 153 and 181).
[BBBM13] M. Benning, C. Brune, M. Burger, and J. Mu¨ller, Higher-order TV
methods – Enhancement via Bregman iteration, Journal of Scientific
Computing 54 (2013), no. 2-3, 269–310, http://dx.doi.org/10.1007/
s10915-012-9650-3. (Cited on pages 31 and 148).
[BBM01] J. Bourgain, H. Brezis, and P. Mironescu, Another look at Sobolev spaces,
Optimal Control and Partial Differential Equations. A Volume in Honour of
A. Bensoussan’s 60th Birthday (2001), 439–455. (Cited on pages 33, 35, 36,
158, 159, and 173).
[BC98] P. Blomgren and T. F. Chan, Color TV: total variation methods for restora-
tion of vector-valued images, IEEE Transactions on Image Processing 7
(1998), no. 3, 304–309, http://dx.doi.org/10.1109/83.661180. (Cited
on page 25).
192
BIBLIOGRAPHY
[BCM05] A. Buades, B. Coll, and J.M. Morel, A review of image denoising algorithms,
with a new one, Multiscale Modeling & Simulation 4 (2005), no. 2, 490–530,
http://dx.doi.org/10.1137/040616024. (Cited on page 36).
[BDH13] K. Bredies, Y. Dong, and M. Hintermu¨ller, Spatially dependent regularization
parameter selection in total generalized variation models for image restora-
tion, International Journal of Computer Mathematics 90 (2013), no. 1,
109–123, http://dx.doi.org/10.1080/00207160.2012.700400. (Cited on
page 31).
[BEG07] A. Bertozzi, S. Esedoglu, and A. Gillette, Analysis of a two-scale Cahn-
Hilliard model for binary image inpainting, Multiscale Modeling & Simu-
lation 6 (2007), no. 3, 913–936, http://dx.doi.org/10.1137/060660631.
(Cited on page 32).
[Ben11] M. Benning, Singular regularization of inverse problems. Bregman distances
and applications to variational frameworks with singular regularization en-
ergies, Ph.D. thesis, University of Mu¨nster, 2011. (Cited on pages 31, 138,
and 148).
[BF91] G. Buttazzo and L. Freddi, Functionals defined on measures and applica-
tions to non equi-uniformly elliptic problems, Annali di Matematica Pura
ed Applicata 159 (1991), no. 1, 133–149, http://dx.doi.org/10.1007/
BF01766298. (Cited on page 60).
[BGH+14] M. Benning, L. Gladden, D. Holland, C.B. Scho¨nlieb, and T. Valkonen,
Phase reconstruction from velocity-encoded MRI measurements – A sur-
vey of sparsity-promoting variational approaches, Journal of Magnetic Res-
onance 238 (2014), 26–43, http://dx.doi.org/10.1016/j.jmr.2013.10.
003. (Cited on pages 25 and 31).
[BH12a] K. Bredies and M. Holler, Artifact-free JPEG decompression with total gen-
eralized variation, VISAP 2012: Proceedings of the International Conference
on Computer Vision and Applications, 2012. (Cited on page 31).
[BH12b] , A total variation-based JPEG decompression model, SIAM Jour-
nal on Imaging Sciences 5 (2012), no. 1, 366–393, http://dx.doi.org/10.
1137/110833531. (Cited on page 25).
[BH13a] , Regularization of linear inverse problems with total generalized vari-
ation, preprint (2013). (Cited on page 31).
193
BIBLIOGRAPHY
[BH13b] , A TGV regularized wavelet based zooming model, Scale Space and
Variational Methods in Computer Vision, Springer, 2013, http://dx.doi.
org/10.1007/978-3-642-38267-3_13, pp. 149–160. (Cited on page 31).
[BHS09] M. Burger, L. He, and C.B. Scho¨nlieb, Cahn-Hilliard inpainting and a
generalization for grayvalue images, SIAM Journal on Imaging Sciences 2
(2009), no. 4, 1129–1167, http://dx.doi.org/10.1137/080728548. (Cited
on page 32).
[BKP10] K. Bredies, K. Kunisch, and T. Pock, Total generalized variation, SIAM
Journal on Imaging Sciences 3 (2010), no. 3, 492–526, http://dx.doi.org/
10.1137/090769521. (Cited on pages 30, 31, 83, 107, and 109).
[BKV13] K. Bredies, K. Kunisch, and T. Valkonen, Properties of L1-TGV 2 : The
one-dimensional case, Journal of Mathematical Analysis and Applications
398 (2013), no. 1, 438 – 454, http://dx.doi.org/10.1016/j.jmaa.2012.
08.053. (Cited on pages 27, 31, 35, 109, 110, 111, 113, 114, 116, 117, 118,
and 120).
[BL11] K. Bredies and D. Lorenz, Mathematische bildverarbeitung: Einfu¨hrung in
grundlagen und moderne theorie, Vieweg+Teubner Verlag, 2011. (Cited on
page 72).
[BM07] F. Bornemann and T. Ma¨rz, Fast image inpainting based on coherence trans-
port, Journal of Mathematical Imaging and Vision 28 (2007), no. 3, 259–278,
http://dx.doi.org/10.1007/s10851-007-0017-6. (Cited on page 32).
[BMPS14] M. Burger, J. Mu¨ller, E. Papoutsellis, and C.B. Scho¨nlieb, Total varia-
tion regularisation in measurement and image space for PET reconstruc-
tion, arXiv preprint (2014), http://arxiv.org/abs/1403.1272. (Cited on
page 25).
[BO04] M. Burger and S. Osher, Convergence rates of convex variational regular-
ization, Inverse problems 20 (2004), no. 5, 1411, http://dx.doi.org/10.
1088/0266-5611/20/5/005. (Cited on pages 71 and 72).
[Bol90] Be´la Bolloba´s, Linear analysis, vol. 199, Cambridge University Press Cam-
bridge, 1990. (Cited on page 39).
[BP10] M. Bergounioux and L. Piffet, A second-order model for image denoising,
Set-Valued and Variational Analysis 18 (2010), no. 3-4, 277–306, http://
dx.doi.org/10.1007/s11228-010-0156-6. (Cited on pages 29, 56, and 83).
194
BIBLIOGRAPHY
[BPW13] K. Bredies, T. Pock, and B. Wirth, A convex, lower semi-continuous ap-
proximation of Euler’s elastica energy, preprint (2013). (Cited on page 32).
[Bra02] A. Braides, Γ-convergence for beginners, Oxford lecture series in mathemat-
ics and its applications, 2002. (Cited on pages 49 and 73).
[Bre83] Ha¨ım Brezis, Analyse fonctionelle, vol. 5, Masson, 1983. (Cited on pages 165
and 168).
[Bre14] K. Bredies, Recovering piecewise smooth multichannel images by minimiza-
tion of convex functionals with total generalized variation penalty, Effi-
cient Algorithms for Global Optimization Methods in Computer Vision,
Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2014,
http://dx.doi.org/10.1007/978-3-642-54774-4_3, pp. 44–77. (Cited on
pages 31, 55, 83, 93, 94, 126, and 153).
[BRH07] M. Burger, E. Resmerita, and L. He, Error estimation for Bregman iterations
and inverse scale space methods in image restoration, Computing 81 (2007),
no. 2, 109–135, http://dx.doi.org/10.1007/s00607-007-0245-z. (Cited
on page 71).
[BSCB00] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, Image inpainting,
Proceedings of the 27th annual conference on Computer graphics and in-
teractive techniques, 2000, http://dx.doi.org/10.1145/344779.344972,
pp. 417–424. (Cited on page 32).
[BSW+13] C. Brune, A. Sawatzky, F. Wu¨bbeling, T. Ko¨sters, and M. Burger, EM-
TV methods for inverse problems with poisson noise, Level Set and PDE
Based Reconstruction Methods in Imaging, Lecture Notes in Mathemat-
ics, Springer International Publishing, 2013, http://dx.doi.org/10.1007/
978-3-319-01712-9_2, pp. 71–142. (Cited on page 23).
[BT09] A. Beck and M. Teboulle, Fast gradient-based algorithms for constrained total
variation image denoising and deblurring problems, IEEE Transactions on
Image Processing 18 (2009), no. 11, 2419–2434, http://dx.doi.org/10.
1109/TIP.2009.2028250. (Cited on page 25).
[BV11] K. Bredies and T. Valkonen, Inverse problems with second-order total gener-
alized variation constraints, Proceedings of SampTA 2011 - 9th International
Conference on Sampling Theory and Applications, Singapore, 2011. (Cited
on pages 30, 31, 109, and 153).
[CCN07] Vicent Caselles, Antonin Chambolle, and Matteo Novaga, The discontinuity
set of solutions of the TV denoising problem and some extensions, Multiscale
195
BIBLIOGRAPHY
Modeling & Simulation 6 (2007), no. 3, 879–894, http://dx.doi.org/10.
1137/070683003. (Cited on pages 26, 27, and 117).
[CDlRS14] L. Calatroni, J.C. De los Reyes, and C.B. Scho¨nlieb, Dynamic sam-
pling schemes for optimal noise learning under multiple nonsmooth con-
straints, arXiv preprint (2014), http://arxiv.org/abs/1403.1278. (Cited
on pages 23 and 181).
[CE05] T.F. Chan and S. Esedoglu, Aspects of total variation regularized L1 function
approximation, SIAM Journal on Applied Mathematics (2005), 1817–1837,
http://dx.doi.org/10.1137/040604297. (Cited on pages 23 and 26).
[CEP07] T.F. Chan, S. Esedoglu, and F.E. Park, Image decomposition combining
staircase reduction and texture extraction, Journal of Visual Communication
and Image Representation 18 (2007), no. 6, 464–486, http://dx.doi.org/
10.1016/j.jvcir.2006.12.004. (Cited on page 29).
[CGMP09] F. Cao, Y. Gousseau, S. Masnou, and P. Pe´rez, Geometrically
guided exemplar-based inpainting, http://hal.archives-ouvertes.fr/
hal-00392018. (Cited on page 32).
[CKS01] T.F. Chan, S.H. Kang, and J. Shen, Total variation denoising and enhance-
ment of color images based on the CB and HSV color models, Journal of
Visual Communication and Image Representation 12 (2001), no. 4, 422 –
435, http://dx.doi.org/10.1006/jvci.2001.0491. (Cited on page 25).
[CKS02] , Euler’s elastica and curvature-based inpainting, SIAM Journal on
Applied Mathematics 63 (2002), no. 2, 564–592, http://dx.doi.org/10.
1137/S0036139901390088. (Cited on pages 27 and 32).
[CL97] A. Chambolle and P.L. Lions, Image recovery via total variation mini-
mization and related problems, Numerische Mathematik 76 (1997), 167–
188, http://dx.doi.org/10.1007/s002110050258. (Cited on pages 25, 28,
and 83).
[Cla12] C. Clason, L∞ fitting for inverse problems with uniform noise, Inverse Prob-
lems 28 (2012), no. 10, 104007, http://dx.doi.org/10.1088/0266-5611/
28/10/104007. (Cited on page 23).
[CMM01] T. Chan, A. Marquina, and P. Mulet, High-order total variation-based im-
age restoration, SIAM Journal on Scientific Computing 22 (2001), no. 2,
503–516, http://dx.doi.org/10.1137/S1064827598344169. (Cited on
page 29).
196
BIBLIOGRAPHY
[CP11] A. Chambolle and T. Pock, A first-order primal-dual algorithm for con-
vex problems with applications to imaging, Journal of Mathematical Imag-
ing and Vision 40 (2011), no. 1, 120–145, http://dx.doi.org/10.1007/
s10851-010-0251-1. (Cited on pages 31, 55, 83, 85, 126, and 153).
[CPT03] A. Criminisi, P. Perez, and K. Toyama, Object removal by exemplar-based in-
painting, IEEE Computer Society Conference on Computer Vision and Pat-
tern Recognition, vol. 2, 2003, http://dx.doi.org/10.1109/CVPR.2003.
1211538, pp. II–721. (Cited on page 32).
[CS01] T.F. Chan and J. Shen, Nontexture inpainting by curvature-driven diffusions,
Journal of Visual Communication and Image Representation 12 (2001),
no. 4, 436–449, http://dx.doi.org/10.1006/jvci.2001.0487. (Cited on
page 32).
[CS05] , Variational image inpainting, Communications on Pure and Applied
Mathematics 58 (2005), no. 5, 579–619, http://dx.doi.org/10.1002/cpa.
20075. (Cited on pages 25 and 27).
[CW98] T.F. Chan and C.K. Wong, Total variation blind deconvolution, IEEE Trans-
actions on Image Processing 7 (1998), no. 3, 370–375, http://dx.doi.org/
10.1109/83.661187. (Cited on page 25).
[CW06] P.L. Combettes and V.R. Wajs, Signal recovery by proximal forward-
backward splitting, Multiscale Modeling and Simulation 4 (2006), no. 4,
1168–1200, http://dx.doi.org/10.1137/050626090. (Cited on page 80).
[DAG09] V. Duval, J.F. Aujol, and Y. Gousseau, The TVL1 model: a geometric
point of view, SIAM Journal on Multiscale Modeling and Simulation 8
(2009), no. 1, 154–189, http://dx.doi.org/10.1137/090757083. (Cited
on pages 23 and 26).
[Dem85] F. Demengel, Fonctions a` Hessien borne´, Annales de l’Institut Fourier 34
(1985), 155–190. (Cited on pages 29, 60, 61, and 64).
[DHKS10] Y. Dong, M. Hintermu¨ller, F. Knoll, and R. Stollberger, Total variation de-
noising with spatially dependent regularization, ISMRM 18th Annual Scien-
tific Meeting and Exhibition Proceedings, 2010, p. 5088. (Cited on page 25).
[DiB02] E. DiBenedetto, Real analysis, Springer, 2002. (Cited on page 39).
[DlRS12] J.C. De los Reyes and C.B. Scho¨nlieb, Image denoising: learning noise distri-
bution via PDE-constrained optimization, to appear in Inverse Problems and
Imaging (2012), http://arxiv.org/abs/1207.3425. (Cited on pages 23
and 181).
197
BIBLIOGRAPHY
[DM93] G. Dal Maso, Introduction to Γ-convergence, Birkhauser, 1993. (Cited on
pages 38, 49, and 73).
[DMFLM09] G. Dal Maso, I. Fonseca, G. Leoni, and M. Morini, A higher order model for
image restoration: the one dimensional case, SIAM Journal on Mathemat-
ical Analysis 40 (2009), no. 6, 2351–2391, http://dx.doi.org/10.1137/
070697823. (Cited on pages 27 and 29).
[DS96] D. Dobson and F. Santosa, Recovery of blocky images from noisy and blurred
data, SIAM Journal on Applied Mathematics 56 (1996), no. 4, 1181–1198,
http://dx.doi.org/10.1137/S003613999427560X. (Cited on pages 25
and 27).
[DT84] F. Demengel and R. Temam, Convex functions of a measure and applica-
tions, Indiana University Mathematics Journal 33 (1984), 673–709. (Cited
on pages 58 and 63).
[DWB09] S. Didas, J. Weickert, and B. Burgeth, Properties of higher order nonlinear
diffusion filtering, Journal of Mathematical Imaging and Vision 35 (2009),
no. 3, 208–226, http://dx.doi.org/10.1007/s10851-009-0166-x. (Cited
on page 30).
[Eck89] J. Eckstein, Splitting methods for monotone operators with applications to
parallel optimisation, Ph.D. thesis, Massachusetts Institute of Technology,
1989. (Cited on page 81).
[EG92] L.C. Evans and R.F. Gariepy, Measure theory and fine properties of func-
tions, CRC Press, Boca Raton, FL, 1992. (Cited on page 40).
[ES02] S. Esedoglu and J. Shen, Digital inpainting based on the Mumford-Shah-
Euler image model, European Journal of Applied Mathematics 13 (2002),
no. 4, 353–370, http://dx.doi.org/10.1017/S0956792502004904. (Cited
on page 32).
[ET76] I. Ekeland and R. Temam, Convex analysis and variational problems, vol. 1,
North Holland, 1976. (Cited on pages 38, 50, and 72).
[Eva10] L.C. Evans, Partial differential equations, volume 19 of graduate studies in
mathematics, second edition, American Mathematical Society, 2010. (Cited
on pages 39, 65, and 111).
[EZC10] E. Esser, X. Zhang, and T. Chan, A general framework for a class of first
order primal-dual algorithms for convex optimization in imaging science,
SIAM Journal on Imaging Sciences 3 (2010), no. 4, 1015–1046, http://dx.
doi.org/10.1137/09076934X. (Cited on page 80).
198
BIBLIOGRAPHY
[FSHW04] C. Frohn-Schauf, S. Henn, and K. Witsch, Nonlinear multigrid methods for
total variation image denoising, Computing and Visualization in Science 7
(2004), no. 3, 199–206, http://dx.doi.org/10.1007/s00791-004-0150-3.
(Cited on page 56).
[Gab83] D. Gabay, Chapter IX applications of the method of multipliers to varia-
tional inequalities, Augmented Lagrangian Methods: Applications to the
Numerical Solution of Boundary-Value Problems (Michel Fortin and Roland
Glowinski, eds.), Studies in Mathematics and its Applications, vol. 15, Else-
vier, 1983, pp. 299 – 331. (Cited on page 81).
[GB08] M. Grant and S. Boyd, Graph implementations for nonsmooth convex pro-
grams, Recent Advances in Learning and Control (V. Blondel, S. Boyd,
and H. Kimura, eds.), Lecture Notes in Control and Information Sciences,
Springer-Verlag Limited, 2008, http://stanford.edu/~boyd/graph_dcp.
html, pp. 95–110. (Cited on page 178).
[GB14] , CVX: Matlab software for disciplined convex programming, version
2.1, http://cvxr.com/cvx, March 2014. (Cited on page 178).
[Giu84] E. Giusti, Minimal surfaces and functions of bounded variation, Birkha¨user,
1984. (Cited on page 40).
[GM01] M. Gobbino and M. G. Mora, Finite-difference approximation of free-
discontinuity problems, Proceedings of the Royal Society of Edinburgh: Sec-
tion A Mathematics 131 (2001), 567–595, http://dx.doi.org/10.1017/
S0308210501000257. (Cited on pages 36 and 159).
[GO07] G. Gilboa and S. Osher, Nonlocal linear image regularization and supervised
segmentation, Multiscale Modeling & Simulation 6 (2007), no. 2, 595–630,
http://dx.doi.org/10.1137/060669358. (Cited on page 36).
[GO08] , Nonlocal operators with applications to image processing, Multiscale
Modeling & Simulation 7 (2008), no. 3, 1005–1028, http://dx.doi.org/
10.1137/070698592. (Cited on page 36).
[GO09] T. Goldstein and S. Osher, The split Bregman method for L1 regularized
problems, SIAM Journal on Imaging Sciences 2 (2009), no. 2, 323–343, http:
//dx.doi.org/10.1137/080725891. (Cited on pages 31, 33, 55, 73, 77, 79,
80, 83, 87, 97, and 98).
[Gra07] M. Grasmair, The equivalence of the taut string algorithm and BV-
regularization, Journal of Mathematical Imaging and Vision 27 (2007),
199
BIBLIOGRAPHY
no. 1, 59–66, http://dx.doi.org/10.1007/s10851-006-9796-4. (Cited
on page 26).
[HHK+03] W. Hinterberger, M. Hintermu¨ller, K. Kunisch, M. Von Oehsen, and
O. Scherzer, Tube methods for BV regularization, Journal of Mathemati-
cal Imaging and Vision 19 (2003), no. 3, 219–235, http://dx.doi.org/10.
1023/A:1026276804745. (Cited on page 26).
[Hir08] R. Hirsch, Seizing the light, A history of photography (2008). (Cited on
page 19).
[HL13] M. Hintermu¨ller and A. Langer, Subspace correction methods for a class of
nonsmooth and nonadditive convex variational problems with mixed L1/L2
data-fidelity in image processing, SIAM Journal on Imaging Sciences 6
(2013), no. 4, 2134–2173, http://dx.doi.org/10.1137/120894130. (Cited
on page 23).
[Hol13] M. Holler, Higher order regularization for model based data decompression,
Ph.D. thesis, University of Graz, 2013. (Cited on page 31).
[HS06a] W. Hinterberger and O. Scherzer, Variational methods on the space of
functions of bounded Hessian for convexification and denoising, Com-
puting 76 (2006), no. 1-2, 109–133, http://dx.doi.org/10.1007/
s00607-005-0119-1. (Cited on page 29).
[HS06b] M. Hintermu¨ler and G. Stadler, An infeasible primal-dual algorithm for total
bounded variation–based inf-convolution-type image restoration, SIAM Jour-
nal on Scientific Computing 28 (2006), no. 1, 1–23, http://dx.doi.org/
10.1137/040613263. (Cited on page 56).
[Jal12] K. Jalalzai, Regularization of inverse problems in image processing, Ph.D.
thesis, E´cole polytechnique, 2012. (Cited on page 117).
[Jal14] , Discontinuities of the minimizers of the weighted or anisotropic total
variation for image reconstruction, arXiv preprint (2014), http://arxiv.
org/abs/1402.0026. (Cited on page 117).
[KBPS11] F. Knoll, K. Bredies, T. Pock, and R. Stollberger, Second order total gen-
eralized variation (TGV) for MRI, Magnetic Resonance in Medicine 65
(2011), no. 2, 480–491, http://dx.doi.org/10.1002/mrm.22595. (Cited
on page 31).
[KP13] K. Kunisch and T. Pock, A bilevel optimization approach for parameter learn-
ing in variational models, SIAM Journal on Imaging Sciences 6 (2013), no. 2,
938–983, http://dx.doi.org/10.1137/120882706. (Cited on page 181).
200
BIBLIOGRAPHY
[LBU12] S. Lefkimmiatis, A. Bourquard, and M. Unser, Hessian-based norm regu-
larization for image restoration with biomedical applications, IEEE Trans-
actions on Image Processing 21 (2012), 983–995, http://dx.doi.org/10.
1109/TIP.2011.2168232. (Cited on page 29).
[LCA07] T. Le, R. Chartrand, and T.J. Asaki, A variational approach to recon-
structing images corrupted by Poisson noise, Journal of Mathematical Imag-
ing and Vision 27 (2007), no. 3, 257–263, http://dx.doi.org/10.1007/
s10851-007-0652-y. (Cited on pages 23 and 25).
[LL01] E.H. Lieb and M. Loss, Analysis, volume 14 of graduate studies in mathe-
matics, American Mathematical Society, Providence, RI, 4 (2001). (Cited
on page 39).
[LLT03] M. Lysaker, A. Lundervold, and X.C. Tai, Noise removal using fourth-order
partial differential equation with applications to medical magnetic resonance
images in space and time, IEEE Transactions on Image Processing 12 (2003),
no. 12, 1579–1590, http://dx.doi.org/10.1109/TIP.2003.819229. (Cited
on page 29).
[LPSS] J. Lellmann, K. Papafitsoros, C.B. Scho¨nlieb, and D. Spector, Non-local
higher-order regularisation, in preparation. (Cited on pages 7, 32, 33,
and 38).
[LT06] M. Lysaker and X.C. Tai, Iterative image restoration combining total varia-
tion minimization and a second-order functional, International Journal of
Computer Vision 66 (2006), no. 1, 5–18, http://dx.doi.org/10.1007/
s11263-005-3219-7. (Cited on page 30).
[LTC13] R. Lai, X. Tai, and T. Chan, A ridge and corner preserving model for sur-
face restoration, SIAM Journal on Scientific Computing 35 (2013), no. 2,
A675–A695, http://dx.doi.org/10.1137/110846634. (Cited on pages 29
and 32).
[Men12] T. Mengesha, Nonlocal Korn-type characterization of Sobolev vector fields,
Communications in Contemporary Mathematics 14 (2012), no. 4, http:
//dx.doi.org/10.1142/S0219199712500289. (Cited on page 173).
[Mey01] Y. Meyer, Oscillating patterns in image processing and nonlinear evolution
equations: the fifteenth Dean Jacqueline B. Lewis memorial lectures, vol. 22,
American Mathematical Society, 2001. (Cited on page 25).
201
BIBLIOGRAPHY
[MM98] S. Masnou and J.M. Morel, Level lines based disocclusion, IEEE Proceedings
of International Conference on Image Processing, 1998, http://dx.doi.
org/10.1109/ICIP.1998.999016, pp. 259–263. (Cited on page 32).
[MS13] T. Mengesha and D. Spector, Localization of nonlocal gradients in various
topologies, CVGMT preprint (2013). (Cited on pages 36, 159, 169, and 173).
[Mu¨l13] J. Mu¨ller, Advanced image reconstruction and denoising – Bregmanized
(higher order) total variation and application in PET, Ph.D. thesis, Uni-
versity of Mu¨nster, 2013. (Cited on pages 25, 30, 31, and 148).
[Nik02] M. Nikolova, Minimizers of cost-functions involving nonsmooth data-fidelity
terms. Application to the processing of outliers, SIAM Journal on Numer-
ical Analysis 40 (2002), no. 3, 965–994, http://dx.doi.org/10.1137/
S0036142901389165. (Cited on pages 23 and 26).
[Nik04] , A variational approach to remove outliers and impulse noise,
Journal of Mathematical Imaging and Vision 20 (2004), no. 1, 99–120,
http://dx.doi.org/10.1023/B:JMIV.0000011326.88682.e5. (Cited on
pages 23 and 25).
[Nir59] L. Nirenberg, On elliptic partial differential equations, Annali della Scuola
Normale Superiore di Pisa, 3e serie 13 (1959), no. 2, 115–162. (Cited on
page 175).
[NMS93] M. Nitzberg, D. Mumford, and T. Shiota, Filtering, segmentation, and depth,
Springer-Verlag New York, Inc., 1993. (Cited on page 32).
[NWC13] M. Nikolova, Y.W. Wen, and R. Chan, Exact histogram specification for
digital images using a variational approach, Journal of Mathematical Imag-
ing and Vision 46 (2013), no. 3, 309–325, http://dx.doi.org/10.1007/
s10851-012-0401-8. (Cited on page 23).
[OBG+] S. Osher, M. Burger, D. Goldfarb, J. Xu, and W. Yin, An iterative regulariza-
tion method for total variation-based image restoration, Multiscale Modeling
& Simulation 4, 460–489, http://dx.doi.org/10.1137/040605412. (Cited
on pages 73 and 77).
[OF03] S. Osher and R. Fedkiw, Level set methods and dynamic implicit surfaces,
Springer, 2003. (Cited on page 178).
[Pas12a] Pascal Getreuer, Rudin-Osher-Fatemi Total Variation Denoising using Split
Bregman, Image Processing On Line (2012), http://dx.doi.org/10.5201/
ipol.2012.g-tvd. (Cited on pages 25, 83, 96, and 97).
202
BIBLIOGRAPHY
[Pas12b] , Total Variation Deconvolution using Split Bregman, Image Pro-
cessing On Line (2012), http://dx.doi.org/10.5201/ipol.2012.g-tvdc.
(Cited on pages 25, 83, 96, and 97).
[Pas12c] , Total Variation Inpainting using Split Bregman, Image Processing
On Line (2012), http://dx.doi.org/10.5201/ipol.2012.g-tvi. (Cited
on pages 25, 27, 83, 96, and 97).
[PB13] K. Papafitsoros and K. Bredies, A study of the one dimensional total gen-
eralised variation regularisation problem, submitted (2013), http://arxiv.
org/abs/1309.5900. (Cited on pages 7, 32, 33, and 38).
[Pon04] A.C. Ponce, A new approach to Sobolev spaces and connections to Γ-
convergence, Calculus of Variations and Partial Differential Equations 19
(2004), no. 3, 229–255, http://dx.doi.org/10.1007/s00526-003-0195-z.
(Cited on pages 36, 158, and 173).
[Po¨s08] C. Po¨schl, Tikhonov regularization with general residual term, Ph.D. thesis,
Leopold-Franzens-Universita¨t Innsbruck, Austria, 2008. (Cited on page 71).
[PS08] C. Po¨schl and O. Scherzer, Characterization of minimisers of convex regular-
ization functionals, Contemporary mathematics 451 (2008), 219–248. (Cited
on page 30).
[PS13] , Exact solutions of one-dimensional TGV, arXiv preprint (2013),
http://arxiv.org/abs/1309.7152. (Cited on page 35).
[PS14] K. Papafitsoros and C.B. Scho¨nlieb, A combined first and second order vari-
ational approach for image reconstruction, Journal of Mathematical Imag-
ing and Vision 48 (2014), no. 2, 308–338, http://dx.doi.org/10.1007/
s10851-013-0445-4. (Cited on pages 7, 32, and 38).
[PSS13] K. Papafitsoros, C.B. Scho¨nlieb, and B. Sengul, Combined First and Second
Order Total Variation Inpainting using Split Bregman, Image Processing On
Line 2013 (2013), 112–136, http://dx.doi.org/10.5201/ipol.2013.40.
(Cited on pages 7, 32, 33, 38, 95, and 101).
[Rin00] W. Ring, Structural properties of solutions to total variation regulariza-
tion problems, ESAIM: Mathematical Modelling and Numerical Analysis
34 (2000), no. 4, 799–810, http://dx.doi.org/10.1051/m2an:2000104.
(Cited on pages 26, 27, 35, and 116).
[ROF92] L.I. Rudin, S. Osher, and E. Fatemi, Nonlinear total variation based noise
removal algorithms, Physica D: Nonlinear Phenomena 60 (1992), no. 1-4,
203
BIBLIOGRAPHY
259–268, http://dx.doi.org/10.1016/0167-2789(92)90242-F. (Cited on
pages 24 and 25).
[Rud87] W. Rudin, Real and complex analysis, WCB, McGraw-Hill, Boston, 1987.
(Cited on page 40).
[Saw11] A. Sawatzky, (Nonlocal) total variation in medical imaging, Ph.D. thesis,
University of Mu¨nster, 2011. (Cited on page 36).
[SBBH09] C.B. Scho¨nlieb, A. Bertozzi, M. Burger, and L. He, Image inpainting using a
fourth-order total variation flow, SAMPTA’09, International Conference on
Sampling Theory and Applications, Marseilles (2009). (Cited on page 32).
[SBJM09] A. Sawatzky, C. Brune, Mu¨ller J., and Burger M., Total variation processing
of images with Poisson statistics, Computer Analysis of Images and Pat-
terns, Springer, 2009, http://dx.doi.org/10.1007/978-3-642-03767-2_
65, pp. 533–540. (Cited on page 23).
[SC02] J. Shen and T.F. Chan, Mathematical models for local nontexture inpaintings,
SIAM Journal on Applied Mathematics 62 (2002), no. 3, 1019–1043, http:
//dx.doi.org/10.1137/S0036139900368844. (Cited on pages 25 and 27).
[SC03] D. Strong and T. Chan, Edge-preserving and scale-dependent properties of
total variation regularization, Inverse Problems 19 (2003), no. 6, S165, http:
//dx.doi.org/10.1088/0266-5611/19/6/059. (Cited on page 26).
[Sch98] O. Scherzer, Denoising with higher order derivatives of bounded variation
and an application to parameter estimation, Computing 60 (1998), no. 1,
1–27, http://dx.doi.org/10.1007/BF02684327. (Cited on page 29).
[Sch09] C.B. Scho¨nlieb, Modern PDE techniques for image inpainting, Ph.D. thesis,
University of Cambridge, 2009. (Cited on page 27).
[Set99] J.A. Sethian, Level set methods and fast marching methods: Evolving in-
terfaces in computational geometry, fluid mechanics, computer vision, and
materials science, vol. 3, Cambridge University Press, 1999. (Cited on
page 178).
[Set09] S. Setzer, Split Bregman algorithm, Douglas-Rachford splitting and frame
shrinkage, Scale Space and Variational Methods in Computer Vision 5567
(2009), 464–476, http://dx.doi.org/10.1007/978-3-642-02256-2_39.
(Cited on pages 79, 80, and 81).
204
BIBLIOGRAPHY
[SST11] S. Setzer, G. Steidl, and T. Teuber, Infimal convolution regularizations
with discrete `1-type functionals, Communications in Mathematical Sci-
ences 9 (2011), 797–872, http://dx.doi.org/10.4310/CMS.2011.v9.n3.
a7. (Cited on page 31).
[Tem85] R. Temam, Mathematical problems in plasticity, vol. 15, Gauthier-Villars
Paris, 1985. (Cited on pages 30 and 126).
[THC11] X.C. Tai, J. Hahn, and G.J. Chung, A fast algorithm for Euler’s elastica
model using augmented Lagrangian method, SIAM Journal on Imaging Sci-
ences 4 (2011), no. 1, 313–344, http://dx.doi.org/10.1137/100803730.
(Cited on pages 32 and 103).
[Tik63] A. Tikhonov, Solution of incorrectly formulated problems and the regulariza-
tion method, Soviet Math. Dokl., vol. 5, 1963, p. 1035. (Cited on page 23).
[TS80] R. Temam and G. Strang, Functions of bounded deformation, Archive for
Rational Mechanics and Analysis 75 (1980), no. 1, 7–21, http://dx.doi.
org/10.1007/BF00284617. (Cited on pages 30, 126, and 176).
[Val14a] T. Valkonen, The jump set under geometric regularisation. Part 1: Basic
technique and first-order denoising, arXiv preprint (2014), http://arxiv.
org/abs/1407.1531. (Cited on page 117).
[Val14b] , The jump set under geometric regularisation. Part 2: Higher-
order approaches, arXiv preprint (2014), http://arxiv.org/abs/1407.
2334. (Cited on page 117).
[VBK13] T. Valkonen, K. Bredies, and F. Knoll, Total generalized variation in dif-
fusion tensor imaging, SIAM Journal on Imaging Sciences 6 (2013), no. 1,
487–525, http://dx.doi.org/10.1137/120867172. (Cited on page 31).
[Ves01] L. Vese, A study in the BV space of a denoising-deblurring variational prob-
lem, Applied Mathematics and Optimization 44 (2001), no. 2, 131–161.
(Cited on pages 33, 56, 67, 68, and 71).
[Vog95] C.R. Vogel, A multigrid method for total variation-based image denoising,
Progress in Systems and Control Theory 20 (1995), 323–323, http://dx.
doi.org/10.1007/978-1-4612-2574-4_21. (Cited on page 56).
[VW97] P.S. Vassilevski and J.G. Wade, A comparison of multilevel methods for to-
tal variation regularization, Electronic Transactions on Numerical Analysis
(1997), 255–270. (Cited on page 56).
205
BIBLIOGRAPHY
[WB09] Z. Wang and A.C. Bovik, Mean squared error: Love it or leave it? a new
look at signal fidelity measures, IEEE Signal Processing Magazine 26 (2009),
no. 1, 98–117, http://dx.doi.org/10.1109/MSP.2008.930649. (Cited on
pages 34 and 83).
[WBSS04] Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli, Image quality as-
sessment: From error visibility to structural similarity, IEEE Transactions
on Image Processing 13 (2004), no. 4, 600–612, http://dx.doi.org/10.
1109/TIP.2003.819861. (Cited on pages 34 and 83).
[WGO07] Y. Wotao, D. Goldfarb, and S. Osher, The total variation regularized L1
model for multiscale decomposition, Multiscale Modeling & Simulation 6
(2007), no. 1, 190–211, http://dx.doi.org/10.1137/060663027. (Cited
on page 23).
[Whi23] E. T. Whittaker, On a new method of graduation, Proceedings of the Edin-
burgh Mathematical Society 41 (1923), 63–75. (Cited on page 23).
[WT10] C. Wu and X.C. Tai, Augmented Lagrangian method, dual methods, and
split Bregman iteration for ROF, vectorial TV, and high order models, SIAM
Journal on Imaging Sciences 3 (2010), no. 3, 300–339, http://dx.doi.org/
10.1137/090767558. (Cited on pages 74 and 80).
[WYZ07] Y. Wang, W. Yin, and Y. Zhang, A fast algorithm for image deblurring with
total variation regularization. (Cited on page 25).
[YOGD08] W. Yin, S. Osher, D. Goldfarb, and J. Darbon, Bregman iterative algo-
rithms for `1-minimisation with applications to compressed sensing, SIAM
Journal on Imaging Sciences 1 (2008), 142–168, http://dx.doi.org/10.
1137/070703983. (Cited on pages 77 and 80).
206