Collaborating queues: large service network and a limit order book Elena Yudovina Emmanuel College University of Cambridge Submitted April 2012, revised May 2012 This dissertation is submitted for the degree of Doctor of Philosophy Summary We analyse the steady-state behaviour of two different models with collaborating queues: that is, models in which “customers” can be served by many types of “servers”, and “servers” can process many types of “customers”. The first example is a large-scale service system, such as a call centre. Collaboration is the result of cross-trained staff attending to several different types of incoming calls. We first examine a load-balancing policy, which aims to keep servers in different pools equally busy. Although the policy behaves order-optimally over fixed time horizons, we show that the steady-state distribution may fail to be tight on the diffusion scale. That is, in a family of ever-larger networks whose arrival rates grow as O(r) (where r is a scaling parameter growing to∞), the sequence of steady-state deviations from equilibrium scaled down by r−1/2 is not tight. We then propose a different policy, for which we show that the sequence of invariant distributions is tight on the r1/2+ scale, for any  > 0. For this policy we conjecture that tightness holds on the diffusion scale as well. The second example models a limit order book, a pricing mechanism for a single- commodity market in which buyers (respectively sellers) are prepared to wait for the price to drop (respectively rise). We analyse the behaviour of a simplified model, in which the arrival events are independent of each other and the state of the limit or- der book. The system can be represented by a queueing model, with “customers” and “servers” corresponding to bids and asks; the roles of customers and servers are symmet- ric. We show that, with probability 1, the price interval breaks up into three regions. At small (respectively large) prices, only finitely many bid (respectively ask) orders ever get fulfilled, while in the middle region all orders eventually clear. We derive equations which define the boundaries between these regions, and solve them explicitly in the case of iid uniform arrivals to obtain numeric values of the thresholds. We derive a heuristic for the distribution of the highest bid (respectively lowest ask), and present simulation data confirming it. Contents Chapter 1. Introduction 1 Location of original results 4 Notation 4 Chapter 2. Large service network 7 Introduction 7 1. Call centre model and the static planning problem 10 2. Complete resource pooling 12 3. LQFS-LB and LAP algorithms 14 4. Fluid-scaled convergence for LQFS-LB and LAP 15 5. LQFS-LB fluid models near equilibrium 22 6. LAP fluid models: convergence to equilibrium 38 7. LQFS-LB steady-state on the diffusion scale 42 8. LAP steady-state on sub-fluid scales 56 Chapter 3. Limit order book 65 Introduction 65 1. Limit order book model 67 2. Main results 69 3. Monotonicity 70 4. Proof of Theorems 3.5 and 3.7 73 5. Strict limit order book 75 6. Exact values of the thresholds κb and κa 77 7. Restricted limit order book, and conjecture on steady-state behaviour 84 8. Lyapunov function 86 9. Arrival distributions 89 10. Market orders 92 11. Simulation results 93 Bibliography 95 Appendix A. Continuity of functions 99 Appendix B. Another reason to restrict to a tree 101 Appendix C. Halfin-Whitt regime 103 Appendix D. Computations 107 1. Computations for Example 2.32 107 2. Computations for Example 2.36 111 3. Computations for Lemma 2.39 113 4. Vertices of the level set of the Lyapunov function in §3.8 117 CHAPTER 1 Introduction In this thesis we discuss two examples of queueing models, with the following unifying characteristics. We have interactions of two “genders” of agents; for example, customers and servers, bids and asks, or passengers and taxis. Activity (often, service) may happen when agents of opposite gender meet. However, agents of both genders come in multiple types, and quality of possible matchings depends on the pair of types being matched; some pairings may be outright impossible. To use the passenger-taxi example, a passenger may have a certain quantity of luggage, for which a taxi may not have enough room; here some pairings are impossible. Our interest will be in algorithms that determine which of the possible pairings of agents actually occur. There is a rich body of literature discussing problems of similar flavour. We give a broader historical introduction in this chapter, mentioning current research in the chapters to which it is applicable. The concept of the two-sided queue, and in particular of the ubiquitous taxi-stand analogy, dates at least to Kendall. Kendall [1951] briefly considers the two-sided queue with exponential arrivals, describing its distribution as the difference of two Poisson pro- cesses or a symmetric random walk. Slightly later, Brigham [1955] discusses a many-server system, thinking of the waiting periods of the servers (who here are attendants behind the counter) as well as of the customers. The paper of Foster [1959] discusses a manufacturing queue with a finite amount of waiting room; the finite buffer size implies a certain duality between the arriving jobs and the servers working on them. This duality means that the roles of the two can be interchanged without altering the mathematical analysis. Manufacturing systems naturally lead to the concept of multiclass queues: that is, there will be stations which can be working on customers of different types, and decisions will need to be made about which customer to serve first. Customers may even revisit a station more than once. Jackson [1957] considers a manufacturing system with M “departments” containing machines; each arriving job needs to be processed by one or more of the machines, in a fixed order (which may be different for different job types). Consequently, the machines will need to choose the queue from which they are currently taking jobs. Generalizations of this set-up have been successfully considered by Jackson [1963] (in the paper which introduced the concept of “Jackson networks”), Kelly [1976] (“Kelly networks”), and Baskett et al. [1975] (“BCMP networks”). These models have a steady-state distribution, which can be computed explicitly and depends only on the mean arrival rates and mean service times of customers. Interest in such networks was renewed after Kumar and Seidman [1990] and others ([Rybko and Stolyar, 1992, Bramson, 1994, Dumas, 1997]) constructed examples of processing networks with counterintuitive stability conditions. The focus in this line of research has been on the interactions between job types that result from the multi-stage processing inherent in a manufacturing system. Conversely, there are many queueing models in which customers require processing only once, and then leave the network; this is the natural assumption when customers are human. Kendall [1951] includes in his discussion the notion of parallel-service systems, in which there is a single (undifferentiated) server pool, and customers simply go to the next available server. A more interesting case is when customers are required to pick 1 a server on arrival, causing several parallel queues. Vvedenskaya et al. [1996] presents a spectacular example of a state-dependent routing algorithm improving performance of the system. Specifically, if a customer, on arrival, may look at a randomly chosen pair of queues and always enters the shorter one, then the probability of a queue size being substantially larger than average decays much faster – superexponential decay – than if the customer were simply routed to a randomly chosen queue. Our interest is in models where both “customers” and “servers” have a choice. Interest in such models appears to be more recent. Wein [1991] introduces a model for a network with both routing decisions (the customer, or job, may choose the server from which it receives service) and scheduling decisions (the server may choose which customer to take into service next). The model in [Wein, 1991] arises as a version of the manufacturing set- up, but one in which there are several parallel lines of machines that could work on a job, and the job may choose to switch from one line to another. Kelly and Laws [1993] discuss the emergence of resource pooling in some models with customer routing: although there are many server pools with differentiated skills, under certain conditions they behave as if there were just a single, large, server pool. (That is, the queue size scales as it would for a single, faster, pool of servers, not as it would for a set of parallel queues.) This means that efficiency can be increased by merging smaller systems together, creating larger, more flexible working pools. The paper of van Mieghem [1995] discusses an optimal dynamic control policy for a system with multiple customer classes waiting to be serviced by a single pool of servers. A common theme in analysis of queueing networks is using some form of asymptotic approximations for analysing “large” networks. This is because, with few exceptions, computing the steady-state distribution of a particular network is difficult. Moreover, even when the distribution can be written down in closed form, for example in [Kelly, 1976], the answer is often unenlightening. In practice, asymptotics which reveal the scal- ing behaviour (“If I double the arrival rates and number of servers, what will happen to the queue size?”) are often more useful. Several scaling regimes are commonly used, among them “heavy traffic” (introduced by Kingman [1962]) and “diverse routing” (stud- ied by Ziedins and Kelly [1989]; see also Whitt [1985]). Often a single queueing model can support several limiting regimes. For example, Halfin and Whitt [1981] made the inter- esting observation that in a system with many servers, the customer waiting times could be kept small even when the load on the system was quite high. This is a “heavy traffic” scenario which cannot be observed in the conventional heavy traffic scaling. (Conven- tional heavy traffic assumes that, as arrival rates increase, the service rates of individual servers increase proportionally; here we instead have more servers working at the original speed.) This implies that in a large service system (such as a call centre), the overstaffing necessary for all customers to have small waiting times is much smaller than would be ex- pected from conventional heavy traffic approximations. There is also a flourishing theory of diffusion approximations (see for instance [Harrison and Nguyen, 1990]), which studies the scaling behaviour of the stochastic process of deviations of the system from some nominal working point, approximating it by the solution of an appropriate stochastic dif- ferential equation. Throughout this thesis, we will frequently be interested in asymptotic questions. Frequently there is a tension between the limiting regime imposed by the steady-state behaviour of the system (i.e., t→∞), and the limiting regime imposed by taking a “large” system. (Large here means that we consider a family of systems, indexed by r →∞; for some scaling parameter r; typically, r determines the rate at which work arrives into the rth system at rate λr). In particular, in situations when one is interested in the long- term behaviour of a large system, one could consider taking these limits in either order; 2 and much interesting research has been concerned with the question of whether the two procedures commute. In terms of the diagram below, we would like to know whether there is convergence along all edges, and if so, whether the limits can be taken in either order. Xˆr(t) steady-state distribution−−−−−−−−−−−−−→ t→∞ Xˆr limiting process yr→∞ appropriate scaling?yr→∞ Xˆ(t) steady-state distribution?−−−−−−−−−−−−−−→ t→∞ Xˆ Much of Chapter 2 is concerned with this question explicitly. In Chapter 3, although we are unable to prove the existence of a steady-state distribution, we certainly will be interested in the question of whether different asymptotic approximations commute. Last, we note that the problem of interactions between two “genders” of agents, which we informally posed, does not have to be modelled as a queueing system. That is, we may not want to introduce a stochastic process of arrivals and of service times. For example, Caldentey et al. [2009] study the problem using so-called infinite bipartite matchings. Specifically, they make the assumption that, in the problem of pairing “passengers” and “taxis”, there is an infinite stream supplying each type of agent, and the goal of each agent is simply to find a match. The Caldentey et al. [2009] model was inspired by housing projects, in which interested applicants are matched with housing as it becomes available, and there may not be a meaningful notion of an arrival process, not to mention service time. Our model of the limit order book in Chapter 3 fits into their framework. Acknowledgments This thesis would have been impossible without two people: first, my supervisor at the University of Cambridge, Professor Frank P. Kelly, and second, Dr. Alexander L. Stolyar at Alcatel-Lucent Bell Labs. My deepest gratitude is due to both of them. Some of the many other people to conversations with whom portions of this thesis can be traced are Maury Bramson, Michael Chmutov, Sergey Foss, Florian Simatos, Yuri Suhov, Neil Walton, Richard Weber, Damon Wischik, and the referee of [Stolyar and Yudovina, 2010] for the Annals of Applied Probability. Special thanks to Daniel Whalen for help with the Mathematica graphics. Many thanks also to Michael Chmutov, Vladimir Dokchitser, and Daniel Whalen for proofreading parts of the dissertation. My PhD studies have been generously funded by the US National Science Foundation Graduate Research Fellowship. Last but not least, I am indebted to the generosity of colleagues, friends, and family members, who have surrounded me with patience, kindness, and biscuits throughout the thesis-writing process. Map of thesis Each chapter is self-contained, with its own introduction and summary of recent rele- vant literature. In Chapter 2 we introduce the model of a large service network, motivated by call centres. We study two algorithms for making routing and scheduling decisions. One (LQFS-LB) arises naturally from a static planning problem, but we show that it can lead to undesirable behaviour (unstable fluid-scale approximations over finite time horizon, and “large” steady-state deviations from equilibrium). The other (LAP) is designed to squash such instability, and we prove that the steady-state deviations from equilibrium 3 when LAP is used are “not too large”. We discuss finite-time horizon behaviour on a variety of scales, steady-state behaviour, and the interplay between them. In Chapter 3 we discuss the concept of limit order books, and formulate a simple, analytically tractable model of it. We then show that even such a simple model can have interesting behaviour. Appendices contain extra information. Appendix A gives a brief overview of the var- ious notions of continuity of functions that we use in the thesis. Appendix B contains a discussion of unstable networks, and offers an intuition for wanting to consider trees in Chapter 2. Appendix C presents a summary of the results in Halfin and Whitt [1981], which form the inspiration and basis for much of Chapter 2. Appendix D contains com- putations which would be too bulky to include in the main text. Assumptions, definitions, examples, lemmas, propositions, and theorems share a com- mon numbering, and are numbered consecutively within each chapter; equations are num- bered sequentially throughout the thesis. Except in this chapter and the appendices, section numbers refer to sections of the same chapter. Pages 89, 93, 94, and 117 are best viewed in colour. Location of original results The following sections contain new models, algorithms, or results: 2.3–8, 3.1–11, Ap- pendix D. Chapter 2 presents work undertaken in collaboration with Alexander L. Stolyar (Alcatel-Lucent Bell Labs, Murray Hill, NJ). The results follow [Stolyar and Yudov- ina, 2010], [Stolyar and Yudovina, 2012] and the corrections suggested by the referees of [Stolyar and Yudovina, 2010] for the Annals of Applied Probability, but the proofs in many sections (particularly §2.7) have been expanded. The associated computations in Appendix D.1–3 are my own, and the expository Appendices B and C are new in the thesis. Although there necessarily is a certain amount of overlap between the material in this dissertation and the fourth term report and the Smith-Knight and Rayleigh-Knight prize essay I submitted at the end of my fourth term at Cambridge, essentially all of the text has been rewritten. Notation Vectors and matrices. In Chapter 2, we will encounter many vectors indexed by sets I, J , E , C(j), and S(i). I and C(j) index customer types; their elements are denoted i, i′, etc. J and S(i) index server types; their elements are denoted j, j′, etc. E indexes (a subset of) customer-server type pairings; its elements are denoted (ij). For any symbol γ, (γi, i ∈ I) = (γi) = γI . Similarly, (γj, j ∈ J ) = (γj) = γJ , and (γij, (ij) ∈ E) = (γij) = γE . Occasionally, we also use γIJ = (γij, i ∈ I, j ∈ J ). Although elements of γE may have a double index ij, we treat γE as a (column) vector, not as a matrix. Unless specified otherwise,∑ j γij = ∑ j∈S(i) γij, ∑ i γij = ∑ i∈C(j) γij. The symbols γ and Γ will reappear as placeholders, but do not have any specific meaning in the thesis. 4 In matrix expressions, vectors are column vectors unless specified otherwise. For a (col- umn) vector v, its transpose (a row vector) is denoted v>. Similarly, for a matrix M , its transpose is denoted M>. For a vector v ∈ Rd, its Euclidean norm is denoted ‖v‖. The zero vector is denoted simply 0; it will be clear from context that the quantity is a vector. Sets. M, N are manifolds. I, J , E , C(j), S(i) are discrete index sets. A is an event. (Also written in the same script, but not sets: P is a partial ordering; L is a Lyapunov function; F and Fn are σ-algebras.) The one-point compactification of Rd is denoted Rd. The σ-algebra on Rd and Rd is always the Borel σ-algebra. The space of RCLL functions with domain [η,∞) and values in Rd is denoted Dd[η,∞). Usually, η = 0. (RCLL means “right-continuous with left limits”, see below under Func- tions.) The notion of convergence on Dd[η,∞) is uniform convergence on compact sets; see below under Convergence. Measures. Measures on Rd for the appropriate dimension d, or on its one-point compactification Rd, are denoted using Gothic script; e.g., M, A, D, Q. (A is meant to resemble “A”; Q is meant to resemble Q.) Of these, A, D, and Q are counting measures on R. For a measureM on R, we writeM[a, b], M[a, b), andM{a} to denoteM([a, b]), M([a, b)), and M({a}) respectively. pi and $ are probability measures on [0, 1]. Partial orderings. Partial orderings are named P and variations thereon, and de- noted x ≺ y. If x and y are incomparable, i.e. none of x ≺ y, x = y, or x  y holds, we write x ∼ y. (Our use of  in §2.5 has nothing to do with partial orderings.) Functions and random processes. For functions (or random processes) (γ(t), t ≥ 0) we often write γ(·); we also do this for functions with domain different from [0,∞). For a vector of functions, we may combine the shorthand vector notation with the shorthand function notation: for example, (γi(·)) and γI(·) both signify ((γi(t), i ∈ I), t ≥ 0). For γt a state variable indexed by time, γt− ≡ lim↓0 γt− and γ∞ ≡ limt↑∞ γt, provided the limit exists. The indicator function of a set A is denoted 1A; that is, 1A(ω) = 1 if ω ∈ A, and 0 otherwise. The symbol L denotes a Lyapunov function; see §2.6-7 and §2.8. The term “RCLL” means “right-continuous with left-limits” (also denoted ca`dla`g in lit- erature). These are functions γ(·) for which γ(t−) exists but need not be equal to γ(t), and γ(t+) ≡ lim↓0 γ(t+ ) exists and is equal to γ(t). The derivative of a function f(·) is denoted f˙ . Convergence. The symbol =⇒ denotes convergence in distribution of random processes in the Skorohod space Dd[η,∞), uniformly on compact sets1. The symbol w→ denotes weak convergence of probability measures on Rd or its one-point 1The usual topology on the space Dd[η,∞) of RCLL functions is the “Skorohod topology,” or more precisely one of the Skorohod topologies. The need for a topology other than one of uniform convergence arises becauseDd[η,∞) is not separable in the topology of uniform convergence on compact sets. However, differences between convergence in the uniform sense and convergence in the Skorohod sense arise only at jump points of the limiting process, and all limiting processes we consider will be continuous. There is an excellent discussion of this point – and the Skorohod topology – in [Pollard, 1984, Chapter VI.1]. 5 compactification Rd. It also denotes convergence in law of the associated random variables. The symbol → denotes ordinary convergence in Rd, Rd, or Dd[η,∞). The term u.o.c. means uniform(ly) on compact sets ; the domain may be defined explicitly, or be obvious from the context. The term w.p.1 means with probability 1, which is the same as almost surely. Miscellaneous. For x ∈ R, bxc is the greatest integer less than or equal to x. (The notation [x] in §3.6 is unrelated.) We use ≡ as the assignment operator, and = in equalities. That is, if we define x ≡ 2 + 2, then the equality x = 4 holds. The index r is a scaling parameter. We are typically interested in the behaviour of quantities as r →∞. For a function f(r) we say that f(r) is O(r) if |r−1f(r)| is bounded as r → ∞, and f(r) is o(r) if |r−1f(r)| → 0 as r → ∞. We write o(1) to mean some function which converges to 0. For a sequence of random variables, iid means independent and identically distributed. 6 CHAPTER 2 Large service network Introduction In this chapter we model a system in which service requests of several different types arrive externally, are processed by servers with varying skills, and leave the system. Ex- amples of such systems include call centres, cloud computing (where jobs submitted to the cloud take the role of service requests, and the machines take the role of servers; instead of “skill” the servers may be differentiated by available memory and processing power), as well as emergency wards in hospitals. Our primary example will be a call centre; we will therefore refer to the service requests as “customers”. A common feature of these applications is the large number of servers they employ, and the relatively unscalable processing requirements for each service activity. (A call centre agent may get marginally faster at processing calls on a busy day, but the effect is unlikely to be significant.) To compensate for the inflexible speed of servers, we instead may adjust their number (e.g., by hiring more call centre agents). To gain insight into the behaviour of a large call centre, we will be looking at the “many-server” asymptotic regime, in which the individual contribution of each server to the total processing capacity becomes negligible as the system grows. When the arrival rate of calls is close to the maximal that the servers are capable of processing, this asymptotic approximation is also known as the “Halfin-Whitt regime”, after the authors of [Halfin and Whitt, 1981]. Halfin and Whitt [1981] show that, for a model of a single-class many-server queue, by carefully managing staffing levels as the system grows larger, the expected time that customers spend queueing prior to entering service while the probability that an arriving customer has to wait will tend to a constant strictly between 0 and 1. This is achieved by having a system with O(r) arrival rate, putting in O( √ r) extra servers beyond the minimal number necessary to process all the arriving work on average. (In contrast, usual heavy load techniques only guarantee small customer waiting times – and then asymptotic probability 0 of having to wait at all – with O(r) overstaffing. The difference is considerable when there are hundreds – or, in the case of a large call centre, thousands – of agents.) The analysis of Halfin and Whitt (which we briefly summarize in Appendix C) uses undifferentiated customers and servers. However, in a call centre there are many types of customer requests (e.g., “I lost my credit card,” “I can’t log into online banking,” and “I need to transfer money to an account overseas”1), which are typically serviced by different pools of agents. The different pools are not entirely separated, because agents are typically cross-trained: for example, although we have assumed that “online banking” and “lost credit card” are two different call types, there probably are agents who can both email you a password reminder and block your lost credit card. It is likely that not every server can service every request type, and the associated service times may well vary. The challenge for a call centre then becomes to assign customers to servers in such a way that the entire system “looks like” a single pool of agents; in particular, so that the entire 1The examples of call types listed here are simply guesses formulated while waiting on the phone; the actual division of incoming requests into classes in a bank’s call centre could be completely different. 7 system has little customer waiting with only O( √ r) overstaffing, even if the arrival rates of calls of different types change. Let us classify the servers by the training they have received, and the associated average speed with which they can process different types of customer requests. If all of the parameters of the system, such as arrival rates and mean service times, are known, we can use the solution of the static planning problem (§1.2) to design a simple probabilistic routing mechanism. If the parameters of the system satisfy the complete resource pooling condition (Assumption 2.4), Halfin-Whitt-like behaviour is likely to emerge. (We discuss this point in §1.2, after Assumption 2.4.) However, in practice we usually want algorithms which do not rely on precise knowledge of arrival rates, since these are external to the system and may well change. In this case, somewhat less is known. A myopic, “maximal weight”-like policy, which is optimal when the number of servers is fixed [Mandelbaum and Stolyar, 2004], is known to not have optimal overstaffing requirements in the many-server regime [Stolyar and Tezcan, 2010]. Stolyar and Tezcan [2010, 2011] propose a “shadow routing” algorithm, which they conjecture does have optimal overstaffing behaviour. Both the maximal weight and the shadow routing algorithms rely on the precise knowl- edge of the mean service times. However, in real systems these also can only be approxi- mated. It would be preferable if the algorithm for assigning jobs to servers used only the information on system state (such as queue sizes and number or proportion of idle servers in each pool) and did not explicitly rely on either the arrival or the service rates. The two algorithms we investigate in this chapter rely only on knowing the basic activity tree. This is defined in §1.2; intuitively, it indicates the set of “most efficient” customer-server type pairings for the given arrival pattern and set of service rates. We would expect it to change only rarely, because computing the basic activity tree only requires approximate knowledge of the system parameters, and is insensitive to small perturbations. We will consider two algorithms. One (longest-queue freest-server load balancing, or LQFS-LB) has a more natural definition; however, we show that it can “misbehave”, in the sense of having large deviations from equilibrium (which will be defined later). In particular, we show that almost always in steady state the system is too far from equilibrium for diffusion approximations to be applicable. (The finite-time-horizon dif- fusion approximation for a family of algorithms including this one has been rigorously constructed by Gurvich and Whitt [2009]; we summarize the relevant results in §7.1.) We informally conjecture that this behaviour is “rare”: all the counterexamples we have been able to construct have somewhat unrealistic parameter values. We show that for certain parameter values the algorithm really does show the Halfin-Whitt regime behaviour (in- finitesimal average waiting times and finite probability of customer waiting, with O( √ r) overstaffing). The other algorithm we consider (leaf activity priority, or LAP) is more robust, but its operating point is less intuitive. For it, we conjecture the correct overstaffing behaviour (infinitesimal average waiting times and finite probability of customer waiting, with O( √ r) overstaffing). We prove a slightly weaker result, namely that the deviations of the system state from equilibrium are O(r1/2+) for any  > 0. As was mentioned in Chapter 1, for queueing models in general, and many-server models in particular, there is a tension between the time scaling and space scaling. We will be interested in the long-term behaviour of a large network, and we will consider a family of ever-larger networks, indexed by a scaling parameter r. We might then do one of two things: (a) consider the associated family of steady-state distributions (possibly, centered and rescaled), and take the limit; or (b) construct a limiting process which approximates system behaviour (appropriately scaled) over a finite time horizon, and 8 take its steady-state distribution. Schematically, Xˆr(t) steady-state distribution−−−−−−−−−−−−−→ t→∞ Xˆr limiting diffusion process yr→∞ ?yr→∞ Xˆ(t) steady-state distribution?−−−−−−−−−−−−−−→ t→∞ Xˆ Typically, in this diagram it is “easier” to go down and across, i.e. to the limiting pro- cess and then to its steady-state behaviour. There are standard techniques for proving convergence of the, appropriately scaled, state of a large queueing network to a Semi- Martingale Reflected Brownian Motion (SRBM); this was done in conventional heavy traffic by Harrison and Williams [1987], and for multi-server models examples can be found in [Mandelbaum et al., 1998, Pang et al., 2007]. There is also a large body of work studying when the approximating SRBM has an invariant distribution. While the full characterisation of the necessary and sufficient conditions for the SRBM to have an invariant distribution has not been accomplished2, a large class of sufficient conditions is known. (The papers [Harrison and Williams, 1987, El Kharroubi et al., 2000, 2002, Bramson et al., 2010] collectively characterise stability in at most three dimensions. Per- haps more relevantly for queueing applications, [Harrison and Williams, 1987] provides a set of sufficient conditions, which works in arbitrary dimension and is more natural in the context of queueing models.) However, the diagram above need not commute, primarily because the family of invari- ant distributions {Xr} need not be tight (so need not have any limit points as r →∞). In the many-server setting, this limit interchange problem has been particularly challenging. While there are a few individual results (notably, Corollary 2 in Halfin and Whitt [1981], reproduced in Appendix C as Theorem C.4; more recently, Gamarnik and Zeevi [2006], Gamarnik and Momcilovic [2008], Gamarnik and Stolyar [2012] as well as Stolyar and Yudovina [2010]), they show tightness only in very specialised settings. In the framework of multitype one-hop queueing networks, we provide an example of a situation where diffusion-scale tightness holds (in §7.4), as well as an example of a natural algorithm for which it does not hold (in §7.2). For the case of the leaf activity priority, in §8 we prove a family of tightness results on scales bigger than the diffusion scale (which are rarely encountered in the literature), and state a conjecture a tightness result on the diffusion scale. There is a third direction from which the algorithms we consider look interesting. An important aspect of queueing theory is the study of stability of queueing models. For certain types queueing networks (Jackson networks [Jackson, 1963], Kelly networks [Kelly, 1976], BCMP networks [Baskett et al., 1975]), the stability criterion is simple: if none of the servers are on average receiving more jobs than they can process, then the network is stable. The natural conjecture that this is the only requirement for network stability was essentially disproved3 in 1990 with the Kumar-Seidman network [Kumar and Seidman, 1990]; several other examples (e.g., [Rybko and Stolyar, 1992], [Bramson, 1994], [Dumas, 1997]) have since been produced. A common feature of most of these examples of instability is that there is a certain loop, or cycle, in the structure of the job flow graph, and after “going around” this loop the number of unserviced jobs in the system increases. 2In fact, as Gamarnik and Katz [2010] show, a simple set of necessary and sufficient conditions cannot be identified: the question of whether a SRBM is positive recurrent is undecidable. 3There wasn’t a formal conjecture to disprove; rather, the series of papers [Kumar and Seidman, 1990], [Rybko and Stolyar, 1992], [Bramson, 1994], [Dumas, 1997] and others exhibited increasingly natural disciplines displaying instability without any given station being overloaded. 9 The curious feature of the algorithm we analyse is that, although its routing graph is constrained to be a tree ( i.e., cycle-free), it nevertheless supports unstable, exponentially growing perturbations. We discuss this further in Appendix B. 1. Call centre model and the static planning problem 1.1. Scaling regime and state descriptor. Consider a queueing model in which there are I customer classes, or types, labelled 1, 2, . . . , I, and J server (agent) pools, or classes, labelled 1, 2, . . . , J . Generally, we will use the subscripts i, i′ (and sometimes k) for customer classes, and j, j′ (and sometimes k) for server pools; the sets of customer classes and server classes will be denoted by I and J respectively. We are interested in the scaling properties of the system as it grows large. The meaning of “grows large” is as follows. We consider a sequence of systems indexed by a scaling parameter r →∞. As r grows, the arrival rates and the sizes of the service pools, but not the speed of service, increase. Specifically, in the rth system, customers of type i enter the system as a Poisson process of rate λri = rλi + o(r), while the jth server pool has rβj individual servers. (All λi and βj are positive parameters.) We model the system as input-queued. That is, customers are only assigned to a server type when they are taken into service; if queueing occurs, there is a separate queue for each customer type. We do not allow customers to abandon the system before being served. (In this chapter we will be discussing a system in underload or in Halfin-Whitt- type heavy traffic; for it, waiting times ought to be negligible, and abandonment should not be important.) When a customer of type i is accepted for service by a server in pool j, the service time is exponential of rate µij; the service rate depends both on the customer type and the server type, but not on the scaling parameter r. If customers of type i cannot be served by servers of class j, the service rate is µij = 0. All interarrival and service times are taken to be independent exponentials. We present a schematic diagram of such a model in Figure 2.1. βr3 β r 4β r 1 β r 2 µA1 µD4 λrA A λrB B C D λrC λ r D Figure 2.1. Schematic diagram of a queueing system showing the arrival rates, service rates, and number of servers in each pool. The absence of an edge implies the corresponding service rate is zero; e.g., here µB1 = 0. For the system with scaling parameter r, we introduce the following notation for the system state at time t: Ψrij(t) is the number of servers of type j serving customers of type i; Ψrj(t) ≡ ∑ i∈I Ψ r ij(t) is the total number of busy servers of type j; Ψri (t) ≡ ∑ j∈J Ψ r ij(t) is the total number of servers serving type i customers; P rj (t) ≡ Ψrj(t)/βj is the instantaneous load of server pool j; Qri (t) is the number of customers of type i waiting for service; Xri (t) ≡ Ψri (t) +Qri (t) is the total number of customers of type i in the system; −Zrj (t) ≡ βjr −Ψrj(t) is the number of idle servers of type j (note that Zrj (t) ≤ 0). We further describe the state of the routing choices that have been made by the algorithms up to time t: Ari (t) is the total number of customers of type i that have arrived into the system in the 10 interval [0, t]; Drij(t) is the total number of customers of type i that completed service in pool j (and departed the system) in the interval [0, t]; Ξrij(t) is the total number of customers of type i that entered service in pool j in the interval [0, t]. 1.2. Static Planning Problem. The load-balancing objective is to minimize the maximal proportion of occupied servers of any given type. Suppose that the rth system has a well-defined average rate λrij at which requests of type i are sent to servers of type j. Intuitively, a load-balancing algorithm should aim to have λrij ≈ λijr, where {λij} is an optimal solution to the following static planning problem (SPP) (see [Harrison, 2000]): (1a) min λIJ ,ρ ρ, subject to (1b) λij ≥ 0, ∀i, j (1c) ∑ j∈J λij = λi, ∀i (1d) ∑ i∈I λij/(βjµij) ≤ ρ, ∀j. Definition 2.1. The optimal value of ρ in (1) is called the load on the system. If ρ < 1, the system is called underloaded ; if ρ = 1, the system is called critically loaded. We will not consider the overloaded case ρ > 1, in which case some of the customers must abandon the system for it to be stable. Talreja and Whitt [2008] discuss fluid model asymptotics for overloaded many-server systems. The dual problem to (1) is (2a) max νI ,αJ ∑ i λiνi, subject to (2b) αj ≥ 0, ∀j (2c) ∑ j∈J αj = 1 (2d) αj ≥ νiβjµij, λij (αj − νiβjµij) = 0, ∀i, j. Definition 2.2. The optimal value of νi in (2) is called the workload associated with a job of type i. The optimal value of αj is called the rate at which server pool j can process workload. In a system indexed by r, the rate at which server pool j processes workload scales as r, whereas the workload associated with an individual job of type i does not scale. Strong duality guarantees that (3) ∑ j∈J αj = 1, ∑ i∈I λiνi = ρ ∑ j∈J αj = ρ. 11 Remark 2.3. The workloads νi and rates αj are not intrinsic to the service system: they depend on the parameters βJ , µIJ , but also on the arrival rates λI . In other words, the same call centre faced with two different patterns of calls λI , λ˜I may well assign different values of “workload” to jobs of a given type, and different rates of processing said workload by the server pools. However, the feasible set of the dual problem (2) defining νI and αJ depends only on the parameters βJ , µIJ , which are intrinsic to the service system. Since (2) is a linear program whose optimum is always attained at one or more vertices of the feasible set, there is a finite set of possible “workloads”, and this set depends only on the parameters intrinsic to the system. Moreover, provided the arrival rates λI are such that the maximum is attained at only one vertex of the dual feasible set, there will be a unique possible set of workloads, and the same set will work for all sufficiently close values λ˜I . 2. Complete resource pooling Throughout this chapter, we make the following complete resource pooling (CRP) assumption: Assumption 2.4. The SPP (1) has a unique optimal solution {λij, i ∈ I, j ∈ J }, ρ. The solution is such that the set of pairs, or edges, (ij) for which λij > 0 forms a (connected) tree 4 in the graph with vertex set I ∪ J . The CRP assumption can equivalently be formulated as “The linear program (1) has a unique, non-degenerate solution” (and hence so does its dual). The term “complete resource pooling” was introduced in the paper of Harrison and Lo´pez [1999], where this condition is used to simplify diffusion-scale analysis; but variants of the condition are ubiquitous in discussions of systems with multiple server types that may share customers. Definition 2.5. A basic activity is a pair (ij) such that λij > 0 in the optimal solution to (1). The basic activity tree E is the graph formed by the (undirected) edges (ij) which are basic activities. Assumption 2.4 consists of two parts. The assumption that optimal solution is unique and the graph formed by basic activities contains no cycles holds “generically”: in systems where it is violated, the parameters λI , βJ , and µIJ are linked by a set of polynomial equations [Stolyar and Tezcan, 2011, Theorem 2.2]. Since the arrival rates at a call centre typically oscillate throughout the day, it seems reasonable to assume that most of the time the parameters will not be so well-matched. The assumption that the graph is connected may well fail for a large range of pa- rameters. If it does fail, then in heavy load it is optimal to run the system as several noninteracting subsystems, where sharing occurs within each subsystem but not between them. In this case, all of the analysis below applies to each of the connected components separately. When Assumption 2.4 holds, E is also the graph formed by edges (ij) along which equalities hold in (2d). (Without the CRP assumption, there may be additional edges along which equality holds.) In Figure 2.2 we show the optimal tree associated with a particular set of parameter values. The workloads are νA = 1 12 , νB = 1 18 , νC = 1 54 , νD = 1 54 , and the corresponding service rates are α1 = 1 3 , α2 = 1 3 , α3 = 1 9 , α4 = 2 9 . 4 A tree is a connected graph without cycles. Its leaves are nodes with only one outgoing edge. 12 A B C D 1 2 A B C D 1 2 1 2 4 41 1 4 53 2 43 3 2 1 2 3 3 2 1 3 3 2 2 1 3 2 2 1 5 6 3 3 1 Figure 2.2. Sample parameters for the queueing system, and associated solution to the static planning problem (ρ = 5/12). Definition 2.6. For a customer type i, let S(i) ≡ {j : (ij) ∈ E} denote the set of server types to whom customers of type i are routed in the solution to the static planning problem; for a server type j, let C(j) ≡ {i : (ij) ∈ E} denote the set of customer classes that servers of type j process in the solution to the SPP. We can think about workload as follows. Jobs of type i arrive at rate λi bringing a certain amount νi of work with them. A server in pool j that is working on a job of any type i ∈ C(j) is processing system workload at rate 1 βrj αj. If it is working on a job of some type i′ 6∈ C(j), then its rate of processing workload is strictly slower than 1 βrj αj. Consequently, if we want to run the system efficiently, we should only assign servers to work on customers of types i ∈ C(j). On the other hand, it seems intuitively plausible that any “reasonable” policy which assigns customers to servers without straying outside the basic activity tree will result in the same behaviour of the system workload; that is, effectively we will have “merged” the server pools into a single large pool that is processing system workload as efficiently as it can. Throughout the discussion of call centre models, we make the following additional assumption: Assumption 2.7. The basic activity tree E is known in advance. All assignments of customers to servers are made along edges of the basic activity tree. This will ensure that routing choices are such that λrij = 0 for (ij) 6∈ E ; that is, all servers, when they are busy, will be processing workload as quickly as they can. Remark 2.8. As in Remark 2.3 on the dual workload, the basic activity tree E depends on the arrival rates λI , as well as on βJ and µIJ . If we are at a point where CRP holds (implying that there is a unique possible choice of E), then a small perturbation in parameters λI , βJ , or µIJ will not change the tree (although it will change the optimal rates λE). This suggests that, as long as the arrival rates λI and service rates µIJ are not subject to wild fluctuations, we can separate time scales. First, over a longer time scale we estimate the parameters enough to determine which of the possible basic activity trees is present in our case, and then over a very short time scale we route individual customers to servers based on the knowledge of the specific tree. Our discussion in what follows is concerned only with the routing of customers on the short time scales; (approximate) identification of the basic activity tree could be done either by measuring λI and µIJ and solving the static planning program, or (assuming the λI are more variable than the µIJ ) by the shadow routing algorithm of Stolyar and Tezcan [2011]. (The shadow routing algorithm will give incorrect rates λ˜E , but will identify the correct set E ≡ {(ij) : λ˜ij > 0}.) 13 3. LQFS-LB and LAP algorithms In this section we define the two algorithms we will be considering for matching cus- tomers to servers: Longest-Queue Freest-Server Load Balancing (LQFS-LB) and Leaf Activity Priority (LAP). LQFS-LB belongs to the family of algorithms considered by Gurvich and Whitt [2009] and others (Armony and Ward [2011], Atar et al. [2011]). It is a natural routing and scheduling rule that strives to equalize the load, or proportion of busy servers, on all the server pools. LAP instead assigns static priorities to the basic activities in the basic activity tree, and strives to keep the high-priority activities “filled”. Each of the algorithms consists of two parts: routing and scheduling. “Routing” determines where an arriving customer goes if it sees available servers of several different types. “Scheduling” determines which waiting customer a server picks if it sees customers of several different types waiting in queue. Throughout this chapter, we alternate between analysing the two algorithms. Thus, §3-4 discuss both algorithms, §5 and §7 are devoted exclusively to LQFS-LB, and §6 and §8 are devoted exclusively to LAP. 3.1. Longest-queue, freest-server load balancing algorithm (LQFS-LB). Scheduling: If a server of type j, upon completing a service, sees a customer of a class in C(j) in queue, it will pick the customer from the longest queue, i.e. i ∈ arg maxj∈C(j) Qri . (Ties are broken in an arbitrary Markovian manner.) Routing: If an arriving customer of type i sees any unoccupied servers in server classes in S(i), it will pick a server in the least loaded server pool, i.e. j ∈ arg minj∈S(i) P rj (t). (Ties are broken in an arbitrary Markovian manner.) This algorithm is a special case of one considered by Gurvich and Whitt [Gurvich and Whitt, 2009, Remark 2.3], with constant probabilities pi = 1 I (queues “should” be equal), vj = βjP βj (the proportion of idle servers “should” be the same in all server pools). The results of [Gurvich and Whitt, 2009] which we use are briefly summarized in §7.1. 3.2. Leaf Activity Priority algorithm. The definition of Leaf Activity Priority (LAP) policy proceeds in three steps. First, we assign priorities to customer classes as follows: (1) Pick a leaf5 of the tree; (2) If it is a customer class (rather than a server class), assign to it the highest priority that hasn’t yet been assigned; (3) Remove the leaf from the tree. Without loss of generality, we assume the customer classes are numbered in order of priority (with 1 being highest). We now assign priorities to the edges of the basic activity tree by iterating the following procedure: (1) Pick the highest-priority customer class; (2) Pick an edge of the activity tree going out of the class to a leaf; (3) Assign this edge the highest priority that hasn’t yet been assigned, and remove the edge; (4) When the customer class becomes a leaf of the activity tree, assign the remaining edge out of it the highest priority that hasn’t yet been assigned, and remove the edge together with the customer class. It is not hard to verify that this algorithm will successfully assign priorities to all edges. It suffices to check that at any time the highest remaining priority customer class will have 5 See p. 12 for the definition. 14 at most one outgoing edge “leading” to customer classes of lower priority, which follows from the way we assigned priority to customer classes. We shall assume that the server classes are numbered so that the lowest-priority activity is (IJ). Remark 2.9. This procedure will not give a unique assignment of priorities: choosing the leaves in different orders will result in different assignments. We give two examples in Figure 2.3. The LAP analysis applies equally well to any priority assignment constructed 4 3 2 1 6 2 1 3 47 5 1 4 2 3 2 3 4 5 71 6 Figure 2.3. Two examples of assigning priorities to customer classes and activities of the same tree. according to the above algorithm. Definition 2.10. We will write (ij) < (i′j′) to mean that activity (ij) has higher priority than activity (i′j′). For example, if j = j′, we have (ij) < (i′j) if and only if i < i′. Now we define the LAP policy itself. Scheduling: A server of type j upon completing a service picks the customer from the queue of type i ∈ C(j) such that i ≤ i′ for all i′ ∈ S(i) with Qi′ > 0. If no customer types in C(j) have queues, the server remains idle. Routing: An arriving customer of type i picks an unoccupied server in the pool j ∈ S(i) such that (ij) ≤ (ij′) for all j′ ∈ S(i) with Zj′ < 0. If no server pools in S(i) have idle servers, the customer queues. 4. Fluid-scaled convergence for LQFS-LB and LAP In this section, we consider the behaviour of large systems under the fluid scaling γr(·) = 1 r Γr(·) for all state variables Γ. This is a rather coarse description of the process; later, we will also investigate the behaviour of r−1/2+(Γr(·) − rγ∗), for 0 ≤  < 1/2 and some appropriately chosen constant γ∗. We will show that, under the fluid rescaling, the Markov processes describing the system state converge, as r →∞, in distribution and uniformly on compact sets, to a set of Lipschitz functions satisfying certain fluid model equations. We refer to subsequential limits limrk→∞ γ rk(·) as fluid limits, and to Lipschitz functions satisfying appropriate equations as fluid models ; in these terms, we show that all fluid limits are fluid models. We will then analyse the behaviour of fluid models (which, by the convergence, will be approximately the same for all sufficiently large systems). Analysis of fluid models is a standard technique in the theory of queueing networks; see, for example, Bramson [2006]. In order to show convergence of processes, we would like to formalize the control we have over the arrival processes. We will assume that the arrival and service processes are rescalings of a family of independent unit-rate Poisson processes. Moreover, for any sequence r → ∞ there is a subsequence (also indexed by r) along which the underlying Poisson processes are “well-behaved”; we will work only along such subsequences. Formally, we make the following assumptions: 15 Assumption 2.11. Let Π (a) I (·) and Π(s)E (·) be independent unit-rate Poisson processes. We will assume that, for each r, Ari (t) = Π (a) i (λirt), ∀i ∈ I Srij(t) = Π(s)i (µijrt), ∀(ij) ∈ E . Poisson processes Π (a) i (·) and Π(s)ij (·) satisfy the following property (see [Mandelbaum and Stolyar, 2004, (34)]). Any subsequence of {r}, has a further subsequence, such that with probability 1, for any fixed t > 0 and d > 0, uniformly on any sequence of pairs (sr, tr) with 0 ≤ sr < tr ≤ rt and tr − sr ≥ √rd, we have (4) lim r→∞ Π (a) i (t r)− Π(a)i (sr) tr − sr = 1 and similarly for Π (s) ij (·). This lets us work with pathwise limits, rather than limits in distribution. Assumption 2.12. The sequence {r} is such that (4) holds for Π(a)i (·) and Π(s)ij (·), for all i ∈ I and (ij) ∈ E. 4.1. Convergence for LQFS-LB. Consider the scaling( ψrE(t), q r I(t), x r I(t), a r I(t), ρ r J (t) ) ≡ 1 r ( ΨrE(t), Q r I(t), X r I(t), A r I(t), P r J (t) ) Theorem 2.13. Suppose (ψrE(0), q r I(0))→ (ψE(0), qI(0)) . Then, w.p.1, for any sequence r →∞ there exists a subsequence along which( ψrE(·), qrI(·), xrI(·), arI(·), ρrJ (·) )→ (ψE(·), xI(·), qI(·), aI(·), ρJ (·)) uniformly on compact sets, for some set of Lipschitz functions (ψE , qI , xI , aI , ρJ ) satisfying the fluid model equations (5). (The conditions involving derivatives are to be satisfied whenever the derivatives exist, which is Lebesgue-almost everywhere.) The LQFS-LB fluid model equations are (5a) ai(t) = λit, ∀i ∈ I (5b) xi(t) = qi(t) + ∑ j ψij(t), ∀i ∈ I (5c) xi(t) = xi(0) + ai(t)− ∑ j ∫ t 0 µijψij(s)ds, ∀i ∈ I (5d) ρj(t) = 1 βj ∑ i ψij(t), ∀j ∈ J (5e) ρj(t) = 1 if qi(t) > 0 for any i ∈ C(j), ∀j ∈ J For any set of customer types I∗ ⊆ I, and any set of server types J∗ ⊆ J such that (a) ρj(t) < 1 for all j ∈ J∗, and (b) ρj(t) < ρj′(t) whenever j ∈ J∗, j′ 6∈ J∗, and C(j) ∩ C(j′) ∩ I∗ 6= ∅, (5fa) ∑ j∈J∗ ∑ i∈C(j)∩I∗ ψ˙ij(t) = ∑ i∈∪j∈J∗C(j)∩I∗ λi − ∑ j∈J∗ ∑ i∈C(j)∩I∗ µijψij(t) 16 For any set of server types J ∗ ⊆ J , and any set of customer types I∗ ⊆ I such that (a) qi(t) > 0 for all i ∈ I∗, and (b) qi(t) > qi′(t) whenever i ∈ I∗, i′ 6∈ I∗ and S(i) ∩ S(i′) ∩ J ∗ 6= ∅, (5fb) ∑ i∈I∗ ∑ j∈S(i)∩J ∗ ψ˙ij(t) = ∑ j∈∪i∈I∗S(i)∩J ∗ ∑ i′∈C(j) µi′jψi′j(t)− ∑ i∈I∗ ∑ j∈S(i)∩J ∗ µijψij(t) We comment on (5f) as the least intuitive. It describes the idea that customers are only entering service at one of the least busy servers that they can find, while servers are only taking requests from one of the longest queues that they can serve. The meaning of (5fa) is as follows. Consider a set of customer types I∗. If a set of server types J∗ consists of the “least busy server types for I∗” (we will make this more precise), then arriving customers of type i∗ ∈ I∗ will all be routed to servers in J∗. In this case, the total number of customers of types I∗ in service by servers of types J∗ will be changing at the total arrival rate of customers in I∗, less the rate of servicing customers of all types by servers in J ∗. The requirements that J∗ needs to satisfy for this to be the case are, that there be no server types outside J∗ with smaller instantaneous load, which can serve customers of some type in I∗. We now consider some examples of what valid sets J∗ can look like for a given I∗. As warm-up, a one-element set J∗ = {j∗} is a valid choice for a one-element set I∗ = {i∗} if and only if the server pool j∗ ∈ S(i∗) has the (strictly) smallest instantaneous load among all of the server types that can serve i∗. Consider now the situation in the right-hand network of Figure 2.4. If I∗ = {A,B,C,D}, then the only valid choices of J∗ are {a} and {a, b, c, d}. Note that J0 ≡ {a, b} does not qualify, because b shares a customer type B ∈ I∗ with an equally- loaded server pool c 6∈ J0. If we instead look at I∗ = {C,D}, then J∗ = {d} becomes a valid choice: no j ∈ {d} and j′ ∈ {a, b, c} “share” a customer type in I∗. 3 1 2 2 A DCB A B C D 0.5 0.90.70.7a b c d a b c d Figure 2.4. Illustration for (5f). In (5fb) we consider a similar situation, but with queueing: in this case, servers are picking customers, and not the other way around. Consider a set of server types J ∗. If a set of customer types I∗ consists of the “longest queues for J ∗” (we will make this more precise), then servers in pools j∗ ∈ J ∗, whenever they finish serving some customer, will immediately replace her with someone from queue i∗. In this case, the total number of customers of types I∗ in service by servers of types J ∗ will be increasing at the total rate of servicing all customers by servers in J ∗, less the rate of servicing customers of types I∗ by servers in J ∗. The requirements that I∗ needs to satisfy for this to be the case are, that there be no customer types outside I∗ with longer queues that servers in J ∗ can serve. We now consider some examples of what valid sets I∗ can look like for a given J ∗. As warm-up, a one-element set I∗ = {i∗} is a valid choice for a one-element set J ∗ = {j∗} if and only if the customer type i∗ ∈ C(j∗) has the (strictly) longest queue among all of the customer types that can be served by j∗. Consider now the situation in the left-hand network of Figure 2.4. If J ∗ = {a, b, c, d}, then the valid choices of I∗ are {A}, {C,D}, and {A,B,C,D}. Note that I0 ≡ {C} alone does not qualify, because C shares a server type d ∈ J ∗ with a queue of the same length 17 D 6∈ I0. On the other hand, the fact that qA > qC does not stop {C,D} from being a valid choice for I∗, because S(i) ∩ S(i′) = ∅ for i ∈ {A}, i′ ∈ {C,D}. If we instead look at J ∗ = {c}, then I∗ = {B,C} becomes a valid choice: no i ∈ {B,C} and i′ ∈ {A,D} “share” a server type in J ∗. Proof of Theorem 2.13. Given property (4), it is standard to conclude that, with probability 1, any sequence of fluid-scaled processes has a subsequence which converges uniformly on compact sets to some absolutely continuous6 limit; see for example [Man- delbaum et al., 1998, Theorem 2.1]. That the limit is then Lipschitz follows from the fact that the arrival rate and the maximal service rate of customers are upper-bounded. We will skip the technical difficulties of demonstrating the existence of Lipschitz limits, and focus instead on the question of why any fluid limit must satisfy the fluid model equations (5f). (5a) is a direct consequence of (4). (5b) holds in all prelimit systems, hence in the limit as well. (5c) also follows from (4). Indeed, in the prelimit system we have Xri (t) = X r i (0) + A r i (t)− ∑ j∈S(i) Π (s) ij (∫ t 0 µijΨ r ij(s)ds ) . Dividing by r and using the fact that the limit as r →∞ exists and is Lipschitz, we can apply (4) to conclude 1 r Π (s) ij (∫ t 0 µijΨ r ij(s)ds ) → ∫ t 0 µijψij(s)ds. (5d) holds in all prelimit systems, hence in the limit as well. (5e) follows from the fact that, in the rth system, ρrj(t) = 1 whenever any customer type i ∈ C(j) has a positive queue. (Note that if qi > 0 then q r i > 0 for all sufficiently large r.) We now turn to (5f). Recall that the limit is Lipschitz, hence absolutely continuous, so the equation makes sense almost everywhere. Let t be one of the regular times at which the derivatives of all of the limiting process ψ˙ij(t) exist. Consider (5fb). Pick a set of server types J ∗ ⊆ J , and a set of customer types I∗ ⊆ I satisfying the conditions. Since qi(t) > qi′(t) for all i ∈ I∗, i′ 6∈ I∗ s.t. S(i)∩S(i′)∩J ∗ 6= ∅, there exists a δ > 0 sufficiently small that qi(s) > qi′(s) + δ for all s ∈ [t, t+ δ); and then for all sufficiently large r we have qri (s) > q r i′(s) + δ/2 for all s ∈ [t, t+ δ). Consequently, during the entire time interval [t, t+δ], all servers in J ∗ that can take customers in I∗ will do so. Now, during [t, t+ δ], each server type j ∈ J ∗ has approximately ∑i′∈C(j) µi′jΨri′jδ service completions (this is a consequence of (4)), all of which are replaced by customers of types i ∈ I∗. Therefore, the total fluid-scaled number of customers of types in I∗ being served by servers in J ∗ will be changing by approximately δ  ∑ j∈∪i∈I∗S(i)∩J ∗ ∑ i′∈C(j) µi′jψ r i′j(t)− ∑ i∈I∗ ∑ j∈S(i)∩J ∗ µijψ r ij(t)  . Since we are assuming all the derivatives ψ˙ij(t) exist, they must satisfy (5fb). The argument for (5fa) is nearly identical. Picking a regular time t, let δ be small enough that strict inequalities on instantaneous loads in the limiting system hold at all times in [t, t + δ] in the prelimit fluid-scaled systems for all sufficiently large r. Then all customer arrivals to types in I∗ must be routed to servers in J ∗, so the fluid-scaled 6See Appendix A for a definition of absolute continuity. 18 number of customers of types in I∗ being served by servers in J ∗ will be changing by approximately δ  ∑ i∈∪j∈J∗C(j)∩I∗ λi − ∑ j∈J∗ ∑ i∈C(j)∩I∗ µijψ r ij(t)  .  Definition 2.14. We call any Lipschitz solution of (5) (ψE(·), qI(·), xI(·), aI(·), ρJ (·)) a fluid model of the LQFS-LB system with initial state (ψE(0), qI(0)); a set (ψE(·), qI(·)), which is a projection of a fluid model, we often call a fluid model as well. Remark 2.15. In general, the set of fluid models will be larger than the set of fluid- scaled limits of queueing processes. Further, in general, given a set of initial conditions, there need not be a unique fluid model starting from that set of conditions; indeed, there may not even be a unique fluid limit. For the rest of the exposition, it will not be important whether solutions to the fluid model equations are uniquely defined by their starting state; but it is an interesting question in its own right. We show below that, indeed, there is a unique fluid limit from any starting state, and in the process derive the additional equations that need to be added to the (5) to enforce this uniqueness. Consider the quantity ξij(t), the amount of customers of type i that have been routed to servers of type j up to time t. (We define ξij(0) = 0 for concreteness.) It is not hard to see that ξE(·) will be Lipschitz, and that knowing the initial state of the system (ψE(0), qI(0)) and ξE(·) is equivalent to knowing the entire trajectory of the fluid model. Let λE(·) = ddtξE(t) whenever this is defined; λij(t) is the instantaneous rate of routing customers of type i to servers of type j. Since ξE(t) is Lipschitz, λE(·) determines ξE(·). We will now show that, given the state of the fluid model at time t, λE(t+) is uniquely determined. (Note that we already know that one feasible λE(t+) exists, because the fluid limit started from that initial state will determine some set of values.) We will usually drop the time index t+ in what follows. Note firstly that λij = 0 if (a) there exists some i ′ ∈ C(j) with qi′ > qi, or (b) there exists some j′ ∈ S(i) with ρj′ < ρj. Consequently, we have partitioned the basic activity tree E into subtrees, such that within each subtree all customer queue sizes and all server pool loads are equal. We will now restrict attention to one such subtree, T ; WLOG it will be a subtree with all queue sizes equal to q > 0. Let C and S denote the subsets of customer and server types belonging to T . If we were given the constraint that all queues in T stay equal at t+, then we could determine the routing rates λij. Indeed, if all queue sizes remain equal, then necessarily q˙(t+) = |C|−1 ∑ i∈C λi − ∑ i′∈∪j∈SC(j) ψi′j(t)µi′j  , and ρ˙j(t) = 0 for all j ∈ C (since q > 0). This allows us to solve for λij(t+) by sequentially eliminating leaves of the tree. If customer type i is a leaf with unique server pool j, then λij(t +) = q˙(t+) − λi; and if server type j is a leaf with unique customer type i, then λij(t +) = µij(t)ψij. Unfortunately, this may give λij(t +) < 0 for some activities, which is not physical (the process ξij(t) must be nondecreasing). This indicates that, in fact, the queue sizes of customer types in C will not remain equal; rather, our tree T will split into subtrees T1, T2, . . . , Tn with the following properties: 19 (1) Within each Tk, queue sizes will remain equal, and will change at rate q˙k(t +) (positive or negative). WLOG, the indexing is such that q˙1 > q˙2 > . . . > q˙k. 7 (2) The associated rates λij(t +) ≥ 0 within each subtree. (3) The rates λij(t +) are 0 if i and j belong to different subtrees. This means that if, for a basic activity (ij) we get i ∈ Tk and j ∈ Tk′ , then q˙k < q˙k′ . Observe that, once we know the subtrees, the associated rates λij(t +) are completely determined. We now claim that the partition P of T into subtrees satisfying (1)–(3) is unique. (Again, we know one exists because the fluid limit with this initial state must give one.) Indeed, suppose P˜ ≡ {T˜1, . . . , T˜n˜} is another partition, and WLOG let q˙1 ≥ ˜˙1q. Consider now the queues of types C1 ∈ T1. The total amount of service that they are getting in the partition P˜ cannot be greater than in P , since in P they are getting all of the servers available to them. Consequently, at least one of these queues will have a higher time derivative in P˜ than q˙1, and equality is only possible if in P˜ the set of queues C1 also gets, and shares equally, all the service that it can – i.e., if C1 = C˜1. Continuing inductively gives the result. The argument is similar if we restrict our attention to a subtree where all queues are 0. Thus, we’ve shown that, for any state (ψE(t), qI(t)) there is a unique set of time derivatives λE(t+) ≥ 0 that are consistent with the fluid model equations. Now, (5) has no equations equivalent to the nonnegativity of λE(t+). However, adding this constraint (by adding the process ξE(t) to the state descriptor and requiring it to be nondecreasing) would, as we saw above, force uniqueness. 4.2. Convergence for LAP. We now perform similar analysis for the LAP policy. Our state descriptor will need to be slightly larger than for the LQFS-LB model, but otherwise the analysis is very similar. Consider the scaling( ψrE(t), q r I(t), x r I(t), a r I(t), d r E(t), ξ r E(t) ) ≡ 1 r ( ΨrE(t), Q r I(t), X r I(t), A r I(t), D r E(t),Ξ r E(t) ) . Proposition 2.16. Suppose (ψrE(0), q r I(0))→ (ψE(0), qI(0)) Then, w.p.1, for any sequence r →∞ there exists a subsequence along which (ψrE(·), qrI(·), xrI(·), arI(·), drE(·), ξrE(·))→ (ψE(·), qI(·), xI(·), aI(·), dE(·), ξE(·)) uniformly on compact sets, for some set of Lipschitz functions (ψE , qI , xI , aI , dE , ξE) satisfying the fluid model equations (7). (The conditions involving derivatives are to be satisfied whenever the derivatives exist, which is Lebesgue-almost everywhere.) The LAP fluid model equations are (7a) qi(t) ≥ 0, ∀i ∈ I; ψij(t) ≥ 0, ∀(ij) ∈ E ; ∑ i ψij(t) ≤ βj, ∀j ∈ J (7b) ai(t) = λit, ∀i ∈ I; dij(t) = ∫ t 0 µijψij(s)ds, ∀(ij) ∈ E 7If we require strict inequalities between q˙k for different k, then strictly speaking Tk might end up disconnected. This makes no difference to the analysis. 20 (7c) qi(t) = qi(0) + ai(t)− ∑ j ξij(t), ∀i ∈ I (7d) ψij(t) = ψij(0) + ξij(t)− dij(t), ∀i ∈ I (7e) xi(t) = qi(t) + ∑ j ψij(t) = xi(0) + λit− ∑ j ∫ t 0 µijψij(s)ds, ∀i ∈ I (7f) ∑ i ψij(t) = βj, whenever qi′(t) > 0 for at least one i ′ ∈ C(j) (7g) d dt ξij(t) = 0, whenever qi′(t) > 0 for at least one i ′ ∈ C(j), i′ < i (7h) d dt ξij(t) = 0, whenever ∑ k ψkj′(t) < βj′for at least one j ′ ∈ S(i) with (ij′) < (ij) (7i) d dt ξij(t) = min λi − ∑ (ij′)<(ij) d dt ξij′(t), ∑ i′ µi′jψi′j(t)− ∑ (i′j)<(ij) d dt ξij′(t)  whenever qi′(t) = 0 for all i ′ ∈ C(j), i′ < i, and∑ k ψkj′ = βj′ for all j ′ ∈ S(i) with (ij′) < (ij). Proof. Given property (4), it is standard to conclude that, with probability 1, any sequence of fluid-scaled processes has a subsequence which converges uniformly on com- pact sets to some absolutely continuous limit; see for example [Mandelbaum et al., 1998, Theorem 2.1]. That the limit is then Lipschitz follows from the fact that the arrival rate and the maximal service rate of customers are bounded above. We will skip the technical difficulties of demonstrating the existence of Lipschitz limits, and focus instead on the question of why any fluid limit must satisfy the fluid model equations (7) (7a) holds in all prelimit systems, hence in the limit. (7b) is a direct consequence of (4). (7c) and (7d) hold in all prelimit systems, hence in the limit. (7e) also follows from (4). Indeed, in the pre-limit system we have Xri (t) = X r i (0) + A r i (t)− ∑ j∈S(i) Π (s) ij (∫ t 0 µijΨ r ij(s)ds ) . Dividing by r and using the fact that the limit as r →∞ exists and is Lipschitz, we can apply (4) to conclude 1 r Π (s) ij (∫ t 0 µijΨ r ij(s)ds ) → ∫ t 0 µijψij(s)ds. (7f)–(7h) hold in all prelimit systems, hence in the limit. Finally, (7i) follows from (7f)–(7h).  Definition 2.17. We call any Lipschitz solution (ψE(·), qI(·), xI(·), aI(·), dE(·), ξE(·)) of (7) a fluid model of the LAP system with initial state (ψE(0), qI(0)); a set (ψE(·), qI(·)), which is a projection of a fluid model, we often call a fluid model as well. 21 Remark 2.18. In this case, we also have uniqueness of fluid model solutions given starting state, and rather more simply than for LQFS-LB: since LAP is a simple priority discipline, we can directly determine the quantities λij(t +) ≡ ξ˙ij(t+) in order of decreasing priority. 5. LQFS-LB fluid models near equilibrium 5.1. Linear ODE. In this section, we examine the behaviour of the fluid models for LQFS-LB. Define the equilibrium point of the LQFS-LB fluid model as follows. Definition 2.19. In underload (ρ < 1), the equilibrium point is the state ψ∗ij ≡ λij µij , ∀(ij) ∈ E , q∗i ≡ 0, ∀i ∈ I where λE are the optimal solution to the SPP (1) (unique by Assumption 2.4). In critical load (ρ = 1), an equilibrium point is any state with ψ∗ij ≡ λij µij , ∀(ij) ∈ E , q∗i ≡ q∗, ∀i ∈ I for some constant q∗ ≥ 0. Thus, in critical load the equilibrium point is non-unique, although ψ∗E is uniquely defined. It is easy to see that the functions (ψE(t), qI(t)) = (ψ∗ij, q ∗ i ) for all t are indeed a fluid model. Definition 2.20. The values associated with the equilibrium point are henceforth referred to as nominal. For example, ψ∗ij is the nominal occupancy (of pool j by type i), λi is the nominal arrival rate, λij is the nominal routing rate (along activity (ij)), ψ∗ijµij = λij is the nominal service rate (of type i in pool j), ∑ j ψ ∗ ijµij = λi is the nominal total service rate (of type i), ρ is the nominal total occupancy (of each pool j), etc. Desirable system behaviour would be to have ψE(t) → ψ∗E as t → ∞; we will now investigate whether this in fact occurs. We will consider two cases: ρ < 1 and (ρ = 1, q∗ > 0)8. In the first case, in a sufficiently small neighbourhood of the equilibrium, the system state can be described by specifying the I + J − 1 variables ψij(t). In the second case, in a sufficiently small neighbourhood of the equilibrium, the system state can also be described by specifying I + J − 1 variables, namely, qi(t) and, for each j ∈ J , all but one of the ψij(t) (for a total of J − 1 variables ψij(t)). Indeed, the condition qi(t) > 0 for all i will imply ∑ i ψij(t) = βj for all j ∈ J . Since the state descriptor has this form on a neighbourhood of the equilibrium point, and the fluid models are Lipschitz, there will be an interval of time during which the state descriptor of the fluid model trajectory is of this form. We now prove two state space collapse results (in underload and in critical load). These results show that, after a finite time, the fluid models are confined to a submanifold of dimension I, rather than I + J − 1. In the process, we confirm that LQFS-LB does work as a load-balancing mechanism, in that the instantaneous load ρj(t) on all of the server types will be equal after a finite time. Theorem 2.21. Let ρ < 1. There exists a sufficiently small  > 0, depending only on the system parameters, such that for all sufficiently small δ the following holds. There 8We do not consider here the case q∗ = 0 in critical load, because for a fluid model it is “atypical”: it requires the system workload to be “just right”. We will return to the case ρ = 1, q∗ = 0 in §7.4, when we discuss the Halfin-Whitt asymptotic regime. 22 exist times T1 = T1(δ) and T2 = T2(δ), 0 < T1 < T2, such that if the initial system state ψE(0) satisfies ‖ψE(0)− ψ∗E‖ < δ, then for all t ∈ [T1, T2] the system state satisfies ‖ψE(t)− ψ∗E‖ < , ρj(t) = ρj′(t) for all j, j ′ ∈ J . Moreover, T1(δ) ↓ 0 and T2(δ) ↑ ∞ as δ ↓ 0. The evolution of the system during [T1, T2] is described by a linear ODE, specified below by (12). Theorem 2.22. Let ρ = 1, and consider an equilibrium point with q∗ > 0. There exists a sufficiently small  > 0, depending only on the system parameters, such that for all sufficiently small δ > 0 the following holds. There exist T1 = T1(δ) and T2 = T2(δ), 0 < T1 < T2, such that if the initial system state satisfies ‖ψE(0)− ψ∗E‖ < δ, ∥∥qI(0)− (q, . . . , q)>∥∥ < δ, then for all t ∈ [T1, T2] the system state satisfies ‖ψE(t)− ψ∗E‖ < , ∥∥qI(t)− (q, . . . , q)>∥∥ < , qi(t) = qi′(t) for all i, i ′ ∈ I. Moreover, T1(δ) ↓ 0 and T2(δ) ↑ ∞ as δ ↓ 0. The evolution of the system during [T1, T2] is described by a linear ODE specified below by (14). Proof of Theorem 2.21. Let us choose a suitably small  > 0. In particular, we require  to be sufficiently small that if ‖ψE(0)− ψ∗E‖ < , then ∑ i ψij(t) < βj for all j, so there is no queueing. Because the fluid model trajectories are continuous, we can always choose some T2 > 0 such that, for all sufficiently small δ > 0, if ‖ψE(0)− ψ∗E‖ < δ, then ‖ψE(t)− ψ∗E‖ <  for all t ≤ T2. We now show that ρj(t) = ρj′(t) for all j, j′ ∈ J , during the time interval [T1, T2], for some T1 depending on δ. Consider ρ∗(t) ≡ minj ρj(t), ρ∗(t) ≡ maxj ρj(t), and assume ρ∗(t) < ρ∗(t). Let J∗(t) ≡ {j : ρj(t) = ρ∗(t)}. Then the total arrival rate to servers of type j ∈ J∗(t) is∑ i∈C(j),j∈J∗(t) λi. We claim that this is strictly greater (by a constant) than the nominal arrival rate∑ i∈C(j),j∈J∗(t) λij. Indeed, under Assumption 2.4 the basic activity tree E is connected, and J∗(t) ( J (else we couldn’t have ρ∗(t) < ρ∗(t)). Consequently, we must have λij′ > 0 for at least one edge (ij′) ∈ E such that i ∈ ∪j∈J∗(t)C(j) but j′ 6∈ J∗(t), i.e.∑ i∈C(j),j∈J ∗(t) λi − ∑ i∈C(j),j∈J∗(t) λij ≥ λij′ > 0. Taking the minimum of the λij′ over all (nonempty, proper) subsets J∗(t) ( J gives∑ i∈C(j),j∈J ∗(t) λi − ∑ i∈C(j),j∈J∗(t) λij ≥ c > 0 for some constant c which depends only on the solution to the static planning problem (1), i.e. only on the system parameters. This inequality holds at all times t such that J∗(t) 6= J , i.e. ρ∗(t) < ρ∗(t). On the other hand, the total rate at which customers depart from servers in J∗(t) is∑ i∈C(j),j∈J∗(t) µijψij(t), 23 which for ‖ψE(t)− ψ∗E‖ <  is close to nominal. If we choose  < c/2, we see that arrivals exceed services by at least a constant at all times t such that ‖ψE(t)− ψ∗E‖ <  and ρ∗(t) < ρ∗(t). Similarly (decreasing  if necessary), ρ∗(t) is decreasing at a rate bounded below by a (possibly different) constant. We conclude that while ρ∗(t) − ρ∗(t) > 0 (and ‖ψE(t)− ψ∗E‖ <  continues to hold), the difference ρ∗(t) − ρ∗(t) is decreasing at a rate bounded below by a constant. This difference is bounded below by 0, and, being Lipschitz, it is equal to the integral of its own derivative (see Appendix A). Consequently, in finite time T1 = T1(δ), we must reach the set ρ∗(t) = ρ∗(t). Of course, this requires T1(δ) < T2, but since T1 is linear in δ, we can always choose δ small enough for this to hold. Moreover, we clearly have T1(δ) ↓ 0 as δ ↓ 0. Since, as we saw above, the derivative of ρ∗(t)−ρ∗(t) is negative whenever ρ∗(t)− ρ∗(t) > 0, and the function ρ∗(·)− ρ∗(·) is equal to the integral of its derivative, the equality ρ∗(t) = ρ∗(t) will continue to hold for T1 ≤ t ≤ T2. It remains to derive the differential equation, and to show that T2 can be chosen depending on δ so that T2 ↑ ∞ as δ ↓ 0. Once we are confined to the manifold ρj(t) = ρj′(t) = ρ(t) for all t, the system evolution is determined in terms of only I independent variables. Recall that there is no queueing for t ≤ T2, so we can take the I variables to be ψi(t). Given ψI(t), we know ρ(t) as ( ∑ i ψi(t))/( ∑ j βj). Consequently, we know ∑ i ψij(t) = ρ(t)βj and ∑ j ψij(t) = ψi(t). On a tree, this allows us to solve for ψij(t) by “stripping off” leaves. (For a customer type leaf, ψij(t) = ψi(t), while for a server-type leaf, ψij(t) = ρ(t)βj; see (15) below.) The resulting relationship will clearly be linear, i.e. (8) (ψij(t)) ≡M(ψi(t)) for some matrix M . For future reference, we define the (“load balancing”) linear mapping M from y ∈ RI to z = zE ∈ RI+J−1 as follows: z = My is the unique solution of (9) η = ∑ i yi∑ j βj ; ∑ i zij = ηβj,∀j; ∑ j zij = yi,∀i. Let M denote the manifold containing the image of M ; that is, (10) M≡ {My, y ∈ RI} ⊆ RI+J−1. Thus, the assertion that ρj(t) = ρj′(t) = ρ(t) for all t is equivalent to the assertion that ψE(t) ∈M. The evolution of ψi(t) is given by (11) ψ˙i(t) = λi − ∑ j µijψij(t), ∀i. (This is (5c) for the case of qi = 0, i.e. xi(t) = ψi(t).) By the above arguments we see that this entails (in matrix form) (12) ψ˙I(t) = λI + AuψI(t), where Au is an I × I matrix, (13) Au = SDM ; here, M is given by (9), D is the diagonal matrix of service rates with entries µE , and S has entries Si,(kj) = −δik. It remains to justify the claim that T2(δ) ↑ ∞ as δ ↓ 0. This follows from the fact that, as long as t ≥ T1 and ‖ψE(t)− ψ∗E‖ < , the evolution of the system is described by the linear ODE (12). The solutions have the general form ψI(t)− ψ∗I = exp(Au(t− T1))(ψI(T1)− ψ∗I), ψE(t)− ψ∗E = M(ψI(t)− ψ∗I) 24 where M and Au are constant matrices depending on the system parameters. Therefore, if ‖ψI(T1)− ψ∗I‖ ≤ δ is sufficiently small, then the time it takes for ψE(t) to escape the set ‖ψE(t)− ψ∗E‖ <  can be made arbitrarily large. Since as δ ↓ 0 we have T1(δ) ↓ 0, and the system is Lipschitz, taking ‖ψE(0)− ψ∗E‖ < δ for small enough δ will guarantee that ‖ψI(T1)− ψ∗I‖ is small, and hence we can choose T2(δ) ↑ ∞.  The proof of Theorem 2.22 proceeds similarly; we outline only the differences. Proof of Theorem 2.22. We take T2 s.t. ∥∥qI − (q, . . . , q)>∥∥ <  for all t ≤ T2. We will take  > 0 sufficiently small that this implies, in particular, qi(t) > 0 for all i ∈ I, and hence ρj(t) = 1 for all j ∈ J , at all times t ≤ T2. The equality of queue lengths in [T1, T2] is shown analogously to the proof of ρ∗(t) = ρ∗(t) for the underloaded case. Namely, the smallest queue must increase and the largest queue must decrease (as long as not all qi(t) are equal), because it is getting less (respectively more) service than nominal (we choose  small enough for this to be true provided ‖ψE(t)− ψ∗E‖ < ). Thus, in [T1, T2] we will have qi(t) = qi′(t) for all i, i ′ ∈ I. The linear equation is modified as follows. We have x˙i(t) = λi − ∑ j µijψij(t). Since we know that all qi(t) are equal and positive, we have qi(t) = q(t) = 1 I ( ∑ xk(t) −∑ βj), and therefore ψ˙i(t) = x˙i(t)− 1 I ∑ k x˙k(t). The rest of the argument proceeds as above to give (14) (ψ˙i(t)) = (λi − 1 I ∑ i λi) + Ac(ψi(t)) for the appropriate matrix Ac which can be computed explicitly from the basic activity tree. The trajectory ψI(·) determines ψE(·) on [T1, T2] (because we are load-balanced with ρ = 1), and this in turn determines xI(·) and qI(·). Just as above, the existence of the linear ODE, together with the fact that T1(δ) ↓ 0 as δ ↓ 0, implies that T2(δ) ↑ ∞ as δ ↓ 0.  To compute the matrix entries of M of (9), and then of Au, Ac, we carry out the “leaf-stripping” procedure mentioned in the proof of Theorem 2.21. We arrive at the following formula: (15) ψi0j0(t) = ∑ i(i0,j0) ψi(t)− ∑ j(i0,j0) ρ(t)βj = 1∑ βj  ∑ i(i0,j0) ∑ j(j0,i0) ψi(t)βj − ∑ i(j0,i0) ∑ j(i0,j0) ψi(t)βj  Here, the relation  is defined as follows. Suppose we disconnect the basic activity tree by removing the edge (i0, j0). Then for any node k (either customer type or server type) we say k  (i0, j0) if it falls in the same component as i0; otherwise, k  (j0, i0). (This is unrelated to the use of ≺ for partial orderings in Chapter 3.) Since in underload we have ψ˙i(t) = λi − ∑ j µijψij(t), we obtain the following expression for Au: 25 Lemma 2.23. (1) Let ρ < 1. The entries (Au)ii′ of the matrix Au are as follows. The coefficient of ψi in ψ˙i is (Au)ii = − 1∑ j βj ∑ j∈S(i) µij ∑ j′(j,i) βj′ . The coefficient of ψi′ in ψ˙i is (Au)ii′ = 1∑ j βj − ∑ j∈S(i),j 6=jii′ µij ∑ j′(j,i) βj′ + µijii′ ∑ j′(i,jii′ ) βj′  = (Au)ii + µijii′ . Here, jii′ ∈ S(i) is the neighbour of i such that, after removing the edge (i, jii′) from the basic activity tree, nodes i and i′ will be in different connected compo- nents. (Such a node is unique, since there is a unique path along the tree from i to i′.) (2) The matrix Au is non-singular. (3) The matrix Au depends only on βJ , the basic activity tree structure E, and µE , and does not depend on λI and ψ∗E . Proof. (1) We simply use (15) in the expression d dt ψi(t) = λi − ∑ j∈S(i) µijψij(t). For example, for the network in Figure 2.5, we have A B β1 β2 µA1 µA2 µB2 Figure 2.5. Example for calculation of the matrix Au. ψA1ψA2 ψB2  =  β1β1+β2 β1β1+β21− β1 β1+β2 − β1 β1+β2 0 1 (ψA ψB ) giving( ψ˙A ψ˙B ) = ( λA λB ) + (−µA1 β1β1+β2 − µA2 β2β1+β2 −µA1 β1β1+β2 + µA2 β1β1+β2 0 −µB2 )( ψA ψB ) as required. The equality between the two expressions for (Au)ii′ is a consequence of the identity 1∑ j βj ∑ j′(j,i) βj′ + 1∑ j βj ∑ j′(i,j) βj′ = 1; e.g., in the example above, we observe that −µA1 β1 β1 + β2 + µA2 β2 β1 + β2 = ( −µA1 β1 β1 + β2 − µA2 β2 β1 + β2 ) + µA2. 26 (2) By (12), to show that Au is nonsingular, it suffices to demonstrate the follow- ing. Given a vector of derivatives ψ˙I , we can find a load-balancing vector ψE corresponding to some load ρ (an unknown), which results (under (11)) in these derivatives. Consider the I+J linear equations λi− ∑ j µijψij = ψ˙i (for all i) and ∑ i ψij = ρβj (for all j). The value ρ is uniquely determined by the workload derivative condition (see (3)): ∑ i νiψ˙i = ∑ i νiλi − ρ ∑ j αj. Given the values ψ˙I and ρ, we can now solve for ψE by sequentially eliminating the leaves of the basic activity tree. (3) Follows from (1).  There is also an explicit expression for the entries of Ac, which is obtained similarly: Lemma 2.24. (1) The entries (Ac)ii′ of the matrix Ac (for the critical load case, ρ = 1) are as follows: (16) (Ac)ii′ = (Au)ii′ − 1 I ∑ k (Au)ki′ . (2) The matrix Ac has rank I − 1. The (I − 1)-dimensional subspace N = {y :∑ i yi = 0} is invariant under the transformation Ac, i.e. Ac maps vectors in N to N . Letting pi denote the orthogonal projection (along (1, . . . , 1)>) onto N , we have (17) Ac = piAu. Restricted to N , the transformation Ac is invertible. (3) The linear transformation Ac, restricted to subspace N , depends only on the basic activity tree structure E and the values µE , and does not depend on βJ , λI and ψ∗E . Proof. (1) The fluid model here is such that there are always non-zero queues, which are equal across customer types. We can write (18) ψ˙i(t) = x˙i(t)− 1 I ∑ k x˙k(t) = (λi − ∑ j µijψij(t))− 1 I ∑ k (λk − ∑ j µkjψkj(t)), which implies (16). (2) First of all, it is not surprising that Ac does not have full rank: the linear ODE defining Ac is such that ∑ i ψi(t) = ∑ j βj at all times, so there are at most (I−1) degrees of freedom in the system. Also, it will be readily seen that (16) asserts precisely that Ac = piAu. Since Au is invertible and pi has rank I − 1, their composition has rank I − 1. Since the image of Ac is contained in N , we must have equality. It remains to check that Ac restricted to N still has rank I−1. To see this, we observe that the simple eigenvalue 0 of Ac has as its unique right eigenvector the vector A−1u (1, 1, . . . , 1) >. We will be done once we show that this eigenvector does not belong to N . Suppose instead that Auv = (1, 1, . . . , 1)> for some v ∈ N ,∑ i vi = 0. Then, for small  > 0, starting from some state ψI(t), the state ψI(t) + v would (under balanced loads) have strictly faster service of all the customer types, while keeping the same proportion of servers occupied. This, 27 however, is impossible. When loads on all the server pools are balanced, the rate at which the system processes workload depends only on the total proportion of occupied servers, hence only on the total number of customers in service. (3) The specific expression (16) for Ac may depend on the pool sizes βJ . However, Ac is a singular I × I matrix, and the statement (3) is only concerned with the transformation of the (I − 1)-dimensional subspace N that Ac induces; this transformation does not depend on (βJ ), as the following argument shows. Pick any (ij) ∈ E . Modify the original system by replacing βj by βj + δ and λi by λi + δµij. Then the ODE (18) for the modified system remains exactly the same as for the original one. Thus, the transformation Ac must not depend on βJ . An alternative argument is purely analytic. Recall that to compute (Au)ij we used (15). In critical load, we have ρ(t) ≡ 1, so the (left) equation (15) for ψi0j0(t) simplifies to (19) ψi0j0(t) = ∑ i(i0,j0) ψi(t)− ∑ j(i0,j0) βj. If we substitute this in the right-hand side of (18), we will obtain a different expression for ψ˙i(t). While its constant term will depend on βJ , the linear term will not, since the linear term of (19) does not depend on βJ . Therefore, the ODE describing the evolution of (ψI − ψ∗I) (which drops the constant term) will not depend on βJ .  We will now analyse the stability of the fluid models for LQFS-LB. 5.2. Definitions of stability. Definition 2.25. We say that the (fluid) system is locally stable if all fluid models starting in a sufficiently small neighborhood of an equilibrium point (which is unique for ρ < 1; and for ρ = 1 we consider any equilibrium point with equal, positive queues q∗ > 0) are such that, for some constant C > 0 that does not depend on the initial state, ‖ψE(t)− ψ∗E‖ ≤ ∆0e−Ct, where ∆0 = ‖ψE(0)− ψ∗E‖+ ∥∥qI(0)− (q∗, . . . , q∗)>∥∥. We call the system globally stable if any fluid model, with arbitrary initial state, converges to some equilibrium point as t→∞. It is not obvious that, as defined here, global stability implies local stability; however, in the example in which we can prove global stability (Theorem 2.30), we shall see that local stability also holds. The assumption of exponential convergence for local stability is the result of Theorems 2.21 and 2.22. The theorems assert that on a neighbourhood of equilibrium the fluid models are governed by a linear ODE, so if they converge at all, they do so exponentially quickly. Remark 2.26. The definition of global stability implies ρj(t) → ρ for all j ∈ J and ψE(t) → ψ∗E for all i ∈ I, j ∈ J . In underload, the definition necessarily implies qi(t) → 0 for all i ∈ I. In the case ρ = 1, the local stability criterion does not require that qi(t) → q∗, for q∗ associated with the chosen equilibrium point. However, local stability will guarantee convergence of queues somewhere. First, if ψE(t) ≈ ψ∗E at all times t, then we cannot have large inequalities between queue sizes qi(t) across different customer types, because the rates at which customers of different type enter service must be approximately nominal. Second, if ψE(t) ≈ ψ∗E , then system workload is approximately 28 constant; since the number of customers in service is approximately constant, we conclude that the queues are approximately constant as well. Consequently, local stability will imply that all qi(t)→ q for some q. In fact, it is not hard to see that |q − q∗| ≤ C0∆0 for some constant C0 > 0 depending only on the system parameters. In other words, local stability guarantees convergence to an equilibrium point not too far from the “original” one. Remark 2.27. “Global stability” is slightly weaker than the definition of “stability” usually adopted for fluid models. Typically (e.g. [Bramson, 2006, Chapter 4]), a fluid model is called stable if, for all starting states within a ball of radius 1 from the equilibrium point, the fluid model reaches the equilibrium point after a finite time. We, on the other hand, allow convergence to be asymptotic, and do not require uniformity (although in the case of Theorem 2.30 the convergence will indeed be uniform). There is a general theory of proving positive (Harris) recurrence for queueing networks via the stability (in the sense of uniform, finite-time convergence to equilibrium) of fluid networks; see e.g. [Bramson, 2006, Chapter 4]. However, we are not trying to prove stability of LQFS-LB in this sense. In our set-up positive Harris recurrence will hold for all parameter values with ρ < 1, because if the queues grow large enough, then all the servers will become fully occupied, and the system will process workload faster than it arrives. (In particular, discussions of “steady state” of the LQFS-LB algorithm are well-defined.) We are interested in the finer question of whether, in steady-state, the LQFS-LB algorithm will eliminate customer queueing, and our notion of global stability is more appropriate. By Theorems 2.21 and 2.22, local stability is determined by the stability of a linear ODE, which in turn is governed by the eigenvalues of the matrix Au or Ac. Definition 2.28. We will call matrix Au stable if all its eigenvalues have negative real part. We call matrix Ac stable if all its eigenvalues have negative real part, except for one simple eigenvalue 09. If Au,c is stable, then the corresponding linear ODE (12) or (14) is stable as well. On the other hand, if Au,c has an eigenvalue with positive real part, the ODE has solutions diverging from equilibrium ψ∗I exponentially fast; and if Au,c has (a pair of conjugate) pure imaginary eigenvalues, the ODE has oscillating, never converging solutions. That is, Proposition 2.29. The local stability of the underloaded, respectively critically loaded fluid system is equivalent to the stability of the matrix Au, respectively Ac. We will now examine examples where the matrices Au,c are globally stable, locally stable, and locally unstable. In §5 we will investigate the first and last of these cases further, under the diffusion scaling. 5.3. Global stability. If the service rates in the system depend only on the server type, we have both global and local stability. Theorem 2.30. Assume µij = µj for all (ij) ∈ E. Then the system is globally stable both for ρ < 1 and for ρ = 1. In addition, the system is locally stable (i.e. the matrices Au and Ac are stable). 9A matrix A with all eigenvalues having negative real part is usually called Hurwitz. So, Au stability is equivalent to Au being Hurwitz; while Ac stability definition is slightly different, because Ac considered as a linear transformation of RI is singular. A symmetric matrix A, whose eigenvalues are all real, is Hurwitz if and only if it is negative definite, which is a property that can be easily verified by computing some polynomials in the matrix entries (see e.g. [Meyer, 2000, Section 7.6] or [Horn and Johnson, 1985, Section 7.2]). Unfortunately, neither Au nor Ac is, in general, symmetric; so there appears to be no easy way of determining the sign of the real part of the eigenvalues. 29 Proof. Consider the underloaded system, ρ < 1, first. First, we show that the lowest load cannot stay too low. Suppose the minimal load ρ∗(t) ≡ minj ρj(t) is smaller than ρ, and let J∗(t) ≡ {j : ρj(t) = ρ∗(t)}. Then all customer types in C(J∗(t)) ≡ ⋃ j∈J∗(t) C(j) are routed to server pools in J∗(t), so the total arrival rate “into” J∗(t) is no less than nominal; on the other hand, since µij = µj and server occupancy is lower than nominal, the total departure rate “from” J∗(t) is smaller than nominal. This shows that if ρ∗ < ρ −  < ρ, then ρ˙∗ > δ > 0, where δ ≥ c for some constant c > 0 (depending on the system parameters). That is, if ρ∗(t) < ρ then ρ˙∗(t) ≥ c(ρ− ρ∗(t)), so ρ∗(t) is bounded below by a function converging exponentially fast to ρ. Consider a fixed, sufficiently small  > 0; we know that there exists a finite time T1 such that ρ∗(t) ≥ ρ−  for all t ≥ T1. If some customer class i has a queue qi(t) > 0, then all server classes j ∈ S(i) have ρj = 1. It is now easy to see that the system is serving customers faster than they arrive (because ρ < 1 and  is small). This easily implies that all qi(t) = 0 after some finite time T2. In the absence of queues, we can analyse ρ∗(t) ≡ maxj ρj(t) similarly to the way we treated ρ∗(t); namely, if ρ∗(t) > ρ at some point, then the servers in J ∗(t) ≡ {j : ρj(t) = ρ∗(t)} are processing workload faster than the nominal rate, and are getting no more arrivals than the nominal quantity. Consequently, ρ∗(t) is bounded above by a function converging exponentially fast to ρ. Since ρ∗(t)→ ρ and ρ∗(t)→ ρ, we conclude ρj(t)→ ρ for all j. Once all ρj(t) are close enough to ρ, we can use an argument similar to the proof of Theorem 2.21 to conclude that, after a further finite time, we will have ρj(t) = ρj′(t) for all j, j′. (Theorem 2.21 does not apply directly, because we have ρj(t) ≈ ρ, but possibly ψE(t) 6≈ ψE . However, because the service rates µij = µj do not depend on the customer class, we only need the total occupancies of each server pool to be approximately nominal.) Moreover, this common load ρ(t) will satisfy ρ˙(t) ∑ j βj = ∑ i λi − ρ(t) ∑ j βjµj, and therefore will be given by ρ(t) = ρ+ c1 exp(−c2t) for some constants c1, c2 = ( ∑ j βjµj)/( ∑ j βj) > 0. We conclude that ρ(t) → ρ (expo- nentially quickly) and ρ˙(t)→ 0 (exponentially quickly). Define λˆE(t) by (20) λˆij(t) ≡ µjψij(t) + ψ˙ij(t). This is the instantaneous rate at which customers of type i are being routed to servers of type j in the absence of queueing. We have ∑ j λˆij(t) = λi at all (large) times t. Further, from the discussion of ρ(t) above we conclude∑ i λˆij(t) = µj ∑ i ψij(t) + ∑ i ψ˙ij(t) = µjβjρj(t) + βj ρ˙j(t)→ βjµjρ = ∑ i λij. This implies λˆE(t)→ λE , and therefore by (20) ψE(t)→ ψ∗E as required. Now, consider a critically loaded system, ρ = 1. Essentially the same argument as above tells us that, as long as not all queues qi(t) are equal, each of the longest queues gets more service than the arrival rate into it, and so q∗(t) ≡ max qi(t) has derivative which is strictly negative and bounded away from 0. If at some time t, all qi(t) are equal and positive, then q˙∗(t) = 0. We see that q∗(t) is non-increasing, and so q∗(t) ↓ q ≥ 0. We 30 also have ρ∗(t) → ρ = 1 exponentially fast. (Same proof as above applies.) These facts easily imply convergence to an equilibrium point. We omit further detail. In order to show local stability, it is sufficient to observe that, for all  > 0 there exists a δ > 0 such that fluid models started in a δ-neighbourhood of the equilibrium point will never leave an -neighbourhood of it. In this case, taking  to be small enough that the behaviour of the fluid model is controlled by a linear ODE ((11) or (14)), convergence to the equilibrium point will imply stability of Au and Ac.  5.4. Local stability. If the service rates in the system depend only on the customer type, we have local stability. Theorem 2.31. Assume µij = µi for all (ij) ∈ E. Then the system is locally stable (i.e. the matrices Au and Ac are stable). Proof. For the case ρ < 1 and µij = µi, (11) becomes ψ˙i(t) = λi − µiψi(t) and Au is simply a diagonal matrix with entries −µi, which is clearly stable. Assume now that ρ = 1. As we just saw, the matrix Au in this case is diagonal with entries −µi. By Lemma 2.24, Ac has off-diagonal entries (Ac)ii′ = µi′/I and diagonal entries −µi(1−1/I). In particular, its off-diagonal entries are strictly positive. Therefore, Ac+ηI for some large enough constant η > 0 (where I is the identity matrix) is a positive matrix. By Perron-Frobenius theorem [Meyer, 2000, Chapter 8], Ac + ηI has a real eigenvalue p + η with the property that any other eigenvalue of Ac + ηI is smaller than p+ η in absolute value (and in particular has real part smaller than p+ η). Moreover, the associated left eigenvector w is strictly positive, and is the unique (up to scaling) strictly positive left eigenvector of Ac + ηI. Translating these statements to Ac, we get: Ac has a real eigenvalue p; all other eigenvalues of Ac have real part smaller than p; Ac has a unique (up to scaling) strictly positive left eigenvector w; and the eigenvalue of w is p. Now, we know that Ac has a positive left eigenvector with eigenvalue 0, namely (1, 1, . . . , 1). We conclude that p = 0, and all other (i.e., non-zero) eigenvalues of Ac have real part smaller than 0, as required.  5.5. (Local) instability. We have shown that the matrices Au and Ac are stable in the cases µij = µj, (ij) ∈ E and µij = µi, (ij) ∈ E . Since the entries of Au, Ac depend continuously on the parameters µE (Lemmas 2.23, 2.24), and the eigenvalues of a matrix depend continuously on its entries, we know that the matrices Au, Ac will be stable for all parameter settings sufficiently close to those special cases. Therefore, there exists a non-trivial parameter domain of local stability. It seems reasonable to conjecture that at least local stability holds for any set of µE , provided Assumption 2.4 holds. This seems a particularly natural assumption given that the graph of available routing choices has no cycles along which an instability could propagate and grow. However, this intuition turns out to be false. We will now construct examples to demonstrate that, in general, the system can be locally unstable. Example 2.32 (Local instability: underload). Consider a system with 3 customer types A, B, C and 4 server types 1 through 4, connected 1−A− 2−B − 3−C − 4. Set β1 = 0.97 and β2 = β3 = β4 = 0.01. Set µA1 = µB2 = µC3 = 1, and µA2 = µB3 = µC4 = 100. (See Figure 2.6.) For this example, we compute using Lemma 2.23, Au = −1.99 −0.99 −.9997.02 −2.98 −1.98 96.03 96.03 −3.97  31 0.01 0.01 1 1 100100 1 100 0.97 0.01 A B C Figure 2.6. System with three customer types whose underload equilib- rium is unstable. with eigenvalues {−17.8, 4.45±23.4i}. Therefore by Theorem 2.21, the system with these parameters is described by an unstable ODE in the neighbourhood of its equilibrium point. Remark 2.33. For this, as for any other, set of parameter values µE , βJ , there exist values λI which make all the activities in E basic. For example, we may simply start with a load-balancing allocation ψE , and define λI = ∑ j µijψij. Since the stability of Au and Ac does not depend on the arrival rates λI (as long as the basic activity tree is unchanged), it does not matter which of the possible arrival rates we choose. Remark 2.34. Although we have shown that for the parameters in Example 2.32, fluid models on a neighbourhood of equilibrium are governed by an unstable ODE (and will see in Section 7 that the stochastic system is never very close to the equilibrium point), this leaves open the question of the steady-state behaviour of fluid models. In principle, as Remark 2.15 shows, it is possible to construct the unique explicit solution to the fluid model equations (with the added constraint λij(t) ≥ 0); however, as Lemma 2.35 shows, we must be dealing with a system of dimension ≥ 6 (and, it seems, with somewhat unwieldy parameters), which makes the numerical analysis somewhat involved. On general grounds, we can conclude that, for ρ < 1, all fluid limits started in a compact set K will reside in some other compact set K ′. This follows from the arguments in Remark 2.27, whose contents are essentially as follows: if we look at a LQFS-LB system over a sufficiently long time scale, and it starts with a large queue size, then after a finite time all server pools will be fully busy, and hence will be processing workload faster than it arrives. For the (deterministic) fluid limit, it is in principle possible to give exact bounds of the form “If the initial queue size satisfies ‖qI(0)‖ > Q, then after a finite time T0 all server pools will be fully busy for at least another time T1 such that ‖qI(T0 + T1)‖ < ‖qI(0)‖.” (The precise values of Q, T0, and T1 are unenlightening.) This means that there are three possible behaviours for the fluid model equations in the long run: (1) It is possible that all fluid models eventually hit the submanifold of convergence of the linear ODE that governs the evolution of the system near the equilibrium point, and thus eventually they converge to the equilibrium point. (This seems unlikely.) (2) The fluid model solutions may be periodic, or may converge to some periodic solution (which does not enter the region near the equilibrium point). (3) The fluid model solutions may be chaotic. This intuitively seems like the most likely possibility, at least with generic parameter values. This instability example is minimal in the following sense. Lemma 2.35. Consider an underloaded system, ρ < 1. 32 (1) Let I ≥ 2. Any customer type i that is a leaf in the basic activity tree, does not affect the local stability of the system. Namely, let us modify the system by re- moving type i, and then modifying (if necessary) input rates λk of the remaining types k ∈ I \ i so that the basic activity tree of the modified system is E \ (ij), where (ij) is the (only) edge in E adjacent to i. Then, the original system is locally stable if and only if the modified one is. (2) A system with two (or one) non-leaf customer types is locally stable. Proof. (1) If customer type i is a leaf, the equation for ψi(t) is simply ψ˙i(t) = λi − µijψi(t). This means that the unit vector in the ith coordinate direction is an eigenvector of Au with the corresponding eigenvalue −µij < 0. Further, it is easy to see that: (a) the rest of the eigenvalues of Au are those of matrix A (−i) u obtained from Au by removing the ith row and the ith column; and (b) A (−i) u is exactly the “Au-matrix” for the modified system. (2) We can assume that there are no customer type leaves. The case I = 1 is trivial (and is covered by Theorem 2.30), so let I = 2. Throughout the proof, the pool sizes βj are fixed. From Theorem 2.30 we know that for a certain set of service rate values (namely, µij = µj, (ij) ∈ E), the matrix Au is stable. Suppose that we continuously vary the parameters µij from those initial values to the values of interest, without ever making µij = 0. If we assume that the final matrix Au is not stable, then as we change µij the (changing) matrix Au acquires at some point two purely imaginary eigenvalues. In that case, we must have trace (Au) = 0. However, as seen from the form of Au in Lemma 2.23, the diagonal entries of Au are always negative, and therefore trace (Au) < 0. The contradiction completes the proof.  This argument explains how the parameters in Example 2.32 were chosen. For 3 customer types, let the characteristic polynomial of Au be x 3−c2x2 +c1x−c0. A necessary and sufficient condition for all roots of the polynomial to have negative real parts is: −c2, c1,−c0 > 0 and c2c1 < c0 (see [Farkas, 2001, A1.1.1]). Using Lemma 2.23, we can evaluate the expression c0 − c1c2 in terms of the system parameters, and look for terms which appear with a “−” sign. Setting the corresponding parameters to be large relative to the rest will produce a candidate parameter set. Appendix D.1 contains the computations. Example 2.36. It is possible to construct an instability example with more reasonable values of βJ , µE , although it will have more customer types. Figure 2.7 shows the diagram. The associated 21× 21 matrix Au may be found in Appendix D.2; its largest eigenvalue has real part ≈ 0.03 > 0. Remark 2.37. One of the justifications used in Remark 2.8 for the assumption that the basic activity tree E is known in advance was an argument of separation of time scales; the routing of customers happens over quite short time scales. One could therefore argue that the rather slow exponential growth of the instability caused by an eigenvalue with real part 0.03 is irrelevant. This intuition, however, is somewhat difficult to quantify. Example 2.38 (Local instability: critical load, q > 0). We now analyse the critically loaded system ρ = 1 with queues, i.e. the stability of the matrix Ac. Recall that the transformation Ac, restricted to subspace N ≡ {y : ∑ i yi = 0}, and hence the stability of Ac, does not depend on βJ , so it suffices to specify µE . 33 13 1 111 9 more customers1 3 1 3 11 1 111 6 more customers3 313 111 1 Figure 2.7. System with βj = 1 and µij ∈ {13 , 1, 3} whose underload equilibrium is unstable. There are 21 customer types; µij = 1 for edges going to the left, µij = 1 3 for the first 12 edges going to the right, µij = 3 for the last 9 edges going to the right. Consider the network of Figure 2.8, which has 5 customer types A through E and 4 server types 1 through 4, connected A−1−B−2−C−3−D−4−E, with the following parameters: µA1 = 1 µB1 = 100 µB2 = 1 µC2 = 100 µC3 = 1 µD3 = 100 µD4 = 10000 µE4 = 100 A B C D E 100104100100 111001 Figure 2.8. System with five customer types whose critical load equilib- rium is unstable The matrix Ac, computed from Lemma 2.24 will be given by Ac = 1 20  9389 9805 10201 10597 −29003 10894 9290 9706 10102 −29498 10399 10795 9191 9607 −29993 −40091 −39695 −39299 −40903 119497 9409 9805 10201 10597 −31003  and the eigenvalues of Ac are {0,−16.88,−2190.05, 2.565± 23.23i}. Again, the above example is in a sense minimal: Lemma 2.39. Consider a critically loaded system, ρ = 1. (1) Let J ≥ 2. Any server type j that is a leaf in the basic activity tree does not affect the local stability of the system. Namely, let us modify the system by removing type j, and then replacing λi for the unique i adjacent to j by λi − βjµij. Then, the original system is locally stable if and only if the modified one is. (2) Consider a system labelled S. We say that a system S ′ is an expansion of system S if it is obtained from S by the following modification. We pick one server type j and one customer type i adjacent to it in E; we “split” type j into two types j′ and j′′; we “connect” type i to both j′ and j′′; each of the remaining types i′ ∈ C(j) \ i we connect to either j′ or j′′ (but not both); if (i′j′) (respectively (i′j′′)) is a new edge, we set µi′j′ = µi′j (respectively µi′j′′ = µi′j.) Then, S is locally stable if and only if S ′ is. (3) A system with four or fewer customer types is locally stable. Proof. (1) The argument here is similar to the one used to show the indepen- dence of transformation Ac (restricted to (I−1)-dimensional invariant subspace) 34 from βJ in the proof of Lemma 2.24. Namely, it is easy to check that the original and the modified system share exactly the same ODE (18). (2) Again, it is easy to see that the two systems share the same ODE (18). (3) We can assume that there are no server-type leaves, so that the tree E has only customer-type leaves, of which it can have two, three, or four. We now classify these trees. If it has four customer type leaves, then the tree has a total of four edges, hence five nodes, i.e. a single server pool, to which all the customer types are connected. If the tree has three customer type leaves, then letting k be the number of edges from the fourth customer type, we have k + 3 total edges, so k + 4 nodes, of which k are server types. That is, the non-leaf customer type is connected to all of the server types. Since there are no server type leaves, we must have k ≤ 3; since we are assuming the fourth customer type is not a leaf, we must have k ≥ 2; thus, k = 2 or k = 3. The last case is of two customer type leaves. Letting k, l be the number of edges coming out of the other customer types, we have a total of k+ l+ 2 edges. On the other hand, since each server type has at least 2 edges coming out of it, we have at most (k + l + 2)/2 server types, so at most (k + l + 2)/2 + 4 nodes. Thus, we have (k+ l+2)+1 ≤ (k+ l+2)/2+4, or k+ l+2 ≤ 6, giving k = l = 2 (since they must both be ≥ 2). We summarize the possibilities in Figure 2.9. Note that the bottom-left sys- tem can be obtained by a sequence of expansions (in the sense of (2) above) from each of the top-left systems. Applying Lemma 2.39 we find that, to establish local stability for systems with four customer types, we only need to consider two systems: bottom-left and right. In both of the resulting cases, we can use 4 leaves 3 leaves 3 or 4 leaves 2 leaves Figure 2.9. Possible arrangements of four customer types. Lemma 2.24 to write out Ac and its characteristic polynomial explicitly. The characteristic polynomial will have degree 4, but one of its roots is 0, so we can reduce it to degree 3. We then symbolically verify that the stability criterion for degree 3 polynomials cited above [Farkas, 2001, A1.1.1] is satisfied. Computations can be found in Appendix D.3.  An argument similar to that in the above proof allows us to explain how the parameters in Example 2.38 were chosen. We seek a condition satisfied by the coefficients of a degree 4 polynomial with two imaginary roots. Letting the polynomial be x4−c1x3+c2x2−c3x+c4, and letting the roots be η1, η2, ±iz (where η1 and η2 may be real or complex conjugates, and z ∈ R), we see that c1 = η1 + η2, c2 = η1η2 + z2, c3 = (η1 + η2)z2, and c4 = η1η2z2. This implies the relation c4c 2 1 + c 2 3− c1c2c3 = 0, and we can find the parameters for which 35 this is true. (The symbolic calculation will involve rather a lot of terms, and we do not reproduce it here.) Remark 2.40. Whereas for polynomials of degree 3 the condition c2c1−c0 = 0 is both necessary and sufficient for the existence of two imaginary roots [Farkas, 2001, A1.1.1], the condition we derive here for polynomials of degree 4 is only necessary. (For example, the polynomial (x − 1)2(x + 1)2 has c1 = c3 = 0, so c4c21 + c23 − c1c2c3 = 0, but it has no imaginary roots.) Thus, checking the sign of the corresponding expression alone is insufficient to determine whether the system is unstable, but is a useful way of narrowing down the parameter ranges. Example 2.41 (Local instability: both underload and critical load (q > 0)). It is possible to construct a single set of parameters for which both Au and Ac will be unstable. For the local stability of the underloaded system, the leaves of the basic activity tree corresponding to customer types are irrelevant (the corresponding occupancy on the sole available server class converges to nominal exponentially). On the other hand, for the critically loaded system, the leaves corresponding to server pools are irrelevant, since the corresponding server is fully occupied by its unique available customer type. This observation allows us to merge the above two systems into a single one which is unstable both in the underloaded and in the critically loaded case. Consider a system with 5 customer types A through E and 5 server types 0 through 4 connected as 0 − A − 1 − B − 2 − C − 3 − D − 4 − E, with µA0 = 100 and the re- maining µij as in the critically loaded case. Set β3 = 0.96 while β0, β1, β2, β4 = 0.01. (See Figure 2.10.) By the above discussion, this system, which is a modification of Ex- 1 1 1 B C D E 0.01 0.01 0.01 0.96 0.01 100104100100100 A 100 Figure 2.10. System with five customer types whose underload and crit- ical load equilibrium points are both unstable ample 2.38, must be unstable for ρ = 1 and positive queues, with the same eigenvalues {0,−16.88,−2190.05, 2.565± 23.23i}. On the other hand, in underload, we construct the matrix Au. We may restrict Au to the first 4 customer types, since E is a customer type leaf and, by Lemma 2.35 (1) doesn’t matter for the stability of the system. Au =  −1.99 −0.99 −0.99 −0.99 97.02 −2.98 −1.98 −1.98 96.03 96.03 −3.97 −2.97 −99 −99 −99 −199  with eigenvalues {−14.6,−201.1, 3.91± 18.1i}. Remark 2.42. Although for this system, both Au and Ac are unstable, it is not obvious what the system behaviour would be like in the vicinity of the q = 0 equilibrium point: the trajectories governed by either matrix cross the ρ = 1, q = 0 boundary, and the question of stability of such hybrid systems is in general quite difficult. (For example, it is certainly possible to have two individually-unstable matrices combine to form a stable system.) 36 Example 2.43 (Local instability: common eigenvector in underload and critical load). The expression (17) suggests another way to construct a system which is always locally unstable. Namely, we find a set of parameters for which Au has a right eigenvector (1, . . . , 1)> with some non-zero real eigenvalue c, and such that Ac = piAu is unstable (where pi is the projection along (1, . . . , 1)>). Then the projection of the system state onto the manifold N defined in Lemma 2.24 will always evolve according to Ac, which is unstable. (See §7.3.) Specifically, consider the system diagrammed in Figure 2.11. For sufficiently small , the matrix Ac for this system will be unstable, because the system in Figure 2.8 was unstable (i.e., had an eigenvalue with a positive real part), and the eigenvalues of a matrix depend continuously on its entries. By Lemma 2.39 (1), the addition of the server-type leaves 0 and 5 does not change critical-load stability. A B C D E 1001041001001 1 1 100−  1100 β0 β1 β2 β3 β5β4 11111 Figure 2.11. Modification of example in Figure 2.8, for which Au will have (1, . . . , 1)> as an eigenvector. We will next design the system parameters for which Au has eigenvector (1, . . . , 1) >. For this to be the case, it suffices to construct a system for which ψ∗i = ∑ j ψ ∗ ij are all equal, and ∑ j µijψ ∗ ij = λi = 1 for all i. Once we find a set of suitable parameters ψ ∗ ij > 0, we will set βj = ∑ i ψ ∗ ij to guarantee that ψ ∗ E achieve load balancing. (Recall that the linear transformation Ac, and in particular its stability properties, does not depend on βJ – see Lemma 2.24 (3).) For δ > 0 small, choose ψ∗D3 = 1−δ 100− and ψ ∗ D4 = δ 104 ; then∑ j µDjψ ∗ Dj = 1, ψ ∗ D = ψ ∗ D3 + ψ ∗ D4 > 1 100 . (This requires changing µD3 from 100 to 100 − , otherwise we could not get ψ∗D4 > 0.) Next, set ψ∗A0 = 1 100 − δ1 and ψ∗A1 = 100δ1, with δ1 > 0 small, so that ψ∗A = ψ∗D. Set ψ∗E4 = ψ ∗ C2 = ψ ∗ B1 ≡ ψ∗A0 and ψ∗E5 = ψ∗C3 = ψ∗B2 ≡ ψ∗A1. We have shown the following Proposition 2.44. There exists a set of parameters λI, βJ , µE for which the following hold: (1) Assumption 2.4 holds, and the unique optimal solution to the static planning problem (1) has ρ = 1. (2) The matrix Ac is unstable. (3) The matrix Au has the vector (1, . . . , 1) > as a right eigenvector, with some real (non-zero) eigenvalue. In the above construction, the eigenvalue in question will be given by ∑ j ψ ∗ ij. We have shown that systems with two customer types cannot be locally unstable, for any parameter setting such that ρ < 1 (and Assumption 2.4 is satisfied). Here we show a form of converse to this result, namely, that for large systems there always are parameters rendering the system unstable. 37 Lemma 2.45. Let ρ < 1. Any shape of basic activity tree that includes a locally unstable system (i.e., with Au having an eigenvalue with positive real part) as a subset will, with some set of parameters βJ , µE , become locally unstable. In particular, any shape of basic activity tree that includes Example 2.32 will be locally unstable for some set of parameters βJ , µE . Proof. Let U be any system whose underload (ρ < 1) equilibrium is locally unstable, e.g. one of the examples given above, with the associated fixed set of parameters µij, βj and λi. Let S be a system including U as a subset, namely: the activity tree of S is a superset of that of U ; the µij and βj in U are preserved in S; the µij in S are fixed. Consider a sequence of systems S in which βj = → 0 for all j not in U . By Remark 2.33, for each , we can slightly perturb the arrival rates λi to λ  i , such that as  → 0 we have convergence λi → { λi, i ∈ U 0, i 6∈ U and all of the activities in S are basic. (Keeping the value ρ from the system U , we simply prescribe the desired new occupancies ψij; βj → 0 implies ψij → 0 as → 0, so λi will have the desired convergence properties.) Order the ψi so that the customer types i in U come first. Suppose there are I customer types in U and I+k customer types in S. Let Au be the (I+k)× (I+k) matrix associated with S, and let Au be the I × I matrix associated with U considered as an isolated system. Then as → 0 the top left I × I entries of Au converge to Au, while the bottom left k × I entries of Au converge to 0. (That is, the effect of U on the stability of the rest of the system vanishes – this is due to the fact that pool size parameters βj in U remain constant, while βj → 0 in the rest of the system.) Consequently, each eigenvalue of Au is a limit of eigenvalues of A  u. Since Au had an eigenvalue with positive real part, for sufficiently small  the matrix Au will have at least one eigenvalue with positive real part as well, so the system S will be locally unstable.  We do not have an explicit characterization of the local instability domain, either for the underloaded case or for the critically loaded one, beyond the necessity of I ≥ 3, respectively I ≥ 4. We informally conjecture that the phenomenon is “rare”: Conjecture 2.46 (Very informal). All examples of instability have somewhat unre- alistic parameters (either very many customer classes, or widely differing server pool sizes or service rates). 6. LAP fluid models: convergence to equilibrium We now switch our attention to the fluid models for the Leaf Activity Priority (LAP) algorithm, described by (7). For LAP, we only treat the case of strict underload, ρ < 1. The LAP discipline is not designed with load balancing in mind; consequently, its equilibrium point is different from the load-balancing one. Instead, we recursively define the “routing rates” λij ≥ 0 as follows. For the activity (1j) with the highest priority, define either λ1j = λ1 and ψ ∗ 1j = λ1 µ1j , or ψ∗1j = βj and λ1j = βjµ1j, according to whichever is smaller. Replace λ1 by λ1− λ1j and βj by βj −ψ∗1j, and remove the edge (1j) from the tree. Proceed similarly with the remaining activities. Formally, we define the equilibrium point as follows. 38 Definition 2.47. Assume ρ < 1. Set λij = min λi − ∑ j′:(ij′)<(ij) λij′ , µij ( βj − ∑ i′ 0 for all (ij) ∈ E. In particular, the equilibrium point satisfies this condition and, moreover, it is such that∑ i ψ∗ij = βj, ∀j < J ; ∑ i ψ∗iJ < βJ . The assumption means that the system needs to employ (on average) all activities in order to be able to handle the load. It holds, for example, whenever ρ is sufficiently close to 1. Remark 2.49. Although the LAP equilibrium point doesn’t attain load balancing, the difference is negligible when system is heavily loaded (i.e. ρ is close to 1): the LAP equilibrium point is such that all queues are small and all servers are almost fully loaded, which is the best any “load balancer” could do in a heavily loaded system. We now show that, unlike the LQFS-LB, the LAP discipline accomplishes convergence of fluid models to equilibrium for all parameter settings. Proposition 2.50. Suppose Assumption 2.48 holds. For any ′ > 0 and any K > 0 there exists a finite time T = T (K) such that all fluid models whose starting state satisfies ‖ψE(0), qI(0)‖ ≤ K have ∑ i ψij(t) = βj, ∀j < J , qi(t) = 0, ∀i ∈ I, and ∣∣ψij(t)− ψ∗ij∣∣ < ′ for all (ij) ∈ E, for all t ≥ T (K). Sketch of proof. For the highest priority activity (1j) there are two cases. If type 1 is a leaf, then it is easily seen that ψ1j(t) → ψ∗1j exponentially fast, uniformly on the initial states (bounded as in the proposition statement); this in turn implies that, after some time T0 = T0(K), q1(t) has to be equal to 0. If pool j is a leaf, then, after some time T ′0 = T ′ 0(K), ψ1j(t) = ψ ∗ 1j = βj. In either case, for arbitrarily small δ > 0, 39 there exists T1 = T1(δ) such that |ψ1j(t) − ψ∗1j| < δ. We can now essentially remove the highest priority activity from the tree, and proceed by induction on activity priority. (Assumption 2.48 guarantees that, for sufficiently small δ, the remaining tree will always remain connected.)  For large systems, we can use this finite time horizon result to show that sufficiently large stochastic systems will be stable, i.e. positive recurrent. (This was trivial for LQFS- LB, see Remark 2.27, but may not be true for small systems running LAP.) Moreover, in steady-state the system will sit close to equilibrium on the fluid scale. Theorem 2.51. For all sufficiently large r, the LAP discipline stabilizes the network (in the sense of positive recurrence of the underlying Markov process). Moreover, the sequence of invariant distributions of (ψrE(·), qrI(·)) is tight, and (ψrE , q r I) w→ (ψ∗E , q∗I = 0), where (ψ∗E , q ∗ I) is the equilibrium point specified in Definition 2.47. Note that the convergence in law is here convergence in probability, since the limit is a single point. We will be using Foster-Lyapunov criteria (see e.g. [Bramson, 2006, Proposition 4.5], and references therein) to conclude stability and tightness of the associated measures. In order to do that, we need to establish some results on the behaviour of all sufficiently large systems over a finite time horizon. Lemma 2.52. There exists T1 > 0 such that for any T2 > T1 there exists sufficiently large C = C(T2) for which the following holds. For any  > 0, P  ∣∣∣∣∣∣ ∑ (ij)∈E νi(d r ij(T2)− drij(T1))− (T2 − T1) ∑ j∈J αj ∣∣∣∣∣∣ ≥  → 0, as r → ∞, uniformly on initial states with maxi∈I qri (0) ≥ C. Here, νi and αj are the workload (of a job of type i) and the rate of processing workload (by the server pool j) respectively, defined by (2). In plain words, we are asserting that, if the initial state of the system is large enough, then after one finite time T1 (uniform) and until another finite time T2 (which grows to infinity with the starting state) the system will be processing workload at the maximal possible rate ∑ j αj(t). Proof. The proof uses fluid models with infinite initial states. We cannot appeal directly to the properties of “standard” fluid models defined earlier, because we require convergence that is uniform in all large initial states. Instead, we consider the following version of the fluid limit result. Consider a sequence of initial states (ψrE(0), q r I(0)) with ‖(ψrE(0), qrI(0))‖ = C ′(r) → ∞ as r → ∞. If we regard qrI(0) ∈ RI ≡ RI ∪ {∞}, any such sequence has a convergent subsequence; we will restrict our attention to such a subsequence. The condition ‖(ψrE(0), qrI(0))‖ → ∞ as r → ∞ means that the limit (ψI(0), qI(0)) will have qi(0) =∞ for at least one customer class i. Partition the customer classes as I = I∞ ∪ I0, where qi(0) = ∞ for i ∈ I∞, and qi(0) < ∞ for i ∈ I0. Now, we can prove a fluid limit result, analogous to Proposition 2.50. Namely, with probability 1, any subsequence of fluid-scaled trajectories has a further subsequence which converges u.o.c. to a fluid model satisfying conditions (7), except that for all i ∈ I∞ the queue length qi(t) =∞,∀t ≥ 0. These infinite-initial-state fluid models are such that, uniformly on all of them, starting at some finite time T ′1, all server pools are fully occupied. Indeed, the same analysis as in Proposition 2.50 gives ∑ i ψij(t) = βj for all j < J after a finite 40 time; but the existence of infinite queues together with Assumption 2.48 guarantees that after some further finite time T ′1 we will always have qI > 0, so ∑ i ψiJ(t) = βJ as well. We choose T1 = 2T ′ 1. Consider any T2 > T1. If the lemma were false, then for some fixed ′ > 0 we could find a sequence of systems with ‖(ψrE(0), qrI(0))‖ = C ′(r)→∞, such that lim sup r P  ∣∣∣∣∣∣ ∑ (ij) νi(d r ij(T2)− drij(T1))− ∑ j αj(t)(T2 − T1) ∣∣∣∣∣∣ ≥ ′  > 0. This, however, is impossible because, from the fluid limit result, we must have w.p.1. sup t∈[T1,T2] max j ∣∣∣∣∣∑ i ψrij(t)− βj ∣∣∣∣∣→ 0, and then ∣∣∣∣∣∣ ∑ (ij) νi(d r ij(T2)− drij(T1))− ∑ j αj(t)(T2 − T1) ∣∣∣∣∣∣→ 0.  Proof of Theorem 2.51. Recall that νi > 0 is the workload associated with a single request of type i; i.e., the optimal dual variable associated with (1c) for type i (see (2)). We consider the total workload W r(t) = ∑ i νix r i (t). We will argue that the quantity Lr(t) = (W r(t))2 will serve as a Lyapunov function for the rth system. Namely, the following property holds: there exist positive constants K, T , C1, C2, C3 such that, for all sufficiently large r, (22) if Lr(t) > K then E[Lr(t+ T )− Lr(t)|Lr(t)] < −C1W r(t) + C2 and (23) if Lr(t) ≤ K then E[Lr(t+ T )− Lr(t)|Lr(t)] < C3. Once we show (22)–(23), a standard application of the Foster-Lyapunov criteria [Bram- son, 2006, Proposition 4.5] shows that for all sufficiently large r the system Markov process is positive recurrent, and moreover, the stationary distributions are such that EW r = ∑ i νiExri remains uniformly (in r) bounded. This implies that the sequence (ψrE , q r I) is tight; hence, any subsequence has a further, convergent, sub-sub-sequence. Proposition 2.50 then implies that any convergent subsequence of invariant measures must weakly converge to the point mass at equilibrium, which concludes the proof. It remains to show (22)–(23). First, it is easy to see that, ∀T > 0, (24) E[W r(t+ T )−W r(t)]2 are uniformly bounded across all r and t. This guarantees (23) for any fixed K. To prove (22), we fix T1 > 0 as in Lemma 2.52, and then choose a large fixed T > T1. Note that (min i∈I νi)(max i∈I qri (t)) ≤ W r(t) ≤ (max i∈I νi)(I max i∈I qri (t) + ∑ j βj); therefore, we may replace maxi∈I qri (0)→∞ by W r(0)→∞ in the conditions of Lemma 2.52. If we fix a sufficiently small ′ > 0 and apply Lemma 2.52, we obtain 41 the following fact: For a sufficiently large fixed K > 0, uniformly on all Lr(0) > K and all large r, (25) P { W r(T )−W r(0) ≤ 2 ∑ i λiνiT1 − 1 2 (1− ρ)(T − T1) ∑ j αj } ≥ 1− ′. Here, the term 2 ∑ i λiνiT1 is a crude upper bound on W r(T1)−W r(0), which holds with high probability for large r. The term −1 2 (1 − ρ)(T − T1) ∑ j αj is an upper bound on W r(T )−W r(T1), also holding with high probability, which follows from Lemma 2.52 and relation (3)). When T is large enough, the right-hand side of the first inequality in (25) is negative. This, along with (24), implies (22).  7. LQFS-LB steady-state on the diffusion scale We will now analyse the behaviour of the LQFS-LB on the diffusion scale; that is, we will consider deviations of the system state from its equilibrium point (Definition 2.19), scaled down by √ r. First we show that, over a finite time horizon, the behaviour of the system can be described by a diffusion process satisfying a certain stochastic differential equation. We will then consider the steady state behaviour on the same scale. We will show the following three results: • If ρ < 1 and fluid models are locally unstable, then the steady state of the system does not live on the diffusion scale; that is, the invariant measure of a ball of size K √ r around the equilibrium point converges to 0 as r → ∞, for any K. (Theorem 2.58.) • The model with parameters satisfying Proposition 2.44 will not display “Halfin- Whitt-like” behaviour. (A summary of [Halfin and Whitt, 1981] can be found in Appendix C.) That is, if ρr → 1 with 1 − ρr = O(√r), the probability of an arriving call having to wait converges to 1. (Theorem 2.60.) • If, however, the service rate depends only on the server type (µij = µj for all i), then both of the above are reversed. When ρ < 1, the steady-state deviations of such a model from its equilibrium point, scaled down by √ r, are tight; when ρr → 1 with 1 − ρr = O(√r), the same tightness holds, and implies that the probability of an arriving customer having to wait has a limiting value strictly between 0 and 1. (Theorem 2.62.) Remark 2.53. These three possibilities are not exhaustive: for example, in the case µij = µi we have shown local stability, but have not shown either global stability or lack of it. Theorem 2.62 applies whenever we have both global and local stability. Conversely, lack of global stability suggests, but by no means proves, that the invariant measure does not live near equilibrium; but we have only been able to show this when local stability fails. We begin by defining the diffusion scaling. Definition 2.54. For all state variables Γ, we let Γˆr(t) ≡ Γ r(t)− rγ∗√ r . Specifically, we will be interested in the quantities (26) Ψˆrij(t) ≡ Ψrij(t)− rψij√ r 42 and the derived quantities Ψˆri (t) ≡ ∑ j Ψˆrij(t), Ψˆ r j(t) ≡ ∑ i Ψˆrij(t) = Ψrj(t)− ρrβj√ r . We will be interested in the behaviour of the system under this scaling in two settings: in underload and in the Halfin-Whitt regime. In underload, our assumptions on the set-up of the system are as before: namely, the rth system has arrival rates λri ≡ λir, server pool sizes βrj ≡ βjr, and service rates µij, where the parameters λI , βJ , and µE satisfy Assumption 2.4. We now define the Halfin-Whitt regime for the multi-class case as follows: Definition 2.55. The Halfin-Whitt asymptotic regime is a family of systems, indexed by r, with the following properties. We consider a set of parameters λI , βJ , and µE such that the unique optimal solution to the static planning problem (1) has ρ = 1, and Assumption 2.4 is satisfied. In the rth system, βrj ≡ rβj (same as throughout the chapter). However, the input rates are λri ≡ rλi + √ rli, for some set of real numbers lI such that ∑ liνi = −C < 0. (Here, νI are workloads defined by (2).) Denote by ρr, {λrij} the optimal solution of SPP (1) with βj and λi replaced by βrj and λri respectively. (Under Assumption 2.4, this solution is unique for all large r.) Because ρr can equivalently be defined through workloads as in (3), and the workloads will be the same for all large r, we have ρr = 1 + ( ∑ liνi)/ √ r = 1−C/√r. This in turn implies that, for all large r, the Markov process describing the model is positive Harris recurrent, with a unique invariant distribution; so it makes sense to speak of steady-state variables. We use the notation of (26) in the Halfin-Whitt regime, with the convention q∗i = 0 and z∗j = 0. Recall that Z r j (t) ≤ 0 measures the idleness of server pool j, and is given by Zrj (t) ≡ Ψrj(t)− rβj. We note that Zˆrj (t) is measuring the deviation of pool-j occupancy from full occupancy rβj, not from its equilibrium value in the rth system, ρ rrβj. Thus, we have queueing if Qˆri > 0, and we have idleness if Zˆ r j < 0. Recall the load-balancing mapping M : RI → RI+J−1 (9), which sent ψI to the load- balancing allocation ψE . Let M ′ be its left inverse, namely, M ′zE = (∑ j zij ) i∈I . We can rewrite the manifold M defined in (10) as M = {z ∈ RI+J−1 : z = MM ′z}. The state space collapse results of Theorems 2.21, 2.22 suggest that the queueing network should “live on”M; we will see that this is true. Specifically, we show that if the network starts close to the equilibrium point on the diffusion scale, then under the diffusion scaling it will jump to M instantaneously, and then over a finite time it will evolve within M. 7.1. Finite time horizon diffusion process approximation. We will require an approximation of the behaviour of the network under the diffusion scaling over a finite time horizon. Derivation of such behaviour is nearly standard; and, in any case, was done in Gurvich and Whitt [2009] in some generality. (Our load-balancing algorithm belongs to the family of algorithms that they consider.) The exposition below follows Gurvich and Whitt [2009] and Dai and Tezcan [2011], omitting some of the more technical details. 43 The term “finite time horizon” means that we will be concerned with uniform conver- gence of processes on compact sets. That is, in this section we fix a time interval [0, T ] and look at the behaviour of the rth system on it, rather than examining the steady-state behaviour. We will return to studying the steady-state behaviour in §7.2–7.4. Theorem 2.56 (Essentially a corollary of [Gurvich and Whitt, 2009, Theorem 3.1 and Theorem 4.4]). Let ρ < 1. Assume that as r →∞, (27) ΨˆrE(0)→ ΨˆE where ΨˆE is deterministic and finite. (Consequently, ΨˆrI(0)→ ΨˆI(0) ≡M ′ΨˆE .) Then, (28) ΨˆrI(·) =⇒ ΨˆI(·) in DI [0,∞), and for any fixed η > 0, (29) ΨˆrE(·) =⇒ MΨˆI(·) in DI+J−1[η,∞), where ΨˆI(·) is the unique (possibly weak) solution of the stochastic differential equation (30) ΨˆI(t) = ΨˆI(0) + ∫ t 0 AuΨˆI(s)ds+ √ 2λiBi(t), the matrix Au is defined by (12), and the processes Bi(·) are independent standard Brow- nian motions. Sketch of proof. We will not justify why limiting processes can be defined (details can be found in Gurvich and Whitt [2009], or in [Dai and Tezcan, 2011, Theorem 4.2]). We will, however, justify why any subsequential limit must satisfy the SDE (30). Fix an interval [0, T ]. The finiteness of the limit ΨˆE(0) in (27) means that under the fluid scaling, the initial state converges to the fluid equilibrium point. Applying Theorem 2.21, we conclude: as r →∞, ΨrE(t) = rψ∗E + o(r) for all t ∈ [0, T ]. In particular, since we are in underload ρ < 1, P(Qri (t) > 0 for some i ∈ I, t ∈ [0, T ])→ 0 as r → ∞. We may therefore work on the event that there is never any queueing in the system, a significant simplification relative to Gurvich and Whitt [2009]. (We also have no abandonment.) Assuming there is no queueing on the time interval [0, T ], we have Ψri (t) = X r i (t), and we can write Ψˆri (t) = Ψˆ r i (0)− ∑ j µij ∫ t 0 Ψˆrij(s)ds+ 1√ r ( Π(a)(λirt)− λirt ) − ∑ j 1√ r ( Π (s) ij ( µij ∫ t 0 Ψrij(s)ds ) − µij ∫ t 0 Ψrij(s)ds ) . The Brownian term √ 2λiBi(t) now follows from the functional central limit theorem for Poisson processes, because by Theorem 2.21 we know that on the fluid scale the trajectory is sitting at equilibrium: thus, ∑ j µijΨ r ij(s) = λir + o(r). To conclude the sketch of proof that the limiting process satisfies the linear SDE (30), we need to demonstrate that ΨˆrE(t) ≈MΨˆrI(t) for all t ∈ [η, T ], where we can choose η → 0 44 as r →∞: that is, we need to establish the state space collapse of Theorem 2.21, but on the diffusion scale10. This is accomplished by considering the hydrodynamic scaling,11 , γr,m(t) = 1√ r ( Γr ( t√ r + m√ r ) − rγ∗ ) . (This is the same scaling as in §8.2, with h(r) ≡ r1/2.) We will be considering the limits under this scaling, for 0 ≤ m < √rT . Since this is a version of a fluid scaling, similarly to Theorem 2.13, as r →∞, any sequence of hydrodynamically-scaled processes (for a fixed m) has a subsequence which converges uniformly to a Lipschitz limit. Note, however, that in this limit quantities such as ψ m ij (t) measure the deviations of the occupancy process from the equilibrium, and as such can be negative. The hydrodynamic model equations (for ρ < 1 and no queueing) are: (31a) ψ m i (t) = ψ m i (0), ∀i ∈ I (31b) ψ m i (t) = ∑ j ψ m ij (t), ∀i ∈ I (31c) ρmj (t) = (∑ i ψ m ij (t) ) /βj, ∀i ∈ I For any set of customer types I∗ ⊆ I, and any set of server types J∗ ⊆ J such that ρmj (t) < ρ m j′ (t) whenever j ∈ J∗, j′ 6∈ J∗, and C(j) ∩ C(j′) ∩ I∗ 6= ∅, (31d) ∑ j∈J∗ ∑ i∈C(j)∩I∗ d dt ψ m ij (t) = ∑ i∈∪j∈J∗C(j)∩I∗ λi − ∑ j∈J∗ ∑ i∈C(j)∩I∗ µijψ ∗ ij Equation (31a), which corresponds to the ordinary fluid model equation (5c) but has no explicit mention of the arrival or departure processes, arises as follows. Under our rescaling, the arrival process is simply linear, of rate λit. On the hydrodynamic scale, the approximation Ψrij ≈ rψ∗ij implies that the service rate is (in the limit) precisely equal to the nominal value. Consequently, in the hydrodynamic limit the number of customers of type i in the system does not change. (This also accounts for the appearance of ψ∗ij in (31d).) From these equations it follows readily that whenever minj ρ m j (t) < maxj ρ m j (t), the difference between the largest and the smallest loads is decreasing at a rate bounded below by a constant. Consequently, after a finite hydrodynamic time (corresponding to some time of order r−1/2 under the diffusion scaling), we will arrive at M; and since hydrodynamic models which start in M cannot leave it, we will remain on M for the remainder of the time interval [0, T ]. Thus, for large r, the rth diffusion-scaled system is very close to M on the time interval [T1r−1/2, T ] for some fixed, finite T1; and therefore, the limiting process will stay on M during [η, T ] for any η > 0. This concludes the sketch of proof that the diffusion-scaled process converges to a con- tinuous process satisfying the SDE (30). Properties of such SDEs (in particular, unique- ness of solutions) can be found in [Karatzas and Shreve, 1996, Chapter 5].  10This is sufficient only if we assume the existence of the limiting process. To prove the existence, we require the processes involved to be stochastically bounded; details can be found in [Gurvich and Whitt, 2009]. 11 The term “hydrodynamic” is not used in any technical sense; this is simply another version of “fluid- like” scaling, in which the system converges to a nearly-deterministic, rather than stochastic, process. This scaling regime has features which distinguish it both form the ordinary fluid and the local fluid limits, meriting a separate name. 45 The meaning of Theorem 2.56 is simple: the diffusion limit of the process ΨˆrI(·) is such that, at initial time 0, it “instantly jumps” to the state ΨˆE(0) ≡MM ′ΨˆE on the manifold M (where ΨˆE(0) = limr→∞ ΨˆrE(0) ≡ ΨˆE holds only if ΨˆE ∈ M); after this initial jump, the process stays onM and evolves according to the stochastic differential equation (30). Our proofs of diffusion-scaled instability in §7.2 will rely on the analysis of this SDE. Similar in spirit, but somewhat more involved in the details of proof is the following result, which holds in the Halfin-Whitt regime. Let pi be the orthogonal projection along (1, . . . , 1)>, and define the map F as follows: for y ∈ RI , set (32) F (y) = { pi(y), ∑ i yi > 0 y, ∑ i yi ≤ 0 . Theorem 2.57 (Corollary of [Gurvich and Whitt, 2009, Theorem 3.1 and Theorem 4.4]). For a family of systems in the Halfin-Whitt regime (Definition 2.55), assume that (33) XˆrI(0)→ XˆI(0), ΨˆrE(0)→ ΨˆE where XˆI(0) and ΨˆE are deterministic and finite. Then, (34) XˆrI(·) =⇒ XˆI(·) in DI [0,∞), and for any fixed η > 0, (35) ΨˆrE(·) =⇒ MF (XˆI(·)) in DI+J−1[η,∞), where XˆI(·) is the unique (possibly weak) solution of the stochastic differential equation (36) XˆI(t) = XˆI(0) + ∫ t 0 AuF (XˆI(s))ds+ ( √ 2λiBi(t))i∈I , and the processes Bi(·) are independent standard Brownian motions. We refer to Gurvich and Whitt [2009] for the details of proof. 7.2. Steady state behaviour of locally unstable systems: evanescence of invariant measures in underload. In this section we show that if the matrix Au has eigenvalues with positive real part, then the stationary distribution of the (diffusion scaled) process ΨˆrE(·) escapes to infinity as r →∞. We will rely on the results of Theorem 2.56. Theorem 2.58. Consider a set of parameters λI, βJ , µE such that Assumption 2.4 holds, and ρ < 1. Consider a sequence of systems with arrival rates λrI ≡ rλI, server pool sizes βrJ ≡ rβJ , and unscaled service rates µE . Denote by Mr the stationary distribution of the process ΨˆrE(·), a probability measure on RI+J−1. Let bK = {z : |z| ≤ K} ⊂ RI+J−1 be the ball of radius K in RI+J−1. Suppose the matrix Au defined in Lemma 2.23 has at least one eigenvalue with positive real part, and no purely imaginary eigenvalues12. Then for any K, Mr(bK)→ 0 as r →∞. Before we proceed with the proof, let us introduce more notation and one auxiliary result. Recall that, in fluid models, after a finite time the difference ψE(t) − ψ∗E lives on the manifold M defined by (10), and satisfies (in the vicinity of the equilibrium point) the linear ordinary differential equation (37) z˙ = (MAuM ′)z, z ∈M. 12The requirement of “no purely imaginary eigenvalues” is made for convenience of differentiating between strict convergence and strict divergence. It holds for generic values of βJ , µE : that is, any set of values βJ , µE has an arbitrarily small perturbation β˜J , µ˜E with for which the corresponding matrix A˜u has no purely imaginary eigenvalues. 46 Let C ⊂ M denote the submanifold of convergence of this ODE; that is, C = {z : z(t)→ 0 as t→∞}. We can equivalently define C = MCI , where CI is the submanifold of convergence of the linear ODE y˙ = Auy, y ∈ RI ; this ODE describes the evolution of the fluid model quantity ψI(t)−ψ∗I near the equilib- rium point. Given assumptions of the theorem on Au, the solutions to (37) converge to 0 exponen- tially fast if z(0) ∈ C, which is a submanifold of positive codimension. Solutions started from points z(0) ∈M \ C diverge to infinity exponentially quickly13. We will write bK(δ1, δ2) ≡ bK ∩ {z : d(z,M) ≤ δ1, d(z, C) ≥ δ2}, where d(·, ·) is Euclidean distance. Lemma 2.59. Solutions of the stochastic differential equation (30) have the following properties. (1) For any T > 0 and any ΨI(0), P{MΨˆI(T ) ∈M \ C} = 1; (2) For any K > 0, δ2 > 0 and  > 0, there exist sufficiently large TK and K ′ > K, such that, uniformly on MΨˆI(0) ∈ bK(0, δ2), P{MΨˆI(TK) ∈ bK′ \ b2K} ≥ 1− . Proof. Statement (1) follows from the fact that, regardless of the (deterministic) initial state ΨI(0), the solution to SDE (30) is such that the distribution of ΨI(T ) is Gaussian with non-singular covariance matrix. (See [Karatzas and Shreve, 1996, Sec- tion 5.6]. In our case the matrix of diffusion coefficients is diagonal with entries √ 2λi.) Consequently, the probability that the state belongs to the submanifold C of positive codimension is 0. Statement (2) follows from the fact [Karatzas and Shreve, 1996, Section 5.6] that the expectation m(t) = EΨˆI(t) evolves according to the ODE m˙(t) = Aum(t). Since d(MΨˆE(0), C) ≥ δ2 (and thus ΨˆI(0) is also separated by a positive distance from CI), we have |m(t)−m(0)| ≥ a1 exp(at) for some fixed a1, a > 0 and all large t. It is easy to see that if the mean of a Gaussian distribution goes to infinity, then (regardless of how the covariance matrix evolves) the measure of any bounded set goes to zero, so choosing TK sufficiently large, we will have arbitrarily high probability of leaving the set b2K . On the other hand, both m(t) and the covariance matrix remain bounded for all t ∈ [0, TK ], for any TK ; so K ′ can always be chosen sufficiently large so that P{MΨˆI(TK) ∈ bK′} is arbitrarily close to 1.  We are now in position to prove Theorem 2.58. 13If Au has purely imaginary eigenvalues, there is a further submanifold C˜ on which solutions orbit around the equilibrium point. The important thing for us is that the set of initial conditions for which solutions do not diverge to infinity is a submanifold of M with positive codimension, hence of measure 0. 47 Proof of Theorem 2.58. We will treat Mr as measures on the one-point compact- ification Rn = Rn∪{∗} of the space Rn, where n = I+J−1. In this space, any subsequence of {Mr} has a further subsequence, along which Mr w→M for some probability measure M on Rn. We will show that the entire measure M is concentrated on the infinity point ∗, i.e. M(Rn) = 0. Suppose not, i.e. M(Rn) > 0. The proof proceeds in two steps. Step 1. We prove that M(Rn) = M(M \ C). Indeed, let us choose any  > 0, and K large enough so that M(bK/2) > (1 − )M(Rn). Then, for all large r, Mr(bK) > (1 − )M(Rn). Choose δ1 > 0 and T > 0 arbitrary. By Lemma 2.59, we can choose a sufficiently small δ2 > 0 and a sufficiently large K ′ such that, uniformly on the initial states ΨˆrE(0) ∈ bK , lim inf r→∞ P{ΨˆrE(T ) ∈ bK′(δ1, δ2)} > 1− . This implies that for all large r, Mr(bK′(δ1, δ2)) > (1− )2M(Rn), and then M(bK′(δ1, δ2)) ≥ (1− )2M(Rn). Since  and δ1 were arbitrary, we conclude that M(Rn) ≤M(M\ C), and then, obviously, the equality must hold. Step 2. We show that, for arbitrarily large K > 0, M(Rn \ bK) = M(Rn). (This is, of course, impossible when M(Rn) > 0, and thus we obtain a contradiction.) It suffices to show that for any  > 0, we can choose a sufficiently large K, such that M(Rn \ bK) ≥ (1− )2M(Rn). Let us choose (using step 1) a large K and a small δ2 > 0, such that M(bK/2(δ1/2, 2δ2)) > (1− )M(Rn) for any δ1 > 0. Then, for any fixed δ1 > 0, for all large r, Mr(bK(δ1, δ2)) > (1− )M(Rn). Now, using Lemma 2.59(ii), we can choose K ′ and TK sufficiently large, and then δ1 sufficiently small, so that, uniformly on the initial states ΨˆrE(0) ∈ bK(δ1, δ2), lim inf r→∞ P{ΨˆrE(TK) ∈ bK′ \ b2K} ≥ 1− . Therefore, Mr(bK′ \ b2K) > (1− )2M(Rn) for all large r, and then for the limiting measure M we must have M(Rn \ bK) ≥ (1 − )2M(Rn).  7.3. Steady state behaviour of locally unstable systems: evanescence of invariant measures in Halfin-Whitt regime. In this section we show that for any system satisfying the conditions of Proposition 2.44, considered in the Halfin-Whitt as- ymptotic regime, the stationary distributions of XˆrI(·) and of ΨˆrE(·) escape to infinity as r →∞. We will rely on the results of Theorem 2.57. Theorem 2.60. Consider a set of parameters λI, βJ , µE satisfying Proposition 2.44. Consider a sequence of systems in the Halfin-Whitt asymptotic regime (Definition 2.55). Denote by Mr the stationary distribution of the process XˆrI(·), a probability measure on RI . Let bK = {z : |z| ≤ K} ⊂ RI be the ball of radius K in RI . Then for any K, Mr(bK)→ 0 as r →∞. Proof. We will show the result for the projection piXˆI of the limiting state XˆI onto the subspace N = {z ∈ RI : ∑ zi = 0}. Since (1, . . . , 1)> is an eigenvector of Au with a real eigenvalue, we have piAuy = piAupiy for all y ∈ RI . (Recall Ac = piAu as transformations by Lemma 2.24.) Therefore, for the map F defined in (32), we have piAuF (y) = piAupiy = Acpiy 48 for all y ∈ RI . From this and Theorem 2.57 we see that piXˆI satisfies the linear SDE (38) piXˆI(t) = piXˆI(0) + ∫ t 0 Ac(piXˆI(s))ds+ pi (√ 2λiBi(t) ) i∈I . The argument now proceeds as in the proof of Theorem 2.58, using the unstable ODE z˙ = (MAcM ′)z, z ∈M∩N in the place of (37). (The entire analysis will proceed in the (I − 1)-dimensional space N .)  We conjecture that a stronger result holds: Conjecture 2.61. Consider a set of parameters λI , βJ , µE satisfying the condi- tions of Theorem 2.58, respectively 2.60. Consider a sequence of systems in underload, respectively the Halfin-Whitt asymptotic regime. For 0 ≤  < 1 2 , denote by Mr the sta- tionary distribution of the process r− 1 2 +(XrI(·)− rx∗I), a probability measure on RI . Let bK = {z : |z| ≤ K} ⊂ RI be the ball of radius K in RI . Then for any K, Mr(bK)→ 0 as r →∞. That is, we suspect that the limiting measure is non-tight on all scales strictly smaller than fluid (corresponding to  = 1 2 above). 7.4. Diffusion scale tightness of stationary distributions for the case when service rate depends on the server type only. In this section we consider a special case when there exists a set of positive rates µJ , such that µij = µj for all (ij) ∈ E . We demonstrate tightness of invariant distributions of the diffusion-scaled process, assuming the system is critically loaded on the fluid scale, i.e. ρ = 1. (An analogous result holds for the underloaded system, but critical load is typically more relevant in applications.) This, in combination with the transient diffusion limit results, allows us to claim that the limit of invariant distributions is the invariant distribution of the limiting diffusion process. We will work in the Halfin-Whitt asymptotic regime specified by Definition 2.55. As noted after the definition, we use Qˆri (t) = Q r i (t)/ √ r, Zrj (t) = Ψ r j(t) − rβj, Zˆrj (t) = Zrj (t)/ √ r; here, Zˆrj (t) measures the deviation of pool-j occupancy from full occupancy rβj, rather than from the equilibrium value for the rth network, ρ rrβj. Our choice of signs is such that Qˆri ≥ 0 while Zˆrj ≤ 0. Theorem 2.62. Suppose µij = µj, (ij) ∈ E and ρ = 1. Consider a system under the LQFS-LB rule in the Halfin-Whitt asymptotic regime (Definition 2.55). Then, for any real θ < θ0 := 2 mini λi∑ i λi + (maxj µj) ∑ j βj , the stationary distributions are such that lim sup r E [∑ i exp(θQˆri ) + ∑ j βj exp(θZˆ r j /βj) ] <∞. Proof. Note that the statement is trivial for θ = 0. Also, for θ > 0 each term exp(θZˆrj /βj) is bounded so has finite expectation, while for θ < 0 each term exp(θQˆ r i ) is bounded so has finite expectation. Our method is based on that in [Gamarnik and Stolyar, 2012]. (The exposition below is self-contained.) 49 Step 1: preliminary bounds. Consider the embedded Markov chain taken at the instants of (say, right after) the transitions. We will use uniformisation. That is, we keep the total rate of all transitions from any state constant at αrr = ∑ i λri + ∑ j rβjµ ∗, µ∗ ≡ maxµj; note that, as r → ∞, αr → α∗ = ∑i λi +∑j βjµ∗14. The transitions are of three types: arrivals, departures, and virtual transitions, which do not change the state of the system. The rate of a transition due to a type i arrival is λri ; for the service completion at pool j the rate is µj(rβj +Z r j ) (recall Z r j ≤ 0); and a virtual transition occurs at the complementary rate αrr −∑i λri −∑j µj(rβj + Zrj ). (The probability that a transition occurring at a transition instant has a given type is the ratio of the corresponding rate and αrr.) The stationary distribution of the embedded, uniformised Markov chain is the same as that of the original, continuous-time chain. In the rest of the proof, τ ∈ {0, 1, 2, . . .} refers to the discrete time of the embedded Markov chain. We will work with the following Lyapunov function (39) L(τ) := ∑ i exp(θQˆri (τ)) + ∑ j βj exp(θZˆ r j (τ)/βj). Throughout, we use the bound (40) exp(θy) ≤ exp(θx) ( 1 + θ(y − x) + 1 2 θ2(y − x)2 exp(θ |y − x|) ) which arises from the second-order Taylor expansion of exp(θy). A priori we do not know that E[L(τ)] exists for θ > 0. Indeed, while Zˆrj (t) is bounded for any r (above by 0 and below by −βj √ r), the scaled queue size Qˆri (t) is unbounded. To deal with this, we also consider the truncated Lyapunov function LK = min{L, K}. In the equation below, let x denote the variable of interest (either Qˆri or Zˆ r j /βj), and let S(τ) denote the state of the embedded Markov chain at time τ . From (40) we obtain E[exp(θx(τ + 1))− exp(θx(τ)) | S(τ)] ≤ exp(θx(τ)) ( θE[x(τ + 1)− x(τ) | S(τ)]+ 1 2 θ2E [ (x(τ + 1)− x(τ))2 exp(θ |x(τ + 1)− x(τ)|) | S(τ)]). Since for both Zˆrj and Qˆ r i the change in a single transition is bounded by 1/ √ r, we conclude: (41a) E[exp(θQˆri (τ + 1))− exp(θQˆri (τ))|S(τ)] ≤ exp(θQˆri (τ)) ( θE[Qˆri (τ + 1)− Qˆri (τ)|S(τ)] + ( 1 2 θ2 exp(θ/ √ r) ) 1 r ) , (41b) E[βj exp(θZˆrj (τ + 1)/βj)− βj exp(θZˆrj (τ)/βj)|S(τ)] ≤ exp(θZˆrj (τ)/βj) ( θE[Zˆrj (τ + 1)− Zˆrj (τ)|S(τ)] + ( 1 βj 1 2 θ2 exp(θ/ √ r) ) 1 r ) . Clearly, as long as values of θ are bounded, for any fixed C2 > 1 and all sufficiently (depending on C2) large r, the second summands in (41a) and (41b) are bounded above 14This use of αr and α∗ is unrelated to the rate or processing workload αj defined in (2). 50 by C2 1 2 θ2 1 r and 1 β∗C2 1 2 θ2 1 r , respectively, where β∗ = minj βj. Note that the second bound is independent of j. That is, we obtain (42a) E[exp(θQˆri (τ + 1))− exp(θQˆri (τ))|S(τ)] ≤ exp(θQˆri (τ)) ( θE[Qˆri (τ + 1)− Qˆri (τ)|S(τ)] + C2 1 2 θ2 1 r ) (42b) E[βj exp(θZˆrj (τ + 1)/βj)− βj exp(θZˆrj (τ)/βj)|S(τ)] ≤ exp(θZˆrj (τ)/βj) ( θE[Zˆrj (τ + 1)− Zˆrj (τ)|S(τ)] + 1 β∗ C2 1 2 θ2 1 r ) Next, we will obtain an upper bound on the drift E[L(τ + 1)− L(τ)|S(τ)]. To do that, we introduce an artificial scheduling/routing rule, which acts only within one time step, and is such that the increment L(τ + 1) − L(τ) under this rule is “almost” a (pathwise, w.p.1) upper bound on this increment under the actual – LQFS-LB – rule. (It is important to keep in mind that the artificial rule is not a rule that is applied continuously. It is limited to one time step, and its sole purpose is to derive a pathwise upper bound on the increment L(τ + 1)− L(τ) within one time step.) Step 2: Artificial scheduling/routing rule. We will use the following notation: I+ = I+(τ) ≡ {i : Qˆri (τ) > 0}, I0 = I0(τ) ≡ {i : Qˆri (τ) = 0}, J− = J−(τ) ≡ {j : Zˆrj (τ) < 0}, J0 = J0(τ) ≡ {j : Zˆrj (τ) = 0} Artificial scheduling: Departures from servers j ∈ J− are processed normally, i.e. reduce the corresponding Zrj (τ) by 1. Whenever there is a departure from a server pool j ∈ J0, the server picks a customer type i with nominal probability λrij/ ∑ i λ r ij. If the chosen i is one of the types in I+, then we keep Zrj (τ + 1) = 0 and reduce Qri (τ + 1) = Qri (τ) − 1. However, if i ∈ I0, i.e. Qri (τ) = 0, then we keep Qri (τ + 1) = Qri (τ) = 0 and instead allow Zrj (τ + 1) = −1. Artificial routing: Arrivals to customer types i ∈ I+ are processed normally, i.e. increase the corresponding Qri (τ) by 1. Whenever there is an arrival to a customer type i ∈ I0, the customer picks a server type j with nominal probability λrij/λri . If the chosen j is one of the types in J−, then we keep Qri (τ + 1) = Qri (τ) = 0 and increase Zrj (τ + 1) = Zrj (τ) + 1. However, if the chosen j ∈ J0, i.e. Zrj (τ) = 0, then we keep Zrj (τ + 1) = Zrj (τ) = 0 and instead allow Q r i (τ + 1) = 1. Step 3: One time-step drift under the artificial rule. For i ∈ I+, E[Qˆri (τ + 1)− Qˆri (τ)|S(τ)] = 1 αrr 1√ r ( λri − ∑ j (µjrβj) λrij∑ k λ r kj ) . Recalling that (43) ∑ k λrkj = µjβjrρ r = µjβjr(1− C/ √ r), we obtain (44) E[Qˆri (τ + 1)− Qˆri (τ)|S(τ)] = − Cλi α∗ 1 + o(1) r , i ∈ I+, where o(1) is a fixed function, vanishing as r →∞. 51 If Qˆri (τ) = 0 (i.e. i ∈ I0), and a new arrival of type i picks a server pool j which has idle servers, (i.e. j ∈ J−), then Qˆri stays at 0 and Qˆri (τ + 1)− Qˆri (τ) = 0. However, if a new type i arrival picks some server pool j ∈ J0 which has no available idle servers, then (by the definition of artificial rule) Qˆri (τ + 1)− Qˆri (τ) = Qˆri (τ + 1) = 1/ √ r. Thus, we can write: (45) E[Qˆri (τ + 1)− Qˆri (τ)|S(τ)] = ∑ j∈J0 λrij αrr 1√ r , i ∈ I0. The right-hand side of (45) is of order 1/ √ r. This may be alarming, because the time step is of order 1 r , and we would like to avoid large jumps in a single time step. However, we will see shortly that order 1/ √ r terms in E[L(τ + 1)−L(τ)|S(τ)] cancel out, and the expected drift is in fact of order 1/r (same order of magnitude as the time step). The treatment of the drift of Zˆrj is similar (and again makes use of (43)). For j ∈ J−, (46) E[Zˆrj (τ + 1)− Zˆrj (τ)|S(τ)] = − 1 αr µj(Zˆ r j (τ) + βjC) 1 r , j ∈ J−, and for j ∈ J0, (47) E[Zˆrj (τ + 1)− Zˆrj (τ)|S(τ)] = − 1√ r ∑ i∈I0 rµjβj αrr λrij∑ k λ r kj = − 1 1− C/√r ∑ i∈I0 λrij αrr 1√ r , j ∈ J0. We can rewrite (47) as (48) E[Zˆrj (τ + 1)− Zˆrj (τ)|S(τ)] = − ∑ i∈I0 λrij αrr 1√ r − C ∑ i∈I0 λij α∗ 1 + o(1) r , j ∈ J0, where o(1) is a fixed function, vanishing as r →∞. Note that if L(τ) ≥ K then LK cannot increase over the next time step. The drift of LK(τ) starting from a value L(τ) < K is no greater than if we allowed the transi- tions that increase L(τ) above K. Putting together this observation and equations (42), (44) – (48), we obtain 52 E[LK(τ + 1)− LK(τ)|S(τ)] ≤(49a) 1{L(τ)≤K} ∑ i∈I+ exp(θQˆri (τ))θ ( −Cλi(1 + o(1)) α∗ )1 r (49b) + ∑ i∈I0,j∈J0 θλrij 1 αrr 1√ r (49c) + ∑ j∈J− exp(θZˆrj (τ)/βj)θ ( −µj αr )( Zˆrj (τ) + βjC )1 r (49d) + ∑ j∈J0,i∈I0 θ ( −λrij 1 αrr 1√ r − Cλi(1 + o(1)) α∗ 1 r ) (49e) + ∑ i∈I exp(θQˆri (τ)) (C2 2 θ2 )1 r (49f) + ∑ j∈J 1 β∗ exp(θZˆrj (τ)/βj) (C2 2 θ2 )1 r ) .(49g) (There is no exponential in (49c) and (49e) because by assumption the relevant queues, respectively idlenesses, are equal to zero.) Note that the O(1/ √ r) terms in (49c) and (49e) cancel each other as promised, so there will be no O(1/ √ r) terms in the final bound. We will show that this bound is in fact negative later, in Step 5. Step 4: One time-step drift under the LQFS-LB rule. We now explain in what sense the increment L(τ + 1)−L(τ) under the artificial rule is “almost” an upper bound on this increment under LQFS-LB. To illustrate the idea, suppose first that all βj are equal. Then, as we will now show, the routing or scheduling decision made by LQFS-LB at time step τ will have a smaller increment L(τ + 1)−L(τ) than the artificial rule (with probability 1). Suppose the decision is associated with scheduling a customer from a queue after a service completion at server j ∈ J0. (After service completion at a server j ∈ J− the two rules behave identically.) Suppose first that LQFS-LB schedules a customer from queue i, while the artificial policy attempts to schedule a customer from queue i′. Then by definition of LQFS-LB, Qˆri ≥ Qˆri′ , so the one-step increment L(τ + 1)−L(τ) is smaller for LQFS-LB. If the artificial rule chooses i′ with Qˆri′ = 0, then LQFS-LB will decrease Qˆri while the artificial rule increases Zˆ r j . Convexity of the exponential function shows that in this case, the one-step increment L(τ + 1)− L(τ) is again smaller for LQFS-LB. We argue similarly when the decision to be taken by the rules is the routing of a newly arrived customer of type i. Therefore, when all βj are equal, the key estimate (49) of the expected drift holds, in exactly same form, for the LQFS-LB rule as well. Now consider the case of general βj. In the event of a service completion (and then possibly taking a customer for service from one of the non-zero queues), the increment L(τ+1)−L(τ) under LQFS-LB is still clearly no greater than under the artificial rule. The only situation when LQFS-LB can possibly cause a greater increment than the artificial rule is as follows. There is an arrival of a type i customer, which the artificial rule routes to pool j with Zˆrj < 0, but the LQFS-LB will instead route it to pool k such that Zˆ r j /βj ≥ Zˆrk/βk. Given convexity of the function e θx, the largest increment of L(τ+1)−L(τ) occurs when Zˆrj /βj = Zˆ r k/βk. (If θ > 0, increasing Zˆ r k would make the positive increment larger; if θ < 0, increasing Zˆrk would make the negative increment get smaller in absolute value.) 53 Thus, as we replace the artificial rule by LQFS-LB, in the “worst case”, the increment βj exp(θ[Zˆ r j (τ) + r −1/2]/βj)− βj exp(θZˆrj (τ)/βj) may need to be replaced by βk exp(θ[Zˆ r k(τ) + r −1/2]/βk)− βk exp(θZˆrk(τ)/βk), with Zˆrk(τ) satisfying Zˆ r j (τ)/βj = Zˆ r k(τ)/βk. In this case, after applying (40) we obtain βk exp(θZˆ r k(τ + 1)/βk)− βk exp(θZˆrk(τ)/βk) ≤ exp(θZˆrk(τ)/βk) ( θr−1/2 + ( 1 βk 1 2 θ2 exp(θ/ √ r) ) 1 r ) , which is bounded above by exp(θZˆrj (τ)/βj) ( θr−1/2 + ( 1 β∗ 1 2 θ2 exp(θ/ √ r) ) 1 r ) (where we have used Zˆrk(τ)/βk = Zˆ r j (τ)/βj). Thus, (42b) remains true even if we use LQFS-LB rather than the artificial rule; so the estimate (49) continues to hold. Step 5: Exponential moments estimates. Next, note that for each fixed K > 0 and each fixed parameter r, the values of exp(θQˆri (τ)) are uniformly bounded over all states S(τ) satisfying the condition L(τ) ≤ K; the values of exp(θZˆrj (τ)/βj) are “automatically” uniformly bounded (for a fixed r). We take the expected values of both parts of (49) with respect to the invariant distribution. The expectation of the left-hand side is of course 0, and so we get rid of the factor 1/r from the right-hand side expectation. The resulting estimates we will write separately for the cases θ > 0 and θ < 0 (with the case θ = 0 being trivial). Case θ > 0. For a fixed θ > 0, the expected value of the sum of all terms not containing exp(θQˆri (τ)) is bounded (uniformly in r). Indeed, this follows from the facts that Zˆrj (τ) ≤ 0 and 0 ≤ −θZˆrj (τ) exp(θZˆrj (τ)/βj) ≤ βj/e (because 0 ≥ xex ≥ −1 e for x ≤ 0). Then, we obtain: (50) E 1{L(τ)≤K}∑ i∈I+ exp(θQˆri (τ)) ( Cλi(1 + o(1)) α∗ θ − ( C2 2 θ2 )) ≤ C1 for some constant C1 = C1(θ) > 0, uniformly on all sufficiently large r. Now let us fix a sufficiently small positive θ, so that all coefficients of exp(θQˆri (τ)) are at least some  > 0 (for all large r). Recalling that C2 > 1 can be arbitrarily close to 1, it suffices that θ < θ0 = 2(mini λi)/α ∗. Then, E 1{L(τ)≤K}∑ i∈I+ exp(θQˆri (τ))  ≤ C1/, from where, letting K →∞, by monotone convergence, we obtain (51) E ∑ i∈I+ exp(θQˆri (τ))  ≤ C1/ <∞, uniformly on all large r. 54 Case θ < 0. Fix arbitrary θ < 0. In this case, the expected value of the sum of all terms not containing exp(θZˆrj (τ)) is bounded (uniformly on r). We can write: (52) E 1{L(τ)≤K} ∑ j∈J− exp(θZˆrj (τ)/βj) ( θ[ µj αr ][Zˆrj (τ) + βjC]− ( 1 β∗ C2 2 θ2 )) ≤ C ′1, for some constant C ′1 = C ′ 1(θ) > 0, uniformly on all sufficiently large r. Let us choose sufficiently large K1 > 0, such that the condition Zˆ r j (τ) ≤ −K1 implies that( θ (µj αr )( Zˆrj (τ) + βjC ) − ( 1 β∗ C2 2 θ2 )) ≥ , for some  > 0 (and all large r). Then, from (52), E 1{L(τ)≤K} ∑ j∈J− 1{Zˆrj (τ)≤−K1} exp(θZˆ r j (τ)/βj)  ≤ C ′1/, from where, letting K →∞, by monotone convergence, we obtain E ∑ j∈J− 1{Zˆrj (τ)≤−K1} exp(θZˆ r j (τ)/βj)  ≤ C ′1/ <∞, uniformly on all large r, which implies the required result.  Corollary 2.63. The sequence of stationary distributions of the processes( QˆrI(·), ZˆrJ (·) ) has a weak limit, which is the unique stationary distribution of the limiting process ( QˆI(·), ZˆJ (·) ) , described as follows: (53) Qˆi(t) ≡ max{Yˆ (t)/I, 0}, ∀i, Zˆj(t) ≡ min { βj∑ k βk Yˆ (t), 0 } , ∀j, where Yˆ (·) is a one-dimensional diffusion process with constant variance parameter 2∑i λi and piecewise linear drift, equal at point x to − (∑ j µj ) (C + min{x, 0}) . The invariant distribution density is then a continuous function, which is a “concatena- tion” at point 0 of exponential (for x ≥ 0) and Gaussian (for x ≤ 0) distribution densities. Proof. Theorem 2.62 implies tightness of stationary distributions of ( QˆrI(·), ZˆrJ (·) ) . Then, it follows from [Liptser and Shiryaev, 1989, Theorem 8.5.1] (whose conditions are easily verified in our case), that as r → ∞, any weak limit of the sequence of stationary distributions of the processes ( QˆrI(·), ZˆrJ (·) ) is a stationary distribution of the limit pro- cess. This limiting process is the one-dimensional diffusion given by (53) (see [Gurvich and Whitt, 2009, Theorem 4.4]), and it is easy to see that its invariant distribution is the “concatenation” specified above.  A tightness result analogous to Theorem 2.62 also holds for the underloaded system, ρ < 1, and can be proved by a similar method. The asymptotic regime in this case is such that λri = rλi (there is no point in considering O( √ r) terms in λri when ρ < 1). We denote Zrj (t) = Ψ r j(t) − rβjρ (which is consistent with the definition given earlier in 55 this section for ρ = 1), and keep notation Qri (t) for the queue length. We work with the following Lyapunov function: L ≡ ∑ i [ exp(θ(1− ρ)√r + θQˆri )− exp(θ(1− ρ) √ r) ] + ∑ j βj exp(θZˆ r j /βj). The same approach as in the proof of Theorem 2.62 leads to the following result: for any real θ, lim sup r E[ ∑ j exp(θZˆrj )] <∞. The limiting process for ZˆrJ (·) is ZˆJ (·) ≡ ( βjP k βk Yˆ (·)), with Yˆ (·) being a one-dimensional Ornstein-Uhlenbeck process, with Gaussian stationary distribution. The limit of station- ary distributions of ZˆrJ (·) is the (Gaussian) stationary distribution of ZˆJ (·). 8. LAP steady-state on sub-fluid scales 8.1. Main theorem and set-up. The main result of this section is to show that not only is LAP stable on the fluid scale, it is in fact stable on essentially all scales larger than the diffusion scale. Theorem 2.64. Consider the sequence of systems under LAP policy, in the scaling regime and under the assumptions specified in §1, with ρ < 1. Then: (1) For all sufficiently large r, the system is stable, i.e. the countable state-space Markov chain (ΨrE(·), QrI(·)) is positive recurrent. (2) For any  > 0, the stationary distribution of r−1/2−(ΨrE(·) − rψ∗E , QrI(·)) weakly converges to 0. Theorem 2.51 proves statement (1), so for all large r we may define steady-state variables (ΨrE , Q r I) s.t. Ψ r E(t) w→ ΨrE , QrI(t) w→ QrI . Moreover, Theorem 2.51 implies statement (2) for  = 1/2; that is, (54) lim r→∞ P (∥∥∥∥1r (ΨrE(·)− ψ∗Er,QrI(·)) ∥∥∥∥ > δ) = 0, for any δ > 0. (The theorem statement is about weak convergence, but weak convergence to a constant implies convergence in probability.) The rest of this section is devoted to extending this result to all  > 0. This will involve studying finer rescalings of the process, which we call the hydrodynamic and local-fluid scalings. Fix , 0 <  < 1/2. From (54), for an arbitrarily small fixed δ > 0, we can choose a positive function g(r) = o(r), such that (55) P{‖(ΨrE − rψ∗E , QrI)‖ ≤ g(r)} ≥ 1− δ. Without loss of generality, assume r−1/2−g(r)→∞. We will prove that there exist positive constants C and T , such that for any fixed δ1 > 0 the following holds for all sufficiently large r: (56) r1/2+ ≤ ∥∥∥(ΨrE(0)− rψ∗E , QrI(0))∥∥∥ ≤ g(r) implies P {∥∥∥(ΨrE(T log r)− rψ∗E , QrI(T log r))∥∥∥ ≤ Cr1/2+} ≥ 1− δ1. This fact, along with (55), will prove Theorem 2.64(ii). We will need strong law of large numbers type results, which can be obtained from a strong approximation of Poisson processes, available e.g. in [Cso¨rgo˝ and Horva´th, 1993, Chapters 1 and 2]: 56 Proposition 2.65. A unit rate Poisson process Π(·) and a standard Brownian motion W (·) can be constructed on a common probability space in such a way that the following holds. For some fixed positive constants C1, C2, C3, ∀T > 1 and ∀u ≥ 0 P ( sup 0≤t≤T |Π(t)− t−W (t)| ≥ C1 log T + u ) ≤ C2e−C3u. Applying the result to the unit rate Poisson processes Π (a) i (·) and Π(s)ij (·) which drive the exogenous arrivals and departures, we obtain the following. (For Π (a) i (·), for example, we replace t with λirt; T with λirT log r; and u with r 1/4.) Proposition 2.66. For any fixed T > 0 and any subsequence of r →∞, we can find a further subsequence (with r increasing sufficiently fast), such that: for each i ∈ I, sup 0≤t≤T log r r−1/2−/2 ∣∣∣Π(a)i (λirt)− λirt∣∣∣→ 0, w.p.1, and for each (ij) ∈ E, sup 0≤t≤T log r r−1/2−/2 ∣∣∣Π(s)ij (µijβjrt)− µijβjrt∣∣∣→ 0, w.p.1. Let F r(t) be the process of (unscaled) deviations from equilibrium; that is, F r(t) = (ΨrE(t)− rψ∗E , QrI(t)). Suppose we have a function h(r), such that r1/2+ ≤ h(r) ≤ g(r). (The value h(r) will be the “scale” of F r(0); sometimes, but not always, we simply use h(r) = ‖F r(0)‖.) We will establish properties of F r(·) under two different scalings, called hydrodynamic and local-fluid. Remark 2.67. The use of multiple scalings (in addition to the “standard” fluid scal- ing) is typical in the analysis of systems in the many-server asymptotic regime, cf. [Gur- vich and Whitt, 2009] and references therein. Our hydrodynamic and local-fluid scalings are somewhat unusual in that the scaling factor h(r) is strictly “between” r and r1/2. (When h(r) = r, both local-fluid and hydrodynamic scalings become the standard fluid scaling; if h(r) = r1/2, the local-fluid scaling becomes the standard diffusion scaling.) Also, although the concept of analysing the system over the course of many short intervals is not new (cf. [Shah and Wischik, 2009, Section 8]), using multiple scalings simultaneously to derive tightness of stationary distributions is, to the best of our knowledge, novel. 8.2. Hydrodynamic scaling. Consider the process under the following scaling and centering: (57) (ψ r ij(t), q r i (t), x r i (t), a r i (t), d r ij(t), ξ r ij(t)) = h(r)−1 ( Ψrij((h(r)r −1t)− rψ∗ij, Qri (h(r)r−1t), Xri (h(r)r−1t)− r ∑ j ψ∗ij, Ari (h(r)r −1t), Drij(h(r)r −1t),Ξrij(h(r)r −1t) ) i∈I,(ij)∈E . Note that since ψ r E(·) is centered before it is scaled in space, the condition ρ < 1 implies∑ i ψ r ij(t) ≤ 0 for all j < J at all times t. Theorem 2.68. Consider a sequence of deterministic realisations, such that the driv- ing realisations satisfy the functional strong law of large numbers conditions, namely: (58) (arI(t), t ≥ 0)→ (λIt, t ≥ 0), u.o.c. 57 (59) ( h(r)−1 ( Drij(h(r)r −1t)− µij ∫ h(r)r−1t 0 Ψrij(s)ds ) , t ≥ 0 ) → 0, u.o.c., ∀(ij) ∈ E . Suppose (ψ r E(0), q r I(0)) → (ψE(0), qI(0)). Then, for any subsequence of r there exists a further subsequence along which (ψ r E(·), qrI(·), xrI(·), arI(·), d r E(·), ξ r E(·)) converges uniformly on compact sets to a set of Lipschitz continuous functions (ψE(·), qI(·), xI(·), aI(·), dE(·), ξE(·)) satisfying the hydrodynamic model equations (60). (The conditions involving deriva- tives are to be satisfied whenever the derivatives exist, which is almost everywhere w.r.t. Lebesgue measure.) The hydrodynamic model equations are: (60a) qi(t) ≥ 0, ∀i ∈ I; ∑ i ψij(t) ≤ 0, ∀j ∈ J (60b) ai(t) = λit, ∀i ∈ I; dij(t) = µijψ∗ijt, ∀(ij) ∈ E (60c) qi(t) = qi(0) + ai(t)− ∑ j ξij(t), ∀i ∈ I (60d) ψij(t) = ψij(0) + ξij(t)− dij(t), ∀i ∈ I (60e) xi(t) = qi(t) + ∑ j ψij(t) ≡ xi(0), ∀i ∈ I (60f) ∑ i ψij(t) = 0, whenever qi′(t) > 0 for at least one i ′ ∈ C(j) (60g) d dt ξij(t) = 0, whenever qi′(t) > 0 for at least one i ′ ∈ C(j), i′ < i (60h) d dt ξij(t) = 0, whenever ∑ k ψkj′(t) < 0 for at least one (ij ′) < (ij) (60i) d dt ξij(t) = min λi − ∑ (ij′)<(ij) d dt ξij′(t), ∑ i′ µi′jψ ∗ i′j − ∑ (i′j)<(ij) d dt ξij′(t)  whenever qi(t) = 0 and ∑ k ψkj = 0. Definition 2.69. We call any Lipschitz solution of (60) (ψE(·), qI(·), xI(·), aI(·), dE(·), ξE(·)) a hydrodynamic model of the system with initial state (ψE(0), qI(0)); a set (ψE(·), qI(·)), which is a projection of a hydrodynamic model we often call a hydrodynamic model as well. Clearly, we have the following corollary of Theorem 2.68, which we record for future reference. We denote f r (·) ≡ (ψrE(·), qrI(·)), f(·) ≡ (ψE(·), qI(·)). 58 Corollary 2.70. For any fixed T > 0, K > 0 and δ2 > 0, there exists a sufficiently small δ3 > 0, such that the following holds. Uniformly on all ∥∥∥f r(0)∥∥∥ ≤ K and all sufficiently large r, conditions (61) max i sup [0,T ] |ari (t)− λit| ≤ δ3, (62) max (ij) sup [0,T ] ∣∣∣∣∣h(r)−1(Drij(h(r)r−1t)− µij ∫ h(r)r−1t 0 Ψrij(s)ds )∣∣∣∣∣ ≤ δ3, imply (63) sup [0,T ] |f r(t)− f(t)| ≤ δ2, where f(·) is a hydrodynamic model with initial state f r(0). Theorem 2.71. For any K > 0 there exists a finite time T = T (K) such that all hydrodynamic models whose starting state satisfies ∥∥(ψE(0), qI(0))∥∥ ≤ K have ∑i ψij(t) = 0,∀j < J , qi(t) = 0,∀i ∈ I, and (ψE(t), qI(t)) = (ψE(T ), qI(T )), for all t ≥ T . Moreover, (ψE(T ), qI(T )) = L(ψE(0), qI(0)), where L is a fixed linear mapping defined below by (64). Proof. Consider the highest priority activity (1j). There are two possible cases: 1 is a leaf or j is a leaf. If j is a leaf, then ψ1j(0) ≤ 0; if the inequality is strict, ψ1j(t) must increase at a positive, bounded away from 0, rate until it reaches 0 within a finite time; ψ1j(t) = 0 thereafter. If type 1 is a leaf, then q1(t) must decrease and ψ1j(t) increase at the same rate (positive, bounded away from 0), until the entire queue (if any) “relocates into” ψ1j; and after that time, ψ1j(t) and q1(t) = 0 will remain constant. We see that in either case, after a finite time, the highest priority activity (1j) can be in a sense ignored. This allows us to proceed by induction on the activities, from the highest priority to the lowest, to check that by some finite time T (depending on K) the hydrodynamic model gets into a state (ψE(T ), qI(T )), satisfying the conditions of the theorem, and will remain in the same state for all t ≥ T . Since xi(t) do not change, the linear mapping L is as follows: L(uE , wI) = (cE , 0) where cE is the unique solution to (64a) ∑ j uij + wi = ∑ j cij, ∀i ∈ I (64b) ∑ i cij = 0, ∀j < J. For future reference, note that L(uE , wI) = (cE , 0) is a function only of the vector zI , where zi = wi + ∑ j uij. The corresponding linear mapping from zI to cE we denote L ′. 8.3. Local-fluid scaling. The process under local fluid scaling is defined as follows. For each r consider (ψ˜rE(t), q˜ r I(t)) ≡ f˜ r(t) = h(r)−1F r(t). We will also denote x˜ri (t) = h(r) −1Xri (t) ≡ q˜ri (t) + ∑ j ψ˜ r ij(t). Note that since ψ˜rE(·) is centered before it is scaled in space, the condition ρ < 1 implies ∑ i ψ˜ r ij(t) ≤ 0 for all j < J at all times t. 59 Theorem 2.72. Consider a sequence of deterministic realisations, such that the driv- ing realisations satisfy the functional strong law of large numbers conditions, namely: (65) (h(r)−1(ArI(t)− λIrt), t ≥ 0)→ 0, u.o.c., (66) ( h(r)−1 ( Dri (t)− µij ∫ t 0 Ψrij(s)ds ) , t ≥ 0 ) → 0, u.o.c., ∀(ij). Assume that the initial states converge to a fixed vector (ψ˜rE(0), q˜ r I(0))→ (ψ˜E(0), q˜I(0)). Further assume that q˜I(0) = 0 and ∑ i ψ˜ij(0) = 0 for all j < J . (In other words, (ψ˜E(0), q˜I(0)) = L(ψ˜E(0), q˜I(0)).) Then, for any subsequence of r there exists a further subsequence along which (67) (ψ˜rE(·), q˜rI(·))→ (ψ˜E(·), q˜I(·)), u.o.c., where (ψ˜E(·), q˜I(·)) is a set of Lipschitz functions, with initial conditions (ψ˜E(0), q˜I(0)), satisfying the local fluid model equations (69). Moreover, these limit trajectories are such that, uniformly on all of them, (68) ∥∥∥(ψ˜E(t), q˜I(t))∥∥∥ ≤ ∥∥∥(ψ˜E(0), q˜I(0))∥∥∥ c1e−c2t, ∀t ≥ 0, where c1, c2 > 0 are fixed constants. The local fluid model equations are (69a) q˜i(t) = 0, ∀i ∈ I (69b) ∑ j ψ˜ij(t) = ∑ j ψ˜ij(0)− ∑ j ∫ t 0 µijψ˜ij(s)ds, ∀i ∈ I (69c) ∑ i ψ˜ij(t) = 0, ∀j < J The I + J − 1 equations for the I + J − 1 functions (ψ˜ij(·)) can be solved sequentially, in order of decreasing activity priority, since the highest unsolved-for priority will always correspond to either a customer-type or a server-type leaf of the remaining activity tree. Definition 2.73. We call any Lipschitz solution of (69) (ψ˜E(·), q˜I(·), x˜I(·), a˜I(·), d˜E(·), ξ˜E(·)) a hydrodynamic model of the system with initial state (ψ˜E(0), q˜I(0)); a set (ψ˜E(·), q˜I(·)), which is a projection of a hydrodynamic model we often call a hydrodynamic model as well. Proof of Theorem 2.72. The non-trivial part of the proof is establishing that the limit (ψ˜E(·), q˜I(·)) is Lipschitz, which here is not a simple consequence of the functional law of large numbers for the driving processes (as was the case for fluid and hydrodynamic limits). This is because the arrival and service rates in the system with index r are O(r), while the space is scaled down by h(r) = o(r). For the same reason, it is also not “automatic” that the limit queues q˜i(·) stay at 0. This difficulty is resolved as follows. Consider an arbitrary number C4 > ∥∥∥(ψ˜CE(0))∥∥∥, and the random time (70) τ(r) = min{t : ∥∥∥(ψ˜rij(t))∥∥∥ ≥ C4}. 60 Speaking informally (the formal statements are given below), the trajectory x˜rI(·) must be “almost Lipschitz” in the interval [0, τ(r)], with Lipschitz constant η = C4 ∑ (ij)∈E µij, because the absolute difference between the arrival and departure rates (scaled down by h(r)) is bounded above by η in [0, τ(r)]. A similar observation holds for each queue length trajectory q˜ri (·), as long as the corresponding queue is non-zero. We will show that τ(r) is bounded away from 0 for all large r. Suppose not, and τ(r)→ 0 along some subsequence. Denoting x˜i(0) = ∑ j ψ˜ij(0), we have (71) sup [0,τ(r)] ‖x˜rI(t)− x˜I(0)‖ → 0, sup [0,τ(r)] ‖q˜rI(t)− q˜I(0)‖ → 0. We also must have (72) sup [0,τ(r)] ∥∥∥ψ˜rE(t)− ψ˜E(0)∥∥∥→ 0; if not, we would be able to construct a hydrodynamic model which violates the condition that after a finite time the vector of occupancies ψE(t) is uniquely determined as L ′xI(t). However, (72) contradicts the definition of τ(r). We conclude that the case τ(r) → 0 is impossible, i.e. there exists some 4 > 0 such that lim inf τ(r) > 4 > 0. If lim inf τ(r) > 4 > 0 along some subsequence, then it is easy to see that there exists a further subsequence along which (73) x˜rI(·)→ x˜I(·), q˜rI(·)→ q˜I(·), where the convergences are uniform in [0, 4], and each function x˜i(·) and q˜i(·) is Lipschitz with constant η in [0, 4]. Next, in addition to (73), we show that (74) ∥∥∥(ψ˜rE(t), q˜rI(t))− L(ψ˜rE(t), q˜rI(t))∥∥∥→ 0, in particular ∥∥∥ψ˜rE(t)− L′x˜rI(t)∥∥∥→ 0, uniformly in [0, 4]. Suppose not; then we would be able to construct a hydrodynamic model which would violate the condition that (ψE(t), qI(t)) = L(ψE(t), qI(t)) must hold after a finite time. In [0, 4] we also have x˜i(t) = x˜i(0)− d˜i(t), ∀i, where the Lipschitz function d˜i(·) is a limit (along a subsequence) of∑ j ∫ t 0 µijψ˜ r ij(s)ds. The above properties lead to conditions (69) on the interval [0, 4]. Namely, we formally define (ψ˜E(·)) = L′(x˜I(·)), obtain the convergence (ψ˜rE(·)) → (ψ˜E(·)) from (74), and then (69) follows. Conditions (69) reduce to a system of linear ordinary differential equations for ψ˜CE(t). In particular, each local fluid model remains bounded in [0,∞). This allows us to conclude that by choosing a sufficiently large C4, the corresponding 4 (which bounds from below the time τ(r) taken for the state to leave the ball of radius C4) can be arbitrarily large. The fact that each local fluid model converges to 0 is easily established, again by induction on activities. Since the solution of a linear ODE converges exponentially quickly whenever it converges at all, the bound (68) follows.  We will actually need a generalised version of Theorem 2.72. 61 Theorem 2.74. Consider a sequence of deterministic realisations, such that the driv- ing realisations satisfy (65)–(66). Assume that the initial states converge to a fixed vector (ψ˜rE(0), q˜ r I(0)) → (ψ˜E , q˜I). (We do not assume (ψ˜E , q˜I) = L(ψ˜E , q˜I).) Then, for any subsequence of r there exists a further subsequence along which (75) (ψ˜rE(·), q˜rI(·))→ (ψ˜E(·), q˜I(·)), in D[η,∞) for any η > 0, where (ψ˜E(·), q˜I(·)) is a local fluid model with initial state (ψ˜E(0), q˜I(0)) = L(ψ˜E , q˜i). Moreover, these limit trajectories are such that, uniformly on all of them, (76) ∥∥∥(ψ˜E(t), q˜I(t))∥∥∥ ≤ ∥∥∥(ψ˜E(0), q˜I(0))∥∥∥ c1e−c2t, ∀t ≥ 0, where c1, c2 > 0 are fixed constants. The proof is a slight generalisation of that of Theorem 2.72. The initial jump in the local fluid model from (ψ˜E , q˜I) to (ψ˜E(0), q˜I(0)) is proved by considering an interval [0, T5h(r)] and the corresponding hydrodynamic scaled trajectories in [0, T5]; T5 is chosen large enough so that the hydrodynamic model reaches the state (ψE(0), qI(0)) = L(ψE , qI) by time T5. Corollary 2.75. There exists C > 0 such that the following holds. For any fixed T > 0, K > 0, δ2 > 0 and 2 > 0, there exists a sufficiently small δ3 > 0, such that: uniformly on all ∥∥∥f˜ r(0)∥∥∥ ≤ K and all sufficiently large r, conditions (77) max i sup [0,T ] |h(r)−1(Ari (t)− λirt)| ≤ δ3, (78) max (ij) sup [0,T ] |h(r)−1(Dri (t)− µij ∫ t o Ψrij(s)ds )| ≤ δ3, imply (79) sup [0,T ] ∥∥∥f˜ r(t)∥∥∥ ≤ (K + 1)C, (80) sup [2,T ] ∥∥∥f˜ r(t)− f˜(t)∥∥∥ ≤ δ2, where f˜(·) is a local fluid model with initial state Lf˜ r(0) (so that f˜(·) depends on r). 8.4. Proof of Theorem 2.64(ii). We are now in position to prove (56), and then Theorem 2.64(ii). The basic idea is to consider the process in the interval [0, T log r], subdivided into log r intervals15, each being T -long. Using the local fluid limit results, we show that, with high probability, in each of the T -long subintervals, the norm ‖F r(t)‖ de- creases by a factor δ6 ∈ (0, 1), unless the norm ‖F r(t)‖ at the beginning of the subinterval was smaller than r1/2+; in this case ‖F r(t)‖ will be bounded above by 3Cr1/2+ during the entire subinterval (where C is as in Corollary 2.75). If δ6 is small enough, so that (81) δlog r6 < r 1/2+/r, δ6 < e −1/2+, then the above implies ‖F r(t)‖ must “dip” below r1/2+ at least once, and therefore ‖F r(T log r)‖ ≤ 3Cr1/2+ (with high probability). We proceed with the details. 15To be precise, we should consider an integer number of subintervals, say blog rc. This does not cause any difficulties besides making notation cumbersome. 62 Let us choose δ6 > 0 satisfying (81), and then δ2 > 0 such that 2δ2 < δ6. Denote by ‖L‖ the norm of the linear operator L (defined in Theorem 2.71), i.e. the maximum of the absolute values of its eigenvalues. Let us choose T > 0 large enough so that (see Theorem 2.74) ‖L‖ c1e−c2T < δ2. Suppose, for each r the initial state is as in (56). To prove (56) it suffices to show that from any subsequence of r we can find a further subsequence, along which (56) holds. So, consider any fixed subsequence, and a fixed δ1 > 0. In each of the subintervals [(i − 1)T, iT ], i = 1, 2, . . . , log r, we consider the pro- cess with the time origin reset to (i − 1)T and the corresponding initial state F r((i − 1)T ). If ‖F r((i− 1)T )‖ ≤ g(r), then we set h(r) = max(‖F r((i− 1)T )‖ , r1/2+); if ‖F r((i− 1)T )‖ > g(r) we set h(r) = g(r) for completeness, but with high probability this will never occur. By Proposition 2.66, we can choose a further subsequence so that, w.p.1, conditions (77) and (78) hold for all large r, simultaneously on each of the subintervals [0, T ], [T, 2T ], . . . , [T (log r− 1), T log r]. We consider the corresponding local fluid scaled processes f˜ r(·), with their corresponding h(r), on each of the subintervals; and apply Corollary 2.75. We see that, with probability 1, for all large r, the following holds for each interval [(i− 1)T, iT ], i = 1, 2, . . . , log r: if ‖F r((i− 1)T )‖ ∈ [r1/2+, g(r)] then ‖F r(iT )‖ ≤ 2δ2 ‖F r((i− 1)T )‖; if ‖F r((i− 1)T )‖ < r1/2+ then ‖F r(iT )‖ ≤ 3Cr1/2+. Since 2δ2 < δ6 we must have ‖F r(iT )‖ < r1/2+ for at least one i. Finally, we conclude that the condition ‖F r(T log r)‖ ≤ 3Cr1/2+ must hold (w.p.1 for all large r). This obviously implies (56). We believe that a stronger result is also true. Conjecture 2.76. The sequence of stationary distributions of the processes r−1/2(ΨrE(·)− rψ∗E , QrI(·)) is tight. 63 CHAPTER 3 Limit order book Introduction In this chapter we model a limit order book. A limit order book is a pricing mechanism for a single-commodity market. To illustrate the concept of a pricing mechanism, suppose you would like to buy a carrot. Depending on the amount of time and money you have (and the amount of ridicule you’re willing to put up with), there are several ways in which you could go about acquiring it: • Go to a supermarket, and pay the price of a carrot written on the shelf. • Go to a farm, and haggle over the price with the farmer. • Go to a street market and in a booming voice announce “I need a carrot; who will offer me the best price?” in the hopes that this will spur the stall-keepers into a price war. • Bid on a carrot on eBay. • . . . All of these are mechanisms for pairing up buyers and sellers (of carrots), and for deciding the amount of money that will be exchanged during the transaction. In these terms, a limit order book works as follows. Sellers and buyers of carrots arrive in real time. They publicly make one of the following four announcements: • I would like to sell a carrot right now, to the highest waiting bidder in the system. (“Market ask”) • I would like to buy a carrot right now, from the lowest seller in the system. (“Market bid”) • I have some carrots, and could be persuaded to part with them, but only if the price rises above p. (“Limit ask”) • I would like to invest in some carrots eventually, but only if the price drops below p. (“Limit bid”) The market bid and market ask are essentially equivalent to trading with a supermarket: you get to buy or sell the carrot immediately, but possibly at an inconveniently high, respectively low, price. The limit orders, on the other hand, may result in better deals, but involve waiting for the order to be executed. The limit order book is the list of unfulfilled limit bids and limit asks. Although somewhat impractical as a way of getting a single carrot for dinner, this pricing mechanism is important in financial markets, many of which are run using variants of this model. Consequently, it has generated a lot of interesting research, both empirical (studies of real-world market data) and theoretical (models of how the behaviour might arise). The following discussion is taken from the excellent survey of [Gould et al., 2011], and many more references can be found therein. Empirical studies. Because the information (prices and sizes of orders) in the limit order book is publicly available1, quite a lot of statistical data about limit order book 1With caveats: not all of the limit order book is available to the general public, there may be a delay, some of the markets allow asymmetric information, and in some markets it is possible to submit partially or completely hidden orders. The hidden orders in particular greatly complicate empirical studies. 65 behaviour has been amassed. Many of the empirical studies contradict each other, possibly due to fundamental differences between the underlying markets, or due to the inherent difficulty of the problem. However, there are a few features shared by many markets. The time series of prices2 has certain interesting characteristics, including different volatilities at different time scales. There are several ways to define volatility, but loosely speaking, volatility is a measure of variability of the logarithm of the price over that time scale. For example, to compute the 5-minute volatility, one could look at the series of prices p(t0), p(t1), . . . spaced by 5 minutes, and compute the standard deviation of the quantity log p(ti+1)− log p(ti). One can also consider the time series of volatilities on a given time scale, e.g., the day-by-day 5-minute volatility. Observations suggest that high-volatility periods tend to cluster together, as do low-volatility periods; that is, large variations in prices are more likely to follow other large variations in prices than they are to occur unconditionally. Another feature found in many markets is the “humped” shape of the limit order book. Here, we consider the total quantity of the good being offered for sale, or requested, as a function of price. Many studies find that each of the buy and sell distributions are approximately unimodal, with the maximum occurring at some price that is near, but not equal to, the current best price. Finally, some studies find that the process describing the limit order book may not be stationary; this is interpreted as the result of the new information being constantly supplied to the market. It may also be possible to model this as evolving “steady-state behaviour” of a system whose underlying parameters vary over time. The theoretical models of limit order books have largely fallen into the following two classes. Economic game theory. A limit order book can be naturally modelled as a large repeated game, in which players have more or less information about each others’ pref- erences. Two studies using this set-up are [Parlour, 1998] and [Ros¸u, 2009]. In [Parlour, 1998], orders cannot be changed after placement, and the set of possible prices is reduced to just two ticks (“high” and “low”, corresponding to current selling price and buying price). Thus, the strategic choices are essentially “place a market order”, “join the queue of limit orders”, or “pass”. This models the trade-off between the price and the probabil- ity of an order being executed before a deadline. Ros¸u [2009] introduces the possibility of modifying orders after they are submitted. This and the assumption of continuously- varying prices turns out to simplify the space of possible strategies enough to derive the form of the subgame-perfect Nash equilibria for the system. Both models assume a large amount of (symmetric) common knowledge available to all the market participants; for example, everyone knows everyone else’s level of aversion to waiting. The strategies giving the game-theoretic equilibria of these two models explain some of the features of real-world limit order book markets. While it is possible that a fuller model of this flavour would explain more of the behaviour, analysing a large repeated game in continuous time is tricky at best. Zero-knowledge and Markovian markets. Because modelling individual buyers is difficult, one could try to model the market without referring to the individual buyer and seller preferences, and instead specifying stochastic dynamics for the market as a whole. An early paper introducing these ideas is [Gode and Sunder, 1993]; they consider a small market with zero-intelligence traders is considered. Zero-intelligence traders make their decisions based only on the current price, without attempting to game the system 2There isn’t a single well-defined price in the limit order book; so this could refer to the highest bid price, the lowest ask price, the mid-point price halfway between the two, or the price at which the most recent transaction occurred. 66 in any way. One interesting feature that emerges is a notion of an equilibrium price: even through the traders do not “discuss” their different valuations in any way, trades in the market eventually only occur around some single price. An example of a Markovian market is given in [Cont and de Larrard, 2010], which uses a Markov process to represent the market state. Cont and de Larrard [2010] assume that the Markovian state descriptor can be taken to be just the pair (bid price, ask price), rather than the full state of the limit order book. Within this model, the authors are able to derive steady-state distributions of various quantities, such as price movements. A trend in literature is to add assumptions to the model until it reproduces the desired statistical properties of the real-world limit order books. Unfortunately, the added com- plexity usually makes the models less analytically tractable. In particular, with relatively few exceptions, models of limit order books are only amenable to numerical analysis, which makes it difficult to understand the effect that the parameters of the model have on its behaviour. The analysis in this chapter is instead a deliberately very simple and almost parameter- free model. This is because we do not set the goal of approximating “real-world behaviour” as closely as possible. Rather, we would like to understand the behaviour of the underlying system of interacting queues, in the hopes that the insights will generalise to other settings. Consequently, there are very few adjustable parameters in the model we analyse, although we show some possible extensions in §10 and §9. It is interesting that even in such a simple system nontrivial behaviour emerges; for example, we see clear threshold values for orders clearing from the system. 1. Limit order book model Definition 3.1. An order is a pair (price, type). Price is a real number; type is one of “bid” and “ask”. A bid is an order to buy a unit of good (one carrot, in the terminology of the introduction); an ask is an order to sell one unit of good. Orders arrive exogenously into the limit order book, and cannot be cancelled: they are either executed, that is matched to an order of the opposite type (immediately or at a later time), or they remain in the system forever. We are implicitly assuming that all orders have the same size; this is a simplification (in real life, you may want to buy not one but a dozen carrots). Ros¸u [2009] discusses the effect that the order sizes may have on the statistical properties of real-world limit order books. The assumption that orders cannot be cancelled is also a simplification: in real life, many if not most of the orders are indeed cancelled before execution (and perhaps submitted without intending for them to be executed). We can think of our model as applying to the orders that really are meant to be executed; but the primary reason for the assumption is that the monotonicity results in Section 3 break if we allow orders to depart at will. Definition 3.2. The state of a limit order book at time t is a pair of counting measures (Qbt , Q a t ) supported on [0, 1], counting bids and asks respectively, with cumulative density functions Qbt(p) ≡ Qbt(−∞, p] and Qat (p) ≡ Qat [p,∞) (note that asks are counted from the right). The quantities Qbt{p} and Qat {p} are known in the financial literature as the depth of the market at price p; when p is, e.g., the highest bid price, this is the number of transactions that need to occur in order to change the price. We will occasionally want to consider infinite states, but only “nice” ones. Specifically, all states we will consider have finite support, i.e. for i = a, b, there are only finitely many prices p such that Qit{p} 6= 0. However, we may have Qb{p} = ∞ and/or Qa{q} = ∞ 67 for some prices p and q. If we have both infinitely many bids at p and infinitely many asks at q, we will always have p < q,3 so that we never have to consider infinitely many departures in a finite time period. The external arrivals to the limit order book happen according to some processes (At)t≥0; the set of times at which an arrival occurs is discrete. Each arrival event is an order, i.e. a pair (price, type). Unless otherwise noted, arrivals happen in discrete time, are iid, the type is equally likely to be “bid” or “ask”, and the price is supported on [0, 1]. (We may also consider Poisson arrival processes, in which case we will restrict to the probability-1 event that there are only finitely many arrivals in any finite time interval, and no two arrival times coincide.) For most of this chapter, we will assume that the distribution of the price of an arriving order is the same for bids and for asks, in which case it can without loss of generality be taken to be uniform on [0, 1] (see §9, where we also discuss what happens if the arrivals are iid but with different distributions for bids and asks). The departures from an order book happen only at arrival times. Informally, an arriving bid departs if there is an ask in the system to the left of it, and in that case it departs with the leftmost such ask; similarly, an arriving ask departs if there is a bid in the system to the right of it, and in that case it departs with the rightmost such bid. We will also consider the effect of partitioning prices into discrete ticks, which we will formalise by introducing price level functions. Definition 3.3. A price level function, which we may also refer to as a pricing scheme, called P and denoted x ≺ y, is a partial ordering on [0, 1] that is refined by the usual total ordering. Equivalently, it is a nondecreasing map P : [0, 1]→ [0, 1], where we define p ≺ q to mean P(p) < P(q). (We will always take this map to be right-continuous.) When p is incomparable to q (i.e., none of p ≺ q, p = q, or p  q hold), we write p ∼ q. We allow the highest bid-lowest ask pair to depart the system whenever βt 6≺ αt (note that one of βt and αt in this case must be a newly arrived order). To formally specify the dynamics of the system, we will introduce two more pairs counting measures, counting the cumulative arrivals and the cumulative departures from a set. Definition 3.4. For a set of prices S, let Aat (S) (A b t(S)) denote the number of asks (bids) with prices in S that have arrived into the system by time t, and let Dat (S) (D b t(S)) denote the number of asks (bids) with prices in S that have left the system up to time t. We will always consider starting states for which Da0 = D b 0 = 0 is the zero measure. Let Dbt (p) = D b t(−∞, p] and Dat (p) = Dat [p,∞) be the cumulative distribution functions for the departure measures; note that asks are counted from the right. We formally define the evolution of a limit order book with price level function P : Upon the arrival at time t of a bid at price p at time t, if αt−  p, then the bid waits: (82a) Qbt = Q b t− + δp, Q a t = Q a t− , if At = (p, bid) and αt−  p. If αt− 6 p, then the bid departs with the leftmost ask: Qat = Q a t− − δαt− , Qbt = Qbt− , if At = (p, bid) and αt− 6 p. Dbt = D b t− + δp, D a t = D a t− + δαt−(82b) The situation is symmetrical if the order arriving at time t is an ask at price p: if βt− ≺ p then the ask waits, while if βt− 6≺ p, then both the ask and the rightmost bid depart. 3More precisely, p ≺ q in the appropriate partial ordering ≺; see below. 68 The relationship between the quantities Q, A, and D is as follows: for a set of prices S, any time t ≥ 0, and i = a, b, (83) Qit(S) = Q i 0(S) + A i t(S)−Dit(S). We require that β0 ≺ α0, so that no departures are possible initially. The evolution described in (82) guarantees that all of these quantities are finite, and βt ≺ αt at all times t ≥ 0. 2. Main results In this section we state our main results. Their proofs will be given in later sections: the proof of Theorems 3.5 and 3.7 is in §4; the proof of Theorems 3.9 and 3.11 is in §6; the proof of Theorem 3.10 is in §8. Theorem 3.5. Let L be a limit order book with deterministic starting state (Qb0,Q a 0) and arrival process (At)t≥0. Suppose that the arrival events are independent. Then there exist two constants κb and κa such that the following hold for any  > 0, with probability 1. (1) Db∞(−∞, κb − ] <∞, and Da∞[κa + ,∞) <∞. That is, only finitely many bid departures at prices < κb −  ever occur, and only finitely many ask departures at prices > κa +  ever occur. (2) The event {Qbt [κb+,∞) = 0} occurs infinitely often, and the event {Qat (−∞, κa− ] = 0} occurs infinitely often. That is, infinitely often all of the bids at prices > κb +  are executed, and infinitely often all of the bids at prices < κa −  are executed. Further, the constants κb and κa do not change if the starting state (Q b 0,Q a 0) is modified by a finite number of bids. Definition 3.6. We call κb and κa in Theorem 3.5 the threshold values on the bid and ask side respectively. When the arrivals are iid with some bounded density, we have the following refinement: Theorem 3.7. Let L be a limit order book with some deterministic finite starting state, arbitrary price level function, and arrival process (At)t≥0. Let the arrival events (At)t≥0 be iid, with P(At ∈ dp× bid) = 1 2 dF b(p), P(At ∈ dp× ask) = 1 2 dF a(p) for some pair of probability distributions F b, F a on [0, 1] with bounded densities f b, fa respectively; let M = max i=a,b sup p∈[0,1] f i(p) Then the threshold values κb and κa satisfy F b(κb) = 1 − F a(κa). Moreover, for any  > 0, w.p.1, there exists a sequence of times Tn →∞ such that QbTn [κb + ,∞) = 0 and lim sup Tn→∞ 1 Tn QaTn(−∞, κa + ] ≤ 2M. Remark 3.8. The boundedness of the densities is used to control the number of bid or ask arrivals on a small interval near the boundary values κb, κa. In particular, we only require the density to be bounded on a neighbourhood of κb and κa. 69 In the above two theorems, the event of measure 1 on which the results hold may depend on . In Theorem 3.7, the sequence of times Tn is random as well: that is, for almost every ω in the underlying probability space there exists a sequence Tn = Tn(ω) satisfying the conditions of the theorem. Theorem 3.5 is applicable to a very wide class of arrival processes; and it leaves open the possibility that κb = 0 and κa = 1, or that κb = κa. Theorem 3.9 shows that, when the arrivals are iid uniform (and the partial ordering is not too coarse), this is not the case: we really do have a positive fraction of unfulfilled bid and ask orders. Theorem 3.10 shows that κb < κa, so there is a nontrivial region where all bids periodically clear, and all asks periodically clear. (We do not know whether all orders will clear infinitely often on the entirety of this region.) Finally, Theorem 3.11 computes the threshold values precisely for the case of uniform arrivals and continuous pricing. Theorem 3.9. Let L be a limit order book with some deterministic finite starting state, price level function P, and arrival process (At)t≥0. Suppose the arrival events be iid uniform on [0, 1] × {bid, ask}, and suppose P is such that there exists a price p with 0 ≺ p ≺ 1− p ≺ 1. Then p− 2p2 2− 3p ≤ κb, p− 2p2 2− 3p ≤ 1− κa. In particular, 1 9 ≤ κb and κa ≤ 89 when P(x) = x. Theorem 3.10. Let L be a limit order book with some deterministic finite starting state, arbitrary price level function P, and arrival process (At)t≥0. Suppose the arrival events be iid uniform on [0, 1]× {bid, ask}. Then κb ≤ 14 and 1− κa ≤ 14 . For the case of P(x) = x, we can find the value of κb and κa precisely. Theorem 3.11. Let the partial ordering P be ≤, i.e. given by P(x) = x. Let the arrival events be iid uniform on [0, 1] × {bid, ask}. Then the value of κb is given as the unique solution to log ( 1− κb κb ) = κb 1− κb + 1, κb ≈ 0.2178 and κa = 1− κb. 3. Monotonicity In this section we gather some of the monotonicity results that our model exhibits. First, we examine what happens if we modify the starting state of the model, i.e. what effect will a change in the initial configuration have on the future evolution of the limit order book. Second, we relax the pricing scheme; i.e. we replace the price level function P by P˜ , where x≺˜y implies x ≺ y. Remark 3.12. There is a body of work on proving monotonicity for Markov processes; see for example [Massey, 1987], or more recently [Delgado et al., 2004] and [Lorek and Szekli, 2012], and references therein. In our case, the proofs are sufficiently simple to simply derive from scratch. Lemma 3.13. Let L and L˜ be two limit order books sharing the same arrival process and price level function, but such that Q˜b0 = Q b 0 + δp0 for some price p0, Q˜ a 0 = Q a 0. Then at all subsequent times either Q˜bt = Q b t + δpt and Q˜ a t = Q a t , or Q˜ b t = Q b t and Q˜at = Q a t − δqt, for some prices pt, qt (which depend on t, and may be random). 70 Proof. Up until the time the extra bid has departed, the departures are exactly the same in the two systems, so there is an extra bid in L˜. When the extra bid does depart, there is one fewer ask in L˜. We now repeat the argument swapping the roles of L and L˜, and of bids and asks.  Corollary 3.14. Let L˜ be obtained from L by adding some bids at some set of times; except for that change, the starting state, arrival process, and price level function of L and L˜ coincide. Then at all times, L˜ has at least as many bids as L and no more asks than L. Corollary 3.15. Let L˜ be obtained from L by modifying the starting state by a finite number ≤ M of orders; except for that change, the starting state, arrival process, and price level function of L and L˜ coincide. Then the states of L and L˜ at all times will differ by at most M orders. In particular, for any set S and i = a, b we will have∣∣∣Dit(S)− D˜it(S)∣∣∣ ≤M . We next discuss different pricing schemes. Definition 3.16. For two price level functions P , P˜ we say that P˜ is coarser than P if x≺˜y implies x ≺ y. Relaxing the pricing scheme into a coarser one means that more bid-ask pairs become “eligible” to leave; Lemma 3.17 asserts that more pairs really do leave. Lemma 3.17. Let L and L˜ be limit order books with the same starting state and external arrival process, but let P˜ be coarser than P. Then D˜bt (p) ≥ Dbt (p) and D˜at (p) ≥ Dat (p) for all prices p and times t. We will need a preliminary result, which is essentially an observation about increasing functions. Lemma 3.18. Let L, L˜ be limit order books as in Lemma 3.17, and suppose that at some time t we have D˜bt (p) ≥ Dbt (p) and D˜at (p) ≥ Dat (p) for all p. Further, suppose that D˜bt (∞) = Dbt (∞). Then the rightmost bid satisfies β˜t ≥ βt, and the leftmost ask satisfies α˜t ≤ αt. Proof. First, observe that D˜bt (∞) = D˜at (−∞) and Dbt (∞) = Dat (−∞), since bids and asks depart in pairs. We prove the statement about the rightmost bid; the statement about the leftmost ask will follow by an identical argument. Suppose that the lemma does not hold, i.e. the set [βt,∞) has Qbt [βt,∞) > 0 but Q˜bt [βt,∞) = 0 (i.e., β˜t < βt). Since bids are only present at a discrete set of prices, let  > 0 be such that Qbt(βt − ,∞) = Qbt [βt,∞) and similarly for Q˜bt . From (83) we infer that D˜bt(βt − ,∞) > Dbt(βt − ,∞), since the arrivals are equal. However, then D˜bt (βt − ) = D˜bt (∞)− D˜bt(βt − ,∞) < Dbt (βt − ), contradicting the assumptions of the lemma.  Proof of Lemma 3.17. The inequalities clearly hold at t = 0, and then we only need to check that they are preserved after an arrival. We will prove only the inequality for Db; the proof for Da is identical. Bid arrival. We consider first the arrival event At = (q, bid). If α˜t−  q (bid doesn’t depart in L), or if D˜bt−(p) > D b t−(p) (strictly) for all p ≥ q, then the (nonstrict) inequality 71 D˜bt (p) ≥ Dbt (p) holds. Thus, we need to consider the case αt− 6 q and D˜bt−(p) = Dbt−(p) for some p ≥ q. Since there is an ask at αt− 6 q ≤ p in L, there must be no bids to the right of p: Qbt− [p,∞) = 0. By (83) this implies that Dbt− [p,∞) = Qb0[p,∞) + Abt− [p,∞) has the biggest value it could possibly have. Since D˜bt−(p) = D b t−(p), we see Dbt−(∞) = Dbt−(p) +Dbt− [p,∞) ≥ D˜bt−(D˜bt−(∞). Since we had assumed Dbt−(∞) ≤ D˜bt−(∞), we must in fact have equality: that is, D˜bt−(∞) = Dbt−(∞). Applying Lemma 3.18, we conclude α˜t− ≤ αt− 6 q, and hence α˜t− ˜6q. Therefore, the arriving bid departs in both systems (Dbt = D b t− + δq and D˜ b t = D˜ b t− + δq) and the inequality is preserved. Ask arrival. We now consider the arrival of an ask at price q. The argument will be very similar, except at the very end. If βt− ≺ q (no bid departs in L), or if D˜bt−(p) > D˜bt−(p) (strictly) for all p ≥ βt− , then the (nonstrict) inequality D˜bt (p) ≥ D˜bt (p) holds. Thus, we need to consider the case βt− 6≺ q and D˜bt−(p) = Dbt−(p) for some p ≥ βt− 6≺ q. Since the rightmost bid in L is at βt− , we have Q b t−(βt− ,∞) = 0, and in particular Qbt− [p,∞) = 0. By (83) this implies that Dbt− [p,∞) = Qb0[p,∞) + Abt− [p,∞) has the biggest value it could possibly have; since D˜bt−(p) = D b t−(p), we see that Dbt−(∞) = Dbt−(p) +Dbt− [p,∞) ≥ D˜bt−(∞). Since we had assumed Dbt−(∞) ≤ D˜bt−(∞), we must in fact have equality: that is, D˜bt−(∞) = Dbt−(∞). Applying Lemma 3.18, we conclude β˜t− ≥ βt− 6≺ q, hence β˜t− ˜6≺q. Thus, the arriving ask will depart in both systems (with a bid at β˜t in L, and with a bid at β˜t− in L˜). This immediately implies (84) D˜bt (x) ≥ Dbt (x), x 6∈ [βt− , β˜t−). We claim that on the interval [βt− , β˜t−) we had a strict inequality D˜bt−(x) > D b t−(x), for x ∈ [βt− , β˜t−). Indeed, for x ∈ [βt− , β˜t−), (83) implies Dbt−(x,∞) = Qb0(x,∞) + Abt−(x,∞), x ∈ [βt− , β˜t−) which is the biggest value it could possibly have. On the other hand, D˜bt−(x,∞) = Qb0(x,∞) + Abt−(x,∞)− Q˜bt−(x,∞) ≤ Dbt−(x,∞)− 1 on [βt− , β˜t−). The last inequality holds because Q˜ b t−(x,∞) ≥ 1, since by assumption there is a waiting bid at β˜t− ≥ x in L˜. Since we already know Dbt−(∞) = D˜bt−(∞), this inequality is enough to conclude (85) D˜bt−(x) = D˜ b t−(∞)− D˜bt−(x,∞) > Dbt−(x), x ∈ [βt− , β˜t−). Combining (84) and (85) yields the result.  72 Corollary 3.19. Let L and L˜ be limit order books with the same starting state and external arrival process, but let P˜ be coarser than P (Definition 3.16). Then κ˜b ≤ κb and κ˜a ≥ κa. 4. Proof of Theorems 3.5 and 3.7 In this section we prove Theorem 3.5. We will be using the machinery of the Kol- mogorov 0-1 law for tail σ-algebras Williams [1991]. Define the events Ab(x) ≡ {Db∞(−∞, x] <∞}, Aa(x) ≡ {Da∞[x,∞) <∞}. Note that for any set S the functions Dit(S), i = a, b, are nondecreasing, so the limits as t→∞ always exist (but may be infinite). For limit order books L˜, Lˆ we will denote the corresponding events A˜i, Aˆi (i = a, b). Proof of Theorem 3.5. Without loss of generality, we may index the time by non- negative integers. Let Fn = σ({At, t ≥ n}); the tail σ-algebra is F ≡ ⋂ nFn. We begin by showing that for any x, the events Ab(x), Aa(x) are F -measurable, that is, that they are Fn-measurable for all n. Below we consider Ab(x); the case of Aa(x) is similar. The event Ab(x) ∈ F0 because Ab(x) = ⋃ m ⋂ n Abn,m(x) where Abn,m(x) is the event that there are at most m bid departures at prices p < x by the time of the nth arrival (clearly, an element of F0). We now show that Ab(x) is Fn-measurable. Consider the following limit order book L˜. The arrival process of L˜ is given by (Et+n)t≥0; the starting state of L˜ is (Q˜b0, Q˜ a 0) ≡ (Qbn,Q a n). Then at all times t ≥ 0, (Q˜bt , Q˜at ) = (Qbt+n,Qat+n). Consequently, A˜b(x) holds for L˜ if and only Ab(x) holds for L. Now consider a limit order book Lˆ with arrival process (Et+n)t≥0 (same as for L˜) but starting state (Qb0,Q a 0) (same as for L). By construction, Aˆb(x) ∈ Fn. On the other hand, since the starting states of L˜ and Lˆ differ by a finite number of orders, Corollary 3.15 implies that Aˆb(x) holds for Lˆ if and only if A˜b(x) holds for L˜, i.e. the events Ab(x) and Aˆb(x) coincide. We conclude that Ab(x) is Fn-measurable for all n, and therefore it is F -measurable. The argument above also demonstrates that whether Ab(x) holds for L is unaffected by finite changes in the starting state of L. By Kolmogorov’s 0-1 law [Williams, 1991, Theorem 4.11], for each x the event Ab(x) holds with probability 0 or 1. Let κb ≡ sup{x : P(Ab(x)) = 1}. We claim that κb satisfies the conditions of the theorem. First, for any x < κb there exists x ≤ y < κb such that P(Ab(y)) = 1. This implies P(Ab(x)) = 1 as well, since for x ≤ y we have Db∞(−∞, x] ≤ Db∞(−∞, y]. Consequently, for any  > 0, only finitely many bids at prices below κb −  ever depart the system, and similarly for asks. It remains to show that, almost surely, Qbt [κb + ,∞) = 0 infinitely often. This is by construction: for any x > κb there will be infinitely many bid departures at prices ≤ x. At any time that a bid at price ≤ x leaves the system, it must be the rightmost bid, so Qbt(x,∞) = 0 happens infinitely often for any x > κb.  73 We now consider the case of iid arrivals, and investigate the number of asks in the system at the times when there are no bids to the right of κb + . Proof of Theorem 3.7. We know from Theorem 3.5 that κb and κa exist and are unique. The assertion F b(κb) = 1 − F a(κa) is a consequence of the fact that the arrival distributions are absolutely continuous, and bids and asks always depart in pairs. Indeed, the long-run proportion of arriving bids that leave the system is 1−F b(κb), and this must equal the long-run proportion of arriving asks that leave the system, namely F a(κa). Let Tn → ∞ be the sequence of times along which QbTn [κb + ,∞) = 0. We analyse the number of waiting asks QaTn(−∞, κa + ]. Using (83), we may write for any T , 1 T QaT (−∞, κa + ] = 1 T (AaT (−∞, κa + ]−DaT (−∞, κa + ]) . Law of large numbers implies for the first term lim T→∞ 1 T AaT (−∞, κa + ] = 1 2 F a(κa + ), w.p.1. For the second term, DaT (−∞, κa + ] = DaT (−∞)−DaT (κa + ). Theorem 3.5 implies that the second term is finite w.p.1. Since also DbT (κb − ) < ∞ w.p.1, we obtain lim T→∞ 1 T ( DaT (−∞, κa + ]−DbT [κb − ,∞) ) = lim T→∞ 1 T ( DaT (−∞)−DbT (−∞) ) = 0, w.p.1 since bids and asks always depart in pairs. We would like to translate the statement about asks into a statement about bids. Observe DbT [κb − ,∞) ≤ DbT [κb + ,∞) +Qb0[κb − , κb + ) + AbT [κb − , κb + ). The second term is finite by assumption, while for the third term, w.p.1, lim T→∞ 1 T AbT [κb − , κb + ) = 1 2 ( F b(κb + )− F b(κb − ) ) ≤  sup p∈[0,1] f b(p). We conclude (86) 1 T QaT (−∞, κa + ] ≤ 1 2 F a(κa + )− 1 T DbT [κb + ,∞) +  sup p∈[0,1] f b(p) + o(1), w.p.1 where the o(1) term tends to 0 as T →∞. We bound the first term as F a(κa + ) ≤ F a(κa) +  sup p∈[0,1] fa(p) ≤ F b(κb + ) +  ( sup p∈[0,1] fa(p) + sup p∈[0,1] f b(p) ) . Recall M = maxi=a,b supp∈[0,1] f i(p). Putting the estimates above together, we conclude 1 T QaT (−∞, κa + ] ≤ 1 T ( AbT [κb + ,∞)−DbT [κb + ,∞) ) + 2M + o(1). 74 Finally, looking along the sequence Tn, the first term vanishes, so lim sup Tn→∞ 1 Tn QaTn(−∞, κa + ] ≤ 2M.  5. Strict limit order book We would like to analyse the steady-state behaviour of the system when the pricing is continuous: that is, P(x) = x. Lemma 3.17 allows us to bound the bid and ask counts Qbt(x), Q a t (x) for this case from below, but not from above. To be able to bound them from above, we introduce the model of a strict limit order book. A strict limit order book differs from the ordinary one in that a bid at price p and an ask at price q are only allowed to depart the system when q ≺ p (i.e., the inequality must be strict). In particular, a given price level may contain both a waiting bid and a waiting ask. This does not affect the dynamics when P(x) = x, since w.p.1 all orders in that system arrive at distinct prices. On the other hand, if for some price p the set {x : x ∼ p} has positive measure, then this modification will alter the paths of the limit order book, because fewer bid-ask pairs will be eligible to leave. The main results in this section are Lemma 3.21 and Corollary 3.25. Lemma 3.21 shows that when the arrival process and the price level function are sufficiently “nice”, strict and nonstrict limit order books are really quite similar. Corollary 3.25 concludes that we can bound the constant κb for a limit order book with continuous pricing from above and below using only ordinary limit order books with discrete pricing. Lemma 3.20. Let L and L˜ be strict limit order books with the same starting state and external arrival process, but let P˜ be coarser than P (Definition 3.16). Then D˜bt (p) ≤ Dbt (p), D˜at (p) ≤ Dat (p), ∀p, ∀t ≥ 0. The proof is similar to the proof of Lemma 3.17. We now show that the strict and non-strict versions of the limit order book are not too different. Let f bN , f a N be defined on counting measures of bids and asks as follows: for bids, f bN(Q b t) = f b N( ∑ biδpi) ≡ ∑ 1pi≤1− 1N biδpi ; faN(Q a t ) = f a N( ∑ aiδpi) ≡ ∑ 1pi≥ 1N aiδpi− 1N . Lemma 3.21. Let L be a limit order book with empty starting state, arrival process (At)t≥0 with arrivals iid uniform on [0, 1] × {bid, ask} and price level function P(x) = 1 N bNxc. Let Lˇ be a limit order book with empty starting state, the same price level function P, and arrival events Aˇt given as follows. If At = (p, ask) for p > 1N , then Aˇt = (p − 1N , ask). If At = (p, bid) for p < 1 − 1N , then Aˇt = (p, bid). (The remaining events are ignored.) Then the state of Lˇ at time t is given by Qˇbt = f b N(Q b t), Qˇ a t = f a N(Q a t ). Proof. An easy induction on t.  We derive the following two corollaries for Poisson arrival processes. 75 Corollary 3.22. Let L be a limit order book with empty starting state, price level function P(x) = 1 N bNxc, and arrival process which is Poisson on the set [0, 1]×{bid, ask}× [0,∞) (the last coordinate being time), where arrival events have rate 1 in time and are uniform on [0, 1]× {bid, ask}. We work on the probability-one event that the set of times at which arrivals occur is discrete, and times of arrival events do not coincide. Let Lˇ be a limit order book with empty starting state, price level function Pˇ(x) = 1 N−1b(N − 1)xc, and arrival process which is Poisson on the set [0, 1] × {bid, ask} × [0,∞) (the last co- ordinate being time), where arrival events have rate N−1 N in time and are uniform on [0, 1]× {bid, ask}. For a counting measure ∑ ciδpi with support on [0, N−1 N ] define f˜( ∑ ciδpi) = ∑ ciδ N N−1pi . Then Qˇit and f˜ ( f iN(Q i t) ) , i = a, b, are equal in distribution. Proof. Lemma 3.21 gives a coupling. Observe that the construction given in the lemma applied to a Poisson process of rate 1 for the ordinary limit order book produces an arrival process that is Poisson of rate N−1 N in time for the strict limit order book, and which is uniform on [0, N−1 N ]. If we want the arrival events to be uniform on [0, 1] instead, we simply need to rescale space using f˜ .  Corollary 3.23. Let L be a limit order book as in Corollary 3.22. Let Lˇ be a limit order book with empty starting state, price level function Pˇ(x) = 1 N−1b(N − 1)xc, and arrival process which is Poisson on the set [0, 1]× {bid, ask} × [0,∞) (the last coordinate being time), where arrival events have rate 1 in time and are uniform on [0, 1]×{bid, ask}. We work on the probability-one event that the set of times at which arrivals occur is discrete, and times of arrival events do not coincide. Then QˇiN−1 N t and f˜ ( f iN(Q i t) ) , i = a, b, are equal in distribution. Proof. For a Poisson process, rescaling time is equivalent to rescaling the rate, so this follows from Corollary 3.22.  Corollary 3.24. Let L be an ordinary limit order book with finite starting state, price level function P(x) = 1 N bNxc, and let the arrival events (At)t≥0 be iid uniform. Let Lˇ be a strict limit order book with finite starting state, price level function Pˇ(x) = 1 N−1b(N−1)xc, and the same arrival process. Then N N − 1κb = κˇb, N N − 1(1− κa) = 1− κˇa. where κi (with or without the hat) are defined as in Theorem 3.5. Proof. Since the statements about κi do not depend on the time scaling of the arrival process so long as the arrival events are independent, we may take the arrival process to be Poisson in time. Further, as we saw in the proof of Theorem 3.5, the statements about κ do not depend on the starting state provided it’s finite. Applying Corollary 3.23 we conclude that the defining properties of Theorem 3.5 are simultaneously true or not true of κb (respectively 1 − κa) in the ordinary system, and of κˇb ≡ NN−1κb (respectively 1− κˇa ≡ NN−1(1− κa)) in the strict system.  Combining Corollary 3.24 with Lemmas 3.17 and 3.20 gives the following Corollary 3.25. Consider the limit order book with finite starting state, P(x) = x, and arrival events (At)t≥0 iid uniform on [0, 1] × {bid, ask}. Let κi, i = a, b be the associated threshold values (Definition 3.6). Consider also the ordinary limit order books 76 LN and strict limit order books LˇN , with the same arrival process and PN(x) = 1 N bNxc, PˇN(x) = 1 N−1b(N − 1)xc. Let κNi , κˇNi , i = a, b be the corresponding threshold values. Then κNb ≤ κb ≤ κˇNb = N N − 1κ N b ,(87) 1− κNa ≤ 1− κa ≤ 1− κˇNa = N N − 1(1− κ N a ).(88) 6. Exact values of the thresholds κb and κa In this section we will derive the value of κb for the case of iid uniform arrivals and price level function P(x) = x. In the process, we will conjecture steady-state distributions for the rightmost bid and the leftmost ask. We will find certain sequences of times with the property that empirical distributions converge to the conjectured distributions. For the duration of this section, we will assume that the arrival events are iid uniform on [0, 1]× {bid, ask}. We will consider price level functions with a finite number of distinct price levels. Typically, but not necessarily, we will be interested in P(x) = 1 N bNxc for some N . In all the models we investigate, the price level functions will satisfy x ≺ y if and only if 1 − y ≺ 1 − x; together with the symmetry of the arrival process about the point 1 2 , this implies that the threshold values satisfy κb = 1− κa. Definition 3.26. We will refer to the set of prices in the kth price level as bin k. Let lk be the width of the k th bin; that is, the kth bin is [l1 + . . .+ lk−1, l1 + . . .+ lk) = {x : x ∼ l1 + . . .+ lk−1} (Recall that P was taken to be right-continuous in Definition 3.3.) For x ∈ [0, 1], we let [x] denote the index of the bin to which x belongs; that is, [x] = k if x belongs to bin k. Write κb(N) and κa(N) for the constants identified in Theorem 3.5. Let kb(N) = [κb(N)], unless κb = l1 + . . . + lk−1 happens to coincide with the leftmost boundary of bin k; in that case, define kb(N) = k − 1. Similarly, let ka(N) = [κa(N)] (since we are taking bins to be closed on the left and open on the right, the corresponding caveat is unnecessary). For a measure pi on [0, 1] we abuse notation slightly, and write write pi(k) to mean the measure of bin k. Theorem 3.5 implies that the number of bid departures from bins < kb(N), and ask departures from bins > ka(N), will be finite. For each δ > 0 let Tn = Tn(δ)→∞ be the sequence of (random) times along which (89a) DbT1(−∞, κb(N)− δ] = Db∞(−∞, κb(N)− δ], and (89b) DaT1 [κa(N) + δ,∞) = Da∞[κa(N) + δ,∞); (89c) QbTn [κb(N) + δ,∞) = 0, and (89d) 1 Tn QaTn(−∞, κa(N) + δ] ≤ δ. The (almost-sure) existence of such a sequence has been established in Theorems 3.5 and 3.7. 77 For any time T and 1 ≤ k ≤ N , we may define the empirical distributions of the rightmost bid βt and the leftmost ask αt up to time T as follows: pibT (k) = 1 T # {0 ≤ t ≤ T : [β(t)] = k} , piaT (k) = 1 T # {0 ≤ t ≤ T : [α(t)] = k} These are discrete probability distributions on the compact set [0, 1]; consequently, there is some subsequence of our chosen sequence Tn(δ) (which, by a slight abuse of notation, we also call Tn) along which we have convergence pibTn w→ pib, piaTn w→ pia for some discrete probability measures pib and pia. We will now use the relations in (89) to derive some properties of these limiting distributions pia and pib. Remark 3.27. Slightly more is true than stated above. Namely, any subsequence of Tn(δ) has a further subsequence, say Tnr , along which the pi b Tnr converge to some limit, and this limit will have all the properties given below. Lemma 3.28. With the above definitions,(∑ i≤k li ) pib(k) lk + ∑ i≤k pia(i) = 1 and 1− δ min li ≤ ( N∑ i=k li ) pia(k) lk + ∑ i≤k pib(i) ≤ 1 Proof. Note that, for any set S, we have QbT (S)−Qb0(S) = ∑ t≤T 1{At=pt×bid, pt∈S, αtpt} − ∑ t≤T 1{At=pt×ask, βt∈S, pt 6βt} We now estimate these two terms when S is bin k. The first term is the number of exogenous bid arrivals, up to time T , during the times when [αt] > k. Recalling that exogenous arrivals are independent of the state of the limit order book, ∑ t≤T 1{At=pt×bid, pt∈S, αtpt} = Bin ( 1 2 lk (∑ i>k piaT (i) ) T ) is a binomial random variable with parameter 1 2 lk( ∑ i>k pi a T (i))T . Indeed, during each of the ∑ i>k pi a T (i)T times when [αt] > k there is a probability 1 2 lk of the arrival event being a bid at a price pt ∈ S. The second term is the number of external ask arrivals, up to time T , into bins ≤ k, during the times when [βt] = k. We obtain∑ t≤T 1{At=pt×ask, βt∈S, pt 6βt} = Bin ( 1 2 (∑ i≤k li ) pibT (k)T ) is a binomial random variable with parameter 1 2 ( ∑ i≤k li)pi b T (k)T , since during each of the pibT (k)T times when [βt] = k, the probability of an ask arriving at price pt 6 l1 + . . . + lk is ( ∑ i≤k li). 78 Dividing by T , looking along Tn, and using (89c) to evaluate the left-hand side, we obtain (90) 0 = 1 Tn Bin ( 1 2 lk (∑ i>k piaTn(i) ) Tn ) − 1 Tn Bin ( 1 2 (∑ i≤k li ) pibTn(k)Tn ) . By the law of large numbers, this implies 0 = lim n→∞ ( 1 2 lk (∑ i>k piaTn(i) )− 1 2 (∑ i≤k li ) pibTn(k) ) . Using the fact that ∑ i pi a Tn (i) = 1 for all n, and piiTn w→ pii for i = a, b, we conclude that(∑ i≤k li )pib(k) lk + ∑ i≤k pia(i) = 1. The equation for asks is obtained similarly, using (89d) instead of (89c) in (90).  Observe that if limn→∞ 1TnQ b Tn (S) = Q exists and is well-defined, (90) becomes Q = 1 Tn Bin ( 1 2 lk (∑ i>k piaTn(i) ) Tn ) − 1 Tn Bin ( 1 2 (∑ i≤k li ) pibTn(k)Tn ) , and in the limit we obtain (91) (∑ i≤k li )pib(k) lk + ∑ i≤k pia(i) = 1− 2Q lk . We can now prove Theorem 3.9. Proof of Theorem 3.9. We begin by analysing a model with three bins of width l, 1− 2l, and l again; that is, N = 3 in the above machinery. We will see how this relates to the original limit order book later. By symmetry, κa = 1− κb; the two clearly cannot be in the middle bin, so kb = 1 and ka = 3. Let δ > 0 be small. We will consider the sequence of times Tn along which (89) holds. We deduce the following variants of (90): pib(1) + pia(1) = 1− κb l (92a) pia(3) + pib(3) ∈ [ 1− κb + δ l , 1− κb l ] (92b) 1− l 1− 2lpi b(2) + pia(2) + pia(1) = 1(92c) 1− l 1− 2lpi a(2) + pib(2) + pib(3) ∈ [ 1− δ l , 1 ] (92d) 1 l pib(3) + pia(1) + pia(2) + pia(3) = 1(92e) 1 l pia(1) + pib(1) + pib(2) + pib(3) ∈ [ 1− δ l , 1 ] (92f) (Note that limt→∞ 1tQ b t [0, l) = 1 2 κb, and similarly for asks in [1− l, 1].) Taking (2− 3l)((92a) + (92b)) + (1− 2l)((92c) + (92d))− l(1− 2l)((92e) + (92f)) 79 yields (2− 4l + 2l2)(pib(1) + pia(1) + pib(2) + pia(2) + pib(3) + pia(3)) ∈ 6− 12l + 4l2 − (4 l − 6)κb + [ −3− 5l l δ, (1− 2l)δ ] . The left-hand side is clearly ≤ 2(2− 4l + 2l2). Therefore, κb ≥ l − 2l 2 2− 3l + Cδ for some constant C. Since δ can be chosen arbitrarily small, we conclude that κb ≥ l−2l22−3l for the three-bin model. The result of the theorem now follows by applying Corollary 3.19. Indeed, for any limit order book L, let p be such that 0 ≺ p ≺ 1 − p ≺ 1, and consider the limit order book L˜ with the coarser pricing scheme P˜(x) =  0, x ≤ p 1 2 , p < x ≤ 1− p 1, 1− p < x Corollary 3.19 implies that p−2p 2 2−3p ≤ κ˜b ≤ κb. In particular, if P(x) = x (so p can be chosen arbitrarily), then choosing p = 1/3 maximises this bound at 1 9 .  We now establish some further properties of the limiting distributions pia and pib, for arbitrary N ≥ 3. Lemma 3.29. (1) pib(k) = 0 for k ≤ kb(N)−1 or k ≥ ka(N). Similarly, pia(k) = 0 for k ≥ ka(N) + 1 or k ≤ kb(N). (2) pi b(k) lk ≤ (∑i≤k li)−1, and pia(k)lk ≤ (∑i≥k li)−1, for all k = 1, . . . , N . (3) pib(ka(N)− 1) lka(N)−1 ≤ lka(N)−1  ∑ i≥ka(N)−1 li −1 ∑ i≤ka(N)−1 li −1 and pia(kb(N) + 1) lkb(N)+1 ≤ lkb(N)+1  ∑ i≤kb(N)+1 li −1 ∑ i≥kb(N)+1 li −1 . Proof. For (1), note that if for some k we have pib(k) > 0, then for some  > 0 and all sufficiently large n we must have pibn(k) > , i.e. #{t ≤ Tn : [β(t)] = k} > Tn for all large n. In that case law of large numbers implies lim inf n→∞ 1 Tn DbTn(bin k) > . Since for k ≤ kb(N)− 1 this limit is equal to 0, we conclude pib(k) = 0 for k ≤ kb(N)− 1. By an identical argument, pia(k) = 0 for k ≥ ka(N) + 1. Further, after a finite time there are always asks in bin ka(N). Consequently, after a finite time the rightmost bid cannot be in any bin k ≥ ka(N), so pib(k) = 0 for k ≥ ka(N) By an identical argument, pia(k) = 0 for k ≤ kb(N). 80 For (2), define Ab,QT (k) (respectively D b,Q T (k)) to be the number of bids arriving into (respectively departing from) queue in bin k up to time T (as opposed to departing without ever visiting the queue). Observe Db,QT (k) = Bin ( 1 2 pibT (k)T (∑ i≤k li )) . The number of arrivals into queue in bin k is certainly bounded above by the total number of arrivals into bin k, namely Ab,QT (k) ≤ AbT (k) = Bin (1 2 lkT ) . Dividing by T and looking along Tn →∞, lim n→∞ 1 Tn Db,QT (k) = 1 2 pib(k) (∑ i≤k li ) and lim n→∞ 1 Tn AbTn(k) = 1 2 lk. The inequality “number of departures ≤ number of arrivals” thus gives pib(k)(∑i≤k li) ≤ lk. By an identical argument for asks, we conclude pib(k) lk ≤ (∑ i≤k li )−1 , pia(k) lk ≤ (∑ i≥k li )−1 , ∀k. Finally, for (3), we use the definition of the time sequence Tn to refine the computation on the number of arrivals into queue. Note that external arrivals into bin k go into queue if only if [αt] > k: that is, we have Eb,QT (k) = Bin ( 1 2 (∑ i>k piaT (i) ) T lk ) . Let Tn →∞, and recall by (1) pia(k) = 0, k ≥ ka(N) + 1, pia(ka(N)) ≤ lk (∑ i≥k li )−1 . This gives lim sup n→∞ 1 Tn Eb,QTn (ka(N)− 1) ≤ 1 2 l2ka(N)−1  ∑ i≥ka(N)−1 li −1 ; while for the departures we still have lim n→∞ 1 Tn Db,QTn (ka(N)− 1) = 1 2 pib(ka(N)− 1)  ∑ i≤ka(N)−1 li  . Combining the two yields the inequality pib(ka(N)− 1) lka(N)−1 ≤ lka(N)−1  ∑ i≥ka(N)−1 li −1 ∑ i≤ka(N)−1 li −1 , and similarly for asks.  We now let N →∞: 81 Proposition 3.30. (1) Let Tn = Tn(δ) be a sequence along which (89) is sat- isfied, and also pibTn w→ pib, piaTn w→ pia for some distributions pib, pia. Then the limiting distributions satisfy (93a) (∑ i≤k li )pib(k) lk + ∑ i≤k pia(i) = 1, ka(N)∑ k=kb(N)+1 pib(k) ≥ 1− lkb(N)  ∑ i≤kb(N) li −1 and (93b) 1− δ min li ≤ ( N∑ i=k li )pia(k) lk + ∑ i≤k pib(i) ≤ 1, ka(N)−1∑ k=kb(N) pia(k) ≥ 1− lka(N)  ∑ i≥ka(N) li −1 . (2) Consider a sequence of limit order books indexed by N = 1, 2, . . . , such that the partial orderings PN are symmetric and are successive refinements. That is, x ≺N y if and only if 1 − y ≺N 1 − x for all N , and if x ≺N y then x ≺n y for all n ≥ N . We further require max 1≤i≤N li → 0, δN min li → 0, as N →∞. Let TNn (δN) be the corresponding family of sequences along which (89) is satisfied. Each sequence of distributions pii,N TNn , i = a, b, has a convergent subsequence; call its limit pii,N . (There may be multiple possible values for pii,N , depending on the convergent subsequence we choose; pick any.) As N → ∞, for some functions $i, the following pointwise convergence holds: 1 l[x] pib,N([x])→ $b(x), 1 l[x] pia,N([x])→ $a(x). The limits $b, $a satisfy (94) $b(x) = 1(κ,1−κ)(1− κ) ( 1 x + log( 1− x x ) ) , $a(x) = $b(1− x), where κ is the unique real number satisfying log 1− κ κ = 1 1− κ, κ ≈ 0.2178. Before proving this result, we derive Theorem 3.11 from it: Corollary (Theorem 3.11). Let the partial ordering P be P(x) = x, i.e. x ≺ y if and only if x < y. Let the arrival events be iid uniform on [0, 1] × {bid, ask}. Then the value of κb is given as the unique solution to log ( 1− κb κb ) = κb 1− κb + 1, κb ≈ 0.2178 and κa = 1− κb. Proof. We combine Proposition 3.30 with Corollary 3.25, noting that PN(x) = 1 N ! bN !xc, δN = (N !)−2 clearly satisfies the conditions of Proposition 3.30.  82 Proof of Proposition 3.30. (1) simply combines Lemma 3.28 with Lemma 3.29. For (2), since the space of probability distributions on the compact set [0, 1] is compact, any sequence of distributions has a convergent subsequence, so there is no problem with defining some set of limits pii,N (i = a, b), where pii,N = limnk→∞ pi i,N TNnk . Since these are again probability distributions on a compact set, any subsequence (along Nk, say) will have a further, convergent, subsequence. Below, we will show that the only possible limit of a convergent subsequence is the measure with density pi given by (94); this proves the result. We now work along a subsequence of N along which weak limits of the two sequences pii,N (i = a, b) exist; by a slight abuse of notation, we still denote its index N . Rewrite the inequalities (93) as follows: (95a) (∑ i≤k li )pib,N(k) lk + ∑ i≤k li ( pia,N(i) li ) = 1, ka(N)∑ k=kb(N)+1 lk ( pib,N(k) lk ) = 1− b,N and (95b) (∑ i≥k li )pia,N(k) lk + ∑ i≤k li ( pib,N(i) li ) = 1− a,N(k), ka(N)−1∑ k=kb(N) lk ( pia,N(k) lk ) = 1− a,N for some constants and functions b,N , a,N , and a,N(·) which converge to 0 as max( δN min li , li)→ 0. Lemma 3.29 (1) implies pia,N(k) = 0 for k ≤ kb(N). Inserting this into (95a) gives( ∑ i≤kb(N)+1 li )pib,N(kb(N) + 1) lkb(N)+1 = 1− pia,N(kb(N) + 1) ≡ 1− εb,N1 and( ∑ i≤kb(N)+2 li )pib,N(kb(N) + 2) lkb(N)+2 = 1− pia,N(kb(N) + 1)− pia,N(kb(N) + 2) ≡ 1− εb,N2 . By Lemma 3.29 (3), pia,N(kb(N) + 1) ≤ l2kb(N)+1  ∑ i≤kb(N)+1 li −1 ∑ i≥kb(N)+1 li −1 . A similar inequality may be derived for pia,Nkb(N)+2. These inequalities imply that ε b,N 1,2 → 0 as max li → 0 (Theorem 3.9 applies, so κb is bounded away from 0). Rearranging, we conclude (96a) ∣∣∣∣∣∣pi b,N(kb(N) + 1) lkb(N)+1 − (∑ i≤kb(N)+1 li N )−1∣∣∣∣∣∣→ 0 and (96b) ∣∣∣∣∣∣ ( pib,N(kb(N) + 1) lkb(N)+1 − pib,Nkb(N)+2 lkb(N)+2 ) − (∑ i≤kb(N)+1 li N )−2 lkb(N)+2 ∣∣∣∣∣∣→ 0. 83 Consider now the following system of integral equations: x$b(x) + ∫ x 0 $a(y)dy = 1, ∫ 1−κ κ $b(x)dx = 1, (1− x)$a(x) + ∫ 1 x $b(y)dy = 1, ∫ 1−κ κ $a(x)dx = 1. Here, κ ≡ lim kb(N) N ; the limit exists because PN are refinements, and hence Corollary 3.19 implies that κb(N) is an increasing sequence which must have a limit. (The rounding error in going from κb(N) to kb(N) N will not cause a problem.) We can rewrite these equations as follows: on (κ, 1− κ) we have d dx (x$b(x)) = −$a(x) = − 1 1− x ( 1− ∫ 1 x $b(y)dy ) = − 1 1− x ∫ x 0 $b(y)dy and hence (98a) d dx ( −(1− x) d dx (x$b(x)) ) = $b(x). This is a second-order ordinary differential equation, which needs two initial conditions to have a unique solution. We take (98b) $b(κ) = 1 κ , d dx $b(x)|x=κ = − 1 κ2 . Now, the set of coupled first-order equations (95) can be rearranged to form a second- order difference equation for pi b,N ([x]) l[x] , with initial conditions given by (96). In this setting, results on convergence of Euler’s method for numerically approximating ordinary differen- tial equations [Bradie, 2006, Chapter 7] guarantee that the functions 1 l[x] pii,N([x]) converge to the unique solution of the ODE (98a) with initial conditions (98b). The ordinary differential equation (98) can be solved explicitly to give $b(x) = 1(κ,1−κ)(1− κ) ( 1 x + log( 1− x x ) ) . Moreover, Lemma 3.29 implies that $b(x) → 0 as x → 1 − κ. Putting this into the above equation gives log 1− κ κ = 1 1− κ, κ ≈ 0.2178. The claim $a(x) = $b(1− x) follows easily from the symmetry of the system.  The distribution $b(x), together with simulated data for it, is plotted in Figure 3.2 (§11). 7. Restricted limit order book, and conjecture on steady-state behaviour In §6 we have constructed distributions $b(x), $a(x) as limits, along a certain care- fully chosen sequence, of the empirical distributions of the location of the rightmost bid (respectively leftmost ask) for a limit order book with a finite number of price bins. While this is enough to prove that the value of κb in a system with P(x) = x is ≈ 0.2178, it by no means proves anything about the steady-state distribution of the rightmost bid in that limit order book. In fact, it is not clear that it makes sense to speak of the steady-state distribution of the rightmost bid in an ordinary limit order book, because 84 whenever κb > 0, the limit order book is an obviously-transient Markov process. (While this does not prevent some marginal of the state, such as the rightmost bid, from having a steady-state distribution, it makes its existence nonobvious.) In this section we construct a modified object, the restricted limit order book, which we conjecture to be a positive recurrent Markov process. We also prove that, in a certain sense, the ordinary limit order book and the restricted one behave in the same way. Definition 3.31. Let L be a limit order book with initial state L0. For an interval I ≡ [x, y] define the restriction of L to I, denoted LI , as follows. The initial state is given by (99) (Q˜b,I0 , Q˜ a,I 0 ) ≡ (∞δx + 1IQb0,∞δy + 1IQa0). That is, we restrict the measures to I by multiplying them by the indicator function 1I , and add infinitely many bids at the point x, and infinitely many asks at the point y. The price level function and arrival process in LI are the same as in L. The state of the restricted limit order book is (Qb,It ,Q a,I t ) ≡ 1(x,y)(Q˜b,It , Q˜a,It ); this is a Markovian descriptor, because Qbt{x} = Qat {y} =∞ at all times t. Arrival events At of the form p × bid for p ≤ x, and of the form p × ask for p ≥ y, do not change the state of the restricted limit order book. Restrictions can be combined: (LI)J = LI∩J . For all of the examples we have consid- ered so far, L coincides with L[0,1]. The restricted order book can be thought of as a modification of the price level func- tion: Lemma 3.32. Let L be an ordinary limit order book with price level function P. Let LI be the restriction of L to an interval I ≡ [x, y]. Let P˜ be the price level function given by P˜(p) =  0, p ≤ x P(p), x < p < y 1, p ≥ y and let L˜ be an ordinary limit order book with price level function P˜ and the same initial state and arrival process as L. Then at all times 1IL˜t = L I t . Proof. Easy induction in t.  In particular, by Corollary 3.19 we conclude that κIb ≤ κb, κIa ≥ κa for any I, where κIb and κ I a are the constants found in Theorem 3.5 for L I . However, Theorem 3.5 also strongly suggests that, for any pair x, y with x < κb < κa < y, the restriction L [x,y] should be “not too different” from L itself. Specifically, we have the following result. Theorem 3.33. Let L be a limit order book, and let I = [x, y] with x < κb < κa < y. Then for all sets S, with probability 1, lim sup t→∞ ∣∣∣Qbt(S)−Qb,It (S)∣∣∣ <∞, lim sup t→∞ ∣∣∣Qat (S)−Qa,It (S)∣∣∣ <∞. 85 Proof. This is a consequence of Theorem 3.5 and Corollary 3.15. Theorem 3.5 asserts that after a finite time there will always be a bid in L at some price in [x, κb) and an ask at some price in (κa, y]. At that time, the states of L and L I restricted to I differ by a finite number of orders; and Corollary 3.15 implies that this will continue to be the case at all subsequent times.  The conjecture on the positive recurrence of the limit order book between κb and κa can be stated as follows: Conjecture 3.34. For any pair x, y with κb ≤ x < y ≤ κa, the restriction L[x,y] is a Harris recurrent Markov process. Moreover, when strict inequalities κb < x < y < κa hold, the restriction is positive recurrent. If this is the case, then it makes sense to speak of the steady-state distribution of the rightmost bid and the leftmost ask for this process, and the analysis in §6 describes this distribution. One of the things we can derive based on the conjecture is the steady-state distribution for the departing bid-ask pairs. Lemma 3.35. Let P(x) = x, and assume Conjecture 3.34. Then the steady-state distribution of the departing bid-ask pairs has density 1y≤x($b(x) +$a(y)) with respect to the Lebesgue measure dxdy. Proof. Departures of a bid at price x with an ask at price y < x happen either when the rightmost bid is at x and an ask arrives at y (this happens at rate $b(x)dy 2 ) or when the leftmost ask is at y and a bid arrives at x (this happens at rate $a(y)dx 2 ). To obtain a normalised probability distribution, observe∫ 1 0 ∫ 1 y ($b(x) +$a(y))dxdy = ∫ 1 0 ∫ 1 y $b(x)dx+ ∫ 1 0 y$a(y)dy = E$b + E$a. Since the bid and ask distributions are symmetric, we clearly have E$b + E$a = 1, so the normalised distribution is as stated.  For another interpretation of the restricted limit order book in terms of market orders, see §10. 8. Lyapunov function In this section we present the proof of Theorem 3.10. The proof uses Lyapunov function techniques to show positive recurrence of the Markov chain. We consider an ordinary limit order book with 5 equal bins,4 each having width 1/5. We will show that the restriction of this limit order book to an interval [1 5 + , 4 5 − ] for any  > 0 is positive recurrent. This implies κb ≤ 15 for the ordinary limit order book. Corollary 3.25 then implies that κb ≤ 14 for the strict limit order book with 4 bins, and hence κb ≤ 14 for P(x) = x. Since P(x) = x refines all partial orderings we consider, Corollary 3.19 implies κb ≤ 14 always (and similarly, κa ≥ 34 always). It remains to show that the restriction of the ordinary limit order book with 5 equal bins to [1 5 + , 4 5 − ] is positive recurrent. We will refer to the bins in of the restricted limit order book as 2, 3, 4 (inheriting the numbering from the original limit order book). 4 The number 5 isn’t magical, it’s just the largest number we can comfortably analyze: working with N bins would produce an (N − 2)-dimensional Markov chain below; for N = 4 it turns out to be boring, and for N = 6 difficult to work with. 86 It is clearly sufficient to consider the Markov chain with state space in Z3 which counts the number of orders of each type in each of the three bins; the sign will be positive if the orders are bids, and negative if they are asks. We denote the state of this Markov chain Xt ≡ (Xt(2), Xt(3), Xt(4)). We will show that this Markov chain is positive recurrent. (This implies that the restricted limit order book has finite mean recurrence time associated with the state “empty”.) To do this, we will construct a Lyapunov function L = L(Xt). Write ∆L(t) ≡ L(Xt+1)− L(Xt), ∆Xt ≡ Xt+1 −Xt. Our Lyapunov function will have the property that |∆L(t)| is uniformly bounded, and for all sufficiently large states, E[∆L(t)|Xt] < − < 0. The Foster-Lyapunov criterion asserts that existence of such a Lyapunov function implies positive recurrence of the Markov chain. (See e.g. [Bramson, 2006, Proposition 4.4], and references therein). We now describe the construction of the Lyapunov function. L will have the form L(X) ≡ min F 〈X, vF 〉 for some finite set of vectors vF which we will find below (102), (104). Therefore, level sets of L will be polyhedra P k ≡ {x : L(x) = k} = kP 1 whose faces have vF as their outer normals. Informally, the drift E[∆L(t)|Xt] ought to be negative provided (100) 〈E[∆Xt|Xt], vF 〉 < 0 for all faces F of PL(Xt) to which the state Xt belongs5. We now set out to find an appropriate set vF . Note that E[Xt+1 − Xt|Xt] has only ten different values, depending on [αt], [βt] (i.e. on the sign of the coordinates X(2), X(3), X(4)), namely: (101) E[Xt+1 −Xt|Xt] = ∆+++ ≡ (15 − , 15 ,−(45 − )), X(4) > 0 ∆−−− ≡ (45 − ,−15 ,−(15 − )), X(2) < 0 ∆++− ≡ (15 − ,−35 , 25 − ), X(3) > 0, X(4) < 0 ∆+−− ≡ (−(25 − ), 35 ,−(15 − )), X(2) > 0, X(3) < 0 ∆++0 ≡ (15 − ,−35 , 0), X(3) > 0, X(4) = 0 ∆0−− ≡ (0, 35 ,−(15 − )), X(2) = 0, X(3) < 0 ∆+0− ≡ (−(25 − ), 0, 25 − ), X(2) > 0, X(3) = 0, X(4) < 0 ∆+00 ≡ (−(25 − ), 0, 0), X(2) > 0, X(3) = X(4) = 0 ∆00− ≡ (0, 0, 25 − ), X(2) = X(3) = 0, X(4) < 0 ∆000 ≡ (0, 0, 0), X(2) = X(3) = X(4) = 0 Of these, the last one is completely irrelevant to our purposes because it only applies to a one-point set. We will index the orthants of R3 and the faces between them by the signs of X(2), X(3), X(4). Note that we only care about the closure of four orthants: + + +, + + −, 5This assumes that the only relevant directions are the faces containing Xt, which may not be the case. Instead, the binding direction for Xt+1 could, in principle, be different from the one for Xt. However, if we start from a large state, and have small step size, this will not be a problem. We will return to this point below, after finding the relevant vectors vF . 87 +−−, and −−−. The level set P ≡ P 1 will be constructed by starting off with the four faces of a unit octahedron, corresponding to v+++ ≡ (1, 1, 1), v++− ≡ (1, 1,−1),(102) v+−− ≡ (1,−1,−1), v−−− ≡ (−1,−1,−1). This choice satisfies (100) whenever Xt lies in the interior of an orthant, but not on the boundaries between them. We will now find vectors v++0, v0−−, v+0−, v+00, and v00− satisfying the conditions below, which will be normal to P at points where it crosses the boundaries between orthants. 〈v++0,∆+++〉 < 0, 〈v++0,∆++0〉 < 0, 〈v++0,∆++−〉 < 0(103a) 〈v0−−,∆−−−〉 < 0, 〈v0−−,∆0−−〉 < 0, 〈v0−−,∆+−−〉 < 0(103b) 〈v+0−,∆++−〉 < 0, 〈v+0−,∆+0−〉 < 0, 〈v+0−,∆+−−〉 < 0(103c) 〈v+00,∆+++〉 < 0, 〈v+00,∆++0〉 < 0, 〈v+00,∆++−〉 < 0,(103d) 〈v+00,∆+00〉 < 0, 〈v+00,∆+0−〉 < 0, 〈v+00,∆+−−〉 < 0 〈v00−,∆−−−〉 < 0, 〈v00−,∆0−−〉 < 0, 〈v00−,∆−−+〉 < 0,(103e) 〈v00−,∆00−〉 < 0, 〈v00−,∆+0−〉 < 0, 〈v00−,∆++−〉 < 0 We also require the vectors v to be outer normals at the relevant points of P , so the coordinates of the v’s must have the same sign as the corresponding index. For example, writing v++0 = (v2, v3, v4), we must have v2 > 0 and v4 > 0; v3 can have either sign. There is a subtle point here. When we go to solve the inequalities (103), the vector v+0− satisfies the following. If Xt belongs to one of the orthants adjacent to the edge given by this list of signs or lies on the edge itself (i.e. X(2) > 0, X(4) < 0, and X(3) has either sign or is equal to 0), then the dot product of the one-step drift from Xt and v+0− is negative. Consequently, when we go to cut up the octahedron, we must make sure that points of the resulting level set where the outer normals are given by v+0− lie in one of the two orthants + +−, +−−, or on the boundary between them, and similarly for all the other vectors. Provided all of the above is satisfied, we will have constructed a polyhedron P with the property that whenever Xt is on P , the vector E[Xt+1−Xt|Xt] always points strictly into P . Since ‖Xt+1 −Xt‖ is bounded (by 1), our construction guarantees that, starting from all sufficiently large states, (100) will hold; moreover, since we have a finite number of vectors vF , the drift will be bounded away from 0. Verifying that a solution to these inequalities exists for all sufficiently small  > 0 is a simple computation: for example, we may take v++0 = v+00 ≡ ( 4 3 , 1, 2 3 ) , v+0− ≡ ( 1,−4 5 ,−9 5 ) ,(104) v0−− = v00− = (−2,−3,−4). Finally, the function L is given by (105) L(X) = min{〈X, vF 〉}, F ∈ {+ + +,+ +−,+−−,−−−,+00,+0−, 00−}. where the vectors vF are given by (102) and (104). 88 In Figure 3.1 we show the constructed level set P . We only care about the four orthants (with boundary) which are physically possible for the limit order book, i.e. +++, ++−, +−−, and −−−; hence the resulting polyhedron is not convex. -0.5 0.0 0.5 X H2 L -0.5 0.0 0.5 1.0 X H3 L -0.50.0 0.51.0 X H4 L -0.5 0.0 0.5 X H2 L -0.5 0.0 0.5 1.0 X H3 L -0.5 0.0 0.5 1.0 X H4 L Figure 3.1. Two views of the level set P for the Lyapunov function (105). Red face is normal to v+0−, green is normal to v+00, blue is normal to v00−. The remaining planes come from the octahedron |X(2)|+|X(3)|+|X(4)| = 1 and the coordinate planes. The vertices and faces of this polyhedron are listed in Appendix 4. Remark 3.36. The method of constructing a Lyapunov function by building a poly- hedral level set can in principle be carried out for finer price level functions. However, the the procedure of starting with the octahedron ∑ |xi| = const. and “filing away” at the vertices and edges with additional hyperplanes that satisfy the appropriate constraints becomes substantially more difficult as the number of dimensions increases. 9. Arrival distributions In this section we discuss the behaviour of the limit order book if the arrivals are iid, but the distribution of prices is not uniform. We will assume that arriving orders are equally likely to be bids and asks. (If this is not satisfied, it is clear that one of the order types will have a queue going off to infinity.) First, suppose that in a limit order book L prices of arriving bids and of arriving asks have the same distribution F , i.e. P(At ∈ dp× bid) = P(At ∈ dp× ask) = dF (p). The transformation x 7→ 1 2pi (arctan(x) + pi) maps R to (0, 1), so (reparametrising the price if necessary) we may assume that the support of F is contained in [0, 1]. As always, we use the convention that F is RCLL (right-continuous with left limits). Let F−1 : (0, 1]→ [0, 1] be defined by F−1(y) = inf{x : F (x) ≥ y}. Let the price level function of L be P : [0, 1]→ [0, 1]. Consider now a limit order book L˜ whose arrivals are iid uniform on [0, 1]×{bid, ask}, and apply the map x 7→ F−1(x); this reproduces the arrival processes of L. Thus, if we take the price level function on L˜ to be P˜ ≡ PF−1, we will reproduce the dynamics of L in the following sense: 89 Proposition 3.37. Let L be a limit order book whose arrivals are iid, and P(At ∈ dp× bid) = P(At ∈ dp× ask) = dF (p) for some distribution F on [0, 1]; let P be the associated price level function. Let L˜ be a limit order book whose arrivals are iid uniform on [0, 1] × {bid, ask}, and whose price level function is P˜ ≡ PF−1. Then, defining for a counting measure M = ∑miδxi and a function g the composition g(M) ≡∑miδg(xi), we have equality in distribution F−1A˜it d =Ait, F −1Q˜it d =Qit for i = a, b and all times t ≥ 0. As mentioned above, restricting the support of F to lie within [0, 1] does not lose generality, because we can always reparametrise the prices so that this holds. In the case where the partial ordering of price levels is simply the total ordering on R (P(x) = x) and the arrival distribution is absolutely continuous with positive density f(x), instead of modifying the price level function P , we can simply add an extra coordinate transformation. Applying this intuition to Theorem 3.5 and Theorem 3.10, we obtain the following: Corollary 3.38. Suppose that the arrival events are iid, with P(At ∈ dp× bid) = P(At ∈ dp× ask) = dF (p) for an absolutely continuous distribution F with strictly positive density f on R. Suppose further that the starting state of the limit order book is some deterministic finite state (Qb0,Q a 0). Then there exist two constants κb and κa, with |κi| <∞, such that the following hold for any  > 0, with probability 1. (1) Db∞(−∞, κb − ] < ∞, and Da∞[κa + ,∞) < ∞. That is, there are finitely many bid departures at prices < κb−, and finitely many ask departures at prices > κa + . (2) The event {Qbt [κb+,∞) = 0} occurs infinitely often, and the event {Qat (−∞, κa− ] = 0} occurs infinitely often. The constants κb and κa are given by κb = F −1(κ), κa = F−1(1− κ) where κ ≈ 0.2178 is defined by Theorem 3.10. That is, even when the distribution of the prices of the arriving orders has infinite support, there will only be departures of bids and asks from a finite interval of prices. Remark 3.39. Proposition 3.37 and Corollary 3.38 contain no new mathematics, but the result of Corollary 3.38 is, perhaps, surprising: no matter what the distribution, on R, of arriving bids and asks is (provided it’s the same distribution for both of them), trading will only occur on a finite interval of prices. We can also consider the case when arrivals of bids and asks are iid, but follow different distributions. We will assume that the prices of arriving bids and asks have well-defined, continuously differentiable densities on [0, 1]. (Restricting to [0, 1] is no loss of generality since we can always reparametrise the price.) We will consider continuous pricing, i.e. P(x) = x. Consider a limit order book L with some deterministic finite starting state, price level function P(x) = x, and arrival events (At)t≥0 which are iid. Let the distribution of At be given by P(At ∈ dp× bid) = 1 2 dF b(p), P(At ∈ dp× ask) = 1 2 dF a(p) 90 for some pair of probability distributions F b, F a on [0, 1] with continuously differentiable, bounded densities f b, fa respectively; let M = max i=a,b sup p∈[0,1] f i(p). The analysis of Section 6 can be carried through more or less verbatim to derive limiting densities $i,f (i = a, b) satisfying the following equations: (106a) F a(x)$b,f b (x) = f b(x) ∫ 1 x $a,f a (y)dy, ∫ κa κb $b,f b (x)dx = 1, (106b) (1− F b(x))$a,fa(x) = fa(x) ∫ x 0 $b,f b (y)dy, ∫ κa κb $a,f a (x)dx = 1 with the boundary condition (106c) $b,f (κa) = $ a,f (κb) = 0. We can proceed as in Section 6 to derive ordinary differential equations for $b,f , which in particular determine the threshold values κb and κa in the case of (almost) arbitrary iid arrivals. Theorem 3.40. Let L be a limit order book with some deterministic finite starting state, price level function P(x) = x, and arrival events (At)t≥0 which are iid. Suppose that the distribution of arrival prices of bids, respectively asks, has bounded, continuously differentiable density f b, respectively fa. Then the threshold values κb and κb identified in Theorem 3.5 are given as the unique values for which the system of equations (106) has a solution. If we restrict our attention to symmetric arrival distributions, i.e. (107) fa(p) = f b(1− p) above, then (106) can be written in a particularly satisfying form. Namely, we obtain (108) $b,f (x)F a(x) = fa(1− x)Πb,f (1− x), x ∈ [κ, 1− κ] where Πb,f (x) = ∫ x 0 $(p)dp. In particular, we see that the roles played by the arrival distribution and the distribution of the extreme order (highest bid or lowest ask) are in a certain sense symmetric. If the arrival distributions of bids and asks are not the same, Theorem 3.9 no longer holds, so we no longer can conclude that κb > 0. Indeed, it is easy to come up with examples of densities f b, fa for which this is not the case. For example, taking f b = 2 · 1[1/2,1] and fa = 2 · 1[0,1/2], any bid-ask pair is eligible to depart, and the entire system becomes essentially a single two-sided queue, or a symmetric random walk. The inequality need not be so extreme; the analysis of restricted limit order books in §7 implies that we can take f b to be uniform on (κ, 1] and fa to be uniform on [0, 1−κ), for κ = κb ≈ 0.2178 defined by Theorem 3.11. We might like to know whether a pair of continuously differentiable, bounded distri- butions f b, fa with 0 < inf p∈[0,1] f b(p) fa(p) ≤ sup p∈[0,1] f b(p) fa(p) <∞ can have κb = 0 and κb = 1, i.e. distributions $ b, $a which are positive on (0, 1). The answer is negative, as the following argument shows. Note that we must have fa, f b strictly positive on (0, 1) (bids cannot remain waiting in a region where they do not arrive). We may therefore reparametrise so that fa(p) = 1, 91 and f b is some distribution that is bounded from below and above. Solving (106) for f b, we obtain f b(x) = x 1− x (1− x)pib(x)∫ 1 x pia(y)dy . Take the limit as x→ 0. The last term converges to 1; while the first term clearly tends to 0. This contradicts the assumption that f b is bounded away from 0. If we remove the assumption of bounded ratios and simply require both densities to be positive on all of (0, 1), it is certainly possible to have $b > 0 on all of (0, 1). For example, in the symmetric case, taking the (unbounded) ask arrival density fa(x) = 1 2 √ x = f b(1−x) and solving the resulting (108) does give a density $b,f whose support is equal to the entire interval [0, 1] (here, $b,f = fa). However, the techniques used in the proof of Theorem 3.7 do not apply for unbounded distributions, so it is not clear whether $b,f has an interpretation as the empirical distribution in this setting. 10. Market orders In all of the above discussion we have explicitly assumed that all orders arrive as limit orders, although some of them will be executed immediately upon arrival. We can also consider a model in which there is a steady stream of market orders arriving at rate θ. Definition 3.41. A market bid (respectively market ask) arriving at time t matches the lowest available waiting limit ask (respectively highest available waiting limit bid) in the system. If no asks (respectively no bids) are waiting in the system, the market bid (respectively market ask) is cancelled immediately, and disappears. A market order is submitted by impatient traders, who want a trade to be executed immediately. We will assume that orders are not submitted when there is nobody waiting to trade. Equivalently, we may think that there is an infinite supply of waiting asks at price “+∞” (very high), and an infinite supply of waiting bids at price “−∞” (very low); in real-world financial markets, such liquidity might be provided by market makers6. Note that some of the “limit” orders we have considered so far are also matched immediately, so in a sense the model we have been considering already has a notion of “market orders” as orders that do not have to wait to be executed. Let us refer to limit orders which are executed immediately as pseudo-market orders, to distinguish them from the new stream of market orders we are introducing in this section. Pseudo-market orders have the property that when the bid price βt is high, the number of pseudo-market asks increases; when the ask price αt is low, the number of pseudo-market bids increases. This is a natural dynamic – if the “price” of impatience decreases, we would expect more impatient orders. What we add now is an extra stream of market orders, corresponding to the assumption that a fraction of traders are impatient irrespective of the price. Let Abm(t) (respectively A a m(t)) denote the number of market bids (respectively asks) that have arrived up to time t. Consider a limit order book L whose arrivals happen in discrete time, and are supported on [0, 1]. Assume further that Abm(t) is binomial with parameter θbt, and A a m(t) binomial with parameter θat, independently of all the other arrival and departure processes. We can construct a process with the same distribution of waiting orders as follows. Speed up time, so that arrivals happen at times τt = t 1+θa+θb . With probability θa 1+θa+θb , let the incoming arrival be a market ask, which we will model as a pseudo-market ask with 6A market maker is an agent in the market who offers to buy and sell large quantities of the commodity, at low and high prices respectively. The difference between the buying and selling price pays for the risks associated with providing liquidity to the market. 92 price uniformly distributed over [1, 1 + θa]. With probability θb 1+θa+θb , let the incoming arrival be a market bid, which we will model as a pseudo-market bid with price uniformly distributed over [−θb, 1]. Finally, with the remaining probability 11+θa+θb , let the incoming arrival be a limit order, which we will model as the first arrival event from the original arrival process that has not yet occurred. Up to rescaling the price by a factor of 1 + θa + θb, this is the same construction as was used in the restricted limit order book in §7. Consequently, we see that restricted limit order books can be naturally interpreted as limit order books with market orders. In particular, if arrivals are iid uniform on [0, 1], and θa = θb = θ, we expect qualitatively different behaviour for θ < κ 1−2κ ≈ 0.386 and for θ > κ1−2κ (where κ = κb is as defined in Theorem 3.10). Specifically, we conjecture the following: Conjecture 3.42. Suppose limit order arrivals are iid uniform on [0, 1]×{bid, ask}, and suppose market bids and market asks arrive at rate θ as above, independently of the state of the system. Suppose θ > κ 1−2κ , where κ = κb is as defined in Theorem 3.10. Then the Markov chain describing the waiting orders in the system is positive recurrent. For θ < κ 1−2κ we know that the Markov chain is not positive recurrent, because the corresponding restricted limit order book is restricted to an interval containing [κ, 1− κ], and consequently, the number of waiting bids and waiting asks is tending to infinity (Theorem 3.33). 11. Simulation results In this section we show a few results of simulating the limit order book with N = 100 equally spaced bins, and uniform arrivals. The simulations were done in Maple. Figure 3.2 plots the distribution $b (in green) described in Proposition 3.30, along with the empirical distribution (in red) of the rightmost bid for a limit order book with N = 100 price bins over a sufficiently long time. The close agreement between the empirical distribution and the curve $b suggests that the rightmost bid really does have a steady-state distribution, $b. Figure 3.2. Distribution of the rightmost bid, together with the predicted curve. 93 Figure 3.3 presents the joint empirical distribution of the highest bid and the lowest ask with N = 100 bins. It seems plausible that the pair (βt, αt) has a true steady-state density on [0, 1]2 (although we have not proved anything about the joint distribution even along subsequences; all our analysis was concerned with the marginal distributions). The distribution is supported on a triangle because we always have β(t) < α(t); the wide strip around the distribution corresponds to the threshold values κb and κa. Although the density appears to dip sharply as we approach the corner (βt, αt) = (κb, κa), it appears to be positive everywhere except possibly the corner itself. This supports the conjecture that the restriction of a limit order book to any interval [κb + , κa − ] with  > 0 is positive Harris recurrent. Figure 3.3. Joint distribution of the rightmost bid and the leftmost ask. 94 Bibliography M. Armony and A. R. Ward. Blind fair routing in large-scale service systems. Submitted, 2011. http://www.stern.nyu.edu/om/faculty/armony/ArWa_10_6_11.pdf. R. Atar, Y. Shaki, and A. Shwartz. A blind policy for equalizing cumulative idleness. Queueing Systems, 67(4):275–293, 2011. F. Baccelli and T. Bonald. Window flow control in FIFO networks with cross traffic. Queueing Systems, 32:195–231, 1999. F. Baskett, K. M. Chandy, R. R. Muntz, and F. G. Palacios. Open, closed and mixed networks of queues with different classes of customers. Journal of the ACM, 22(2): 248–260, 1975. B. Bradie. A friendly introduction to numerical analysis. Prentice Hall, 2006. M. Bramson. Stability and Heavy Traffic Limits for Queueing Networks: St. Flour Lectures Notes. Springer, 2006. http://www.math.duke.edu/~rtd/CPSS2007/Bramson.pdf. M. Bramson. Instability of FIFO queueing networks. Annals of Applied Probability, 4(2): 414–431, 1994. M. Bramson, J. G. Dai, and J. M. Harrison. Positive recurrence of reflecting Brownian motion in three dimensions. Annals of Applied Probability, 20(2):753–783, 2010. G. Brigham. On a congestion problem in an aircraft factory. Journal of the Operations Research Society of America, 3(4):412–428, 1955. R. Caldentey, E. Kaplan, and G. Weiss. FCFS infinite bipartite matching of servers and customers. Advances in Applied Probability, 41:695–730, 2009. R. Cont and A. de Larrard. Price dynamics in a Markovian limit order market. Working paper, 2010. http://ssrn.com/abstract=1735338. M. Cso¨rgo˝ and L. Horva´th. Weighted approximations in probability and statistics. Wiley, 1993. J. G. Dai and T. Tezcan. State space collapse in many-server diffusion limits of parallel server systems. Mathematics of Operations Research, 36(2):271–320, 2011. R. Delgado, F. J. Lo´pez, and G. Sanz. Local conditions for the stochastic comparison of particle systems. Advances in Applied Probability, 36(4):1252–1277, 2004. V. Dumas. A multiclass network with non-linear, non-convex, nonmonotonic stability conditions. Queueing Systems, 25:1–43, 1997. A. El Kharroubi, A. Ben Tahar, and A. Yaacoubi. Sur la re´currence positive du mou- vement Brownien re´flechi dans l’orthant positif de Rn. Stochastics and Stochastics Reports, 68(3-4):229–253, 2000. A. El Kharroubi, A. Ben Tahar, and A. Yaacoubi. On the stability of the linear Skorohod problem in an orthant. Mathematical Methods of Operations Research, 56(2):243–258, 2002. M. Farkas. Dynamical Models in Biology. Academic Press, 2001. F. G. Foster. A unified theory for stock, storage and queue control. Operations Research, 10(3):121–130, 1959. D. Gamarnik and D. Katz. Stability of Skorokhod problem is undecidable. Submitted, 2010. arXiv:1007.1694v1. 95 D. Gamarnik and P. Momcilovic. Steady-state analysis of a multiserver queue in the Halfin-Whitt regime. Advances in Applied Probability, 40:548–577, 2008. D. Gamarnik and A. L. Stolyar. Multiclass multiserver queueing system in the Halfin- Whitt heavy traffic regime. Asymptotics of the stationary distribution. Queueing Sys- tems, pages 1–27, 2012. URL http://dx.doi.org/10.1007/s11134-012-9294-x. To appear in print. D. Gamarnik and A. Zeevi. Validity of heavy traffic steady-state approximations in gen- eralized jackson networks. The Annals of Applied Probability, 16:56–90, 2006. D. K. Gode and S. Sunder. Allocative efficiency of markets with zero-intelligence traders: Market as a partial substitute for individual rationality. Journal of Political Economy, 101(1):119–137, 1993. M. D. Gould, M. A. Porter, S. Williams, M. McDonald, D. J. Fenn, and S. D. Howison. Limit order books. In preparation, 2011. arXiv:1012.0349v2. I. Gurvich and W. Whitt. Queue-and-Idleness-Ratio controls in many-server service sys- tems. Mathematics of OR, 34(2):363–396, 2009. S. Halfin and W. Whitt. Heavy-traffic limits for queues with many exponential servers. Operations Research, 29(3):567–588, 1981. J. M. Harrison. Brownian models of open processing networks: Canonical representation of workload. The Annals of Applied Probability, 10(1):75–103, 2000. J. M. Harrison and M. J. Lo´pez. Heavy traffic resource pooling in parallel-server systems. Queueing Systems, 33(4):339–368, 1999. J. M. Harrison and V. Nguyen. The QNET method for two-moment analysis of open queueing networks. Queueing Systems: Theory and Applications, 6:1–32, 1990. J. M. Harrison and R. J. Williams. Brownian models of open queueing networks with homogeneous customer populations. Stochastics, 22:77–115, 1987. R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985. J. R. Jackson. Networks of waiting lines. Operations Research, 5(4):518–521, 1957. J. R. Jackson. Jobshop-like queueing systems. Management Science, 10(1):131–142, 1963. I. Karatzas and S. Shreve. Brownian Motion and Stochastic Calculus. Springer, second edition, 1996. F. P. Kelly. Networks of queues. Advances in Applied Probability, 8:416–432, 1976. F. P. Kelly and C. N. Laws. Dynamic routing in open queueing networks: Brownian models, cut constraints and resource pooling. Queueing Systems, 13:47–86, 1993. D. Kendall. Some problems in the theory of queues. Journal of the Royal Statistical Society, 13(2):151–185, 1951. J. F. C. Kingman. On queues in heavy traffic. Journal of the Royal Statistics Society, 24: 383–392, 1962. P. R. Kumar and T. I. Seidman. Dynamic instabilities and stabilization methods in distributed real-time scheduling of manufacturing systems. IEEE Transactions on Au- tomatic Control, 35:289–298, 1990. R. S. Liptser and A. N. Shiryaev. Theory of Martingales. Kluwer Academic Publishers, 1989. Translated from Russian by K. Dzjaparidze. In Russian published by Nauka 1986. P. Lorek and R. Szekli. Strong stationary duality for M obius monotone Markov chains: Unreliable networks. Queueing Systems, pages –1–17, 2012. URL http://dx.doi.org/ 10.1007/s11134-012-9284-z. To appear in print. A. Mandelbaum and A. L. Stolyar. Scheduling flexible servers with convex delay costs: Heavy-traffic optimality of the generalized cµ-rule. Operations Research, 52:836–855, 2004. A. Mandelbaum, W. A. Massey, and M. I. Reiman. Strong approximations for Markovian service networks. Queueing Systems, 30:149–201, 1998. 96 W. A. Massey. Stochastic orderings for Markov processes on partially ordered spaces. Mathematics of Operations Research, 12(2):350–367, 1987. C. D. Meyer. Matrix analysis and applied linear algebra, volume 1. SIAM, 2000. O. A. Nielsen. An Introduction to Integration and Measure Theory. John Wiley & Sons, 1997. G. Pang, R. Talreja, and W. Whitt. Martingale proofs of many-server heavy-traffic limits for Markovian queues. Probability Surveys, 4:193–267, 2007. C. A. Parlour. Price dynamics in limit order markets. Review of Financial Studies, 11 (4):789–816, 1998. D. Pollard. Convergence of Stochastic Processes. Springer-Verlag, 1984. I. Ros¸u. A dynamic model of the limit order book. Review of Financial Studies, 22: 4601–4641, 2009. A. N. Rybko and A. L. Stolyar. Ergodicity of stochastic processes describing the operation of open queueing networks. Problemy Peredachi Informacii, 28(3):3–26, 1992. D. Shah and D. Wischik. The teleology of scheduling algorithms for switched networks under light load, critical load, and overload. Submitted, 2009. http://www.cs.ucl. ac.uk/staff/d.wischik/Research/netsched2.pdf. A. L. Stolyar and T. Tezcan. Shadow-routing based control of flexible multi-server pools in overload. Operations Research, 59(6):1427–1444, 2011. A. L. Stolyar and T. Tezcan. Control of systems with flexible multi-server pools: A shadow routing approach. Queueing Systems, 66:1–51, 2010. A. L. Stolyar and E. Yudovina. Systems with large flexible server pools: Instability of “natural” load balancing. Submitted, 2010. arXiv:1012.4140. A. L. Stolyar and E. Yudovina. Tightness of invariant distributions of a large-scale flexible service system under a priority discipline. Submitted, 2012. arXiv:1201.2978. R. Talreja and W. Whitt. Fluid models for overloaded multiclass many-server queueing systems with first-come, first-served routing. Management Science, 54(8):1513–1527, 2008. J. A. van Mieghem. Dynamic scheduling with convex delay costs: The generalized cµ rule. The Annals ofApplied Probability, 5:809–833, 1995. N. D. Vvedenskaya, R. L. Dobrushin, and F. I. Karpelevich. Queueing system with selection of the shortest of two queues: An asymptotic approach. Problemy Peredachi Informacii, 32(1):20–34, 1996. L. M. Wein. Brownian networks with discretionary routing. Operations Research, 39(2): 322–340, 1991. W. Whitt. Blocking when service is required from several facilities simultaneously. AT&T Tehcnical Journal, 64(8):1807–1856, 1985. D. Williams. Probability with Martingales. Cambridge University Press, 1991. I. B. Ziedins and F. P. Kelly. Limit theorems for loss networks with diverse routing. Advances in Applied Probability, 21:804–830, 1989. 97 APPENDIX A Continuity of functions In this appendix we summarise the various notions of continuity of functions that we use. Definition A.1. A function f : R→ R is Lipschitz continuous with constant K if |f(x)− f(y)| ≤ |x− y| , ∀x, y ∈ R The definition is identical for functions defined on a subset of R. We may omit any explicit mention of the Lipschitz constant. Definition A.2. A function f : R → R is absolutely continuous if for all  > 0 there exists δ > 0 such that for any finite number of disjoint intervals (ak, bk)k=1,...,n with ∑n k=1(bk − ak) < δ we have ∑n k=1 |f(bk)− f(ak)| < . (The definition is similar for functions defined on a subset of R.) Clearly, any Lipschitz function is absolutely continuous; simply take δ = /K. The converse need not be true: the function y = x1/3 is absolutely continuous on R, but is not Lipschitz on any interval containing the origin. Absolutely continuous functions have the following property: Proposition A.3 (Theorem 20.8 of [Nielsen, 1997]). Let f : R → R be absolutely continuous. Then the set of points x such that f ′(x) does not exist has Lebesgue measure 0. Moreover, f(y)− f(x) = ∫ y x f ′(t)dt, where the integral is the Lebesgue integral. Definition A.4. A function f : R → R is right-continuous with left limits (RCLL) if, for all x, lim →0 f(x+ ) = f(x), lim →0 f(x− ) exists. (Here,  is understood to be positive.) The definition is identical if the domain is a subset of R. If the range is Rd rather than R, the RCLL property needs to hold for each of the coordinates. This property is also often denoted ca`dla`g (“continue a` droite, limite´e a` gauche”). 99 APPENDIX B Another reason to restrict to a tree In this appendix we informally describe another reason for wanting to restrict attention in Chapter 2 to a routing scheme where the allowed activities form a graph without cycles. This is based on an (apparently flawed!) intuition for how instabilities arise in networks. The examples in this section are taken from [Bramson, 2006, Chapter 3]. The earliest examples of queueing networks which exhibited unstable oscillations de- spite each station, on average, getting no more work than it could process were given by Kumar and Seidman [1990] and Rybko and Stolyar [1992]. In Figure B.1 we sketch the route the jobs in the Kumar-Seidman network take. m1 m3m4 m2 A B Figure B.1. Diagram of job flow through the Kumar-Seidman network. We think directly about a “fluid” model. That is, each “job” has infinitesimal size, and instead we talk about the “work mass” (in a queue, or being processed by a server in a unit of time). The “work mass” arrives into the system at the top-left corner, deterministically at rate 1. Each of the rectangles A, B represents a single server, with two work stations (1 and 4 at A, 2 and 3 at B). Each job needs to visit each server twice, in the order indicated by the arrows. Service is deterministic: if the server is employed at station i, then it can process m−1i units of work per unit time. We assume that each server is individually not overloaded: m−11 +m −1 4 > 1 and m −1 2 +m −1 3 > 1. We will take particular values m2 = m4 = 2 3 , m1 = m3 = 0; the result holds for any values satisfying m2 +m4 > 1. There are no routing choices here, as each job simply moves on to the next station. There is, however, a choice of scheduling, i.e. which of the stations to service. Let us suppose that the discipline is a clearing policy : once a server starts working on the jobs at one of its two stations, it will continue working at the same station until there is no more work in that queue. (This is a sensible policy if there are large costs associated with the server switching from one station to the other.) Suppose the network starts empty, and work mass M enters the system at station 1. It is immediately served by server A at station 1, and passes on to station 2. At time 2 3 M , server B at station 2 will finish processing these initial jobs; however, an additional 2 3 M jobs will enter the system in that time, be immediately processed at station 1, and join the queue at station 2, so server B continues processing the jobs at station 2. This cycle will terminate at time 2M , when the queue at class 2 will empty out, and server B will immediately process the batch of jobs through station 3. Thus, at time 2M the 3M jobs will enter service at station 4, blocking service of new arrivals at station 1. The 3M jobs at station 4 will be served from time 2M until 4M . During this time all jobs arriving into the system at station 1 will be forced to wait. Finally, at time 4M the system will again become empty, and a batch of 2M queued jobs will enter service at station 1. As 101 we iterate the process, the number of jobs queueing in the system will double at every iteration, so this network is not stable. The reason for the instability is that jobs block the servers from doing work: if the servers were to be busy all the time, they would be able to handle the amount of work coming in, but the scheduling discipline means that it is sometimes impossible for jobs to reach the idle servers. It seems that a salient feature of this network is the cycle present in the routing graph: jobs need to be serviced by server A, then B, then A again, and this repetition allows the system state to grow over the course of the cycle. The same feature appears in almost all of the examples in [Bramson, 2006, Chapter 3]. (An exception is the example constructed by Baccelli and Bonald [1999]; in their network, a single stream of jobs interacts with many cross-traffic streams, without ever returning to a previously visited server.) It seems intuitively plausible that restricting the routing graph to a tree would alleviate the problem, as there is “less room” to create growing excitations. Of course, the instability examples in §5.5 demonstrate that this intuition is flawed, and that unstable, exponentially-growing excitations can occur even without any cyclic structure. 102 APPENDIX C Halfin-Whitt regime In this appendix we summarise some of the results of Halfin and Whitt [1981]. Our notation in what follows is, as much as possible, consistent with the rest of the section on call centre models, rather than with [Halfin and Whitt, 1981]. Consider a sequence of M/M/βr queues, indexed by a scaling parameter r1. The arrival process in the rth system is Poisson of rate λr; the service in the rth system is exponential with parameter µ, and there are βr ≡ r servers. We will be interested in the case where ρr ≡ λ r µβr = 1− νr−1/2 for some ν > 0. (Here, µ and ν are constants which do not depend on r.) Remark C.1. Currently, our systems are indexed by the number of servers, and we carefully regulate the arrival rate into the rth queue. By renumbering, we can of course equivalently think of λr = λr with βr = λ µ r +O( √ r). Let Xr(t) denote the number of customers in the rth system (in queue and in service) at time t. Under the above scaling of arrival and service rates, the rth system will be described by a positive recurrent Markov process, so Xr(t) → Xr for some limiting variable Xr. Recall that the steady-state distribution of the M/M/β queue is pk = P(X = k) = { 1 B (βρ)k k! , k ≤ β 1 B ββρk β! , k ≥ β where B is a normalising constant. We can also compute the probability that an arriving customer will have to wait (note that Poisson arrivals see time averages, so it is enough to compute the steady-state probability of all β servers being occupied): P(X ≥ β) = 1 B (βρ)β β!(1− ρ) Note that in all of this, B will implicitly depend on r. Cleverly expanding this for the case βr = r and ρr = 1− νr−1/2 gives: Theorem C.2 (Proposition 1 of [Halfin and Whitt, 1981]). Let βr ≡ r and ρr ≡ 1− νr−1/2. Then as r →∞, P(Xr ≥ βr)→ α ≡ ( 1 + √ 2piνΦ(ν) exp(ν2/2) )−1 . Here, Φ is the cumulative distribution function of the standard normal random variable. That is, under the O( √ r) overstaffing, the probability that an arriving customer will have to wait converges to some constant between 0 and 1. We can also consider the behaviour of the diffusion-scaled queue size process, Xˆr(t) = r−1/2(Xr(t) − βr). Note that this can be both positive and negative: when Xˆr(t) is positive, it corresponds to a positive queue, and when it is negative, it corresponds to server idleness. 1In βr, r is the index rather than the exponent; we will in fact take βr = r 103 Theorem C.3 (Theorem 2 of [Halfin and Whitt, 1981]). Let βr ≡ r and ρr ≡ 1 − νr−1/2. Suppose Xˆr(0) → Xˆ(0). Then Xˆr(·) =⇒ Xˆ(·) in the Skorohod space D[0,∞), where X is a diffusion process satisfying the stochastic differential equation dXˆt = m(Xˆ)dt+ σ(Xˆ)dBt, where B is a Brownian motion (independent from all other quantities in the problem), and m(x) = −µν − µx1{x≤0}, σ(x) = √ 2µ. Without proving the convergence, we argue that the drift and variance are correct. Indeed, the instantaneous drift in the rth system is mr(Xˆr(t)) ≡ lim →0 1  E[Xˆr(t+ )− Xˆr(t)|Xˆr(t)] = { r−1/2(−rµ+ λr), Xˆr(t) > 0 r−1/2(−(r +√rXˆr(t))µ+ λr), Xˆr(t) ≤ 0 which by the scaling on βr and ρr equals −µν − µx1x≤0. The instantaneous variance is (σr)2(Xˆr(t)) ≡ lim →0 1 2 E[(Xˆr(t+ )− Xˆr(t))2 − (mr(Xˆr(t)))2|Xˆr(t)] = { r−1(rµ+ λr), Xˆr(t) > 0 r−1((r + √ rXˆr(t))µ+ λr), Xˆr(t) ≤ 0 which converges to 2µ under the chosen scaling. Here, we use the fact that the arrival and service processes are Poisson; clearly, the variance would change for general interarrival or service times. We now have the following limits: Xˆr(t) positive recurrent−−−−−−−−−−→ t→∞ Xˆr Theorem C.3 yr→∞ ?yr→∞ Xˆ(t) ?−−−→ t→∞ Xˆ We would like to know whether there is in fact a limiting distribution Xˆ on the diffu- sion scale, which is simultaneously the steady-state distribution of the diffusion process Xˆ(t) (assuming it has a steady-state distribution) and the weak limit of the steady-state variables Xˆr. Typically, we are interested in the steady-state Xr for large r, which we might attempt to approximate by the steady-state of the diffusion approximation (which is computationally “easier”). However, for this model it turns out to be relatively simple to compute the limit Xr directly: Theorem C.4 (Theorem 1 and Corollary 2 of [Halfin and Whitt, 1981]). In the above set-up, Xˆr → Xˆ, where Xˆ has an exponential tail with parameter ν above 0, and a normal tail below 0. More precisely, P(Xˆ ≥ 0) = α as given by Theorem C.2, above 0 we have P(Xˆ > x|Xˆ ≥ 0) = e−xν, and below 0 we have P(Xˆ ≤ x|Xˆ ≤ 0) = Φ(ν + x)/Φ(ν), where Φ is the cumulative distribution function of a standard normal variable. This is also the invariant distribution of the diffusion process Xˆ(·) given by Theorem C.3. The argument for showing convergence Xˆr → Xˆ is similar to the argument for Theo- rem C.2, in that it involves explicitly approximating the steady-state distributions for an 104 M/M/β queue. Note that if Xˆr → Xˆ, then Xˆ must be the steady-state distribution of Xˆ(·). Indeed, letting Xˆr(0) ≡ Xˆr → Xˆ ≡ Xˆ(0), we obtain at all future times Xˆ(t) = Xˆ, so Xˆ is the invariant distribution for Xˆ(·). In more general systems, it will be sufficient to prove (a) tightness of the family of distributions Xˆr, and (b) the existence of a “nice” limiting process (e.g., solution to an SDE) Xˆ(·): then any subsequential limit of Xˆr will be an invariant distribution of Xˆ(·), which for a nice process is unique. Therefore, all convergent subsequences of Xˆr converge to Xˆ, and under tightness, any subsequence of Xˆr has a convergent subsequence. We conclude that Xˆr → Xˆ. 105 APPENDIX D Computations 1. Computations for Example 2.32 In this section, we include the Maple code (with output) for constructing the small example of a call centre model which is unstable in underload (Example 2.32 in §2.5.5). In the multi-page formula expanding the quantity c2c1− c0, whose sign determines the pres- ence of eigenvalues with positive real part in the matrix Au, note that all but four terms are preceded by a − sign (the four pluses have been replaced by ⊕ to make them stand out). This supports the informal Conjecture 2.46 that parameters leading to instability are rare. # We show here the calculation that allows us to find # the local instability example in underloaded. # We consider a simple 3-customer-type, # 4-server-type system as in the paper. # The customer types are a, b, c; # the server types are 1, 2, 3, 4. # We do not divide through by B = sum_j beta_j # in computations below; in the numerical example # it will be equal to 1. # # Entries of the matrix A_u (or, rather, of # B*A_u, as explained above): # Diagonal entries #} A[aa]:=-mu[a1]*beta[1]-mu[a2]*(beta[2]+beta[3]+beta[4]): A[bb]:=-mu[b2]*(beta[1]+beta[2])-mu[b3]*(beta[3]+beta[4]): A[cc]:=-mu[c3]*(beta[1]+beta[2]+beta[3])-mu[c4]*beta[4]: #} # Off-diagonal entries B:= beta[1]+beta[2]+beta[3]+beta[4]: A[ab]:= A[aa]+B*mu[a2]: A[ac]:=A[aa]+B*mu[a2]: A[ba]:= A[bb]+B*mu[b2]: A[bc]:=A[bb]+B*mu[b3]: A[ca]:= A[cc]+B*mu[c3]: A[cb]:=A[cc]+B*mu[c3]: # # Matrix A[u]:=Matrix([[A[aa], A[ab], A[ac]], [A[ba], A[bb], A[bc]], [A[ca], A[cb], A[cc]]]): # # Characteristic polynomial with(LinearAlgebra): charpol:=CharacteristicPolynomial(A[u], x): # # If the polynomial is 107 # x^3 - c_2*x^2 + c_1*x - c_0 then # c_0 is the product of the roots, c_1 is the sum of # products of pairs of roots, c_2 is the sum of the roots. # The expression c_2*c_1 - c_0 will be # negative if all the roots have negative real parts. # We compute the coefficients and the expression below. # Observe that most terms in the expression come # with a "-" sign. However, some come with a "+" sign; # these have the "+" circled. # Setting the corresponding parameters to be very # large while keeping the remaining parameters very small # produces the desired counterexample: see below. # c[2]:=-coeff(charpol, x^2): c[1]:=coeff(charpol, x): c[0]:=-coeff(charpol, x, 0): expr:=simplify(c[2]*c[1]-c[0]); # expr := −2µc3β31µb2µa1 − 2µc3β32µb2µa2 − 2µc3β33µb3µa2 − 2µc4β34µb3µa2 − µ2c3β 2 1β4µb2 − 3µ2c3β21µb2β2 − µ2c3β21µb3β3 − µ2c3β21µa2β2 − µ2c3β21µa2β3 − 2µ2c3β 2 1β2µa1− 2µ2c3β21β3µb2− 2µ2c3β21β3µa1− 2µ2c3β1β23µa2− 2µ2c3β1β23µb3− 2µ2c3β1β 2 2µa2 − 3µ2c3β1β22µb2 − µ2c3β21β4µa1 − µ2c3β22β4µb2 − µ2c3β22µb3β3 − µ2c3β 2 2µa1β1 − 3µ2c3β22µa2β3 − µ2c3β22µa2β4 − 2µ2c3β22β3µb2 − 3µ2c3β2β23µa2 − 2µ2c3β2β 2 3µb3 − µ2c3β23µb2β1 − µ2c3β23µb2β2 − µ2c3β23µb3β4 − µ2c3β23µa1β1 − µ2c3β 2 3µa2β4 − µ2a2β32µc3 − µ2a1β31µb2 − 3µ2a2β2µc3β23 − 2µ2a2β2µb3β23 − µ2a2β 2 2µb2β1 − 2µ2a2β22µb2β3 − 2µ2a2β22µb2β4 − µ2a2β22µb3β3 − µ2a2β22µb3β4 − 2µ2a2β2µb3β 2 4 − µ2a2β23µc4β4 − µ2a2β23µc3β1 − 2µ2a2β23µc3β4 − 2µ2a2β3µc4β24 − µ2a2β 2 3β1µb3 − µ2a2β23µb2β2 − 3µ2a2β23µb3β4 − 3µ2a2β3µb3β24 − µ2a2β24µc3β2 − µ2a2β 2 4µc3β3 − µc4β4µ2a2β3β1 − µc4β24µa2µa1β1 − µc4β4µb2β21µa1 − µc4β4µa1β 2 1µb3 − µc4β4µa2β21µb2 ⊕ µc4β4µa2β21µb3 − µc3β4µa1β21µa2 − µc3β4µ 2 a2β2β1 − µc3β4µ2a2β3β1 − µc3β24µa2µa1β1 − µc3β4µb2β21µa1 − µc3β4µa1β 2 1µb3 − µc3β4µa2β21µb2 − µc3β4µa2β21µb3 − µc4β24β1µb3µa1 − µc4β 2 4β1µb3µa2 − µc4β24β1µb2µa1 − µc4β24β1µb2µa2 − µc4β4µb2β21µb3 − µc4β4µb2β 2 2µb3 − µc4β4µ2b3β3β1 − µc4β4µ2b3β3β2 − µc4β24µb3µb2β1 − µc4β 2 4µb3µb2β2 − µc3β24β1µb3µa1 − µc3β24β1µb3µa2 ⊕ µc3β24β1µb2µa1 − µc3β 2 4β1µb2µa2 − 2µc3β4µ2b2β1β2 − µc3β4µb2β21µb3 − µc3β4µb2β22µb3 − µc3β4µ 2 b3β3β1 − µc3β4µ2b3β3β2 − µc3β24µb3µb2β1 − µc3β24µb3µb2β2 − 2µc4β4µb2β 2 2µa2 − 2µc4β24µb2β2µa2 − 2µc4β24µb3β2µa2 − 2µc3β4µb2β22µa2 − 2µc3β4µb3β 2 2µa2 − 2µc3β24µb3β2µa2 − 2µc3β21µb2β3µa1 − 2µc3β21µb2µa2β2 − 4µc3β 2 1µb2β2µa1 − 4µc3β1µb2β22µa2 − 2µc3β1µb3β23µa2 − 2µc3β22µb2µa1β1 − 4µc3β 2 2µb2µa2β3 − 2µc3β22µb3β3µa2 − 4µc3β2µb3β23µa2 − 2µc3β23µb2β2µa2 − 4µc3β 2 3µb3µa2β4 − 2µc3β3µb3β24µa2 − 2µc4β4µb3β23µa2 − 4µc4β24µb3β3µa2 − 2µ2c3β1β2µb3β3 − 4µ2c3β1β2µa2β3 − µ2c3β1β2µa2β4 − 4µ2c3β1β3µb2β2 − µ2c3β1β3µb3β4 − µ2c3β1β3µa2β4 − µc3β1µc4β24µa2 − µc3β1µc4β24µb3 − µc3β 2 1µc4β4µa2 − µc3β21µc4β4µb3 − 2µc3β22µc4β4µa2 − µc3β22µc4β4µb3 − µ2c3β2β3µb3β4 − 2µ2c3β2β3µa1β1 − 2µ2c3β2β3µa2β4 − 2µc3β2µc4β24µa2 − µc3β2µc4β 2 4µb3 − µc4β24µ2a2β1 − µc3β4µ2a1β21 − µc4β24µ2b3β1 − µc4β24µ2b3β2 − µc3β4µ 2 b2β 2 1 − µc3β4µ2b2β22 − µ2c4β24µa2β2 − µ2c4β24µa2β3 − µ2c4β24µb3β2 − µ2c4β 2 4µb3β3 − µ2c4β24µa2β1 − µ2c4β24µb3β1 − 3µ2b2β21µc3β2 − µ2b2β21µc3β3 − 3µ2b2β1µc3β 2 2 − 2µ2b2β1β22µa2 − µ2b2β21β3µa1 − µ2b2β21β4µa1 − µ2b2β21µa2β2 − 108 2µ2b2β 2 1β2µa1 − µ2b2β22µc3β3 − µ2b2β22µa1β1 − µ2b2β22µa2β3 − µ2b2β22µa2β4 − µ2b3β 2 3µc3β1 − µ2a2β34µc4 − µ2a2β33µb3 − µ2a2β33µc3 − µ2a2β32µb2 − µ2a2β34µb3 − µ2a1β 3 1µc3 − µ2b3β34µa2 − µ2b3β34µc4 − µ2a2β24β1µb3 − µ2a2β24µb2β2 − µc4β4µa2β3µa1β1 − µc4β4µb2β2µa1β1 − µc4β4µa1β1µb3β2 − 3µc4β4µb2β1µa2β2 ⊕ µc4β4µa2β1µb3β2 − 3µc3β4µa2β2µa1β1 − 3µc3β4µa2β3µa1β1 − µc3β4µb2β2µa1β1 − µc3β4µa1β1µb3β2 − 3µc3β4µb2β1µa2β2 − 3µc3β4µa2β1µb3β2 − µc4β4β1µb3β3µa1 − µc4β4β1µb3β3µa2 − µc4β4β1µb2β3µa1 − µc4β4β1µb2β3µa2 − 2µc4β4µb2β1µb3β2 − µc4β4µb3β3µb2β1 − µc4β4µb3β3µb2β2 − 3µc3β4β1µb3β3µa1 − 3µc3β4β1µb3β3µa2 ⊕ µc3β4β1µb2β3µa1 − 3µc3β4β1µb2β3µa2 − 2µc3β4µb2β1µb3β2 − 3µc3β4µb3β3µb2β1 − 3µc3β4µb3β3µb2β2 − 2µc4β4µb2β2µa2β3 − 2µc4β4µb3β2µa2β3 − 2µc3β4µb2β2µa2β3 − 6µc3β4µb3β2µa2β3 − 6µc3β1µb2β2µa2β3 − 2µc3β1µb3β3µa2β2−µ2b3β23µc3β2−2µ2b3β23µc3β4−µ2b3β23µc4β4−2µ2b3β3µc4β24− µ2b3β 2 3β1µa2 − µ2b3β23µa2β2 − 3µ2b3β23µa2β4 − 3µ2b3β3β24µa2 − µ2b3β24µc3β3 − µ2b3β 2 4β1µa2 − µ2b3β24µa2β2 − µ2a1β21µc3β2 − µ2a1β21µc3β3 − µ2a1β21µb2β3 − µ2a1β 2 1µb2β4 − µ2a1β21µb2β2 − µ2a2β22µc4β4 − µ2a2β22µc3β1 − 3µ2a2β22µc3β3 − 2µ2a2β 2 2µc3β4 − 2µ2a2β2µc4β24 − µ2b2β32µa2 − µ2b2β32µc3 − µ2b2β31µa1 − µ2b3β 3 3µa2 − µ2b3β33µc3 − µ2c4β34µb3 − µ2c4β34µa2 − µ2b2β31µc3 − µ2c3β31µb2 − µ2c3β 3 3µb3 − µ2c3β33µa2 − µ2c3β32µb2 − µ2c3β32µa2 − µ2c3β31µa1 − µc4β4µ2a2β2β1 − µc4β4µa1β 2 1µa2 − µc4β4µa2β2µa1β1 − 2µb2β22µc3µb3β3 − 2µb2β2µc3β23µb3 − 2µb2β2µb3β 2 3µa2 − µ2b2β2β1β3µa1 − µ2b2β2β1β4µa1 − 2µb2β22µb3β3µa2 − 2µb2β 2 2µb3β4µa2 − 2µb2β2µb3β24µa2 − 2µb3β23µc3µa1β1 − 2µb3β3µc3β21µa1 − µb3β3µb2β 2 1µa1 − 2µ2b3β3β1β4µa2 − µb3β23β1µb2µa1 − 2µ2b3β3β4µa2β2 − µb3β4µb2β 2 1µa1 − µb3β24β1µb2µa1 − 2µa1β21µc3µa2β2 − 2µa1β21µc3µa2β3 − 2µa1β1µc3β 2 3µa2 − 2µa1β1µc3β22µa2 − µa1β1µb3β23µa2 − 2µa1β1µb2β22µa2 − µa1β 2 1µb3β3µa2 − µa1β21µb3β4µa2 − 2µa1β21µb2µa2β2 − µa1β1µb3β24µa2 − 2µ2a2β2µc4β4β3 − 2µ2a2β2µc3β1β3 − 4µ2a2β2µc3β3β4 − µ2a2β2β1µb3β3 − µ2a2β2β1µb3β4 − 4µ2a2β2µb3β3β4 − µa2β3µb2β21µa1 − 2µ2a2β3β1µb3β4 − µa2β 2 3β1µb2µa1 − µ2a2β3µb2β1β2 − 2µ2a2β3µb2β2β4 − µa2β4µb2β21µa1 − µa2β 2 4β1µb2µa1 − µ2a2β4µb2β1β2 − µ2c3β2β4µa1β1 − 2µc3β23µc4β4µa2 − µ2c3β3β4µb2β1 − µ2c3β3β4µb2β2 − 2µc3β23µc4β4µb3 − 2µc3β3µc4β24µa2 − 2µc3β3µc4β 2 4µb3 − µ2c3β3β4µa1β1 − µc4β24µc3µb2β1 − µc4β24µc3µb2β2 − µc4β4µc3β 2 2µb2 − µc4β4µc3β21µa1 − µc4β4µc3β21µb2 − µc4β24µc3µa1β1 − 2µb2β 2 1µc3µb3β3 − 2µb2β21µc3µa2β3 − 2µ2b2β1µc3β3β2 − 2µb2β1µc3β23µa2 − 2µb2β1µc3β 2 3µb3 − µb2β1µb3β23µa2 − µb2β21µb3β3µa2 − µb2β21µb3β4µa2 − µ2b2β1β2µa2β3 − µ2b2β1β2µa2β4 − µb2β1µb3β24µa2 − 2µc3β2β1µb2β3µa1 − 2µ2c3β1β4µb2β2 − 3µc3β1µc4β4µa2β2 − 3µc3β1µc4β4µa2β3 − 2µc3β1µc4β4µb3β2 − 3µc3β1µc4β4µb3β3 − 4µc3β2µc4β4µa2β3 − 3µc3β2µc4β4µb3β3 − 2µc4β4µc3β1µb2β2 − µc4β4µc3β2µa1β1 − µc4β4µc3β3µb2β1−µc4β4µc3β3µb2β2−µc4β4µc3β3µa1β1−4µb2β1µc3β2µb3β3− 3µb2β1µb3β3µa2β2 − 2µb2β1µb3β3µa2β4 − 3µb2β1µb3β4µa2β2 − 4µb2β2µb3β3µa2β4 − 2µb3β3µc3β2µa1β1 − 2µb3β3β1µb2β4µa1 − µb3β3µb2β2µa1β1 − µb3β4µb2β2µa1β1 − 4µa1β1µc3β2µa2β3 − 3µa1β1µb2β2µa2β3 − 3µa1β1µb2β2µa2β4 − µa1β1µb3β3µa2β2 − 2µa1β1µb3β3µa2β4 − µa1β1µb3β4µa2β2 − 2µa2β3β1µb2β4µa1 # # Numerical values and the counterexample # # We find a positive term in the above expression: 109 coeff(coeff(coeff(coeff(expr, mu[a2]),mu[b3]),mu[c4]),beta[1]^2); # β4 # We set beta_1, mu_a2, mu_b3, mu_c4 to be large # and the rest of the parameters to be small. # Setting beta_4 large would not be productive, # because there may well be (and, in fact, are) # positive terms depending on beta_4 through # beta_4^2 or even beta_4^3. # eval(expr, {mu[a1] = 1, mu[a2] = 100, mu[b2] = 1, mu[b3] = 100, mu[c3] = 1, mu[c4] = 100, beta[1] = .97, beta[2] = 0.1e-1, beta[3] = 0.1e-1, beta[4] = 0.1e-1}); # 6464.105200 # We compute the matrix, and check that it in fact # has a pair of eigenvalues with positive real part # A:=eval(A[u], {mu[a1] = 1, mu[a2] = 100, mu[b2] = 1, mu[b3] = 100, mu[c3] = 1, mu[c4] = 100, beta[1] = .97, beta[2] = 0.1e-1, beta[3] = 0.1e-1, beta[4] = 0.1e-1}); A := −3.97 96.03 96.03−1.98 −2.98 97.02 −0.99 −0.99 −1.9  Eigenvalues(A); 4.45477075946149625 + 23.3689162935988684i4.45477075946149625− 23.36891629i −17.8495415189230116 + 0.0i  110 2. Computations for Example 2.36 In this section, we include the Maple code (with output) for demonstrating that the 21-customer-class call centre model in Example 2.36 (§2.5.5) is unstable in underload. # We show that the 21-customer-type example # is really an example of local instability. # # a b c d e f g h i j k l ... # / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / # 1 2 3 4 5 6 7 8 9 10 11 12 13 # # ... m n o p q r s t u # \ / \ / \ / \ / \ / \ / \ / \ / \ / \ # 13 14 15 16 17 18 19 20 21 22 # # The service rates are 1 to the left always, # 1/3 to the right on the leftmost 12 edges (a2 ... l13), # 3 to the right on the rightmost 9 edges (m14 ... u22). # # We construct A_u: # betas:=Vector(22,1): B:=22: # mu:=Matrix(21,22,0): for i from 1 to 21 do mu[i,i]:=1: od: # service rate to the left is 1 for i from 1 to 12 do mu[i,i+1]:=1/3: od: # service rate to the right is 1/3 for first 12 edges for i from 13 to 21 do mu[i,i+1]:=3: od: # and 3 for the last 9 edges # A_u:=Matrix(21,21,0): for i from 1 to 21 do # diagonal entries of A_u A_u[i,i]:=-1/B*(mu[i,i]*i+mu[i,i+1]*(B-i)): od: for i from 1 to 21 do # off-diagonal entries of A_u for k from 1 to i-1 do A_u[i,k]:=A_u[i,i]+mu[i,i]: od: for k from i+1 to 21 do A_u[i,k]:=A_u[i,i]+mu[i,i+1]: od: od: # # # We compute the eigenvalues: with(LinearAlgebra): evals:=evalf(Eigenvalues(A_u)): evalm(evals); 0.03033448065 + 0.3508691241i, −1.099781956 + 2.908783840i, −0.2847639055 + 0.2914327420i, −1.796376325 + 1.354320217i, −0.3937698928 + 0.2093609582i, −1.894085480 + 0.7212796463i, 111 −0.4453938101 + 0.1408808854i, −0.4715951453 + 0.08138816700i, −1.923518325 + 0.3232948910i, −0.4828957275 + 0.02663815111i, −1.930853280, −0.4828957275− 0.02663815111i, −1.923518325− 0.3232948910i, −0.4715951453− 0.08138816700i, −0.4453938101− 0.1408808854i, −1.894085480− 0.7212796463i, −0.3937698928− 0.2093609582i, −1.796376325− 1.354320217i, −0.2847639055− 0.2914327420i, −1.099781956− 2.908783840i, 0.03033448065− 0.3508691241i # # Largest real part of an eigenvalue is positive: max(Re(evals)); 0.03033448065 112 3. Computations for Lemma 2.39 In this section, we include the Maple code (with output) for demonstrating that call centre models with (at most) 4 customer classes are locally stable in critical load, when q > 0 (Lemma 2.39). # We show here the that the four-customer-type # critically loaded systems are locally stable. # There are two essentially different arrangements # of the four customer types: # a b c d # \ / \ / \ / # 1 2 3 # and # b--1 # \ # a--2--c # / # d--3 # We do not divide through by B = sum_j beta_j # in computations. # This does not affect the sign of anything, # and makes the results more legible. # # Case 1: # a b c d # \ / \ / \ / # 1 2 3 # We compute the entries of A_c by first computing # the entries of A_u (or, rather, of B*A_u as we # explained above): # Entries of A_u: # Diagonal entries # Au1[aa]:=-mu[a1]*(beta[1]+beta[2]+beta[3]): Au1[bb]:=-mu[b1]*beta[1]+mu[b2]*(beta[2]+beta[3]): Au1[cc]:=-mu[c2]*(beta[1]+beta[2])+mu[c3]*beta[3]: Au1[dd]:=-mu[d3]*(beta[1]+beta[2]+beta[3]): # # Off-diagonal entries B:=beta[1]+beta[2]+beta[3]: Au1[ab]:=0: Au1[ac]:=0: Au1[ad]:=0: Au1[ba]:=Au1[bb]+B*mu[b1]: Au1[bc]:=Au1[bb]+B*mu[b2]: Au1[bd]:=Au1[bb]+B*mu[b2]: Au1[ca]:=Au1[cc]+B*mu[c2]: Au1[cb]:=Au1[cc]+B*mu[c2]: Au1[cd]:=Au1[cc]+B*mu[c3]: Au1[da]:=0: Au1[db]:=0: Au1[dc]:=0: # # Entries of A_c 113 cola:=1/4*(Au1[aa]+Au1[ba]+Au1[ca]+Au1[da]): colb:=1/4*(Au1[ab]+Au1[bb]+Au1[cb]+Au1[db]): colc:=1/4*(Au1[ac]+Au1[bc]+Au1[cc]+Au1[dc]): cold:=1/4*(Au1[ad]+Au1[bd]+Au1[cd]+Au1[dd]): Ac1[aa]:=Au1[aa]-cola: Ac1[ab]:=Au1[ab]-colb: Ac1[ac]:=Au1[ac]-colc: Ac1[ad]:=Au1[ad]-cold: Ac1[ba]:=Au1[ba]-cola: Ac1[bb]:=Au1[bb]-colb: Ac1[bc]:=Au1[bc]-colc: Ac1[bd]:=Au1[bd]-cold: Ac1[ca]:=Au1[ca]-cola: Ac1[cb]:=Au1[cb]-colb: Ac1[cc]:=Au1[cc]-colc: Ac1[cd]:=Au1[cd]-cold: Ac1[da]:=Au1[da]-cola: Ac1[db]:=Au1[db]-colb: Ac1[dc]:=Au1[dc]-colc: Ac1[dd]:=Au1[dd]-cold: # # Matrix A_c A[c,1]:=Matrix([[Ac1[aa], Ac1[ab], Ac1[ac], Ac1[ad]], [Ac1[ba], Ac1[bb], Ac1[bc], Ac1[bd]], [Ac1[ca], Ac1[cb], Ac1[cc], Ac1[cd]], [Ac1[da], Ac1[db], Ac1[dc], Ac1[dd]]]): # # Characteristic polynomial # We know 0 is an eigenvalue, so we divide by x with(LinearAlgebra): charpol1:=simplify(CharacteristicPolynomial(A[c,1],x)/x): # # Coefficients and the expression c_2 c_1 - c_0. # For stability, we need -c_2 > 0, c_1 > 0, -c_0 > 0, # and -expr > 0. # (The coefficients are called c[i,1], and the expression # is called expr1, corresponding to case 1.) # c[2,1]:=-coeff(charpol1, x\symbol{94}2): c[1,1]:=coeff(charpol1, x): c[0,1]:=-coeff(charpol1, x, 0): expr1:=expand(c[2,1]*c[1,1]-c[0,1]): # # c[i,1] and expr1 are polynomials in mu’s and beta’s. # We will compute the sign of the minimal coefficient # of -c_2, c_1, -c_0, and -expr as polynomials # of mu’s and beta’s. # Observing that these are all positive (+1), we see # that the expressions are positive whenever mu’s and # beta’s are positive (which they are). # Thus, the matrix is stable. # signs1:=[sign(min(coeffs(-c[2,1]))), sign(min(coeffs(c[1,1]))),sign(min(coeffs(-c[0,1]))), sign(min(coeffs(-expr1)))]; # # signs1 := [1, 1, 1, 1] 114 # Case 2: # b--1 # \ # a--2--c # / # d--3 # We compute the entries of A_c by first computing # the entries of A_u (or, rather, of B*A_u as we # explained above): # Entries of A_u: # Diagonal entries # B:=beta[1]+beta[2]+beta[3]: Au2[aa]:=-mu[a1]*beta[1]-mu[a2]*beta[2]-mu[a3]*beta[3]: Au2[bb]:=-mu[b1]*B: Au2[cc]:=-mu[c2]*B: Au2[dd]:=-mu[d3]*B: # # Off-diagonal entries Au2[ab]:=Au2[aa]+B*mu[a1]: Au2[ac]:=Au2[aa]+B*mu[a2]: Au2[ad]:=Au2[aa]+B*mu[a3]: Au2[ba]:=0: Au2[bc]:=0: Au2[bd]:=0: Au2[ca]:=0: Au2[cb]:=0: Au2[cd]:=0: Au2[da]:=0: Au2[db]:=0: Au2[dc]:=0: # # Entries of A_c cola:=1/4*(Au2[aa]+Au2[ba]+Au2[ca]+Au2[da]): colb:=1/4*(Au2[ab]+Au2[bb]+Au2[cb]+Au2[db]): colc:=1/4*(Au2[ac]+Au2[bc]+Au2[cc]+Au2[dc]): cold:=1/4*(Au2[ad]+Au2[bd]+Au2[cd]+Au2[dd]): Ac2[aa]:=Au2[aa]-cola: Ac2[ab]:=Au2[ab]-colb: Ac2[ac]:=Au2[ac]-colc: Ac2[ad]:=Au2[ad]-cold: Ac2[ba]:=Au2[ba]-cola: Ac2[bb]:=Au2[bb]-colb: Ac2[bc]:=Au2[bc]-colc: Ac2[bd]:=Au2[bd]-cold: Ac2[ca]:=Au2[ca]-cola: Ac2[cb]:=Au2[cb]-colb: Ac2[cc]:=Au2[cc]-colc: Ac2[cd]:=Au2[cd]-cold: Ac2[da]:=Au2[da]-cola: Ac2[db]:=Au2[db]-colb: Ac2[dc]:=Au2[dc]-colc: Ac2[dd]:=Au2[dd]-cold: # # Matrix A_c A[c,2]:=Matrix([[Ac2[aa], Ac2[ab], Ac2[ac], Ac2[ad]], [Ac2[ba], Ac2[bb], Ac2[bc], Ac2[bd]], [Ac2[ca], Ac2[cb], Ac2[cc], Ac2[cd]], [Ac2[da], Ac2[db], Ac2[dc], Ac2[dd]]]): # # Characteristic polynomial # We know 0 is an eigenvalue, so we divide by x charpol2:=simplify(CharacteristicPolynomial(A[c,2],x)/x): # 115 # Coefficients and the expression c_2 c_1 - c_0. # For stability, we need -c_2 > 0, c_1 > 0, -c_0 > 0, # and -expr > 0. # (The coefficients are called c[i,2], and the expression # is called expr2, corresponding to case 2.) # c[2,2]:=-coeff(charpol2, x^2): c[1,2]:=coeff(charpol2, x): c[0,2]:=-coeff(charpol2, x, 0): expr2:=expand(c[2,2]*c[1,2]-c[0,2]): # # c[i,2] and expr2 are polynomials in mu’s and beta’s. # We will compute the sign of the minimal coefficient # of -c_2, c_1, -c_0, and -expr as polynomials # of mu’s and beta’s. # Observing that these are all positive (+1), we see # that the expressions are positive whenever mu’s and # beta’s are positive (which they are). # Thus, the matrix is stable. # signs2:=[sign(min(coeffs(-c[2,2]))), sign(min(coeffs(c[1,2]))),sign(min(coeffs(-c[0,2]))), sign(min(coeffs(-expr2)))]; # signs2 := [1, 1, 1, 1] 116 4. Vertices of the level set of the Lyapunov function in §3.8 Here we describe the polyhedron P ≡ {X : L(X) = 1} constructed in §3.8 in the course of proving stability of the limit order book with 5 equally sized bins restricted to the interval [1 5 + , 4 5 − ]. Identifying (x, y, z) ≡ (X(2), X(3), X(4)), the (filled-in) polyhedron P˜ ≡ {X : L(X) ≤ 1} consists of the points satisfying the inequalities |x|+ |y|+ |z| ≤ 1, x− 4 5 y − 9 5 z ≤ 1 4 3 x+ y + 2 3 z ≤ 1, −2x− 3y − 4x ≤ 1 and one of the four orthant constraints x ≥ 0, y ≥ 0, z ≥ 0 (+ + +) x ≥ 0, y ≥ 0, z ≤ 0 (+ +−) x ≥ 0, y ≤ 0, z ≤ 0 (+−−) x ≤ 0, y ≤ 0, z ≤ 0 (−−−) The polyhedron P has fifteen vertices {0, 0, 0}, {0, 1, 0}, {0, 0, 1}, {1 2 , 0, 1 2 }, {45 58 , 2 29 ,− 9 58 }, {6 7 ,−1 7 , 0}, {29 34 ,− 2 17 ,− 1 34 }, {3 4 , 0, 0}, {11 50 , 6 25 ,−27 50 }, {0, 3 7 ,−4 7 }, {11 26 ,− 6 13 ,− 3 26 }, {2 5 ,−3 5 , 0}, {0,−1 3 , 0}, {−1 2 , 0, 0}, {0, 0,−1 4 } and ten faces (defined as ordered sets of vertices, possibly with varying orientations; some of the faces will be non-convex) {4, 3, 2}, {5, 2, 10, 9}, {7, 6, 12, 11}, {1, 3, 4, 8}, {1, 8, 6, 12, 14}, {1, 3, 2, 10, 15}, {1, 15, 14}, {7, 5, 9, 11}, {2, 4, 8, 6, 7, 5}, {9, 10, 15, 14, 12, 11} The last three faces are the red, green, and blue face in Figure 3.1, which we reproduce below. -0.5 0.0 0.5 X H2 L -0.5 0.0 0.5 1.0 X H3 L -0.50.0 0.51.0 X H4 L -0.5 0.0 0.5 X H2 L -0.5 0.0 0.5 1.0 X H3 L -0.5 0.0 0.5 1.0 X H4 L 117