Stochastic Hub and Spoke Networks Edward Eric Hult Homerton College Judge Business School University of Cambridge A thesis submitted for the degree of Doctor of Philosophy 2011 I would like to dedicate this thesis to my loving wife who has been by my side through all my ups and downs, who has supported me and helped me, during the entire course of my Ph.D., which has been both a very emotional and intellectually challenging journey. Without her, things would not be the same. i Declaration of Originality I declare, as the author of this thesis, the work and research presented here is, to the best of my knowledge and belief, original except where due acknowledgement, in accordance with the standard referencing practices, is made in the text of the thesis. I certify that the work has not been previously submitted for a degree or diploma at any other institution. I declare that no ideas, techniques, or any other material from the work of other people is included in my thesis where full acknowledgement is not made with the exception of the work contributed by my two supervisors Daniel Ralph and Houyuan Jiang in Chapter 2. Chapter 2 is a co-authored paper, where I am the first author, together with my two supervisors. Apart from standard supervision input, both Daniel Ralph and Houyuan Jiang have contributed with the write up of Chapter 2 and in addition Houyuan Jiang has assisted in producing the majority of the C++ code referred to in this chapter. Edward E. Hult, 18 May 2011 ii Acknowledgement It is difficult to overstate my gratitude to my first and second Ph.D. supervisors, Prof. Daniel Ralph and Dr. Houyuan Jiang, for all their invaluable help. I have learned a lot from them during my time at Cambridge and I would like to thank them for always challenging me and questioning me, teaching me to always make sure I could fully motivate and understand any new ideas I brought to the table. I would like to thank them for all the great, lively, and very useful meetings we have had over the last few years and for all their useful tips and tricks for attacking and overcoming any problems which have been encountered along the way. Most importantly I would like to thank them for all their time they have given me and all their patience they have shown me, during the course of my Ph.D. degree. I would also like to thank my sister in law, Britt Hult, for her time, help and useful comments, proof reading this thesis. Additionally, I would like to thank my friends and family who have been there for me, especially my parents who have always inspired me and motivated me to always stay the course, no matter how hard things get. iii Abstract Transportation systems such as mail, freight, passenger and even telecommunication systems most often employ a hub and spoke network structure since correctly designed they give a strong balance between high service quality and low costs resulting in an economically competitive operation. In addition, consumers are increasingly demanding fast and reliable transportation services, with services such as next day deliveries and fast business and pleasure trips becoming highly sought after. This makes finding an efficient design of a hub and spoke network of the utmost importance for any competing transportation company. However real life situations are complicated, dynamic and often require responses to many different fixed and random events. Therefore modeling the question of what is an optimal hub and spoke network structure and finding an optimal solution is very difficult. Due to this, many researchers and practitioners alike make several assumptions and simplifications on the behavior of such systems to allow mathematical models to be formulated and solved optimally or near optimally within a practical timeframe. Some assumptions and simplifications can however result in practically poor network design solutions being found. This thesis contributes to the research of hub and spoke networks by introducing new stochastic models and fast solution algorithms to help bridge the gap between theoretical solutions and designs that are useful in practice. Three main contributions are made in the thesis. First, in Chapter 2, a new formulation and solution algorithms are proposed to find exact solutions to a stochastic p-hub center problem. The stochastic p-hub center problem is about finding a network structure, where travel times on links are stochastic, which minimizes the longest path in the network to give fast delivery guarantees which will hold for some given probability. Second, in Chapter 3, the stochastic p-hub center problem is looked at using a new methodological approach which gives more realistic solutions to the network structures when applied to real life situations. In addition a new service model is proposed where volume of flow is also accounted for when considering the stochastic nature of travel times on links. Third, in Chapter 4, stochastic volume is considered to account for capacity constraints at hubs and, de facto, reduce the costs embedded in excessive hub volumes. Numerical experiments and results are conducted and reported for all models in all chapters which demonstrate the efficiency of the new proposed approaches. iv Contents 1 Introduction 1 1.1 The Hub and Spoke Network Design . . . . . . . . . . . . . . . . . . . . 1 1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Paper One 10 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 The Stochastic Hub Center Problem and a Reformulation . . . . . . . . . 15 2.3 Solution Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.1 SpHCP-Pull, preprocessing . . . . . . . . . . . . . . . . . . . . . . 19 2.3.2 SpHCP-Push, separation algorithm . . . . . . . . . . . . . . . . . 20 2.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3 Paper Two 31 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2 New Methodological Approach to a Stochastic p-hub Center Problem . . 38 v 3.2.1 Variable departure times . . . . . . . . . . . . . . . . . . . . . . . 39 3.2.2 Preset departure times . . . . . . . . . . . . . . . . . . . . . . . . 46 3.2.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.3 The Service Cost Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.3.1 Stochastic formulation . . . . . . . . . . . . . . . . . . . . . . . . 57 3.3.2 Reformulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.3.3 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 4 Paper Three 70 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.1.1 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 4.2 The Stochastic Capacitated Single Allocation Hub Location Problem . . 76 4.3 Sample Average Approximation . . . . . . . . . . . . . . . . . . . . . . . 80 4.3.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4 Solution Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.5.1 Main run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.5.2 Effects of using different sample sizes . . . . . . . . . . . . . . . . 89 4.5.3 Stochastic vs. deterministic solutions . . . . . . . . . . . . . . . . 91 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 5 Conclusion 95 5.1 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 vi List of Figures 1.1 a) fully connected network b) single allocation hub and spoke design c) multiple allocation hub and spoke design. . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1 Performance of CPU times for SpHCPL, SpHCPL-Pull, and SpHCPL-Push: The top panel for the CAB dataset and the bottom panel for the AP dataset. 27 2.2 Sensitivity results of the CPU times in coefficient of variation ν and service level parameter γ for test example AP 25.2. . . . . . . . . . . . . . . . . . . 28 3.1 Origin-destination path via hub nodes. . . . . . . . . . . . . . . . . . . . . . 32 3.2 Departure times for a deterministic model. . . . . . . . . . . . . . . . . . . . 34 3.3 Example network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.4 Critical section of a network. . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.5 Decreasing values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.1 Origin - destination path. . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 4.2 Cumulative distribution graph for total volume flowing into a hub, showing the VaRγ , CVaRγ and the hub capacity, Γ. . . . . . . . . . . . . . . . . . . . . 79 4.3 Number of times an optimal solution was found from 100 different runs for each sample size S = {100, 200, 400, 600, 800}. . . . . . . . . . . . . . . . . . 89 4.4 Average solution times over ten different runs for each sample size S = {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000}. . . . . . . . . . . . . . . . . 90 vii List of Tables 2.1 Numerical results for SpHCPL, SpHCPL-Pull, and SpHCPL-Push, CAB examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.2 Numerical results for SpHCPL, SpHCPL-Pull, and SpHCPL-Push, AP examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.1 Departure times from each ghost node kθ, θ = {1, ..., 4}, to nodes m, m = {Paris, Rome, New York}. . . . . . . . . . . . . . . . . . . . . . . . 49 3.2 Numerical results for SpHCPDVL , CAB examples. . . . . . . . . . . . . . 52 3.3 Numerical results for SpHCPDVL , AP examples. . . . . . . . . . . . . . . 53 3.4 Test results for SpHCPDVL using different step sizes. . . . . . . . . . . . . 54 3.5 Numerical results for SpHCPDFL , CAB examples. . . . . . . . . . . . . . 55 3.6 Numerical results for SpHCPDFL AP, examples. . . . . . . . . . . . . . . . 56 3.7 Numerical results for SCP CAB, examples. . . . . . . . . . . . . . . . . . 63 3.8 Numerical results for SCP AP, examples. . . . . . . . . . . . . . . . . . . 64 4.1 Numerical results for AP examples 10LL - 25TT, S = 400. . . . . . . . . 88 4.2 Numerical results for AP examples 40LL - 50TT, S = 200. . . . . . . . . 89 4.3 Expected capacity breaches at different hubs for AP 20 examples. . . . . 92 4.4 Cost and capacity breach differences for AP 20LT and 20TT examples. . 93 viii Chapter 1 Introduction Different network systems are found everywhere in our world, from DNA strands which make up life to the early Roman water management and road network systems to modern day transportation systems and social networks such as Facebook and Twitter. The topic of this PhD thesis is an in-depth research study on a specific type of network system known as hub and spoke networks. Specifically the design phase of such systems is studied considering optimization tools and solution procedures. Special emphasis is made on the stochastic properties of such networks and how they can be incorporated into solvable mathematical programming models. 1.1 The Hub and Spoke Network Design Hub and spoke networks are used and found in many different areas, such as shipping and postal mail systems, air and rail transportation, and telecommunication networks [3, 13]. Given multiple origin nodes which each need to be connected to multiple destination nodes it becomes economically and practically infeasible to connect every origin and destination node (OD pair) with a direct single link connection. Instead a hub and spoke network aggregates multiple origin flows at a single hub node where the high volume aggregated flow travels from one hub to another via a hub-hub link as shown in figure 1.1, all hub nodes are assumed to be fully connected. Arriving at the second hub node the flow then gets split up and sorted for shipment to each of its respective destination nodes. Travel or shipment along a hub-hub link is often quicker and less expensive per unit flow due to the aggregation of flows. Examples include larger jets 1 for an airline passenger company — fitting more passengers into one airplane — or the shipment via trains or airplanes compared to trucks on a node-hub or hub-node link in a postal system. Figure 1.1: a) fully connected network b) single allocation hub and spoke design c) multiple allocation hub and spoke design. Not all networks require both hub location and node allocations to be found but instead aim to find an optimal node allocation based on an existing set of hub nodes, known in short as an allocation problem. The allocation problem is of importance to companies who have spent a large amount of time and money building and setting up hub nodes at specific points in the network but where origin and demand flows have changed over time and need to be reallocated to keep a similar standard of service quality and profit margin. In general, allocation problems are easier to solve for than the full location problems, due to the known and fixed hub set. In both cases the designer can specify if each origin and destination node should be connected to only one, single allocation (figure 1.1 b)), or several, multiple allocation (figure 1.1 c)), hub nodes. All our models employ a single allocation strategy. The most popular hub and spoke optimization objectives found in the literature are the median, center and covering problems. The objective of the median problem, which has been most studied in the literature, is to minimize the total cost of flow across a network under a given set of constraints [49]. A variation of the median problem is the hub location problem which has an additional term added to the objective which accounts for the fixed costs of setting up hub nodes [50]. The center problem is a minimax type problem where the objective is to either minimize the maximum cost for any path or single link in the network [10]. And lastly, the covering problem minimizes the number of hubs needed to fulfil some criteria such as the cost of each link or path to not exceed some given value [10]. The design aspect of hub and spoke networks is extremely important. For example, one European postal company [1] has approximately 20 million packages and letters sent out each day, with about 50% of them requiring less than 24 hours delivery. The importance of having a well structured network is critical. It directly impacts the quality of service 2 provided where design decisions affect economic viability through cost and reliability of deliveries. The choice of optimal hub locations and/or node allocations is however not an easy task. The economy of scale on hub-hub links make hub and node allocation choices far from obvious with the addition of accounting for many other factors which are not always clear or easy to capture. Each system has its own unique characteristics and often has multiple dynamic and unpredictable or random elements to it which need to be considered. On the other hand, good hub and spoke network designs can still be found by simplifying some appropriate network characteristics. However, as can be seen throughout the hub and spoke literature, as we will demonstrate in Sections 1.3, 2.1.1, 3.1.1 and 4.1.1 by citing some of the academic literature, the computational difficulty of solving even simple hub and spoke design problems has led to a somewhat narrow research output. This research pool can be divided up into two different focus areas. The first aims to create new or expand on existing models in an attempt to create more realistic representations of real life systems. The second focuses on reformulating and/or constructing new solution methods to existing models to improve upon the computa- tional performance allowing larger sized networks to be solved. These areas of research are of importance to hub and spoke design since they both, in their own right, contribute to bridging the gap between small solvable theoretical models and large computationally infeasible real life systems. This thesis contributes to both of these research areas by working in the realm of stochastic optimization. Due to the complexity of solving for optimal hub and spoke network designs, the vast majority of the literature simplifies all stochastic events which take place in a real life system by assuming average values [6, 9, 10, 12, 16, 16, 17, 18, 20, 22, 23, 24, 25, 27, 28, 36, 37, 38, 40, 41, 45, 47, 48, 50, 53, 49, 52, 56, 60, 61, 66, 67, 68, 70, 71, 72, 74, 80], [alternatively see the two literature review papers [3, 13] which have over 100 references each of papers considering deterministic hub and spoke network designs]. Even though such simplification may be sensible in some cases, there are many situations where such simplifications can result in poor designs due to the flaw of averages [62]. The flaw of averages simply states: “plans based on the assumption that average conditions will occur are usually wrong” [63]. In optimal decision making, such as in hub and spoke network design, Rosenhead et al. [59] divided decision making into three categories — certainty, risk and uncertainty. All deterministic situations are classified as certainty, i.e. everything is known and no randomness exists. For risk and uncertainty situations, random events occur, which is the norm in everyday life. For risk, the random events are governed by probability distributions which are known. For uncertainty, no properties 3 of the random events are available. The facility location literature, among others, has picked up on the importance of incorporating stochastic optimization and has produced a vast pool of research papers covering the different decision making environments [69]. For hub and spoke network design models, risk and uncertainty research is very limited but is nonetheless very important to consider, see for example the Appendix of Chapter 3: “Example A: Importance of considering stochastic link travel times”. The three papers presented in this thesis all deal within the risk environment. Risk modeling has been around for some time and has played a vital role in financial modeling. In 1952 Markowitz [46] introduced modern portfolio theory by studying the trade off between risk and expected return. The models included either fixing risk and maximizing the expected return or fixing the expected return and minimizing risk. By changing the fixed parameters, one can create an efficient frontier and then choose the optimal portfolio depending on one’s risk preference. The random events in these models follow a Gaussian distribution and the risk is defined as the variance. Another, and probably the most common, risk measure in finance is value-at-risk (VaR) which first became widely used in the early 1990s following the stock market crash of 1987, also known as Black Monday [35]. The idea behind VaR is to minimize the value, φ, which a loss should not exceed for a given probability level γ. In general this differs from a mean-variance model, where one either minimizes the risk or maximizes the expected return, and instead finds the setup which gives the smallest loss for a given confidence interval. Although VaR gives a minimum value φ which should not be exceeded for some probability, it does not account for how bad things might get when φ is exceeded. To be able to account for this, Rockafellar and Uryasev [57] introduced conditional value-at- risk (CVaR). CVaR states that for a lower value φ, such that the cost does not exceed φ with a probability γ, then CVaR is the expected value when φ is exceeded. For example, if 95% of the time it takes a postal company less than 24 hours to deliver a package from A to B, then CVaR calculates the expected time when the delivery exceeds 24 hours. Since CVaR is the expected value when VaR is breached, we automatically get a small VaR value as well when we minimize CVaR. On the other end, if whatever happens when some value is breached does not incur extra costs based on the size of the breach, but instead results in some known fixed cost, one can instead minimize the probability of exceeding such a value by formulating a minimum probability model. This list is far from comprehensive but highlights some key risk measures which can be used for hub and spoke design problems. 4 1.2 Contributions We present three different stand alone papers which make up Chapters 2, 3 and 4 of this thesis. Our first paper is a computational contribution applied on a stochastic uncapacitated single allocation p-hub center problem proposed in Sim et al. [65]. The problem considers the stochastic nature of travel times on links and the authors construct heuristic methods to find approximate solutions to networks of up to 40 nodes in size. We contribute to this problem by finding exact solutions to 50 node networks. We first propose a new and more efficient formulation to the problem in [65] which allows exact solutions to be found for networks up to 25 nodes in size. We then preprocess the model, reducing its size, which further allows us to solve 40 node networks exactly. Lastly we propose a separation algorithm which effectively solves 50 node networks to optimality. Our second paper, Chapter 3, further extends the methodology proposed in [65] and used in Chapter 2. We show that specifically accounting for departure times at hub nodes in the model can lead to better and more practical network designs. Two different models are proposed where one allows optimal departure times to be solved for while the other assumes a fixed list of departure times are available for each potential hub to hub link. Solutions are found by reformulating the models into an approximate linear model, using a step function, and an equivalent linear model, respectively, by assuming independent normal distributions on the link travel times. Further we prove that the optimal location- allocation solutions to the approximation model are a subset of the optimal solutions to the original stochastic model when the step sizes are small enough. Computational tests are also presented which demonstrate the effectiveness of the solution formulations. In addition we also propose a new model in the second paper for a hub allocation problem. The objective of this model is to minimize the probability of all paths of over exceeding some guaranteed delivery time. There is a cost function associated with each path which accounts for the volume along the paths. The purpose of the model is to maximize the percent of total packages which can be delivered on time as opposed to paths which deliver within some probability on time. Three solution approaches are also presented and tested for this new model where fixed departure times are assumed. In our third paper, Chapter 4, we consider a capacitated single allocation hub location problem with stochastic demand. Most real life systems have some type of capacity constraints on the amount of incoming flow a hub can handle. Considering the daily fluctuations of flow volume is important. Knowing that the average traffic flow falls 5 below the capacity of a hub tells us nothing about the likelihood of the flow volume exceeding capacity on a given day or the likely size of such a breach. When considering average flow volumes, capacity breaches will occur as often as every other flow cycle which can cause huge unwanted costs and poor service quality. We set up a chance constraint model to guarantee that capacity constraints at hub nodes are not breached for some probability. We develop an approximation solution based on a sample average approximation method, using the conditional value at risk, which gives a safe volume of flow into each hub. We are able to solve test problems of up to 50 nodes in size and also show that, for a large enough sample size, the optimal solutions for the approximation model are a subset of the optimal solutions for the stochastic model. Additionally, we show the benefits of accounting for stochastic demand by comparing simulation tests carried out on both the optimal network designs found by a deterministic capacitated single allocation hub location model and the designs found by our proposed model. 1.3 Literature Review Deterministic modeling work began with the first mathematical model being formulated in O’Kelly [49], which was named the single allocation p-hub median problem. A few years later O’Kelly [50] introduced the single allocation hub location problem, which is similar to the p-hub median problem, but differing by accounting for some given fixed costs of setting up the hubs at specific nodes. This additional cost was added to the objective function, resulting in a solution suggesting the number of hubs to set up instead of using the predetermined p number of hubs. Campbell [9] was the first to introduce the use of multiple allocations to the p-hub median problem and a couple of years later Campbell [10] proposed two additional sets of problems, namely, the p-hub center problem and the hub covering problem. The median, including the hub location problem, the center, and the covering problems have become benchmark problems in the literature. All of them have been both improved upon computationally, via reformulation or exact or heuristic algorithm developments, or extended upon to capture more real life system properties, by several different authors. For example [37, 24] both individually reformulate the single allocation p-hub center problem presented in [10] where each formulation computationally outperforms the prior. [54] presents a heuristic method and [48] presents an efficient exact solution method, which they call the two-phase algorithm, also for the center problem. The median problem has received the most attention in the 6 literature where some sort of computational improvements or extensions can be found in for example [6, 40, 41, 10, 66, 53, 12, 25, 52, 67, 68, 70, 27, 56, 71, 72, 22, 23, 20, 16, 18, 16, 60]. Further literature review discussions can be found in Sections 2.1.1, 3.1.1 and 4.1.1. We also refer the interested reader to the two literature review papers [3, 13]. We now move on to talk about the work done on stochastic hub and spoke problems as this is where the contributions of this thesis are found. As mentioned earlier, there are but a few papers which consider risk or uncertainty models for hub and spoke networks and there is a vital need for more research to be conducted in this area. Sim et al. [65] model a stochastic single allocation p-hub center problem with stochastic travel times on each link. This model has the added benefit of not only minimizing the maximum travel time, as in the deterministic case, but also requiring all path travel times to be less than or equal to the maximum time for at least some probability γ. This gives a more reliable network design compared to a deterministic p-hub center model, where the longest paths will be delayed half of the time. [65] formulates a stochastic mathematical program which is then reformulated into an equivalent mixed integer linear program by assuming all links to be independent and normally distributed. However, they do not manage to find any exact solutions to the problem and instead focus their attention on developing several heuristic methods. They are able to approximately solve networks up to 40 nodes in size. There are three additional papers which all consider a variation of the hub location problem. Marianov and Serra [44] consider an airline passenger system and model a multiple allocation hub location problem with chance constraints on the capacity of the queue length of arriving airplanes. They develop an M/D/c queuing model for the number of airplanes waiting to land at a hub airport. This is constructed considering Poisson arrivals of airplanes. A deterministic service time (landing time, plus time at the gate, plus take off time) and c servers (number of runways at the airport) are assumed. They calculate the steady-state probabilities, assuming the average arrival rates are constant during peak hours, and later use this to define the capacity constraints. The capacitated hub location problem is then formulated with a constraint on the average number of airplanes arriving to some hub to be less than or equal to the pre-calculated maximum arrival rate. Their model requires a large number of variables and constraints and thus they solve the associated deterministic problem by developing a heuristic based on tabu search to get approximate solutions for networks up to 30 nodes in size. Yang [81] looks at a stochastic air freight uncapacitated multiple allocation hub location 7 problem (UMAHLP) where seasonal variations on demand, as well as seasonal variations on the discount factors for hub to hub flights, are accounted for. The model is separated into two stages. The first stage determines the number and location of hubs, taking into account the solution found from the second stage flight routing problem, where stochasticity of the seasonal demands and discount factors appear. They rewrite the two stage model as an extensive form, combining both stages into the same problem to allow exact solutions to be found. All scenarios from the second stage are accounted for in the extensive form. They assume a discrete probability distribution, involving three possible scenarios representing average values for each of the three seasonal variations. The paper includes and solves a case study of a 10 node air freight network in Taiwan and China. Contreras et al. [15] also study a stochastic uncapacitated multiple allocation hub lo- cation problem. They look at three different cases. First they consider a stochastic UMAHLP where demand is stochastic. They show that in such circumstances the stochastic model is equivalent to the deterministic problem where demand is consid- ered on an average basis, i.e. there is no flaw of averages. Second they consider a different stochastic UMAHLP where transportation costs on all paths are stochastic, but assume the costs are all dependent on a single random parameter. They also show here, as in the first case, that the stochastic model is equivalent to the deterministic problem, where costs are based on average values. In the third case they similarly ac- count for stochastic transportation costs on all paths, but now assume all path costs are independent and described by some known probability distributions. They construct a Monte-Carlo simulation, using a sample average approximation method coupled with a Benders’ decomposition algorithm, as a solution approach for the later problem. They are able to approximately solve the third case model for networks of up to 50 nodes in size. There are a lot of existing gaps, both computational and methodological, in the stochas- tic hub and spoke literature. For example, no existing models consider what percent of packages actually arrive on time, or account for daily flow volume variations from different origin nodes and the problems this may cause when capacity constraints exist. Filling some of these major gaps, like the two just mentioned and more, is the main contribution of this thesis, to further develop stochastic hub and spoke optimization models and methods. More research is still however needed and is discussed further in the conclusion, Chapter 5. 8 Although the location and allocation problem is the topic of this thesis, it is still worth quickly mentioning some of the other design aspects considered in a working hub and spoke system. Transportation networks may optimize the unloading, sorting and loading operations at a hub node. This is of special importance for sea freight terminals where large volumes of containers are gathered at each port [73]. For truck to truck or truck to train or airplane (or vice versa) transfers, the cross docking problem is an important design strategy. The purpose with a cross docking terminal is to unload, sort and load cargo from different sources onto waiting outbound vehicles. Well planned assignment of trucks to docks is the key aspect to achieving less costly and fast transfers. Incoming trucks can be assigned directly to a docking station or asked to wait in a queuing area until an appropriate docking station is freed up. The queuing system is often unavoidable, due to trucks generally outnumbering the number of docking stations and the arrival of a high number of trucks at the same time during peak periods. Due to the growing number of shipments on a global scale, a common scenario nowadays is the large queue times of waiting trucks. Many cross docking models are set up slightly differently with different assumptions, due to the fact that there are very many different types of docking stations, for example in [82] there are 32 models proposed. However, papers such as [8, 51] propose a more standard base case model. Similarly, work is also done for flight gate assignment problems where one of the more common objectives is to minimize the walking distance of connecting passengers (or sometimes goods) from one flight to the next [21]. In addition, some hub and spoke networks also need to consider the vehicle routing problem. The vehicle routing problem optimizes the delivery and pick up of, for example, mail and cargo from the local buildings which are associated with each of the individual origin and destination nodes. The vehicle routing problem can also be found in many other applications outside of hub and spoke networks [31]. 9 Chapter 2 Paper One Exact Computational Approaches to a Stochastic Uncapacitated Single Allocation p-Hub Center Problem Edward Hult, Houyuan Jiang, Daniel Ralph Judge Business School, University of Cambridge, Trumpington Street, Cambridge CB2 1AG, United Kingdom, {e.hult,h.jiang,d.ralph@jbs.cam.ac.uk} Abstract The stochastic uncapacitated single allocation p-hub center problem is an extension of the deterministic version which aims to minimize the longest origin-destination path in a hub and spoke network. Considering the stochastic nature of travel times on links is important when designing a network to guarantee the quality of service measured by a maximum delivery time for a proportion of all deliveries. We propose an efficient refor- mulation for a stochastic p-hub center problem and develop exact solution approaches based on preprocessing and a separation algorithm. We report numerical results to show effectiveness of our new reformulations and approaches by finding global solutions of small-medium sized problems. 10 Keywords: Hub location; Center problem; Stochastic programming; Separation algo- rithm. 2.1 Introduction Transportation of goods and people plays a vital role in the economy. One of the key elements affecting the efficiency of shipping services is the network structure. Hub and spoke networks are often employed to balance cost of building and maintaining services between pairs of nodes and having short routes, in terms of distance or time, between pairs of nodes. The hub and spoke structure has found wide application, in air and rail transportation, postal mail systems, and telecommunication networks [3, 13]. Although traffic flows are patently uncertain, with stochastic travel time on links and stochastic volumes between nodes to name just two sources of uncertainty, the main work in the design of hub and spoke networks has focussed on deterministic hub location models. In this literature stochastic data are replaced by averages. This may be sensible in some cases but can suggest network designs that are far from optimal, on average, in a stochastic environment. This phenomenon has been recognized in the facility location literature [7, 76] and is an instance of a more general notion, the so-called flaw of averages [62] which merely says that knowing the average values of stochastic inputs to a process is not enough to determine its average output. The benefit of deterministic network design is the relative tractability of this problem. The deterministic model underlying our work is the so-called p-center problem which, given a list of N nodes and number p less than N , is to identify p nodes as hubs in order to minimize the maximum travel time across the network. Although this is a combinatorial optimization problem, progress has been impressive. For instance the work of Meyer et al. [48] shows that combining clever integer linear programming reformulations with hybrid algorithms, in a branch and bound framework, can achieve global optimality in reasonable time for networks of up to 400 nodes. We consider a stochastic version of the uncapacitated single allocation p-hub center problem which is the same problem proposed by Sim et al. [65]. The challenge is the computational burden of dealing with the stochastic problem since it is not obvious how 11 to take advantage of the formulations and methods that are efficient for deterministic problems. For instance in [65], exact solution seems to be out of the question and even heuristic methods are at their limits when N = 50. We approach the goal of finding global solutions in reasonable time in three steps. First we formulate a stochastic model with chance constraints, second we develop a compact integer linear program (ILP) reformulation of the stochastic model under certain conditions, and thirdly we adopt and manipulate various methods and techniques such as preprocessing and introduce a separation algorithm to find efficient solution procedures. Beyond exact solutions, exact methods can provide some guarantees on approximation. For example, an exact solution method based on branch & bound gives guaranteed upper and lower bounds during execution, even if you stop before an exact solution is found. However, heuristic methods, even when they identify feasible points, typically are not associated with guarantees on the quality of such points. The quality of the solution identified by a heuristic is typically only estimated by extension of empirical results on problems for which the exact solutions are known. This is not to say heuristics are not useful: they play an important role and are very powerful for solving large complicated life sized problems which exact solution methods fail to solve in reasonable time. However, our main contribution is being able to efficiently solve 40 and 50 node networks exactly. This is computationally similar to, or better than, the performance of the heuristic methods proposed in [65] which solve 40 node networks. Although our stochastic model is similar to that in [65], our reformulation is an ILP using O(N2) binary variables and O(N3) constraints (see Proposition 2.2.1). This formulation is compact relative to the reformulation model of [65] (as can be found in the Appendix) which has O(N4) constraints and O(N4) binary variables. Nevertheless computational efficiency benefits considerably from preprocessing and separation algorithm approaches. Indeed the building blocks of our approach — compact ILP reformulations and hybrids of branch and bound with higher level algorithms — are standard; our contribution is in part the development of a new reformulation but mainly to demonstrate the value of exploring generic ILP tools in the stochastic case. The paper is organised as follows. We review the hub and spoke literature in Section 2.1.1. In Section 2.2 we present the stochastic p-hub center problem (SpHCP) and establish our compact integer linear programming reformulation (SpHCPL). In Section 2.3, we propose pull (preprocessing) and push (separation algorithm) methods for solving the ILP reformulation exactly. In Section 2.4, we report numerical results in which exact 12 solution of the stochastic p-hub problem for up to N = 50 is routine. 2.1.1 Literature review Sim et at. [65] are the first to consider the single allocation p-hub center problem with stochastic travel times on each link. They formulate a stochastic mathematical program and then reformulate the problem into an integer linear program by assuming travel time on links have independent normal distributions. Since exact solutions to this problem are hard to get the authors focus their attention on developing several heuristic methods to approximately solve network sizes up to 40 nodes. In addition to [65] we mention three other hub and spoke papers considering stochas- tic properties. Marianov and Serra [44] consider a hub location problem with chance constraints where the arrival times of incoming aircraft are stochastic. They build an M/D/c queuing model which considers the number of airplanes waiting to land based on an assumed Poisson arrival rate of the aircraft. The result from the M/D/c queuing model is then used as a hard constraint requiring the average number of airplanes, arriv- ing to some hub, to be less than or equal to the found value. They developed a heuristic algorithm based on tabu search and are able to solve networks up to 30 nodes in size. Yang [81] looks at an air freight hub location problem where demand and the associated discount factors on hub-hub links are stochastic. The model is separated into two stages where the first stage determines the number and location of hubs taking into account stochasticity that appears in the second stage flight routing problem. [81] assumes a discrete probability distribution involving only three possible scenarios on demand. The paper includes a case study of a 10 node air freight network in Taiwan and China. Con- treras et al. [15] study the uncapacitated multiple allocation hub location problem and include stochastic properties of both demand and transportation cost. For the rather special case in which a single random parameter affects the transportation cost on all links equally, they show that their stochastic model is equivalent to the deterministic problem in which all random variables are replaced by their average values, i.e., there is no flaw of averages. For independent transportation costs, however, where the uncer- tainty of a link cost is independent of all other links in the network, the same cannot be done. They therefore move on to approximately solve the latter problem for network sizes up to 50 nodes using Monte-Carlo simulation coupled with Benders’ decomposition. Although the literature on stochastic hub and spoke networks is small, there is a rel- 13 atively larger pool of research done on deterministic problems. Hub location became a recognised field of study in the late 1980s. The state-of-the-art for the research in hub location can be found in two review papers [3, 13]. Attention has focussed on two themes: proposing mathematical models such as median, center and covering hub problems to approximate the real-world business challenges, and developing efficient numerical approaches for solving these proposed mathematical models. Computational methods dominate the hub location literature. Unsurprisingly, as the number of nodes in the network grows, exact methods are more prone to computational difficulties — such as excessive time or memory requirements — than are heuristics. Campbell [10] is the first to introduce and study the p-hub center problem in the hub literature. This includes the deterministic version of the problem we study which is motivated by a hub system involving perishable or time sensitive items in which cost refers to time. Campbell [10] defines the uncapacitated single allocation p-hub center problem (USApHCP) as a quadratic programming problem and proposes a linear programming reformulation using O(N4) variables where, as above, N is the number of nodes in the network. The same problem has been studied by several different authors aiming to speed up the computational efficiency. Kara and Tansel [37] present an equivalent re- formulation using O(N2) variables and O(N3) constraints. An additional improvement on the computational performance of the USApHCP was found by Ernst et al. [24] by reformulating it using what they call the radius approach, which requires only O(N2) variables and O(N2) constraints. The radius approach can solve test problems that have up to 200 nodes in the network. Recently a new 2-phase method combining a branch and bound algorithm and the radius approach was introduced by Meyer et al. [48] and can solve test examples that have up to 400 nodes. Jutte et al. [36] give a polyhedral analysis on the hub center problem based on the radius approach. The other two standard problems in the hub literature are the p-hub median prob- lem and the hub covering problem which both follow the same trend with different researchers trying to either improve on the original formulation computationally or to study extensions and/or variations to the problem. The p-hub median problem, which is to minimize the total cost of all flows between all OD pairs over the network, has received more attention in the literature. Transportation costs between hub nodes are usually discounted by a factor to reflect economies of scale. O’Kelly [49] is the first to formulate a hub location problem mathematically. Campbell [9, 10] later produces the 14 first integer linear programming formulations for single and multiple allocation p-hub median problems. Skorin-Kapov et al. [67] reformulate the linear formulation to produce exact solutions in quicker times. Ernst and Krishnamoorthy [25] present a very efficient reformulation requires O(N3) variables and O(N3) constraints. Recent advances for the hub median problems can be found in [16, 60]. The third standard problem is the hub covering problem, first mathematically formulated by Campbell [10]. The motivation is to find a set of hubs so that the cost of opening hubs or the number of hubs to use is minimized subject to delivering a certain service. We refer the interested reader to Alumur and Kara [3] for additional details on hub covering. 2.2 The Stochastic Hub Center Problem and a Re- formulation Here we present the stochastic p-hub center problem (SpHCP for short) after Sim et al. [65] and, in the deterministic case, Campbell [10]. This is a quadratic program which has O(N2) binary variables and O(N4) constraints. In Proposition 2.2.1 we show it has an integer linear program representation of O(N2) binary variables and O(N3) constraints. Consider a network consisting of N nodes, denoted by N and options to link any pair of nodes. The designer wants to select p hubs in N and to assign each node in N to exactly one hub. If k and m are hub nodes and nodes i and j are allocated to k and m, respectively, then the path for delivering the goods from node i to node j is i→ k → m→ j. Assume that Dij represents random travel time between nodes i and j with an average travel time of dij and a standard deviation of σij. Assume α ∈ (0, 1) is the discount factor for travel times over hub arcs (links between hub nodes). The total travel time along the path i→ k → m→ j is Tijkm = Dik + αDkm +Dmj. Let β be the nominal maximum travel times between all OD pairs in the network. Assume γ ∈ [0, 1] is the service level. Then the service level guarantee means that for any nodes i and j, with probability of γ, the travel time for the goods delivered from i to j will not exceed β. For instance if β = 24 hours and γ = 0.95, then the service level guarantee requires that the travel time from i to j will not exceed 24 hours for 95% of journeys. This type of service level guarantee is often employed in call centers and other 15 telecommunication services [30, 32]. Clearly, the designer prefers a smaller value for β and a larger value for γ. However, these two parameters usually act against each other. For example, a large value of γ often results in a relatively large value of β. Define Xik to be a binary variable such that Xik = 1 if and only if k is a hub node and node i is assigned to k. SpHCP is defined by the following nonlinear stochastic program: (SpHCP) min. Xik,β β (2.1) s.t. γ ≤ Pr(β ≥ (Dik + αDkm +Dmj)XikXjm), ∀i, j, k,m ∈ N (2.2)∑ k∈N Xik = 1, ∀i ∈ N (2.3) Xik ≤ Xkk, ∀i, k ∈ N (2.4)∑ k∈N Xkk = p, (2.5) Xik ∈ {0, 1}, ∀i, k ∈ N . (2.6) In the above formulation, the objective is to minimize the nominal maximum travel time between all OD pairs in the network. Constraint (2.3) indicates that each node is allocated to exactly one hub. Constraint (2.4) shows that if node i is allocated to hub k, then node k must be a hub. Constraint (2.5) specifies that exactly p hubs are established. Constraint (2.6) states that Xik is a binary variable. The most complicated is constraint (2.2), which defines the service level: The probability such that the travel time on each OD path is less than or equal to β is greater than or equal to γ. Note that when either Xik = 0 or Xjm = 0, i → k → m → j is not a valid travel path between nodes i and j. In this case, constraint (2.2) still holds because Pr(β ≥ 0) = 1 and this constraint becomes redundant. A few remarks are in order. Firstly, if the travel times on all links are deterministic, then constraint (2.2) is equivalent to the following constraint: β ≥ (dik + αdkm + dmj)XikXjm, and SpHCP is equivalent to the p-hub center problem studied by Campbell [10]. This shows that SpHCP is an extension of the traditional deterministic p-hub center problem. Secondly, SpHCP is equivalent to the stochastic p-hub center problem investigated by Sim et al. [65], where four-index binary variables yijkm is used and yijkm = 1 is implied 16 by Xik = Xjm = 1. It is easy to define and understand SpHCP. Also, up to this point, we have not needed specific assumptions on either the probability distribution of link travel times Dik, or on the relationship between these random travel times for different links. However, this formulation involves both integer variables and the stochastic constraint (2.2). These features make the above formulation computationally intractable. In the sequel, we aim to convert the above formulation into a computationally tractable deterministic integer linear program based on two techniques: linearizing quadratic terms, and replacing probabilistic expressions by equivalent deterministic counterparts. To this end, we make one more assumption which states that, first, Dij is normally distributed and, second, is mutually and stochastically independent of Dk` for any nodes i, j, k, ` with (i, j) 6= (k, `). Based on this assumption, the total travel time Tijkm = Dik+αDkm+Dmj along the path i → k → m → j is also normally distributed with a mean of tijkm = dik + αdkm + dmj and a standard deviation of δijkm = √ (σik)2 + α2(σkm)2 + (σmj)2. The fact that Tijkm has a normal distribution shows that the stochastic constraint (2.2) is equivalent to a deterministic and quadratic constraint: β ≥ (tijkm + zγδijkm)XikXjm, ∀i, j, k,m ∈ N , where zγ is the z-value for a standard normal distribution and satisfies the equation Pr(Z ≤ zγ) = γ and Z is a standard normally distributed random variable. Now we can apply some standard tricks for reformulating a quadratic inequality as a linear inequality. Evidently, the optimal value for β is nonnegative. Because Xik is a binary variable, it is easy to verify that constraints β ≥ (tijkm + zγδijkm)XikXjm for all i, j, k,m ∈ N can be collectively replaced by constraints: β ≥ (tijkm + zγδijkm)(Xik +Xjm − 1), ∀i, j, k,m ∈ N . Furthermore, constraint β ≥ (tijkm + zγδijkm)XikXjm or constraint β ≥ (tijkm + zγδijkm)(Xik +Xjm− 1) is redundant if either Xik = 0 or Xjm = 0. In view of constraint (2.3), for any fixed i, j,m, constraints β ≥ (tijkm + zγδijkm)(Xik +Xjm − 1) for all k are equivalent to a single constraint β ≥ ∑ k∈N (tijkm + zγδijkm)(Xik +Xjm − 1), ∀i, j,m ∈ N . (2.7) 17 By the above two reformulation techniques, we have obtained an equivalent deterministic integer linear program for the SpHCP. Proposition 2.2.1 Assume that for all i, j ∈ N , Dij are mutually and stochastically independent and normally distributed. Then the stochastic optimization problem SpHCP defined by (2.1)-(2.6) is equivalent to the following integer linear program: (SpHCPL) min. Xik,β β s.t. (2.3), (2.4), (2.5), (2.6), (2.7). Having introducing SpHCP and its deterministic counterpart SpHCPL, we make three remarks in addition to the two made earlier in this section. First, Sim et al. [65] de- rive a deterministic counterpart from their stochastic optimization problem and the deterministic counterpart is also an integer linear program. However, the determin- istic formulation in Sim et al. [65] has 2N4 + N2 + N + 1 constraints and N4 + N2 binary variables, whereas our SpHCPL has only N3 + N2 + N + 1 constraints and N2 binary variables. This seems computationally more promising in terms of both computer memory and CPU times. Second, the appendix of [65] gives an alternative ILP represen- tation, with O(N2) variables and O(N4) constraints, requiring an extra assumption on the stochastic variables, a kind of stochastic triangle inequality. Third, when random- ness of travel times is removed, SpHCPL is similar to the integer linear programming formulation studied in Kara and Tansel [37]. The only difference is in the constraint (2.7) for defining the lower bound β on the objective function, which in [37] is given by β ≥∑k∈N (dik + αdkm)Xik + dmjXjm, for each i, j and m in N . It is important to note that the same approach can be applied for any distribution of the travel time between nodes provided that the percentiles TDijkm are known in advance, where TDijkm satisfies the equation Pr(Dij + αDkm + Dmj ≤ TDijkm) = γ. If, unlike the normal distribution, these percentiles cannot be computed directly, then they can be found by sampling or simulation techniques which can be carried out independently from and prior to solving SpHCPL. 18 2.3 Solution Methods Looking ahead to Section 2.4, our preliminary computational experience in applying CPLEX to SpHCPL indicates that this combination of model and software is adequate for problems up of to 25 nodes (Table 2.1). However N = 25 is small for the standard dataset of test problems, the Australian Post (AP) [26], where deterministic problems of hundreds of nodes can be solved exactly [48], reasonably quickly. Therefore, in the follow- ing two subsections, we investigate two computational approaches for solving SpHCPL more efficiently. These are called pull and push approaches, respectively. In the pull approach, we aim to remove redundant constraints related to (2.7) and to add some valid cuts. The push approach is a separation algorithm in which we start with no or a few constraints of type (2.7) and we gradually add more such constraints if required. It is worthwhile to mention that Camargo et al. [20] solve the uncapacitated multiple allocation problem using the Benders’ decomposition approach, which allows them to solve test examples with a network size of up to 200 nodes. 2.3.1 SpHCP-Pull, preprocessing Suppose that βU and βL are upper and lower bounds for the optimal objective function value for SpHCPL or equivalently SpHCP. We give a lemma that uses these bounds to identify valid cuts and redundant constraints for SpHCPL; its proof is elementary and appears in the Appendix. Then we suggest a way to generate valid upper bounds. Lemma 2.3.1 (a) Assume the travel time is symmetric: i.e., Dij and Dji have the same probability distribution. If for given j,m, 2djm + zγ √ 2σjm > β U , then Xjm = 0 is a valid cut for SpHCPL. (b) If for given i, j,m, minNk=1(tijkm + zγδijkm) > β U , then Xjm = 0 is a valid cut for SpHCPL. (c) If for given i, j,m, maxNk=1(tijkm+zγδijkm) < β L, then constraint (2.7) is redundant for the given i, j,m and can be removed. (d) Assume the triangular inequality property holds over dij, i.e. dij + djm ≥ dim. If for given j,m, djm + zγσjm + αmax N `=1 dm` > β U , then Xjm = 0 is a valid cut. 19 (e) If Xjm = 0 is a valid cut, then for any i and the corresponding j,m, constraint (2.7) is redundant for SpHCPL. Based on Lemma 2.3.1, if an upper bound and/or a lower bound for SpHCPL are avail- able, then we can obtain a better integer linear program than SpHCPL by adding some additional cuts and by removing some redundant constraints. We call this modified in- teger linear programming formulation SpHCPL-Pull. The exact format of SpHCPL-Pull depends on the test example and available upper and lower bounds. Next we design a heuristic method for deriving an upper bound. Define d˜ik = dik + zγσik. It is easy to verify that Pr(Dik ≤ d˜ik) = γ, which represents the service level on the link between i and k. Based on d˜ik, we construct a new network whose network structure is identical to the existing network, but the direct stochastic travel time Dik between i and k is replaced by the deterministic travel time d˜ik. For this new network, we follow Ernst et al. [24] to propose the radius-based formulation: (Heuristic) min. Xik,rk,β β s.t. rk ≥ d˜ikXik,∀i, k ∈ N β ≥ rk + rm + αd˜km, ∀k,m ∈ N (2.3), (2.4), (2.5), (2.6). Here rk represents the radius of hub node k. When k is not a hub node, rk = 0 holds automatically. If for all i, k, Dik is a random variable with a single pulse, then Dik = d˜ik = dik and the above formulation reduces to the radius-based formulation proposed in Ernst et al. [24]. In terms of computational times, the radius-based formulation is the state-of-the-art integer linear programming formulation for the single allocation p-hub center problem. The optimal solution for (Heuristic) gives a feasible solution for SpHCP, which in turn generates an upper bound for SpHCP. 2.3.2 SpHCP-Push, separation algorithm In this subsection we focus on the so-called push approach SpHCPL-Push, which is a separation algorithm. SpHCPL-Push consists of two separate components: so-called restrictive master problems and, to check their optimality, subproblems. A restrictive master problem is a modification of SpHCPL by ignoring some constraints of type (2.7). A subproblem is to check whether or not the most recent restrictive master problem 20 indeed gives an optimal solution to SpHCPL or SpHCP. This can be done by checking whether or not any constraint of type (2.7) violates if β in (2.7) is replaced by the optimal objective function value of the most recent restrictive master problem. If not, an optimal solution for SpHCPL is obtained. Otherwise, such violating constraints are added to the most recent master problem to form an updated restrictive master problem. Due to the special structure of SpHCPL, we only need to check constraint violations for some selected ones of type (2.7). Suppose X∗ik and β ∗ are the optimal solution and the optimal objective function value for the most recent restrictive master problem, respectively. If X∗jm = 0, then for any i and the corresponding j,m, {X∗ik} satisfies constraint β∗ ≥ ∑Nk=1(tijkm + zγδijkm)(X∗ik + X∗jm − 1) automatically, which implies that there is no need to check constraint violations for any i and the corresponding j,m. If X∗jm = 1 and constraint β ∗ ≥ ∑k∈N TDijkmX∗ik is violated, then for any i and the corresponding j,m, constraint (2.7) is added to the most recent restrictive master problem. Note that for any j, there is exactly one m such that X∗jm = 1. Therefore, we add at most N2 constraints of type (2.7) to the new restrictive master problem. The above argument also shows that the subproblem can be solved in polynomial time: checking if Xjm = 0, and checking if β ∗ ≥ ∑Nk=1(tijkm + zγδijkm)X∗ik is violated when Xjm = 1. The SpHCPL-Push approach cycles iteratively between restrictive master problems and subproblems and terminates when an optimal solution for SpHCPL is found. Because there are exactly N3 constraints of type (2.7) and at least one new and different con- straint of type (2.7) is added to the restrictive master problem, an optimal solution for SpHCPL can be found in at most N3 iterations between the restrictive master problem and the subproblem. The SpHCPL-Push approach as well as some related results is outlined below. 21 SpHCPL-Push: Step 1 Generate an initial restrictive master problem: Modifying SpHCPL by removing all the constraints of type (2.7). Step 2 Solve the most recent restrictive master problem. Let X∗ik and β ∗ be its optimal solution and optimal objective function value, respectively. Step 3 Solve the subproblem by checking constraint violations for (2.7) based on the procedure described above. If there is no constraint violation, terminate and an optimal solution for SpHCPL is obtained. Otherwise, go to Step 4. Step 4 Add newly violating constraints to the most recent restrictive master problem to form a new restrictive master problem. Go to Step 2. Proposition 2.3.1 (a) The subproblem in SpHCPL-Push is polynomially solvable. (b) An optimal solution for SpHCPL can be obtained after at most N3 iterations be- tween the restrictive master problem and the subproblem. 2.4 Numerical Results In this section, we test the computational performance of the approaches proposed in the previous sections using well-known test examples from two libraries of test problems, (CAB) [29, 49] and (AP) [26]. Our code is written in C++, integer linear programs are solved using ILOG CPLEX Version 12.1, and all of the numerical experiments are carried out on a HP Pavilion laptop with an Intel(R)Core(TM)2 CPU processor with 3.00 GB RAM. The standard CAB and AP test examples are designed for the deterministic p-median hub location problem. Recall that Dik is assumed to be normally distributed with a mean of dik and a standard deviation of σik. In our test examples, we assume that dik takes the value given in the (deterministic) test libraries and σik = νdik, for a constant ν called the coefficient of variation. In our runs we set ν = 1 and γ = 0.95 for all test examples and later check the sensitivity of the CPU times with respect to both γ and ν. 22 We test three approaches: SpHCPL, SpHCPL-Pull, and SpHCPL-Push and let each run for a maximum of 1 CPU hour (3600 CPU seconds). However, for some of the smaller test problems which were not solved within the 1 hour time limit, by SpHCPL or SpHCPL-Pull, a 2 hour time limit was tried but resulted in no additional solutions found. Note that the problem (Heuristic) introduced in Section 2.3.1 is used for finding an upper bound for SpHCPL-Pull. Numerical results are shown in the tables below. For each test problem, we report: Prob problem name (for CAB examples, a name has a format of N.p.q such that α = q/10, with the exception of q = 1 then α = q, and for AP examples, a name has a format of N.p and α = 0.75), Obj optimal objective function value, Upper objective function value of the feasible solution generated from (Heuristic) which provides an upper bound to Obj, SpHCPL CPU time of running SpHCPL, SpHCPL-Pull CPU time of running SpHCPL-Pull, SpHCPL-Push CPU time of running SpHCPL-Push, and Iter number of iterations for SpHCPL-Push. For network sizes of 25 nodes or fewer (table 2.1), all methods perform relatively well. In comparison to the test runs in Sim et al. [65] — where the computational demands of the ILP formulation of [65] meant that only heuristic approaches could be usefully employed — our standard SpHCPL formulation appears to be very promising because CPLEX produces globally optimal solutions for all problems within a few minutes. Moreover, the difference between SpHCPL-Pull and SpHCPL-Push is very small for network sizes of 25 nodes or fewer as can be seen in tables 2.1 and 2.2. For network sizes of 40 nodes or more (table 2.2), however, we can no longer solve the test problems using SpHCPL with one exception, 40.2, or solve networks of size 50 nodes using SpHCPL-Pull within 1 CPU hour. What is interesting is that SpHCPL-Push performs very well for the test examples with 40 or 50 nodes. 23 Prob γ ν Obj SpHCPL SpHCPL-Pull SpHCPL-Push Upper Iter 5.3.1 0.95 1 2078.44 0.08 0.02 0.02 2276.84 2 5.3.6 0.95 1 1736.03 0.08 0.02 0.04 1788.54 2 10.2.1 0.95 1 3937.62 0.61 0.13 0.04 3986.21 2 10.2.2 0.95 1 2896.63 0.59 0.08 0.05 2896.63 2 10.2.4 0.95 1 3207.72 0.6 0.17 0.21 3207.72 3 10.2.6 0.95 1 3576.5 0.76 0.16 0.14 3811.55 3 10.2.8 0.95 1 3811.55 0.92 0.07 0.06 3811.55 2 10.3.1 0.95 1 3829.18 0.66 0.06 0.05 3881.41 2 10.3.2 0.95 1 2425.73 0.44 0.05 0.04 2425.73 2 10.3.4 0.95 1 2425.73 0.47 0.05 0.04 2425.73 2 10.3.6 0.95 1 2807.9 0.37 0.05 0.04 2807.9 2 10.3.8 0.95 1 3361.34 0.49 0.05 0.04 3386.14 2 10.4.1 0.95 1 3829.18 0.4 0.21 0.25 4676.7 4 10.4.2 0.95 1 1707.95 0.45 0.06 0.04 1707.95 2 10.4.4 0.95 1 1921.89 0.41 0.03 0.03 1921.89 2 10.4.6 0.95 1 2800.5 0.4 0.04 0.06 2806.02 3 10.4.8 0.95 1 3361.34 0.38 0.06 0.09 3741.36 3 15.2.1 0.95 1 5420.99 3.31 0.69 0.35 5655.94 4 15.2.2 0.95 1 4126.04 2.07 0.54 0.57 4126.04 5 15.2.4 0.95 1 4430.18 2.45 0.66 0.53 4681.75 5 15.2.6 0.95 1 4785.7 2.11 0.45 0.24 4798.08 3 15.2.8 0.95 1 5184.13 2.3 0.45 0.16 5257.06 3 15.3.1 0.95 1 5092.54 2.49 1.44 1.58 6890.21 7 15.3.2 0.95 1 3647.14 1.98 0.26 0.18 3789.68 3 15.3.4 0.95 1 3813.75 2.54 0.53 0.36 3813.75 4 15.3.6 0.95 1 4009.01 2.35 0.14 0.08 4009.01 2 15.3.8 0.95 1 4702.49 2.3 1.1 0.28 5512.17 4 15.4.1 0.95 1 5092.54 1.72 1.39 1.6 6890.21 7 15.4.2 0.95 1 2782.98 1.72 0.25 0.08 2782.98 2 15.4.4 0.95 1 2949.82 2.71 0.13 0.06 2949.82 2 15.4.6 0.95 1 4009.01 3.28 0.41 0.2 4134.12 3 15.4.8 0.95 1 4702.49 2.72 0.69 0.31 5512.17 4 20.2.1 0.95 1 5420.99 13.92 2.55 0.84 5655.94 5 20.2.2 0.95 1 4101.59 9.57 1.14 0.14 4101.59 2 20.2.4 0.95 1 4284.84 9.86 2.25 1.29 4681.75 5 20.2.6 0.95 1 4785.7 9.62 2.07 0.74 4928.59 4 24 20.2.8 0.95 1 5089.27 13.36 2.6 0.81 5420.99 5 20.3.1 0.95 1 5092.54 14.74 9.07 4.04 6890.21 8 20.3.2 0.95 1 3075.01 6.11 0.85 0.14 3075.01 2 20.3.4 0.95 1 3688.54 8.04 2.32 0.8 3813.75 4 20.3.6 0.95 1 4004.81 21.09 0.49 0.11 4032.51 2 20.3.8 0.95 1 4671.23 28.1 6.46 0.69 5171.19 4 20.4.1 0.95 1 5092.54 14.65 6.46 6.77 6890.21 8 20.4.2 0.95 1 2830.12 16.17 1.3 0.44 2922.82 3 20.4.4 0.95 1 3028.72 12.47 0.77 0.2 3028.72 2 20.4.6 0.95 1 4001.28 19.58 0.77 0.33 4060.07 3 20.4.8 0.95 1 4671.23 16.15 5.06 1.26 5406.61 5 25.2.1 0.95 1 5629.69 100.35 1.52 0.19 5646.17 2 25.2.2 0.95 1 4517.81 33.71 6.95 1.61 4617.72 5 25.2.4 0.95 1 4814.33 26.98 10.32 6.18 4895.53 7 25.2.6 0.95 1 5117.46 32.42 2.43 0.63 5117.46 4 25.2.8 0.95 1 5362.94 110.05 1.98 0.21 5362.94 2 25.3.1 0.95 1 5455.94 148.55 3.21 1.22 5982.31 5 25.3.2 0.95 1 3880.84 34.35 5.67 0.89 4105.16 3 25.3.4 0.95 1 4407.47 60.15 15.43 5.66 4439.91 6 25.3.6 0.95 1 4624.12 101.85 4.57 0.36 4624.12 2 25.3.8 0.95 1 5048.59 92.58 0.84 0.14 5060.09 2 25.4.1 0.95 1 5455.94 119.88 18.55 5.96 6911.02 7 25.4.2 0.95 1 3216.94 56.39 2.46 0.71 3216.94 3 25.4.4 0.95 1 3813.75 96.97 2.34 0.5 3813.75 3 25.4.6 0.95 1 4449.14 192.99 14.59 3.3 4617.72 6 25.4.8 0.95 1 5048.59 109.1 5.16 0.87 5342.62 4 Table 2.1: Numerical results for SpHCPL, SpHCPL-Pull, and SpHCPL-Push, CAB examples. 25 Prob γ ν Obj SpHCPL SpHCPL-Pull SpHCPL-Push Upper Iter 5.2 0.95 1 60402 0.08 0.02 0.05 61525.2 3 5.3 0.95 1 60402 0.07 0.03 0.06 65123.5 3 10.2 0.95 1 87498.2 0.69 0.11 0.04 87498.2 2 10.3 0.95 1 73634.5 0.69 0.1 0.09 75413.2 3 10.4 0.95 1 70902.5 0.53 0.04 0.04 70902.5 2 10.5 0.95 1 70902.5 0.33 0.09 0.15 75086.6 3 20.2 0.95 1 99570.1 8.3 3.13 0.24 99570.1 3 20.3 0.95 1 93111.2 17.91 0.97 0.25 94036.9 3 20.4 0.95 1 90871.9 13.23 2.65 1.75 100351 5 20.5 0.95 1 90871.9 16.5 2.54 1.23 100351 5 20.10 0.95 1 90871.9 4.74 2.47 1.47 100351 5 25.2 0.95 1 114205 51.99 6.2 0.63 115289 3 25.3 0.95 1 109781 53.8 30.53 7.44 120714 6 25.4 0.95 1 109781 30.97 9.17 6.08 120714 7 25.5 0.95 1 109781 37.74 12.36 6.75 120714 7 25.10 0.95 1 109781 16.27 8.86 7.42 120714 14 40.2 0.95 1 131626 1904.91 58.54 2.41 133652 3 40.3 0.95 1 121203 * 278.7 19.19 126088 6 40.4 0.95 1 114571 * 901.98 107.36 131814 7 40.5 0.95 1 114571 * 85.27 125.44 131814 9 40.10 0.95 1 114571 * 150.62 253.41 131814 9 50.2 0.95 1 133722 * * 196.88 141971 6 50.3 0.95 1 120783 * * 70.16 124513 5 50.4 0.95 1 117921 * * 2651.61 131557 10 50.5 0.95 1 117921 * * 1192.33 134376 11 50.10 0.95 1 117921 * * 739.38 134376 19 Table 2.2: Numerical results for SpHCPL, SpHCPL-Pull, and SpHCPL-Push, AP examples. To visualize the relative performance among all three numerical methods, we plot the ratio of the CPU time for each method against the best CPU time among the three methods. This relative performance of CPU times is given in figure 2.1, which shows that SpHCPL-Push is a clear winner. We remark that the graphs are truncated at the performance ratios of 20 and 80 for the CAB and AP datasets, respectively. 26 Figure 2.1: Performance of CPU times for SpHCPL, SpHCPL-Pull, and SpHCPL-Push: The top panel for the CAB dataset and the bottom panel for the AP dataset. Some additional observations can be made. First, additional tests in which we applied our proposed methods to the deterministic single-allocation p-hub center problem showed that they are more efficient than the methods of [37] but less efficient than the radius method of [24] that itself is dominated by the more recent advance of [48] in which problems of hundreds of nodes can be routinely solved. This gives an idea of the gap between tractability of stochastic and deterministic hub center problems. Second, it is clear from constraint (2.7) in SpHCPL that the objective value is increasing in the discount factor α, the coefficient of variation ν, and the service level parameter γ since increasing any of these will most often increase the right hand side of (2.7) for all i, j,m ∈ N . The exceptions being if β is bound by a path where k = m or by a path with a single pulse, then respectively increasing α or ν and/or γ no more than as to keep β bound by the same path will not affect the objective value. However it is not as clear how this may affect the CPU times. We therefore conduct a sensitivity analysis on 27 a single test problem, AP 25.2, for a fixed value of γ = 0.95 and ν = 1 respectively. We check sensitivity of the CPU times of the three methods as functions of ν (γ, resp.) and observe that the CPU times are not very sensitive to either service level γ or coefficient of variation ν as can be seen in figure 2.2. Figure 2.2: Sensitivity results of the CPU times in coefficient of variation ν and service level parameter γ for test example AP 25.2. 2.5 Conclusion In this paper we introduce a new formulation (Proposition 2.2.1) for the stochastic unca- pacitated single allocation p-hub center problem together with two solution procedures for finding globally optimal solutions based on a preprocessing approach and a separation algorithm. The combination of modeling and optimization techniques allows us to solve small to medium-sized problems in reasonable time. This is in contrast to the original formula- tion of the problem in Sim et al. [65] where computational difficulties led to the use of heuristics for even small problems. Of note is the separation algorithm approach where the subproblem is polynomially solvable, requiring at most N3 iterations. The combi- nation of the new model formulation and the separation algorithm, we are able to solve test examples up to 50 nodes in size. 28 Appendix Proof of Lemma 2.3.1. (a) Suppose that Xjm = 1. Let j = i and m = k. Then constraint (2.7) reduces to β ≥ tjjmm + zγδjjmm = 2djm + zγ √ 2σjm, which implies that a feasible solution with Xjm = 1 gives a worse objective function value than the current available upper bound βU . Hence, Xjm = 0 is a valid cut for SpHCP L. (b) Suppose that Xjm = 1. Then constraint (2.7) reduces to N∑ k=1 (tijkm + zγδijkm)Xik ≥ N min k=1 (tijkm + zγδijkm) N∑ k=1 Xik = N min k=1 (tijkm + zγδijkm), which is greater than βU according to the assumption. This shows that any feasible solution with Xjm = 1 gives a worse objective function value than β U . Therefore, Xjm = 0 is a valid cut for SpHCP L. (c) Suppose that Xjm = 1. Then constraint (2.7) reduces to β ≥ N∑ k=1 (tijkm+zγδijkm)Xik ≤ Nmax k=1 (tijkm+zγδijkm) N∑ k=1 Xik = N max k=1 (tijkm+zγδijkm) < β L, which is redundant for the given i, j,m. (d) Suppose that Xjm = 1 and assume dmi = max N `=1 dm` and Xin = 1. It follows from constraint (2.7) and the triangle inequality property that β ≥ (tijnm + zγδijnm) = din + αdnm + dmj + zγ √ (σin)2 + α2(σnm)2 + (σmj)2 ≥ αdim + dmj + zγσmj = dmj + αmax N `=1 dm` + zγσmj > βU . This shows that any feasible solution with Xjm = 1 gives a worse objective function value than βU . Hence Xjm = 0 is a valid cut. (e) When Xjm = 0, the right-hand side of constraint (2.7) for any i and the corresponding j,m is non-positive. Clearly, this constraint is redundant for SpHCPL. 29 ILP reformulation in Sim et al. [65] min. Xik,Yikmj ,β β s.t. β ≥ (dik + αkm + dmj + zγ √ σ2ik + α 2σ2km + σ 2 mj)Yikmj, ∀i, j, k,m ∈ N Yikmj ≥ Zik + Zjm − 1, ∀i, j, k,m ∈ N∑ k∈N Zik = 1, ∀i ∈ N Zik ≤ Zkk, ∀i, k ∈ N∑ k∈N Zkk = p Yikmj ∈ {0, 1}, ∀i, j, k,m ∈ N Xik ∈ {0, 1}, ∀i, k ∈ N 30 Chapter 3 Paper Two Service Risk in Hub and Spoke Networks Edward Hult Judge Business School, University of Cambridge, Trumpington Street, Cambridge CB2 1AG, United Kingdom, {e.hult@jbs.cam.ac.uk} Abstract The stochastic single allocation p-hub center problem aims to find the shortest service or delivery time guarantee over all paths in a hub and spoke network. This guarantee is set to hold for a minimum probability γ. The model considers the important stochastic nature of travel times on links but, unlike the deterministic single allocation p-hub center problem, departure times at hub nodes are not accounted for automatically. Not accounting for departure times may result in a non-optimal solution design for real life networks. We propose two new methodological approaches, motivated from talks with a few different European transportation companies, respectively accounting for variable and fixed departure times at hub nodes. An approximation solution is proposed for the variable departure times model using a step function where we prove that the optimal location-allocation solutions to the approximation model converge to the optimal location-allocation solutions of the original stochastic model when the step size is small 31 enough. In addition a linear model is formulated to give exact solutions to the stochastic p-hub center model with fixed departure times. We also propose a different risk model, using fixed departure times, which minimizes the penalty cost for origin-destination pairs missing some given maximum delivery service guarantee. The purpose is to maximize the volume of packages which arrive on time in contrast to the center problem which is path based. Three exact solution approaches are also constructed and numerical results are reported for all new models demonstrating the effectiveness of the respective solution approaches. Keywords: Hub location; Center problem; Stochastic programming. 3.1 Introduction In hub and spoke networks the flow of packages from origin to destination is designed to be simple and cost efficient. All packages from different origin nodes are collected and brought, often by truck, to a relatively nearby hub. The packages at the hub are then sorted and consolidated into bigger delivery vehicles, such as airplanes or trains, for a quicker and more cost effective transfer to a relatively far off located hub. If the origin- destination nodes are geographically close this second leg is skipped if both nodes are connected to the same hub. At the second hub, packages are disaggregated and sorted for distribution to all the directly connected destination nodes. A typical delivery path is shown in figure 3.1. Figure 3.1: Origin-destination path via hub nodes. Fast delivery times and strong service guarantees are vital aspects for modern day trans- portation and delivery companies. This paper presents new mathematical modeling tools, for network managers, providing effective design solutions. An effective solution provides a design which is practical and provides both fast and reliable delivery time 32 guarantees. The main focus of the paper is on mail and parcel delivery networks but many of the ideas developed here can be directly applied to any hub and spoke type network. The majority of work done on hub and spoke systems in the literature assumes deter- ministic network properties, for example demand and link travel times are based on average values. However, assuming fixed travel times on links in a network when con- sidering delivery times and service guarantees is a very big assumption and can lead to poor solutions with huge additional costs, see for example Example A in the Appendix. Strong assumptions are however not uncommon in the mathematical modeling commu- nity and many researchers and practitioners alike will often make several assumptions and simplifications when designing models due to the dynamic and highly complicated properties of real life systems. This paper aims to help bridge the gap between theo- retically solvable models and practical and useful design solutions for real life mail and parcel delivery systems. In our quest to help bridge the theory-practice gap we contacted four European mail and parcel companies including a major European postal company. We did this to find out what the industry’s thoughts are on considering stochastic travel times in the design process of the network. In our talks with the different companies they all emphasized the importance of having a well structured and well timed network based on stochastic link travel times. This is because at each hub there is a set departure time, which must be based on the variations of each link’s travel time, so that incoming units can arrive on time. In the classic p-hub center problem, first studied by Campbell [10], which minimizes the maximum delivery time, link travel times, and therefore also departure times, are based on average values [38, 74]. This creates a problem since any travel time variation longer than the average on a link will miss the delivery time guarantee or will delay the departure time at the upcoming hub causing a chain effect delay for multiple delivery destinations or a missed departure altogether. Sim et al. [65] and later Hult et al. [33] consider the stochastic nature of travel times on links and solve the stochastic uncapacitated single allocation p-hub center problem. They minimize the maximum delivery time β for a guaranteed service level in a network. The service level is the probability that the travel time on each origin-destination path is less than or equal to β is greater than the given probability γ. However, for the delivery path to hold all incoming packages must arrive at each hub before the set departure time from hub k to hub m and from hub m to destination node j. In the 33 literature for the deterministic case, the departure time is assumed to be when the latest origin-hub link has arrived or when the latest hub-hub link has arrived [38, 74] as shown in figure 3.2. The methodology used in [65, 33] is an extension on the formulations found in the literature on the problems deterministic counterpart. However, unlike in the deterministic case where the departure times are automatically accounted for, the existing stochastic models in [65, 33] do not account for or automatically find any departure times at the hub nodes. The implicit assumption is that a packet arriving at hub node k from any origin node i will be forwarded, on arrival, to the next hub. However, we impose a common departure time for all incoming packets. Figure 3.2: Departure times for a deterministic model. In our talks with the different companies it became clear that there are two main ways to account for departure times at hub nodes. The first is in regard to larger transportation companies which have their own airplanes. They will often opt to fly the hub-hub leg at night when there is less traffic. This allows the companies to more or less freely choose their desired departure times at hub nodes. The second situation is in regard to smaller companies who rent space from some other companies to transport their goods on the hub-hub links. For these companies there is no choice in departure time and a fixed departure time at each hub to hub transfer is given in advance. When modeling the stochastic p-hub center problem it becomes vital to specifically account for each hub’s departure times or a non-optimal solution may be found when applied to a real life situation where departures must exist from each hub node, see Counter Example B in the Appendix. We therefore propose two new methodological approaches to the stochastic uncapacitated single allocation p-hub center problem taking into account the two different departure time situations to give a practically better hub and spoke network design. We also propose two computationally efficient solution approaches to the two problems. In addition to minimizing the maximum delivery time for some given service guarantee, 34 as in the stochastic p-hub center problem, it became clear from our talks that companies are interested in finding a network setup which looks at the percent of packages which are delivered in time and not just paths which deliver in time. To accommodate this we propose a new model, with three accompanying solution approaches, which we call the Service Cost Problem (SCP) and is introduced in Section 3.3. This differs from the stochastic p-hub center problem by taking into account the volume on paths for a given delivery time, such as a 24 hour delivery guarantee. The paper is set up in the following way. A deeper look at the literature is presented in Section 3.1.1. In Section 3.2 the two new methodological approaches to a stochastic p-hub center problem are introduced in Sections 3.2.1 and 3.2.2 together with their respective models and solution procedures. Numerical results are presented in Section 3.2.3. In Section 3.3 the new service cost problem is introduced and formulated in Section 3.3.1. Solution procedures are presented in 3.3.2 and numerical results given in Section 3.3.3. The paper ends with a short conclusion in Section 3.4. 3.1.1 Literature review Hub and spoke research is a relatively new field of research and O’Kelly [49] was the first to formulate a hub location problem mathematically, named the single allocation p-hub median problem. Campbell [9] and [10] later produced the first linear integer programming formulations for the multiple and single allocation p-hub median problems respectively. The goal here is to minimize the total cost of flow across the network by finding p number of hub nodes and allocating each additional node to one (single allocation) or more (multiple allocation) hubs. Additionally Campbell [10] introduced two other main problems which have become standard in the hub and spoke literature, the p-hub center problem and the hub covering problem. The p-hub center problem aims to minimize a defined maximum cost over the links or paths, for example minimizing the maximum travel time for any origin-destination pair. The hub covering problem is about finding a set of hubs so that the distances for origin and destination nodes to hub nodes fulfill a certain criterion. The objective can for example be to minimize the cost of opening hubs to reach all destinations or to maximize the flow under a set of constraints. The p-hub median problem has received the most attention in the literature but all three have been studied with a focus on either improving the computational performance or creating different model variations to represent more realistic systems. 35 Strong computational improvements on the p-hub median problem were made in Skorin- Kapov et al. [67] and in Ernst and Krishnamoorthy [25]. Later Camargo et al. [20] solved the uncapacitated multiple allocation problem using Benders’ decomposition, al- lowing network sizes of up to 200 nodes to be solved in a few hours. Contreras et al. [16] solved the capacitated single allocation problem using column generation together with a Lagrangean relaxation also solving networks up to 200 nodes. Other variations on the median problem can be found in, for example, Elhedhli and Hu [23] who consider conges- tion effects at hub nodes, adding a non-linear congestion cost function to the objective of the uncapacitated single allocation p-hub median problem. Costa et al. [18] remodeled the capacitated single allocation hub location problem, which itself is a variation of the p-hub median problem also introduced in [10], by removing the capacity constraints and instead adding a second objective function to the formulation, minimizing the time a hub takes to process the incoming flow. Martin and Roman [47] and Sasaki [61] use one-hub models for capturing competition properties in airline companies. Marianov et al. [45] studied competition in a hub network aimed to maximize the total captured flow by using a constraint for the travel cost stating that if the travel cost is not less than a certain market value between an OD pair, then there will be no flow between that OD pair. The p-hub center problem, also first discussed by Campbell [10], aims to minimize the maximum cost of a defined type of flow over a link or path, for example minimizing the maximum length of travel for any origin-destination pair. The original formulations of the uncapacitated single allocation p-hub center problem [10] consisted of a quadratic programming formulation and also a linear programming formulation using O(n4) vari- ables. Kara and Tansel [37] later improved on this by formulating the problem using O(n2) variables and O(n3) constraints. An additional improvement on the computa- tional performance was found by Ernst et al. [24] by reformulating it using what they called the radius of hub k resulting in an O(n2) variable and constraint formulation. The radius of hub k model solves problem sizes consisting of 100 nodes without any major concerns. Meyer et al. [48] introduced a very efficient 2-phase method which can solve 400 node networks in a relatively short period of time. They use a branch and bound algorithm to find potential optimal hubs and then solve the allocation problem sepa- rately for each node in the branch and bound forest. Kara and Tansel [38] modeled the latest arrival hub location problem taking into the account the departure times at each hub. However, the departure times at each hub are assumed to be at the time the last cargo arrives at the hub and this was shown in [74] to be equivalent to the methodology 36 of the original p-hub center problem. Yaman et al. [80] also develop a latest arrival hub location problem in which they allow stopovers to occur on the node-hub links, such as when a truck may stop at several nodes on its way to the hub node. They solve network sizes of up to 16 nodes. Many variations exist for deterministic hub and spoke problems and the interested reader is referred to the two literature review papers [3, 13]. As for stochastic hub and spoke research the literature is much smaller. Marianov and Serra [44] developed a heuristic algorithm based on tabu search to solve a hub location problem with chance constraints. They build an M/D/c queuing model which considers the number of airplanes waiting to land based on an assumed Poisson arrival rate of the aircraft. The result from theM/D/c queuing model is then used in the chance constraint requiring the average number of airplanes, arriving to some hub, to be less than or equal to the found value. They are able to solve networks up to 30 nodes in size. Sim et at. [65] developed a stochastic version of the single allocation p-hub center problem based on normally distributed link travel times. They developed several heuristic methods to approximately solve network sizes up to 40 nodes. Later Hult et al. [33] studied the same problem as [65] but presented exact solution methods based on preprocessing and a separation algorithm which are able to solve up to 50 node networks. Yang [81] developed a two stage model considering stochastic demand and stochastic discount factors on the hub-hub links for an air freight hub location problem. They consider three possible discrete scenarios on demand and conduct a case study of a 10 node air freight network in Taiwan and China. Contreras et al. [15] constructed and solved, using Monte-Carlo simulation coupled with Benders’ decomposition, a stochastic uncapacitated multiple allocation hub location problem using stochastic independent transportation costs. They are able to approximately solve networks up to 50 nodes in size. There is a much larger pool of research done in the similar but, modeling wise, simpler field of stochastic facility location problems. The interested reader is referred to the large stochastic facility location literature review paper [69]. 37 3.2 New Methodological Approach to a Stochastic p-hub Center Problem Given a network of N nodes denoted by N , the designer wants to find an optimal single allocation solution for each node in N to a set of p hubs in N so as to minimise the longest path, β, in the network. β is set such that all deliveries from an origin-destination pair (OD pair) arrive before the maximum time β for at least a given probability γ. For each OD pair i → j, shipments are transported via some hub nodes k and m where k may be equal to m in some cases. We assume all hub nodes to be fully connected. Define Xik to be a binary variable such that Xik = 1 if and only if node i is allocated to hub node k. Assume T 1ik to be the stochastic travel time from node i to hub k plus the time for unloading, sorting and loading after arrival. Similarly assume T 2mj to be the stochastic travel time from hub m to node j plus the time from the latest arrival, Am, to hub m until the actual departure time from m to node j. The random time from the latest arrival to departure accounts not only for possible random effects of unloading, sorting and loading times but also accounts for possible randomness in the travel times on the incoming hub-hub links, although according to the talks we had with the different companies, the hub-hub links are often very reliable and have low variance. Assume all T 1ik and all T 2 mj are independent and normally distributed random variables where t 1 ik and t2mj are the means and σ 1 ik and σ 2 mj are the standard deviations respectively. Define tkm to be the expected travel time for each hub-hub link and α as the discount factor associated with the hub to hub transfers. The triangle inequality is assumed to hold for all mean travel times but not for their respective variances. To find an optimal solution to the uncapacitated single allocation p-hub center problem with stochastic travel times we need to incorporate and account for departure times. Not doing so may lead to non-optimal solutions, as seen in Counter Example B in the Appendix. In addition a delay in the departure from a hub to hub link can cause a chain effect delay across the network resulting in several OD pairs missing their guaranteed delivery times. The departure time for each hub-hub link, k to m, is modelled in two different ways, depending upon the type of transportation company, and each are discussed further in the following two sections. The departure time, Am, from hub m to destination node j, is more flexible, due to the more flexible modes of transportation in regards to departure times used on the hub-node links. Although it is modelled to depart 38 when the last leg has arrived it allows for delays to account for sorting and loading times of the goods. The sorting and loading times are captured by the stochastic travel time T 2mj. In addition we assume that all origin node shipments depart at the same time, for example at 16:00 from each origin depot. This can however very simply be modified by adding the constant ti onto T 1 ik in all formulations (for example T 1 ikXik turns into (T 1ik + ti)Xik) which represents a positive or negative number associated with a later or earlier departure time than the norm for each origin node i respectively. 3.2.1 Variable departure times Larger delivery companies which have access to their own airplanes, for the hub-hub link transfers, often opt to fly at night. At night there is less passenger travel and finding a departure time becomes less about checking what slot times are available and more about figuring out when will the packages arrive and how long will it take to sort and load them onto the outbound flights. To capture this additional question we propose a stochastic p-hub center model which also solves for an optimal departure time, Dk, from each hub k to all hubs m. We model the flow through the network in the following way. The packages flow from an origin node i departing at time “zero”. T 1ik time units later they have arrived and been sorted and will get loaded to depart from hub k to hub m by time Dk. From k they travel to hub m taking αtkm time units, where α is the discount factor, associated with the often faster mode of transportation, for travel times over hub arcs. At hub m they wait till the time Am for when the last cargo has arrived. Everything then gets loaded onto outbound trucks from hub m to be delivered to some destination node j which takes T 2mj time units from Am and arrives before the maximum time β with a minimum probability γ. We assume all departures from a hub k can be made simultaneously. This assumption is the same and in accordance with the literature for the deterministic versions of the problem and is satisfactory since only the longest path in the network realistically needs to depart on the found departure time for the delivery service to hold. All other paths can depart in order of length and will most often not delay the delivery service. The exception is if two paths are of same length which both make up the longest path in the network and are routed via the same hub k to go to two different hubs m and m′, then one will have to wait till the other has departed and may cause the path on the 39 later departing flight to slightly increase β so as to hold with a probability γ. This effect however is very small given the entire length of the paths and will have a very small impact on the optimality of the network structure. The stochastic p-hub center problem with variable departure times (SpHCPDV ) lets the designer solve for the optimal hub locations and node allocations and the optimal departure times at each hub while minimizing the nominal maximum travel time for all OD pairs via the decision variables Xik, Dk, Am and β respectively. (SpHCPDV ) min. Xik,Dk,Am,β β (3.1) s.t. γ ≤ Pr[T 1ikXik ≤ Dk ∩ (Am + T 2mj)Xjm ≤ β],∀i, j, k,m ∈ N (3.2) Am ≥ (Dk + αtkm)XkkXmm, ∀k,m ∈ N (3.3)∑ k∈N Xik = 1, ∀i ∈ N (3.4) Xik ≤ Xkk, ∀i, k ∈ N (3.5)∑ k∈N Xkk = p (3.6) Xik ∈ {0, 1}, ∀i, k ∈ N (3.7) Dk ≥ 0, ∀k ∈ N (3.8) Am ≥ 0, ∀m ∈ N (3.9) The objective function (3.1) minimizes the maximum travel time β for all paths in the network. The chance constraints (3.2) state that the probability of a path i− k−m− j to arrive before the time β is greater than or equal to γ. Constraint (3.3) finds the latest arrival from all incoming hub links to hub m. Constraints (3.4)-(3.7) are standard for the uncapacitated single allocation p-hub problem and respectively state that all nodes must be assigned to exactly one hub, a node can only be assigned to a node k if k is a hub node, exactly p hubs should be found and lastly all Xik are binary. Going forward we have two objectives. One is to reduce the size of the formulation, from O(N4) to O(N2) constraints, and the second, to make the chance constraints (3.2) linear. The goal is to find solutions to midsized problems within reasonable time. We note that (Dk + αtkm)XkkXmm in (3.3) could be directly inserted into constraint 40 (3.2) since the binding link i− k will always result in the latest arrivals to hub m from k due to the variable departure time Dk. This allows constraints (3.2) and (3.3) to be replaced by: γ ≤ Pr[T 1ikXik ≤ Dk ∩ ((Dk + αtkm)XkkXmm + T 2mj)Xjm ≤ β], ∀i, j, k,m ∈ N (3.10) and thus also eliminating the need for the decision variable Am. In addition due to the independence of the random variables T 1ik and T 2 mj, (3.10) can also be rewritten by adding the decision variables p1 and p2 using the following set of constraints: p1 ≤ Pr[T 1ikXik ≤ Dk], ∀i, k ∈ N (3.11) p2 ≤ Pr[((Dk + αtkm)XkkXmm + T 2mj)Xjm ≤ β], ∀j, k,m ∈ N (3.12) γ ≤ p1p2 (3.13) By adding the decision variables rm, (3.12) can be changed to: p2 ≤ Pr[T 2mjXjm ≤ rm], ∀j,m ∈ N (3.14) β ≥ Dk + αtkm + rm, ∀k,m ∈ N (3.15) Constraints (3.14) and (3.15) are equivalent to (3.12) since β in (3.15) must be greater than or equal to the departure time from hub k plus the travel time from hub k to hub m plus the longest time required for packages to travel from hub m to some destination node j whose time is guaranteed to hold for a minimum probability p2 in (3.14). Additionally if k and/or m are not hub nodes (3.15) will not be binding since, for example, if k is not a hub then Dk will be zero and k will be allocated to some other hub node k ′. This holds since β ≥ Dk′ + αtk′m + rm ≥ tkk′ + zp1σkk′ + αtk′m + rm ≥ αtkm + rm due to the triangle inequality of the mean times and where zp1 is the z-value from the standard normal distribution table for the probability p1. This results in an O(|N |2) instead of an O(|N |4) constraint formulation which we call 41 SpHCPDV 2 and is equivalent to SpHCPDV . (SpHCPDV 2 ) min. Xik,Dk,rm,p1,p2,β β s.t. rm ≥ 0, ∀m ∈ N (3.16) p1 ≥ 0 (3.17) p2 ≥ 0 (3.18) (3.4), (3.5), (3.6), (3.7), (3.8) (3.11), (3.13), (3.14), (3.15) In SpHCPDV 2 , Dk and rm also represent the radius of hub k and hub m for collection and distribution respectively. This is similar to the radius of hub k formulation by Ernst et al. [24] for the deterministic p-hub center problem, but here with chance constraints. However, the chance constraints (3.11) and (3.14) will still cause computational prob- lems. We therefore reformulate SpHCPDV 2 into an approximate deterministic linear formulation using a step function which can be optimally solved. The step function is in regards to the probability of reaching the different departure times at the different hubs and destination nodes and is defined by the set L where L = {(P 1l , P 2l ) ∈ [γ, 1] | P 1l = γ + lδ, P 1l P 2l = γ, l = 0, . . . , L} for δL = 1−γ. When |L| → ∞ we will show that the optimal solutions will be the same since all probability combinations of P 1l and P 2 l will be accounted for and will correspond with p1 and p2 in SpHCPDV 2 . Moreover, the optimal location-allocation solution, X∗ik, converges with an increasing |L|. Since the two vectors P 1l and P 2 l have the set L of possible probability combinations we can pre-calculate the parameters t1ikl and t 2 mjl so that P 1l = Pr[T 1 ik ≤ t1ikl] (3.19) and P 2l = Pr[T 2 mj ≤ t2mjl] (3.20) hold. Since T 1ik and T 2 mj are assumed to be normally distributed this can be calculated 42 by setting t1ikl = t 1 ik + z 1 l σ 1 ik and t 2 mjl = t 2 mj + z 2 l σ 2 mj where z 1 l and z 2 l represent the z- value from the standard normal distribution table for each of the probabilities P 1l and P 2 l respectively. However, if T 1ik and T 2 mj are not normally distributed t 1 ikl and t 2 mjl can easily be estimated by simulation and/or sampling at this stage for each of their respective different probabilities in P 1l and P 2 l . Since γ = P 1l P 2 l = Pr[T 1 ik ≤ t1ikl]Pr[T 2mj ≤ t2mjl] we can set:∑ l∈L t1iklXikYl ≤ Dk, ∀i, k ∈ N (3.21) and ∑ l∈L t2mjlXjmYl ≤ rm, ∀m, j ∈ N (3.22) where Yl is a binary variable equal to one if the probability combination (P 1 l , P 2 l ) ∈ L is chosen and will guarantee that P 1l ≤ Pr[T 1ikXik ≤ Dk] and P 2l ≤ Pr[T 2mjXjm ≤ rm] for the chosen l ∈ L. Note that the same probability combination is chosen for all outbound and inbound links, since all outbound links are connected to all inbound links. This allows us to reformulate SpHCPDV 2 into an approximate linear formulation using the decision variables Xik, Yl, Dk, rm and β with a step function consisting of different probability combinations (P 1l , P 2 l ) ∈ L represented by t1ikl and t2mjl. (SpHCPDVL ) min. Xik,Yl,Dk,rm,β β s.t. Dk ≥ ∑ l∈L t1ikl(Xik + Yl − 1), ∀i, k ∈ N (3.23) rm ≥ ∑ l∈L t2mjl(Xjm + Yl − 1), ∀m, j ∈ N (3.24) 1 = ∑ l∈L Yl (3.25) Yl ∈ {0, 1}, ∀l ∈ L (3.26) (3.4), (3.5), (3.6), (3.7), (3.8), (3.15), (3.16) Constraints (3.23) and (3.24) are linearizations of (3.21) and (3.22) and calculate the 43 radii of hubs k and m respectively for a chosen l ∈ L. Constraint (3.25) allows only one probability combination to be chosen and (3.26) is the binary constraint. We now show that for |L| large enough, the optimal location-allocation solution, Xik, for SpHCPDVL will also be optimal for SpHCPD V 2 . First off, it is good to note that there will always exist a feasible and optimal solution to both SpHCPDV 2 and SpHCPDVL , as long as |N | ≥ p > 0, since Dk, rm and β are decision variables. Definition 3.2.1 Define F as the set of location-allocation solutions Xik, for all i, k of (3.4) - (3.7) and denote an arbitrary location-allocation solution by f ∈ F . It is clear that F has finite cardinality, and is the set of feasible location-allocation solutions for each of the problems SpHCPDV 2 and SpHCPDVL . Definition 3.2.2 For each solution f ∈ F there exists an optimal value β, which is the minimum value of the objective function with respect to minimizing the longest path in the network, defined as βf and β L f for SpHCPD V 2 and SpHCPDVL , respectively. βf and βLf are found by minimizing the decision variables Dk, rm, p 1, p2 and Dk, rm, Yl for SpHCPDV 2 and SpHCPDVL , respectively, for each solution f ∈ F . Lemma 3.2.1 Solving for an optimal βf gives equality of constraint (3.13), γ = p 1p2. Proof: Remembering that T 1ik and T 2 mj are normally distributed, we know that the cumulative distributions Pr[T 1ik ≤ ζ] and Pr[T 2mj ≤ ζ] are strictly decreasing for each i, k ∈ N and m, j ∈ N . With this property in mind, it is clear that if a gap γ < p1p2 exists, reducing the values of the decision variables Dk and rm on the binding constraints of (3.11) and (3.14), respectively, will remove the gap and give a lower βf value from (3.15) for any given f ∈ F . Lemma 3.2.2 βLf → βf as |L| → ∞ ∀f ∈ F . Proof: Assume D∗k, r ∗ m, p 1∗ and p2∗ are the optimal solutions when solving for βf for a given f ∈ F denoted by X ′′ik. Also assume DLk , rLm and Y Ll are the optimal solutions when solving for βLf for the same f ∈ F and assume l′ is the value of index l ∈ L for when Y Ll′ = 1. From (3.11) we get p1∗ = min i,k∈N (Pr[T 1ikX ′′ ik ≤ D∗k]) 44 and from (3.23) we get DLk = max i∈N (t1ikl′X ′′ ik) (3.27) which from (3.19) results in P 1l′ = Pr[T 1 ik ≤ t1ikl′ ] = min i,k∈N (Pr[T 1ikX ′′ ik ≤ t1ikl′X ′′ik]) = min i,k∈N (Pr[T 1ikX ′′ ik ≤ DLk ]). The possible values of DLk , when it is binding as in (3.27), are bound by the possible choices of l′ ∈ L. However, when |L| → ∞ the possible values of DLk , when binding, becomes continuous as Dk is in SpHCPD V 2 . Therefore when |L| → ∞ | min i,k∈N (Pr[T 1ikX ′′ ik ≤ D∗k])− min i,k∈N (Pr[T 1ikX ′′ ik ≤ DLk ])| → 0 which gives |D∗k − DLk | → 0 and |p1∗ − P 1l | → 0. The same steps can be taken using constraints (3.14), (3.20) and (3.24) to show that |r∗m − rLm| → 0 and |p2∗ − P 2l | → 0. Therefore from (3.15) we get βLf → βf for any given f ∈ F . Definition 3.2.3 The set O is a subset of F which represents the optimal location- allocation solutions for SpHCPDV 2 . Lemma 3.2.3 There exists a positive gap  such that βf∗ +  < βf , ∀f ∗ ∈ O and ∀f ∈ F \ O. Lemma 3.2.3 holds since F is discrete and finite, therefore all βf ∀f ∈ F are discrete and finite. Definition 3.2.4 The set OL is a subset of F which represents the optimal location- allocation solutions for SpHCPDVL when |L| <∞. Proposition 3.2.1 OL ⊆ O when |L| is large enough. Proof: Let  be as in Lemma 3.2.3. Take L large enough such that |βLf − βf | < /2 for every f ∈ F . Now take f ∗ ∈ O and f ∈ F \ O and observe that βLf∗ < βf∗ + /2 < βf − /2 < βLf where the first and third inequalities are due to the approximation 45 property above and the middle one is from Lemma 3.2.3. This implies that f cannot be optimal for SpHCPDVL , hence the result follows. 3.2.2 Preset departure times Smaller and medium sized shipping companies, who don’t have their own airplanes, will rent cargo space from some other companies for the hub-hub link transfers (this may also include rental space on other modes of transportation such as trains). In such cases the shipping company in question has no choice on the departure times from any hub k to some other hub m. The departure time is given beforehand and the shipping company is forced to design its delivery network around these departure times. To model this we use a set of fixed parameters which represents all the preassigned departure times dkm for all possible hubs k to m. In some cases, on busy hub-hub links, there may exist a competing cargo space rental company (offering a similar rental space service) which may be of interest for the shipping company in question. This may result in a hub node having two different departure time choices from a hub k to a hub m which the shipping company wants to consider. To capture this in the model we add a ghost node to the network which has the second departure time choice to hub m. Even though the network now has an additional node it will not change the solution properties or the layout of the original network, see definition 3.2.5 and proposition 3.2.2 below for a formal description. In this model we first formulate and then solve exactly, via an equivalent linear reformu- lation, the stochastic uncapacitated single allocation p-hub center problem with preset (or fixed) departure times, SpHCPDF . SpHCPDF finds the minimum maximum travel time and the associated optimal location-allocation solution by finding the optimal val- ues for the decision variables Xik, Am and β. (SpHCPDF ) min. Xik,Am,β β s.t. γ ≤ Pr[T 1ikXik ≤ dkm ∩ (Am + T 2mj)Xjm ≤ β],∀i, j, k,m ∈ N(3.28) Am ≥ (dkm + αtkm)XkkXmm, ∀k,m ∈ N (3.29) (3.4), (3.5), (3.6), (3.7), (3.9) 46 Constraints (3.28) and (3.29) are similar to (3.2) and (3.3) respectively differing only in that the variable departure time Dk has been replaced by the fixed departure list dkm. In addition, unlike (Dk +αtkm)XkkXmm in SpHCPD V , (dkm +αtkm)XkkXmm cannot be inserted into constraint (3.28) directly since the latest arrival at hub m may not be from the same hub k as the binding link i− k due to the preset departure times. Similarly, the chance constraint (3.28) makes SpHCPDF very hard to solve. However, we can formulate an equivalent linear reformulation of the model which can be solved exactly. Since we have a list of all departure times, dkm, for all possible hubs k to possible hubs m, we can precalculate γikm = Pr[Tik ≤ dkm] for all i, k,m. In addition, since T 1ik and T 2mj are independent, we can rewrite constraint (3.28) as: γ ≤ Pr[T 1ikXik ≤ dkm ∩ (Am + T 2mj)Xjm ≤ β] = Pr[T 1ikXik ≤ dkm]Pr[(Am + T 2mj)Xjm ≤ β] ⇒ γXikXjm ≤ Pr[T 1ikXik ≤ dkm]Pr[(Am + T 2mj)Xjm ≤ β]XikXjm = Pr[T 1ik ≤ dkm]Pr[Am + T 2mj ≤ β]XikXjm = γikmPr[Am + T 2 mj ≤ β]XikXjm ⇒ γ γikm XikXjm ≤ Pr[Am + T 2mj ≤ β]XikXjm, ∀i, j, k,m ∈ N . (3.30) Due to the normal distribution we can solve for β by rewriting (3.30) as: (Am + tmj + zikmσmj)XikXjm ≤ β, ∀i, j, k,m ∈ N (3.31) where zikm is the corresponding z-value for each γ γikm from the standard normal distri- bution table. If γikm is zero it can be replaced by a very small value close to but greater than zero with no loss of precision. It is easy to see that the larger zikm is the larger β will need to be for all valid paths. To this end we introduce a new variable Zm which represents the largest zikm value at each hub m. By doing so we are able to formulate an equivalent 2-index linear formulation to SpHCPDF using the decision variables Xik, 47 Am, Zm and β. (SpHCPDFL ) min. Xik,Am,Zm,β β s.t. β ≥ Am + (tmj +M)Xjm + Zmσmj −M, ∀m, j ∈ N (3.32) Am ≥ (dkm + αtkm)(Xkk +Xmm − 1), ∀k,m ∈ N (3.33) Zm ≥ ∑ k∈N zikm(Xik +Xmm − 1) ∀i,m ∈ N (3.34) Zm ≥ 0, ∀m ∈ N (3.35) (3.4), (3.5), (3.6), (3.7), (3.9) Constraint (3.32), with the help of (3.34), is an equivalent linear formulation of (3.31) which in turn is equivalent to (3.28). Constraint (3.32) holds, since if either k or m are not hub nodes then Am and Zm will be zero from (3.33) and (3.34) respectively, and if j is not connected to m then the big constant M will make (3.32) non-binding. Constraint (3.33) finds the latest arrival at hub m and constraint (3.34) finds the larges zikm value for all incoming links to hub m. Note that we can sum over all k in (3.34) which holds from (3.4) since ∑ k∈N Xik = 1. As mentioned earlier, if any considered hub-hub link has a choice in departure times, both SpHCPDF and SpHCPDFL will capture this if we insert a ghost node into the network, representing the new departure time, without affecting the original network properties. We now formally state this for any number of different departure time choices. Definition 3.2.5 A ghost node kθ, for node k ∈ N , is a node that is added to the network which represents alternative departure times to the other potential hub nodes m ∈ N \ k. Each ghost node has identical travel times to and from all other nodes in N as the original node k. Let Gkm represent the number of departure times to choose from for a given node k ∈ N to a given node m ∈ N if k and m were to become hub nodes. There is one ghost node for each of the different possible combinations of departure times from node k (the original node k can now also be referred to as one of the ghost nodes). Gk is the set of ghost nodes kθ, ∀θ ∈ Gk, where |Gk| = ∏ m∈N\kGkm. For example, if London, Paris, Rome and New York are potential hub nodes where, from London, there exist two alternative departure times to Paris, 10:00 and 14:00, and two to Rome, 11:00 and 13:00, and one to New York, 15:00. Then node k (assuming 48 k is London) would be replaced by the four ghost nodes in Gk = {k1, k2, k3, k4} which account for all the possible departure time combinations as shown in table 3.1. Paris Rome New York dkθm 10:00 14:00 11:00 13:00 15:00 k1 x x x k2 x x x k3 x x x k4 x x x Table 3.1: Departure times from each ghost node kθ, θ = {1, ..., 4}, to nodes m, m = {Paris, Rome, New York}. If there are a lot of alternative departure time choices to account for the network size will explode. This is due to the fact that for each node k ∈ N , |Gk| grows exponentially fast resulting in a network size ∑ k∈N ∏ m∈N\kGkm. This makes the problem computationally impossible to solve. However, in practice, few nodes will have multiple possibilities and, additionally, delivery companies of this type will often work with one main contractor at a hub node and will most often not have negotiated with several contractors for the same hub-hub link resulting in different departure time choices for that link. Therefore the method described above will often work well in practice. Proposition 3.2.2 Considering the solution found from solving SpHCPDF or SpHCPDFL with added ghost nodes, we can take the following steps to get a solution design for the original network without affecting the found optimal solution. i) If no nodes in Gk have been chosen as a hub, then all nodes but one in Gk can manually be removed from the network. ii) If one node in Gk has been chosen as a hub, then all other nodes in Gk can manually be removed from the network. iii) Assuming there exists an optimal solution for p hubs which is no worse than the optimal solution found using p − q hubs for 0 ≤ q < p. Then, if there are qk > 1 nodes in Gk which have been chosen as hubs, all allocations to the hubs in Gk can manually be reallocated to a single node in Gk as follows. For each destination m find the latest departure time to m from the chosen hubs in Gk. Now choose the ghost node in Gk with these departure times. All other nodes in Gk can manually 49 be removed from the network. Additionally pick qk − 1 non-hub nodes in N to be merely self allocated hub nodes, and which do not affect β when turned into hubs. Proof i) Only one node in Gk needs to remain and since all nodes in Gk are identical, apart from the hub to hub departure times which are of no concern here, it does not matter which node in Gk remains. ii) Only one node is needed and the one chosen as the hub represents the optimal departure times and is therefore kept. iii) Looking at the example in table 3.1, assume k2 and k3 are chosen as hubs. We would then manually reallocate all allocations of k2 and k3 to node k4 and make k4 a hub node and remove k1, k2 and k3 from the network. The optimal solution remains the same due to the fact that the packages from an earlier departure, for example, k2 to Rome at 11:00, will have to wait at Rome till the arrival of the packages from the later departure from k3 to Rome which departs at 13:00, until they can continue their journey to some chosen destination node. In other words, the bottleneck is at the second hub node where all packages need to wait until the last packages have arrived before moving on with their journey. A remark on the assumption which is stated in proposition 3.2.2 iii). This assumption is not always true and for cases when it is not, it is strongly recommended that the network designer considers only using p− q hubs. Although unlikely, such a situation may occur due to the requirement that a node is allocated to itself if it is a hub node. For example if all remaining choices of nodes to become hubs all have very late departure times to one or more hub nodes. Alternatively, since the model does not allow three hub stops, the link travel times from the to be hub nodes to one or more of the other hubs may be very volatile. Both of these situations may result in a larger β when forced to choose additional hub nodes. 50 3.2.3 Numerical results In this section, we test the computational performance of the solution approaches to the new proposed models in the previous two sections. In addition we conduct a study to see how using different step sizes in SpHCPDVL affects the solution outcome for a few randomly chosen test examples. Numerical tests are conducted using the well- known test examples from two libraries of test problems, Civil Aeronautics Board Survey (CAB) [29, 49] and Australian Post (AP) [26]. Our code is written in C++, integer linear programs are solved using ILOG CPLEX Version 12.1, and all of the numerical experiments are carried out on a HP Pavilion laptop with an Intel(R)Core(TM)2 CPU processor with 3.00 GB RAM. The standard CAB and AP test examples are designed for the deterministic p-hub median problem. Recall that T 1ik and T 2 mj are assumed to be normally distributed with means of t1ik and t 2 mj and standard deviations of σ 1 ik and σ 2 mj respectively. In our runs we set t1ik and t 2 mj equal to the given values in the deterministic test libraries between each node i−k and m− j respectively. In addition we set σ1ik = νt1ik and σ2mj = νt2mj where ν is a constant called the coefficient of variation which we set to 0.5. For the service level we set γ = 0.95 for all test runs. Numerical results are shown in the tables below where we report the optimal objective value found (Obj) for each test problem (Prob) as well as the associated CPU time (CPU) in seconds. The problem name for CAB examples, tables 3.2, 3.4 and 3.5, has the format N.p.q such that α = q/10, with the exception of when q = 1 then α = q. For AP examples, tables 3.3 and 3.6, the name has a format of N.p and α = 0.75. All test problems were solved in a short time where for the larger sized problems it took CPLEX less then 20 CPU minutes to produces globally optimal solutions. This is very promising and demonstrates the effectiveness of the developed solution methods for each of the respective problems. Numerical results for SpHCPDVL Here we report the numerical results for SpHCPDVL using a step function of size L = 9 in L. This size, although small, is sufficient for the purpose of these tests as can also be seen from the study conducted in Section 3.2.3 ahead. All problems were solved with no occurring concerns for both the CAB and AP test libraries where only the larger problems moved from taking a few seconds to a few minutes to be solved optimally. 51 Prob Obj CPU Prob Obj CPU 5.3.1 1489.75 0.06 15.4.8 2845.96 3.99 5.3.6 1478.43 0.07 20.2.1 4417.66 7.48 10.2.1 3429.71 0.44 20.2.2 3743.39 17.14 10.2.2 2625.61 0.44 20.2.4 3743.39 14.28 10.2.4 2823.89 0.35 20.2.6 3990.19 16.38 10.2.6 3025.83 0.50 20.2.8 4272.88 11.84 10.2.8 3227.77 0.24 20.3.1 3715.91 8.10 10.3.1 2555.34 0.89 20.3.2 2770.80 25.16 10.3.2 2213.88 0.73 20.3.4 3075.24 16.97 10.3.4 2213.88 1.13 20.3.6 3411.74 14.90 10.3.6 2213.88 0.52 20.3.8 3480.69 14.12 10.3.8 2353.40 0.55 20.4.1 3277.30 12.44 10.4.1 2081.12 0.68 20.4.2 2578.04 26.62 10.4.2 1555.80 1.39 20.4.4 2684.15 15.41 10.4.4 1647.58 0.67 20.4.6 2692.20 10.81 10.4.6 1763.45 1.18 20.4.8 2922.56 21.71 10.4.8 1922.28 0.89 25.2.1 4914.35 18.05 15.2.1 4272.88 3.93 25.2.2 4115.67 53.87 15.2.2 3753.97 6.84 25.2.4 4214.44 25.42 15.2.4 3970.05 4.06 25.2.6 4523.95 26.61 15.2.6 4186.12 4.07 25.2.8 4758.16 18.91 15.2.8 4272.88 3.72 25.3.1 4406.42 54.27 15.3.1 3480.69 2.42 25.3.2 3480.69 81.69 15.3.2 3328.62 7.72 25.3.4 3810.75 117.93 15.3.4 3458.72 4.92 25.3.6 4052.16 52.12 15.3.6 3480.69 3.87 25.3.8 4165.62 62.34 15.3.8 3480.69 2.35 25.4.1 4064.67 39.44 15.4.1 3049.75 2.42 25.4.2 2905.56 149.67 15.4.2 2539.94 6.13 25.4.4 3210.20 270.76 15.4.4 2655.08 2.54 25.4.6 3480.69 200.97 15.4.6 2692.20 2.41 25.4.8 3743.05 96.98 Table 3.2: Numerical results for SpHCPDVL , CAB examples. 52 Prob Obj CPU Prob Obj CPU 5.2 49424 0.05 25.4 72022 53.63 5.3 30973 0.06 25.5 66953 29.83 10.2 79857 1.06 25.10 45553 5.63 10.3 59522 0.85 40.2 118406 270.62 10.4 55998 0.59 40.3 103972 828.23 10.5 45453 0.57 40.4 88333 1025.70 20.2 90874 9.00 40.5 81401 820.31 20.3 77681 29.39 40.10 52807 259.80 20.4 68648 21.20 50.2 118194 882.46 20.5 60880 12.77 50.3 102390 1077.73 20.10 40860 3.79 50.4 93490 1093.30 25.2 99611 52.51 50.5 77119 1080.28 25.3 84007 43.91 50.10 58359 1008.95 Table 3.3: Numerical results for SpHCPDVL , AP examples. Step size test for SpHCPDVL In this section we conduct a study on how the size of the step functions in SpHCPDVL affects the solution outputs. We randomly chose five test problems of varying size from the CAB library to conduct the test on. We solved each test problem using a step size of L = {9, 49, 199, 499}. In four out of the five cases, each solution using the different step sizes resulted in identical location-allocation solutions. The objective values do decrease slightly with an increasing step size, due to better found departure times, and are reported below. The fifth problem, test problem 10.3.4, gave the same location-allocation solution for L = 9, L = 49 and L = 499 but not for L = 199. However, it did give the same optimal objective value as L = 499 which itself only differed 0.01 from the other two step size solutions which were equivalent. This simply suggests that non-unique optimal location- allocation solutions exist for 10.3.4. The test results are in other words very promising and seem to suggest that SpHCPDVL may very well be finding the optimal location- allocation solution to SpHCPDV with a step size of only L = 9, from proposition 3.2.1, for these test problems. 53 Prob Obj CPU 10.2.2 L = 9 2625.61 0.44 L = 49 2625.61 0.62 L = 199 2625.60 1.15 L = 499 2625.59 2.46 15.4.8 L = 9 2845.96 3.99 L = 49 2845.96 5.85 L = 199 2845.28 12.15 L = 499 2845.15 759.17 20.4.1 L = 9 3277.37 14.81 L = 49 3277.37 9.72 L = 199 3276.48 172.23 L = 499 3276.31 62.38 25.3.1 L = 9 4406.42 54.27 L = 49 4406.42 29.45 L = 199 4404.91 235.36 L = 499 4404.61 639.33 Table 3.4: Test results for SpHCPDVL using different step sizes. Numerical results for SpHCPDFL Here we test the CPU performance of SpHCPDFL where preset departure times are used. Since no departure time data exists in the test libraries CAB or AP, we pre-calculate the departure times for each possible hub k to m. For these test runs all departures from a potential hub k are set at the same time to all possible hubs m, i.e. dkm = dkn for all m and n in N . The departure at a hub k is calculated by taking the average travel time to k from all nodes i in N and then multiplying the resulting value with a constant κ, which is set equal to 2.5, to give a feasible departure time for each k. Not surprisingly, due to the preset departure times, we see that the objective values in SpHCPDFL are larger compared to the objective values found by SpHCPD V L where the departure times are found in the solution. However, the CPU times for solving the test problems are extremely fast and SpHCPDFL can solve all problems, up to and including 50 nodes in size, in a few seconds. This is computationally very powerful for being an 54 exact solution formulation to a stochastic p-hub location problem. Prob Obj CPU Prob Obj CPU 5.3.1 2739.09 0.04 15.4.8 4733.14 0.30 5.3.6 2493.88 0.05 20.2.1 5821.31 0.61 10.2.1 4220.99 0.10 20.2.2 4853.97 0.53 10.2.2 3420.11 0.12 20.2.4 5096.31 0.63 10.2.4 3615.17 0.11 20.2.6 5337.98 0.52 10.2.6 3817.11 0.11 20.2.8 5579.64 0.55 10.2.8 4019.05 0.15 20.3.1 5078.66 0.65 10.3.1 4220.99 0.13 20.3.2 4324.26 0.47 10.3.2 3406.88 0.10 20.3.4 4384.23 1.05 10.3.4 3535.26 0.11 20.3.6 4595.33 1.01 10.3.6 3774.55 0.13 20.3.8 4837.00 0.76 10.3.8 4013.85 0.13 20.4.1 5078.66 0.75 10.4.1 4220.99 0.14 20.4.2 4324.26 0.36 10.4.2 3406.88 0.12 20.4.4 4353.66 0.85 10.4.4 3535.26 0.11 20.4.6 4595.32 0.80 10.4.6 3774.55 0.11 20.4.8 4837.00 0.64 10.4.8 4013.85 0.12 25.2.1 5718.37 1.25 15.2.1 5729.99 0.24 25.2.2 4792.52 1.42 15.2.2 4731.39 0.15 25.2.4 5045.48 1.44 15.2.4 5005.00 0.20 25.2.6 5286.22 1.73 15.2.6 5246.66 0.19 25.2.8 5502.29 1.33 15.2.8 5488.33 0.21 25.3.1 5207.32 0.92 15.3.1 4974.84 0.23 25.3.2 4677.12 1.12 15.3.2 4221.27 0.17 25.3.4 4677.12 0.52 15.3.4 4280.42 0.26 25.3.6 4723.99 0.57 15.3.6 4491.51 0.21 25.3.8 4965.65 1.77 15.3.8 4733.17 0.24 25.4.1 5207.32 0.51 15.4.1 4974.81 0.26 25.4.2 4677.12 0.80 15.4.2 4220.42 0.17 25.4.4 4677.12 1.07 15.4.4 4249.81 0.22 25.4.6 4723.99 1.35 15.4.6 4491.48 0.23 25.4.8 4965.65 1.96 Table 3.5: Numerical results for SpHCPDFL , CAB examples. 55 Prob Obj CPU Prob Obj CPU 5.2 62722 0.03 25.4 103304 1.86 5.3 71154 0.04 25.5 103304 1.51 10.2 81661 0.18 25.10 103304 4.24 10.3 79742 0.13 40.2 135659 6.45 10.4 80716 0.10 40.3 112144 9.73 10.5 81975 0.15 40.4 112144 10.39 20.2 102596 0.79 40.5 112144 31.35 20.3 92824 0.97 40.10 112144 10.50 20.4 91779 0.87 50.2 127218 13.06 20.5 91779 1.84 50.3 115588 34.08 20.10 92820 0.84 50.4 112324 37.71 25.2 109210 1.98 50.5 112324 41.65 25.3 103304 1.90 50.10 112324 30.28 Table 3.6: Numerical results for SpHCPDFL AP, examples. 3.3 The Service Cost Problem The deterministic and stochastic p-hub center problems, as discussed above, aim to min- imize the longest path in the network. This is important to consider since an important aspect for any transportation company to have, to be competitively strong, is to be able to provide fast delivery times for all origin-destination paths. The stochastic version of the problem further looks at the reliability of the paths and finds a network setup such that the maximum delivery time is not breached for at least a probability γ. For some delivery networks however, there may already exist a natural maximum delivery time, such as a 24 hour delivery service, which the transportation company wishes to meet. In these cases the delivery service aspect of the design process comes down to the reliability of delivering under such a time. In our talks with the different delivery companies it became clear that apart from a path based reliability it is also important to consider volume based reliability, i.e. to look at the total percent of packages that make it on time instead of considering the percent of times a path makes it on time. The main problem which all transportation companies face is minimizing the overall cost. When an OD pair misses its delivery time guarantee a cost is incurred on the company. 56 This cost may for example be in overtime working hours, refunds to customers due to service failure or loss of unhappy customers to the competition in the future. This goes hand in hand with the volume associated with the OD pair in question and, hence, the importance of considering the total percent of packages that miss or make their delivery time guarantees. To tackle this problem we introduce a new model, which we call the Service Cost Problem (SCP), which aims to minimize the costs involved, given a maximum delivery time, if an OD pair does not meet its delivery guarantee. We base this model in accordance with companies that do not have their own aircraft for the hub-hub links and therefore look to rent cargo space for the hub link transfers as discussed in Section 3.2.2. Due to computational complexities, we make a simplification to the network setup but one which has minimal impact on the solution quality. Instead of considering a location problem (finding hub locations and node-hub allocations) we instead consider only the allocation part of the problem for a given hub set. We find that for companies which rent cargo space this simplification does not affect the network setup by much, since such companies already know where there is space to be rented and which of these hubs they wish to utilise for their network design, hence a fixed hub set is already more or less given. 3.3.1 Stochastic formulation Working off the same notations as in Section 3.2, we define Cij to be the cost function representing the cost if an OD pair misses its service guarantee. Each of the costs are associated with the expected volume on the respective OD pairs. The objective is to minimize the sum product of Cij multiplied with the probability for the OD path not meeting the given delivery time β˜ which, for example, may be the latest arrival time needed to make the departure of the local delivery truck delivering all packages and letters to the local buildings in the area. Given the set of departure times dkm for all hub nodes in the hub set H of size p, we can calculate, prior to solving the model, the departure times Am from hub m to node j, which we here call dm. We do this by setting dm = max k (dkm + αtkm), ∀m ∈ H. 57 This lets us formulate the following single allocation stochastic program. (SCP) min. Xik ∑ i,j∈N Cij ∑ k,m∈H Pr[T 1ik > dkm ∪ dm + T 2mj > β˜]XikXjm (3.36) s.t. ∑ k∈H Xik = 1, ∀i ∈ N \ H (3.37) Xkk = 1, ∀k ∈ H (3.38) Xik ∈ {0, 1}, ∀i ∈ N ,∀k ∈ H. (3.39) The objective function (3.36) minimizes the sum of the cost for possibly missing the delivery time for each OD pair. Pr[T 1ik > dkm ∪ dm + T 2mj > β˜] is the probability that packages on path i − k − m − j are not delivered under the given maximum delivery time β˜, i.e. either link i − k will not make the departure at hub k or the departure from hub m or the travel time from m to j will get delayed missing the delivery time β˜. Constraint (3.37) makes sure each node is connected to exactly one hub where (3.38) assigns each hub to be allocated to itself. (3.39) is the binary constraint. 3.3.2 Reformulations Due to the difficulty of solving the objective function (3.36) we formulate two linear equivalent reformulations to SCP in addition to an exact separation algorithm in pursuit of finding a computationally effective solution approach. We start by making the objective function linear by adding N2 variables and N2p con- straints resulting in formulation SCP1 presented below. Next we develop a second for- mulation, SCP2, to see if one computationally outperforms the other. Due to the fact that p is often small in comparison to N , we develop this formulation using O(N2p2) variables but with a factor p fewer constraints than SCP1. Although SCP1 only has N2p+N constraints, a separation algorithm is developed based on SCP1 which initially removes N2p constraints to see if this can be computationally beneficial. A separation algorithm is not developed for SCP2 since the number of constraints here is even smaller. Due to the fact that the hub set and departure times are known we can first calculate the probability for each of the possible paths for each OD pair missing the delivery guarantee, i.e. the probability Pr[T 1ik > dkm ∪ dm + T 2mj > β˜] for all i, j in N and k,m in H. Since T 1ik and T 2mj are independent normal distributions we can easily calculate 58 γijkm where γijkm = Pr[T 1 ik > dkm ∪ dm + T 2mj > β˜] = 1− Pr[T 1ik ≤ dkm ∩ dm + T 2mj ≤ β˜] = 1− Pr[T 1ik ≤ dkm]Pr[dm + T 2mj ≤ β˜]. If however T 1ik and T 2 mj are non-normal distributions γijkm could instead be estimated via simulation and/or sampling. By replacing Pr[T 1ik > dkm ∪ dm + T 2mj > β˜] in (3.36) with γijkm we get the equivalent quadratic reformulation of SCP. (SCPQ) min. Xik ∑ i,j∈N Cij ∑ k,m∈H γijkmXikXjm (3.40) s.t. (3.37), (3.38), (3.39) We make the quadratic term in (3.40) linear by introducing a set of continuous decision variables Vij. Vij is set equal to ∑ k,m∈H γijkmXikXjm which is equivalent to Vij ≥∑ k∈H γijkm(Xik + Xjm − 1) for all m in H since each node is connected to exactly one hub. This results in formulation SCP1. (SCP1) min. Xik,Vij ∑ i,j∈N CijVij (3.41) s.t. Vij ≥ ∑ k∈H γijkm(Xik +Xjm − 1), ∀i, j ∈ N ,∀m ∈ H (3.42) Vij ≥ 0, ∀i, j ∈ N (3.43) (3.37), (3.38), (3.39) Next we formulate SCP2 which has more variables but fewer constraints than SCP1. Instead of the continuous decision variables Vij we introduce the positive continuous decision variables Zijkm which are equal to one if OD pair i− j goes via hubs k and m 59 respectively. (SCP2) min. Xik,Zijkm ∑ i,j∈N Cij ∑ k,m∈H γijkmZijkm (3.44) s.t. ∑ k,m∈H Zijkm = 1, ∀i, j ∈ N , (3.45) MXik ≥ ∑ j∈N ∑ m∈H Zijkm, ∀i ∈ N ,∀k ∈ H (3.46) MXjm ≥ ∑ i∈N ∑ k∈H Zijkm, ∀j ∈ N ,∀m ∈ H (3.47) Zijkm ≥ 0, ∀i, j ∈ N ,∀k,m ∈ H (3.48) (3.37), (3.38), (3.39) The objective function (3.44) is equivalent to (3.40), which in turn is equivalent to (3.36), since constraint (3.45) allows there to be only one path for each OD pair. Additionally, constraints (3.46) and (3.47) allow only a path to be valid if i is connected to k and j to m, respectively. M is a big constant. The separation algorithm we construct for SCP1 is similar to that in [33]. The algorithm iterates around a master problem and a subproblem which respectively solves a cut down version of SCP1 and then checks if an optimal solution has been found for SCP1. The steps of the algorithm are outlined below, and a more detailed description can be found in [33]. (SCP1-Push) Step 1 Generate an initial master problem by removing all of the constraints of type (3.42) in SCP1. Step 2 Solve the most recently generated master problem. Let X∗ik and V ∗ ij be its optimal solutions. Step 3 Solve the subproblem by checking constraint violations for (3.42) when X∗jm = 1. If there is no constraint violation an optimal solution has been found. Otherwise go to Step 4. Step 4 Generate a new master problem by adding the new constraint violations from Step 3 to the master problem. Go to Step 2. 60 3.3.3 Numerical results In this section we test the computational performance of SCP1, SCP2 and SCP1-Push. Numerical tests are conducted using an identical setup as described in Section 3.2.3. Departure times from each hub k to hub m are calculated in the same way as described in Section 3.2.3. The fixed hub set for each test problem consists of the hub nodes found from solving SpHCPDFL , presented in Section 3.2.2, for each of the respective test problems in the CAB and AP libraries. Cij is calculated by taking the average volume between each OD pair given by the test libraries multiplied with 1000. In the tables below for each test problem (Prob) we present the optimal objective func- tion (Obj) and CPU times in seconds for each of the three solution approaches SCP1, SCP2 and SCP1-Push respectively as well as the number of iterations (Iter) required to find the optimal solution using SCP1-Push. CPU performance on the CAB test problems is relatively similar for all three solution approaches as can be seen in table 3.7. For the AP test problems, table 3.8, we can see that for some of the test problems, specifically 25.10 and 40.10, SCP1 and SCP1- Push, respectively, take relatively long to solve. It is well known that different ILP formulations for the same problem can result in large variations in CPU times due to the fact that different formulations may require more or less of the branch & bound tree to be searched before an optimal solution is found, as often demonstrated with the famous pigeon hole problem [34]. It is observed from the computational runs here that solving SCP1, and the subproblems in SCP1-Push, for 25.10 and 40.10 respectively, the branch & bound method struggles to make progress, closing the gap between the lower and upper bounds, requiring a larger portion of the tree to be searched. For example, for problem 25.10, solving SCP1, over 500,000 nodes in the tree are searched before an optimal solution is found. This is compared to solving SCP2 where the initial LP relaxation of SCP2 finds an optimal integer solution straight away. This naturally has a major effect on the total CPU time required to solve the problem as can be seen in table 3.8. An additional observation is that some of the subproblems in SCP1-Push, for test problem 40.10, are requiring much more of the tree to be searched than if solving the full formulation, SCP1, directly. This is not uncommon as branch & bound solution times can sometimes vary drastically, even with only minor changes to the formulation setup. Thus the results show that using a similar “push” approach as in [33] does, for the most part, not improve on the computational performance compared to SCP1, and although all methods give impressive computational results, the winner here is SCP2. 61 Prob Obj SCP1 SCP2 SCP1-Push Iter 5.3.1 214.82 0.02 0.02 0.03 3 5.3.6 335.52 0.01 0.02 0.03 3 10.2.1 125.69 0.03 0.03 0.04 3 10.2.2 647.80 0.04 0.02 0.04 3 10.2.4 333.78 0.02 0.03 0.04 3 10.2.6 187.93 0.02 0.02 0.05 3 10.2.8 149.84 0.02 0.02 0.04 3 10.3.1 122.53 0.03 0.04 0.07 4 10.3.2 256.69 0.03 0.04 0.04 3 10.3.4 228.86 0.03 0.04 0.04 3 10.3.6 183.66 0.03 0.04 0.04 3 10.3.8 146.71 0.03 0.04 0.04 3 10.4.1 122.53 0.04 0.04 0.09 5 10.4.2 256.69 0.04 0.04 0.04 3 10.4.4 228.86 0.04 0.04 0.05 3 10.4.6 183.66 0.04 0.04 0.04 3 10.4.8 146.71 0.04 0.04 0.04 3 15.2.1 1505.59 0.05 0.08 0.07 3 15.2.2 573.61 0.04 0.06 0.06 3 15.2.4 1552.28 0.04 0.04 0.06 3 15.2.6 1534.06 0.05 0.04 0.06 3 15.2.8 1518.72 0.04 0.04 0.06 3 15.3.1 86.35 0.06 0.08 0.08 3 15.3.2 601.26 0.06 0.06 0.10 4 15.3.4 592.92 0.06 0.06 0.08 4 15.3.6 170.51 0.06 0.09 0.09 3 15.3.8 110.97 0.06 0.06 0.06 3 15.4.1 95.09 0.06 0.08 0.10 4 15.4.2 505.47 0.07 0.08 0.11 4 15.4.4 489.93 0.07 0.08 0.09 4 15.4.6 294.21 0.07 0.08 0.10 4 15.4.8 165.61 0.06 0.08 0.10 4 20.2.1 1643.08 0.15 0.21 0.16 3 20.2.2 807.88 0.08 0.09 0.09 3 20.2.4 1695.25 0.09 0.10 0.12 3 20.2.6 1674.93 0.08 0.07 0.10 3 62 20.2.8 1657.74 0.08 0.07 0.09 3 20.3.1 45.89 0.09 0.11 0.10 3 20.3.2 388.68 0.09 0.10 0.13 4 20.3.4 447.71 0.09 0.10 0.13 4 20.3.6 118.38 0.10 0.10 0.09 3 20.3.8 56.37 0.10 0.11 0.09 3 20.4.1 58.71 0.13 0.15 0.15 4 20.4.2 352.23 0.16 0.20 0.13 4 20.4.4 326.38 0.11 0.13 0.11 4 20.4.6 151.07 0.12 0.14 0.15 4 20.4.8 96.92 0.21 0.15 0.15 4 25.2.1 1218.70 0.13 0.14 0.14 3 25.2.2 480.59 0.13 0.16 0.13 3 25.2.4 593.46 0.14 0.13 0.15 3 25.2.6 1241.36 0.13 0.13 0.14 3 25.2.8 1228.65 0.14 0.14 0.14 3 25.3.1 175.97 0.18 0.17 0.21 4 25.3.2 221.11 0.27 0.25 0.29 5 25.3.4 230.72 0.18 0.18 0.20 4 25.3.6 234.12 0.19 0.17 0.20 4 25.3.8 200.00 0.19 0.18 0.19 4 25.4.1 164.42 0.42 0.23 0.61 5 25.4.2 221.04 0.59 0.24 0.91 6 25.4.4 232.24 0.71 0.25 0.56 5 25.4.6 237.62 0.27 0.25 0.35 5 25.4.8 200.29 0.27 0.24 0.36 5 Table 3.7: Numerical results for SCP CAB, examples. 63 Prob Obj SCP1 SCP2 SCP1-Push Iter 5.2 7229.01 0.02 0.02 0.03 6 5.3 3639.61 0.02 0.02 0.04 6 10.2 6743.20 0.02 0.02 0.07 6 10.3 2623.39 0.03 0.05 0.09 8 10.4 1542.18 0.04 0.04 0.06 8 10.5 1446.86 0.04 0.06 0.06 8 20.2 4275.26 0.08 0.08 0.10 6 20.3 1234.94 0.12 0.10 0.16 8 20.4 1197.98 0.18 0.16 0.25 10 20.5 1197.98 0.41 0.19 0.49 12 20.10 1081.89 0.76 0.53 0.76 14 25.2 1690.26 0.30 0.13 0.17 6 25.3 1017.37 0.23 0.18 0.37 8 25.4 850.98 0.56 0.31 0.99 10 25.5 854.45 1.67 0.33 2.54 12 25.10 850.92 206.67 0.90 19.99 22 40.2 19906.00 0.64 0.64 1.77 6 40.3 1702.62 2.51 0.86 4.59 8 40.4 1359.22 24.74 1.06 34.17 10 40.5 1377.48 39.68 2.60 69.60 6 40.10 1412.78 9.25 4.56 398.89 11 50.2 3295.57 1.47 1.49 1.94 3 50.3 1096.12 1.83 1.69 2.38 4 50.4 691.12 4.97 2.46 5.19 5 50.5 343.10 5.48 5.15 24.03 6 50.10 343.00 10.47 7.67 69.95 11 Table 3.8: Numerical results for SCP AP, examples. 3.4 Conclusion This paper presents two new methodological approaches to a stochastic uncapacitated single allocation p-hub center problem which incorporates departure time choices at hub nodes. It is shown that ignoring departure times in the model formulation can lead 64 to non-optimal design solutions. Both variable and fixed departure time choices are modeled together with effective solution reformulations. Numerical tests are conducted showing the efficiency of the solution formulations on network sizes of up to 50 nodes using the standard CAB and AP test libraries. In addition a new service cost problem is introduced for an allocation network. A stochastic formulation is given, where travel time on links are random, and three different solution approaches are presented. This problem minimizes the cost of origin-destination pairs missing their service guarantees for a given maximum delivery time. Numerical results are shown also using the standard CAB and AP test libraries. 65 Appendix Example A: Importance of considering stochastic link travel times Consider a network consists of two origin nodes, ‘A’ and ‘B’, which are delivering to the single destination node ‘C’, via one of the two possible hub nodes ‘H1’ or ‘H2’. If we consider the average travel times on the links, given in figure 3.3, we can easily find a network design to minimize the longest path. By rooting nodes ‘A’ and ‘B’ via hub node ‘H1’ to end at node ‘C’ we get the longest path to be 15 time units. Now let us assume that in reality link ‘A-H1’ has large volatility and the link travel time will often increase up to 15 or even 20 time units. This increase of travel time will then result in a missed delivery time guarantee from node ‘A’ to ‘C’. Figure 3.3: Example network. The missed service guarantee will result in several packages frequently taking up to 20-25 time units for delivery, increasing costs and the potential of losing unhappy customers in the future. If we instead incorporated the possible travel time variations on link ‘A-H1’, assuming all other links have no or very little variation, we would be better off choosing node ‘H2’ as our hub node. This will only increase the guaranteed delivery time by 0.5 time units but we avoid the not infrequent and costly delivery delays. Counter Example B: Importance of accounting for departure times Assume we are given an optimal solution to the stochastic p-hub center problem, as defined in [65, 33] where common departure times are not considered, for a network with independent normal distributions on the link travel times. We show that enforcing a common departure time at hubs can change the stochastic path length which may in turn change the optimal location-allocation solution. If we take a look at a small but critical part of the network, i.e. where the longest path is found, we note in figure 3.4 below that nodes i and i′ are allocated to hub k, k = m and node j is allocated to hub 66 m. The mean travel times and standard deviations for the random travel times Tik, Ti′k and Tmj, which can also be found in figure 3.4, are: tik = 15, σik = 5 ti′k = 10, σi′k = 8 tmj = 5, σmj = 2 Figure 3.4: Critical section of a network. We call path i− k − j path A and path i′ − k − j path B. The service level is γ = 0.95 which means for each path A and B we get: 0.95 ≤ Pr[Tik + Tmj ≤ β˜] 0.95 ≤ Pr[Ti′k + Tmj ≤ β˜] Solving for β on each path respectively gives: βA = 15 + 5 + 1.65 √ 52 + 22 = 28.8855 βB = 10 + 5 + 1.65 √ 82 + 22 = 28.6062 showing that path A is the binding path, β = βA. If this design methodology holds, then path A should still be the binding path when imposing common departure times as described in Section 3.2. We start by looking at departure times which can be chosen freely. It is easy to see that the earliest departure time, Dk, at hub k is: Dk = max(Tik, Ti′k). Due to independence we have for each path A and B: 0.95 ≤ Pr[Tik ≤ Dk]Pr[Dk + Tmj ≤ β] = γ1Aγ2 0.95 ≤ Pr[Ti′k ≤ Dk]Pr[Dk + Tmj ≤ β] = γ1Bγ2 67 To find the minimum maximum departure time we set 0.95 = γ1Aγ2 = γ1Bγ2 = γ1γ2 which gives: DkA = 15 + 5zγ1 DkB = 10 + 8zγ1 where zγ1 is the z-value from the standard normal distribution for the probability 0.95 < γ1 < 1. For link i − k to determine the departure time at hub k, zγ1 needs to be less then 5/3 since: DkA > DkB or 15 + 5zγ1 > 10 + 8zγ1 The functions 15+5zγ1 and 10+8zγ1 are linear and therefore convex. max(15+5zγ1 , 10+ 8zγ1) is therefore also convex. The function 5+2zγ2 is linear and convex in regards to the zγ2 value found from the probability γ2 = 0.95/γ1. The binding path will be determined by: min max(15 + 5zγ1 , 10 + 8zγ1) + 5 + 2zγ2 (3.49) Since the sum of two convex functions is convex, we can check if (3.49) is decreasing or increasing around the value zγ1 = 5/3. If (3.49) is decreasing it means that there is a lower function value for some zγ1 > 5/3 meaning that the link i ′ − k will determine the departure time and hence path B will be the binding path. Picking 2 close values below and 2 above 5/3 for zγ1 will show if (3.49) is increasing or decreasing which shows if the min max path is below or above z = 5/3 = 1.66666.... For zγ1 = {1.65, 1.66, 1.67, 1.68} (3.49) is equal to 34.75, 34.18, 33.80, 33.72 respectively which are all decreasing which can also be seen in figure 3.5 below. The minimum longest path will therefore use a Figure 3.5: Decreasing values. departure time at hub k determined by link i′− k, since the optimal value will be found with a zγ1 value > 5/3, making path B the binding path. This is not the same as when 68 departure times are not accounted for and hence the given solution may not be optimal when departure times are added. For fixed and given departure times, here dk, the same can easily be shown. We set dk = 23.5 which gives for each path A and B: 0.95/(Pr[Tik ≤ dk]) ≤ Pr[dk + Tmj ≤ β] 0.95/(Pr[Ti′k ≤ dk]) ≤ Pr[dk + Tmj ≤ β] Solving for β on each path respectively with dk = 23.5 gives: βA = 33.56 βB = 33.7 showing that path B is now the binding path, β = βB, and the solution found with the original model is non-optimal. 69 Chapter 4 Paper Three Modeling Risk in the Capacitated Single Allocation Hub Location Problem Edward Hult Judge Business School, University of Cambridge, Trumpington Street, Cambridge CB2 1AG, United Kingdom, {e.hult@jbs.cam.ac.uk} Abstract The capacitated single allocation hub location problem aims to design a hub and spoke network minimizing the total flow and hub setup costs. Hub capacity limits are modeled as constraints based on average flow volumes. This may however result in big capac- ity breaches at hub nodes and can lead to high costs and poor service quality across the network. We propose extending the deterministic version by introducing stochastic origin-destination flow volumes to account for the risk of breaching hub capacity lim- its. We formulate a chance constraint model incorporating the conditional value-at-risk which is a safe measure to assure capacity limits are not breached with at least a prob- ability γ. In addition, we propose a linear sample average approximation model as a strong solution method, whose solution converges to a solution of the original problem 70 with probability one, exponentially fast, as the sample size increases. Numerical re- sults are reported and confirm the importance of accounting for the daily fluctuations in origin-destination flow volumes when capacity limits exist. Keywords: Hub location; Stochastic programming; Risk; Conditional value-at-risk. 4.1 Introduction In mail, cargo, air fleet and telecommunication systems, to name a few, one aims to send a certain type of package from one origin point to another destination point, OD flow. This is often done via a hub and spoke network to increase volume on links by aggregating many low volume links and thus reducing the per leg per item cost [3, 13]. OD flow volumes vary from day to day and using chance constraints to model operational effects of hub capacities is of practical importance when working with a hub and spoke network design. The deterministic capacitated single allocation hub location problem (CSAHLP), first introduced by Campbell [11], considers the optimal design of a hub and spoke network by minimizing the total shipment costs across the network as well as the fixed hub setup costs. We refer to the sum of these two costs as the transportation cost. All nodes are allocated to exactly one hub and each hub has a capacity limit regarding the amount of incoming flow, see figure 4.1. The capacity constraint at each hub is important when modeling incoming volume since only so many flights can land at an airport or so many delivery trucks can dock at a docking station etc. at a given time. Figure 4.1: Origin - destination path. 71 This paper considers the CSAHLP with stochastic OD flows and accounts for its random volume fluctuations on a daily basis. When designing networks it is often much easier to work with, construct and solve deterministic models rather than models accounting for random or unknown events. However, deterministic models do not always repre- sent reality closely enough to be beneficial. Here for example, volume is an important parameter which is far from static but is a key component in the deterministic CSAHLP. When considering stochastic volume there are two types of situations which can be studied. The first, which is often found in stochastic facility location median problems, is the possible permanent volume changes that may take place in some future event. The second, which is what we consider here, is the fluctuation of volume generated from each origin node i on a daily basis. When designing a network based on capacitated flow volumes it is extremely important to consider this daily fluctuation on each origin node. It is important since if ignored, or assumed constant, hub nodes may often receive more flow than they are capable of handling leading to major congestion problems, soaring costs and poor services by the company. For example, if the average inflowing volume to a hub is equal to the hub’s capacity, it will lead to an overflow of volume at the hub up to 50 percent of the time. This is especially true if the volume distributions from the nodes are tail heavy since this can lead to a breach in capacity more often than 50 percent of the time. Due to this, the optimal solutions found by the existing deterministic models in the literature may in actuality give a poor network design. In addition, the few stochastic models in the literature cannot handle different variations of OD flow volumes on a daily basis and provide limited benefits for transportation companies who want to guard against high volume fluctuations of incoming flow to a hub node during a given time [44, 81, 15]. We therefore propose a new model which incorporates a risk measure to account for different volume fluctuations from different origin nodes. For the CSAHLP with stochastic OD flow volumes, we introduce a chance constraint formulation which accounts for the risk of breaching the capacity limits at each of the different hubs. We want to set the value-at-risk (VaR) to be less than or equal to each hub’s capacity, where the probability of flow exceeding the capacity is not more than γ. However, VaR is hard to solve since even if a normal distribution is assumed on the OD flows, one ends up with a non-linear binary formulation which is extremely hard to solve for any but very small sized networks. We therefore apply the risk measure conditional value-at-risk (CVaR) [57], also known as expected or mean shortfall and tail value-at-risk [58]. We here denote CVaR, for each hub k, as CV aRkγ, which states that 72 for a lower value φk, such that φk is not exceeded with a probability γ, then CV aR k γ is the expected value when φk is exceeded. Since CV aR k γ automatically calculates φk, we get the capacity at each hub to hold at least γ100% of the time when we set CV aRkγ as a capacity constraint to be less than or equal to each hub’s capacity limit. This holds since φk will always be less than or equal to CV aR k γ and is a great strength of the risk measure. In addition, we formulate a sample average approximation (SAA) method which we show also solves the CVaR model with probability one for large enough sample sizes. It is also shown that it converges exponentially fast with an increase of the sample size. Convergence statements and their proofs are not new within the SAA literature and it has been shown that, for models with specific properties, the above statement holds true [5, 39]. We show that the SAA convergence proof presented for the models in Atlason et al. [5] also holds true for our models. We can therefore find a reliable solution which solves for a safe VaR while considering the important random OD flow fluctuations. In Section 4.1.1 we review the hub and spoke literature as well as the facility location lit- erature in regards to stochastic and deterministic hub location problems and stochastic demand median problems, respectively. In Section 4.2 we model the stochastic CSAHLP with chance constraints. In Section 4.3 we formulate the SAA model and show that the solution found for the SAA model converges to the solution of the original model with an increasing sample size. In Section 4.4 we give an equivalent but smaller sized linear formulation which can be solved using any standard mixed integer linear programing solver. In Section 4.5 several numerical tests are conducted on up to 50 node networks. Simulation runs are also conducted on the optimal network setups found from our pro- posed model as well as the optimal solution from the existing deterministic models, which confirm the importance of modeling stochastic volume when considering hub capacities. 4.1.1 Literature review The capacitated single allocation hub location problem was first introduced by Campbell [11] who formulated a linear model with O(n4) variables and constraints. This model was later computationally improved by Ernst et al. [28] who used the uncapacitated p-hub median model from Ernst et al. [25] to formulate a tighter CSAHLP model. This formulation has been the computationally fastest MILP formulation, but was shown recently by Correia et al. [17] to be incomplete for some problems, giving infeasible values 73 to some variables. To fix this they added an additional set of constraints. A variation to the standard problem is presented by Costa et al. [18], who remove the capacity constraint and instead add a second objective function to the formulation minimizing the time a hub takes to process the incoming flow. This was done to try to spread out the volume more cost beneficially without using a hard constraint on the maximum capacity. For stochastic hub and spoke network design there is a limited pool of existing research. There are a few papers on stochastic hub location problems but none of these accounts for different variations of stochastic volume originating from each origin node when ca- pacity constraints are present. Marianov and Serra [44] build an M/D/c queuing model for the number of airplanes waiting to land assuming Poisson arrivals of airplanes. They solve the capacitated multiple allocation hub location problem with capacity constraints on the number of airplanes allowed in the queue for some probability. The queuing model calculates the capacity beforehand and then they set the average incoming flights to each hub to be less than or equal to the found maximum capacity. They develop a heuristic algorithm based on tabu search and solve up to 30 node networks. Yang [81] looks at an air freight uncapacitated multiple allocation hub location problem with stochastic demand and stochastic discount factors on the hub-hub links. They formu- late a two stage model where the first is a deterministic model, taking into account the solution found from the second stage model, and solves for the number and locations of hub nodes. The second stage solves the flight routing problem considering the stochas- tic properties. This is reformulated into one extended model considering three discrete scenarios of the stochastic outcomes. They look at and solve a case study of a 10 node air freight network in Taiwan and China. Contreras et al. [15] look at the stochastic uncapacitated hub location problem with random demand and random transportation costs. They show that considering stochastic demand, when no capacity constraints exist, or stochastic dependent transportation costs results in the same models as the respective deterministic models. They then move on to model the uncapacitated hub location problem with stochastic independent transportation costs which they approx- imately solve, using Monte-Carlo simulation coupled with Benders’ decomposition, for up to 50 node networks. None of the existing hub location papers, however, consider the daily fluctuations of the OD flow volumes for networks which have capacity constraints on the incoming flow to each hub. Since many hub and spoke networks have constraints on the amount of flow their hub nodes can efficiently handle, this is a significant gap in the literature. Filling this gap is the main contribution of this paper. 74 In addition to the three hub location problems, Sim et al. [65] consider stochastic travel times for the single allocation p-hub center problem. They assume independent normal distributions on the travel times for each link and develop heuristic methods to approximately solve network sizes up to 40 nodes. Hult et al. [33] look at the same stochastic p-hub center problem as in [65] but present exact solution methods based on preprocessing and a separation algorithm to solve up to 50 node networks. There is a much larger pool of stochastic research done in the facility location literature. Looking at the literature using uncertain demand for median type problems there is one big difference however, which is that most facility location problems look at how demand might permanently change in the future and thus change the cost of flow through the network, differing from the every day fluctuations we are interested in. Earlier work on the stochastic demand p-median facility location problem used a mean outcome model where the expected volume was calculated from different possible scenarios of how future demand may look, see for example Sheppard [64] and Weaver and Church [78]. Louveaux [43] presents a similar version of the p-median problem but includes capacity constraints and instead maximizes the expected utility of profit. Carbone [14] minimizes a chance constrained upper bound on the total demand weighted distance, choosing p facilities to hold for a given probability. Zhou and Liu [83] present a few different risk models for the capacitated median facility location problem. Demand is stochastic and due to the capacity constraints at the facilities, customers may not only be supplied by the nearest factory. They set up three different formulations, an expected value model which is linear, a chance-constrained model which provides a confidence level β (minimizing the value which the cost needs to be less than for a probability β), and also a dependent- chance model which maximizes the probability that the cost does not exceed some given value. They then solve these approximately by using a hybrid intelligent algorithm consisting of simulations, a genetic algorithm and the simplex algorithm. Wen and Iwamura [79] solve the same stochastic facility location-allocation problem as [83] above. They model it as a chance-constrained model as well but under the Hurwicz criterion, i.e. a weighted balance between minimizing the value which the cost should be less than for some confidence level and the min-max value which the cost should be more than for the same confidence level. They then solve this model using fuzzy simulation. Wang [76] uses probabilistic demand constraints for the median location problem, minimizing the total travel cost with unknown demand and then solves this exactly by considering all possible combinations of random demands. This model can however only be used to solve small networks. He also approximates the problem by assuming normal distributions but 75 exact solution times are still very large for small sized problems. Albareda-Sambola et al. [2] formulate the facility location problem with Bernoulli demands. They set capacity constraints for the facilities and use a penalty cost if demand exceeds the capacity. They also require a minimum number of customers to be allocated to a facility for it to be opened. They solve the resulting stochastic assignment problem both by using simulation and a heuristic algorithm. Wagner et al. [75] formulate the uncapacitated median facility location problem with stochastic demands. They model it using the risk measure value-at-risk borrowed from the finance literature which minimizes the value the cost needs to be less than or equal to for a given probability/risk. They solve this using simulated data and by deriving a convex nonlinear program and design a branch-and-bound algorithm. Their initial model can solve problems up to 25 nodes in size, they can however solve problems up to 40 nodes in some special cases. There are many different variations to the stochastic facility location problem that consider different objective functions and/or other stochastic properties in comparison to the median type problems mentioned here. The interested reader is referred to, for more details on different stochastic facility location problems, the literature review paper by Snyder [69]. 4.2 The Stochastic Capacitated Single Allocation Hub Location Problem We here introduce the CSAHLP with stochastic OD flows using chance constraints, V aRkγ. To make this computationally feasible, we immediately remodel via CV aR k γ. This safely sets V aRkγ less than or equal to the capacity limit at each hub for a given probability. We refer to this model as the stochastic capacitated single allocation hub location problem with chance constraints or, stochastic capacitated hub location problem (SCHLP), for short. Define a set of nodes N to be allocated to an unknown set of hub nodes H ⊆ N . Each node i is allocated to exactly one hub and all hub nodes are assumed to be fully connected. The shipment cost per unit flow from node i to k is represented by tik and Xik is the binary variable which is equal to one if and only if node i is allocated to k for all i, k ∈ N . Since each node is allocated to itself if it is a hub node Xkk equals one if node k is a hub. Zijkm are the binary variables which are equal to one if and only 76 if the path OD pair i − j takes is via hub nodes k and m. χ, δ and α are economies of scale where α, which is less than or equal to both χ and δ, is the economy of scale for consolidation. Fk is the fixed hub setup cost at node k and Γk is the maximum capacity at hub k. Let ζ denote the probability distribution of the random flow volume where Wij(ζ) is the stochastic volume of flow between OD pair i − j. Wij is the mean and σ˜2ij is the variance of Wij(ζ). We also define Oi(ζ) as the stochastic volume of flow generated at origin node i where Oi(ζ) = ∑ jWij(ζ). Oi is the mean and σ2i is the variance of Oi(ζ). Similarly we define Dj(ζ) as the stochastic demand of flow arriving at node j where Dj(ζ) = ∑ iWij(ζ) and Dj is the mean and σ̂2j the variance of Dj(ζ). Additionally Hk( ~O(ζ), ~Xk) is the total volume flowing into hub k which is dependent on the decision variables Xik, Hk( ~O(ζ), ~Xk) = ∑ i∈N Oi(ζ)Xik. V aRkγ(Hk( ~O(ζ), ~Xk)) is the value-at-risk for SCHLP, which is the value, depending on the chosen hub k, which the total volume flowing into the hub will not exceed by a probability γ. Similarly the conditional value-at-risk, CV aRkγ(Hk( ~O(ζ), ~Xk)), is the expected volume of flow into each hub k conditional on flow exceeding V aRkγ(Hk( ~O(ζ), ~Xk)). We formulate SCHLP by setting CV aRkγ(Hk( ~O(ζ), ~Xk)) as a constraint. The idea with the model is to make sure capacity holds at each hub k at least γ100% of the time. Setting CV aRkγ(Hk( ~O(ζ), ~Xk)) less than or equal to the capacity of hub k ensures that the associated VaR constraint holds, namely, capacity is not breached more than γ100% of the time. This is actually very robust with respect to minimizing the chance of breaching the capacity constraint when high values of γ are used, see Section 4.5.3. In addition to approximating V aRkγ(Hk( ~O(ζ), ~Xk)), CV aRkγ(Hk( ~O(ζ), ~Xk)), used as a constraint, has an added benefit. It guards against extreme volume events the (1− γ)100% of the time the capacity does not hold, by, for example, denying several origin nodes with tail heavy flow distributions to be assigned to the same hub. We now formulate SCHLP using the decision variables Xik and Zijkm. 77 (SCHLP) min. Xik,Zijkm E[ ∑ i,k∈N tikXik(χOi(ζ) + δDi(ζ))] + E[ ∑ i,j,k,m∈N αtkmWij(ζ)Zijkm] +∑ k∈N FkXkk (4.1) s.t. 0 ≤ ΓkXkk − CV aRkγ(Hk( ~O(ζ), ~Xk)), ∀k ∈ N (4.2)∑ k∈N Xik = 1, ∀i ∈ N (4.3) Xik ≤ Xkk, ∀i, k ∈ N (4.4)∑ k,m∈N Zijkm = 1, ∀i, j ∈ N (4.5) Zijkm ≤ Xik, ∀i, j, k,m ∈ N (4.6) Zijkm ≤ Xjm, ∀i, j, k,m ∈ N (4.7) Zijkm ∈ {0, 1}, ∀i, j, k,m ∈ N (4.8) Xik ∈ {0, 1}, ∀i, k ∈ N (4.9) The objective function consists of three terms which all define the transportation cost. The expected shipment cost on node-hub links and hub-node links are respectively cal- culated in the first term. The expected shipment cost on the hub-hub links is calculated in the second term and the fixed hub setup cost is calculated in the third term. Con- straint (4.2) is the approximate chance constraint and sets CV aRkγ(Hk( ~O(ζ), ~Xk)) to be less than or equal to the capacity at each hub k. Constraint (4.3) allows each node to be allocated to exactly one hub node and (4.4) forces each node allocation to be to a hub node. Constraint (4.5) makes sure only one path is chosen for each OD pair and constraints (4.6) and (4.7) make sure i is connected to k and j is connected to m for the chosen OD path. Constraints (4.8) and (4.9) are the binary constraints. A few remarks on the model are in order. First, as mentioned above, any feasible hub allocation flows through the network will not violate any hub constraints with at least a probability γ. This can be seen visually in figure 4.2 where CVaRγ is the average value of the shaded area. Second, one cannot guarantee that a capacity breach will never occur since extreme instances do happen, but by increasing γ the chance of a breach is lessened. This however may in turn increase the transportation cost setup and γ needs therefore to be chosen in regard to the user’s risk preference. Thirdly, similar to the deterministic models, it is assumed that at each node the origin and demand volumes are similar and therefore here it is also assumed that they have the 78 Figure 4.2: Cumulative distribution graph for total volume flowing into a hub, showing the VaRγ , CVaRγ and the hub capacity, Γ. same probability distributions. Thus if a hub k can handle the inbound flow from all nodes i which are allocated to it, it can also later handle all the flow arriving to it from the other hub nodes due to the same total inbound and outbound flow volumes. However if this is not the case then one can simply modify the above formulation by adding the additional set of risk constraints 0 ≤ ΓkXkk − CV aRkγ(H ′k( ~D(ζ), ~Xk)) for all k ∈ N , where H ′k( ~D(ζ), ~Xk) = ∑ i∈N Di(ζ)Xik. This holds since the total transfer volume arriving to a hub from all other hubs will equal the total outbound volume,∑ i∈N Di(ζ)Xik, it delivers. Since we are not concerned with possible permanent changes to flow volume in the future but are instead considering daily fluctuations, the total shipment cost function can be reformulated as a deterministic function using average flow volumes without loss of precision. This is due to the fact that the location-allocation solution found will, over time, average out the fluctuating transportation costs associated with the random volumes. From basic probability theory we know that for some random variables V1 and V2 and for some constants (or fixed decision variable) a1 and a2, the following expectation holds: E[a1V1 + a2V2] = a1E[V1] + a2E[V2] Therefore we can reformulate the first two terms in the objective function in (4.1) in the same way. 79 E[ ∑ i,k∈N tikXik(χOi(ζ) + δDi(ζ))] + E[ ∑ i,j,k,m∈N αtkmWij(ζ)Zijkm] + ∑ k∈N FkXkk =∑ i,k∈N tikXik(χE[Oi(ζ)] + δE[Di(ζ)]) + ∑ i,j,k,m∈N αtkmE[Wij(ζ)]Zijkm + ∑ k∈N FkXkk =∑ i,k∈N tikXik(χOi + δDi) + ∑ i,j,k,m∈N αtkmWijZijkm + ∑ k∈N FkXkk (4.10) This allows us to use the average values for the shipment costs in CSAHLP which can be formulated as (SCHLP2) min. Xik,Zijkm ∑ i,k∈N tikXik(χOi + δDi) + ∑ i,j,k,m∈N αtkmWijZijkm + ∑ k∈N FkXkk s.t. 0 ≤ ΓkXkk − CV aRkγ(Hk( ~O(ζ), ~Xk)), ∀k ∈ N (4.3)− (4.9) Even though we can use average values for the shipment costs we still have the condi- tional value-at-risk constraint (4.2) to deal with which is the topic of the next section. 4.3 Sample Average Approximation Financial models using CVaR can efficiently be estimated for any formulation by using data sampling to give an LP approximation, see for example [4, 42, 57, 58]. For SCHLP2 the same can be done. Since the CVaR constraint (4.2) is hard to deal with directly, we can estimate the CVaR function by taking S data samples from Oi(ζ) for all i ∈ N . This is also referred to as a sample average approximation (SAA) method. SAA problems are well studied in the literature where many papers look at convergence rates for different problems, see for example [5, 39, 77]. Taking S samples of Oi(ζ), denoted as Ois(ζs) for s ∈ S = {1, . . . , S}, we get, CV aR k γS(Hk( ~Os(ζs), ~Xk)). CV aR k γS(Hk( ~Os(ζs), ~Xk)) is a SAA of S independent and identically distributed realizations of CV aRkγ(Hk( ~O(ζ), ~Xk)) (given explicitly in (4.17) below), where ζ is replaced by its SAA ζ1, . . . , ζS and CVaR is applied to the corre- sponding discrete approximation Hk( ~Oi1(ζ1), ~Xik), . . . , Hk( ~OS(ζS), ~Xk) of the original 80 distribution Hk( ~Oi(ζ), ~Xik). This is the average value of these quantities conditional on not falling below their 100γ percentile. In Section 4.4 we show that for any given Xik solution Pr[ lim S→∞ CV aR k γS(Hk( ~Os(ζs), ~Xk)) = CV aR k γ(Hk( ~O(ζ), ~Xk))] = 1, ∀k ∈ N . Thus we can formulate SCHLP2 as a SAA model, SCHLP2S, as (SCHLP2S ) min. Xik,Zijkm ∑ i,k∈N tikXik(χOi + δDi) + ∑ i,j,k,m∈N αtkmWijZijkm + ∑ k∈N FkXkk s.t. 0 ≤ ΓkXkk − CV aRkγS(Hk( ~Os(ζs), ~Xk)), ∀k ∈ N (4.11) (4.3)− (4.9) 4.3.1 Convergence Our purpose is to show that the convergence framework of [5] applies above, hence the following result holds: Theorem 4.3.1 The optimal solutions of SCHLP2S are a subset of the optimal solutions of SCHLP2 with probability 1 when S is large. Furthermore a solution of SCHLP2S con- verges to a solution of SCHLP2 with probability 1 exponentially fast when S is increased, for example, Pr[|CV aRkγS(Hk( ~Os(ζs), ~Xk))− CV aRkγ(Hk( ~O(ζ), ~Xk))| ≥ ] ≤ ae−bS for some constants , a, b > 0 that are independent of S. We assume, as in [19] which is used in [5], that ζ ∈ Rm denotes any random variable that takes values in Rm and for which there exists c > 0, θ0 > 0, η(·) : Rm → R1 such that |Hk( ~O(ζ), ~Xk)| ≤ cη(ζ), E[eθη(ζ)] <∞, for any solution Xik and for all 0 ≤ θ ≤ θ0. This is in accordance with assumptions of Theorem 3.1 of [19] (to the expectation of (4.17), the SAA of the CV aRkγ(Hk( ~O(ζ), ~Xk)), given later in Section 4.4). 81 In [5] a generalization of the theorem above is shown to hold in the context of a call center model where the number of incoming calls are stochastic. Note that [5] insist on integer distributions but Theorem 4.3.1 above holds because [5] is built on [19] which allows for any distributions ζ satisfying the above conditions. The two models proposed in [5], which we here call G1 and G2, are defined below where G1 is the original model with chance constraints and G2 is the SAA model of G1. (G1) min. y f(y) s.t. y ∈ Y g(y) ≥ 0 (G2) min. y f(y) s.t. y ∈ Y gn(y) ≥ 0 In [5] they define the finite set Y as Y = {y ≥ 0 and integer : ∃ 0 ≤ x ∈ X and integer withAx ≥ y}. In Y , X is a compact set of integer decision variables and the entries in the matrix A are either 0 or 1. The function f(y) is defined as the linear cost function f(y) = min. {x≥0,x integer:x∈X,Ax≥y} cTx where c represents some positive cost parameters. g(y) is a function whose ith component is the expected value gi(y) = E[Gi(y, ξ)], where ξ is assumed to be a distribution of integer valued random variables. Additionally Gi(y, ξ) is defined, for each period i, as Gi(y, ξ) = Si(y, ξ) − γiN i(ξ) where Si(y, ξ) is the number of calls answered within a pre-specified time limit, N i(ξ) is the number of incoming calls and γi is the minimum acceptable service level. It is assumed an algebraic expression of g(y) is complex and not available. Therefore instead of solving (G1) directly, SAA is used to formulate (G2) using n samples from g(y) by simulation and by taking n samples from the distribution of ξ to get gn(y) = 1 n ∑n s=1 G(y, ξ s) where gin(y) is a sample mean of n independent and identically distributed realizations of gi(y) and is the ith component of gn(y). G1 and G2 are linear with the exceptions of g(y) and gn(y). Additionally they assume g(y) to 82 be finite for all y ∈ Y and that a global optimum exists for G1. It is easy to see that SCHLP2 and SCHLP2S have the same properties as G1 and G2, respectively, where ζ replaces ξ. The objective functions and constraints are all linear except for the chance constraint where the chance constraints can be estimated via sampling. Both model pairs also have feasible sets consisting of finitely many integer points. We also assume the capacity constraints are relaxed enough to allow for a feasible solution to exist for SCHLP2 and thus a global optimal solution must also exist. Therefore, since theorem 4.3.1 holds when applied for G1 and G2 as proven in [5], it will also hold for SCHLP2 and SCHLP2S. 4.4 Solution Method We know from the previous sections that, with probability 1, SCHLP2 can be solved exactly through SCHLP2S for S large enough. There are however a lot (quadratically many in the number of nodes) of variables and constraints in SCHLP2S which will cause computational difficulties even if S is small. We therefore propose to reformulate both SCHLP2 and SCHLP2S into two equivalent and computationally more tractable reformu- lations, SCHLP3 and SCHLP3S, which respectively find the same optimal network designs as SCHLP2 and SCHLP2S. Note we introduce SCHLP 3 not for any direct computational purpose, but to make a notational bridge between SCHLP2 and the linear SAA sub- problem SCHLP3LS , defined at the end of this section, that is essential for computational tractability. We first start by removing the binary variables Zijkm which also allows the removal of the constraints (4.5), (4.6), (4.7) and (4.8) since their only purpose it to define the feasible values of Zijkm. The purpose of Zijkm was to help account for the volume on each hub link in the objective function by considering which hubs each origin and destination nodes were connected to. This can however also be done by introducing the continuous decision variables Y ikm, which each represent the volume on hub-hub link k−m originating from node i. The use of the decision variables Y ikm for the deterministic capacitated single allocation hub location problem was first introduced in [28] and later completed in [17]. 83 SCHLP2 can be reformulated into the following equivalent formulation. (SCHLP3) min. Xik,Y i km ∑ i,k∈N tikXik(χOi + δDi) + ∑ i,k,m∈N αtkmY i km +∑ k∈N FkXkk (4.12) s.t. ∑ m∈N Y ikm − ∑ m∈N Y imk = OiXik − ∑ j∈N WijXjk, ∀i, k ∈ N (4.13)∑ m∈N ,m 6=k Y ikm ≤ OiXik, ∀i, k ∈ N (4.14) Y ikm ≥ 0, ∀i, k,m ∈ N . (4.15) (4.2), (4.3), (4.4), (4.9) Recall that (4.2) is the probabilistic constraint 0 ≤ ΓkXkk − CV aRkγ(Hk( ~O(ζ), ~Xk)), ∀k ∈ N . Constraints (4.13) and (4.14) assign the hub- hub link flows to Y ikm and (4.15) are the variable constraints. This formulation can similarly be used for the SAA model SCHLP2S by replacing (4.2) with (4.11) resulting in (SCHLP3S ) min. Xik,Y i km ∑ i,k∈N tikXik(χOi + δDi) + ∑ i,k,m∈N αtkmY i km +∑ k∈N FkXkk s.t. (4.3), (4.4), (4.9), (4.11), (4.13), (4.14), (4.15) Next we consider how to calculate CV aR k γS(Hk( ~Os(ζs), ~Xk)). Rockafellar and Uryasev [57] gave a linear approximation reformulation of min. x CV aRγ(f(x,y)) where x is a vector of decision variables and y is a vector of random variables, by sampling the data of y generating a collection of sampled vectors ys of size S. min. φ,x φ+ 1 S(1− γ) ∑ s∈S Θs s.t. Θs ≥ f(x,ys)− φ, ∀s ∈ S Θs ≥ 0, ∀s ∈ S 84 φ represents the VaRγ value which is a great strength of the reformulation, since VaRγ can otherwise be hard to find in and of itself. Rockafellar and Uryasev [58] showed that the same reformulation can be made when CVaR are constraints. To this end we derive an algebraic expression for V aRkγ(Hk( ~O(ζ), ~Xk)) and CV aRkγ(Hk( ~O(ζ), ~Xk)) for each hub k as V aRkγ(Hk( ~O(ζ), ~Xk)) : Φkγ = min{φk : Pr[ ∑ i∈N Oi(ζ)Xik ≤ φk] ≥ γ} (4.16) CV aRkγ(Hk( ~O(ζ), ~Xk)) : Ψkγ = E[ ∑ i∈N Oi(ζ)Xik| ∑ i∈N Oi(ζ)Xik ≥ Φkγ] = Φkγ + E[ ∑ i∈N OiXik − Φkγ| ∑ i∈N Oi(ζ)Xik ≥ Φkγ] = Φkγ + 1 1− γE[( ∑ i∈N OiXik − Φkγ)+] = Φkγ + 1 1− γE[max(0, ∑ i∈N Oi(ζ)Xik − Φkγ)].(4.17) This allows us to calculate CV aR k γS(Hk( ~Os(ζs), ~Xk)) by solving φk + 1 S(1− γ) ∑ s∈S max(0, ∑ i∈N OisXik − φk), ∀k ∈ N (4.18) where φk is the variable in Φ k γ from (4.16). It is now easy to see that for any given Xik solution Pr[ lim S→∞ CV aR k γS(Hk( ~Os(ζs), ~Xk)) = CV aR k γ(Hk( ~O(ζ), ~Xk))] = 1, ∀k ∈ N since from the strong law of large numbers we know that Pr[ lim S→∞ 1 S ∑ s∈S max(0, ∑ i∈N OisXik − φk) = E[max(0, ∑ i∈N Oi(ζ)Xik − φk)]] = 1, ∀k ∈ N . From (4.18) we can reformulate SCHLP3S as an equivalent linear SAA model SCHLP 3L S . 85 (SCHLP3LS ) min. Xik,Y i km,φk,Θks ∑ i,k∈N tikXik(χOi + δDi) + ∑ i,k,m∈N αtkmY i km + ∑ k∈N FkXkk s.t. ΓkXkk ≥ φk + 1 S(1− γ) ∑ s∈S Θks, ∀k ∈ N (4.19) Θks ≥ ∑ i∈N OisXik − φk, ∀k ∈ N , ∀s ∈ S (4.20) Θks ≥ 0, ∀k ∈ N , ∀s ∈ S (4.21) (4.3), (4.4), (4.9), (4.13), (4.14), (4.15) Constraints (4.19), (4.20) and (4.21) are the linear equivalent constraints of (4.11) where CV aR k γS(Hk( ~Os(ζs), ~Xk)) is defined in and linearized from (4.18). Θks are continuous variables. This model is now in the format of a standard mixed integer linear pro- gram, and thus solving it can be attempted using any commercial mixed integer linear programming solver. 4.5 Numerical Results The computational performance of SCHLP3LS is tested in this section using the well- known Australian Post (AP) test examples [26] which have data regarding the costs of setting up hubs as well as the hubs’ maximum capacities. For each test set |N | = {10, 20, 25, 40, 50}, problems with both tight (T) and loose (L) fixed hub costs as well as both tight (T) and loose (L) volume capacities are considered. The (T) costs have a higher hub setup cost at the nodes which have high volume flows, making it more difficult for the model to nominate these as hub nodes which would otherwise be a natural selection. Similarly the (T) capacity problems have stricter capacity constraints. The combination of (T) and (L) costs and capacities results in four test problems for each network size. Each test problem is named after its size and type of cost and capacity, for example 10TL represents the test problem where |N | = 10 using (T) costs and (L) capacities. The sampled sets were generated in C++ by randomly selecting the total outgoing volume for each origin node i from a uniform distribution between 0.5Oi and 1.5Oi. A different seed was used to generate different random data for each run based on the computer clock time. All code is written in C++ and is solved using ILOG CPLEX Version 12.1, and the numerical experiments were carried out on a HP Pavilion laptop 86 with an Intel(R)Core(TM)2 CPU processor with 3.00 GB RAM. Each run is allowed a maximum of 7200 CPU seconds (2 hours) before termination to find an optimal solution to SCHLP3LS . The 2 hour time limit was deemed appropriate after conducting numerical experiments on the CPU times, for the test problems with network size |N | = 50, where no additional solutions were found when allowing up to a 3 hour time limit. In Section 4.5.1 we look at the performance of SCHLP3LS in regards to CPU time. In Section 4.5.2 we check how much the optimal solution fluctuates and how CPU time is affected by the sample size. In Section 4.5.3 we compare the optimal solutions found from the deterministic CSAHLP and SCHLP3LS by simulating the OD flow. In the tables below we report the objective function value found (Obj) and also the CPU time in seconds (CPU) needed to find the optimal solution. In Section 4.5.3 we also report from the simulation runs the total network setup cost (Cost), the difference of cost between the deterministic solution and that found by SCHLP3LS (Cost diff) as well as the overall average breach in capacity on a daily basis (Aver breach) and the difference of breaches between the models (Aver breach diff). In addition we report with what probability the capacity will hold for each hub for each model as well as the expected size of the capacity breach if one does occur. 4.5.1 Main run Here we look at the results from all standard test examples described above for γ = 0.90 and γ = 0.95. To reduce the computational burden we sacrifice slightly on the likelihood of getting an optimal solution by reducing the sampled data set size. For network sizes |N | = {10, 20, 25} the sample set size for each node i is set to S = 400 and for |N | = {40, 50} the sample set size is set to S = 200. The sample sizes 400 and 200 are specifically chosen after conducting empirical experiments, which are discussed in the following section and shown in figures 4.3 and 4.4 below, and are a trade off between the likelihood of finding the true optimal solutions and the required CPU times for solving the different sized problems. In other words, larger sample sizes could have been chosen but at the cost of larger CPU times with relatively little solution quality gain. However, for real life situations it is worth noting that running a solution for a couple of days is worthwhile to assure a higher success rate of finding a true optimal solution. As can be seen in tables 4.1 and 4.2 solving SCHLP3LS works very well. It is clear that the test problems with tight capacities are generally harder to solve than the ones with loose 87 capacities and no optimal solutions could be found for the test problem 50TT within the allowed time frame. We also observe that there is a large time difference between solving test problem 20LT, as shown in table 4.1, when solving using γ = 0.90 compared to γ = 0.95. From the numerical tests we find that solving 20LT, with γ = 0.90, not only requires more of the branch & bound tree to be searched (approximately 60 nodes compared to 15 nodes when γ = 0.95) but also that the LP relaxation takes longer to solve. This can happen since the simplex method, applied in the branch & bound algorithm, is not a polynomial-time algorithm [55] and will sometimes require longer run times before an optimal solution is found, even for small changes in the data. We also see, as expected, that the objective value increases slightly (or stays the same) when using a higher value of γ. γ = 0.90 γ = 0.95 Prob Obj CPU Obj CPU 10LL 224250 2.72 224250 1.99 10LT 260064 8.48 260064 7.87 10TL 263400 3.69 263400 4.01 10TT 264616 4.29 274884 7.91 20LL 234691 5.47 234691 5.47 20LT 265606 1303.29 265639 56.71 20TL 271128 6.47 271128 6.09 20TT 321685 121.39 325062 96.05 25LL 245183 26.14 245183 28.33 25LT 289943 1533.84 289943 2016.00 25TL 310318 13.46 310318 13.48 25TT 360815 2234.58 362687 2385.47 Table 4.1: Numerical results for AP examples 10LL - 25TT, S = 400. 88 γ = 0.90 γ = 0.95 Prob Obj CPU Obj CPU 40LL 242114 56.08 242114 55.01 40LT 294407 2200.66 295251 1475.11 40TL 306461 103.47 306461 178.9 40TT 386788 5791.23 387090 1250.65 50LL 238595 289.69 239817 686.06 50LT 280582 2905.48 280582 478.79 50TL 324277 1628.93 324975 2870.41 50TT * * * * Table 4.2: Numerical results for AP examples 40LL - 50TT, S = 200. 4.5.2 Effects of using different sample sizes The more data we sample in SCHLP3LS , the more hopeful we are that the location- allocation part of the solution to this problem also solves SCHLP3. To see how sensitive the optimal solution is in regards to the sample size we solve SCHLP3LS for the test problems 10LT and 20TT, for sample sizes S = {100, 200, 400, 600, 800}, using γ = 0.95, 100 times each using different sampled data sets each time by applying a different seed to the function generating the random data. Figure 4.3: Number of times an optimal solution was found from 100 different runs for each sample size S = {100, 200, 400, 600, 800}. 89 Figure 4.3 presents the optimal solutions found from running SCHLP3LS 100 times using different sampled sets for each predetermined sample size. For test problem 10LT only two different optimal solutions were found (as can be seen on the horizontal axis) from the 500 total runs. For S = 800 the higher objective value solution was found 99 out of the 100 times solved followed closely by the solutions using S = 600 and S = 400 which found the same solution 95 and 94 times respectively. For the test problem 20TT there was more variation to the optimal solutions found specifically for S = 100. However it is also clear here that there is a most frequent solution found, specially for sample sizes of 200 or more. This suggests that the solutions with an objective value of 260064 and 325062 for 10LT and 20TT, respectively, are very possibly the optimal solutions to SCHLP3 found by SCHLP3LS since they are increasingly found when using a larger sampled data set size. To see how CPU time is affected by the sample size, a test was conducted on the same two test problems as above, averaging the solution CPU time over ten runs using different random sample sets. The test was conducted for sample sizes S = {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000} and the CUP times follow a slight non- linear increasing curve which can be seen in figure 4.4 below. We observe that this is better than expected since it is often believed that solution times, for integer linear pro- grams, in the worst case, will increase exponentially as the problem dimension increases [55]. Figure 4.4: Average solution times over ten different runs for each sample size S = {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000}. 90 4.5.3 Stochastic vs. deterministic solutions In this section we look at the difference between the optimal solutions found by the deterministic CSAHLP and the SCHLP3LS . We solve the four different test problems for |N | = 20. For SCHLP3LS we test both γ = 0.95 (SCHLP3LS -0.95) and γ = 0.85 (SCHLP3LS -0.85) using a sampled set size S = 400. We then simulate the origin volumes, Oi(ζ) at each node i, 10,000 times from the same distribution as the sampled data, i.e. a uniform distribution between 0.5Oi and 1.5Oi, which we run on the optimal network setups found by each model. This makes it possible to calculate the probability for a capacity breach to occur at each hub as well as what the average size of the breach is expected to be. Table 4.3 presents the results of the simulations and shows what hubs were chosen (Hub), the capacity constraint at the different chosen hubs (Capacity), the average volume of flow arriving to each hub (Average flow), the standard deviation of the incoming flow (S.t.d.), the probability a capacity is expected to hold for (γk = Pr[ ∑ iOi(ζ)Xik ≤ Γk]), and what the expected size of the capacity breach will be if it occurs (CV aRγk = E[ ∑ iOi(ζ)Xik| ∑ iOi(ζ)Xik > Γk]), for each selected hub. It is clear from the runs that when the hubs have loose (L) capacity constraints using the SCHLP3LS model to find an optimal network setup gives limited benefits compared to using the deterministic CSAHLP model. This is not surprising since capacities are large for the (L) capacity constraint problems and thus capacity overruns are intrinsically unlikely. However, for hubs which have tight (T) capacity constraints there is a huge difference. For the test problems 20LT and 20TT, using the setup found from the deterministic model, 40%−50% of the time the volume flowing into some of the hubs will exceed hub capacity; and on average, these breaches are 13% in excess of capacity. In contrast, capacity breaches at the hubs using the setup found from SCHLP3LS -0.95 will at most occur 1% of the time with, in general, a much smaller (than 13%) expected average breach size. For SCHLP3LS -0.85, hub 14 in 20LT and hubs 10 and 19 for 20TT hold capacity 94, 94 and 96 percent of the time, respectively. These are otherwise the major problem hubs in the deterministic model solutions as mentioned above. Although not as robust as when using the design found from SCHLP3LS -0.95, as would be expected, it still gives a remarkable improvement compared to the design found from the deterministic model regarding avoidance of capacity breaches. 91 Prob Model Hub Capacity Average flow S.t.d. γk CV aRγk 20LL CSAHLP 7 4815.31 1457.77 137.78 1.00 0.00 14 3493.96 2520.40 324.03 1.00 0.00 SCHLP3LS -0.95 7 4815.31 1457.77 137.78 1.00 0.00 14 3493.96 2520.40 324.03 1.00 0.00 SCHLP3LS -0.85 7 4815.31 1457.77 137.78 1.00 0.00 14 3493.96 2520.40 324.03 1.00 0.00 20LT CSAHLP 10 2183.73 2044.12 164.42 0.79 97.46 14 1953.10 1934.00 318.65 0.51 259.81 SCHLP3LS -0.95 6 1621.36 1069.26 127.07 1.00 0.00 10 2183.73 1485.15 182.64 1.00 0.00 14 1953.10 1422.51 278.46 0.99 117.99 SCHLP3LS -0.85 6 1621.36 1072.94 127.70 1.00 0.00 10 2183.73 1382.38 181.15 1.00 0.00 14 1953.10 1524.25 276.73 0.94 85.217 20TL CSAHLP 7 4815.31 1459.02 139.03 1.00 0.00 19 4212.93 2521.16 324.12 1.00 0.00 SCHLP3LS -0.95 7 4815.31 1459.02 139.03 1.00 0.00 19 4212.93 2521.16 324.12 1.00 0.00 SCHLP3LS -0.85 7 4815.31 1459.02 139.03 1.00 0.00 19 4212.93 2521.16 324.12 1.00 0.00 20TT CSAHLP 1 1160.77 332.17 77.75 1.00 0.00 10 2183.73 2121.15 205.01 0.60 147.76 19 1576.37 1522.04 279.71 0.56 219.34 SCHLP3LS -0.95 6 1621.36 1209.59 133.43 0.99 53.94 10 2183.73 1643.62 280.63 0.99 171.25 19 1576.37 1127.94 177.13 0.99 58.26 SCHLP3LS -0.85 9 1774.78 993.55 116.30 1.00 0.00 10 2183.73 1859.37 199.88 0.94 92.303 19 1576.37 1125.01 271.61 0.96 42.831 Table 4.3: Expected capacity breaches at different hubs for AP 20 examples. In table 4.4 the different network setup costs are presented for the different runs on the two test problems with tight capacities as well as an approximation of what the overall 92 average capacity breach difference may be. Approximately 150 additional units, on an overall average basis, are expected to breach capacity each day when using the network setup found by the deterministic model compared to SCHLP3LS -0.95 and SCHLP 3L S - 0.85. The transportation cost is slightly less using the setup found from SCHLP3LS -0.85, compared to SCHLP3LS -0.95, where the overall average breach difference is quite small. Therefore it is important for network managers to get an idea of the costs associated with a capacity breach, i.e. the costs associated with delays, extra personnel costs needed to handle the extra volume, long queuing costs at docking stations, poorer services and so forth, in order to trade off the different construction and operation costs against each other. CSAHLP SCHLP3LS -0.95 SCHLP 3L S -0.85 20LT Cost 253517 265639 262510 Cost diff 12122 8993 Aver breach 147.77 1.18 5.11 Aver breach diff -146.59 -142.66 20TT Cost 296035 325062 320873 Cost diff 29027 24838 Aver breach 155.61 2.83 7.25 Aver breach diff -152.78 -148.36 Table 4.4: Cost and capacity breach differences for AP 20LT and 20TT examples. From these test problems and simulation runs it is clear that SCHLP3LS does find differ- ent network setups compared to the deterministic CSAHLP model which vastly diminish the risk of breaching a hub’s capacity when capacity constraints are tight. For networks where capacity breaches have costly consequences it is clear that accounting for fluctu- ations in the daily OD flow volumes is extremely important and the optimal solution found from the deterministic model is poor. In addition, increasing or decreasing γ also gives the user a lot of flexibility and the value used should be adjusted accordingly to one’s risk preference in regards to the cost of breaching a hub’s capacities. 93 4.6 Conclusion The introduction of stochastic volume into the capacitated single allocation hub location problem is presented and studied in this paper. The contribution of the paper is to give better decision tools to managers when designing a hub and spoke network with capacity constraints by being able to include the realistic and important random changes that OD flow volumes have on a daily basis. A model with chance constraints is developed and formulated using the risk measure conditional value-at-risk due to its effectiveness in finding a safe value-at-risk guaranteeing that the capacity will hold with a probability greater then or equal to γ. In addition it buffers against extreme volume increases which can have fatal affects on a network. The model is reformulated into a strong approximate linear formulation using sampled data, a SAA model, whose solution converges to a solution of the original problem with probability one, exponentially fast, as the sample size S increases. The SAA model can, in principle, be solved using any standard mixed integer linear programing solver. Numerical tests are conducted on the linear SAA model and show that we can solve several test problems up to a size of 50 nodes. In addition simulation tests are carried out and clearly show that accounting for random changes in the OD flow volumes is very important when capacity constraints are tight as this drastically reduces the chance of a capacity breach to occur. 94 Chapter 5 Conclusion For transportation companies, network infrastructure and operations are critical com- ponents which directly affect the economic health of the company. The design aspect of a hub and spoke network for a transportation company, be it mail, cargo, air passenger or something else, is by no means an easy or straightforward undertaking. In addition the design aspect is not only a onetime project at the birth of the company but is an ongoing process. For example, many post and mail companies are continuously consid- ering how to restructure their network, in regards to number and placement of hubs, node allocations, departure times etc., to optimize operations as demand changes over time [1]. The academic management science and operation research hub and spoke literature aims to help network managers, with their goal of finding an optimal hub and spoke design, by providing new mathematical programming tools. Different modeling objectives result in different design benefits which help different managers in different ways. Due to the extreme complexities of finding optimal solutions to large scale systems, which consider all parameters — dynamic, static and stochastic — and all the different variables, there exists a gap between solvable models and the possibility of finding a real life global optimum. Many simplified models can however still provide valuable feedback to real life design questions. The purpose of the academic literature is not only to try to reduce the theory practice gap but to also design simplified solvable models which still provide valuable design suggestions. The hub and spoke mathematical modeling literature started with one model [49], and evolved from there. As the evolution of hub and spoke optimization started taking 95 place, heuristic algorithms, different exact solution methods and model variations began to appear [3, 13]. This evolution is still going strong and hub and spoke modeling is now taking steps into the world of stochastic optimization. This is an important step since many critical network characteristics are missed when only considering averages [62]. This thesis contributes to the hub and spoke literature by being on the front line of exploring and developing solvable models considering important stochastic processes. The results from this thesis are very promising. For the first time exact solutions are being found for medium sized stochastic hub and spoke problems by implementing more efficient reformulations and solution algorithms as demonstrated in Chapter 2. Even the approximation models in Chapters 3 and 4 seem to suggest that optimal solutions are also possibly being found for many of the medium sized network problems. In addition, important design questions are being answered by the solutions of these models. For example, Chapter 3 is devoted to showing how to structure a network to be able to provide both fast and reliable delivery services. In particular, how does one minimize the longest path in the network so as to provide competitively fast delivery times no matter where in the network a customer wants to ship from and to, while accounting for the following issues: (i) random travel times for each leg as well as the random unloading, sorting and loading times at hubs, (ii) finding and/or making sure departure times are met even though times before everything has arrived and been loaded are random, (iii) making sure each delivery reaches its destination node before the found minimax time for at least some acceptable probability level γ. Also in Chapter 3, the additional problem of maximizing the number of total packages which are delivered in time by accounting for the costs and probabilities of late deliveries, for certain network structures, is modeled and then solved numerically. In Chapter 4, network managers can now find more reliable network structures that minimize costs while safely guarding against hub capacity breaches by accounting for the daily volume fluctuations of OD flows. In so doing, costs can be reduced by limiting the possibility of delays, extended working hours, large queues, loss of unhappy customers, etc., due to over capacity volume flows at hub nodes which cannot be handled efficiently. All these questions are very important during the design phase of a hub and spoke network and solutions are now available to find where important real life stochastic processes are considered. It is important to note however, that cost and time minimizations don’t always go hand in hand and each network manager will have to weigh one against the other depending upon capital and operational budgets, existing competition and type of service. The service cost model, presented in Chapter 3, is however a promising type of hybrid where 96 costs of missed delivery times are considered. More on this together with some future research suggestions are discussed in the following section. 5.1 Future Research The hope is that future research will continue developing stochastic optimization models and methods building on, among others, the model designs and methodologies presented in the preceding chapters. The well known question, which is of great practical impor- tance, of how to deal with the dual, and often opposing, objective functions of mini- mizing delivery times and network costs simultaneously, is an important future research question. The service cost problem, introduced in Chapter 3, has great potential as a stepping stone to solving this problem. As mentioned, it is a type of hybrid model which considers the cost of missed delivery times and can therefore more easily be extended to account for the total network costs, as opposed to a center type problem where travel time has no direct network cost relationship. This would be both a very intriguing and important future research question to consider while keeping the stochastic travel time properties and incorporating stochastic demand if capacity constraints are present. How- ever, prior to this step it may be well worth first looking into and developing appropriate formulations and solution algorithms to solving the location-allocation problem, for both variable and fixed departure times applied on the service cost problem. Additionally, research on how to account for stochastic OD flow volumes and looking into how/if this can affect the probability of a path making it on time, due to possible congestion at hub nodes on high volume days, for the service cost problem, is important. Finding efficient models and solution methods for these questions would be a major contribution and very beneficial to network managers. They would have tools which would allow optimal or near optimal solutions to be found while accounting for important real life system components such as increasing the probability of goods being delivered on time, accounting for departure times at hub nodes, considering stochastic network effects on the demand and travel times, while all at the same time minimizing the overall network costs. An alternative approach to the dual objective function problem discussed above is to add probabilistic time constraints to the stochastic hub location problem presented in Chapter 4. This would not account for the costs of missed deliveries but would minimize the costs associated with the network setup while guarding against capacity constraint 97 violations while at the same time guaranteeing delivery time services. In addition, one could further look into whether the stochastic travel times of arriving goods into a hub are affecting the congestion at each hub node in a positive or a negative way. Understanding this and developing efficient models and solution methods would be of great value for network system designers. Another important future research question is in regards to the new stochastic p-hub center problems also introduced in Chapter 3. The two different departure time models presented in Chapter 3 are appropriate and of great importance for delivery companies, however, the models could further be developed to better fit airline passenger hub and spoke systems where an airline company may have multiple slot time departure options to choose from for all hub-hub links. The second model presented in Chapter 3 can handle some departure time options but would explode in size if multiple options were available for every hub-hub link, which may be the case for some airline companies. Developing this would therefore be a very beneficial research question. Several other interesting research questions exist such as modelling a robust approach for handling uncertainty environments for the center, covering and median type prob- lems while incorporating hub departure times. Accounting for risk or uncertainty in a competitive environment where one considers the actions of competing players in the system would also be an interesting and important question. This would bring game the- ory into the stochastic hub and spoke literature as well. Lastly, looking into additional improvements of existing models, to be able to solve even larger networks, is always a sought after and appreciated research question. Thus it is clear that there are many interesting future research areas for stochastic optimization, and we hope to see at least some of them taken up in the future. Only in this way will the field continue to develop. The very human nature of wanting to explore, improve, understand and find solutions is what makes us great and allows us to grow as a species. If a person can answer just a single research question, it will mean the world, since only then can we keep developing. “Almost everything you do will seem insignificant, but it is important that you do it” — Gandhi. 98 Bibliography [1] Personal communication, National postal company, Europe, 2009. [2] M. Albareda-Sambola, E. Fernndez, and F. S. da Gama. The facility location problem with Bernoulli demands. Omega, 39(3):335 – 345, 2011. [3] S. Alumur and B. Y. Kara. Network hub location problems: The state of the art. European Journal of Operational Research, 190(1):1–21, 2008. [4] F. Andersson, H. Mausser, D. Rosen, and S. Uryasev. Credit risk optimization with conditional value-at-risk criterion. Mathematical Programming, 89(2):273–291, 2001. [5] J. Atlason, M. Epelman, and S. Henderson. Call center staffing with simulation and cutting plane methods. Annals of Operations Research, 127:333–358, 2004. [6] T. Aykin. On ‘A quadratic integer program for the location of interacting hub facilities’. European Journal of Operational Research, 46:409–411, 1990. [7] O. Baron, O. Berman, and D. Krass. Facility location with stochastic demand and constraints on waiting time. Manufacturing & Service Operations Management, 10:484–505, 2008. [8] N. Boysen, M. Fliedner, and A. Scholl. Scheduling inbound and outbound trucks at cross docking terminals. OR Spectrum, 32:135–161, 2010. [9] J. F. Campbell. Location and allocation for distribution systems with transship- ments and transportation economies of scale. Annals of Operations Research, 40:77– 99, 1992. [10] J. F. Campbell. Integer programming formulations of discrete hub location prob- lems. European Journal of Operational Research, 72:387–405, 1994. 99 [11] J. F. Campbell. Integer programming formulations of discrete hub location prob- lems. European Journal of Operational Research, 72:387–405, 1994. [12] J. F. Campbell. Hub location and the p-hub median problem. Operations Research, 44(6):923–935, 1996. [13] J. F. Campbell, A. Ernst, and M. Krishnamoorthy. Hub location problems. In H. Hamacher and Z. Drezner, editors, Location Theory: Applications and Theory, pages 373–406. Springer-Verlag, 2001. [14] R. Carbone. Public facilities location under stochastic demand. ONFOR, 12:261– 270, 1974. [15] I. Contreras, J.-F. Cordeau, and G. Laporte. Stochastic uncapacitated hub loca- tion. HEC Montreal and Interuniversity Research Centre on Enterprise Networks, Logistics and Transportation (CIRRELT), Montreal, Canada, 2010. [16] I. Contreras, J. A. Diaz, and E. Fernandez. Branch-and-price for large-scale ca- pacitated hub location problems with single assignment. INFORMS Journal on Computing, 23:41–55, 2011. [17] I. Correia, S. Nickel, and F. S. da Gama. The capacitated single-allocation hub location problem revisited: A note on a classical formulation. European Journal of Operational Research, 207:92–96, 2010. [18] M. da Graca Costa, M. E. Captivo, and J. Climaco. Capacitated single allocation hub location problem - a bi-criteria approach. Computers & Operations Research, 35:3671–3695, 2008. [19] L. Dai, C. H. Chen, and J. R. Birge. Convergence properties of two-stage stochastic programming. Journal of Optimization Theory and Applications, 106(3):489–509, 2000. [20] R. S. de Camargo, G. Miranda, and H. P. Luna. Benders decomposition for the uncapacitated multiple allocation hub location problem. Computers & Operations Research, 35(4):1047–1064, 2008. [21] U. Dorndorf, A. Drexl, Y. Nikulin, and E. Pesch. Flight gate scheduling: State-of- the-art and recent developments. Omega, 35:326–334, 2007. 100 [22] J. Ebery. Solving large single allocation p-hub location problems with 2 or 3 hubs. European Journal of Operational Research, 128(2):447–458, 2001. [23] S. Elhedhli and F. X. Hu. Hub-and-spoke network design with congestion. Com- puters & Operations Research, 32:1615–1632, 2005. [24] A. T. Ernst, H. Hamacher, H. Jiang, M. Krishnamoorthy, and G. Woeginger. Un- capacitated single and multiple allocation p-hub center problems. Computers & Operations Research, 36(7):2230–2241, 2009. [25] A. T. Ernst and M. Krishnamoorthy. Efficient algorithms for the uncapacitated single allocation p-hub median problem. Location Science, 4:139–154, 1996. [26] A. T. Ernst and M. Krishnamoorthy. Exact and heuristic algorithms for the unca- pacitated multiple allocation p-hub median problem. European Journal of Opera- tional Research, 104(1):100–112, 1998. [27] A. T. Ernst and M. Krishnamoorthy. An exact solution approach based on shortest- paths for p-hub median problems. INFORMS Journal on Computing, 10(2):149– 162, 1998. [28] A. T. Ernst and M. Krishnamoorthy. Solution algorithms for the capacitated single allocation hub location problem. Annals of Operations Research, 86:141–159, 1999. [29] A. S. Fotheringham. A new set of spatial interaction models: The theory of com- peting destinations. Environment and Planning A, pages 15–36, 1983. [30] O. Garnett, A. Mandelbaum, and M. Reiman. Designing a call center with impa- tient customers. Manufacturing and Service Operations Management, 4(3):208–227, 2002. [31] G. Ghiani, F. Guerriero, G. Laporte, and R. Musmanno. Real-time vehicle routing: Solution concepts, algorithms and parallel computing strategies. European Journal of Operational Research, 151:1–11, 2003. [32] I. Gurvich, M. Armony, and A. Mandelbaum. Service-level differentiation in call centers with fully flexible servers. Management Science, 54(2):279–294, 2008. [33] E. Hult, H. Jiang, and D. Ralph. Exact computational approaches to a stochas- tic uncapacitated single allocation p-hub center problem. Judge Business School, University of Cambridge, Trumpington Street, Cambridge, United Kingdom, 2011. 101 [34] E. L. Johnson, G. L. Nemhauser, and M. W. Savelsbergh. Progress in linear programming-based algorithms for integer programming: An exposition. INFORMS Journal on Computing, 12(1):2–23, 2000. [35] P. Jorion. Value at Risk: The new benchmark for managing financial risk. The McGraw-Hill Companies, Inc., 2007. [36] S. Juette, O. Gavriliouk, and H. Hamacher. Polyhedral analysis of uncapacitated single allocation p-hub center problems. Technical Report 109, FB Mathematik, TU Kaiserslautern, 2007. [37] B. Y. Kara and B. C. Tansel. On the single-assignment p-hub center problem. European Journal of Operational Research, 125(3):648–655, 2000. [38] B. Y. Kara and B. C. Tansel. The latest arrival hub location problem. Management Science, 47(10):1408–1420, 2001. [39] A. Kleywegt, A. Shapiro, and T. Homen-De-Mello. The sample average approxima- tion method for stochastic discrete optimization. SIAM Journal on Optimization, 12(2):479–502, 2001. [40] J. G. Klincewicz. Heuristics for the p-hub location problem. European Journal of Operational Research, 53(1):25–37, 1991. [41] J. G. Klincewicz. Avoiding local optima in the p-hub location problem using tabu search and grasp. Annals of Operations Research, 40:283–302, 1992. [42] P. Krokhmal, J. Palmquist, and S. Uryasev. Portfolio optimization with conditional value-at-risk objective and constraints. Journal of Risk, 4(2):43–68, 2002. [43] F. Y. Louveaux. Discrete stochastic location models. Annals of Operations Re- search, 6:23–34, 1986. [44] V. Marianov and D. Serra. Location models for airline hubs behaving as M/D/c queues. Computers & Operations Research, 30:983–1003, 2003. [45] V. Marianov, D. Serra, and C. ReVelle. Location of hubs in a competitive environ- ment. European Journal of Operational Research, 114:363–371, 1999. [46] H. M. Markowitz. Portfolio selection. The Journal of Finance, 7:77–91, 1952. 102 [47] J. C. Martin and C. Roman. Hub location in the South-Atlantic airline market: A spatial competition game. Transportation Research Part A, pages 865–888, 2003. [48] T. Meyer, A. T. Ernst, and M. Krishnamoorthy. A 2-phase algorithm for solving the single allocation p-hub center problem. Computers & Operations Research, 36(12):3143–3151, 2009. [49] M. E. O’Kelly. A quadratic integer program for the location of interacting hub facilities. European Journal of Operational Research, 32:393–404, 1987. [50] M. E. O’Kelly. Hub facility location with fixed costs. Papers in Regional Science: The Journal of the RSAI, 71:293–306, 1992. [51] M. E. O’Kelly. Routing traffic at hub facilities. Networks and Spatial Economics, 10(2):173–191, 2010. [52] M. E. O’Kelly, D. Bryan, D. Skorin-Kapov, and J. Skorin-Kapov. Hub network de- sign with single and multiple allocation: A computational study. Location Science, 4(3):125–138, 1996. [53] M. E. O’Kelly, D. Skorin-Kapov, and J. Skorin-Kapov. Lower bounds for the hub location problem. Management Science, 41(4):713–721, 1995. [54] P. Pamuk and C. Sepil. A solution to the hub center problem via a single-relocation algorithm with tabu search. IEE Transactions, 33:399–411, 2001. [55] C. H. Papadimitriou and K. Steiglitz. Combinatorial Optimization Algorithms and Complexity. Dover Publications, Inc., 1998. [56] H. Pirkul and D. A. Schilling. An efficient procedure for designing single allocation hub and spoke systems. Management Science, 44(12):S235–S242, 1998. [57] R. T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. The Journal of Risk, 2(3), 2000. [58] R. T. Rockafellar and S. Uryasev. Conditional value-at-risk for general loss distri- butions. Journal of Banking & Finance, 26:1443–1471, 2002. [59] J. Rosenhead, M. Elton, and S. Gupta. Robustness and optimality as criteria for strategic decisions. Operational Research Quarterly (1970-1977), 23(4):413–431, 1972. 103 [60] H. W. Samir Elhedhli. A Lagrangean heuristic for hub-and-spoke system design with capacity selection and congestion. INFORMS Journal on Computing, 22:282– 296, 2010. [61] M. Sasaki. Hub network design model in a competitive environment with flow threshold. Journal of the Operations Research Society of Japan, 48:158–171, 2005. [62] S. Savage. Decision Making with Insight. Duxbury Press, 2003. [63] S. Savage. Flaw of averages. San Jose Mercury News, 2008. [64] E. Sheppard. A conceptual framework for dynamic location-allocation analysis. Environment and Planning A, 6:547–564, 1974. [65] T. Sim, T. Lowe, and B. Thomas. The stochastic p-hub center problem with service- level constraints. Computers & Operations Research, 36:3166–3177, 2009. [66] D. Skorin-Kapov and J. Skorin-Kapov. On tabu search for the location of interacting hub facilities. European Journal of Operational Research, 73:502–509, 1994. [67] D. Skorin-Kapov, J. Skorin-Kapov, and M. E. O’Kelly. Tight linear programming relaxations of uncapacitated p-hub median problems. European Journal of Opera- tional Research, 94:582–593, 1996. [68] K. Smith, M. Krishnamoorthy, and M. Palaniswami. Neural versus traditional approaches to the location of interacting hub facilities. Location Science, 4:155– 171, 1996. [69] L. V. Snyder. Facility location under uncertainty: a review. IIE Transactions, 38:537–554, 2006. [70] J. Sohn and S. Park. A linear program for the two-hub location problem. European Journal of Operational Research, 100(3):617–622, 1997. [71] J. Sohn and S. Park. Efficient solution procedure and reduced size formulations for p-hub location problems. European Journal of Operational Research, 108(1):118– 126, 1998. [72] J. Sohn and S. Park. The single allocation problem in the interacting three hub network. Networks, 35:17–25, 2000. 104 [73] D. Steenken, S. Voss, and R. Stahlbock. Container terminal operation and op- erations research - a classification and literature review. OR Spectrum, 26:3–49, 2004. [74] B. Wagner. A note on ‘the latest arrival hub location problem’. Management Science, 50:1751–1752, 2004. [75] M. R. Wagner, J. Bhadury, and S. Peng. Risk management in uncapacitated fa- cility location models with random demands. Computers & Operations Research, 36:1002–1011, 2009. [76] J. Wang. The β-reliable median on a network with discrete probabilistic demand weights. Operations Research, 55:966–975, 2007. [77] W. Wang and S. Ahmed. Sample average approximation of expected value con- strained stochastic programs. Operations Research Letters, 36:515–519, 2008. [78] J. R. Weaver and R. L. Church. Computational procedures for location problems on stochastic networks. Transportation Science, 17:168–180, 1983. [79] M. Wen and K. Iwamura. Fuzzy facility location-allocation problem under the Hurwicz criterion. European Journal of Operational Research, 184:627–635, 2008. [80] H. Yaman, B. Kara, and B. Tansel. The latest arrival hub location problem for cargo delivery systems with stopovers. Transportation Research Part B, 41(8):906–919, 2007. [81] T.-H. Yang. Stochastic air freight hub location and flight routes planning. Applied Mathematical Modelling, 33:4424–4430, 2009. [82] W. Yu. Operational strategies for cross docking systems. Iowa State University, Ph.D. Dissertation, 2002. [83] J. Zhou and B. Liu. New stochastic models for capacitated location-allocation problem. Computers & Industrial Engineering, 45:111–125, 2003. 105