Dynamic Programming Approximations for a Stochastic Inventory Routing Problem

Size: px
Start display at page:

Download "Dynamic Programming Approximations for a Stochastic Inventory Routing Problem"

Transcription

1 Dynamic Programming Approximations for a Stochastic Inventory Routing Problem Anton J. Kleywegt Vijay S. Nori Martin W. P. Savelsbergh School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA August 28, 2002 Abstract This work is motivated by the need to solve the inventory routing problem when implementing a business practice called vendor managed inventory replenishment (VMI). With VMI, vendors monitor their customers inventories, and decide when and how much inventory should be replenished at each customer. The inventory routing problem attempts to coordinate inventory replenishment and transportation in such a way that the cost is minimized over the long run. We formulate a Markov decision process model of the stochastic inventory routing problem, and propose approximation methods to find good solutions with reasonable computational effort. We indicate how the proposed approach can be used for other Markov decision processes involving the control of multiple resources. Supported by the National Science Foundation under grant DMI

2 Introduction Recently the business practice called vendor managed inventory replenishment (VMI) has been adopted by many companies. VMI refers to the situation in which a vendor monitors the inventory levels at its customers and decides when and how much inventory to replenish at each customer. This contrasts with conventional inventory management, in which customers monitor their own inventory levels and place orders when they think that it is the appropriate time to reorder. VMI has several advantages over conventional inventory management. Vendors can usually obtain a more uniform utilization of production resources, which leads to reduced production and inventory holding costs. Similarly, vendors can often obtain a more uniform utilization of transportation resources, which in turn leads to reduced transportation costs. Furthermore, additional savings in transportation costs may be obtained by increasing the use of low-cost full-truckload shipments and decreasing the use of high-cost less-than-truckload shipments, and by using more efficient routes by coordinating the replenishment at customers close to each other. VMI also has advantages for customers. Service levels may increase, measured in terms of reliability of product availability, due to the fact that vendors can use the information that they collect on the inventory levels at the customers to better anticipate future demand, and to proactively smooth peaks in the demand. Also, customers do not have to devote as many resources to monitoring their inventory levels and placing orders, as long as the vendor is successful in earning and maintaining the trust of the customers. A first requirement for a successful implementation of VMI is that a vendor is able to obtain relevant and accurate information in a timely and efficient way. One of the reasons for the increased popularity of VMI is the increase in the availability of affordable and reliable equipment to collect and transmit the necessary data between the customers and the vendor. However, access to the relevant information is only one requirement. A vendor should also be able to use the increased amount of information to make good decisions. This is not an easy task. In fact, it is a very complicated task, as the decision problems involved are very hard. The objective of this work is to develop efficient methods to help the vendor to make good decisions when implementing VMI. In many applications of VMI, the vendor manages a fleet of vehicles to transport the product to the customers. The objective of the vendor is to coordinate the inventory replenishment and transportation in such a way that the total cost is minimized over the long run. The problem of optimal coordination of inventory replenishment and transportation is called the inventory routing problem (IRP). In this paper, we study the problem of determining optimal policies for the variant of the IRP in which a single product is distributed from a single vendor to multiple customers. The demands at the customers are assumed to have probability distributions that are known to the vendor. The objective is to maximize the expected discounted value, incorporating sales revenues, production costs, transportation costs, inventory holding costs, and shortage penalties, over an infinite horizon. 2

3 Our work on this problem was motivated by our collaboration with a producer and distributor of air products. The company operates plants worldwide and produces a variety of air products, such as liquid nitrogen, oxygen and argon. The company s bulk customers have their own storage tanks at their sites, which are replenished by tanker trucks under the supplier s control. Approximately 80% of the bulk customers participate in the company s VMI program. For the most part each customer and each vehicle is allocated to a specific plant, so that the overall problem decomposes according to individual plants. Also, to improve safety and reduce contamination, each vehicle and each storage tank at a customer is dedicated to a particular type of product. Hence the problem also decomposes according to type of product. (This assumption does not hold if the number of drivers is a tight constraint, and drivers can be allocated to deliver one of several different products.) Therefore, in this paper we consider an inventory routing problem with a single vendor, multiple customers, multiple vehicles, and a single type of product. The main contributions of the research reported in this paper are as follows: 1. In an earlier paper (Kleywegt et al., 2002), we formulated the inventory routing problem with direct deliveries, i.e., one delivery per trip, as a Markov decision process and proposed an approximate dynamic programming approach for its solution. In this paper, we extend both the formulation and the approach to handle multiple deliveries per trip. 2. We present a solution approach that uses decomposition and optimization to approximate the value function. Specifically, the overall problem is decomposed into smaller subproblems, each designed to have two properties: (1) it provides an accurate representation of a portion of the overall problem, and (2) it is relatively easy to solve. In addition, an optimization problem is defined to combine the solutions of the subproblems, in such a way that the value of a given state of the process is approximated by the optimal value of the optimization problem. 3. Computational experiments demonstrate that our approach allows the construction of near optimal policies for small instances and policies that are better than policies that have been proposed in the literature for realistically sized instances (with approximately 20 customers). The sizes of the state spaces for these instances are orders of magnitude larger than those that can be handled with more traditional methods, such as the modified policy iteration algorithm. In Section 1 we define the stochastic inventory routing problem, point out the obstacles encountered when attempting to solve the problem, present an overview of the proposed solution method, and review related literature. In Section 2 we propose a method for approximating the dynamic programming value function. In Section 3 the day-to-day control of the IRP process using the dynamic programming value function approximation is discussed. In Section 4 we investigate a special case of the IRP. Computational 3

4 results are presented in Section 5, and Section 6 concludes with some remarks regarding the application of the approach to other stochastic control problems. 1 Problem Definition A general description of the IRP is given in Section 1.1, after which a Markov decision process formulation is given in Section 1.2. Section 1.3 discusses the issues to be addressed when solving the IRP, and Section 1.4 presents an overview of the proposed solution method. Section 1.5 reviews some related literature. 1.1 Problem Description A product is distributed from a vendor s facility to N customers, using a fleet of M homogeneous vehicles, each with known capacity C. The process is modeled in discrete time t =0, 1,..., and the discrete time periods are called days. Let random variable U it denote the demand of customer i at time t, and let U t (U 1t,...,U Nt ) denote the vector of customer demands at time t. Customers demands on different days are independent random vectors with a joint probability distribution F that does not change with time; that is, U 0,U 1,... is an independent and identically distributed sequence, and F is the probability distribution of each U t. The probability distribution F is known to the decision maker. (In many applications customers demands on different days may not be independent; in such cases customers demands on previous days may provide valuable data for the forecasting of customers future demands. A refined model with a suitably expanded state space can be formulated to exploit such additional information. Such refinement is not addressed in this paper.) There is an upper bound C i on the amount of product that can be in inventory at each customer i. This upper bound C i can be due to limited storage capacity at customer i, as in the application that motivated this research. In other applications of VMI, there is often a contractual upper bound C i, agreed upon by customer i and the vendor, on the amount of inventory that may be at customer i at any point in time. One motivation for this contractual bound is to prevent the vendor from dumping too much product at the customer. The vendor can measure the inventory level X it of each customer i at any time t. At each time t, the vendor makes a decision that controls the routing of vehicles and the replenishment of customers inventories. Such decisions may have many aspects, some of which are important for the method developed in this paper, and others which are not. Aspects of daily decisions that are important for the method developed in this paper are the following: 1. which customers inventories to replenish, 2. how much to deliver at each customer, and 4

5 3. how to combine customers into vehicle routes. On the other hand, the ideas developed in the paper are independent of the routing constraints that are imposed, and thus routing constraints are not explicitly spelled out in the formulation. Unless otherwise stated, we assume that each vehicle can perform at most one route per day. We also assume that the duration of the task assigned to each driver and vehicle is less than the length of a day, so that all M drivers and vehicles are available at the beginning of each day, when the tasks for that day are assigned. The expected value (revenues and costs) accumulated during a day depends on the inventory levels and decision of that day, and is known to the vendor. As in the case of the routing constraints, the ideas developed in the paper are independent of the exact composition of the costs of the daily decisions. Next we describe some typical types of costs for illustrative purposes. (These costs were also used in numerical work.) The cost of a daily decision may include the travel costs c ij on the arcs (i, j) of the distribution network that are traversed according to the decision. Travel costs may also depend on the amount of product transported along each arc. The cost of a daily decision may include the costs incurred at customers sites, for example due to product losses during delivery. The cost of a daily decision may include revenue: if quantity d i is delivered at customer i, the vendor earns a reward of r i (d i ). The cost of a daily decision may include shortage penalties: because demand is uncertain, there is often a positive probability that a customer runs out of stock, and thus shortages cannot always be prevented. Shortages are discouraged with a penalty p i (s i )ifthe unsatisfied demand on day t at customer i is s i. Unsatisfied demand is treated as lost demand, and is not backlogged. The cost of a daily decision may include inventory holding cost: if the inventory at customer i is x i at the beginning of the day, and quantity d i is delivered at customer i, then an inventory holding cost of h i (x i + d i ) is incurred. The inventory holding cost can also be modeled as a function of some average amount of inventory at each customer during the time period. The role played by inventory holding cost depends on the application. In some cases, the vendor and customers belong to different organizations, and the customers own the inventory. In these cases, the vendor typically does not incur any inventory holding costs based on the inventory at the customers. This was the case in the application that motivated this work. In other cases, such as when the vendor and customers belong to the same organization, or when the vendor owns the inventory at the customers, the vendor does incur inventory holding costs based on the inventory at the customers. The objective is to choose a distribution policy that maximizes the expected discounted value (rewards minus costs) over an infinite time horizon. 1.2 Problem Formulation In this section we formulate the IRP as a discrete time Markov decision process (MDP) with the following components: 5

6 1. The state x =(x 1,x 2,...,x N ) represents the current amount of inventory at each customer. Thus the state space is X =[0,C 1 ] [0,C 2 ] [0,C N ] if the quantity of product can vary continuously, or X = {0, 1,...,C 1 } {0, 1,...,C 2 } {0, 1,...,C N } if the quantity of product varies in discrete units. Let X it [0,C i ] (or X it {0, 1,...,C i }) denote the random inventory level at customer i at time t. LetX t =(X 1t,...,X Nt ) X denote the state at time t. 2. For any state x, leta(x) denote the set of all feasible decisions when the process is in state x. A decision a A(x) made at time t when the process is in state x, contains information about (1) which customers inventories to replenish, (2) how much to deliver at each customer, and (3) how to combine customers into vehicle routes. A decision may contain more information such as travel times and arrival and departure times at customers (relative to time windows); the three attributes of a decision mentioned above are the important attributes for our purposes. For any decision a, letd i (a) denote the quantity of product that is delivered to customer i while executing decision a. The set A(x) is determined by various constraints, such as work load constraints, routing constraints, vehicles capacity constraints, and customers inventory constraints. As discussed in Section 1.1, constraints such as work load constraints and routing constraints do not affect the method described in this paper. The constraints explicitly addressed in this paper are the limited number M of vehicles that can be used each day, the limited quantity C (vehicle capacity) that can be delivered by each vehicle on a day, and the maximum inventory levels C i that are allowed at any time at each customer i. The maximum inventory level constraints can be imposed in a variety of ways. For example, if it is assumed that no product is used between the time that the inventory level x i is measured at customer i and the time that the delivery of d i (a) takes place, then the maximum inventory level constraints can be expressed as x i + d i (a) C i for all i, all x X, and all a A(x). If product is used during this time period, it may be possible to deliver more. The exact way in which the constraint is applied does not affect the rest of the development. For simplicity we applied the constraint as stated above. Let the random variable A t A(X t ) denote the decision chosen at time t. 3. In this formulation, the source of randomness is the random customer demands U it. To simplify the exposition, assume that the deliveries at time t take place in time to satisfy the demand at time t. Then the amount of product used by customer i at time t is given by min{x it + d i (A t ), U it }. Thus the shortage at customer i at time t is given by S it = max{u it (X it + d i (A t )), 0}, and the next inventory level at customer i at time t + 1 is given by X i,t+1 = max{x it + d i (A t ) U it, 0}. The known joint probability distribution F of customer demands U t gives a known Markov transition function Q, according to which transitions occur. For any state x X, any decision a A(x), and any Borel subset { B X,letU(x, a, B) U R N + : ( max{x 1 +d 1 (a) U 1, 0},...,max{x N +d N (a) U N, 0} ) } B. 6

7 Then Q[B x, a] F [U(x, a, B)]. In other words, for any state x X, and any decision a A(x), P [X t+1 B X t = x, A t = a] = Q[B x, a] F [U(x, a, B)] 4. Let g(x, a) denote the expected single stage net reward if the process is in state x at time t, and decision a A(x) is implemented. To give a specific example in terms of the costs mentioned in Section 1.1, for any decision a and arc (i, j), let k ij (a) denote the number of times that arc (i, j) is traversed by a vehicle while executing decision a. Then, g(x, a) N r i (d i (a)) N c ij k ij (a) h i (x i + d i (a)) (i,j) i=1 i=1 N i=1 E F [ p i ( max{ui0 (x i + d i (a)), 0} )] where E F denotes expected value with respect to the probability distribution F of U The objective is to maximize the expected total discounted value over an infinite horizon. The decisions A t are restricted such that A t A(X t )foreacht, anda t may depend only on the history (X 0,A 0,X 1,A 1,...,X t ) of the process up to time t, i.e., when the decision maker decides on a decision at time t, the decision maker does not know what is going to happen in the future. Let Π denote the set of policies that depend only on the history of the process up to time t. Let α [0, 1) denote the discount factor. Let V (x) denote the optimal expected value given that the initial state is x, i.e., [ ] V (x) sup E π α t g (X t,a t ) π Π X 0 = x t=0 (1) A stationary deterministic policy π prescribes a decision π(x) A(x) based on the information contained in the current state x of the process only. For any stationary deterministic policy π, and any state x X,the expected value V π (x) is given by [ ] V π (x) E π α t g (X t,π(x t )) X 0 = x t=0 = g(x, π(x)) + α V π (y) Q[dy x, π(x)] (The last equality is a standard result in dynamic programming; see for example Bertsekas and Shreve 1978.) It follows from results in dynamic programming that, under conditions that are not very restrictive (e.g., g bounded and α<1), to determine the optimal expected value in (1), it is sufficient to restrict attention to X 7

8 the class Π SD of stationary deterministic policies. It follows that for any state x X, V (x) = sup V π (x) π Π SD = sup a A(x) { g(x, a)+α X } V (y) Q[dy x, a] (2) A policy π is called optimal if V π = V. 1.3 Solving the Markov Decision Process Many algorithms have been proposed to solve Markov decision processes; for example, see the textbooks by Bertsekas (1995) and Puterman (1994). Solving a Markov decision process usually involves computing the optimal value function V, and an optimal policy π, by solving the optimality equation (2). This requires the following major computational tasks to be performed. 1. Computation of the optimal value function V. Because V appears in the left hand side and right hand side of (2), most algorithms for computing V involves the computation of successive approximations to V (x) for every x X. These algorithms are practical only if the state space X is small. For the IRP as formulated in Section 1.2, X may be uncountable. One may attempt to make the problem more tractable by discretizing the state space X and the transition probabilities Q. Even if one discretizes X and Q, the number of states grows exponentially in the number of customers. Thus even for discretized X and Q, the number of states is far too large to compute V (x) for every x X if there are more than about four customers. 2. Estimation of the expected value (integral) in (2). For the IRP, this is a high dimensional integral, with the number of dimensions equal to the number N of customers, which can be as much as several hundred. Conventional numerical integration methods are not practical for the computation of such high dimensional integrals. 3. The maximization problem on the right hand side of (2) has to be solved to determine the optimal decision for each state. In the case of the IRP, the optimization problem on the right hand side of (2) is very hard. For example, the vehicle routing problem (VRP), which is NP-hard, is a special case of that problem. (Consider any instance of the VRP, with a given number of capacitated vehicles, a graph with costs on the arcs, and demand quantities at the nodes. For the IRP, let the vehicles and graph be the same as for the VRP, let the demand be deterministic with demand quantities as given for the VRP, let the current inventory level at each customer be zero, let the discount factor be zero, and let the penalties be sufficiently large such that an optimal solution for the optimization problem 8

9 on the right hand side of (2) has to satisfy the demand quantities at all the nodes. Then the instance of the VRP can be solved by solving the optimization problem on the right hand side of (2).) In Kleywegt et al. (2002) we developed approximation methods to perform the computational tasks mentioned above efficiently and to obtain good solutions for the inventory routing problem with direct deliveries (IRPDD). To extend the approach to the IRP in which multiple customers can be visited on a route, we develop in this paper new methods for the first and third computational tasks, that is, to compute, at least approximately, V, and to solve the maximization problem on the right hand side of (2). The second task was addressed in the way described in Kleywegt et al. (2002). 1.4 Overview of the Proposed Method An outline of our approach is as follows. The first major step in solving the IRP is to construct an approximation ˆV to the optimal value function V. The approximation ˆV is constructed as follows. First, a decomposition of the IRP is developed. Subproblems are defined for specific subsets of customers. Each subproblem is also a Markov decision process. The subsets of customers do not necessarily partition the set of customers, but must cover the set of customers. The idea is to define each subproblem so that it gives an accurate representation of the overall process as experienced by the subset of customers. To do that, the parameters of each subproblem are determined by simulating the overall IRP process, and by constructing simulation estimates of subproblem parameters. Second, each subproblem is solved optimally. Third, for any given state x of the IRP process, the approximate value ˆV (x) is determined by choosing a collection of subsets of customers that partitions the set of customers. Then ˆV (x) is set equal to the sum of the optimal value functions of the subproblems corresponding to the chosen collection of subsets at states corresponding to x. The collection of subsets of customers is chosen to maximize ˆV (x). Details of the construction of ˆV are given in Section 2. An outline of the value function approximation algorithm is given in Algorithm 1. Given ˆV, the IRP process is controlled as follows. Whenever the state of the process is x, then a decision ˆπ(x) is chosen that solves { max g(x, a)+α a A(x) X } ˆV (y) Q[dy x, a] which is the right hand side of the optimality equation (2) with ˆV instead of V. A method for problem (3) is described in Section 3. Algorithm 1 already indicates that the development of the approximating function ˆV requires a lot of computational effort. The effort is required to determine appropriate parameters for the subproblems and to solve all the subproblems. This effort is required only once at the beginning of the control of the IRP process (3) 9

10 Algorithm 1 Procedure for computing ˆV and ˆπ. 1. Start with an initial policy ˆπ 0. Set i Simulate the IRP under policy ˆπ 0 to estimate the subproblem parameters. 3. Solve the subproblems. 4. ˆV is determined by the optimal value functions of the subproblems. 5. Policy ˆπ 1 is defined by equation (4). 6. Repeat steps 7 through 11 for a chosen number of iterations, or until a convergence test is satisfied. 7. Increment i i Simulate the IRP under policy ˆπ i to update the estimates of the subproblem parameters. 9. With the updated estimates of the subproblem parameters, solve the updated subproblems. 10. ˆV is determined by the optimal value functions of the updated subproblems. 11. Policy ˆπ i+1 is given by equation (4). (although, in practice, ˆV may have to be changed if the parameters of the MDP change), so that a substantial effort for this initial computational task seems to be acceptable. In contrast, once the approximating function ˆV has been constructed, only the daily problem (3) has to be solved at each stage of the IRP process, each time for a given value of the state x. Because the daily problem has to be solved many times, it is important that this computational task can be performed with relatively little effort. 1.5 Review of Related Literature In this section we give a brief review of related literature on the inventory routing problem (Section 1.5.1) and on dynamic programming approximations (Section 1.5.2). The review is not comprehensive Inventory Routing Literature A large variety of deterministic and stochastic models of inventory routing problems have been formulated, and a variety of heuristics and bounds have been produced. A classification of the inventory routing literature is given in Kleywegt et al. (2002). Bell et al. (1983) propose an integer program for the inventory routing problem at Air Products, a producer of products such as liquid nitrogen. Dror, Ball, and Golden (1985), and Dror and Ball (1987) construct a solution for a short-term planning period based on identifying, for each customer, the optimal replenishment day t and the expected increase in cost if the customer is visited on day t instead of t.an integer program is then solved that assigns customers to a vehicle and a day, or just a day, that minimizes the sum of these costs plus the transportation costs. Dror and Levy (1986) use a similar method to construct a 10

11 weekly schedule, and then apply node and arc exchanges to reduce costs in the planning period. Trudeau and Dror (1992) apply similar ideas to the case in which inventories are observable only at delivery times. Bard et al. (1998) follow a rolling horizon approach to an inventory routing problem with satellite facilities where trucks can be refilled. To choose the customers to be visited during the next two weeks, they determine an optimal replenishment frequency for each customer, similar to the approach in Dror, Ball, and Golden (1985), and Dror and Ball (1987). Federgruen and Zipkin (1984) formulate an inventory routing problem quite similar to the one in Section 1.2, except that they focus on solving the myopic single-stage problem max a A(x) g(x, a), which is a nonlinear integer program. Golden, Assad, and Dahl (1984) also propose a heuristic to solve the myopic single-stage problem max a A(x) g(x, a), while maintaining an adequate inventory at all customers. Chien, Balakrishnan, and Wong (1989) also propose an integer programming based heuristic to solve the single-stage problem, but they attempt to find a solution that is less myopic than that of Federgruen and Zipkin (1984) and Golden, Assad, and Dahl (1984), by passing information from one day to the next. Anily and Federgruen (1990, 1991, 1993) analyze fixed partition policies for the inventory routing problem with constant deterministic demand rates and an unlimited number of vehicles. They also find lower and upper bounds on the minimum long-run average cost over all fixed partition policies, and propose a heuristic, called modified circular regional partitioning, to choose a fixed partition. Gallego and Simchi-Levi (1990) use an approach similar to that of Anily and Federgruen (1990) to evaluate the long-run effectiveness of direct deliveries (one customer on each route). Bramel and Simchi-Levi (1995) also study fixed partition policies for the deterministic inventory routing problem with an unlimited number of vehicles. They propose a location based heuristic, based on the capacitated concentrator location problem (CCLP), to choose a fixed partition. The tour through each subset of customers is constructed while solving the CCLP, using a nearest insertion heuristic. Chan, Federgruen, and Simchi-Levi (1998) analyze zero-inventory ordering policies, in which a customer s inventory is replenished only when the customer s inventory has been depleted, and fixed partition policies, also for the deterministic inventory routing problem with an unlimited number of vehicles. They derive asymptotic worst-case bounds on the performance of the policies. They also propose a heuristic based on the CCLP, similar to that of Bramel and Simchi-Levi (1995), for determining a fixed partition of the set of customers. Gaur and Fisher (2002) consider a deterministic inventory routing problem with time varying demand. They propose a randomized heuristic to find a fixed partition policy with periodic deliveries. Their method was implemented for a supermarket chain. Burns et al. (1985) develop approximating equations for both a direct delivery policy as well as a policy in which vehicles visit multiple customers on a route. Minkoff (1993) also formulated the inventory routing problem as a MDP. He focused on the case with an unlimited number of vehicles. He proposed a decomposition heuristic to reduce the computational effort. 11

12 The heuristic solves a linear program to allocate joint transportation costs to individual customers, and then solves individual customer subproblems. The value functions of the subproblems are added to approximate the value function of the combined problem. Minkoff s work differs from ours in the following aspects: (1) we consider the case with a limited number of vehicles, (2) we define subproblems involving one or more customers, and the subproblems are defined differently, one reason being that the bound on the number of vehicles has to be addressed in our subproblems, and (3) we solve an optimization problem to combine the results of the subproblems. Webb and Larson (1995) propose a solution for the problem of determining the minimum fleet size for an inventory routing system. Their work is related to Larson s earlier work on fleet sizing and inventory routing (Larson, 1988). Bassok and Ernst (1995) consider the problem of delivering multiple products to customers on a fixed tour. The optimal policy for each product is characterized by a sequence of critical numbers, similar to an optimal policy found by Topkis (1968). Barnes-Schuster and Bassok (1997) study the cost effectiveness of a particular direct delivery policy for the inventory routing problem. Kleywegt et al. (2002) also consider the special case with direct deliveries. A MDP model of the inventory routing problem is formulated, and a dynamic programming approximation method is developed to find a policy. Herer and Roundy (1997) propose several heuristics to construct power-of-two policies for the inventory routing problem with constant deterministic demand rates and an unlimited number of vehicles, and they prove performance bounds for the heuristics. Viswanathan and Mathur (1997) propose an insertion heuristic to construct a power-of-two policy for the inventory routing problem with multiple products, constant deterministic demand rates, and an unlimited number of vehicles. Reiman et al. (1999) perform a heavy traffic analysis for three types of policies for the inventory routing problem with a single vehicle. Çetinkaya and Lee (2000) study a problem in which the vendor accumulates customer orders over time intervals of length T, and then delivers customer orders at the end of each time interval. Bertazzi et al. (2002) consider a deterministic inventory routing problem with a single capacitated vehicle. Each customer has a specified minimum and maximum inventory level. They propose a heuristic to determine the vehicle route at each discrete time point, while following an order-up-to policy, that is, each time a customer is visited the inventory at the customer is replenished to the specified maximum inventory level. They consider the impact of different objective functions. The inventory pickup and delivery problem is quite similar to the inventory routing problem. In the inventory pickup and delivery problem, there are multiple sources of a single product, multiple demand points, and multiple vehicles. The vehicles are scheduled to travel alternatingly between sources and demand points to replenish the inventory at the demand points. Christiansen and Nygreen (1998a, 1998b) present 12

13 a path flow formulation and column generation method for the inventory pickup and delivery problem with time windows (IPDPTW). Christiansen (1999) presents an arc flow formulation for the IPDPTW Dynamic Programming Approximation Literature Dynamic programming or Markov decision processes is a versatile and widely used framework for modeling dynamic and stochastic optimal control problems. However, a major shortcoming is that for many interesting applications an optimal policy cannot be computed because (1) the state space X is too big to compute and store the optimal value V (x) and an optimal decision π (x) for each state x; and/or (2) the expected value in (2), which often is a high dimensional integral, cannot be computed exactly; and/or (3) the single stage optimization problem on the right hand side of (2) cannot be solved exactly. In this section we briefly mention some of the work that has been done to address the first issue, that is, how to attack problems with large state spaces. The second issue makes up a large part of the field of statistics, and the third issue makes up a large part of the field of optimization; these fields are not reviewed here. A natural approach for attacking MDPs with large state spaces, which is also the approach used in this paper, is to approximate the optimal value function V with an approximating function ˆV. It is shown in Section 2 that a good approximation ˆV of the optimal value function V canbeusedtofindagood policy ˆπ. Some of the early work on this approach is that of Bellman and Dreyfus (1959), who propose using Legendre polynomials inductively to approximate the optimal value function of a finite horizon MDP. Chang (1966), Bellman et al. (1963), and Schweitzer and Seidman (1985) also study the approximation of V with polynomials, especially orthogonal polynomials such as Legendre and Chebychev polynomials. Approximations using splines are suggested by Daniel (1976), and approximations using regression splines by Chen et al. (1999). Recently a lot of work has been done on parameterized approximations. Some of this work was motivated by approaches proposed for reinforcement learning; Sutton and Barto (1998) give an overview. Tsitsiklis and Van Roy (1996), Van Roy and Tsitsiklis (1996), Bertsekas and Tsitsiklis (1996), and De Farias and Van Roy (2000) study the estimation of the parameters of these approximating functions for infinite horizon discounted MDPs, and Tsitsiklis and Van Roy (1999a) consider estimation for long-run average cost MDPs. Value function approximations are proposed for specific applications by Van Roy et al. (1997), Powell and Carvalho (1998), Tsitsiklis and Van Roy (1999b), Secomandi (2000), and Kleywegt et al. (2002). In many models the state space is uncountable and the transition and cost functions are too complex for closed form solutions to be obtained. Discretization methods and convergence results for such problems are discussed in Wong (1970a), Fox (1973), Bertsekas (1975), Kushner (1990), Chow and Tsitsiklis (1991), and Kushner and Dupuis (1992). Another natural approach for attacking a large-scale MDP is to decompose the MDP into smaller related 13

14 MDPs, which are easier to solve, and then to use the solutions of the smaller MDPs to obtain a good solution for the original MDP. Decomposition methods are discussed in Wong (1970b), Collins and Lew (1970), Collins (1970), Collins and Angel (1971), Courtois (1977), Courtois and Semal (1984), Stewart (1984), and Kleywegt et al. (2002). Some general state space reduction methods that include many of the methods mentioned above are analyzed in Whitt (1978, 1979a, 1979b), Hinderer (1976, 1978), Hinderer and Hübner (1977), and Haurie and L Ecuyer (1986). Surveys are given in Morin (1978), and Rogers et al. (1991). 2 Value Function Approximation The first major step in solving the IRP is the construction of an approximation ˆV to the optimal value function V. A good approximating function ˆV can then be used to find a good policy ˆπ, in the sense described next. Suppose that V ˆV <ε,thatis, ˆV is an ε-approximation of V. Also suppose that stationary deterministic policy ˆπ satisfies g(x, ˆπ(x)) + α y X ˆV (y) Q[y x, ˆπ(x)] sup a A(x) g(x, a)+α y X ˆV (y) Q[y x, a] δ (4) for all x X, that is, decision ˆπ(x) is within δ of the optimal decision using approximating function ˆV on the right hand side of the optimality equation (2). Then V ˆπ (x) V (x) 2αε + δ 1 α for all x X, that is, the value function V ˆπ of policy ˆπ is within (2αε + δ)/(1 α) of the optimal value function V. This observation is the motivation for putting in the effort to construct a good approximating function ˆV. This section describes the construction of ˆV ; the decisions referred to in this section are used only for the purpose of motivating the approximation ˆV, and are not used to control the IRP process. The decisions used to control the IRP process are described subsequently in Section Subproblem Definition To approximate the optimal value function V, we decompose the IRP into subproblems, and then combine the subproblem results using another optimization problem, described in Section 2.2, to produce the approximating function ˆV. Each subproblem is a Markov decision process involving a subset of customers. The subsets of customers do not necessarily partition the set of customers, but must cover the set of customers, 14

15 and it must be possible to form a partition with a subcollection of the subsets. The approach we followed was to define subproblems for each subset of customers that can be visited on a single vehicle route. Thus each single customer forms a subset, and in addition there are a variety of subsets with multiple customers. Hence, the cover and partition conditions referred to above are automatically satisfied. After the subsets of customers have been identified, a subproblem has to be defined (a model has to be constructed) for each subset. That involves determining appropriate parameters and parameter values for the MDP of each subset. An appealing idea is to choose the parameters and parameter values of each subproblem so that the subproblem represents the overall IRP process as experienced by the subset of customers. There are several obstacles in the way of implementing such an idea. First, the overall process depends on the policy controlling the process, and an optimal policy is not known. Second, even with a given policy for controlling the overall process, it is still hard to determine appropriate parameters and parameter values for each subproblem so that the combined subproblems give a good representation of the overall process. This section, including Subsections and 2.1.2, is devoted to the modeling of the subproblems, that is, the determination of parameters and parameter values for each subproblem. It has the interesting feature that simulation is used in the process of constructing the subproblem models. Issues that have to be addressed are the following. 1. One question is how many vehicles are available for a given subproblem. This issue comes about because in the overall IRP process, several subsets compete for the M vehicles, and thus, at any given time, all M vehicles will not be available to any given subset. Also a vehicle may visit customers in the subset as well as customers not in the subset, and thus not all of a vehicle s capacity C may be available to the given subset. Thus, the availability of vehicles and vehicle capacity to subsets of customers (and therefore in subproblems) has to be modeled. 2. Transition probabilities have to be determined for the subproblems. The transition probabilities of the inventory levels are determined by the demand distribution F as before. In addition, for the subproblems we also address the transition probabilities of vehicle availability to the subset of customers. In the description of the subproblems, we sometimes refer to the overall process, and sometimes to the models of the individual subproblems; we attempt to keep the distinctions as well as the similarities clear. To simplify notation, the modeling of the subproblems is described for a two-customer subproblem; the models for the subproblems with one or more than two customers are similar. A two-customer subproblem for subset {i, j} is denoted by MDP ij. The method presented in this section is for a discrete demand distribution F and a discrete state space X, which may come about naturally due to the nature of the product or because of discretization of the demand distribution and the state space. Let the support of F be denoted by U 1 U N, and let f ij denote the (marginal) probability mass function 15

16 of the demand of customers i and j, thatis,f ij (u i,u j ) F [U 1 U 2 {u i } {u j } U N ] denotes the probability that the demand at customer i is u i and the demand at customer j is u j. Recall that the idea is to define each subproblem so that it gives an accurate representation of the overall process as experienced by the subset of customers. Clearly, the state of a subproblem has to include the inventory level at each of the customers in the subproblem. Furthermore, to capture information about the availability of vehicles for delivering to the customers in the subproblem, the state of a subproblem also includes a component with information about the vehicle availability to the subset of customers. To determine possible values of the vehicle availability component v ij of the state of subproblem MDP ij, consider the different ways in which the customers i and j can be visited in the overall IRP process. For simplicity, we assume that each customer is visited at most once per day. Consequently, on any day, the subset of two customers can be visited by 0, 1, or 2 vehicles. Hence, in subproblem MDP ij,atanypointin time, either 0, or 1, or 2 vehicles are available to the subset of two customers. The simplest case is the case with no vehicles available for delivering to customers i and j (denoted by v ij = 0 in subproblem MDP ij ). When 1 or 2 vehicles are available to the subset of two customers, we also have to specify how much of those vehicles capacities are available to the subset of customers, because those same vehicles may also make deliveries to customers other than i or j on a route. Consider the different ways in which one vehicle could deliver to i and/or j in the overall IRP process. There are the following six possibilities: 1. exclusive delivery to i, 2. exclusive delivery to j, 3. exclusive delivery to i and j (no deliveries to other customers), 4. fraction of vehicle capacity delivered to i and no delivery to j, 5. fraction of vehicle capacity delivered to j and no delivery to i, 6. fraction of vehicle capacity delivered to i and j plus delivery to other customers. The first three possibilities are represented by the same vehicle availability component in subproblem MDP ij (denoted by v ij = a), because in all three cases one vehicle is available exclusively for customers in the subproblem. The other possibilities are denoted by v ij = b, c, d respectively, in subproblem MDP ij. Next consider the different ways in which two vehicles could deliver to i and j in the overall IRP process. There are the following four possibilities: 1. exclusive delivery to i and j (no deliveries to other customers), 2. exclusive delivery to i, fraction of vehicle capacity delivered to j 16

17 3. exclusive delivery to j, fraction of vehicle capacity delivered to i 4. fraction of vehicle capacity delivered to i and fraction of vehicle capacity delivered to j (with different vehicles visiting i and j, each also delivering to other customers). These possibilities are denoted by v ij = e, f, g, h respectively, in subproblem MDP ij. Whenever a vehicle is available for delivering a fraction of its capacity to one or both of the customers in the subset, the model for subproblem MDP ij also needs to specify what portion of the vehicle s capacity is available to the subset. For example, when the vehicle availability v ij {b, c, d}, one vehicle with a fraction of the capacity C is available to the two-customer subset; when v ij = h, two vehicles, each with a fraction of the capacity C, are available to the subset; and when v ij {f,g}, two vehicles, one with capacity C and one with a fraction of the capacity C, are available to the subset. Each of the subproblem vehicle availabilities v ij {b, g, h} correspond to a situation in the overall IRP in which a vehicle visits i and a customer not in {i, j}, but the same vehicle does not visit j. The fractional capacity associated with the vehicle availabilities v ij {b, g} is the same and is denoted by λ i ij [0, C]. Similarly, the fractional capacity associated with the vehicle availabilities v ij {c, f} is the same and is denoted by λ j ij [0, C]. When the vehicle availability is v ij = h, one vehicle with fractional capacity λ i ij and another vehicle with fractional capacity λj ij are available to the subset. Finally, when the vehicle availability is v ij = d, the fractional capacity available to the subset is denoted by λ ij ij [0, C]. Table 1 summarizes the vehicle availability values v ij and associated available capacities for a two-customer subproblem MDP ij. Note that for the subproblem, it is sufficient to know the (possibly fractional) capacities available to the subset. The subproblem decision determines how the capacities will be used to serve customers i and j. Section explains how simulation is used to choose appropriate values for these λ-parameters. Table 1: Vehicle availability values v ij and associated capacities for a two-customer subproblem MDP ij. v ij -value Vehicle capacities available to customer subset {i, j} 0 None a One vehicle with capacity C b One vehicle with capacity λ i ij c One vehicle with capacity λ j ij d One vehicle with capacity λ ij ij e Two vehicles, each with capacity C f Two vehicles, one with capacity C, and one with capacity λ j ij g Two vehicles, one with capacity λ i ij, and one with capacity C h Two vehicles, one with capacity λ i ij, and one with capacity λj ij Each two-customer subproblem MDP ij follows. is a discrete time Markov decision process, and is defined as 17

18 1. The state space is X ij = {0, 1,...,C i } {0, 1,...,C j } {0,a,b,c,d,e,f,g,h}. State (x i,x j,v ij ) denotes that the inventory levels at customers i and j are x i and x j, and the vehicle availability is v ij. Let X it {0, 1,...,C i } denote the random inventory level at customer i at time t, and let V ijt denote the random vehicle availability at time t. 2. For any subproblem state (x i,x j,v ij ), let A ij (x i,x j,v ij ) denote the set of feasible subproblem decisions when the subproblem process is in state (x i,x j,v ij ). A decision a ij A ij (x i,x j,v ij ) contains information about (1) which of customers i and j to replenish, (2) how much to deliver at each of customers i and j, and (3) how to combine customers i and j into vehicle routes. (For a two-customer subproblem, the routing aspect of the decision is easy.) Let d i (a ij ) denote the quantity of product that is delivered to customer i while executing decision a ij. The feasible decisions a ij A ij (x i,x j,v ij ) satisfy the following constraints when the subproblem state is (x i,x j,v ij ). When the vehicle availability is v ij = 0, then no vehicles can be sent to customers i and j, andd i (a ij )=d j (a ij ) = 0. When v ij = a, then one vehicle can be sent to customers i and j, andd i (a ij )+d j (a ij ) C, x i + d i (a ij ) C i, and x j + d j (a ij ) C j. When v ij = b, then one vehicle can be sent to customer i, no vehicle can be sent to customer j, andd i (a ij ) min{λ i ij,c i x i },andd j (a ij ) = 0. Feasible decisions are determined similarly if v ij = c. When v ij = d, then one vehicle can be sent to customers i and j, and d i (a ij )+d j (a ij ) λ ij ij, x i + d i (a ij ) C i,andx j + d j (a ij ) C j. When v ij = e, then one vehicle can be sent to each of customers i and j, andd i (a ij ) min{ C,C i x i },andd j (a ij ) min{ C,C j x j }. When v ij = f, then one vehicle can be sent to each of customers i and j, andd i (a ij ) min{ C,C i x i }, and d j (a ij ) min{λ j ij,c j x j }. Feasible decisions are determined similarly if v ij = g. Finally, when v ij = h, then both i and j can be visited by a vehicle each, and d i (a ij ) min{λ i ij,c i x i },and d j (a ij ) min{λ j ij,c j x j }. As for the overall IRP, let the random variable A ijt A ij (X it,x jt,v ijt ) denote the decision chosen at time t. 3. The transition probabilities of the subproblems have to incorporate the probability distribution of customer demands, as well as the probabilities of vehicle availabilities to the subset of customers. Because we assume that the probability distribution f ij of customer demands is known, the transition probabilities of the inventory levels can be determined for the subproblems as for the overall IRP. In the overall IRP process, the probabilities of vehicle availabilities to a subset of customers depend on the policy used to control the process, and are not directly obtainable from the input data of the IRP. Thus, some additional effort is required to make the transition probabilities of vehicle availabilities in the subproblems representative of what happens in the overall IRP. The basic idea is described next, and more details are provided in Section Consider any policy π Π for the IRP with unique stationary probability ν π (x) foreachx X. (Thus, as indicated in Algorithm 1, the formulation 18

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Approximate Linear Programming for Average-Cost Dynamic Programming

Approximate Linear Programming for Average-Cost Dynamic Programming Approximate Linear Programming for Average-Cost Dynamic Programming Daniela Pucci de Farias IBM Almaden Research Center 65 Harry Road, San Jose, CA 51 pucci@mitedu Benjamin Van Roy Department of Management

More information

Computational Complexity CSC Professor: Tom Altman. Capacitated Problem

Computational Complexity CSC Professor: Tom Altman. Capacitated Problem Computational Complexity CSC 5802 Professor: Tom Altman Capacitated Problem Agenda: Definition Example Solution Techniques Implementation Capacitated VRP (CPRV) CVRP is a Vehicle Routing Problem (VRP)

More information

On a Cardinality-Constrained Transportation Problem With Market Choice

On a Cardinality-Constrained Transportation Problem With Market Choice On a Cardinality-Constrained Transportation Problem With Market Choice Matthias Walter a, Pelin Damcı-Kurt b, Santanu S. Dey c,, Simge Küçükyavuz b a Institut für Mathematische Optimierung, Otto-von-Guericke-Universität

More information

Flexible Servers in Understaffed Tandem Lines

Flexible Servers in Understaffed Tandem Lines Flexible Servers in Understaffed Tandem Lines Abstract We study the dynamic assignment of cross-trained servers to stations in understaffed lines with finite buffers. Our objective is to maximize the production

More information

Reinforcement Learning: A brief introduction. Mihaela van der Schaar

Reinforcement Learning: A brief introduction. Mihaela van der Schaar Reinforcement Learning: A brief introduction Mihaela van der Schaar Outline Optimal Decisions & Optimal Forecasts Markov Decision Processes (MDPs) States, actions, rewards and value functions Dynamic Programming

More information

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs

15.082J and 6.855J. Lagrangian Relaxation 2 Algorithms Application to LPs 15.082J and 6.855J Lagrangian Relaxation 2 Algorithms Application to LPs 1 The Constrained Shortest Path Problem (1,10) 2 (1,1) 4 (2,3) (1,7) 1 (10,3) (1,2) (10,1) (5,7) 3 (12,3) 5 (2,2) 6 Find the shortest

More information

Rollout Algorithms for Stochastic Scheduling Problems

Rollout Algorithms for Stochastic Scheduling Problems Journal of Heuristics, 5, 89 108 (1999) c 1999 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Rollout Algorithms for Stochastic Scheduling Problems DIMITRI P. BERTSEKAS* Department

More information

Newsvendor Bounds and Heuristics for Series Systems

Newsvendor Bounds and Heuristics for Series Systems Newsvendor Bounds and Heuristics for Series Systems August 6, 2002 Abstract We propose a heuristic for a multi-stage serial system based on solving a single newsvendor problem per stage. The heuristic

More information

Vehicle Routing Heuristic Methods

Vehicle Routing Heuristic Methods DM87 SCHEDULING, TIMETABLING AND ROUTING Outline 1. Construction Heuristics for VRPTW Lecture 19 Vehicle Routing Heuristic Methods 2. Local Search 3. Metaheuristics Marco Chiarandini 4. Other Variants

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Optimal Control of a Production-Inventory System with both Backorders and Lost Sales

Optimal Control of a Production-Inventory System with both Backorders and Lost Sales Optimal Control of a Production-Inventory System with both Backorders and Lost Sales Saif Benjaafar Mohsen ElHafsi 2 Tingliang Huang 3 Industrial & Systems Engineering, Department of Mechanical Engineering,

More information

Integer Programming Theory

Integer Programming Theory Integer Programming Theory Laura Galli October 24, 2016 In the following we assume all functions are linear, hence we often drop the term linear. In discrete optimization, we seek to find a solution x

More information

15-780: MarkovDecisionProcesses

15-780: MarkovDecisionProcesses 15-780: MarkovDecisionProcesses J. Zico Kolter Feburary 29, 2016 1 Outline Introduction Formal definition Value iteration Policy iteration Linear programming for MDPs 2 1988 Judea Pearl publishes Probabilistic

More information

Mathematical and Algorithmic Foundations Linear Programming and Matchings

Mathematical and Algorithmic Foundations Linear Programming and Matchings Adavnced Algorithms Lectures Mathematical and Algorithmic Foundations Linear Programming and Matchings Paul G. Spirakis Department of Computer Science University of Patras and Liverpool Paul G. Spirakis

More information

Vehicle Routing for Food Rescue Programs: A comparison of different approaches

Vehicle Routing for Food Rescue Programs: A comparison of different approaches Vehicle Routing for Food Rescue Programs: A comparison of different approaches Canan Gunes, Willem-Jan van Hoeve, and Sridhar Tayur Tepper School of Business, Carnegie Mellon University 1 Introduction

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Algorithms for Integer Programming

Algorithms for Integer Programming Algorithms for Integer Programming Laura Galli November 9, 2016 Unlike linear programming problems, integer programming problems are very difficult to solve. In fact, no efficient general algorithm is

More information

OR 674 DYNAMIC PROGRAMMING Rajesh Ganesan, Associate Professor Systems Engineering and Operations Research George Mason University

OR 674 DYNAMIC PROGRAMMING Rajesh Ganesan, Associate Professor Systems Engineering and Operations Research George Mason University OR 674 DYNAMIC PROGRAMMING Rajesh Ganesan, Associate Professor Systems Engineering and Operations Research George Mason University Ankit Shah Ph.D. Candidate Analytics and Operations Research (OR) Descriptive

More information

Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Technical Report UU-CS-2008-042 December 2008 Department of Information and Computing Sciences Utrecht

More information

Chapter 15 Introduction to Linear Programming

Chapter 15 Introduction to Linear Programming Chapter 15 Introduction to Linear Programming An Introduction to Optimization Spring, 2015 Wei-Ta Chu 1 Brief History of Linear Programming The goal of linear programming is to determine the values of

More information

Material Handling Tools for a Discrete Manufacturing System: A Comparison of Optimization and Simulation

Material Handling Tools for a Discrete Manufacturing System: A Comparison of Optimization and Simulation Material Handling Tools for a Discrete Manufacturing System: A Comparison of Optimization and Simulation Frank Werner Fakultät für Mathematik OvGU Magdeburg, Germany (Joint work with Yanting Ni, Chengdu

More information

Q-learning with linear function approximation

Q-learning with linear function approximation Q-learning with linear function approximation Francisco S. Melo and M. Isabel Ribeiro Institute for Systems and Robotics [fmelo,mir]@isr.ist.utl.pt Conference on Learning Theory, COLT 2007 June 14th, 2007

More information

A Comparison of Mixed-Integer Programming Models for Non-Convex Piecewise Linear Cost Minimization Problems

A Comparison of Mixed-Integer Programming Models for Non-Convex Piecewise Linear Cost Minimization Problems A Comparison of Mixed-Integer Programming Models for Non-Convex Piecewise Linear Cost Minimization Problems Keely L. Croxton Fisher College of Business The Ohio State University Bernard Gendron Département

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

Lecture 2 - Introduction to Polytopes

Lecture 2 - Introduction to Polytopes Lecture 2 - Introduction to Polytopes Optimization and Approximation - ENS M1 Nicolas Bousquet 1 Reminder of Linear Algebra definitions Let x 1,..., x m be points in R n and λ 1,..., λ m be real numbers.

More information

Markov Chains and Multiaccess Protocols: An. Introduction

Markov Chains and Multiaccess Protocols: An. Introduction Markov Chains and Multiaccess Protocols: An Introduction Laila Daniel and Krishnan Narayanan April 8, 2012 Outline of the talk Introduction to Markov Chain applications in Communication and Computer Science

More information

Introduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University

Introduction to Reinforcement Learning. J. Zico Kolter Carnegie Mellon University Introduction to Reinforcement Learning J. Zico Kolter Carnegie Mellon University 1 Agent interaction with environment Agent State s Reward r Action a Environment 2 Of course, an oversimplification 3 Review:

More information

Basis Paths and a Polynomial Algorithm for the Multistage Production-Capacitated Lot-Sizing Problem

Basis Paths and a Polynomial Algorithm for the Multistage Production-Capacitated Lot-Sizing Problem OPERATIONS RESEARCH Vol. 61, No. 2, March April 2013, pp. 469 482 ISSN 0030-364X (print) ISSN 1526-5463 (online) http://dx.doi.org/10.1287/opre.1120.1141 2013 INFORMS Basis Paths and a Polynomial Algorithm

More information

Framework for Design of Dynamic Programming Algorithms

Framework for Design of Dynamic Programming Algorithms CSE 441T/541T Advanced Algorithms September 22, 2010 Framework for Design of Dynamic Programming Algorithms Dynamic programming algorithms for combinatorial optimization generalize the strategy we studied

More information

The Service-Time Restricted Capacitated Arc Routing Problem

The Service-Time Restricted Capacitated Arc Routing Problem The Service-Time Restricted Capacitated Arc Routing Problem Lise Lystlund Aarhus University Århus, Denmark Sanne Wøhlk CORAL - Centre of OR Applications in Logistics, Aarhus School of Business, Aarhus

More information

Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks

Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks 60 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 3, NO. 1, MARCH 2002 Adaptations of the A* Algorithm for the Computation of Fastest Paths in Deterministic Discrete-Time Dynamic Networks

More information

A noninformative Bayesian approach to small area estimation

A noninformative Bayesian approach to small area estimation A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported

More information

Introduction to Stochastic Combinatorial Optimization

Introduction to Stochastic Combinatorial Optimization Introduction to Stochastic Combinatorial Optimization Stefanie Kosuch PostDok at TCSLab www.kosuch.eu/stefanie/ Guest Lecture at the CUGS PhD course Heuristic Algorithms for Combinatorial Optimization

More information

EARLY INTERIOR-POINT METHODS

EARLY INTERIOR-POINT METHODS C H A P T E R 3 EARLY INTERIOR-POINT METHODS An interior-point algorithm is one that improves a feasible interior solution point of the linear program by steps through the interior, rather than one that

More information

Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm. Santos and Mateus (2007)

Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm. Santos and Mateus (2007) In the name of God Crew Scheduling Problem: A Column Generation Approach Improved by a Genetic Algorithm Spring 2009 Instructor: Dr. Masoud Yaghini Outlines Problem Definition Modeling As A Set Partitioning

More information

A Randomized Algorithm for Minimizing User Disturbance Due to Changes in Cellular Technology

A Randomized Algorithm for Minimizing User Disturbance Due to Changes in Cellular Technology A Randomized Algorithm for Minimizing User Disturbance Due to Changes in Cellular Technology Carlos A. S. OLIVEIRA CAO Lab, Dept. of ISE, University of Florida Gainesville, FL 32611, USA David PAOLINI

More information

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019

Reinforcement Learning and Optimal Control. ASU, CSE 691, Winter 2019 Reinforcement Learning and Optimal Control ASU, CSE 691, Winter 2019 Dimitri P. Bertsekas dimitrib@mit.edu Lecture 1 Bertsekas Reinforcement Learning 1 / 21 Outline 1 Introduction, History, General Concepts

More information

LEAST COST ROUTING ALGORITHM WITH THE STATE SPACE RELAXATION IN A CENTRALIZED NETWORK

LEAST COST ROUTING ALGORITHM WITH THE STATE SPACE RELAXATION IN A CENTRALIZED NETWORK VOL., NO., JUNE 08 ISSN 896608 00608 Asian Research Publishing Network (ARPN). All rights reserved. LEAST COST ROUTING ALGORITHM WITH THE STATE SPACE RELAXATION IN A CENTRALIZED NETWORK Y. J. Lee Department

More information

Document Version Publisher s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Document Version Publisher s PDF, also known as Version of Record (includes final page, issue and volume numbers) Heuristics for multi-item two-echelon spare parts inventory control problem with batch ordering in the central warehouse Topan, E.; Bayindir, Z.P.; Tan, T. Published: 01/01/2010 Document Version Publisher

More information

Staffing and Scheduling in Multiskill and Blend Call Centers

Staffing and Scheduling in Multiskill and Blend Call Centers Staffing and Scheduling in Multiskill and Blend Call Centers 1 Pierre L Ecuyer GERAD and DIRO, Université de Montréal, Canada (Joint work with Tolga Cezik, Éric Buist, and Thanos Avramidis) Staffing and

More information

Meso-Parametric Value Function Approximation for Dynamic Customer Acceptances in Delivery Routing

Meso-Parametric Value Function Approximation for Dynamic Customer Acceptances in Delivery Routing Meso-Parametric Value Function Approximation for Dynamic Customer Acceptances in Delivery Routing Marlin W. Ulmer Barrett W. Thomas Abstract In this paper, we introduce a novel method of value function

More information

Column Generation Method for an Agent Scheduling Problem

Column Generation Method for an Agent Scheduling Problem Column Generation Method for an Agent Scheduling Problem Balázs Dezső Alpár Jüttner Péter Kovács Dept. of Algorithms and Their Applications, and Dept. of Operations Research Eötvös Loránd University, Budapest,

More information

General properties of staircase and convex dual feasible functions

General properties of staircase and convex dual feasible functions General properties of staircase and convex dual feasible functions JÜRGEN RIETZ, CLÁUDIO ALVES, J. M. VALÉRIO de CARVALHO Centro de Investigação Algoritmi da Universidade do Minho, Escola de Engenharia

More information

Stuck in Traffic (SiT) Attacks

Stuck in Traffic (SiT) Attacks Stuck in Traffic (SiT) Attacks Mina Guirguis Texas State University Joint work with George Atia Traffic 2 Intelligent Transportation Systems V2X communication enable drivers to make better decisions: Avoiding

More information

Faster parameterized algorithms for Minimum Fill-In

Faster parameterized algorithms for Minimum Fill-In Faster parameterized algorithms for Minimum Fill-In Hans L. Bodlaender Pinar Heggernes Yngve Villanger Abstract We present two parameterized algorithms for the Minimum Fill-In problem, also known as Chordal

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

12.1 Formulation of General Perfect Matching

12.1 Formulation of General Perfect Matching CSC5160: Combinatorial Optimization and Approximation Algorithms Topic: Perfect Matching Polytope Date: 22/02/2008 Lecturer: Lap Chi Lau Scribe: Yuk Hei Chan, Ling Ding and Xiaobing Wu In this lecture,

More information

Chordal deletion is fixed-parameter tractable

Chordal deletion is fixed-parameter tractable Chordal deletion is fixed-parameter tractable Dániel Marx Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. dmarx@informatik.hu-berlin.de Abstract. It

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

The Cross-Entropy Method

The Cross-Entropy Method The Cross-Entropy Method Guy Weichenberg 7 September 2003 Introduction This report is a summary of the theory underlying the Cross-Entropy (CE) method, as discussed in the tutorial by de Boer, Kroese,

More information

56:272 Integer Programming & Network Flows Final Exam -- December 16, 1997

56:272 Integer Programming & Network Flows Final Exam -- December 16, 1997 56:272 Integer Programming & Network Flows Final Exam -- December 16, 1997 Answer #1 and any five of the remaining six problems! possible score 1. Multiple Choice 25 2. Traveling Salesman Problem 15 3.

More information

Hierarchical Average Reward Reinforcement Learning Mohammad Ghavamzadeh Sridhar Mahadevan CMPSCI Technical Report

Hierarchical Average Reward Reinforcement Learning Mohammad Ghavamzadeh Sridhar Mahadevan CMPSCI Technical Report Hierarchical Average Reward Reinforcement Learning Mohammad Ghavamzadeh Sridhar Mahadevan CMPSCI Technical Report 03-19 June 25, 2003 Department of Computer Science 140 Governors Drive University of Massachusetts

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction A Monte Carlo method is a compuational method that uses random numbers to compute (estimate) some quantity of interest. Very often the quantity we want to compute is the mean of

More information

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize.

Lecture notes on the simplex method September We will present an algorithm to solve linear programs of the form. maximize. Cornell University, Fall 2017 CS 6820: Algorithms Lecture notes on the simplex method September 2017 1 The Simplex Method We will present an algorithm to solve linear programs of the form maximize subject

More information

Neuro-Dynamic Programming An Overview

Neuro-Dynamic Programming An Overview 1 Neuro-Dynamic Programming An Overview Dimitri Bertsekas Dept. of Electrical Engineering and Computer Science M.I.T. May 2006 2 BELLMAN AND THE DUAL CURSES Dynamic Programming (DP) is very broadly applicable,

More information

Adversarial Policy Switching with Application to RTS Games

Adversarial Policy Switching with Application to RTS Games Adversarial Policy Switching with Application to RTS Games Brian King 1 and Alan Fern 2 and Jesse Hostetler 2 Department of Electrical Engineering and Computer Science Oregon State University 1 kingbria@lifetime.oregonstate.edu

More information

1 Linear programming relaxation

1 Linear programming relaxation Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Primal-dual min-cost bipartite matching August 27 30 1 Linear programming relaxation Recall that in the bipartite minimum-cost perfect matching

More information

Next-Event Simulation

Next-Event Simulation Next-Event Simulation Lawrence M. Leemis and Stephen K. Park, Discrete-Event Simulation - A First Course, Prentice Hall, 2006 Hui Chen Computer Science Virginia State University Petersburg, Virginia March

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Scheduling Algorithms to Minimize Session Delays

Scheduling Algorithms to Minimize Session Delays Scheduling Algorithms to Minimize Session Delays Nandita Dukkipati and David Gutierrez A Motivation I INTRODUCTION TCP flows constitute the majority of the traffic volume in the Internet today Most of

More information

Delay-minimal Transmission for Energy Constrained Wireless Communications

Delay-minimal Transmission for Energy Constrained Wireless Communications Delay-minimal Transmission for Energy Constrained Wireless Communications Jing Yang Sennur Ulukus Department of Electrical and Computer Engineering University of Maryland, College Park, M0742 yangjing@umd.edu

More information

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks

A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 8, NO. 6, DECEMBER 2000 747 A Path Decomposition Approach for Computing Blocking Probabilities in Wavelength-Routing Networks Yuhong Zhu, George N. Rouskas, Member,

More information

Recursive column generation for the Tactical Berth Allocation Problem

Recursive column generation for the Tactical Berth Allocation Problem Recursive column generation for the Tactical Berth Allocation Problem Ilaria Vacca 1 Matteo Salani 2 Michel Bierlaire 1 1 Transport and Mobility Laboratory, EPFL, Lausanne, Switzerland 2 IDSIA, Lugano,

More information

CSE151 Assignment 2 Markov Decision Processes in the Grid World

CSE151 Assignment 2 Markov Decision Processes in the Grid World CSE5 Assignment Markov Decision Processes in the Grid World Grace Lin A484 gclin@ucsd.edu Tom Maddock A55645 tmaddock@ucsd.edu Abstract Markov decision processes exemplify sequential problems, which are

More information

Metaheuristic Optimization with Evolver, Genocop and OptQuest

Metaheuristic Optimization with Evolver, Genocop and OptQuest Metaheuristic Optimization with Evolver, Genocop and OptQuest MANUEL LAGUNA Graduate School of Business Administration University of Colorado, Boulder, CO 80309-0419 Manuel.Laguna@Colorado.EDU Last revision:

More information

Decomposition of log-linear models

Decomposition of log-linear models Graphical Models, Lecture 5, Michaelmas Term 2009 October 27, 2009 Generating class Dependence graph of log-linear model Conformal graphical models Factor graphs A density f factorizes w.r.t. A if there

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

10703 Deep Reinforcement Learning and Control

10703 Deep Reinforcement Learning and Control 10703 Deep Reinforcement Learning and Control Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu Policy Gradient I Used Materials Disclaimer: Much of the material and slides for this lecture

More information

Rollout Algorithms for Discrete Optimization: A Survey

Rollout Algorithms for Discrete Optimization: A Survey Rollout Algorithms for Discrete Optimization: A Survey by Dimitri P. Bertsekas Massachusetts Institute of Technology Cambridge, MA 02139 dimitrib@mit.edu August 2010 Abstract This chapter discusses rollout

More information

of optimization problems. In this chapter, it is explained that what network design

of optimization problems. In this chapter, it is explained that what network design CHAPTER 2 Network Design Network design is one of the most important and most frequently encountered classes of optimization problems. In this chapter, it is explained that what network design is? The

More information

Planning and Control: Markov Decision Processes

Planning and Control: Markov Decision Processes CSE-571 AI-based Mobile Robotics Planning and Control: Markov Decision Processes Planning Static vs. Dynamic Predictable vs. Unpredictable Fully vs. Partially Observable Perfect vs. Noisy Environment What

More information

Introduction to Optimization Problems and Methods

Introduction to Optimization Problems and Methods Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex

More information

Network Topology Control and Routing under Interface Constraints by Link Evaluation

Network Topology Control and Routing under Interface Constraints by Link Evaluation Network Topology Control and Routing under Interface Constraints by Link Evaluation Mehdi Kalantari Phone: 301 405 8841, Email: mehkalan@eng.umd.edu Abhishek Kashyap Phone: 301 405 8843, Email: kashyap@eng.umd.edu

More information

Topology and Topological Spaces

Topology and Topological Spaces Topology and Topological Spaces Mathematical spaces such as vector spaces, normed vector spaces (Banach spaces), and metric spaces are generalizations of ideas that are familiar in R or in R n. For example,

More information

CHAPTER 8 DISCUSSIONS

CHAPTER 8 DISCUSSIONS 153 CHAPTER 8 DISCUSSIONS This chapter discusses the developed models, methodologies to solve the developed models, performance of the developed methodologies and their inferences. 8.1 MULTI-PERIOD FIXED

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 5 Inference

More information

Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions

Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions Simulation-Based Approximate Policy Iteration with Generalized Logistic Functions Journal: INFORMS Journal on Computing Manuscript ID: JOC-0--OA- Manuscript Type: Original Article Date Submitted by the

More information

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem

CS261: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem CS61: A Second Course in Algorithms Lecture #16: The Traveling Salesman Problem Tim Roughgarden February 5, 016 1 The Traveling Salesman Problem (TSP) In this lecture we study a famous computational problem,

More information

Lecture 2 The k-means clustering problem

Lecture 2 The k-means clustering problem CSE 29: Unsupervised learning Spring 2008 Lecture 2 The -means clustering problem 2. The -means cost function Last time we saw the -center problem, in which the input is a set S of data points and the

More information

An Improved Policy Iteratioll Algorithm for Partially Observable MDPs

An Improved Policy Iteratioll Algorithm for Partially Observable MDPs An Improved Policy Iteratioll Algorithm for Partially Observable MDPs Eric A. Hansen Computer Science Department University of Massachusetts Amherst, MA 01003 hansen@cs.umass.edu Abstract A new policy

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Using Markov decision processes to optimise a non-linear functional of the final distribution, with manufacturing applications.

Using Markov decision processes to optimise a non-linear functional of the final distribution, with manufacturing applications. Using Markov decision processes to optimise a non-linear functional of the final distribution, with manufacturing applications. E.J. Collins 1 1 Department of Mathematics, University of Bristol, University

More information

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions.

NP-Hardness. We start by defining types of problem, and then move on to defining the polynomial-time reductions. CS 787: Advanced Algorithms NP-Hardness Instructor: Dieter van Melkebeek We review the concept of polynomial-time reductions, define various classes of problems including NP-complete, and show that 3-SAT

More information

Solving Large Aircraft Landing Problems on Multiple Runways by Applying a Constraint Programming Approach

Solving Large Aircraft Landing Problems on Multiple Runways by Applying a Constraint Programming Approach Solving Large Aircraft Landing Problems on Multiple Runways by Applying a Constraint Programming Approach Amir Salehipour School of Mathematical and Physical Sciences, The University of Newcastle, Australia

More information

Discrete Optimization. Lecture Notes 2

Discrete Optimization. Lecture Notes 2 Discrete Optimization. Lecture Notes 2 Disjunctive Constraints Defining variables and formulating linear constraints can be straightforward or more sophisticated, depending on the problem structure. The

More information

Rigidity, connectivity and graph decompositions

Rigidity, connectivity and graph decompositions First Prev Next Last Rigidity, connectivity and graph decompositions Brigitte Servatius Herman Servatius Worcester Polytechnic Institute Page 1 of 100 First Prev Next Last Page 2 of 100 We say that a framework

More information

9.5 Equivalence Relations

9.5 Equivalence Relations 9.5 Equivalence Relations You know from your early study of fractions that each fraction has many equivalent forms. For example, 2, 2 4, 3 6, 2, 3 6, 5 30,... are all different ways to represent the same

More information

Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2

Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2 Surrogate Gradient Algorithm for Lagrangian Relaxation 1,2 X. Zhao 3, P. B. Luh 4, and J. Wang 5 Communicated by W.B. Gong and D. D. Yao 1 This paper is dedicated to Professor Yu-Chi Ho for his 65th birthday.

More information

Core Membership Computation for Succinct Representations of Coalitional Games

Core Membership Computation for Succinct Representations of Coalitional Games Core Membership Computation for Succinct Representations of Coalitional Games Xi Alice Gao May 11, 2009 Abstract In this paper, I compare and contrast two formal results on the computational complexity

More information

AN APPROXIMATE INVENTORY MODEL BASED ON DIMENSIONAL ANALYSIS. Victoria University, Wellington, New Zealand

AN APPROXIMATE INVENTORY MODEL BASED ON DIMENSIONAL ANALYSIS. Victoria University, Wellington, New Zealand AN APPROXIMATE INVENTORY MODEL BASED ON DIMENSIONAL ANALYSIS by G. A. VIGNAUX and Sudha JAIN Victoria University, Wellington, New Zealand Published in Asia-Pacific Journal of Operational Research, Vol

More information

Column Generation II : Application in Distribution Network Design

Column Generation II : Application in Distribution Network Design Column Generation II : Application in Distribution Network Design Teo Chung-Piaw (NUS) 27 Feb 2003, Singapore 1 Supply Chain Challenges 1.1 Introduction Network of facilities: procurement of materials,

More information

Linear Programming. Meaning of Linear Programming. Basic Terminology

Linear Programming. Meaning of Linear Programming. Basic Terminology Linear Programming Linear Programming (LP) is a versatile technique for assigning a fixed amount of resources among competing factors, in such a way that some objective is optimized and other defined conditions

More information

Mathematical preliminaries and error analysis

Mathematical preliminaries and error analysis Mathematical preliminaries and error analysis Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan August 28, 2011 Outline 1 Round-off errors and computer arithmetic IEEE

More information

Approximate Dynamic Programming for a Class of Long-Horizon Maritime Inventory Routing Problems

Approximate Dynamic Programming for a Class of Long-Horizon Maritime Inventory Routing Problems Approximate Dynamic Programming for a Class of Long-Horizon Maritime Inventory Routing Problems Dimitri J. Papageorgiou, Myun-Seok Cheon Corporate Strategic Research ExxonMobil Research and Engineering

More information

Comp Online Algorithms

Comp Online Algorithms Comp 7720 - Online Algorithms Notes 4: Bin Packing Shahin Kamalli University of Manitoba - Fall 208 December, 208 Introduction Bin packing is one of the fundamental problems in theory of computer science.

More information

A NETWORK SIMPLEX ALGORITHM FOR SOLVING THE MINIMUM DISTRIBUTION COST PROBLEM. I-Lin Wang and Shiou-Jie Lin. (Communicated by Shu-Cherng Fang)

A NETWORK SIMPLEX ALGORITHM FOR SOLVING THE MINIMUM DISTRIBUTION COST PROBLEM. I-Lin Wang and Shiou-Jie Lin. (Communicated by Shu-Cherng Fang) JOURNAL OF INDUSTRIAL AND doi:10.3934/jimo.2009.5.929 MANAGEMENT OPTIMIZATION Volume 5, Number 4, November 2009 pp. 929 950 A NETWORK SIMPLEX ALGORITHM FOR SOLVING THE MINIMUM DISTRIBUTION COST PROBLEM

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

Cost Optimization in the (S 1, S) Lost Sales Inventory Model with Multiple Demand Classes

Cost Optimization in the (S 1, S) Lost Sales Inventory Model with Multiple Demand Classes Cost Optimization in the (S 1, S) Lost Sales Inventory Model with Multiple Demand Classes A.A. Kranenburg, G.J. van Houtum Department of Technology Management, Technische Universiteit Eindhoven, Eindhoven,

More information