Salle 385
le 25 novembre 2016 à 11:00
The Stochastic Shortest Path problem (SSP) is a natural extension of the deterministic shortest path problem whereby traversing an `arc' may now lead to several destinations with different probabilities. In this setting, vertices are called states and arcs are called actions. The goal is to decide in each time step and each state which action to take so as to converge to a predefined target with probability one over an infinite time horizon. Taking an action has a cost and we wish to find a policy that minimizes the average cost over all possible realizations. SSP forms an important class of Markov Decision Processes (MDP) and it is extensively used in practice~: it arises naturally in robot motion planning, from maneuvering a vehicle over unfamiliar terrain, steering a flexible needle through human tissue or guiding a swimming micro-robot through turbulent water for instance ; and it has also many applications in operations research, artificial intelligence and economics, from inventory control, reinforcement learning to asset pricing. The SSP was studied thoroughly by Bertsekas and Tsitsiklis (1991) and later by Bertsekas and Yu (2016) and it is well understood when there is no nonpositive cost `transition cycles'. In particular, it is known that standard methods like Value Iteration and Policy Iteration converge in this case. In this talk we give a fresh look at the problem from a polyhedral combinatorics perspective. We study the natural linear programming relaxation of the problem and we show that actually standard methods also converge when there is no negative cost transition cycles. This closes the gap with the deterministic shortest path problem. Finally we show that we can also extend Dijkstra's algorithm to the stochastic setting.