Rollout, Policy Iteration, and Distributed Reinforcement Learning

(9)
Rollout, Policy Iteration, and Distributed Reinforcement Learning image
ISBN-10:

1886529078

ISBN-13:

9781886529076

Edition: First Edition
Released: Aug 15, 2020
Format: Hardcover, 376 pages

Description:

This is a monograph at the forefront of research on reinforcement learning, also referred to by other names such as approximate dynamic programming and neuro-dynamic programming. It focuses on the fundamental idea of policy iteration, i.e., start from some policy, and successively generate one or more improved policies. If just one improved policy is generated, this is called rollout, which, based on broad and consistent computational experience, appears to be one of the most versatile and reliable of all reinforcement learning methods. Among others, it can be applied on-line using easily implementable simulation, and it can be used for discrete deterministic combinatorial optimization, as well as for stochastic Markov decision problems. Approximate policy iteration is more ambitious than rollout, but it is a strictly off-line method, and it is generally far more computationally intensive. This motivates the use of parallel and distributed computation. One of the purposes of the monograph is to discuss distributed (possibly asynchronous) methods that relate to rollout and policy iteration, both in the context of an exact and an approximate implementation involving neural networks or other approximation architectures. Several of the ideas that we develop in some depth in this monograph have been central in the implementation of recent high profile successes, such as the AlphaZero program for playing chess, Go, and other games. In addition to the fundamental process of successive policy iteration/improvement, this program includes the use of deep neural networks for representation of both value functions and policies, the extensive use of large scale parallelization, and the simplification of lookahead minimization, through methods involving Monte Carlo tree search and pruning of the lookahead tree. In this monograph, we also focus on policy iteration, value and policy neural network representations, parallel and distributed computation, and lookahead simplification. Thus while there are significant differences, the principal design ideas that form the core of this monograph are shared by the AlphaZero architecture, except that we develop these ideas in a broader and less application-specific framework. Among its special features, the book: a) Presents new research relating to distributed asynchronous computation, partitioned architectures, and multiagent systems, with application to challenging large scale optimization problems, such as combinatorial/discrete optimization, as well as partially observed Markov decision problems. b) Describes variants of rollout and policy iteration for problems with a multiagent structure, which allow a dramatic reduction of the computational requirements for lookahead minimization. c) Establishes a connection of rollout with model predictive control, one of the most prominent control system design methodologies. d) Expands the coverage of some research areas discussed in 2019 textbook Reinforcement Learning and Optimal Control by the same author. See the author's website for selected sections, instructional videos and slides, and other supporting material.

Best prices to buy, sell, or rent ISBN 9781886529076




Frequently Asked Questions about Rollout, Policy Iteration, and Distributed Reinforcement Learning

You can buy the Rollout, Policy Iteration, and Distributed Reinforcement Learning book at one of 20+ online bookstores with BookScouter, the website that helps find the best deal across the web. Currently, the best offer comes from and is $ for the .

The price for the book starts from $81.74 on Amazon and is available from 4 sellers at the moment.

If you’re interested in selling back the Rollout, Policy Iteration, and Distributed Reinforcement Learning book, you can always look up BookScouter for the best deal. BookScouter checks 30+ buyback vendors with a single search and gives you actual information on buyback pricing instantly.

As for the Rollout, Policy Iteration, and Distributed Reinforcement Learning book, the best buyback offer comes from and is $ for the book in good condition.

The Rollout, Policy Iteration, and Distributed Reinforcement Learning book is in very low demand now as the rank for the book is 1,068,731 at the moment. A rank of 1,000,000 means the last copy sold approximately a month ago.

The highest price to sell back the Rollout, Policy Iteration, and Distributed Reinforcement Learning book within the last three months was on December 11 and it was $8.71.