We present new parallel algorithms that solve continuous-state partially observable Markov decision process (POMDP) problems using the GPU (gPOMDP) and a hybrid of the GPU and CPU (hPOMDP). We choose the Monte Carlo value iteration (MCVI) method as our base algorithm and parallelize this algorithm using the multi-level parallel formulation of MCVI. For each parallel level, we propose efficient algorithms to utilize the massive data parallelism available on modern GPUs. Our GPU-based method uses the two workload distribution techniques, compute/data interleaving and workload balancing, in order to obtain the maximum parallel performance at the highest level. Here we also present a CPU-GPU hybrid method that takes advantage of both CPU and GPU parallelism in order to solve highly complex POMDP planning problems. The CPU is responsible for data preparation, while the GPU performs Monte Cacrlo simulations; these operations are performed concurrently using the compute/data overlap technique between the CPU and GPU. To the best of the authors' knowledge, our algorithms are the first parallel algorithms that efficiently execute POMDP in a massively parallel fashion utilizing the GPU or a hybrid of the GPU and CPU. Our algorithms outperform the existing CPU-based algorithm by a factor of 75-99 based on the chosen benchmark.
Bibliographical noteFunding Information:
This work was supported in part by NRF in Korea (grant number 2014K1A3A1A17073365) and MCST/KOCCA in the CT R&D program 2014 (grant number R2014060011).
© SAGE Publications.
- Monte Carlo value iteration