NOAA's Earth System Research Laboratory (ESRL) has been working on the parallelization of the FV3 dynamical core toward fine-grain GPU and MIC processors. Initial work focused on modifying the code to expose more loop level parallelism needed to run efficiently on GPU processors containing over 3000 processing cores. Code changes have been quite invasive, but the original structure of the code has been maintained to retain readibility for the modeling team. A primary requirement has been to demonstrate the benefits of running on the GPU, and MIC processors, without degrading performance on the CPU. This presentation will report on the work, give performance results, and discuss our efforts to achieve performance portability.
This publication was presented at the following:
Authors who have authored or contributed to this publication.