Building a Scalable Ensemble Data Assimilation System for Coupled Models with PDAF
Efficient ensemble data assimilation with coupled models poses particular challenges due to the comp lexity of the model system and due to its high computational cost. On the methodological side, one h as to account for different time scales, but also distinct correlation lengths, of different model c ompartments like the ocean and the atmosphere. Computationally, one often has to deal with multiple program executables, a coupler software, observation handling for different model compartments, and a large number of processors required to compute a complex coupled model. I will focus on the computational aspects and discuss the steps required to build a highly scalable and flexible data assimilation system can be built on the basis of the Parallel Data Assimilation Framework (PDAF, http://pdaf.awi.de) using the example of the coupled climate model AWI-CM (Sidorenko et al., Climate Dynamics, 44 (2015) 757-780). AWI-CM consists of the finite-element sea ice-ocean model FESOM, which uses an unstructured model grid, and the model ECHAM6 for the atmosphere. The model coupling is implemented with OASIS-MCT and the model system consists of two separate executable programs for the ocean and atmosphere. Next to the implementation steps, the scalability of the assimilation system is discussed with a realistic configuration of AWI-CM. The high scalability is obtained by an online-connection strategy for the data assimilation system. First, the parallelization of the coupled model system is modified so that the coupled model can perform ensemble forecasts. Second, the analysis (solver) step is directly inserted into the time-stepping loops of each model compartment. Augmenting the coupled model in this online way, the ensemble information is kept in memory and transferred by parallel communication when necessary. Thus, one avoids the need to repeatedly write an ensemble of model fields into files and read them again for the analysis step. Further, the coupled model is only started once and there is no need to stop and restart the whole coupled model to compute the analysis step. Instead, the analysis step is performed in between time steps and is independent of the actual model coupler. These modifications of the model are supported by the framework structure of PDAF. In addition to the parallel online connection for the data assimilation system, the analysis step has to be parallelized. Here, the different model compartments are treated like parallel subdomains of the model. In this way, one can one can use the data assimilation algorithms provided by PDAF and can implement and perform the analysis step in analogy to uncoupled models. However, one has to take into account the different model grids and possible distinct ways in which the model compartments store their model fields. This results in a data assimilation system that can perform the assimilations both in-compartment (for weakly coupled assimilation) and cross-compartment (for strongly coupled assimilation).