R. Krause¹, P. Benedusi¹, P. Zulian¹
¹ Institute of Computational Science, Università della Svizzera italiana, Lugano, Switzerland
We present a parallel and efficient multilevel solution strategy for solv- ing non-linear time-dependent problems. We consider in particular the mono-domain model, a non-linear reaction-diffusion equation arising from a problem in electrophysiology: the electrical activation in the human heart. Different strategies for the space-time discretization and solution of the mono-domain equation are discussed, which are based on domain decom- position and multi-level methods. For the latter, we propose a semi-geo- metric multigrid method, for which the coarse level approximation spaces are created using arbitrary hierarchies of non-nested meshes. Interpola- tion and restriction in the multilevel context is then realized by means of a discrete L² -projection between the non-matching meshes. This approach allows for creating the coarser levels of a multigrid hierarchy, even if only a single "fine" mesh is available. Hence, multigrid hierarchies can be created for arbitrary geometries in any dimension. We discuss how this approach can be applied to the monodomain equation discretised with space-time finite elements.
While we use continuous finite elements in space, for stability reasons we adopt discontinuous elements in time. We discuss shortly the properties of this time discretization scheme.
We investigate how different block smoothers, coarsening strategies and ordering of the space-time variables effect the overall convergence and robustness of the solver.
Furthermore, we comment on local time-stepping for space-time discretizations.
Finally, we investigate numerically the scalability and the convergence of our multilevel and domain decomposition solution strategies.
R. Falgout¹, H. De Sterck², A. Howse³, S. MacLachlan⁴, J. Schroder¹
¹ Lawrence Livermore National Laboratory, USA
² Monash University, Australia
³ Red Deer College, Canada
⁴ Memorial University of Newfoundland, Canada
The focus of this talk is on the application of the multigrid reduction in time (MGRIT) method to hyperbolic partial differential equations. We first consider the MGRIT approach developed in [1] that coarsens in both time and space. In the case of explicit time-stepping, spatial coarsening is needed to ensure stability on all levels, but it is also useful for implicit time-stepping by producing cheaper multigrid cycles. Unfortunately, uniform spatial coarsening results in extremely slow convergence when the wave speed is near zero, even if only locally. An adaptive spatial coarsening strategy addresses this issue for the variable coefficient linear advection equa- tion and the inviscid Burgers equation using first-order explicit or implicit time-stepping methods. Numerical results show that this offers significant improvements over classical coarsening, and parallel scaling tests indicate that speedups over serial time-stepping strategies are possible. We also discuss progress on other recent ideas for handling hyperbolic problems, one based on more accurately approximating the Petrov-Galerkin coarse-grid operator, and another based on adding higher-order dispersion to minimize diffusiveness while improving MGRIT convergence.
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
References
[1] Hans De Sterck, Robert D. Falgout, Alexander J.M. Howse, Scott P. MacLachlan, and Jacob B.Schroder. Parallel-in-time multigrid with adaptive spatial coarsening for the linear advection and inviscid burgers equations. SIAM J. Sci. Comput., submitted. LLNL-JRNL-737050.
S. Günther², J. B. Schroder¹, N. Gauger², R. D. Falgout¹
¹ Lawrence Livermore National Laboratory, USA
² TU Kaiserslautern, Germany
Parallel-in-time methods provide a solution to the serial time-integration bottleneck. This bottleneck is a result of long-term computer architecture trends, where current and future speedups will be available through greater concurrency, not faster clock speeds, which are stagnant. One area ripe for development by parallel-in-time is numerical optimization with unsteady PDEs. Here, the common adjoint-based approach suffers from a large serial bottleneck because it couples multiple sweeps forwards and then backwards in time. The backwards sweeps use the adjoint to compute sensitivities, which are then used to optimize the objective function. The multiple sweeps forwards and backwards in time create a large un-parallelized sequential process. In this talk, we examine the optimal-scaling multigrid reduction in time (MGRIT) method, which is able to parallelize over such nonlinear time-stepping processes. Initial work by Günther et al. [1, 2] showed promise by using MGRIT to compute both sweeps parallel-in-time, in the context of a traditional adjoint-based approach and also a one-shot optimization framework. In this talk, we describe the implementation of this work into the open source XBraid (MGRIT) software package. The goal is to provide the broader community a general-purpose parallel-in-time optimization approach.
This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52–07NA27344, LLNL-ABS-748131
References
[1] S. Günther, N. R. Gauger, and J. B. Schroder. A non-intrusive parallel- in-time adjoint solver with the xbraid library. Computing and Visualiza- tion in Science, 2017. (accepted).
[2] S. Günther, N. R. Gauger, and J. B. Schroder. A non-intrusive parallel- in-time approach for simultaneous optimization with unsteady pdes. Optimization Methods and Software, 2018. (submitted).
-Lunch-
D. Samaddar¹, D.P. Coster², X. Bonnin³, W. R. Elwasif⁴, D. B. Batchelor⁴,
L.A. Berry⁴, E. Havlickova¹, T. Casper³, S.H. Kim³, D.E. Newman⁵, R. Sanchez⁶
¹ Culham Centre for Fusion Energy, Culham Science Centre, Abingdon, OX14 3DB, UK
² Max-Planck-Institut für Plasmaphysik, Germany
³ ITER Organization, Route de Vinon-sur-Verdon, CS 90 046, 13067 St. Paul Lez Durance, Cedex, France
⁴ Oak Ridge National Laboratory, Oak Ridge, TN, USA
⁵ University of Alaska Fairbanks, AK, USA
⁶ University Carlos III de Madrid, Spain
Achieving long time scale simulations of magnetically confined fusion plasma arguably poses one of the grand challenges of the 21st century for the HPC community. While parallel architectures on modern supercomputers play a crucial role in these computations, a saturation in terms of improving computational gain with respect to increasing computing cores is encountered on existing Petascale machines. Temporal parallelization provides a crucial step in exploiting the computing power of modern supercomputers. The Parareal algorithm [1] has so far been the most widely studied method in fusion plasma, reducing wall clock times by factors of 10 and 20 in complex non-linear problems. This talk will discuss a series of applications in fusion plasma, ranging from fluid turbulence at the plasma core to multifluid scrape off layer simulations at the plasma edge where plasma-wall interactions play a dominant role. Applications in plasma scenario modelling will also illustrate some key issues. Since the Parareal algorithm re- lies on a predictor-corrector approach, choosing a suitable coarse predictor often becomes the focal point of research in complex non-linear problems such as fusion. The talk will discuss both achievements and roadblocks encountered along the way. The views and opinions expressed herein do not necessarily reflect those of the European Commission or the ITER Organization.
References
[1] J. Lions, Y. Maday, and G. Turinici. A "parareal" in time discretization of PDE's. Comptes Rendus de l'Académie des Sciences - Series I - Mathematics, 332(7):661–668, 2001.
Gilles Pagès¹, Olivier Pironneau², Guillaume Sall¹
¹ Laboratoire de Probabilités, Statistique et Modélisation, UMR 8001, Sorbonne Université, Boîte courrier 158, 4 pl. de Jussieu, F-75252 Paris Cedex 5, France
² Laboratoire Jacques Louis Lions, UMR 7598, Case 187, 4 pl. de Jussieu, F-75252 Paris Cedex 5, France
With parallelism in mind we investigate the parareal method for American contracts both theoretically and numerically. Least-Square Monte-Carlo (LSMC, see [1, 2, 3]) and parareal time decomposition (see [4, 5]) with two or more levels are used, leading to an efficient parallel implementation which scales linearly with the number of processors and is appropriate to any multiprocessor-memory architecture in its multilevel version. We prove L² superlinear convergence for an LSMC backward in time computation of American contracts, when the conditional expectations are known, i.e. before Monte-Carlo discretization. The argument provides also a tool to analyze the multi-level parareal algorithm; in all cases the computing time is increased only by a constant factor, compared to the sequential algorithm on the finest grid, and speed-up is guaranteed when the number of processors is larger than that constant. A numerical implementation confirms the theoretical error estimates.
References
[1] J.N. Tsitsiklis and B. van Roy. Regression methods for pricing complex american-style options. IEEE Trans. Neural Netw, 12(4):694–703, 2001.
[2] J. Tsitsiklis and B. van Roy. Regression methods for pricing complex american-style options. IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 12, NO. 4, JULY 2001, 12(4):694–703, 2001.
[3] F. A. Longstaff and E. S. Schwartz. Valuing American options by simulation: a simple least squares approach. Review of Financial Studies, 14:113–148, 2001.
[4] G. Bal and Y. Maday. A "parareal" time discretization for non-linear pde's with application to the pricing of an american put. Recent developments in domain decomposition methods (Zürich, 2001), 23:189–202, 2002.
[5] J.-L. Lions, Y. Maday, and G. Turinici. Résolution d'EDP par un schéma en temps "pararéel". C.R. Acad. Sci. Paris sér. I Math., 332(7):661–668, 2001.
Camille Coti¹
¹ LIPN, CNRS UMR 7030, Université Paris 13, Sorbonne-Paris-Cité
Large-scale parallel system are now more usual and accessible to scientists. The latest release of the Top 500 (www.top500.org), in November 2017, show that, in the world, 181 systems achieve a performance of 1 Petaflops and more; 58 systems feature more than 100 000 cores and 499 feature more than 10 000 cores.
Besides, at large-scale, failures are statistically frequent, and need to be taken into account by the computation. As a matter of fact, at large-scale, failures are so common that it is expected that some nodes will fail during the computation [1]. As a consequence, the computation needs to expect these failures and adapt in order to proceed with the computation in spite of these failures.
I will present a recent model for fault tolerance in parallel applications extending the MPI standard and called user-level failure mitigation[2]. Failure detection and resolution are local, and depend on communications: only the processes that try to communicate with a failed process are aware that it has failed. The rest of the system continue its execution unknowingly. This model is particularly interesting at large scale, since it involves limited synchronizations between processes.
In this model, I will present two fault-tolerant, direct algorithms for QR and LU dense matrix factorizations. These algorithms are based on communication-avoiding algorithms [3], that have shown their good scalability on a wide range of parallel architectures. I will present how some algorithmic and algebraic properties can be taken advantage of in order to insert or exploit partial result redundancy and obtain algorithms that are intrinsically fault-tolerant [4].
References
[1] Daniel A. Reed, Charng da Lu, and Celso L. Mendes. Reliability challenges in large systems. Future Generation Computer Systems, 22(3):293 – 302, 2006.
[2] Wesley Bland, Aurelien Bouteiller, Thomas Hérault, Joshua Hursey, George Bosilca, and Jack J. Dongarra. An evaluation of user-level failure mitigation support in MPI. Computing, 95(12):1171–1184, 2013.
[3] James Demmel, Laura Grigori, Mark Hoemmen, and Julien Langou. Communication- optimal parallel and sequential QR and LU factorizations. SIAM Journal on Scientific Computing, 34(1):A206–A239, 2012.
[4] Camille Coti. Scalable, robust, fault-tolerant parallel qr factorization. In Distributed Computing and Applications to Business, Engineering and Science (DCABES), 2016 15th International Symposium on. IEEE, 2016.
V. Gopinath¹, A. Fournier¹, T. Gastine¹
¹ Institut de Physique du Globe de Paris (IPGP)
The magnetic field of the Earth is understood to be generated by the motion of an electrically conducting fluid inside the outer core. This field protects the Earth from charged particles in the form of cosmic rays or solar winds. Numerical simulations of outer core of the earth have been an essential tool in understanding the evolution of such magnetic fields. In order to maintain such magnetic fields over vast time scales, a convective motion inside the outer core is predicted to happen. Such convective motions can either be driven by thermal convection or by compositional convection.
The simulation of such coupled systems often involves High Performance Computing (HPC). However, even with advanced HPC resources, the simulations are not yet close to operating at the actual parameters and regimes of the Earth. Thus there is the need for efficient numerical strategies to tackle such problems. In this work, we pursue to develop an advanced time integration scheme which would help us explore extreme simulation regimes. A two-dimensional annulus geometry which mimics an equatorial plane of the outer core is modeled. A thermal convection is set up between the outer and inner walls, the strength of which is measured by the Rayleigh number (Ra). Currently, a code for simulating this problem is implemented. It is structured in such a way that we are able to test new schemes in an efficient manner. Various new time integration schemes are implemented and are being currently tested for efficiency. These new schemes will open a new window to simulate certain parameter regimes which were previously considered costly. From this study, we hope to select a coarse and a fine integration scheme which best suits the parallel in time approach, which we will pursue for this problem.
-Dinner-
A. Clarke¹, D. Ruprecht¹, S. Tobias¹, C. Davies¹
¹ University of Leeds
The precise mechanisms responsible for the natural dynamos in the Earth and Sun are still not fully understood. Numerical simulations of natural dynamos are extremely computationally intensive, and are carried out in parameter spaces many orders of magnitude away from real conditions.
Parallelization in space is a common strategy to speed up simulations on high performance computers, but eventually hits a scaling limit. Ad- ditional directions of parallelization are desirable to utilise the high number of processor cores now available. Parallel-in-time methods can deliver speed-up in addition to that offered by spatial partitioning.
This talk will investigate Parareal's [1] ability to speed up simulations of the kinematic dynamo, where the velocity field is a prescribed field in the induction equation. We will describe an implementation of Parareal into the open-source Python software Dedalus [2], which implements spectral methods with implicit-explicit (IMEX) time-stepping. Good performance in Parareal depends on an efficient coarse propagator, which should be faster than the fine propagator, whilst still giving adequate accuracy for good convergence. In this case, reduced spatial resolution is used for the coarse solver. This also allows larger time step sizes, further increasing the speed of the coarse propagator.
Convergence properties, speed-up, and efficiency of the Parareal algorithm for the Roberts [3] flow and other types of dynamos will be presented.
References
[1] Jacques-Louis Lions, Yvon Maday, and Gabriel Turinici. Résolution d’edp par un schéma en temps «pararéel». Comptes Rendus de l’Académie des Sciences-Series I-Mathematics, 332(7):661–668, 2001.
[2] Keaton J Burns, Geoffrey M Vasil, Jeffrey S Oishi, Daniel Lecoanet, Benjamin Brown, and Eliot Quataert. Dedalus: Flexible framework for spectrally solving differential equations. (In preparation).
[3] Gareth O Roberts. Dynamo action of fluid motions with two-dimensional periodicity. Phil. Trans. R. Soc. Lond. A, 271(1216):411–454, 1972.