Version 3 (modified by 12 years ago) ( diff ) | ,
---|
Super Sweep Method
While ghosting of data between grids allows for each grid to be updated independently it does lead to redundant computation. This computation can be justified if it reduces the amount and frequency of communication. However within a single processor, this redundant computation is usually only justified by the degree to which it simplifies the algorithm. However in AMR calculations with a large number of ghost zones - this redundant calculation can become too expensive to ignore.
The super sweep algorithm treats the entire region owned by a processor as one super-grid and sweeps across the super-grid collecting values from the various grids, performing the necessary computations, and returning updated values to the various grids. Of course determining where and when to calculate different stencil pieces can be difficult… Fortunately much of the heavy lifting is already in place.
Each stencil piece has a buffer that determines how close to a final updated cell one has to be before needing to update the stencil piece. Of course this can sometimes be a half cell etc… Defining this region reduces to defining a curve or collection of curves inside of which - the stencil needs to be calculate. Of course there are many ways to store a curve … and perhaps storing it as a collection of non-overlapping boxes is the simplest. These non-overlapping boxes may require breaking up a pair of overlapping boxes into multiple pieces. Of course any two overlapping boxes can be broken into X non overlapping pieces… Given the boxes one could then take a slice in the x-plane and calculate the squares that intersect the boxes - or alternatively one could take overlapping boxes - slice them - and then break up the overlapping squares into non-overlapping squares. (2 overlapping squares can only give you up to 3 non-overlapping squares) - and for cubes I believe it is 4 instead of 3.
Each stencil piece needs a collection of boxes that defines the regions within the super-block that need updating.
In order to overlap computation with communication - two super sweeps are necessary. The first super-sweep updates performs interior calculations that can be performed within the super-block without ghost data. The second super-sweep performs calculations that could not be performed by the first super-sweep and that are needed to synchronize fluxes. There is also a collection of stencil data that is needed by both interior and exterior sweeps. This data can either be cached or re-calculated. Region that needs to be cached can have an arbitrary layout but could be described as a collection of cache boxes.
- Calculate pre-ghost regions for each stencil piece, as well as cache-regions and post-ghost/pre-sync regions.
- Begin super sweep using the pre-ghost regions for each stencil piece.
- Store any values that overlap the cache-regions. * When overlap communication has completed - begin a second super-sweep updating the post-ghost/pre-sync regions.
- At each x-position, uncache stored values and modify the bounds of the square regions to avoid redundant computation.
- Then send fluxes for synchronizing.
- While waiting for fluxes - continue the pre-ghost sweep but now using cached values from the post-ghost/pre-sync sweep.