Changes between Version 8 and Version 9 of ScramblerPaper


Ignore:
Timestamp:
05/11/11 13:57:16 (14 years ago)
Author:
Jonathan
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • ScramblerPaper

    v8 v9  
    4141
    4242* Load Balancing
    43  * As was mentioned before, threading level removes the need for balancing each level and instead allows for global load balancing.  During each root step (level 0) there will be 1 set of level 1 grids, 2 sets of level 2 grids, 4 sets of level 3 grids, and 2^n-1^ sets of level n grids that will need to be distributed.  Before each new distribution of level n grids, each processor evaluates how much work remains to be finished on levels coarser then n (0:n-1) as well as the new workload associated with its new children.  These two numbers are then shared among all processors so that each processor can determine its share of the next level grids.  (Include more detailed calculation).  Each processor then determines the new work loads and from which processors it should expect to receive new child grids from as well as which processor it should give new grids to.  It then sorts its children in a hilbert order before distributing them in order among its child processors.  If the load of the next child grid is larger than the load assigned to the next child processor then the algorithm can optionally attempt to split (artificially fragment) the child grid into proportionate pieces before assigning them to the child processors.  Usually this is avoided and the first child processor is given the entire child grid since future distributions of finer grids can compensate for this inequality.  If we however are distributed finest level grids, then this splitting is used.  The splitting is also used on the coarsest root grid since its size can in general be very large. [wiki:ScramblerScheduling Wiki page with more related info to incorporate]
     43 * As was mentioned before, threading level removes the need for balancing each level and instead allows for global load balancing.  During each root step (level 0) there will be 1 set of level 1 grids, R sets of level 2 grids, R^2^ sets of level 3 grids, and R^n-1^ sets of level n grids that will need to be distributed where R is the refinement level.  The basic idea is that before each new distribution of level n grids, each processor evaluates how much work remains to be finished on levels coarser then n (0:n-1) per level n-1 step as well as the new workload associated with its new children over one n-1 level advance (or R level n advances).  These two numbers are then shared among all processors so that each processor can determine its share of the next level grids so that if the amr patch was static, each processor would finish updating every coarser grid at the end of the last level n step. 
     44
     45Basically since the coarser grid updates can be done at anytime we would like to 
     46
     47{{{
     48#!latex
     49Define $T_l=\displaystyle\sum_{p=0}^Pt_l^p$ as the total time required to complete the current updates of grids on level $l$ and $t_l^p$ as the time required to update the level $l$ grids on processor $p$.  Also define $R_l=\displaystyle\sum_{p=0}^Pr_l^p$ as the remaining time required to complete the current updates of grids on level $l$ and $r_l^p$ as the remaining time required to update the level $l$ grids on processor $p$.  We the want to distribute $T_N$ so as to minimize the time spent waiting after each levels update.   
     50}}}
     51
     52(Include more detailed calculation).  Each processor then determines the new work loads and from which processors it should expect to receive new child grids from as well as which processor it should give new grids to.  It then sorts its children in a hilbert order before distributing them in order among its child processors.  If the load of the next child grid is larger than the load assigned to the next child processor then the algorithm can optionally attempt to split (artificially fragment) the child grid into proportionate pieces before assigning them to the child processors.  Usually this is avoided and the first child processor is given the entire child grid since future distributions of finer grids can compensate for this inequality.  If we however are distributed finest level grids, then this splitting is used.  The splitting is also used on the coarsest root grid since its size can in general be very large. [wiki:ScramblerScheduling Wiki page with more related info to incorporate]
    4453 * Hilbert Ordering / Splitting
    4554