47 | | * The pipelining is done automatically provided that the dependencies between stencil pieces is explicitly stated. For example consider a simple 2D 1st order Gudonov method in which the initial state is stored in {{{q}}}, and the updated fields are stored in {{{Q}}}. The x and y fluxes are stored in {{{fx,fy}}}, and the left and right interface states are stored in {{{qLx,qLy}}} and {{{qRx,qRy}}} respectively. We also adopt the convention that stencil pieces stored on cell edges ( ie {{{qLx, qRx, fx}}}) at position {{{i-1/2}}} are stored in their respective arrays with the index {{{i}}} |
| 47 | * The pipelining is done automatically provided that the dependencies between stencil pieces is explicitly stated. For example consider a simple 2D 1st order Gudonov method in which the initial state is stored in {{{q}}}, and the updated fields are stored in {{{Q}}}. The x and y fluxes are stored in {{{fx,fy}}}, and the left and right interface states are stored in {{{qLx,qLy}}} and {{{qRx,qRy}}} respectively. We also adopt the convention that stencil pieces stored on cell edges ( ie {{{qLx, qRx, fx}}}) at position {{{i-1/2}}} are stored in their respective arrays with the index {{{i}}}. The stencil dependencies can then be expressed as: |
65 | | Without pipelining we would then simply do the following: |
66 | | {{{ |
67 | | FORALL(i=Q%range(1,1):Q%range(1,2), j=Q%range(2,1):Q$range(2,2), k=Q%range(3,1):Q%range(3,2)) |
68 | | Q%data(i,j,k)=q%data(i,j,k)+fx%data(i,j,k)-fx%data(i+1,j,k)+fy%data(i,j,k)-fy%data(i,j+1,k)+fz%data(i,j,k)-fz%data(i,j,k+1) |
69 | | END FORALL |
70 | | }}} |
71 | | To pipeline we have to update one slice in x at a time so instead we have |
72 | | {{{ |
73 | | DO index = q%range(1,1):q%range(1,2) |
| 66 | For the 2D example above this would give the following windows: |
75 | | IF istime(q, index, i) THEN |
76 | | ... |
77 | | END IF |
78 | | |
79 | | ... |
80 | | ... |
81 | | |
82 | | |
83 | | IF istime(Q, index, i) THEN |
84 | | FORALL(j=Q%range(2,1):Q%range(2,2), k=Q%range(3,1):Q%range(3,2)) |
85 | | Q%data(Q%x(i),j,k)=q%data(q%x(i),j,k)+fx%data(fx%x(i),j,k)-fx%data(fx%x(i+1),j,k)+fy%data(fy%x(i),j,k)-fy%data(fy%x(i),j+1,k)+fz%data(fz%x(i),j,k)-fz%data(fz%x(i),j,k+1) |
86 | | END FORALL |
87 | | END IF |
88 | | END DO |
89 | | }}} |
90 | | where index is the x position within the large array that we are currently in the process of updating. The {{{%x(:)}}} is an array that maps a local physical x offset with respect to the row we are updating with an i-index in an thin array that is only as wide as is needed. Finally the istime function returns the index i that represents which position within the Q%data array we should be calculating. In the above 1D example, {{{q}}} would be three cells wide, {{{fx}}} would be two cells wide, and everyone else would be only 1 cell wide. {{{q}}} would be retrieved two cycles before updating/storing {{{Q}}}, while {{{qLx}}}, {{{qRx}}}, & {{{fx}}} would be calculated one cycle before. |
91 | | |
92 | | The dependencies in each calculation are explicitly stated and then used to determine the physical extent each variable is needed as well as when during the sweep the variable is first calculated and how long each calculated value is stored. |
| 68 | [[Image(http://www.pas.rochester.edu/~johannjc/Papers/Carroll2011/SweepDemo.png)]] |