RESULTS

SUKDMIG2D Configuration Model Size Cores Elapsed Time Speed up
CPU Only (Baseline) 2x E5-2698 v3 2.30GHz 2301 x 751 1 218 1.00
mig2d:

Parallel Directives

pragma

Parallelize for loops

Vectorize

Compiler vectorizes inner loops Parallel Directives restrict on pointers! limits aliasing

www.wikipedia.org/wiki/Restrict

pragma

Parallelize outer for loops

Compiler parallelizes inner loop

Resolve Errors!

Parallel Directives

pragma

Parallelize for outer loop

Parallelize inner loops

Resolve loop carried depend

Add acc loop directive

Resit (managed):

537, Accelerator kernel generated

Generating Tesla code

538, #pragma acc loop gang / blockIdx.x /

553, #pragma acc loop vector(128) / threadIdx.x /

540, Loop carried dependence of t->-> prevents parallelization

Loop carried backward dependence of t->-> prevents vector

Resit:

pragma acc parallel for for (ix=0; ix<nx; ++ix)

{

pragma acc loop for (is=0; is<ns; ++is)

{ . . .

pragma acc loop for (iz=0; iz<nz; ++iz) t[ix][iz] -= sr0tb[jr][iz]+srtb[jr+1][iz];

results matching ""

    No results matching ""