[MITgcm-devel] vectorizing adseaice_solve4temp
Martin Losch
Martin.Losch at awi.de
Thu Nov 25 05:56:44 EST 2010
Hi Patrick,
the adjoint code of seaice_solve4temp handles a RECOMPUTATION by creating a local array tsurfloch. Within the "adjoint" interation tsurfloch is copied back to tsurfloc. Unfortunately, the full array is copied back for each (i,j). For most people this is probably benign, but for the SX8 this destroys the performance, because only the copying loop is vectorized. I have found a way (with Ralf's help) to overcome that by defining a local tape at the beginning of solve4temp (with a hack to avoid another global maximum value for now):
> ifdef ALLOW_AUTODIFF_TAMC
> IF (IMAX_TICE .GT. 10) THEN
> STOP 'S/R SEAICE_SOLVE4TEMP: IMAX_TICE > 10'
> ENDIF
> CADJ INIT comlev1_solve4temp = COMMON, sNx*sNy*10
> #endif /* ALLOW_AUTODIFF_TAMC */
and in the iteration:
> Ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
> DO ITER=1,IMAX_TICE
> DO J=1,sNy
> DO I=1,sNx
> #ifdef ALLOW_AUTODIFF_TAMC
> iicekey = I + sNx*(J-1) + (ITER-1)*sNx*sNy
> CADJ STORE tsurfloc(i,j) = comlev1_solve4temp,
> CADJ & key = iicekey, byte = isbyte
> #endif /* ALLOW_AUTODIFF_TAMC */
>
> IF ( iceOrNot(I,J) ) THEN
This changes the something like
> tsurfloc(i,j) = max(273.16d0+min_tice,tsurfloc(i,j))
> adtsurfloc(i,j) = adtsurfloc(i,j)*(0.5+sign(0.5d0,tmelt-
> $tsurfloc(i,j)))
> do ip2 = 1, sny
> do ip1 = 1, snx
> tsurfloc(ip1,ip2) = tsurfloch(ip1,ip2)
> end do
> end do
to
> tsurfloc(i,j) = max(273.16d0+min_tice,tsurfloc(i,j))
> adtsurfloc(i,j) = adtsurfloc(i,j)*(0.5+sign(0.5d0,tmelt-
> $tsurfloc(i,j)))
> tsurfloc(i,j) = comlev1_solve4temp_tsurfloc_1h(iicekey)
and the routine is (because of the more efficient vectorization) dramatically faster. Should I include this within TARGET_SX CPP-flags, or is this OK for anyone? I'm asking, because there was a reason for not storing tsurfloc earlier, right?
Martin
More information about the MITgcm-devel
mailing list