[Mitgcm-support] Re: bug for DIVA with -mpi

mitgcm-support at dev.mitgcm.org mitgcm-support at dev.mitgcm.org
Wed Jul 9 15:52:59 EDT 2003


Ralf,

I should mention:
With the 'hand fixes' described below,
the parallel DIVA gradient checks
using 'taf -mpi' work.
So that's nice.

-p.



Patrick Heimbach wrote:
> Ralf,
> 
> I am finally trying to sort out a few DIVA issues.
> In the past we did quite a few 'manual' changes to the DIVA code,
> in particular we didn't use taf -mpi option since there were problems.
> 
> (1):
> I think there's a bug in the TAF-generated DIVA code
> when using '-mpi':
> 
> The 'call mpi_bcast' have to be outside of the following
> (iprod .eq. 0) block
>  >>>
>       if (iprod .eq. 0) then
>         inquire(file='divided.ctrl',exist=iexistt)
>         if (iexistt) then
>           open(unit=76,file='divided.ctrl',form='formatted')
>           read(unit=76,fmt=*) idivbeh,idivene
>           close(unit=76)
>         else
>           idivbeh = nchklev_3
>           idivene = nchklev_3+(-1)
>         endif
>         call mpi_bcast( idivbeh,1,mpi_integer,0,mpi_comm_world,iert )
>         call mpi_bcast( idivene,1,mpi_integer,0,mpi_comm_world,iert )
>       endif
> <<<
> otherwise proc 0 can't broadcast to the other processors,
> and proc 0 keeps spinning.
> 
> If I shift the call's 'by hand' outside of the block
> it seems to work fine.
> Should be easy to fix.
> 
> 
> 
> There are two other issues, not related to mpi:
> 
> (2):
> After the computation of the cost fc and the first adjoint leg,
> the adjoint state is dumped to 'snapshot', and then picked up later.
> In later runs, the value of fc itself is no longer available.
> But it would be very valuable to keep it, e.g. when doing gradient
> checks which extend over several DIVA intervals (e.g. to test DIVA).
> The easy fix which I did by hand is to just append fc (or whatever
> the name of the dependent variable) to
> the snapshot file (need to add common block for fc to adthe_main_loop).
> It should be very easy for TAF to do this automatically.
> What do you think?
> 
> (3):
> The other issue relates to the '-pure' option.
> Even when we use this option, the whole trajectory is recomputed.
> I guess, the problem is that we generate monthly mean active files
> tbar, sbar, psbar, etc.
> I don't know the workings of TAF, but am guessing the following happens:
> although taf does not recompute fc, it recomputes the whole trajectory
> anyways, since it thinks it has to regenerate tbar,... etc.
> However, that's not necessary since we can keep those files.
> I guess, this is a more tricky issue to solve since you need to decide
> which strategy should be adopted (several ways are conceivable).
> One could be to have a directive in which you specify, which active
> files will be available in an adjoint pickup
> (e.g. taf knows that the tapelev3 files are there).
> 
> We can easily fix point (3), but it would be good to have a fix for
> (1) and (2)
> (plus the missing 'mythid' parameter in the ad<toplevel_routine>
> call when using '-pure' which I had mentioned earlier).
> 
> Thanks
> -Patrick
> 
> 
> 


-- 
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
Patrick Heimbach     Massachusetts Institute of Technology
FON: +1/617/253-5259                    EAPS, Room 54-1518
FAX: +1/617/253-4464               77 Massachusetts Avenue
mailto:heimbach at mit.edu                 Cambridge MA 02139
http://www.mit.edu/~heimbach/                       U.S.A.




More information about the MITgcm-support mailing list