[MITgcm-devel] snapshots and divided.ctrl

Matthew Mazloff mmazloff at ucsd.edu
Mon Jul 2 14:20:43 EDT 2012


Hi Patrick

My question was basically who chose the syntax for the I/O command,
>>    open(unit=77,file='snapshot'//filen,status='old',form=
>>    $'unformatted',iostat=iers)
>>       if (iers .eq. 0) then
>>         read(unit=77) adapressure0,adapressure1,adaqh0,adaqh1,adarea,
>> ...

But I don't think I need to know that anymore.  What I believe is  
happening is that there is an issue with multiple processors accessing  
divided.ctrl.  Occasionally, I believe, processor zero is writing  
divided.ctrl in adthe_main_loop when another processor tries to read  
it in cost_final_restore.  This causes the model to crash.  Then some  
processors, that were writing snapshot, stop writing and this is what  
I was seeing.

So I believe the problem is with cost_final_restore.  There are  
numerous ways to remedy this, and I am not sure what is best.  Since  
cost_final_restore is only needed for packing I will just put it  
inside EXCLUDE_CTRL_PACK, but this isn't the most robust fix so I  
won't check it in.  Anyway, I hope this fixes my problem.

Thanks
-Matt



On Jul 2, 2012, at 6:10 AM, Patrick Heimbach wrote:

>
> Hi Matt,
>
> not sure I understand your question.
> If DIVA is enabled (via #define ALLOW_DIVIDED_ADJOINT)
> TAF automatically picks the outermost checkpoint level
> (by default ilev_3, in your case ilev_4)
> as the interval with which to checkpoint the adjoint snapshots,
> because it is here that we tell TAF to do so:
> c**************************************
> #   ifdef ALLOW_DIVIDED_ADJOINT
> CADJ loop = divided
> #   endif
> c**************************************
>
> There is no extra directive of where the I/O itself should take place,
> the natural place to do this is:
> * to read the snapshot files right before the ilev_4 loop gets  
> incremented, i.e. before
>      do ilev_4 = idivbeg, idivend+1, -1
> * to overwrite that snapshot file after that same loop is completed,  
> i.e.
>      enddo (of the above loop)
>
> So you should know exactly where in S/R the_main_loop.F TAF will  
> target the read/write,
> namely just before block:
> #   ifdef AUTODIFF_4_LEVEL_CHECKPOINT
>      do ilev_4 = 1,nchklev_4
>         if(ilev_4.le.max_lev4) then
>
> and just after block:
> #    ifdef AUTODIFF_4_LEVEL_CHECKPOINT
>       endif
>      enddo
> #    endif
>
> But I guess you know this, so not sure if this helps.
>
> Cheers
> -Patrick
>
> On Jun 29, 2012, at 8:07 PM, Matthew Mazloff wrote:
>
>> Hello
>>
>> I am having an issue where occasionally the adjoint snapshot files  
>> are not properly written out.  I wanted to troubleshoot this and  
>> perhaps just change the way the I/O for these files are performed,  
>> but I am having trouble figuring out where taf gets the info for  
>> snapshots.  Is there a place where we tell taf to write the adjoint  
>> state snapshot, or does it do that on its own?  I can't seem to  
>> locate in any forward code the call that generates the adjoint code:
>>
>> C----------------------------------------------
>> C read snapshot
>> C----------------------------------------------
>> 9813 continue
>>     if (idivbeg .lt. nchklev_4) then
>>       open(unit=77,file='snapshot'//filen,status='old',form=
>>    $'unformatted',iostat=iers)
>>       if (iers .eq. 0) then
>>         read(unit=77) adapressure0,adapressure1,adaqh0,adaqh1,adarea,
>> ...
>>
>> and the later write snapshot one either.  How is this information  
>> provided to taf?
>>
>>
>> Matt
>>
>>
>>
>> _______________________________________________
>> MITgcm-devel mailing list
>> MITgcm-devel at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-devel
>
> ---
> Patrick Heimbach | heimbach at mit.edu | http://www.mit.edu/~heimbach
> MIT | EAPS 54-1420 | 77 Massachusetts Ave | Cambridge MA 02139 USA
> FON +1-617-253-5259 | FAX +1-617-253-4464 | SKYPE patrick.heimbach
>
>
> _______________________________________________
> MITgcm-devel mailing list
> MITgcm-devel at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-devel




More information about the MITgcm-devel mailing list