[MITgcm-support] Adjoint tutorial problems

Marcello Magaldi Marcello.Magaldi at jhu.edu
Fri Feb 3 22:21:39 EST 2012


Ciao Patrick,

>Several clues:
>Does it get past cost calculation? Do a
>grep 'fc ' STDOUT.0000

It does not, the grep returns empty (clearly it gets past cost calc.
when the run goes
thru, i.e. when processors are all in one node or I am using GNU compilers)

>That experiment is one of few that tests 2-level checkpointing.
>How big are your tiles? Could you send SIZE.h?

I am compiling with 8 procs (but running on 2 different nodes on
purpose, 2:ppn=4).
The relevant part of my SIZE.h is

    PARAMETER (
     &           sNx =  45,
     &           sNy =  10,
     &           OLx =   3,
     &           OLy =   3,
     &           nSx =   1,
     &           nSy =   1,
     &           nPx =   2,
     &           nPy =   4,
     &           Nx  = sNx*nSx*nPx,
     &           Ny  = sNy*nSy*nPy,
     &           Nr  =  20)


>Could you try following changes:
>
>1. in ECCO_CPPOPTIONS.h set:
>#undef AUTODIFF_2_LEVEL_CHECKPOINT
>
>2. in tamc.h set:
>      parameter( nchklev_1      =    1 )
>and a bit further down
>      parameter( nchklev_2      =    30 )

Changes made! Compiled and ran... Same problems as before. Error is again:

--------------------------------------------------------------------------
mpirun noticed that process rank 4 with PID 9555 on node ln188 exited
on signal 11 (Segmentation fault).
--------------------------------------------------------------------------









    PARAMETER (
     &           sNx =  45,
     &           sNy =  10,
     &           OLx =   3,
     &           OLy =   3,
     &           nSx =   1,
     &           nSy =   1,
     &           nPx =   2,
     &           nPy =   4,
     &           Nx  = sNx*nSx*nPx,
     &           Ny  = sNy*nSy*nPy,
     &           Nr  =  20)



More information about the MITgcm-support mailing list