[MITgcm-devel] good news for FIZHI !!!
Jean-Michel Campin
jmc at ocean.mit.edu
Sat Jul 30 16:40:47 EDT 2005
Hello Ed,
It's interesting.
I've done some tests few months ago, with a very simple program
that requires dynamical allocation
and with ifort on faulks, the unlimited option
did not do anything, and I was still getting the
Segmentation fault
But with pgf77 and g77, my small test works fine.
May be it's the ifort compiler on faulks, or the options
that I was using ?
Jean-Michel
On Sat, Jul 30, 2005 at 12:24:09PM -0400, Ed Hill wrote:
>
> Hi Andrea, Jean-Michel, and Chris
>
> I have some excellent FIZHI news but first let me tell you how I found
> it since it might be helpful in the future.
>
> Despite Andrea's (good!) efforts to remove Fortran "save" and any other
> not-so-portable idioms, FIZHI still only ran with the PGI compiler
> although it compiled without any real trouble on the others. So in an
> effort to get it running on Columbia, I spent a few hours on it last
> night and this morning trying to get it working with g77 and ifort on my
> laptop. The key thing I discovered was that both ifort and g77
> segfault-ed at exactly the same place in the irrad() routine:
>
> $ gdb ./mitgcmuv
>
> [...lots of output...]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x080b52c6 in irrad_ ()
>
> (gdb) bt
> #0 0x080b52c6 in irrad_ ()
> #1 0x080b223f in lwrio_ ()
> #2 0x0809ddc4 in fizhi_driver__ ()
> #3 0x08091374 in do_fizhi__ ()
> #4 0x0810ac48 in fizhi_wrapper__ ()
> #5 0x081ccfe7 in do_atmospheric_phys__ ()
> #6 0x081d6746 in forward_step__ ()
> #7 0x0820835c in the_main_loop__ ()
> #8 0x082084b5 in the_model_main__ ()
> #9 0x081acee3 in MAIN__ ()
> #10 0x0820e4c5 in main ()
> (gdb)
>
> So, I then wasted hours trying to pare down irrad() to figure out the
> exact line(s) causing the segfault. With both g77 and ifort, the
> segfault happened before the first statement in irrad() was executed.
> Thats the tip-off. It wasn't irrad() code or any of the many DATA
> statements it contains. It was irrad() simply using too much stack
> space with its many local variables!
>
> So, I set the stack size to unlimited in my shell:
>
> $ ulimit -s unlimited
>
> and now you can see on our testing pages that the FIZHI experiment works
> reasonably well on the "ernie" machine (my laptop) which has ifort v8.1
> and g77 v3.4.4 installed (Fedora Core 3):
>
> http://mitgcm.org/testing.html
>
> I'll look into our testing scripts next and try to figure out how to do
> the ulimit command within MPI runs.
>
> Ed
>
> ps - This means the PGI compiler is almost certainly creating Fortran
> local variables on the heap instead of the stack. Yeah, theres
> some useful trivia.
>
More information about the MITgcm-devel
mailing list