[MITgcm-support] segmentation fault problem

Oliver Jahn jahn at MIT.EDU
Wed Aug 11 09:37:29 EDT 2010


Is there a comma missing after dummyRS?

On 08/11/2010 08:52 AM, q li wrote:
> Anyone has any thoughts on this.
> I moved my code to my own laptop, and also reduced the array to 512x20.
> I still have the segmentation problem. When I trace back to the crash, I
> doubt if my MDS_READ_FIELD is wrong. Am I right at it? The problem is
> still not solved.
> Here is the MDS_READ_FIELD:
> CALL MDS_READ_FIELD(
> & dampCoeffFile, readBinaryPrec, .TRUE.,
> & 'RL', 1, 1, 1,
> & dampAlpha,dummyRS
> & 1, myThid)
> Here is some debug thing: (I don't know why I got AMD x86-64 while I am
> using intel CPU).
> (PID.TID 0000.0001) //
> =======================================================
> (PID.TID 0000.0001) // Parameter file "data.relaxbt"
> (PID.TID 0000.0001) //
> =======================================================
> (PID.TID 0000.0001) ># Open-boundaries
> (PID.TID 0000.0001) > &RELAXBT_PARM
> (PID.TID 0000.0001) > dampCoeffFile = 'dampAlpha.bin',
> (PID.TID 0000.0001) > &
> (PID.TID 0000.0001)
> Segmentation fault (core dumped)
> [hiphop at localhost run1]$ file core.8708
> core.8708: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV),
> SVR4-style, from 'mitgcmuv'
> [hiphop at localhost run1]$ gdb ./mitgcmuv core.8708
> GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-23.el5)
> Copyright (C) 2009 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /home/hiphop/MITgcmStudy/Iwamae/run1/mitgcmuv...(no
> debugging symbols found)...done.
> Reading symbols from /usr/lib64/libg2c.so.0...(no debugging symbols
> found)...done.
> Loaded symbols for /usr/lib64/libg2c.so.0
> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libm.so.6
> Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols
> found)...done.
> Loaded symbols for /lib64/libgcc_s.so.1
> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
> Loaded symbols for /lib64/libc.so.6
> Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
> found)...done.
> Loaded symbols for /lib64/ld-linux-x86-64.so.2
> Core was generated by `./mitgcmuv'.
> Program terminated with signal 11, Segmentation fault.
> #0 0x00000000004a32d6 in master_cpu_io__ ()
> (gdb) backtrace
> #0 0x00000000004a32d6 in master_cpu_io__ ()
> #1 0x000000000041f241 in mds_read_field__ ()
> #2 0x0000000000401cf8 in ini_relaxbt__ ()
> #3 0x00000000004023b1 in initialise_varia__ ()
> #4 0x000000000050f803 in the_main_loop__ ()
> #5 0x000000000050fb07 in the_model_main__ ()
> #6 0x00000000004a3229 in MAIN__ ()
> #7 0x0000000000517fb2 in main ()
> (gdb)
> Any help?
> Li
>
> ------------------------------------------------------------------------
> *From:* q li <qliuri at yahoo.com>
> *To:* MITgcm-support at mitgcm.org
> *Sent:* Tue, August 10, 2010 11:40:53 AM
> *Subject:* [MITgcm-support] segmentation fault problem
>
> Hi users,
> I am having a segmentation problem on a AMD64 cluster (ifort, mpich2). I
> got a warning (see below) when I compiled it. Then I got an error of
> segmentation fault. I thought it was a stack problem, but the same error
> still occurs even if I change "ulimit -s unlimited". Anyone had this
> problem before?
> Li
> Here is the warning and error:
> [hiphop at node4 build]$ make > makeoutput.txt
> sigreg.c(46): warning #556: a value of type "void *" cannot be assigned
> to an entity of type "void (*)(int, siginfo_t *, void *)"
> s.sa_sigaction = (void *)killhandler;
> ^
> [hiphop at node4 run]$ !mpi
> mpirun -np 1 ./mitgcmuv
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> rank 0 in job 30 node4_46608 caused collective abort of all ranks
> exit status of rank 0: killed by signal 9
>
>
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support



More information about the MITgcm-support mailing list