[MITgcm-support] MITgcm-support Digest, Vol 237, Issue 1

Matthew Mazloff mmazloff at ucsd.edu
Fri Mar 3 19:04:35 EST 2023


It may be something with reading the data* files. I have had issues with this in the past when using many cores, though it has been a long time since I have seen this issue arise. 
 To try something other than the default you can try
USE_FORTRAN_SCRATCH_FILES   
I’m not sure how it works but it only impacts this part of the code so definitely safe to try

or you can try 
SINGLE_DISK_IO  
but this eliminates IO from all other cores, and will thus suppress error messages. That said, if multiple cores trying to process data* at once is your issue this will resolve it.

Matt




> On Mar 3, 2023, at 2:14 PM, mario wrk <wrkmario at gmail.com> wrote:
> 
> Thanks for pointing that out!  the scratch files point to data.exch2 and data.ctrl  I excluded some pkg one by one, but I still have some similar issues, 
> in the end, there is a segmentation fault, srun: error: l10551: task 256: Segmentation fault, now I highly suspect it might be some compiler issues, cuz it was good with some verification examples, but my own configuration substantially increased resolution and I wanted to compile/run in multi-processors
> I also noticed there were similar discussions before: http://mailman.mitgcm.org/pipermail/mitgcm-support/2018-June/011593.html <https://urldefense.com/v3/__http://mailman.mitgcm.org/pipermail/mitgcm-support/2018-June/011593.html__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RLRznInA$> 
> 
> Best, 
> Mario
> 
>  255: forrtl: severe (28): CLOSE error, unit 11, file "Unknown"
>  255: Image              PC                Routine            Line        Source             
>  255: libifcoremt.so.5   00001555553EFBDE  for__exit_handler     Unknown  Unknown
>  255: libifcoremt.so.5   00001555553FC78E  for__signal_handl     Unknown  Unknown
>  255: libpthread-2.28.s  0000155550C3BC20  Unknown               Unknown  Unknown
>  255: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$>       000015555095316B  unlink                Unknown  Unknown
>  255: libifcoremt.so.5   00001555553E18B1  for__close_proc       Unknown  Unknown
>  255: libifcoremt.so.5   00001555553E0EB0  for_close             Unknown  Unknown
>  255: mitgcmuv_ad        0000000000902A3F  Unknown               Unknown  Unknown
>  255: mitgcmuv_ad        00000000009E2F4A  Unknown               Unknown  Unknown
>  255: mitgcmuv_ad        00000000009DB0E5  Unknown               Unknown  Unknown
>  255: mitgcmuv_ad        00000000009F7D38  Unknown               Unknown  Unknown
>  255: mitgcmuv_ad        000000000097DDEA  Unknown               Unknown  Unknown
>  714: mitgcmuv_ad        0000000000403852  Unknown               Unknown  Unknown
>  714: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$>       0000155550887493  __libc_start_main     Unknown  Unknown
>  714: mitgcmuv_ad        000000000040375E  Unknown               Unknown  Unknown
>  255: mitgcmuv_ad        0000000000403852  Unknown               Unknown  Unknown
>  255: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$>       0000155550887493  __libc_start_main     Unknown  Unknown
>  255: mitgcmuv_ad        000000000040375E  Unknown               Unknown  Unknown
> 1481: forrtl: error (78): process killed (SIGTERM)
> 1481: Image              PC                Routine            Line        Source             
> 1481: libifcoremt.so.5   00001555553FC76C  for__signal_handl     Unknown  Unknown
> 1481: libpthread-2.28.s  0000155550C3BC20  Unknown               Unknown  Unknown
> 1481: libpthread-2.28.s  0000155550C3B1D6  __open64              Unknown  Unknown
> 1481: libifcoremt.so.5   00001555554911B1  for__open_proc        Unknown  Unknown
> 1481: libifcoremt.so.5   000015555540CCBE  for_open              Unknown  Unknown
> 1481: mitgcmuv_ad        000000000097F1F5  Unknown               Unknown  Unknown
> 1481: mitgcmuv_ad        000000000090122F  Unknown               Unknown  Unknown
> 1481: mitgcmuv_ad        00000000009E2F4A  Unknown               Unknown  Unknown
> 1481: mitgcmuv_ad        00000000009DB0E5  Unknown               Unknown  Unknown
> 1481: mitgcmuv_ad        00000000009F7D38  Unknown               Unknown  Unknown
> 1481: mitgcmuv_ad        000000000097DDEA  Unknown               Unknown  Unknown
> 
> On Fri, Mar 3, 2023 at 6:35 PM <mitgcm-support-request at mitgcm.org <mailto:mitgcm-support-request at mitgcm.org>> wrote:
>> Send MITgcm-support mailing list submissions to
>>         mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>
>> 
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support <https://urldefense.com/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8QuLWdRUA$>
>> or, via email, send a message with subject or body 'help' to
>>         mitgcm-support-request at mitgcm.org <mailto:mitgcm-support-request at mitgcm.org>
>> 
>> You can reach the person managing the list at
>>         mitgcm-support-owner at mitgcm.org <mailto:mitgcm-support-owner at mitgcm.org>
>> 
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of MITgcm-support digest..."
>> 
>> 
>> Today's Topics:
>> 
>>    1. too many values for NAMELIST variable (mario wrk)
>>    2. Re: too many values for NAMELIST variable
>>       (Menemenlis, Dimitris (US 329B))
>>    3. Re: too many values for NAMELIST variable
>>       (Carroll, Dustin (US 329C-Affiliate))
>> 
>> 
>> ----------------------------------------------------------------------
>> 
>> Message: 1
>> Date: Fri, 3 Mar 2023 17:48:03 +0300
>> From: mario wrk <wrkmario at gmail.com <mailto:wrkmario at gmail.com>>
>> To: mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>
>> Subject: [MITgcm-support] too many values for NAMELIST variable
>> Message-ID:
>>         <CAAfDP0dnjOKcAO682mC1+z1BDGmAYXdOaC3-iLQfJa0E8kinjA at mail.gmail.com <mailto:CAAfDP0dnjOKcAO682mC1%2Bz1BDGmAYXdOaC3-iLQfJa0E8kinjA at mail.gmail.com>>
>> Content-Type: text/plain; charset="utf-8"
>> 
>> Dear MITgcm community,
>> I was running a high resolution model in parallel  with OpenMPI and got the
>> error below.
>> Does anyone have a clue?
>> Best,
>> Mario
>> 
>> 
>> 799: forrtl: severe (18): too many values for NAMELIST variable, unit 11,
>> file .........run_ad/scratch1.000000799, line 3315, position 7
>>  799: Image              PC                Routine            Line
>>  Source
>>  799: libifcoremt.so.5   00001555553E6E79  for__io_return        Unknown
>>  Unknown
>>  799: libifcoremt.so.5   000015555542F3F5  for_read_seq_nml      Unknown
>>  Unknown
>>  799: mitgcmuv_ad        0000000000786AC8  Unknown               Unknown
>>  Unknown
>>  799: mitgcmuv_ad        000000000077F9A4  Unknown               Unknown
>>  Unknown
>>  799: mitgcmuv_ad        0000000000862732  Unknown               Unknown
>>  Unknown
>>  799: mitgcmuv_ad        00000000008BD780  Unknown               Unknown
>>  Unknown
>>  799: mitgcmuv_ad        00000000004037C2  Unknown               Unknown
>>  Unknown
>>  799: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$>       0000155550887493  __libc_start_main     Unknown
>>  Unknown
>>  799: mitgcmuv_ad        00000000004036CE  Unknown               Unknown
>>  Unknown
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/14bbf386/attachment-0001.html <https://urldefense.com/v3/__http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/14bbf386/attachment-0001.html__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8R3N7lOzA$>>
>> 
>> ------------------------------
>> 
>> Message: 2
>> Date: Fri, 3 Mar 2023 15:11:44 +0000
>> From: "Menemenlis, Dimitris (US 329B)"
>>         <dimitris.menemenlis at jpl.nasa.gov <mailto:dimitris.menemenlis at jpl.nasa.gov>>
>> To: MITgcm Support <mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>>
>> Subject: Re: [MITgcm-support] too many values for NAMELIST variable
>> Message-ID: <541236B8-2F43-46B3-A2BE-67CC43AD6D94 at jpl.nasa.gov <mailto:541236B8-2F43-46B3-A2BE-67CC43AD6D94 at jpl.nasa.gov>>
>> Content-Type: text/plain; charset="us-ascii"
>> 
>> There is a problem with one of your runtime parameter files (data or data.*) in your runtime directory run_ad.
>> 
>> On Mar 3, 2023, at 6:48 AM, mario wrk <wrkmario at gmail.com <mailto:wrkmario at gmail.com>> wrote:
>> 
>> Dear MITgcm community,
>> I was running a high resolution model in parallel  with OpenMPI and got the error below.
>> Does anyone have a clue?
>> Best,
>> Mario
>> 
>> 
>> 799: forrtl: severe (18): too many values for NAMELIST variable, unit 11, file .........run_ad/scratch1.000000799, line 3315, position 7
>>  799: Image              PC                Routine            Line        Source
>>  799: libifcoremt.so.5   00001555553E6E79  for__io_return        Unknown  Unknown
>>  799: libifcoremt.so.5   000015555542F3F5  for_read_seq_nml      Unknown  Unknown
>>  799: mitgcmuv_ad        0000000000786AC8  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        000000000077F9A4  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        0000000000862732  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        00000000008BD780  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        00000000004037C2  Unknown               Unknown  Unknown
>>  799: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$><https://urldefense.us/v3/__http://libc-2.28.so__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHWWSPPDd$ <https://urldefense.com/v3/__https://urldefense.us/v3/__http:/*libc-2.28.so__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHWWSPPDd$__;Lw!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8SGL_NBUw$>>       0000155550887493  __libc_start_main     Unknown  Unknown
>>  799: mitgcmuv_ad        00000000004036CE  Unknown               Unknown  Unknown
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
>> https://urldefense.us/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHXBtL2F1$ <https://urldefense.com/v3/__https://urldefense.us/v3/__http:/*mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHXBtL2F1$__;Lw!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8SaL3uv1Q$>
>> 
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/3f126b2a/attachment-0001.html <https://urldefense.com/v3/__http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/3f126b2a/attachment-0001.html__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RTkw7Kfw$>>
>> 
>> ------------------------------
>> 
>> Message: 3
>> Date: Fri, 3 Mar 2023 15:35:12 +0000
>> From: "Carroll, Dustin (US 329C-Affiliate)"
>>         <dustin.carroll at jpl.nasa.gov <mailto:dustin.carroll at jpl.nasa.gov>>
>> To: "mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>" <mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>>
>> Subject: Re: [MITgcm-support] too many values for NAMELIST variable
>> Message-ID:
>>         <SJ0PR09MB90307EB60388A8CD177F8130BDB39 at SJ0PR09MB9030.namprd09.prod.outlook.com <mailto:SJ0PR09MB90307EB60388A8CD177F8130BDB39 at SJ0PR09MB9030.namprd09.prod.outlook.com>>
>> 
>> Content-Type: text/plain; charset="windows-1252"
>> 
>> To follow up on Dimitris? comment, if you open the file ?scratch1.000000799? in your run_ad directory
>> and look at line 3315, position 7 this will tell you where the syntax error / incorrect parameter value
>> occurred in your data.* file.
>> 
>> From: MITgcm-support <mitgcm-support-bounces at mitgcm.org <mailto:mitgcm-support-bounces at mitgcm.org>> on behalf of Menemenlis, Dimitris (US 329B) <dimitris.menemenlis at jpl.nasa.gov <mailto:dimitris.menemenlis at jpl.nasa.gov>>
>> Date: Friday, March 3, 2023 at 7:12 AM
>> To: MITgcm Support <mitgcm-support at mitgcm.org <mailto:mitgcm-support at mitgcm.org>>
>> Subject: [EXTERNAL] Re: [MITgcm-support] too many values for NAMELIST variable
>> There is a problem with one of your runtime parameter files (data or data.*) in your runtime directory run_ad.
>> 
>> 
>> On Mar 3, 2023, at 6:48 AM, mario wrk <wrkmario at gmail.com <mailto:wrkmario at gmail.com>> wrote:
>> 
>> Dear MITgcm community,
>> I was running a high resolution model in parallel  with OpenMPI and got the error below.
>> Does anyone have a clue?
>> Best,
>> Mario
>> 
>> 
>> 799: forrtl: severe (18): too many values for NAMELIST variable, unit 11, file .........run_ad/scratch1.000000799, line 3315, position 7
>>  799: Image              PC                Routine            Line        Source
>>  799: libifcoremt.so.5   00001555553E6E79  for__io_return        Unknown  Unknown
>>  799: libifcoremt.so.5   000015555542F3F5  for_read_seq_nml      Unknown  Unknown
>>  799: mitgcmuv_ad        0000000000786AC8  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        000000000077F9A4  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        0000000000862732  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        00000000008BD780  Unknown               Unknown  Unknown
>>  799: mitgcmuv_ad        00000000004037C2  Unknown               Unknown  Unknown
>>  799: libc-2.28.so <https://urldefense.com/v3/__http://libc-2.28.so__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RezGD5pg$><https://urldefense.us/v3/__http:/libc-2.28.so__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHWWSPPDd$ <https://urldefense.com/v3/__https://urldefense.us/v3/__http:/libc-2.28.so__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHWWSPPDd$__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8QNGFHtAQ$>>       0000155550887493  __libc_start_main     Unknown  Unknown
>>  799: mitgcmuv_ad        00000000004036CE  Unknown               Unknown  Unknown
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
>> https://urldefense.us/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHXBtL2F1$ <https://urldefense.com/v3/__https://urldefense.us/v3/__http:/*mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!PvBDto6Hs4WbVuu7!M2W-DPfuY1q_PomytTcSW2ZHGL19dSVErDKaeBk6O3OIi_YZzV1x60iLZjKuaNrKDqjCzquEc3Kh4E8hzb2sHXBtL2F1$__;Lw!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8SaL3uv1Q$>
>> 
>> -------------- next part --------------
>> An HTML attachment was scrubbed...
>> URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/c74ea89b/attachment.html <https://urldefense.com/v3/__http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/c74ea89b/attachment.html__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8RsC1HGuw$>>
>> 
>> ------------------------------
>> 
>> Subject: Digest Footer
>> 
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org <mailto:MITgcm-support at mitgcm.org>
>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support <https://urldefense.com/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8QuLWdRUA$>
>> 
>> 
>> ------------------------------
>> 
>> End of MITgcm-support Digest, Vol 237, Issue 1
>> **********************************************
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> https://urldefense.com/v3/__http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support__;!!Mih3wA!Gw4tc7h4d5xgbTQUaiW0X933fMbQFlPW_FA6f78Qjf2PAq-yIVJFR10JAMMJ4xttCi3NmbCTS8QuLWdRUA$ 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20230303/f6bc228a/attachment-0001.html>


More information about the MITgcm-support mailing list