[MITgcm-support] build options for Stampede
Holly Dail
hdail at MIT.EDU
Wed Feb 6 17:40:20 EST 2013
Hi Matt -
I don't have any suggestions for _fixing_ the problem, but I can tell you what I did that drastically improved my failure rate when I had a similar issue. Just modify the code to try opening the file multiple times (maybe 3 times?) before giving up. When I faced a similar issue it was so transient that repeating the call was effective. Can't remember if a sleep was required between the attempts ... probably helpful, but will also slow down your code.
Holly
On Feb 6, 2013, at Feb 6 , 11:54 AM, Matthew Mazloff wrote:
> Hi Angela,
>
> Thanks - but unfortunately stampede is different from these two machines. The good news is it is far faster -- almost 4 times faster. The bad news is that it doesn't seam to like the I/O statements in the mdsio package.
>
> I find it randomly trips up writing (usually with MDS_WRITE_FIELD) when the file already exists.
>
> The error I get is usually a simple
>
> "forrtl: severe (121): Cannot access current working directory for unit 9, file "Unknown"
>
> Though sometimes it tells me that a file that definitely exists does not exist. Or it says this file cannot be written to.
>
> For example, the adjoint needs some average files and it writes out some "bar" fields. Yesterday it crashed writing a "bar" file after running fine for about 2 years…
>
> I also find that its processors do not like to share -- I have had to give each processor its own tiled forcing file because otherwise it randomly crashes…
>
> However, there have been a few runs that complete just fine….
>
> TACC consulting do not have any suggestions. I am hoping that with the proper compile flags I can make the run more robust.
>
> Any suggestions appreciated!
> Matt
>
>
>
>
>
>
>
> On Feb 6, 2013, at 5:42 AM, Angela Zalucha <azalucha at seti.org> wrote:
>
>> Matt: I have successfully used TACC's Lonestar and Ranger machines, but won't be in front of a real computer until tomorrow night, when I can send you the opt files.
>>
>> Angela
>>
>> p.s. I can also send you results for scaling tests, and would be interested to know how stampede works out for you. For instance I find that lonestar is a lot faster than ranger. I am actively proposing to XSEDE.
>>
>> Sent from my Verizon Wireless 4G LTE DROID
>>
>>
>> Matthew Mazloff <mmazloff at ucsd.edu> wrote:
>>
>> Hello,
>>
>> Has anyone made a compile options file for TACC's Stampede yet?
>>
>> http://www.tacc.utexas.edu/user-services/user-guides/stampede-user-guide
>>
>> I am getting random crashes on this machine due to disk access issues and suspect I may not be using an ideal option file.
>>
>> Thanks
>> Matt
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
Holly Dail
Postdoctoral Fellow
Harvard University
http://fas.harvard.edu/~hdail/
More information about the MITgcm-support
mailing list