[MITgcm-support] changing number of processors
Jonny Williams
Jonny.Williams at bristol.ac.uk
Thu Mar 5 06:19:08 EST 2015
As a related question to this thread, is it possible to output one NetCDF
file per stream (state*.nc, ptracers*.nc, etc) rather than one per process?
I am currently running on ARCHER, the national supercomputing facility and
I am not getting the speed up that I am expecting for a long job whereas I
did get the expected speed for a very short test job.
I am thinking that the I/O may be a bottleneck here perhaps?
Cheers!
Jonny
On 10 February 2015 at 07:38, Martin Losch <Martin.Losch at awi.de> wrote:
> Hi Jonny and others,
>
> I am not sure if I understand your question about "the utility of the
> overlap cells": the overlaps are filled with the values of the neighboring
> tiles so that you can compute terms of the model equations near the
> boundary; without the overlap you would not be able to evaluate any
> horizontal gradient or average at the domain boundary.
> The size of the overlap depends on the computational stencil, that you
> want to use. A 2nd order operation needs an overlap of 1, a 3rd order
> operator needs an overlap of 2, and so forth. I think that at the model
> tells you, when your choice of advection schemes requires more overlap that
> you have avariciously specified.
>
> Martin
>
> PS:
> Here’s my experience with scaling or not scaling (by no means are these
> absolute numbers or recommendations):
> As a rule of thumb, the MITgcm dynamics/thermodynamics kernel (various
> packages may behave differently) scale usually nearly linearly down to tile
> sizes of (sNx * sNy) of 30*30, when the overhead of overlap/domain size
> becomes unfavorable (because of too many local communications between
> individual tiles) and the global pressure solver takes its toll (because of
> global communications when all processes have to wait). Below this tile
> size the time to solution reduces still with more processors, but the more
> slowly until the overhead is more expensive than the speedup. I
> re-iterating what Matt already wrote: It’s obvious that for a 30x30 tile,
> the overlap nearly 2*(Olx*sNy+Oly*sNx), so for an overlap of 2 you already
> have 8*30 cells in the overlap, more that one quarter of the cells in the
> interior, etc. From this point of view a tile size of 2x2 + 1 gridpoint
> overlap is totally inefficient.
> Further, it is probably better to have nearly quadratic tiles (so sNx ~
> sNy), except for vector machines, where you try to make sNx as large as
> possible (at least until you reach the maximum vector length of your
> machine).
>
> In my experience you need to test this for every new computer that you
> have access to, to find out what is the best range of processors that you
> can efficiently run with. For example it may be more economic to use fewer
> processor and wait a little longer for the result, but have enough CPU time
> left to do a second run of the same type, than to use all you CPU time on a
> run with twice as many processors that may finish faster, but not twice as
> fast because the linear scaling limit has been reached.
>
> > On 09 Feb 2015, at 16:05, Jonny Williams <Jonny.Williams at bristol.ac.uk>
> wrote:
> >
> > Dear Angela, Matthew
> >
> > Thanks you very much for your emails.
> >
> > For your information I have now gotten round my initial problem of the
> NaNs now by using a shorter timestep although I don't know why this
> would've have made much difference...
> >
> > Your discussion about the overlap parameters and run speed is of
> interest to me because I found that a decrease in timestep by a factor of 4
> and an increase in the number of processors by a factor of 10 resulted in
> an almost identical run speed!
> >
> > My SIZE.h parameters were as follows...
> >
> > PARAMETER (
> > & sNx = 75,
> > & sNy = 10,
> > & OLx = 4,
> > & OLy = 4,
> > & nSx = 1,
> > & nSy = 1,
> > & nPx = 6,
> > & nPy = 80,
> > & Nx = sNx*nSx*nPx,
> > & Ny = sNy*nSy*nPy,
> > & Nr = 50)
> >
> > ... so (using the calculation from the earlier email) I have
> (4+75+4)*(4+10+4)=1494 grid cells per process and (75*10/1494)=50% are
> cells I care about.
> >
> > This is really good to know but I got me to thinking, what is the
> utility of these overlap cells in the first place?
> >
> > Many thanks!
> >
> > Jonny
>
>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mitgcm.org/mailman/listinfo/mitgcm-support
>
--
Dr Jonny Williams
School of Geographical Sciences
Cabot Institute
University of Bristol
BS8 1SS
+44 (0)117 3318352
jonny.williams at bristol.ac.uk
http://www.bristol.ac.uk/geography/people/jonny-h-williams
<http://bit.ly/jonnywilliams>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mitgcm.org/pipermail/mitgcm-support/attachments/20150305/1717e0dd/attachment-0001.htm>
More information about the MITgcm-support
mailing list