[Mitgcm-support] RE: threading and mpi

mitgcm-support at dev.mitgcm.org mitgcm-support at dev.mitgcm.org
Wed Jul 9 15:52:16 EDT 2003


hi arne,

 you can use nsx and nsy and then set variables ntx and nty in eedata.
there are some comments in various header files that 
give details on what they mean, and also some info
in the manual. basically nsx and nsy is a shared memory set
of tiles (each of size snx by sny), ntx and nty are the number of 
threads in x and y that nsx and nsy get mapped to. a simple setup
would be nsx=2, nsy=4, ntx=2, nty=4. that would give
a code with two tiles in x (each of size snx) and four
tiles in y (each of size sny). there would be a total
of eight threads (ntx*nty) with each tile being computed
by one thread. 

however, the nec-sx series used to have some issues
with threads which date back to their copying of
various os features from cray. there are a range of 
environment variables to do with scheduling how threads
get distributed over physical cpus and how they "preempt"
one another that have to be set to get the right
behavior. this is significant in the "barrier" routine
which used to need customizing for nec shared memory
parallel codes.

 in principle everything is in place in the code to combine
nsx, nsy, ntx, nty, npx and npy. so in the above
example if you also had npx=2 and npy=1 you would two
lots of eight tiles, running in two processes and
each set of eight could be spread over eight threads.
this in theory allows
a multi-process, multi-threaded code to use mpi where
shared address space does not exist. however,
how to do this is highly system dependent, the standards for
mpi, threads etc.... leave this totally open so each
vendor adds their own variant. in general my preference is to
figure out how to use only processes i.e. distributed address
space through out and then find how to get a process
to communicate using the same hardware
mechanics that a shared memory code would use.
done right this gives the exact same performance as the
multi-threaded approach, but in a cleaner more
comprehensible way.
 
 if you need serious help from me in this area i would
be happy to give it, but i can't do a thing without
access to the machine and its reference manuals.

hope this helps. very long and complicated i know.
unfortunately a nice idea has been made unecessarily hard
by a lack of standards between hardware vendors.

chris

-----Original Message-----
From: Arne Biastoch [mailto:abiastoch at ifm.uni-kiel.de]
Sent: Wednesday, June 26, 2002 6:11 PM
To: support at mitgcm.org
Subject: threading and mpi


Alistair and Chris,

on the NEC SX-6 I would like to run the model in threaded mode, meaning 
within a node (of 8 processors with shared memory) I would not use mpi 
but simply a copying mechanism between tiles. What do I have to do to 
achieve that? Simply specifying nPx and nPy and compiling without mpi? 
Or nSx and nSy?

And: How would I mix that communication, i.e. using threads within a 
node and mpi across?

-Arne

-- 

Dr. Arne Biastoch

Institute for Marine Research         phone: ++49 431 600-4013
FB1 Ocean Circulation and Climate     fax  : ++49 431 600-1515
Duesternbrooker Weg 20                email: abiastoch at ifm.uni-kiel.de
24105 Kiel, Germany

http://www.ifm.uni-kiel.de/fb/fb1/tm/data/pers/abiastoch.html





More information about the MITgcm-support mailing list