[MITgcm-support] Choice of workstation CPU for running MITgcm

Ali Ramadhan alir at mit.edu
Wed Apr 10 10:27:20 EDT 2019


+1 for Google Cloud. Trying things out on the cloud is a great option and
might be more cost-effective than buying a fancy new rig.

I just wanted to point out that the <$1 / hour for 96 CPUs and 360 GB RAM
is for a *preemptible virtual machine* (VM) which from my understanding
uses spare resources on Google Cloud and thus your VM can be interrupted at
any time (5-15% chance per day apparently) and can only run for a maximum
of 24 hours. See:
https://cloud.google.com/compute/docs/instances/preemptible

A regular VM seems much more suitable for MITgcm runs but then the node
with 96 CPUs and 360 GB RAM costs $4.56 / hour, or $3.19 / hour if you run
long enough to make use of the sustained use discounts
<https://cloud.google.com/compute/docs/sustained-use-discounts>, which is
still pretty good I think. You also don't have to worry about setting up
the machine and maintaining it.

Cheers,
Ali

On Wed, Apr 10, 2019 at 10:15 AM Ryan Abernathey <ryan.abernathey at gmail.com>
wrote:

> You can rent an Intel Skylake node with 96 CPUs and 360 GB RAM for < $1 /
> hour on Google Cloud:
> https://cloud.google.com/compute/pricing
> For low resolution simulations, this would be more than sufficient.
>
> You could use this to experiment before buying any hardware. Or maybe you
> would decide you don't actually need to buy at all.
>
> -Ryan
>
> On Wed, Apr 10, 2019 at 1:31 AM Matthew Mazloff <mmazloff at ucsd.edu> wrote:
>
>> Hi Christoph
>>
>> Some answers to your questions. But there are more knowledgable people
>> out there!
>>
>> The MITgcm scales well and is routinely run on thousands of cores.
>> example:
>> https://people.nas.nasa.gov/~chenze/ECCO/SC05/ecco_sc05.pdf
>>
>> (Obviously if you try to run a small model domain on many cores it will
>> be inefficient.)
>>
>> In my experience with forward model runs memory isn’t a bottleneck.
>>
>> I am not sure what size runs you are talking about, but for runs with
>> great than a few hundred cores I think the bottleneck is primarily with the
>> interconnects and I/O to the NFS. Hopefully people will correct me if I am
>> wrong.
>>
>> Matt
>>
>> On Apr 9, 2019, at 6:13 AM, Christoph Stappert <cstappert at gmx.de> wrote:
>>
>> Hello everyone,
>>
>> I am currently building a workstation to run some MITgcm simulations, and
>> I am wondering which of the different CPU models I am considering would be
>> best suited for the task:
>>
>> Ryzen 7 1700 (8x 3.0 GHz, dual-channel RAM): A consumer-grade CPU and
>> siginificantly cheaper than the others. However, while it does have ECC,
>> the ECC feature is not officially supported by AMD, so I am reluctant to
>> use this CPU in scientific computing.
>>
>> Xeon E-2146G (6x 3.5 GHz, dual-channel RAM): This is the option I am
>> leaning towards at the moment.
>>
>> Ryzen Threadripper 1950X (16x 3.4 GHz, quad-channel RAM): More CPU cores
>> than the other two options, but also more expesive. I am wondering, how big
>> would the performance gain actually be in practice?
>>
>> I have read in some messages on this list that MITgcm does not scale well
>> with an increasing number of CPU cores and that memory bandwidth is an
>> issue. However, these messages were more than 10 years old, so I am not
>> sure if this still applies to the latest generation of CPUs and to the
>> latest version of the software. I was not able to find any newer messages
>> on hardware recommendations, performance and such.
>>
>> My specific questions are:
>> - How well does MITgcm scale with an increasing number of CPU cores (4,
>> 8, 16, 32...)? At which point would I stop seeing a significant increase in
>> performance?
>> - Is there a bottleneck with memory bandwidth in today's CPUs? Does a
>> higher number of RAM channels significantly increase performance?
>> - Are L2 cache and L3 cache a major bottleneck?
>> - Does MITgcm benefit from using AVX-512 or other Intel-specific features
>> (since AMD hasn't really been a factor in scientific computing in the last
>> couple of years)?
>>
>> Of course, I could just get all the CPU models under consideration and do
>> my own benchmarks, but unforunately, I do not currently have the budget or
>> the time for this. So I was hoping that someone here might have some
>> insights based on their knowledge of the MITgcm code or some personal
>> experience using different kinds of hardware.
>>
>> Thank you and kind regards,
>>
>> Christoph
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>
>>
>> _______________________________________________
>> MITgcm-support mailing list
>> MITgcm-support at mitgcm.org
>> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>>
> _______________________________________________
> MITgcm-support mailing list
> MITgcm-support at mitgcm.org
> http://mailman.mitgcm.org/mailman/listinfo/mitgcm-support
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.mitgcm.org/pipermail/mitgcm-support/attachments/20190410/0d88d9f9/attachment-0001.html>


More information about the MITgcm-support mailing list