17 August 2015

621. Very briefly: comparison of different hardware for a single G09 calculation

And another update:
* I've set up an Amazon EC2/AWS instance and have done some benchmarking there too: http://verahill.blogspot.com.au/2016/01/626-briefly-gaussian-g09d-with-slurm-on.html 
Exciting stuff.

Another update:
* I turned off hyperthreading on the i7-4930K to see whether it improved performance. The geometry optimisation took 2 min (4%) longer but the overall runtime was 2 min (2%) shorter. This is probably within the reproducibility error for the measurement.
* I've also added results for an i7-5820K without hyperthreading


Update: I spotted a few mistakes
* the L5430 job ran on a dual-socket machine, so I've multiplied the passmark by two and have replotted
* the X3480 job use the EM64T version of gaussian, not AMD64. I don't have a license to test that system using AMD64.

Original post:
There are lots of potential flaws when comparing the performance of a computational package on different hardware. Thus, it can be difficult to find examples online comparing different hardware using computational chemistry packages which makes it challenging to decide on what hardware to budget for.

So here's a simple comparison of a few different types of hardware for a geovib calculation in Gaussian.

All systems have spinning (7200 rpm) disks and use debian jessie (64). The systems haven't been optimised in any way.

All systems used G09D rev 01 AMD 64 unless otherwise indicated. The amount of time the geometry optimisation took is given within [].

CPU Benchmarks:
i7-4930K 13059
i7-5820K 12999
i7-7700 10817
i7-6700 10011
FX 8350 8943
FX 8150 7626
i5-2400 5910
X3480 5732
PH2 X6 4998

Performance:
2h 15 min. [1h 14 min.] Intel i7-4930K/ 32 Gb ram/ 12 threads
2h 23 min. [1h 25 min.] Intel i7-5820K/ 32 Gb ram/ 6 threads (HT turned off, NOTE)
2h 28 min [1h 15 min] Intel i7-7700/ 16 Gb ram/ 8 threads (w/ HT)
2h 49 min [1h 37 min] Intel i7-6700/ 16Gb ram/ 6 threads (w/ HT)
3h 49 min. [2h 12 min.] AMD FX 8350/ 8 Gb ram/ 8 threads
4h 12 min. [2h 19 min.] Intel i5-2400/ 16 Gb ram/ 4 threads
4h 16 min. [2h 24 min.] AMD Phenom II X6 1055T/ 8 Gb ram/ 6 threads
4h 28 min. [2h 16 min.] dual-socket Intel Xeon L5430/ 16 Gb ram/ 8 threads-- rev A.02
4h 43 min. [2h 47 min.] AMD FX 8150/ 32 Gb ram/ 8 threads
9h 26 min. [5h 18 min.] AMD Athlon II X3 445/ 8 Gb ram/ 3 threads

I also tried the EM64T version of G09D rev 01 and got:
1h 43 min. [57 min.] i7-4930K/ 32 Gb ram/ 12 threads
1h 59 min  [1h    7 min] i7-6700/ 16 Gb ram/6 threads (w/ HT)
2h 12 min  [1h    7 min] i7-7700/ 16 Gb ram/8 threads (w/ HT)
3h 03 min. [1h 44 min.] i5-2400/16 Gb ram/ 4 threads
4h 21 min. [2h 27 min.] Xeon X3480/ 8 Gb ram/ 8 threads -- rev B.01

Just by switching from the AMD64 to the EM64T version we thus cut the calculation down to 75% of the time for the i7.

 I also turned off hyperthreading for the i7-4930K and ran with six threads using the EM64T version:
 1h 41 min. [59 min.] i7-4930K/ 32 Gb ram/ 6 threads
 1h 35 min. [58 min.] i7-5820K/ 32 Gb ram/ 6 threads (NOTE)
 2h   9 min. [1h 11 min.] i7-7700/ 16Gb ram/ 4 threads


Here's a plot of the run times vs Passmark benchmarks (I've multiplied the Xeon L5430 passmark by 2):

Note that the Xeon L5430 and Xeon X340 store the job files on an NFS (scratch files are local) and are not using G09 rev D.01.

It'd be tempting to draw a line through the Athlon II to the I7-4930K, in which case the Phenom II X6 and the I5-2400 perform much, much better than they should based on the CPU Passmark alone.

Either way, these are my observations. No interpretations or opinion attached.

So what's  the benchmarking job that I used? I actually prefer not to reveal it, as it'd eventually point towards my identity (and you're not supposed to publish gaussian benchmarks...)

Suffice to say that it uses:
rpbe/def2-svp and 459 functions/759 primitives (46 atoms) with opt=(verytight) and integral(ultrafine)




No comments:

Post a Comment