02 July 2013

470. Very briefly: compiling nwchem 6.3 with ifort and mkl

This used to be part of http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html, but I think it makes more sense making it a separate post.

I did this on debian wheezy.

1. Installing mkl and the compiler
MKL: http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html
Intel compiler collection: http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html

I will henceforth presume that you have put the files in the same location as shown in those posts, and that you have created /etc/ld.so.conf.d/intel.conf as shown in the second post.

2 Compiling nwchem 6.3
sudo apt-get install build-essential libopenmpi-dev openmpi-bin
sudo mkdir /opt/nwchem -p
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3.revision1-src.2013-05-28.tar.gz -O Nwchem-6.3.revision1-src.2013-05-28.tar.gz
tar xvf Nwchem-6.3.revision1-src.2013-05-28.tar.gz
mv nwchem-6.3-src.2013-05-28 nwchem-6.3-src.2013-05-28.ifort

export NWCHEM_TOP=`pwd`
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr

export BLASOPT="-L/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/ -lmkl_core -lmkl_sequential -lmkl_intel_ilp64"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include


export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src

make clean
make nwchem_config
make FC=ifort 1> make.log 2>make.err

cd $NWCHEM_TOP/contrib
export FC=ifort
./getmem.nwchem

And it works quite fine. See e.g. here if you want to patch to allow to compile with python, and to support gabedit.

3. Performance
This is always a bit contentious, and I want to be upfront with the fact that I haven't spent much time considering whether my test example is a good one. I simply did a geo-opt + vibrational analysis as shown in this post: http://verahill.blogspot.com.au/2013/05/430-briefly-crude-comparison-of.html

The jobs were run using all cores available on that node.
gnu= gfortran + acml 5.3.1 for the Phenom and FX8150, and openblas for the i5-2400 and the Athlon 3800+..
ifort= ifort + mkl for all architectures.

The times are in seconds and are CPU times, not wall times.

Arch|                   cores  gnu      ifort Instruction sets
-------------------------------------------------------------------------
AMD Athlon 64 X2 3800+:   2    10828*    12516  sse, sse2, sse3
AMD Phenom II X6 1055T:   6    2044       2048   sse, sse2, sse3
AMD FX8150            :   8    1611       1507   sse, sse2, sse3, AVX, FMA4
Intel i5-12400        :   4    1652*      1498   sse, sse2, sse3, sse4,AVX

In the last case I also compiled using gfortran but with mkl and got 1550s.

It's a fairly small sample set, but it does seem that there's a little bit of an advantage with mkl+ifort over gfortran+acml on the newest AMD core. One would need much more data though.

A clear downside of using mkl and ifort is the fact that they are not freely available though -- i.e. you can register and download them for free for non-commercial use, but there's no guarantee that your colleague, next-door-neighbour or distant-cousin will be able to use it.

No comments:

Post a Comment