09 May 2013

410. Compiling LAMMPS on Debian (with GPU support)

MM/MD scares me a lot -- it requires experience, expertise and intuition to set up an MD simulation properly, especially if you need to parametrise a new system. In comparison, while DFT of course can easily yield wildly inaccurate results as a function of using the wrong method/functional/basis set or by simply asking the 'wrong' question, I find it easier to understand and to implement based on previous literature (i.e. if I read the computational details in a paper I often know how to repeat the experiments. With MD I often don't).

Anyway, a friend who is an expert in the field is using LAMMPS, and learning by imitation is better than not learning at all, so I've decided to invest a little bit of time familiarizing myself with this software.

The reasons he cited are it's more barebones, and it's C++ (advantage for some, disadvantage for others), and very modular so easy to extend (he's a theoretical chemist rather than a computational one). Finally, it has GPU support. I'm not really qualified to comment one way or the other.


Compilation

Voro++
First compile voro++ which is used for Vorono tesselation.
mkdir ~/tmp
cd ~/tmp
wget http://math.lbl.gov/voro++/download/dir/voro++-0.4.5.tar.gz
tar xvf voro++-0.4.5.tar.gz
cd voro++-0.4.5/
make
sudo make install

Note that it uses optimisation level 3 which makes me nervous in general -- edit the Makefile to change to O2 if you prefer that.

OpenKIM api
Next compile the OpenKIM api. Note that you can't run make in parallel.

sudo mkdir /opt/kimdir
sudo chown $USER:$USER /opt/kimdir
cd /opt/kimdir
wget http://s3.openkim.org/openkim-api-v1.1.1.tgz
tar xvf openkim-api-v1.1.1.tgz
cd openkim-api-v1.1.1/
export KIM_DIR=`pwd`
echo "export KIM_DIR=`pwd`" >> ~/.bashrc
source ~/.bashrc
make examples
make

LAMMPS
Grab the lammps source code. You can get it directly from Sandia national labs, or via sourceforge.

sudo apt-get install openmpi-bin libopenmpi-dev fftw3-dev build-essential gfortran
mkdir ~/tmp
cd ~/tmp
wget http://aarnet.dl.sourceforge.net/project/lammps/lammps-2Feb13.tar.gz
tar xvf lammps-2Feb13.tar.gz
cd lammps-2Feb13/src/
cp MAKE/Makefile.openmpi MAKE/Makefile.verahill
Edit MAKE/Makefile.verahill
 53 FFT_PATH =
 54 FFT_LIB =       -lfftw3
 55 

make verahill

text data bss dec hex filename 6111696 11448 17024 6140168 5db108 ../lmp_verahill make[1]: Leaving directory `/opt/lammps/lammps-2Feb13/src/Obj_verahill'
This compiles a binary in src/ called lmp_verahill. Note that it only enables a few modules.
make package-status
Installed NO: package ASPHERE Installed NO: package BODY Installed NO: package CLASS2 Installed NO: package COLLOID Installed NO: package DIPOLE Installed NO: package FLD Installed NO: package GPU Installed NO: package GRANULAR Installed NO: package KIM Installed YES: package KSPACE Installed YES: package MANYBODY Installed NO: package MC Installed NO: package MEAM Installed YES: package MOLECULE Installed NO: package OPT Installed NO: package PERI Installed NO: package POEMS Installed NO: package REAX Installed NO: package REPLICA Installed NO: package RIGID Installed NO: package SHOCK Installed NO: package SRD Installed NO: package VORONOI Installed NO: package XTC Installed NO: package USER-MISC Installed NO: package USER-ATC Installed NO: package USER-AWPMD Installed NO: package USER-CG-CMM Installed NO: package USER-COLVARS Installed NO: package USER-CUDA Installed NO: package USER-EFF Installed NO: package USER-OMP Installed NO: package USER-MOLFILE Installed NO: package USER-REAXC Installed NO: package USER-SPH
Additional packages.
 To enable additional packages, after doing make verahill, do e.g.
make yes-body yes-dipole
Installing package body Installing package dipole
Again, note that you'll need the proper dependencies installed (e.g. KIM and Voro++ -- and for KIM make sure that you've got KIM_DIR set in your ~/.bashrc as shown above) Next, compile all the libs you need. The easiest approach is to do the following (assuming you're in the src directory):

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/openmpi/include/
cd ../lib/reax
make -f Makefile.gfortran

Important: Edit Makefile.lammps:
3 reax_SYSINC = 4 reax_SYSLIB = -lgfortran #-lifcore -lsvml -lompstub -limf 5 reax_SYSPATH = #-L/opt/intel/fce/10.0.023/lib
cd ../poems
make -f Makefile.g++
cd ../meam
make -f Makefile.gfortran
cd ../linalg
make -f Makefile.gfortran
cd ../colvars
make -f Makefile.g++
cd ../../src/
make yes-asphere yes-body yes-class2 yes-colloid yes-dipole yes-fld yes-granular yes-kim yes-mc yes-meam yes-opt yes-peri yes-poems yes-reax yes-replica yes-rigid yes-shock yes-voronoi yes-xtc

Finish by running
make clean-all
make verahill

to properly set things up.

GPU/CUDA
Make sure you've installed the CUDA toolkit -- on debian it's the nvidia-cuda-toolkit package.

 I'll only show the GPU package here -- there's also USER_CUDA. Read up on the difference on your own.

 Edit lib/gpu/Makefile.linux to set the correct sm value, which depends on the GPU compute capability version (you can look this up at e.g. https://developer.nvidia.com/cuda-gpus and http://www.geeks3d.com/20100606/gpu-computing-nvidia-cuda-compute-capability-comparative-table/ ).

 For GPU compute capability 3.0 you set CUDA_ARCH to sm_30. If your card supports double precision, use -D_DOUBLE_DOUBLE
6 CUDA_HOME = /usr 7 NVCC = nvcc 8 9 # Tesla CUDA 10 #CUDA_ARCH = -arch=sm_21 11 # newer CUDA 12 CUDA_ARCH = -arch=sm_30 13 # older CUDA 14 #CUDA_ARCH = -arch=sm_10 -DCUDA_PRE_THREE 15 16 CUDA_PRECISION = -D_SINGLE_SINGLE 17 CUDA_INCLUDE = -I$(CUDA_HOME)/include 18 CUDA_LIB = -L$(CUDA_HOME)/lib
and edit Makefile.lammps
3 gpu_SYSINC = 4 gpu_SYSLIB = -lcudart -lcuda 5 gpu_SYSPATH = #-L/usr/local/cuda/lib64
then do
make -f Makefile.linux
cd ../../src
make yes-gpu
make verahill

More on GPU compute capability version If you use an sm_XX value which is too high, e.g. sm_30 with GeForce 210 (v 1.2) you get:
LAMMPS (2 Feb 2013)
ERROR: GPU library not compiled for this accelerator (gpu_extra.h:40)
Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 116.
*** The MPI_Abort() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
If you use sm_12 with GF210, you get to this point.
- Using GPGPU acceleration for pppm: - with 1 proc(s) per device. -------------------------------------------------------------------------- GPU 0: GeForce 210, 16 cores, 0.98/1 GB, 1.4 GHZ (Single Precision) -------------------------------------------------------------------------- Initializing GPU and compiling on process 0...Done. Initializing GPUs 0-1 on core 0...Done. ERROR: Double precision is not supported on this accelerator (gpu_extra.h:42)
There's more about this in lib/gpu/README:
124 NOTE: Double precision is only supported on certain GPUs (with
125       compute capability>=1.3). If you compile the GPU library for
126       a GPU with compute capability 1.1 and 1.2, then only single
127       precision FFTs are supported, i.e. LAMMPS has to be compiled
128       with -DFFT_SINGLE. For details on configuring FFT support in
129       LAMMPS, see http://lammps.sandia.gov/doc/Section_start.html#2_2_4
To do that, edit (in this case) src/MAKE/Makefile.verahill and set -DFFT_SINGLE and make sure to link to a single precision library (I built that as part of gromacs 4.5.5. See e.g. http://verahill.blogspot.com.au/2012/03/building-gromacs-with-fftw3-and-openmpi.html):
52 FFT_INC = -DFFT_FFTW3 -DFFT_SINGLE 53 FFT_PATH = 54 FFT_LIB = /opt/fftw/fftw-3.3.2/single/lib/libfftw3f.a
and recompiling everything (make clean-all && make verahill).

Note that I am in now way implying that a GeForce 210 is a suitable test card -- if you are serious about GPU calculations then there are serious cards out there, for serious money. I'm currently designing my next compute node, and while I probably won't go the GPU route anytime soon, I'm thinking about getting a mobo with multiple PCI-E slots for multiple cards. But I really don't have much experience.


Testing
You can test it by e.g. changing directory to examples/indent
cd ../examples/indent
mpirun -n 2 ../../src/./lmp_verahill < indent.in

Installation
You can move lmp_verahill to e.g. /opt/lammps and add it to PATH for easier execution. In my particular example I did
sudo mkdir /opt/lammps
sudo chown $USER /opt/lammps
mv ~/tmp/lammps-2Feb13 /opt/lammps
ln -s /opt/lammps/lammps-2Feb13/src/lmp_verahill /opt/lammps/lammps
echo 'export PATH=$PATH:/opt/lammps' >> ~/.bashrc
source ~/.bashrc

1 comment:

  1. Dear Verahill/Lindqvist,

    I would like to thank you for this blog post, it is now 2017 and people are sporting GTX 1080 Ti's, but your post has remained timelessly relevant. I was stumped at the error:

    LAMMPS (2 Feb 2013)
    ERROR: GPU library not compiled for this accelerator (gpu_extra.h:40)
    Cuda driver error 4 in call at file 'geryon/nvd_device.h' in line 116.
    *** The MPI_Abort() function was called after MPI_FINALIZE was invoked.
    *** This is disallowed by the MPI standard.

    But your post helped me realise I needed to change the CUDA_ARCH argument in the makefile to sm_61 (because the latest GPUs have compute capability 6.1).

    My Sincere Thanks!
    Josh

    ReplyDelete