Showing posts with label dalton. Show all posts
Showing posts with label dalton. Show all posts

11 June 2013

446. B3LYP and WAH -- the confusion

Quite a while back I was looking at the WAH (Wilson-Amos-Handy) functional, and while it turned out to be a bit more complicated than I had hoped, it led me to type up a brief discussion about b3lyp in different computational packages.

The issue is that there are several different definitions, and that even if a paper is kind enough to provide the Becke 1993 communication as a reference, this is rarely the actual form of b3lyp used. In fact, the LYP part would speak directly against it. As someone whom isn't well-versed in the computational and theoretical arts I do think it would be nice if we could get to the point where we can get useful information by doing point-and-click computations, but as exchange-correlation functionals essentially are fudge-factors, this probably won't happen for some time.  In other words, 6-31G/B3LYP may be a winning combination for the computation of electronic energies of a limited range of (mostly organic) small molecules in the gas phase, it doesn't always yield anything useful about real-world systems.

NOTE: I wrote the original text as I was trying to figure out what WAH was -- and I thought at that point that it was a simple form of B3PW91 with tweaked prefactors. It isn't -- it's requires changes in the way the GIAOs are computed (I think). So don't focus on the WAH discussion.

Anyway, here's the story as a bench chemist (i.e. not a computational or theoretical chemist) understands it, in the context of trying to understand what the WAH exchange-correlation functional looks like.

The WAH functional is a hybrid exchange correlation functional which was developed to provide accurate NMR shift calculations.

Note that the WAH functional is correctly implemented in PQS -- see the manual.

But the story is really about B3LYP...


1. Definition of the WAH exchange-correlation functional:
The definition[1] consists of
We found that by using hybrid Kohn-Sham orbitals and eigenvalues with an adjusted 'exact-exchange' coefficient Cx, the NMR shielding parameters gave an accuracy approaching the best coupled-cluster calculations for molecules containing first and second row atoms. For B3LYP[2] we find Cx=0.05[..] give(s) best values (note that the coefficient of Local Density Exchange is (1-Cx) in the amended B3LYP. We name the resulting NMR values B3LYPGGA0.05[..].
This is the paper which is cited by the PQS manual. The definition is brief, but not unreasonable.

2 Definition of B3LYP
[The first-person account of the background to B3LYP is found here: http://www.ccl.net/chemistry/resources/messages/2002/05/22.008-dir/]

In the paper which is cited as a source of B3LYP, Becke defined[2] a functional as
EXC=EXCLSDA+a0 (ExHF-ExLSDA )+axΔ ExBecke88+acΔ EcPW91   (eq 1)
where EXC denotes exchange-correlation functional, Ex denotes exchange functional, Ec denotes correlation functional, Δ denotes non-local contribution, LSDA is the local spin-density approximation, Becke88[3] is the gradient-corrected LDA and PW91 is the Perdew-Wang 1991 gradient correction.[4]. The B3 in B3LYP refers to the three parameters it involves: a0=0.2, ax=0.72 and ac=0.81. Note that a0 is the same as Cx above. We'll refer to equation 1 as B3PW91.

LSDA is poorly defined but is normally taken to be the SVWN of the form
EXCLSDA=EXLSDA+ECLSDA  (eq 2)
=ExSlater+EcVWN   (eq 3)
although there's a slew of Vosko-Wilk-Nusair (VWN) functionals -- most sources suggest that Becke referred to VWN5, while my reading of the literature is a bit different (Becke states he uses the electron-gas parametrization in [4]). Either way, equation 1 now becomes
EXC=a0 ExHF+(1-a0 )ExSlater+EcVWN +axΔ ExBecke88+acΔ EcPW91 (eq 4)
This is implemented as the hybrid exchange-correlation functional acm in NWChem (the adibatic connection method) using VWN5. A very brief summary of Becke '93 vs Gaussian '92 (don't ask me about the chronology) is also available by Stephens et al. in J. Phys. Chem. 1994, 98(45), p. 11624.

2.1 Gaussian '92
You may at this point be forgiven for asking yourself why it is called B3LYP and not B3PW91. In 1991 Gaussian hadn't yet implemented PW91 (fair enough) and substituted it with the Lee-Yang-Parr (LYP) correlation functional (ΔEcLYP). Since it's difficult to separate the local component (and we want the non-local component as indicated by Δ), they wrote
Δ EcLYP=EcLYP-EcVWN (eq 5)
which turns equation 4 into
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+acEcLYP +(1-ac)EcVWN (eq 6)
In addition, in the original Gaussian implementation VWN_1_RPA was used, which sources tell me is 100\% wrong when taking Becke's intentions into account.
EXC=a0 ExHF+(1-a0)ExSlater +axΔ ExBecke88+acEcLYP +(1-ac)EcVWN_1_RPA (eq 7)
To make matters worse, today Gaussian uses VWN_3 and it seems they know how to get the nonlocal component of LYP directly (see below). Equation 7 is what you use if you use B3LYP in most software packages (though not all -- e.g. Gamess US uses VWN5 instead of VWN_1_RPA) So in G09 it's now
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+EcVWN_3+acEcLYP (eq 8)

2.2 PQS
PQS uses the old gaussian version (eq. 7) as the b3lyp functional, but it doesn't explicitly state which form -- b3lyp or b3pw91 -- is used for the WAH functional.


3 So what definition did WAH use?
All we really care about is reproducing the original paper by Handy et al. -- not whether Becke would approve or not. But here's where the problem of citing papers you may not have read becomes an issue.

Wilson, Amos and Handy cite the 1993 paper by Becke which defines the canonical version of B3LYP (i.e. B3PW91), which should settle it in favour of WAH being defined as shown in equation 4.

However, they used CADPACK, which implements it as in equation 7. Reading the CADPACK manual I can see what Handy et al. probably did: the way you define custom parameters for hybrid functionals in CADPACK is by doing
hybrid a0 ax ac
so that they during the development of their functional most likely typed in
b3lyp 0.05 0.72 0.81
which meant they probably used
EXC=0.05 ExHF+0.95 ExSlater+ +0.72 Δ ExBecke88+0.81 EcLYP +0.19 EcVWN_1_RPA (eq 9)

4 Implementing it in your package of choice
[NOTE: this will NOT set up WAH correctly -- I'm leaving it as it shows how to set up custom XCs in nwchem, G09 and Dalton]

4.1 NWCHEM
The canonical version of Becke's functional, B3PW91, is implemented as acm in NWChem and which is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
while the Gaussian '92 form is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
This means that the two possible forms of WAH are:
xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
and
 xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.2 Gaussian 09
Gaussian is a lot less elegant. The canonical version of Becke's functional, B3PW91, is implemented as acm in Gaussian as B3PW91, which is manually entered as
BPW91 IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810010000)
while the old Gaussian form is manually entered as
BLYP IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810001900
This means that the two forms of WAH are:
BPW91 IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810010000)
and
BLYP IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810001900)
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.3 Dalton
The notations are (more or less)
Combine HF=0.20 Slater=0.80 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.20 Slater=0.80 Becke=0.72 LYP=0.81 VWN=0.19
Combine HF=0.05 Slater=0.95 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.05 Slater=0.95 Becke=0.72 LYP=0.81 VWN=0.19
As an aside, I don't think anyone at this point would be surprised to learn that B3PW91 in Dalton and Gaussian are two completely different exchange-correlation functionals...

Performance
NOTE: the results don't quite make any sense to me anymore -- using the same XCs and basis sets and structures one would expect to get the exact same results for all three packages. I don't know why that wasn't the case, assuming that I implemented the XCs correctly, and assuming that the basis sets really are the same (which often they actually aren't...). So keep that in mind.

Original:
I used the same equilibrium structures as Handy (i.e. those of Cybulski), and used the aug-cc-pVTZ basis set with a fine DFT grid (not the same as Handy -- but should be as good. Also, I've done this with def2-qzvp and 6-31+g* as well). All calculations are for the gas phase. Handy's values are for the Huzinaga IV basis set in their GIAO paper[5], but it doesn't really matter much which basis set is chosen. The results are tabulated in table 1.

Note that Handy also saw chemical shifts for the oxygen in carbon monoxide which were around -80 ppm. Basically, I can reproduce everything except for his B3LYP0.05gga. I did a few test-runs with Huz-IV in Dalton and it was just as bad as the other methods.

Table 1: Tabulated calculated gas phase NMR shifts using different combinations of exchange-correlation functionals and the aug-cc-pvtz basis set. Note that Handy's experimental data are not in agreement with my sources, which are 1.0, -42.3 and 344 ppm for CO, CO and H2O, respectively.
Package b3lypa gb3lypb acmc b3pw91b b3lyp0.05 b3pw910.05 Handy
Expt.
Carbon in CO (ppm)
NWChem -10.32 -10.32 -8.80 -8.80 -8.08 -6.55 5.6 2.8
G09 -7.71 -7.71 -9.95 -9.95 -5.66 -7.74
Dalton -10.21 -10.21 N/A -12.65 -7.97 -10.38
Oxygen in CO (ppm)
NWChem -71.51 -71.51 -72.34 -72.34 -70.06 -70.93 -40 -36.7
G09 -78.82 -78.82 -77.89 -77.89 -77.57 -76.86
Dalton -71.50 -71.50 N/A -72.03 -70.04 -70.89
Oxygen in water
NWChem 328.45 328.45 329.28 329.28 329.79 330.48 327 358
G09 N/A 328.76 328.39 328.39 328.95 328.31
Dalton 328.50 328.50 N/A 328.02 329.85 329.07
aNote that G09 uses eq. 8 by default (gives -11.70, -77.63, and 328.34 ppm), but I forced it to use eq. 7. b These were my manual implementations of 'b3lyp' and 'acm' to make sure I got things right. c This is acm in NWChem and B3PW91 in Gaussian 09.


References:
[1] P. J. Wilson, R. D. Amos, H. N. C., Chem. Phys. Letters 1999, 321, 475-484.
[2] A. D. Becke, J. Chem. Phys. 1993, 98, 5648-5652.
[3] A. D. Becke, Phys. Rev. A 1988, 38, 3098-3100.
[4] J. P. Perdew, Y. Wang, Phys. Rev. B 1991, 45, 13244-13249.
[5] T. Helgaker, P. J. Wilson, R. D. Amos, N. C. Handy, J. Chem. Phys. 2000, 113, 2983-2989.

06 November 2012

276. Compiling Dalton 2011 on Debian Testing/Wheezy

UPDATE: To deal with basis sets and 'GPOPEN' errors, see bottom of this post
UPDATE2: Because of the basis set issue the code doesn't run in parallel!
UPDATE 3: All issues are solved by -O0 or -O1. The code now works in parallel and you can define basis sets the usual way. Performance-wise? No idea. So you can compile with -O3 or -O2 but the code doesn't read basis sets the intended way, or you use -O1 or -O0 and it works.

THIS WORKS NOW  :)

Original post:
I've been wanting to use dalton for a long time, but it's been difficult to compile dalton 2.0, and I didn't realise until a few days ago that there's a newer version.
See here for a description of how to compile on ROCKS 5.4.3 (i.e. Centos 5.6) which uses gfortran v 4.1. The main difference between compiling on CentOS 5.6 and Debian Wheezy is in how you edit the Makefile.config. More specifically, compile works a whole lot better with -fno-whole-file and -march=native..

Other than that the steps are the same.

In terms of running, there's an issue with the discoverability of the basis sets which I don't really understand. There's a solution to that at the end of the file.

Before you get started you may want to compile ATLAS as shown here: http://verahill.blogspot.com.au/2012/05/compile-atlas-blas-on-debian-testing.html

Alternatively, you should get the ACML libraries.

NOTE: The compile went without a hitch on my AMD II X3, AMD Phenom II X6 and Intel i5-2400.
My AMF FX 8150 is a trickier story though: it failed to compile with acml libs (gfortran64_fma4_int64) for -O3 and -O2, but compiled with -O1 and -O0. The -O1 binary segfaults though. Never tried the -O0 binary.

WARNING: If you run dalton in parallel it will -- for some reason -- delete your scratch folder when the run is over. The scratch directory is defined in the  /opt/dalton/bin/dalton script (TMPDIR)

License:
First go to http://daltonprogram.org/licence/ and fill out the license agreement. Once that's done you'll get an automated email with a license form, which you should print, sign, scan and email to the email address you're given. Once your form has been processed you'll be sent another email with a user name and password. I received my user name and password the next business day.

Go online and download the source file, Dalton2011_release_v0.tgz, and put it in ~/tmp. Sort out where you want your program to end up
sudo mkdir /share/apps/dalton
sudo chown $USER /share/apps/dalton
mkdir /share/apps/dalton/bin /share/apps/dalton/basis /share/apps/dalton/lsdalton

Next,
cd ~/tmp
tar xvf Dalton2011_release_v0.tgz
cd Dalton2011_release/DALTON
./configure 

and answer all the questions:
./configure

------------------------------------------------------------------
   Configuring the DALTON Makefile.config and "dalton" run script
------------------------------------------------------------------

INFO: Operating system from 'uname -s' : Linux
INFO: Processor type   from 'uname -m' : x86_64
No architecture specified, attempting auto-configuration:
This appears to be a -linux architecture. Is this correct? [Y/n] 
--> Installing DALTON on a -linux computer


Note that 64-bit integers are desirable for Cholesky and very large
scale CI, otherwise the most important effect is that some files will be bigger.

If you choose 64-bit integers, be careful that any system library
routines (incl. MPI) also use 64-bit integers!

Do you want 64-bit integers? [y/N] Do you want to install the program in a parallel MPI version? [Y/n] 
-->WARNING: Makefiles for MPI architecture are difficult to guess
   Please compare the generated Makefile.config with local documentation.

   Checking for Fortran compiler ...
   from this list: mpif90 mpiifort ifort pgf95 pgf90 gfortran g95 

Compiler /usr/bin/mpif90 found, use this compiler? [Y/n] 
-->Compiler mpif90 found and accepted.
Is backend compiler gfortran ? [Y/n] 
   Checking for C compiler ...
   from this list: mpicc  mpiicc   icc ecc pgcc gcc 

Compiler /usr/bin/mpicc found, use this compiler? [Y/n] 
-->Compiler mpicc found and accepted.

Testing existence of libraries in this order:
 libacml.a libmkl.so libmkl_p3.a libatlas.a libblas.a
Directory search list for libraries:
  /opt/ATLAS/lib /home/me/tmp/ATLAS/build/lib /lib /usr/local/lib /usr/lib /usr/local/lib/ATLAS /lib64 /usr/lib64 /usr/local/lib64 

Do you want to replace this with your own directory search list? [y/N] Found /opt/ATLAS/lib/libatlas.a, use it? [Y/n] 
-->The following mathematical library(ies) will be used:
   -L/opt/ATLAS/lib -llapack -llapack -lf77blas -latlas


DALTON uses almost 100 Megabytes of static
allocations, in addition to the dynamic allocation.

DALTON has the possibility to reserve an amount of static memory
for storing two-electron integrals in direct and parallel calculations
Storing some or all of the 2-el. integrals in memory will speed up
direct and parallel calculations (and in particular the latter).
NOTE: This will increase the static memory allocation used by DALTON

Would you like to activate the possibility of storing 2-el.int. in memory? [y/N] How many MB to use for storing 2-el. integrals? 
-->Program will be installed with 300 MB (39000000 words) used for storing 2-el. integrals

Maximum amount of work memory for dynamic allocations can be changed
at run time with the environment variable WRKMEM (in REAL*8 words = megabytes/8)
or by using the -M option to the run script: "dalton -M mb ..." (in megabytes).
We recommend at least 200 MB work memory,
larger for correlated calculations, but it should for maximum
efficiency NOT exceed available physical memory per CPU in parallel calculations.

How many MB to use as default for work memory (hit return for default of 1000 MB)? 
-->Program will be installed with a default work memory of 3900 MB (511000000 words)

-->Current directory is /home/me/tmp/Dalton2011_release/DALTON

Use default ../bin as installation directory for DALTON binaries and scripts? [Y/n] Please enter another installation directory: 
-->DALTON executable and script will be placed in /opt/dalton/bin directory


-->Default basis set directory will be /home/me/tmp/Dalton2011_release/DALTON/../basis/

Use this directory as default basis set directory? [Y/n] 
Please choose another default basis set directory (must end with /) 
-->Default basis set directory will be /opt/dalton/basis/


-->Job specific directories under $SCRATCH/$USER
-->will be used for temporary files when running DALTON

Use SCRATCH=/work as default root scratch space in "dalton" run script? [Y/n] Please enter default root scratch directory: 
-->Creating Makefile.config ...
gfortran version 471 prc=x86_64
INFO: Compiling with 32-bit integers.
INFO: Make sure pre-compiled BLAS, MPI etc. libraries are also with 32-bit integers!!!

Proper 64-bit file access detected.

-->Creating the DALTON run-script in /opt/dalton/bin

   The configuration of DALTON has finished succesfully.
   Check compiler flags etc. in Makefile.config and run "make" to get executable.

which generates Makefile.config. Edit it and

  • change the -march to native. 
  • add -fno-whole-file to avoid internal compiler errors
  • change optimisation level to -O1 (O0 is ok, O2 and O3 give GPOPEN problems)

Like this:

ARCH        = linux
#
#
CPPFLAGS      = -DVAR_GFORTRAN -DSYS_LINUX -DVAR_MFDS -D'INSTALL_WRKMEM=131000000' -D'INSTALL_MMWORK=65000000' -D_FILE_OFFSET_BITS=64 -DVAR_MPI -DGFORTRAN=471 -DIMPLICIT_NONE
F90           = mpif90
CC            = mpicc
LOADER        = mpif90
RM            = rm -f
FFLAGS        = -march=native -O1 -ffast-math -funroll-loops -ftree-vectorize -fbacktrace -fno-whole-file
SAFEFFLAGS    = -march=native -O1 -ffast-math -funroll-loops -ftree-vectorize -fbacktrace -fno-whole-file
CFLAGS        = -march=native -O1 -ffast-math -funroll-loops -ftree-vectorize -std=c99 -DRESTRICT=restrict -DFUNDERSCORE=1
INCLUDES      = -I../include 
MODULES       = -J../modules
LIBS          = -L/opt/ATLAS/lib -llapack -llapack -lf77blas -latlas 
INSTALLDIR    = /opt/dalton/bin
PDPACK_EXTRAS = linpack.o eispack.o gp_zlapack.o gp_dlapack.o
GP_EXTRAS     = 
AR            = ar
ARFLAGS       = rvs
# flags for ftnchek on Dalton /hjaaj
CHEKFLAGS  = -nopure -nopretty -nocommon -nousage -noarray -notruncation -quiet  -noargumants -arguments=number  -usage=var-unitialized
# -usage=var-unitialized:arg-const-modified:arg-alias
# -usage=var-unitialized:var-set-unused:arg-unused:arg-const-modified:arg-alias
#
default : dalton linuxparallel.x
SAFE_FFLAGS_for_ifort = $(FFLAGS)
#
# Parallel initialization
#
MPI_INCLUDE_DIR = 
MPI_LIB_PATH    = -L/usr/lib
MPI_LIB         = -lmpi
#
#
# Suffix rules
# hjaaj Oct 04: .g is a "cheat" suffix, for debugging.
#               'make x.g' will create x.o from x.F or x.c with -g debug flag set.
#
.SUFFIXES : .F .F90 .c .o .i .g .s

.F.o:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) $(FFLAGS) -c $*.F 

.F.i:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) -E $*.F > $*.i

.F.g:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) $(SAFEFFLAGS) -g -c $*.F 

.F.s:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) $(FFLAGS) -S -g -c $*.F 

.F90.o:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) $(FFLAGS) -c $*.F90 

.F90.i:
        $(F90) $(INCLUDES) $(MODULES) $(CPPFLAGS) -E $*.F90 > $*.i




make
make install

Now just copy the basis sets and ecp data to the proper location:
cd ../
cp basis/* -R /opt/dalton/basis

and edit your ~/.bashrc;
export PATH=$PATH:/opt/dalton/bin

And you should be good to go.


So far I haven't run all the tests, but
./TEST -dalton /opt/dalton/bin/dalton short

gave
#####################################################################
                              Summary
#####################################################################

ALL TESTS ENDED PROPERLY!

date and time         : Wed Nov  7 11:57:02 EST 2012



GPOPEN errors and how to get around them.

To make the story short: if you use -O3 or -O2 for some reason Dalton can't find the basis sets if you declare them the normal way (-O0 and -O1 take care of the problem). However, using ATOMBASIS it works.

Here's an example. Typically you'd specify the basis set for a whole molecule in your .mol file:

BASIS
STO-3G
DFT PROPERTIES TEST 
This doesn't work with O3
AtomTypes=2 Angstrom
        8.    1     
O -0.141254 0.0998816 0.00000
        1.    2     
H 0.589315 0.718039 0.00000
H -0.922641 0.652406 0.00000

but that leads to errors on the debian (but not centos) builds:
   0: Directories for basis set searches:
     /jobs/dalton:/opt/dalton/basis

 MPI node no.:     0
 Reason: ERROR (GPOPEN) UPON OPENING A FILE


 Node      0:  --- SEVERE ERROR, PROGRAM WILL BE ABORTED ---
 ERROR (GPOPEN) UPON OPENING A FILE

and
  Atomic type no.    1
  --------------------
  Nuclear charge:   8.00000
  Number of symmetry independent centers:    1
  Number of basis sets to read;    2
  Basis set file used for this atomic type with Z =   8 :
     "/opt/dalton/basis/                                                                                "


--> ERROR (GPOPEN) UPON TRYING TO OPEN FILE ON UNIT 11
--> with filename /opt/dalton/basis/
--> IOSTAT ERROR CODE RETURNED      21


 QTRACE dump of internal trace stack

 ========================
      level    module
 ========================
          7    GPOPEN      
          6    BASLIB      
          5    READ_MOL    
          4    READIN      
          3    HERMIT      
          2    DALTON      
          1    DALTON main 

whereas

ATOMBASIS
DFT PROPERTIES TEST 
This works with O3
AtomTypes=2 Angstrom
        8.    1    basis=STO-3G 
O -0.141254 0.0998816 0.00000
        1.    2     basis=STO-3G
H 0.589315 0.718039 0.00000
H -0.922641 0.652406 0.00000

works and gives
   0: Directories for basis set searches:
     /opt/dalton/basis:/opt/dalton/basis

 NOTE:    1 informational messages have been issued.
 Check output, result, and error files for "INFO".

and a normal exit:
 CPU time statistics for ABACUS
 ------------------------------

 LINRES     00:00:02      77 %

 TOTAL      00:00:03     100 %


 >>>> Total CPU  time used in ABACUS:   3.21 seconds
 >>>> Total wall time used in ABACUS:   3.22 seconds


                   .-------------------------------------------.
                   | End of Static Property Section (ABACUS) - |
                   `-------------------------------------------'

 >>>> Total CPU  time used in DALTON:   6.04 seconds
 >>>> Total wall time used in DALTON:   6.06 seconds

 
     Date and time (Linux)  : Tue Nov  6 14:54:24 2012
     Host name              : beryllium