Showing posts with label nwchem. Show all posts
Showing posts with label nwchem. Show all posts

28 August 2015

623. Comparing frequency calculations in G09 and NWChem -- the importance of grid density

The vibrational entropy term will differ between calculations done in gaussian and nwchem unless you use "grid xfine" in nwchem. That the grid density is important is nothing new, but the magnitude of the effect on the entropy surprised me.

The difference can be large -- 30 cal/molK for [PPh4]+ at pbe0/cc-pvdz -- which becomes quite significant when multiplied by T (e.g. 298.15 K).

Note that the rotational entropy term also may differ, but that this would be due to different uses of symmetry in the calculations: http://molecularmodelingbasics.blogspot.com.au/2012/12/conformational-and-rotational-entropy.html

If you turn off symmetry (noautosym) in nwchem the rotational entropy will not be corrected. I've noticed that Gaussian, on the other hand, will sneakily apply correction if it finds an acceptable symmetry even if you request nosymm, so make sure that you scan through the output carefully.

Either way, vibrational entropy is not symmetry dependent. Instead you will have to worry about the grid fine-ness when comparing outputs.

If your molecule is very small, such as benzene or tetramethylphosphonium, it seems that you don't have to worry about this. However, even fairly small molecules such as [PPh4]+ will be affected.

Conv. Dens. = Convergence Density


CodeSymmGridConv. Dens.DFT EnergyZPEHCorrS(tot)S(trans)S(rot)S(vib)
G09NF1E-8-1266.584241520.3705160.389751147.07643.35835.00968.708
G09NU1E-8-1266.584303740.3704640.389691146.82943.35835.00968.461
NWNX1E-8-1266.584552230.3703480.389549146.69743.33934.99468.365
NWNX1E-5-1266.584552220.3703480.389549146.70443.33934.99468.371
NWYF1E-5-1266.584536840.3700340.385683118.02343.33933.61741.067
NWYF1E-8-1266.584536840.3700340.385683118.02343.33933.61741.067
NWNF1E-5-1266.584549280.3700340.385683119.39443.33934.99441.062
NWNF1E-8-1266.584549290.3700340.385683119.39443.33934.99441.061
NWYX1E-5-1266.584552740.3703480.389549145.33143.33933.61768.376
NWYX1E-8-1266.584552750.3703480.389549145.33743.33933.61768.382

F=fine. X=Extrafine. U=Ultrafine.
S(rot) values in blue are symmetry corrected. That's completely normal.

With "grid fine" NWChem gives a very different result to G09.

You can see the difference in the predicted IR spectra as well.

Fine (NWChem) (blue rings) vs G09 (red circles):
"grid xfine" (NWChem) (blue rings) vs G09 (red circles):


28 August 2014

591. Briefly: Changes to compiling nwchem 6.3 with python support on debian jessie

In general you can compile nwchem 6.3 on debian jessie just like you would on wheezy -- see here for detailed instructions: http://verahill.blogspot.com.au/2013/06/449-nwchem-63-updated-sources-compiling.html

To compile with python support you need to make an additional change in the code though:

On debian wheezy apt-file search libpython2.7.a shows 
python2.7-dev: /usr/lib/python2.7/config/libpython2.7.a

whereas on jessie it says
libpython2.7-dev: /usr/lib/python2.7/config-x86_64-linux-gnu/libpython2.7.a

This is causing issues, as alluded to here: http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id1363/how_do_I_set_the_path_to_python_....html

However, it is easy to solve by editing src/config/makefile.h to read
2357 ifdef USE_PYTHON64 2358 CORE_LIBS += $(PYTHONHOME)/lib/python$(PYTHONVERSION)/config-x86_64-linux-gnu/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE) 2359 else 2360 CORE_LIBS += $(PYTHONHOME)/lib/python$(PYTHONVERSION)/config-x86_64-linux-gnu/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE) 2361 endif
(Note that I'm just guessing with the x86_64 part -- I don't have an i386 system to test on)

In addition, you will need to set
export PYTHONLIBTYPE=so
before building.

My full patch file (my.patch) for version 6.3 now looks like this (it fixes a few compilation issues, and makes nwchem more compatible with gabedit):

diff -rupN src.original/config/makefile.h src/config/makefile.h
--- src.original/config/makefile.h      2013-04-15 12:41:45.016853322 +1000
+++ src/config/makefile.h       2013-04-15 12:38:44.933319544 +1000
@@ -2039,7 +2039,7 @@ endif
 
      ifeq ($(BUILDING_PYTHON),python)
 #   EXTRA_LIBS += -ltk -ltcl -L/usr/X11R6/lib -lX11 -ldl
-     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
+     EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz
   LDOPTIONS = -Wl,--export-dynamic 
      endif
 ifeq ($(NWCHEM_TARGET),CATAMOUNT)
diff -rupN src.original/ddscf/movecs_pr_anal.F src/ddscf/movecs_pr_anal.F
--- src.original/ddscf/movecs_pr_anal.F 2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/movecs_pr_anal.F  2013-04-15 12:23:28.100409225 +1000
@@ -195,7 +195,7 @@ c
  22         format(1x,2('  Bfn.  Coefficient  Atom+Function  ',5x))
             write(LuOut,23)
  23         format(1x,2(' ----- ------------  ---------------',5x))
-            do klo = 0, min(n-1,9), 2
+            do klo = 0, min(n-1,199), 2
                khi = min(klo+1,n-1)
                write(LuOut,2) (
      $              int_mb(k_list+k)+1, 
diff -rupN src.original/ddscf/rohf.F src/ddscf/rohf.F
--- src.original/ddscf/rohf.F   2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/rohf.F    2013-04-15 12:23:28.100409225 +1000
@@ -153,7 +153,7 @@ c
             ilo = 1
             ihi = nmo
          endif
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'ROHF Final Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
diff -rupN src.original/ddscf/scf_vec_guess.F src/ddscf/scf_vec_guess.F
--- src.original/ddscf/scf_vec_guess.F  2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/scf_vec_guess.F   2013-04-15 12:23:28.100409225 +1000
@@ -511,19 +511,19 @@ c
          nprint = min(nclosed+nopen+30,nmo)
          if (scftype.eq.'RHF' .or. scftype.eq.'ROHF') then
             call movecs_print_anal(basis, 1,
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'ROHF Initial Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
          else
             nprint = min(nalpha+20,nmo)
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs, 
+     &           nprint, 0.01d0, g_movecs, 
      &           'UHF Initial Alpha Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      &           .true., dbl_mb(k_occ))
             call movecs_print_anal(basis, max(1,nbeta-20),
-     &           nprint, 0.15d0, g_movecs(2), 
+     &           nprint, 0.01d0, g_movecs(2), 
      &           'UHF Initial Beta Molecular Orbital Analysis', 
      &           .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      &           .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/ddscf/uhf.F src/ddscf/uhf.F
--- src.original/ddscf/uhf.F    2013-04-15 12:41:45.036852381 +1000
+++ src/ddscf/uhf.F     2013-04-15 12:23:28.096409414 +1000
@@ -144,11 +144,11 @@ C
          enddo
          ihi = max(ihi-1,1)
  9611    continue
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs, 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs, 
      $        'UHF Final Alpha Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval), oadapt, int_mb(k_irs),
      $        .true., dbl_mb(k_occ))
-         call movecs_print_anal(basis, ilo, ihi, 0.15d0, g_movecs(2), 
+         call movecs_print_anal(basis, ilo, ihi, 0.01d0, g_movecs(2), 
      $        'UHF Final Beta Molecular Orbital Analysis', 
      $        .true., dbl_mb(k_eval+nbf), oadapt, int_mb(k_irs+nmo),
      $        .true., dbl_mb(k_occ+nbf))
diff -rupN src.original/mcscf/mcscf.F src/mcscf/mcscf.F
--- src.original/mcscf/mcscf.F  2013-04-15 12:41:45.000854073 +1000
+++ src/mcscf/mcscf.F   2013-04-15 12:23:23.748613695 +1000
@@ -719,7 +719,7 @@ c
       if (util_print('final vectors analysis', print_default))
      $     call movecs_print_anal(basis, 
      $     max(1,nclosed-10), min(nbf,nclosed+nact+10),
-     $     0.15d0, g_movecs, 'Analysis of MCSCF natural orbitals',
+     $     0.01d0, g_movecs, 'Analysis of MCSCF natural orbitals',
      $     .true., dbl_mb(k_evals), .true., int_mb(k_sym), 
      $     .true., dbl_mb(k_occ))
 c     
diff -rupN src.original/nwdft/scf_dft/dft_mxspin_ovlp.F src/nwdft/scf_dft/dft_mxspin_ovlp.F
--- src.original/nwdft/scf_dft/dft_mxspin_ovlp.F        2013-04-15 12:41:45.604825677 +1000
+++ src/nwdft/scf_dft/dft_mxspin_ovlp.F 2013-04-15 12:23:28.228403211 +1000
@@ -184,14 +184,14 @@ c
       call ga_sync()
 c
       call movecs_print_anal(basis,int_mb(k_non),int_mb(k_non)
-     & ,0.15d0,g_alpha,'Alpha Orbitals without Beta Partners',
+     & ,0.01d0,g_alpha,'Alpha Orbitals without Beta Partners',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
       if (nct.GE.2) then
       do i = 2,nct
       ind = int_mb(k_non+i-1)
       call movecs_print_anal(basis,ind,ind
-     & ,0.15d0,g_alpha,' ',
+     & ,0.01d0,g_alpha,' ',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
       enddo
       endif
@@ -350,7 +350,7 @@ c      endif
 c      endif
 c 9990 format(/,18x,'THERE ARE',i3,1x,'UN-PARTNERED ALPHA ORBITALS')
 c
-       call movecs_print_anal(basis, 1, nalp, 0.15d0, g_ualpha,
+       call movecs_print_anal(basis, 1, nalp, 0.01d0, g_ualpha,
      & 'Alpha Orb. w/o Beta Partners (after maxim. alpha/beta overlap)',
      &   .false., 0.0 ,.false., 0 , .false., 0 )
 c
diff -rupN src.original/nwdft/scf_dft/dft_scf.F src/nwdft/scf_dft/dft_scf.F
--- src.original/nwdft/scf_dft/dft_scf.F        2013-04-15 12:41:45.608825490 +1000
+++ src/nwdft/scf_dft/dft_scf.F 2013-04-15 12:23:28.228403211 +1000
@@ -1774,7 +1774,7 @@ c
             else
                blob='DFT Final Beta Molecular Orbital Analysis' 
             endif
-            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.15d0, 
+            call movecs_print_anal(ao_bas_han, ilo, ihi, 0.01d0, 
      &           g_movecs(ispin), 
      &           blob, 
      &           .true., dbl_mb(k_eval(ispin)), oadapt, 
diff -rupN src.original/nwdft/scf_dft_cg/dft_cg_solve.F src/nwdft/scf_dft_cg/dft_cg_solve.F
--- src.original/nwdft/scf_dft_cg/dft_cg_solve.F        2013-04-15 12:41:45.612825303 +1000
+++ src/nwdft/scf_dft_cg/dft_cg_solve.F 2013-04-15 12:23:28.220403588 +1000
@@ -183,7 +183,7 @@ c
             blob = 'DFT Final Beta Molecular Orbital Analysis'
           endif
           call movecs_fix_phase(g_movecs(ispin))
-          call movecs_print_anal(basis, ilo, ihi, 0.15d0,
+          call movecs_print_anal(basis, ilo, ihi, 0.01d0,
      &         g_movecs(ispin),blob,
      &         .true., dbl_mb(k_eval+(ispin-1)*nbf),
      &         oadapt, int_mb(k_irs+(ispin-1)*nbf),

--- src.original/config/makefile.h      2014-08-20 16:39:03.020195366 +1000
+++ src/config/makefile.h       2014-08-20 16:43:30.328351859 +1000
@@ -2355,9 +2355,9 @@ ifndef PYTHONLIBTYPE
        PYTHONLIBTYPE=a
 endif
 ifdef USE_PYTHON64
-  CORE_LIBS += $(PYTHONHOME)/lib64/python$(PYTHONVERSION)/config/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)
+  CORE_LIBS += $(PYTHONHOME)/lib/python$(PYTHONVERSION)/config-x86_64-linux-gnu/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)
 else
-  CORE_LIBS += $(PYTHONHOME)/lib/python$(PYTHONVERSION)/config/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)
+  CORE_LIBS += $(PYTHONHOME)/lib/python$(PYTHONVERSION)/config-x86_i386-linux-gnu/libpython$(PYTHONVERSION).$(PYTHONLIBTYPE)
 endif
 endif
 #

My build.sh file looks like this:

export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all python"
export PYTHONLIBTYPE=so
export PYTHONVERSION=2.7
export PYTHONHOME=/usr
#export BLASOPT="-L/opt/acml/acml5.3.1/gfortran64_int64/lib -lacml"
export BLASOPT="-L/opt/openblas/lib -lopenblas"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include
#export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/acml/acml5.3.1/gfortran64_int64/lib"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/openblas/lib"

export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src

patch -p0 < my.patch

make clean
make nwchem_config
make FC=gfortran 1> make.log 2>make.err
cd $NWCHEM_TOP/contrib
export FC=gfortran
./getmem.nwchem

10 June 2014

580. Python script: Interpolate between structures in a multi-xyz file

I'm doing a lot of NEB (nudged elastic band) calculations using nwchem at the moment, and while getting 'neb convergence' is simple enough, I often get an error along the lines of
3965547 @neb NEB calculation converged
3965548 @neb However NEB Gmax not converged...Try increasing the number of beads.

While that sounds simple enough it's nicer if you don't have to go back to the beginning and e.g. run a more fine-grained PES job to generate a reasonable trajectory (straight linear interpolation often doesn't work), then keep on running neb iterations. One way to cut down on time (presumably) is to simply pad the neb path xyz with intermediate structures, and that is what this python (2.7) script does.

Oh, and I really wish blogspot would support code inclusion better...

How to use:
python nebinterpolate.py -i neb_A_F.neb_final.xyz  -o test.xyz

Example:
Say we have the structure of methanol, and methanol in which the oxygen-carbon distance is 3.0 Ångström:

Here's the corresponding xyz file, which we'll call first.xyz:
6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 0.83960 0.81877 0.83960 H 1.38912 0.20156 1.38912 6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.76087 1.75017 1.76087 H 2.31039 1.13296 2.31039

Run nebinterpolate -i first.xyz -o second.xyz and you'll get a new xyz file with three structures -- the first one plus an intermediate structure:

Here's the new file, second.xyz:
6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 0.83960 0.81877 0.83960 H 1.38912 0.20156 1.38912 6 structure 2 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.30024 1.28447 1.30024 H 1.84976 0.66726 1.84976 6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.76087 1.75017 1.76087 H 2.31039 1.13296 2.31039

 Run it again, nebinterpolate -i second.xyz -o third.xyz, and you'll get:

Here's the new file, third.xyz:
6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 0.83960 0.81877 0.83960 H 1.38912 0.20156 1.38912 6 structure 2 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.06992 1.05162 1.06992 H 1.61944 0.43441 1.61944 6 structure 2 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.30024 1.28447 1.30024 H 1.84976 0.66726 1.84976 6 structure 4 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.53056 1.51732 1.53056 H 2.08007 0.90011 2.08007 6 C 0.03517 0.00549 0.03517 H -0.61778 -0.63407 0.66798 H 0.66798 -0.63407 -0.61778 H -0.60514 0.64647 -0.60514 O 1.76087 1.75017 1.76087 H 2.31039 1.13296 2.31039

Now, the real use for this is when you've been optimising a string of structures using NEB and want to increase the number of images because you're not getting gmax convergence, or if you want to do a quick and rough optimisation and then get a prettier looking set of coordinates.

You can load the multi-xyz file in nwchem by using

NEB
   ...
  XYZ_PATH path.xyz
END


nebinterpolate.py 

#!/usr/bin/python

import sys

def getvars(arguments):
 switches={}

 version='0.1'
 
 try:
  if "-i" in arguments:
   switches['in_one']=arguments[arguments.index('-i')+1]
   print 'Input: %s '% (switches['in_one'])
  else:
   arguments="--help";
 except:
  arguments="--help";
  
 try:
  if "-o" in arguments:
   switches['o']=arguments[arguments.index('-o')+1].lower()
   print 'Output: %s'% switches['o']
  else:
   arguments="--help";
 except:
  arguments="--help";

 try:
  if "-w" in arguments:
   switches['w']=float(arguments[arguments.index('-w')+1])
   print 'Weighting: %i'% switches['w']
  else:
   print 'Assuming no weighting'
   switches['w']=1.0;
 except:
  switches['w']=1.0;

 doexit=0
 try:
  if ("-h" in arguments) or ("--help" in arguments):
   print '\t\t bytes2words version %s' % version
   print 'Creates interpolated structures'
   print 'from an multixyz file'
   print '--input \t-i\t multi-xyz file to morph'
   print '--output\t-o\t output file'
   print '--weight\t-w\t weight first structure vs second one (1=average; 0=start; 2=end)'
   print 'Exiting'
   doexit=1
 except:
  a=0 # do nothing
 if doexit==1:
  sys.exit(0)

 return switches

def getcmpds(switches):
 
 cmpds={}
 
 g=open(switches['in_one'],'r') 
 n=0
 xyz=[]
 atoms=[]
 structure_id=1

 for line in g:

  try:
   if len(xyz)==cmpds['atoms_'+str(structure_id)]:
    cmpds['coords_'+str(structure_id)]=xyz
    cmpds['elements_'+str(structure_id)]=atoms   
    structure_id+=2
    n=0
    atoms=[]
    xyz=[]
  except:
   pass

  n+=1
  line=line.rstrip('\n')
   
  if n==1:
   cmpds['atoms_'+str(structure_id)]=int(line)
  elif n==2:
   cmpds['title_'+str(structure_id)]=line
  else:
   line=line.split(' ')
   line=filter(None,line)
   xyz+=[[float(line[1]),float(line[2]),float(line[3])]]
   atoms+=[line[0].capitalize()]
 g.close

 cmpds['coords_'+str(structure_id)]=xyz
 cmpds['elements_'+str(structure_id)]=atoms   
 cmpds['w']=switches['w']
 cmpds['structures']=(structure_id)
 
 return cmpds

def morph(cmpds,structure_id):
 coords_one=cmpds['coords_'+str(structure_id)]
 coords_two=cmpds['coords_'+str(structure_id+2)]
 
 coords_morph=[]
 coords_diff=[]
 for n in range(0,cmpds['atoms_'+str(structure_id)]):
  morph_x=coords_one[n][0]+cmpds['w']*(coords_two[n][0]-coords_one[n][0])/2.0
  morph_y=coords_one[n][1]+cmpds['w']*(coords_two[n][1]-coords_one[n][1])/2.0
  morph_z=coords_one[n][2]+cmpds['w']*(coords_two[n][2]-coords_one[n][2])/2.0
  diff_x=coords_two[n][0]-coords_one[n][0]
  diff_y=coords_two[n][1]-coords_one[n][1]
  diff_z=coords_two[n][2]-coords_one[n][2]
  coords_morph+=[[morph_x,morph_y,morph_z]]
  coords_diff+=[[diff_x,diff_y,diff_z]]

 cmpds['atoms_'+str(structure_id+1)]=cmpds['atoms_'+str(structure_id)]
 cmpds['elements_'+str(structure_id+1)]=cmpds['elements_'+str(structure_id)]
 cmpds['title_'+str(structure_id+1)]='structure '+str(structure_id+1)
 cmpds['coords_'+str(structure_id+1)]=coords_morph

 return cmpds

def genxyzstring(coords,element):
 x_str='%10.5f'% coords[0]
 y_str='%10.5f'% coords[1]
 z_str='%10.5f'% coords[2]
 
 xyz_string=element+(3-len(element))*' '+10*' '+\
 (8-len(x_str))*' '+x_str+10*' '+(8-len(y_str))*' '+y_str+10*' '+(8-len(z_str))*' '+z_str+'\n'
 
 return xyz_string

def writemorph(cmpds,outfile):
 g=open(outfile,'w') 

 for m in range(1,cmpds['structures']+1):
  g.write(str(cmpds['atoms_'+str(m)])+'\n')
  g.write(str(cmpds['title_'+str(m)])+'\n')
  for n in range(0,cmpds['atoms_'+str(m)]):
   coords=cmpds['coords_'+str(m)][n]
   g.write(genxyzstring(coords, cmpds['elements_'+str(m)][n]))
 g.close

 return 0

if __name__=="__main__":
 arguments=sys.argv[1:len(sys.argv)]
 switches=getvars(arguments)
 cmpds=getcmpds(switches)

#check that the structures are compatible
 for n in range(1,cmpds['structures'],2):

  if cmpds['atoms_'+str(n)]!=cmpds['atoms_'+str(n+2)]:
   print 'The number of atoms differ. Exiting'
   sys.exit(1)
  elif cmpds['elements_'+str(n)]!=cmpds['elements_'+str(n+2)]:
   print 'The types of atoms differ. Exiting'
   sys.exit(1)
  cmpds=morph(cmpds,n)
 success=writemorph(cmpds,switches['o'])
 if success==0:
  print 'Conversion seems successful'

04 March 2014

561. b3pw91 in nwchem and g09

UPDATE: there was an error in the earlier version where I gave the wrong energy for the b3pw91 functional in nwchem. In the old version the energy I provided was very close to that of acm in nwchem rather than b3pw91 in g09.

Note that for a large molecule with a medium sized basis set (101 atoms, ca 1100 functions,  ca 2200 primitives) the energy difference between b3pw91 in g09 and b3pw91 in nwchem as defined below is 0.0124 Hartree, which is pretty big (7.8 kcal/mol), although in absolute terms it's quite small (nwchem: -6187.741840960054 Hartree. g09: -6187.75427966 Hartree).

The difference is a lot smaller for the small molecule in the example below.

Original post:
According to http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id721/Are_these_definitions_correct_fo....html b3pw91 (as defined in Gaussian 09) and acm (as defined in nwchem) are identical.

Looking at the energies I've been getting, that's not true when it comes to G09 and NWCHEM.


That acm and b3pw91 are the same should be reasonable -- b3 indicates that it's Becke's 3-parameter hybrid exchange correlation functional model, which is also known as the Adiabatic Connection Method (ACM).

For historical reasons, g98 implemented the ACM as B3LYP, by using LYP instead of PW91, and using VWN_1_RPA and a few other tricks -- see section 2 in http://verahill.blogspot.com.au/2013/06/446-b3lyp-and-wah-confusion.html

Then it would stand to reason that B3PW91 would be the 'canonical' version of Becke's 3-parameter functional.

Looking at http://www.nwchem-sw.org/index.php/Release62:Density_Functional_Theory_for_Molecules acm is defined as

xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81

(there are several versions of VWN -- I know it's vwn_5 from the output)


Either way, using acm in a single energy calculation (no optimisation) in nwchem on a water molecule with 6-31+G* (acm/6-31+G*) gives
-76.358375905073 Hartree

G09 using B3PW91/6-31+G* (manually defined basis set so we're using the same form in both nwchem and g09) gives
 -76.3557851653 Hartree

nwchem using
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
gives
-76.358375905072 Hartree

and nwchem using
XC HFexch 0.20 slater 0.80 becke88 nonlocal 0.72 perdew91 0.81 pw91lda 1.00
obtained from http://myweb.liu.edu/~nmatsuna/gamess/refs/howto.dft.html,gives
-76.355784373093 Hartree

This last definition is thus equivalent to b3pw91 in g09.

The gaussian manual is less than helpful. In fact it is quite misleading:
"These functionals have the form devised by Becke in 1993 [Becke93a]:
A*EXSlater+(1-A)*EXHF+B*ΔEXBecke+ECVWN+C*ΔECnon-local
[..] B3LYP uses the non-local correlation provided by the LYP expression, and VWN functional III for local correlation (not functional V). [..]B3P86 specifies the same functional with the non-local correlation provided by Perdew 86, and B3PW91 specifies this functional with the non-local correlation provided by Perdew/Wang 91.


Addendum:
While I think B3PW91 should be the same as ACM in nwchem (note that nwchem does not have b3pw91 as a keyword), I decided to have a look at how different packages define b3pw91.

nwchem -- doesn't exist. Manual.

g09 (this post) -- xc HFexch 0.20 slater 0.80 becke88 nonlocal 0.72 perdew91 0.81 pw91lda 1.00

gamess US (here) -- xc HFexch 0.20 slater 0.80 becke88 nonlocal 0.72 perdew91 0.81 pw91lda 1.00

PQS (page 52, manual) Paraphrased:
"B3PW91 -- hybrid 3-parameter HF-DFT functional comprising combination of Slater local exchange, Becke nonlocal exchange, VWN 5 local correlation and PW91 nonlocal correlation together with a portion (20%) of the exact Hartree-Fock exchange (original 3-parameter hybrid recommended by Becke)". That to me sounds like ACM.

Turbomol -- not available. Manual.

Orca -- "B3PW The three-parameter hybrid version of PW91". Not informative.

molpro -- doesn't exist. manual

Dalton -- (page 285, manual).
"B3PW91 3-parameter Becke-PW91 functional, with PW91 correlation functional. Note that PW91c includes PW92c local correlation, thus only excess PW92c local correlation is required (coe cient of 0.19).
Combine HF=0.2 Slater=0.8 Becke=0.72 PW91c=0.81 PW92c=0.19"
So the local correlation is 1*PW92c= 0.81 PW91c + 0.19 PW92c. This is, I presume, is quite different from VWN.

Q-Chem -- "B3PW91 (B3 Exchange + PW91 correlation)". Not explicit enough for me.


28 February 2014

559. Briefly: Nudge Elastic Band in NWChem

I'm currently exploring a method called Nudge Elastic Band to find the minimum energy pathway in a reaction involving a large metal cluster. While the NWChem documentation isn't bad, it could be clearer for happy amateurs like myself, which is the impetus for this post.

As usual, this is how I understand it -- which may be wrong.

The NEB method is described here.

So...

There are a number of ways of modelling reaction pathways computationally.

Brute force PES scan (g09 or nwchem)
The fastest, cheapest, least accurate one would be to make a simple Potential Energy Surface (PES) scan by keeping all coordinates frozen with the exception of the ones directly involved in a reaction
e.g. for Me-OH -> Me + OH you'd simply vary the C-O distance.

If you don't let the structures relax for each change in bond length you will overestimate the activation energy. If you do let the structures relax you run the risk of every single snapshot relaxing back to either pure methanol or pure Me and OH, rather than giving you a series of transitional energies.

Transition state search e.g. QST2 (g09)
A somewhat more sophisticated method is to try to identify a proper transition state (TS). You can do this by making a lucky guess and try to do a transition state optimisation. You can improve your chances by using e.g. QST2 (or QST3, which also uses a specific TS guess), which takes the product and the starting material and interpolates the coordinates to generate a transition state guess. This has worked pretty well for me for cases where a simple transition state is expected (i.e. a hydride transfer). You can then model the reaction path by doing an IRC calc.

Minimum energy pathway methods (chain-of-states) e.g. Nudge Elastic Band (nwchem)

A more sophisticated method is to use a chain-of-state method such as Nudge Elastic Band (NEB). There's a nice page about it here: http://theory.cm.utexas.edu/henkelman/research/saddle/

NEB generates a series of structures based on the starting point and the product. These are initially generated a simple interpolations between the coordinates of the starting point and the end point -- this is very similar to a brute for PES scan and gives an unnaturally high transition energy. However, the individual structures are then optimised in passes.

10 passes. Structural convergence is not achieved.

Not clear? Well, here's a rough algorithm:

1. Generate a series of linearly interpolated structures (beads) based on the Starting Point (SP) and End Point (EP) structures. In the simplest case, for N structures, each structure n is SP+(n/N)*(EP-SP).

Now, typically you're more interested in structures in the middle than at the ends (i.e. to get better resolution for the transition) and that is manipulated using the spring constant, kbeads. See figure 3 in http://theory.cm.utexas.edu/henkelman/pubs/jonsson98_385.pdf for an example of varying the spring constant. 1.0 seems to be reasonably safe.

The number of structures is set by nbeads. Make sure to use a reasonable number so that you get enough resolution.

2.  Each structure thus generated is submitted to a single geometry optimisation step i.e. it's not optimised until convergence, but only takes a single step along the energy gradient. The speed at which it is decending along the energy gradient is set using the stepsize.

The largest possible stepsize will give the fastest descent (which is good), but too large a stepsize may cause issues due to the atoms being moved too far (which is bad) and in turn cause SCF convergence failures due to the structures becoming unreasonable. So try 1.0 and hope for the best.

Rule of thumb: if you decrease the stepsize by a factor of 10 from 1.0 to 0.1, you should increase maxiter by a factor of 10.

3. The energies of the new structures are calculated and a reaction profile is generated.

4. Repeat step 2 and 3 until structural convergence or maxiter steps.



5. If you're lucky your calculation will end due to structural convergence. If not, and if you want to restart your calculation you can read in the last set of structures using xyz_path.

But make sure that you do print the intermediate structures in the first place by using print_shift 1.





15 February 2014

552. Very briefly: Enthalpy of correction and different number of reactants and products in nwchem and gaussian

Those who are well-versed in the computational arts won't get much out of this post. On the other hand, happy amateurs like myself might find it a bit more useful as it clarifies something that's been bothering  me.

Short version: Hcorr in Nwchem and Gaussian include PV.

As usual, I am not an expert in either computational or theoretical chemistry. I try to use it as a tool, and I try to use it as well as I can. But I am not an authority. Also, if you consult older posts on the blog you'll find plenty of examples of me misunderstanding basic computational concepts (with 550+ posts it's difficult to go back and erase all the embarrassing little gems).


The background
I had a bit of a fright the other day. I'm currently working on computationally characterising a system which undergoes a reaction that can lead to a large number of isomers. Only one of them is experimentally observed, and so it was of interest to see whether this is the thermodynamically favoured product as predict by reasonably cheap computational methods.

Because DFT calculations like these are based on gas phase reactions (even if you use a solvation model) the free energy that you get is based on the standard conditions for gas phase corrections i.e. 1 bar of partial pressure of each reactant.

If you want to calculate the free energy of reaction in solution you need to use concentrations instead. As you will see, you'll only have to worry about this for reactions that involve an unequal number of reactants and products. Normally your best results will be obtained by using isodesmic reaction schemes anyway, which is a great way of avoiding this. However, if you do have unequal numbers of reactants and products you /must/ correct for it when making solution phase predictions.

A gas at 1 bar of pressure is ca
V=n*R*T/P= 1*8.314*298.15/101350=0.024458 m^3= 24.46 litres


So for an example reaction like this:
A+B -> C


Using

G_SATP=G(gas phase) + R*T*ln Q
=G(gas phase)+ R*T*ln([C]/([A]*[B])
=G(gas phase)+ R*T*ln((1/24.46)/((1/24.46)*(1/24.46)))
=G(gas phase)+1.89 kcal/mol 

In other words: you can't ignore the standard state change when doing solution phase calculations. This is obviously of extra importance in pH calculations, which are notoriously tricky.

Enthalpy
So knowing the need for standard state corrections I was a bit paranoid about how to treat the reaction enthalpy and came across this document: http://chemistry.illinoisstate.edu/standard/che38037/handouts/380.37assign3.pdf

On page, equation 2 states that for Gaussian
Ee+Hcorr=Ee+Hvib+Hrot+Htrans

(where Ee is the electronic energy) which is an expression that doesn't include Δ(PV). In that case
Δ(H)=Σ(Hproducts)-Σ(Hreactants)+Δn*R*T
where Δn is the difference in number of moles of products and reactants.

Consulting the excellent Gaussian thermochemistry whitepaper (http://www.gaussian.com/g_whitepap/thermo.htm) offers the following:
Hcorr=Etot+kBT
and
The Gibbs free energy includes Δ(PV) , so when it's applied to calculating ΔG for a reaction, ΔNRT=ΔPV is already included. This means that ΔG will be computed correctly when the number of moles of gas changes during the course of a reaction.
[Note that H=Ee+Hcorr=Ee+Etot+kBT]

At a first glance, kBT isn't equivalent to RT, but in fact kB=R/NA -- in the words of Wikipedia: "[The gas constant R] is equivalent to the Boltzmann constant, but expressed in units of energy".

In other words, Δ(PV) is already accounted for in Hcorr in gaussian.

Somewhat clearer: the Freq() page, http://www.gaussian.com/g_tech/g_ur/k_freq.htm, on the gaussian website now states
Sum of electronic and zero-point Energies=-527.492585  E0=Eelec+ZPE
Sum of electronic and thermal Energies= -527.489751     E= E0+ Evib+ Erot+Etrans
Sum of electronic and thermal Enthalpies=-527.488807  H=E+RT

We test that claim:
E=-527.489751 Hartree
RT=(1.987*298.15/1000)/627.503=9.4410e-04 Hartree
E+RT=-527.489751+9.4410e-04=-527.488807

It holds up. 

A similar example from an nwchem calculation:
Zero-Point correction to Energy = 120.416 kcal/mol ( 0.191895 au)
Thermal correction to Energy = 127.114 kcal/mol ( 0.202569 au)
Thermal correction to Enthalpy = 127.706 kcal/mol ( 0.203513 au)
(0.203513-0.202569)=9.4400e-04 Hartree(1.987*298.15/1000)/627.503

Nwchem also includes Δ(PV) in the thermal correction to enthalpy.

05 September 2013

512. Briefly: zmatrices in nwchem -- methanol

And another update:
I can now confirm that using your own z matrix still does not constrain the geometry during a PES scan, which was the original impetus for this post: http://verahill.blogspot.com.au/2013/09/511-when-nwchem-pes-scans-fail-to.html

Another update:
the gaussian run failed after 14 geometry steps during the first PES point.
NTrRot= -1 NTRed= 628 NAtoms= 34 NSkip= 532 IsLin=F Error in internal coordinate system. Error termination via Lnk1e in /opt/gaussian/g09/l103.exe at Thu Sep 5 18:17:12 2013. Job cpu time: 0 days 22 hours 25 minutes 27.6 seconds. File lengths (MBytes): RWF= 192 Int= 0 D2E= 0 Chk= 28 Scr= 1
Not being an expert, to me it seems that there's something fundamentally difficult with the system I'm working on. In an ideal world I'd give the actual details, but quite apart from the risk of being scooped, doing so would also make it easier to identify me (not that it's impossible at this point).

[Suffice to say that the system holds a large polyoxoanion and a small p-block anion, both of which are symmetrical and negatively charged. The goal of the PES scan is to bring the ions closer to see whether they 'react'. Which is also a troublesome use of computational resources -- computational chemistry is good at answering well-defined questions using carefully designed computational experiments -- but not generally very good at answering ill-defined questions about synthesis (i.e. you can't generally 'mix two things together and see what happens' and expect a useful result. Anyway, regardless of that, that's exactly what I want to do.]

Update:
nwchem still gives errors about autoz in spite of using noautoz. But I also get messages about the user generated z matrix, so we'll see whether my input is respected or not.

Also, for one of the calcs I'm getting
There are insufficient internal variables: expected 95 got 96

which is really, really, really annoying since there doesn't seem to be a real fix for it -- I've tried everything suggested in http://www.nwchem-sw.org/index.php/Special:AWCforum/st/id286. I can get the same calc to run in gaussian though (gaussian has its own issues), but it'd be nice if stuff just...worked...

Original post:
Normally you don't have to fiddle with zmatrices in nwchem -- instead you'd typically supply cartesian coordinates, and nwchem would do autoz to autogenerate internal (z matrix) coordinates.

Sometimes that fails, and nwchem defaults to using cartesian coordinates. In most cases, this isn't a cause for any real concern -- the computation will continue although I think cartesian coordinates are supposed to be slightly slower.

However, if you're doing a PES scan you'll notice that it's not proceeding as intended -- the constraints are completely ignored: 511. When nwchem PES scans fail to constrain -- autoz failure

The easiest remedy is to supply the internal coordinates directly, but there honestly aren't too many examples online showing how that's done, and I kept on getting annoying failure messages along the lines of
NWChem Input Module ------------------- zmat ---- THE 3-D PIECE OF -Z- DATA FOR ATOM = 2 IS NEITHER FLOATING POINT NOR ALPHANUMERIC OR COULD NOT BE MATCHED WITH A VARIABLE. STOP IAT= 2 ZMAT= 2 1 0 0 0 0.00000 0.00000 0.00000 ------------------------------------------------------------------------ JOB STOPPED PROGRAM STOP IN - ZDAT - ------------------------------------------------------------------------ ------------------------------------------------------------------------ CALLS IT QUIT FROM HND_HNDERR 0 ------------------------------------------------------------------------ This error has not yet been assigned to a category

This particular error came about because the zmatrix module is case sensitive, and my Variables couldn't be interpreted (it should be variables). Anyway, you'll understand more after this post, and it isn't important anyway.


Calculation using a z matrix (internal coordinates) in nwchem, with a little bit of help from openbabel:

Assuming that you set up a calculation in e.g. ECCE for a geometry optimisation of methanol you'll end up with the following input file:
scratch_dir /home/andy/scratch Title "methanol" Start methanol echo charge 0 geometry autosym units angstrom C 0.0351714 0.00548884 0.0351714 H -0.617781 -0.634073 0.667983 H 0.667983 -0.634073 -0.617781 H -0.605139 0.646470 -0.605139 O 0.839603 0.818768 0.839603 H 1.38912 0.201564 1.38912 end ecce_print ecce.out basis "ao basis" spherical print H library "6-31+G*" O library "6-31+G*" C library "6-31+G*" END dft mult 1 direct XC b3lyp grid fine mulliken end driver default end task dft optimize
Take the coordinates, and paste them into a file, e.g. methanol.xyz:
6 methanol C 0.0351714 0.00548884 0.0351714 H -0.617781 -0.634073 0.667983 H 0.667983 -0.634073 -0.617781 H -0.605139 0.646470 -0.605139 O 0.839603 0.818768 0.839603 H 1.38912 0.201564 1.38912

Next, use openbabel:
babel -ixyz methanol.xyz -ogzmat 
#Put Keywords Here, check Charge and Multiplicity. methanol 0 1 C H 1 r2 H 1 r3 2 a3 H 1 r4 2 a4 3 d4 O 1 r5 2 a5 3 d5 H 5 r6 1 a6 2 d6 Variables: r2= 1.1117 r3= 1.1117 a3= 109.74 r4= 1.1094 a4= 108.78 d4= 118.90 r5= 1.3984 a5= 110.18 d5= 238.51 r6= 0.9924 a6= 105.98 d6= 60.61 1 molecule converted 18 audit log messages
The format isn't quite right (everything in red needs to go, and the V in blue should be lower case), but we can sort that out:

 babel -ixyz ~/methanol.xyz -ogzmat |sed 's/\=//g;s/V/v/g;s/\://g' |tail -n+6 > methanol.zmat
C H 1 r2 H 1 r3 2 a3 H 1 r4 2 a4 3 d4 O 1 r5 2 a5 3 d5 H 5 r6 1 a6 2 d6 variables r2 1.1117 r3 1.1117 a3 109.74 r4 1.1094 a4 108.78 d4 118.90 r5 1.3984 a5 110.18 d5 238.51 r6 0.9924 a6 105.98 d6 60.61

Let's update out nwchem input file with the internal coordinates:
scratch_dir /home/andy/scratch Title "methanol" Start methanol echo charge 0 geometry noautoz zmatrix C H 1 r2 H 1 r3 2 a3 H 1 r4 2 a4 3 d4 O 1 r5 2 a5 3 d5 H 5 r6 1 a6 2 d6 variables r2 1.1117 r3 1.1117 a3 109.74 r4 1.1094 a4 108.78 d4 118.90 r5 1.3984 a5 110.18 d5 238.51 r6 0.9924 a6 105.98 d6 60.61 end end ecce_print ecce.out basis "ao basis" spherical print H library "6-31+G*" O library "6-31+G*" C library "6-31+G*" END dft mult 1 direct XC b3lyp grid fine mulliken end driver default end task dft optimize

And run. Done!

511. When nwchem PES scans fail to constrain -- autoz failure

Another update:
My jobs have run long enough now that I can confirm that using your own z matrix still does not constrain the bond lengths i.e. the link at the end of the post is useless.

Update:
I'm not sure using a zmatrix actually solved this -- for each step in the optimization it seems that nwchem attempts to generate a new zmatrix, and probably ignoring my input (and yes, I'm using noautoz). I'll let my calcs run for a little while to see whether the constraints are honoured or not.

But I'm getting really frustrated with nwchem right now, especially since gaussian isn't having any issues running these particular jobs (there are other issues with gaussian, such as the format of the output, etc.)

Original post:
I set up PES scans in nwchem as shown in this post: http://verahill.blogspot.de/2013/08/503-relaxed-pes-scanning-in-nwchem.html

I was noticing that while almost all of my potential energy surface scans were working out just fine, some of them would...not. What would happen is that there would be no error messages, but for some reason the e.g. atom-atom distance that was defined (and defined using constant) would not remain constant during the geometry optimization in each step.

I saw this when looking trying to move an anion (9 atoms) step-wise closer to a large, negatively charged metal oxide ion (25 atoms).

I took a while to chase this down. First I though that well maybe the distances weren't really set as immutable, but were just associated with a certain force constant -- and that the anion-anion repulsion somehow overcame this. That wasn't the case.

Instead it was something that I should've paid attention to: the zmatrix generation.
If you find that for some reason the PES scan is not constrained at all, look for something along the lines of the following in your output:
NWChem Input Module ------------------- molecules_def2_svp ---------------- Scaling coordinates for geometry "geometry" by 1.889725989 (inverse scale = 0.529177249) ------ auto-z ------ warning. autoz generated 10 bonds for atom 24 warning. autoz generated 10 bonds for atom 25 warning. autoz generated 10 bonds for atom 26 warning. autoz generated 10 bonds for atom 27 warning. autoz generated 10 bonds for atom 28 warning. autoz generated 10 bonds for atom 29 autoz: The atoms group into disjoint clusters cluster 1: 1 2 3 4 cluster 2: 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Connecting clusters 1 2 via atoms 3 7 r = 3.71 autoz: regenerating connections with new bonds warning. autoz generated 10 bonds for atom 24 warning. autoz generated 10 bonds for atom 25 warning. autoz generated 10 bonds for atom 26 warning. autoz generated 10 bonds for atom 27 warning. autoz generated 10 bonds for atom 28 warning. autoz generated 10 bonds for atom 29 autoz: excessive number of variables 2066 81 AUTOZ failed to generate good internal coordinates. Cartesian coordinates will be used in optimizations.

If that happens, cartesian coordinates will be used, and your
python from nwgeom import * geom = ''' geometry adjust zcoord bond 1 14 %f bond constant end end '''

won't do anything.

The solution is to provide the coordinates as a zmatrix instead -- and that's the focus of my next post:
http://verahill.blogspot.com.au/2013/09/512-briefly-zmatrices-in-nwchem-methanol.html

Oh, and don't forget to include noautz

30 August 2013

506. Extracting optimized structures from a potential energy scan in nwchem

Another update:
It now dumps the energies in a file, energies.dat, as well.

Update:
some programmes, like ecce, are more picky about the xyz format than others (e.g. jmol, vmd). I've updated the code to output xyz files that ecce too can read.

Original post:
When you use scan_input() in nwchem to do a PES scan (see e.g. here: http://verahill.blogspot.com.au/2013/08/503-relaxed-pes-scanning-in-nwchem.html) you get the energies and the gradients for the optimized structures returned as the results. However, for a casual user the atomic actual coordinates is more informative.

Here's a very simple parser written in python (2.7) which extracts the optimized structures from the output file:

#!/usr/bin/python
import sys

def getrawdata(infile):
        f=open(infile,'r')
        opt=0
        geo=0
        energy=[]
        energies=[]
        struct=[]
        structure=[]
        for line in f:
                if "Total DFT" in line:
                        line=filter(None,line.rstrip('\n').split(' '))
                        energy=float(line[4])
                if 'Optimization converged' in line:
                        opt=1
                if opt==1 and 'Geometry' in line:
                        geo=1
                if      'Atomic Mass' in line and (opt==1 and geo==1):
                        opt=0
                        geo=0
                        struct+=[structure]
                        energies+=[energy]
                        structure=[]
                if opt==1 and geo==1:
                        structure+=[line.rstrip()]
        return struct,energies

def genxyzstring(coords,element):
        x_str='%10.5f'% coords[0]
        y_str='%10.5f'% coords[1]
        z_str='%10.5f'% coords[2]
 
        xyz_string=element+(3-len(element))*' '+10*' '+\
        (8-len(x_str))*' '+x_str+10*' '+(8-len(y_str))*' '+y_str+10*' '+(8-len(z_str))*' '+z_str+'\n'
 
        return xyz_string

def getstructures(rawdata):
        
        n=0
        for structure in rawdata:
                
                n=n+1
                num="%03d" % (n,)
                g=open('structure_'+num+'.xyz','w')
                itson=False
                cartesian=[]
                        
                for item in structure:
                        
                        if itson and not(item==""):
                                coords=filter(None,item.split(' '))
                                coordinates=[float(coords[3]),float(coords[4]),float(coords[5])]
                                element=coords[1]
                                cartesian+=[genxyzstring(coordinates,element)]
                                #cartesian+=[coords[1]+'\t'+coords[3]+'\t'+coords[4]+'\t'+coords[5]+'\n']
                
                        if "---" in item:
                                itson=True
                        if item=="" and itson==True:
                                itson=False
                                if not(len(cartesian)==0):
                                        g.write(str(len(cartesian))+'\n')
                                        g.write('Structure '+str(n)+'\n')
                                        for line in cartesian:
                                                g.write(line)
                                        g.close()
                                cartesian=[]
        return 0
        
if __name__ == "__main__":
        infile=sys.argv[1]
        rawdata,energies=getrawdata(infile)
        structures=getstructures(rawdata)

        g=open('energies.dat','w')
        for n in range(0,len(energies)):
                g.write(str(n)+'\t'+str(energies[n])+'\n')
        g.close()


Presuming that you've saved it as pes_parse.py you can then generate a series of xyz files with the structures, catenate them into a trajectory file, and open it in e.g. jmol. I'm using the output from example 1 in http://verahill.blogspot.com.au/2013/08/503-relaxed-pes-scanning-in-nwchem.html as the example:

chmod +x pes_parse.py
./pes_parse.py nwch.nwout
ls
nwch.nwout structure_001.xyz structure_003.xyz structure_005.xyz structure_007.xyz structure_009.xyz structure_011.xyz structure_013.xyz structure_015.xyz structure_017.xyz structure_019.xyz pes_parse.py structure_002.xyz structure_004.xyz structure_006.xyz structure_008.xyz structure_010.xyz structure_012.xyz structure_014.xyz structure_016.xyz structure_018.xyz
cat structure_*.xyz >> trajectory.xyz jmol trajectory.xyz

You can go through the structures by clicking on the arrows indicated by the white arrow:

Finally, using VMD it's easy to make videos -- note that they for some reason look awful here (seems like a lot of frames are removed, in particular from the beginning of the run):

And here's the SN2 reaction from post 503:


28 August 2013

503. (relaxed) PES scanning in Nwchem revisited.

Update 2: The coordinates are actually gradients, and so aren't terribly informative to a casual user like myself. See this post for how to extract the geometries properly: http://verahill.blogspot.com.au/2013/08/506-extracting-optimized-structures.html


Update:
Please note that the coordinates in square brackets ([]) in the python output are not raw coordinates for the atoms in the molecule -- I haven't quite figured out how they scale, but it's not a simple matter of just multiplying. The energies are good though, and you can always extract the coordinates the slow and painful way by manually going through the output.

Another issue which should be stressed is that scan_input(geom,[1.398],[3.398],19,'dft',task_optimize) does not do the end points -- i.e. you won't get the energy for a bond length of 1.398, and you won't get the energy for a bond length of 3.398. Instead you'll get 19 data points in between these. It's a bit...awkward.

Original post:
A long time ago I made a post on doing potential energy surface (PES) scans in nwchem using python.

This is a post giving PES another look. The impetus for the post is that I'm tired of Gaussian failing and being opaque about the whole procedure.

The following page was of great help: http://www.fqt.izt.uam.mx/html/software_fqt/user/node34.html

NOTE: you'll need to compile nwchem with python support. See e.g. http://verahill.blogspot.com.au/2013/06/449-nwchem-63-updated-sources-compiling.html (the post is a bit messy, but persevere -- it's not that difficult)

On Debian the key is to change
    EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl
to
    EXTRA_LIBS +=    -lnwcutil  -lpthread -lutil -ldl -lssl -lz

in config/makefile.h before compiling. It's not necessary on RHEL clones.

Below I'll show three examples:
* a simple bond dissociation reaction. I also discuss the use of 'constant', and task_energy vs task_optimize.
* an SN2 reaction (CH3Br + I-)
* a 2D/parallel PES scan of ethane ( C-C bond length, H-C-C angle). I also show constant vs free variables.


Example 1.
Breaking the C-O bond in methanol

I set this up in ecce (see e.g. next example), but you don't have to. The input file I used was the following:
scratch_dir /scratch Title "meoh_pes" Start meoh_pes echo charge 0 geometry autosym units angstrom C 0.0351714 0.00548884 0.0351714 H -0.617781 -0.634073 0.667983 H 0.667983 -0.634073 -0.617781 H -0.605139 0.646470 -0.605139 O 0.839603 0.818768 0.839603 H 1.38912 0.201564 1.38912 end ecce_print ecce.out basis "ao basis" cartesian print H library "3-21G" O library "3-21G" C library "3-21G" END dft mult 1 direct XC b3lyp grid fine iterations 99 mulliken end driver default maxiter 888 end python from nwgeom import * geom = ''' geometry adjust zcoord bond 1 5 %f cccc constant end end ''' results=scan_input(geom,[1.398],[3.398],19,'dft',task_optimize) for i in range(0,len(results)): print results[i][0][0],results[i][1] end task python
The PES bit is highlighted in blue. Note the 'constant' keyword -- if you omit that the bond length will initially be set to whatever you define it to in your scan, but it can relax back to the optimum length. If you DO set 'constant' everything BUT that bond will be relaxed. Most likely this is what you will want to do.

Also note that a constrained (i.e. not relaxed) PES scan can be done by doing task_energy instead of task_optimize.

ECCE can't quite handle the textual output (alt+O) since there are lines that are too long. The output is properly written though -- you'll just have to look in the Output folder of the job. The ecce.out file works fine though.

The job takes 90-100 seconds on an old 3-core node (AMD Athlon II X3).


The very end of the output file has all the results, but in a non-obvious way:
1.498 (-115.07289914310056, [-0.00130778291169336, 0.01798903956433226, 0.0, -4.009155466250247e-05, 1.693340302064139e-05, -6.637550254401381e-06, -4.009155466250247e-05, 1.693340302064139e-05, 6.637550254401381e-06, 2. 4514244186701895e-05, -1.5885649893555842e-05, 0.0, 0.0012636893525275195, -0.018041103298149008, 0.0, 9.97624242821682e-05, 3.4082577691996185e-05, 0.0]) (-114.8737952986994, [-4.7287277448850376e-05, 0.030029200359777717, 0.0, -1.3711175166353229e-06, -8.452926738775068e-08, 9.941241931599176e-07, -1.3711175166353229e-06, -8.452926738775068e-08, -9.941241931599176e-07, 8. 167348279908282e-07, -2.5820569179275075e-06, 0.0, 4.871429991895604e-05, -0.030027845123621805, 0.0, 4.984777179639632e-07, 1.3958792967685985e-06, 0.0]) 1.498 (-115.07289914310056, [-0.00130778291169336, 0.01798903956433226, 0.0, -4.009155466250247e-05, 1.693340302064139e-05, -6.637550254401381e-06, -4.009155466250247e-05, 1.693340302064139e-05, 6.637550254401381e-06, 2. 4514244186701895e-05, -1.5885649893555842e-05, 0.0, 0.0012636893525275195, -0.018041103298149008, 0.0, [..] 3.198 (-114.87977711993531, [-0.00018979360652668711, 0.033296276783081655, 0.0, -2.3787379704320877e-06, 1.7510009376556918e-06, 1.3530564600128248e-06, -2.3787379704320877e-06, 1.7510009376556918e-06, -1.3530564600128248e-06, 8. 24207064487048e-06, -8.055936327900498e-07, 0.0, 0.00018027576986845428, -0.03329589479259992, 0.0, 6.033241931824307e-06, -3.0783987173960137e-06, 0.0]) 3.298 (-114.8737952986994, [-4.7287277448850376e-05, 0.030029200359777717, 0.0, -1.3711175166353229e-06, -8.452926738775068e-08, 9.941241931599176e-07, -1.3711175166353229e-06, -8.452926738775068e-08, -9.941241931599176e-07, 8. 167348279908282e-07, -2.5820569179275075e-06, 0.0, 4.871429991895604e-05, -0.030027845123621805, 0.0, 4.984777179639632e-07, 1.3958792967685985e-06, 0.0])
All in all, there are 58 lines for 19 steps. I think that there are three things happening -- firstly, the line in blue is the output from the 19th step, and that somehow gets mixed in with the output from all the calculations. Secondly, the structure and energy of each step is reported twice at a time. Thirdly, the optimised structures/energies are reported one more time by injecting them into the output, like this:
A
S
A
B
B
C
C
D
D
A
E
E
B

where A is the first step, S is the 19th step etc. This way you get 19x3+1=58 lines. This is clearly idiotic.

Instead, you can look through the output and search for 'Scanning NWChem input - results from step' to see all the output for the optimised structures one by one:
Scanning NWChem input - results from step 2 (-115.06618436935011, [-0.0038228970733096973, 0.050051062094932305, 0.0, 2.9196769046224702e-05, -6.928661348853948e-06, 4.746536668570611e-06, 2.9196769046224702e-05, -6.928661348853948e-06, -4.746536668570611e-06, -1.0103262985700079e-05, 1.6491089715894858e-05, 0.0, 0.003767244388907326, -0.05005618579508188, 0.0, 7.362409274846993e-06, 2.489933151654522e-06, 0.0])
In this particular case I can grep my way through by doing
cat nwch.nwout |grep '^(-'|cat -n
1 (-115.07289914310056, [-0.00130778291169336, 0.01798903956433226, 0.0, -4.009155466250247e-05, 1.693340302064139e-05, -6.637550254401381e-06, -4.009155466250247e-05, 1.693340302064139e-05, 6.637550254401381e-06, 2.4514244186701895e-05, -1.5885649893555842e-05, 0.0, 0.0012636893525275195, -0.018041103298149008, 0.0, 9.97624242821682e-05, 3.4082577691996185e-05, 0.0]) 2 (-115.06618436935011, [-0.0038228970733096973, 0.050051062094932305, 0.0, 2.9196769046224702e-05, -6.928661348853948e-06, 4.746536668570611e-06, 2.9196769046224702e-05, -6.928661348853948e-06, -4.746536668570611e-06, -1.0103262985700079e-05, 1.6491089715894858e-05, 0.0, 0.003767244388907326, -0.05005618579508188, 0.0, 7.362409274846993e-06, 2.489933151654522e-06, 0.0]) 3 (-115.05478103866017, [-0.005033784212299788, 0.06848598587431667, 0.0, -1.3396548676491982e-06, -2.5875637174599397e-08, -5.261746410523127e-07, -1.3396548676491982e-06, -2.5875637174599397e-08, 5.261746410523127e-07, 1.4459720645843e-07, -2.8328952926398587e-06, 0.0, 0.005034455335082233, -0.0684825786855032, 0.0, 1.8635897582608418e-06, -5.225422206114883e-07, 0.0]) 4 (-115.04079235517, [-0.005485543277166251, 0.07798880362126945, 0.0, 4.745460307237215e-06, -5.597510268573469e-06, 5.645418744981701e-07, 4.745460307237215e-06, -5.597510268573469e-06, -5.645418744981701e-07, -6.651712157745848e-07, 6.750842351778419e-06, 0.0, 0.00548062073181968, -0.07798086728839469, 0.0, -3.903204054994669e-06, -3.4921546817404114e-06, 0.0]) 5 (-115.02560006674966, [-0.0054233976595857575, 0.08166232318137269, 0.0, -1.659239761503395e-06, -4.376603580866223e-07, 4.4580035316599265e-06, -1.659239761503395e-06, -4.376603580866223e-07, -4.4580035316599265e-06, 3.034808945895362e-06, -6.726118036586015e-06, 0.0, 0.005436665955901393, -0.08164730868562775, 0.0, -1.2984625724410392e-05, -7.4130570159938736e-06, 0.0]) [..] 16 (-114.89364787840326, [-0.0005591249462735259, 0.04018795560035916, 0.0, -5.34666220519675e-07, 1.1370871814235517e-06, 4.809133242467123e-07, -5.34666220519675e-07, 1.1370871814235517e-06, -4.809133242467123e-07, -6.9140095421138525e-06, -3.095664552260277e-06, 0.0, 0.0005695756951453745, -0.040185884820554796, 0.0, -2.467406898132296e-06, -1.2492896190128416e-06, 0.0]) 17 (-114.8863872514371, [-0.00036666056940981573, 0.03667976502852128, 0.0, 2.9101399354747315e-06, -2.094045026924257e-06, -4.933288234976185e-06, 2.9101399354747315e-06, -2.094045026924257e-06, 4.933288234976185e-06, 1.6531622304416516e-07, 1.511517903679191e-07, 0.0, 0.00036162347288279384, -0.03668602744257765, 0.0, -9.484995716624312e-07, 1.0299352320775057e-05, 0.0]) 18 (-114.87977711993531, [-0.00018979360652668711, 0.033296276783081655, 0.0, -2.3787379704320877e-06, 1.7510009376556918e-06, 1.3530564600128248e-06, -2.3787379704320877e-06, 1.7510009376556918e-06, -1.3530564600128248e-06, 8.24207064487048e-06, -8.055936327900498e-07, 0.0, 0.00018027576986845428, -0.03329589479259992, 0.0, 6.033241931824307e-06, -3.0783987173960137e-06, 0.0]) 19 (-114.8737952986994, [-4.7287277448850376e-05, 0.030029200359777717, 0.0, -1.3711175166353229e-06, -8.452926738775068e-08, 9.941241931599176e-07, -1.3711175166353229e-06, -8.452926738775068e-08, -9.941241931599176e-07, 8.167348279908282e-07, -2.5820569179275075e-06, 0.0, 4.871429991895604e-05, -0.030027845123621805, 0.0, 4.984777179639632e-07, 1.3958792967685985e-06, 0.0])
Not pretty, but manageable.
cat nwch.nwout |grep '^(-'|sed 's/\,/\t/g;s/(\([^)]*\))/\1/g'|cat -n|gawk '{print $1,$2}' > profile.dat

and then plot it:


Example 2.
SN2 reaction between iodide and bromomethane

You can set up your calc however you want, but ECCE is easier than anything else.

Draw bromomethane, then throw in an iodine atom. Adjust the angle across Br-C-I to 180 degrees, and set the C to I distance to 3 Å.


Set up the calculation -- in this case I used b3lyp/def2-svp
Edit the input and add
python from nwgeom import * geom = ''' geometry adjust zcoord bond 1 6 %f cccc constant end end ''' results=scan_input(geom,[3.00],[1.5],20,'dft',task_optimize) for i in range(0,len(results)): print results[i][0][0],results[i][1] end task python

(Delete 'task dft optimize')

You'll now have the following input file:
scratch_dir /scratch
Title "sn2_br"

Start  sn2_br

echo

charge -1

geometry noautosym units angstrom
 C     0.00000     0.00000     0.00000
 H     -0.675500     -0.675500     0.675500
 H     0.675500     -0.675500     -0.675500
 H     -0.675500     0.675500     -0.675500
 Br     1.10274     1.10274     1.10274
 I     -1.73205     -1.73205     -1.73205
end

ecce_print ecce.out

basis "ao basis" spherical print
  H library "def2-svpd"
  Br library "def2-svpd"
  C library "def2-svpd"
  I library "def2-svpd"
END
ECP
  I library "def2-ecp"
END

dft
  mult 1
  direct
  XC b3lyp
  grid fine
  iterations 99
  mulliken
end

driver
  default
  maxiter 99
end


python
from nwgeom import *
geom = '''
    geometry adjust
        zcoord
            bond 1 6 %f cccc constant
        end
    end
'''
results=scan_input(geom,[3.00],[1.5],20,'dft',task_optimize)
for i in range(0,len(results)):
    print results[i][0][0],results[i][1]
end


task python
Launch it and wait...eventually (2h 30 min on a slow three-core node) you'll get an output like the one below. Note that I didn't pre-optimise the bromomethane, so there's a bit of a drop in energy at the beginning. Likewise, I let the C-I distance get so short that the energy is rising rapidly at the end
Structure at the beginning

Transition-state-ish structure

Product


Example 3:
Two-dimensional PES scan

I'll keep this brief. First we do a scan where we use 'constant' for the angle, but not the bond length:
scratch_dir //scratch Title "2d_pes-1" Start 2d_pes-1 echo charge 0 geometry noautosym units angstrom C -2.51242e-66 1.67495e-66 -0.767732 H -0.722530 0.722530 -1.16548 H -0.264464 -0.986995 -1.16548 H 0.986995 0.264464 -1.16548 C 2.51242e-66 -2.51242e-66 0.767732 H 0.264464 0.986995 1.16548 H -0.986995 -0.264464 1.16548 H 0.722530 -0.722530 1.16548 end ecce_print ecce.out basis "ao basis" cartesian print H library "6-31G" C library "6-31G" END dft mult 1 direct XC b3lyp grid fine iterations 99 mulliken end driver default end python from pes_scan import pes_scan geom = ''' geometry noprint adjust zcoord bond 1 5 %f cc angle 2 1 5 %f hcc constant end end ''' results = pes_scan(geom, \ [1.535, 111.269], [1.800, 90], 5, 'dft', task_optimize) end task python

And the output:
What's happening is that the bond length ends up being the same no matter what we initially set it to

If we instead set constant for the bond as well:
python from pes_scan import pes_scan geom = ''' geometry noprint adjust zcoord bond 1 5 %f cc constant angle 2 1 5 %f hcc constant end end ''' results = pes_scan(geom, \ [1.535, 111.269], [1.800, 90], 5, 'dft', task_optimize) end task python

And we get:

17 June 2013

456. Adding NWChem basis sets to ECCE. Part 2. A solution: nwchem2ecce.py

UPDATED!

I've moved the finished scripts to here:
https://sourceforge.net/projects/nwbas2ecce/

They work! I've also added a number of converted basis sets to the sourceforge repo under 'examples'. You'll also find example ecp and ECPOrbital files.

Phew...

Here's the README:
The programmes are not 'intelligent' -- they won't check that you are doing something reasonable. Bad input = bad output. __Installation__: Download eccepag and nwbas2ecce They are both python (2.7) programmes, so you will need to install python to run them. On linux, this is normally very easy. E.g. on debian, run 'sudo apt-get install python2.7' and you are done. If you want, you can put the files in /usr/local/bin and do 'sudo chmod +x /usr/local/bin/eccepage' 'sudo chmod +x /usr/local/bin/nwbas2ecce' and you will be able to call the scripts from any directory. __Usage__ nwbas2ecce can turn a full basis set, or a, ECP basis set, into an ECCE compatible set of basis set files. Typically, an nwchem basis set consists of a single file, e.g. 3-21g. It can also be divided into several files, e.g. def2-svp and def-ecp, where the effective core potentials (ecps) are in def2-ecp. Other basis set files, like lanl2dz_ecp, contains both the orbital and the contraction parts. Typically, a ECCE basis set suite consists of: basis.BAS basis.BAS.meta basis.POT (for ECP) basis.POT.meta (for ECP) Sometimes polarization and diffuse functions are separated from the main .BAS file. E.g. 3-21++G* consists of 3-21G.BAS 3-21GS.BAS POPLDIFF.BAS , in addition to the meta files. The meta files are just markup-language type files with e.g. references. Note that you don't HAVE to break up the basis set components like this. Since the basis set data can be broken up into smaller files, the overall basis set is defined as an entry in a category file. For example, 3-21G is defined in the category file 'pople', and points to 3-21G.BAS. 3-21G* is also defined in pople, but point to both 3-21G.BAS and 3-21GS.BAS. ECP works in a similar way, by combining a .BAS and a .POT file. Note that the .POT files look different from the .BAS files. nwbas2ecce generates .BAS and .POT files based on whether there are basis/end or ecp/end sections in the nwchem basis set file. If there are both, both POT and BAS files are generated. All these files are contained in server/data/Ecce/system/GaussianBasisSetLibrary Finally, you need to generate .pag and .dir files that go into the server/data/Ecce/system/GaussianBasisSetLibrary/.DAV directory. The .dir file is always empty, while the .pag file is unfortunately a binary file. eccepag can, however, generate it with the right input. See e.g. http://verahill.blogspot.com.au/2013/06/455-adding-nwchem-basis-sets-to-ecce.html for more detailed information __Example__ We'll use def2-svp as an example. The nwchem basis set file def2-svp contains the basis set, while def2-ecp contains the core potentials. Use def2-svp to generate DEF2_SVP.BAS, DEF2_SVP.BAS.meta. Use def2-ecp to generate DEF2_ECP.POT, DEF2_ECP.POT.meta. As part of the generation, .descriptor files are also generated. These contain information that should go into the category file(s). Then generate the .pag files for both the POT and the BAS files, and touch the .dir files into existence. Do like this: nwbas2ecce -i def2-svp -o DEF2_SVP.BAS -n 'def2-svp' nwbas2ecce -i def2-ecp -p DEF2_ECP.POT -n 'def2-ecp' eccepag -n def2-svp -t ECPOrbital -c ORBITAL -y Segmented -s Y -o DEF2_SVP.BAS.pag eccepag -n def2-ecp -t ecp -c AUXILIARY -o DEF2_ECP.POT.pag NOTE: I don't actually know if def2-svp is segmented, and spherical. I don't think it matters for the .pag file generation. Also note that most inputs are case sensitive. Look at a similar .pag file for hints. You now have the following files: DEF2_ECP.POT DEF2_ECP.POT.descriptor DEF2_ECP.POT.meta DEF2_ECP.POT.pag DEF2_SVP.BAS DEF2_SVP.BAS.descriptor DEF2_SVP.BAS.meta DEF2_SVP.BAS.pag Copy the files. Note that you need to select the correct target directory, and that will vary with where you installed ECCE. I'll assume it's in /opt/ecce cp DEF2* /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary cd /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary mv *.pag .DAV/ touch .DAV/DEF2_SVP.BAS.dir .DAV/DEF2_ECP.POT.dir cat DEF2_SVP.BAS.descriptor >> ECPOrbital cat DEF2_ECP.POT.descriptor >> ECPOrbital cat DEF2_ECP.POT.descriptor >> ecp Edit ECPOrbital so that it reads: name= def2-svp files= DEF2_SVP.BAS DEF2_ECP.POT atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn atoms= Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn
/pre>

11 June 2013

446. B3LYP and WAH -- the confusion

Quite a while back I was looking at the WAH (Wilson-Amos-Handy) functional, and while it turned out to be a bit more complicated than I had hoped, it led me to type up a brief discussion about b3lyp in different computational packages.

The issue is that there are several different definitions, and that even if a paper is kind enough to provide the Becke 1993 communication as a reference, this is rarely the actual form of b3lyp used. In fact, the LYP part would speak directly against it. As someone whom isn't well-versed in the computational and theoretical arts I do think it would be nice if we could get to the point where we can get useful information by doing point-and-click computations, but as exchange-correlation functionals essentially are fudge-factors, this probably won't happen for some time.  In other words, 6-31G/B3LYP may be a winning combination for the computation of electronic energies of a limited range of (mostly organic) small molecules in the gas phase, it doesn't always yield anything useful about real-world systems.

NOTE: I wrote the original text as I was trying to figure out what WAH was -- and I thought at that point that it was a simple form of B3PW91 with tweaked prefactors. It isn't -- it's requires changes in the way the GIAOs are computed (I think). So don't focus on the WAH discussion.

Anyway, here's the story as a bench chemist (i.e. not a computational or theoretical chemist) understands it, in the context of trying to understand what the WAH exchange-correlation functional looks like.

The WAH functional is a hybrid exchange correlation functional which was developed to provide accurate NMR shift calculations.

Note that the WAH functional is correctly implemented in PQS -- see the manual.

But the story is really about B3LYP...


1. Definition of the WAH exchange-correlation functional:
The definition[1] consists of
We found that by using hybrid Kohn-Sham orbitals and eigenvalues with an adjusted 'exact-exchange' coefficient Cx, the NMR shielding parameters gave an accuracy approaching the best coupled-cluster calculations for molecules containing first and second row atoms. For B3LYP[2] we find Cx=0.05[..] give(s) best values (note that the coefficient of Local Density Exchange is (1-Cx) in the amended B3LYP. We name the resulting NMR values B3LYPGGA0.05[..].
This is the paper which is cited by the PQS manual. The definition is brief, but not unreasonable.

2 Definition of B3LYP
[The first-person account of the background to B3LYP is found here: http://www.ccl.net/chemistry/resources/messages/2002/05/22.008-dir/]

In the paper which is cited as a source of B3LYP, Becke defined[2] a functional as
EXC=EXCLSDA+a0 (ExHF-ExLSDA )+axΔ ExBecke88+acΔ EcPW91   (eq 1)
where EXC denotes exchange-correlation functional, Ex denotes exchange functional, Ec denotes correlation functional, Δ denotes non-local contribution, LSDA is the local spin-density approximation, Becke88[3] is the gradient-corrected LDA and PW91 is the Perdew-Wang 1991 gradient correction.[4]. The B3 in B3LYP refers to the three parameters it involves: a0=0.2, ax=0.72 and ac=0.81. Note that a0 is the same as Cx above. We'll refer to equation 1 as B3PW91.

LSDA is poorly defined but is normally taken to be the SVWN of the form
EXCLSDA=EXLSDA+ECLSDA  (eq 2)
=ExSlater+EcVWN   (eq 3)
although there's a slew of Vosko-Wilk-Nusair (VWN) functionals -- most sources suggest that Becke referred to VWN5, while my reading of the literature is a bit different (Becke states he uses the electron-gas parametrization in [4]). Either way, equation 1 now becomes
EXC=a0 ExHF+(1-a0 )ExSlater+EcVWN +axΔ ExBecke88+acΔ EcPW91 (eq 4)
This is implemented as the hybrid exchange-correlation functional acm in NWChem (the adibatic connection method) using VWN5. A very brief summary of Becke '93 vs Gaussian '92 (don't ask me about the chronology) is also available by Stephens et al. in J. Phys. Chem. 1994, 98(45), p. 11624.

2.1 Gaussian '92
You may at this point be forgiven for asking yourself why it is called B3LYP and not B3PW91. In 1991 Gaussian hadn't yet implemented PW91 (fair enough) and substituted it with the Lee-Yang-Parr (LYP) correlation functional (ΔEcLYP). Since it's difficult to separate the local component (and we want the non-local component as indicated by Δ), they wrote
Δ EcLYP=EcLYP-EcVWN (eq 5)
which turns equation 4 into
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+acEcLYP +(1-ac)EcVWN (eq 6)
In addition, in the original Gaussian implementation VWN_1_RPA was used, which sources tell me is 100\% wrong when taking Becke's intentions into account.
EXC=a0 ExHF+(1-a0)ExSlater +axΔ ExBecke88+acEcLYP +(1-ac)EcVWN_1_RPA (eq 7)
To make matters worse, today Gaussian uses VWN_3 and it seems they know how to get the nonlocal component of LYP directly (see below). Equation 7 is what you use if you use B3LYP in most software packages (though not all -- e.g. Gamess US uses VWN5 instead of VWN_1_RPA) So in G09 it's now
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+EcVWN_3+acEcLYP (eq 8)

2.2 PQS
PQS uses the old gaussian version (eq. 7) as the b3lyp functional, but it doesn't explicitly state which form -- b3lyp or b3pw91 -- is used for the WAH functional.


3 So what definition did WAH use?
All we really care about is reproducing the original paper by Handy et al. -- not whether Becke would approve or not. But here's where the problem of citing papers you may not have read becomes an issue.

Wilson, Amos and Handy cite the 1993 paper by Becke which defines the canonical version of B3LYP (i.e. B3PW91), which should settle it in favour of WAH being defined as shown in equation 4.

However, they used CADPACK, which implements it as in equation 7. Reading the CADPACK manual I can see what Handy et al. probably did: the way you define custom parameters for hybrid functionals in CADPACK is by doing
hybrid a0 ax ac
so that they during the development of their functional most likely typed in
b3lyp 0.05 0.72 0.81
which meant they probably used
EXC=0.05 ExHF+0.95 ExSlater+ +0.72 Δ ExBecke88+0.81 EcLYP +0.19 EcVWN_1_RPA (eq 9)

4 Implementing it in your package of choice
[NOTE: this will NOT set up WAH correctly -- I'm leaving it as it shows how to set up custom XCs in nwchem, G09 and Dalton]

4.1 NWCHEM
The canonical version of Becke's functional, B3PW91, is implemented as acm in NWChem and which is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
while the Gaussian '92 form is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
This means that the two possible forms of WAH are:
xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
and
 xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.2 Gaussian 09
Gaussian is a lot less elegant. The canonical version of Becke's functional, B3PW91, is implemented as acm in Gaussian as B3PW91, which is manually entered as
BPW91 IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810010000)
while the old Gaussian form is manually entered as
BLYP IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810001900
This means that the two forms of WAH are:
BPW91 IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810010000)
and
BLYP IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810001900)
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.3 Dalton
The notations are (more or less)
Combine HF=0.20 Slater=0.80 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.20 Slater=0.80 Becke=0.72 LYP=0.81 VWN=0.19
Combine HF=0.05 Slater=0.95 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.05 Slater=0.95 Becke=0.72 LYP=0.81 VWN=0.19
As an aside, I don't think anyone at this point would be surprised to learn that B3PW91 in Dalton and Gaussian are two completely different exchange-correlation functionals...

Performance
NOTE: the results don't quite make any sense to me anymore -- using the same XCs and basis sets and structures one would expect to get the exact same results for all three packages. I don't know why that wasn't the case, assuming that I implemented the XCs correctly, and assuming that the basis sets really are the same (which often they actually aren't...). So keep that in mind.

Original:
I used the same equilibrium structures as Handy (i.e. those of Cybulski), and used the aug-cc-pVTZ basis set with a fine DFT grid (not the same as Handy -- but should be as good. Also, I've done this with def2-qzvp and 6-31+g* as well). All calculations are for the gas phase. Handy's values are for the Huzinaga IV basis set in their GIAO paper[5], but it doesn't really matter much which basis set is chosen. The results are tabulated in table 1.

Note that Handy also saw chemical shifts for the oxygen in carbon monoxide which were around -80 ppm. Basically, I can reproduce everything except for his B3LYP0.05gga. I did a few test-runs with Huz-IV in Dalton and it was just as bad as the other methods.

Table 1: Tabulated calculated gas phase NMR shifts using different combinations of exchange-correlation functionals and the aug-cc-pvtz basis set. Note that Handy's experimental data are not in agreement with my sources, which are 1.0, -42.3 and 344 ppm for CO, CO and H2O, respectively.
Package b3lypa gb3lypb acmc b3pw91b b3lyp0.05 b3pw910.05 Handy
Expt.
Carbon in CO (ppm)
NWChem -10.32 -10.32 -8.80 -8.80 -8.08 -6.55 5.6 2.8
G09 -7.71 -7.71 -9.95 -9.95 -5.66 -7.74
Dalton -10.21 -10.21 N/A -12.65 -7.97 -10.38
Oxygen in CO (ppm)
NWChem -71.51 -71.51 -72.34 -72.34 -70.06 -70.93 -40 -36.7
G09 -78.82 -78.82 -77.89 -77.89 -77.57 -76.86
Dalton -71.50 -71.50 N/A -72.03 -70.04 -70.89
Oxygen in water
NWChem 328.45 328.45 329.28 329.28 329.79 330.48 327 358
G09 N/A 328.76 328.39 328.39 328.95 328.31
Dalton 328.50 328.50 N/A 328.02 329.85 329.07
aNote that G09 uses eq. 8 by default (gives -11.70, -77.63, and 328.34 ppm), but I forced it to use eq. 7. b These were my manual implementations of 'b3lyp' and 'acm' to make sure I got things right. c This is acm in NWChem and B3PW91 in Gaussian 09.


References:
[1] P. J. Wilson, R. D. Amos, H. N. C., Chem. Phys. Letters 1999, 321, 475-484.
[2] A. D. Becke, J. Chem. Phys. 1993, 98, 5648-5652.
[3] A. D. Becke, Phys. Rev. A 1988, 38, 3098-3100.
[4] J. P. Perdew, Y. Wang, Phys. Rev. B 1991, 45, 13244-13249.
[5] T. Helgaker, P. J. Wilson, R. D. Amos, N. C. Handy, J. Chem. Phys. 2000, 113, 2983-2989.