Showing posts with label gaussian. Show all posts
Showing posts with label gaussian. Show all posts

31 January 2017

633. Multiwfn on debian, with gaussian

Not much to say. I wanted to do some Bader-type AIM analyses, and I don't have AIM2000/AIMPAC. Multiwfn can do this, and a lot more.


Setting up multiwfn
Download the precompiled Multiwfn binaries here: https://multiwfn.codeplex.com/

Extract the binary and the copy the directory to .e.g /opt

$ ls /opt/Multiwfn_3.3.9_bin_Linux/
examples  Multiwfn  settings.ini

Edit your ~/.bashrc and add:
export KMP_STACKSIZE=64000000
export Multiwfnpath=/opt/Multiwfn_3.3.9_bin_Linux

When I hit 0 to visualize something, the plot screen is empty. Check/uncheck a box in the window and the plot will show.
Generating wfx files in gaussian:
If you've already run a job that has generated a .chk file, then run a simple quick job like this:
%nprocshared=2 %mem=500Mb %chk=test.chk #P chkbasis geom=allcheck guess=(check,only) output=wfx pop=(none) test.wfx

To analyse the .wfx file using multiwfn, look at the examples in the manual: https://www.codeplex.com/Download?ProjectName=multiwfn&DownloadId=1607667
The strength of multiwfn is that it can do a LOT. That's also its weakness, in the sense that the menu options are a bit overwhelming. Following the examples in the manual is probably mandatory.

18 January 2016

626. Briefly: Gaussian and cloud computing -- Gaussian G09D with Slurm on aws/ec2

Note: you may want to install awscli and euca2ools. I didn't, so I don't actually know whether they are useful.

My instructions are quite rudimentary since I don't have much time to write these blog posts anymore. Hopefully there's enough information to get you through.


AWS
Either way, sign up for AWS. If you already have an amazon ID I think you can use that. Go to https://aws.amazon.com/

Select Launch an Instance and pick the ubuntu AIM and do Launch and Review. I launched it as a t2.micro instance type, as it is free and it's sufficient for set up but not to run jobs.

Hit launch, and create a new key pair. I called mine myfirstkeypair and saved the pem file in my ~/Downloads folder

In my Downloads folder:
ssh -i "myfirstkeypair.pem" ubuntu@ec2-11-222-33-444.us-west-2.compute.amazonaws.com
I then set a password in the ubuntu AWS image:
sudo passwd ubuntu

I added my id_rsa.pub to ~/.ssh/authorized_keys on the ubuntu AWS image to make logging in via ssh easier -- that way I won't need the pem file.


Set up Gaussian
I then connected with SCP and uploaded my gaussian files -- I went straight for EM64T G09D. It went quite fast at +5 MB/s

scp E6L-103X.tgz ubuntu@ec2-00-111-22-333.us-west-2.compute.amazonaws.com:/home/ubuntu/E6L-103X.tgz

Once that was done, on the ubuntu AWS instance I did:
sudo apt-get install csh 
sudo mkdir /opt/gaussian
cd /opt 
sudo chown ubuntu gaussian -R
cd /opt/gaussian
cp ~/E6L-103X.tgz .
tar xvf E6L-103X.tgz
cd g09
csh bsd/install

echo 'export GAUSS_EXEDIR=/opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09' >> ~/.bashrc
echo 'export GAUSS_SCRDIR=/home/ubuntu/scratch' >> ~/.bashrc
echo 'export PATH=$PATH:/opt/gaussian/g09' >> ~/.bashrc
source ~/.bashrc 
mkdir ~/scratch ~/jobs

NOTE that you can't run any gaussian jobs under a t2.micro instance. You will have to stop and relaunch as at least a t2.small instance or the jobs will be 'Killed' (that's what is echoed in the terminal when you try to run)
Note that if you terminate an image it will be deleted.

Stop the image and then create a snapshot or an image from it to keep everything you've installed.

Set up Slurm
You'll want a queue manager so that you can launch several jobs in serial. Also, you can set up your script so that it shuts down the image when your job is done to save money.

sudo apt-get update
sudo apt-get install slurm-llnl

ControlMachine=localhost ControlAddr=127.0.0.1 MpiDefault=none ProctrackType=proctrack/pgid ReturnToService=2 SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd SlurmUser=slurm StateSaveLocation=/var/lib/slurm-llnl/slurmctld SwitchType=switch/none TaskPlugin=task/none FastSchedule=1 SchedulerType=sched/backfill SelectType=select/linear AccountingStorageType=accounting_storage/none ClusterName=rupert JobAcctGatherType=jobacct_gather/none SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log SlurmdLogFile=/var/log/slurm-llnl/slurmd.log NodeName=localhost NodeAddr=127.0.0.1 PartitionName=All Nodes=localhost
sudo /usr/sbin/create-munge-key
Edit /etc/default/munge:
OPTIONS=--force
Then run
sudo service slurm-llnl restart
sudo service munge restart 
Test using slurm.batch
#!/bin/bash # #SBATCH -p All #SBATCH --job-name=test #SBATCH --output=res.txt # #SBATCH --ntasks=1 #SBATCH --time=10:00 srun hostname srun sleep 60
and submit with
sbatch slurm.batch
 squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 2       All     test   ubuntu  R       0:08      1 localhost

Benchmark:
#!/bin/csh #SBATCH -p All #SBATCH --time=9999999 #SBATCH --output=slurm.out #SBATCH --job-name=benchmark setenv GAUSS_SCRDIR /home/ubuntu/scratch setenv GAUSS_EXEDIR /opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09 /opt/gaussian/g09/g09< benchmark.in > benchmark.out
Using the same opt/freq benchmark as in post 621.

c4.2xlarge 2h 11 min [1h 20 min] 8 vcpu/16 Gb
c4.4xlarge 1h 15 min [     44 min] 16 vcpu/32 Gb
c4.8xlarge      41 min [     25 min] 36 vcpu/60 Gb

It scales surprisingly well, although not perfectly linearly. It's clear that it's cheaper to use a smaller instance, so if time isn't critical or the larger memory isn't needed, c4.8xlarge is not the first choice.

Dropbox:
You might want to use dropbox to transfer files back and forth, especially finished job files (useful if you shut down the machine using a slurm script as shown below)

cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86_64" | tar xzf -
~/.dropbox-dist/dropboxd
This computer isn't linked to any Dropbox account... Please visit https://www.dropbox.com/cli_link_nonce?nonce=0011223344556677889900aabbccddeef to link this device. This computer isn't linked to any Dropbox account...

Open that link in a browser, then go back to the terminal.
 
wget -O - https://www.dropbox.com/download?dl=packages/dropbox.py > dropbox.py
sudo mv dropbox.py /usr/local/bin
sudo chmod +x d/usr/local/bin/dropbox.py
dropbox.py autostart y

Now, since you don't want to use up space unnecessarily (you're paying for it after all), exclude as many directories as possible. To exclude all existing dropbox dirs, do
 
cd ~/Dropbox
dropbox.py exclude add `ld -d */`
dropbox.py exclude add `ld *.*`
dropbox.py exclude list

Note that it can't handle directories with spaces in the name, so you'll need to polish the list by hand. Next create a directory where you want to run and store your jobs,e .g.
mkdir ~/Dropbox/aws_jobs

When you run a gaussian job, make sure to specify where the .chk files should end up, e.g.
%chk=/home/ubuntu/scratch/benchmark.chk
so that you don't use up space/bandwidth for your chk files (unless of course you want to).

Stop after execution:
Use a batch script along these lines:
#!/bin/csh #SBATCH -p All #SBATCH --time=9999999 #SBATCH --output=slurm.out #SBATCH --job-name=benchmark setenv GAUSS_SCRDIR /home/ubuntu/scratch setenv GAUSS_EXEDIR /opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09 /opt/gaussian/g09/g09< benchmark.in > benchmark.out rm /home/ubuntu/scratch/*.* sudo shutdown -h now

15 October 2015

624. Gaussian fails with "traps: l502.exe[12449] general protection ip:d75df7 sp:7f7de40dcce0 error:0 in l502.exe[400000+dc8000]" on i7-5820K

Update: 
* The systems is rock solid with nwchem and ADF.  Only G09 crashes
* Gaussian has now released G09E, and the release notes say: "A Sandybridge/Haswell binary distribution is also available". Remains to be found out if this new version solves the issue. I can't check, as I don't have access to that version.

Update 18 Oct 2015:
TL;DR version: G09D (EM64T and AMD64) crash within the first 30 min to 4 hours. An NWChem job has so far run 6 days without crashing and is still going strong.

Original post:
This isn't much of a post yet. I'm mostly posting this so that people searching online will see that they aren't alone.


I just built a new node:
AU$559 Intel BX80648I75820K 6 Core i7-5820K 3.3Ghz 15MB LGA-2011-V3 (No Heatsink)
AU$407 Gigabyte X99-SLI Intel X99 S2011-3 8xDDR4/4xPCI-E/Intel GBLan/ATX Motherboard
AU$50 DeepCool FrostWin v2.0 CPU cooler
AU$155x2 Patriot 16G Kit (8Gx2) DDR4 2133 Desktop RAM
AU$185 Antec HCG-900 High Current Gamer Gaming PSU
AU$39 Gigabyte N210SL-1GI 1GB GT210 PCI-E VGA Card
AU$68 Seagate 3.5" Barracuda 1TB ST1000DM003 SATA3 7200RPM 64MB HDD (carbon)
AU$76 Antec GX500B-W Dominator Window USB3.0 Gaming Case without PSU

I've got an installation of up-to-date Jessie on it, with the following kernel: Debian 3.16.7-ckt11-1+deb8u5 (2015-10-09) x86_64 GNU/Linux.


When running G09D rev. 01 EM64T I keep getting random errors along these lines (these are collected over a couple of days and between restarts):
[100433.566789] traps: l703.exe[11236] general protection ip:df18ca sp:7fc96f595268 error:0 in l703.exe[400000+a46000] [ 2587.899019] traps: l703.exe[3727] general protection ip:9c9757 sp:7fa6fc436ce0 error:0 in l703.exe[400000+a46000] [26439.755347] traps: l502.exe[3235] general protection ip:ab8a55 sp:7ffe29504c10 error:0 in l502.exe[400000+dc8000] [43030.457126] traps: l502.exe[427] general protection ip:11565a7 sp:7f2ec1fff268 error:0 in l502.exe[400000+dc8000] [ 2587.899019] traps: l703.exe[3727] general protection ip:9c9757 sp:7fa6fc436ce0 error:0 in l703.exe[400000+a46000] [37460.207608] traps: l703.exe[14649] general protection ip:a38ae0 sp:7f1a813cf8c0 error:0 in l703.exe[400000+a46000] [ 8865.403861] traps: l502.exe[12449] general protection ip:d75df7 sp:7f7de40dcce0 error:0 in l502.exe[400000+dc8000]


Sometimes the crashes happen after 30 minutes, sometimes after 3 hours. Most happen within four hours. I seem to remember that one ran up to 12 hours, but nothing's gone beyond that. Some short (1h 30 min) calculations have managed to run to completion.

I've checked each RAM stick with memtest+ -- they are fine -- and they are distributed as recommended in the motherboard manual.

The temperature is running below 40 degrees Celsius.

The harddrive is fine according to SMART.

I log everything every two minutes, and so can go back and look at what happened right before the crash, but there's nothing odd.


My current best three hypotheses are:

* There's an issue with G09D EM64T and the new generation of LGA2011-v3 i7 intel cpus specifically

* There's an issue with any version of G09D and the new generation of LGA2011-v3 i7 intel cpus

* There's an issue with my system which is independent of G09D.

To test, I'll be:
* Do runs with G09D rev. 01 AMD64
                         Done -- this also crashed.

* Do runs with NWChem 6.5 (ifort, mkl)
                         Running -- 6 days so far without a crash!

* Update the BIOS (long shot)

* Remove CPU and check for bent pins (long, arduous shot)

I'll be posting updates...

12 May 2014

575. Gaussview: CConnetctionGFCHK::Parse_GFCHK() Missing or bad data: Alpha Orbital Energies Line Number XXXX

I've been getting a fair number of errors when trying to open .fchk files with gaussview 4.x that I've generated using g09. In particular, I've been getting this:



Thinking that it might have something to do with the version of gaussview being too old, I tried gaussview 5.x, which throws the same error. Gaussview (4.x), by the way, runs fine in wine 1.7.

Turns out it's a poorly written piece of software -- gaussview can't properly parse output generated by gaussian. Great...neither piece of software is cheap. Better yet, gabedit does not have any issue reading the unmodified fchk file.

If you do need to use gview though, the solution is outlined here: http://www.ccl.net/cgi-bin/ccl/message-new?2012+07+18+005

Briefly, edit your fchk file and change the "Number of basis functions" to the same number as is shown for the "Number of independent functions" i.e. change
1 alpha 2 FOpt RB3PW91 Gen 3 Number of atoms I 101 4 Info1-9 I N= 9 5 89 86 0 0 0 100 6 2 18 -602 7 Charge I 7 8 Multiplicity I 1 9 Number of electrons I 530 10 Number of alpha electrons I 265 11 Number of beta electrons I 265 12 Number of basis functions I 2015 13 Number of independent functions I 2005 14 Number of point charges in /Mol/ I 0 15 Number of translation vectors I 0
to
1 alpha 2 FOpt RB3PW91 Gen 3 Number of atoms I 101 4 Info1-9 I N= 9 5 89 86 0 0 0 100 6 2 18 -602 7 Charge I 7 8 Multiplicity I 1 9 Number of electrons I 530 10 Number of alpha electrons I 265 11 Number of beta electrons I 265 12 Number of basis functions I 2005 13 Number of independent functions I 2005 14 Number of point charges in /Mol/ I 0 15 Number of translation vectors I 0
And that's it.

15 February 2014

552. Very briefly: Enthalpy of correction and different number of reactants and products in nwchem and gaussian

Those who are well-versed in the computational arts won't get much out of this post. On the other hand, happy amateurs like myself might find it a bit more useful as it clarifies something that's been bothering  me.

Short version: Hcorr in Nwchem and Gaussian include PV.

As usual, I am not an expert in either computational or theoretical chemistry. I try to use it as a tool, and I try to use it as well as I can. But I am not an authority. Also, if you consult older posts on the blog you'll find plenty of examples of me misunderstanding basic computational concepts (with 550+ posts it's difficult to go back and erase all the embarrassing little gems).


The background
I had a bit of a fright the other day. I'm currently working on computationally characterising a system which undergoes a reaction that can lead to a large number of isomers. Only one of them is experimentally observed, and so it was of interest to see whether this is the thermodynamically favoured product as predict by reasonably cheap computational methods.

Because DFT calculations like these are based on gas phase reactions (even if you use a solvation model) the free energy that you get is based on the standard conditions for gas phase corrections i.e. 1 bar of partial pressure of each reactant.

If you want to calculate the free energy of reaction in solution you need to use concentrations instead. As you will see, you'll only have to worry about this for reactions that involve an unequal number of reactants and products. Normally your best results will be obtained by using isodesmic reaction schemes anyway, which is a great way of avoiding this. However, if you do have unequal numbers of reactants and products you /must/ correct for it when making solution phase predictions.

A gas at 1 bar of pressure is ca
V=n*R*T/P= 1*8.314*298.15/101350=0.024458 m^3= 24.46 litres


So for an example reaction like this:
A+B -> C


Using

G_SATP=G(gas phase) + R*T*ln Q
=G(gas phase)+ R*T*ln([C]/([A]*[B])
=G(gas phase)+ R*T*ln((1/24.46)/((1/24.46)*(1/24.46)))
=G(gas phase)+1.89 kcal/mol 

In other words: you can't ignore the standard state change when doing solution phase calculations. This is obviously of extra importance in pH calculations, which are notoriously tricky.

Enthalpy
So knowing the need for standard state corrections I was a bit paranoid about how to treat the reaction enthalpy and came across this document: http://chemistry.illinoisstate.edu/standard/che38037/handouts/380.37assign3.pdf

On page, equation 2 states that for Gaussian
Ee+Hcorr=Ee+Hvib+Hrot+Htrans

(where Ee is the electronic energy) which is an expression that doesn't include Δ(PV). In that case
Δ(H)=Σ(Hproducts)-Σ(Hreactants)+Δn*R*T
where Δn is the difference in number of moles of products and reactants.

Consulting the excellent Gaussian thermochemistry whitepaper (http://www.gaussian.com/g_whitepap/thermo.htm) offers the following:
Hcorr=Etot+kBT
and
The Gibbs free energy includes Δ(PV) , so when it's applied to calculating ΔG for a reaction, ΔNRT=ΔPV is already included. This means that ΔG will be computed correctly when the number of moles of gas changes during the course of a reaction.
[Note that H=Ee+Hcorr=Ee+Etot+kBT]

At a first glance, kBT isn't equivalent to RT, but in fact kB=R/NA -- in the words of Wikipedia: "[The gas constant R] is equivalent to the Boltzmann constant, but expressed in units of energy".

In other words, Δ(PV) is already accounted for in Hcorr in gaussian.

Somewhat clearer: the Freq() page, http://www.gaussian.com/g_tech/g_ur/k_freq.htm, on the gaussian website now states
Sum of electronic and zero-point Energies=-527.492585  E0=Eelec+ZPE
Sum of electronic and thermal Energies= -527.489751     E= E0+ Evib+ Erot+Etrans
Sum of electronic and thermal Enthalpies=-527.488807  H=E+RT

We test that claim:
E=-527.489751 Hartree
RT=(1.987*298.15/1000)/627.503=9.4410e-04 Hartree
E+RT=-527.489751+9.4410e-04=-527.488807

It holds up. 

A similar example from an nwchem calculation:
Zero-Point correction to Energy = 120.416 kcal/mol ( 0.191895 au)
Thermal correction to Energy = 127.114 kcal/mol ( 0.202569 au)
Thermal correction to Enthalpy = 127.706 kcal/mol ( 0.203513 au)
(0.203513-0.202569)=9.4400e-04 Hartree(1.987*298.15/1000)/627.503

Nwchem also includes Δ(PV) in the thermal correction to enthalpy.

05 December 2013

534. Adding new options to ECCE -- adding solvation models for G03 (G09)

To minimize the amount of manual editing of my input files I've started to modify the menus in ECCE. One thing that was missing before was the option of selecting a solvation model for Gaussian jobs. I've added that now.

I will eventually submit my changes upstreams, but for now I'll just post a clunky list of changes that I've made:
There are two files to edit:

apps/scripts/codereg/ged03theory.py.

Add the following right after the DFT options section and before the MP options section (ca line number 273 in my case)
#Theory options solvation -CAO sovSizer = EcceBoxSizer(self, label = "Solvation", cols = 2) sovLeftSizer = EcceVBoxSizer() sovRightSizer = EcceVBoxSizer() #Use solvation self.useSCRF = EcceCheckBox(self, #useCosmo label = " Use SCRF" name = "ES.Theory.SCF.UseSCRF", default = False) sovLeftSizer.AddWidget(self.useSCRF, border = EcceGlobals.BorderDefault) #SCRF type scrfChoice = ["PCM", "CPCM", "IPCM", "SCIPCM", "SMD", "Dipole"] self.scrf = EcceComboBox(self, choices = scrfChoice, name = "ES.Theory.SCF.SCRF", label = "SCRF type:", default = 0) sovRightSizer.AddWidget(self.scrf, border = EcceGlobals.BorderDefault) #Solvent type solventChoice = ["Water", "Acetonitrile", "Methanol", "Ethanol", "Manual", "Benzene", "Chloroform", "Diethylether", "Dichloromethane", "Dichloroethane", "Carbontetrachloride", "Toluene", "Chlorobenzene", "Nitromethane", "Heptane", "Aniline", "Acetone", "Tetrahydrofuran", "Dimethylsulfoxide", "Argon", "Krypton", "Xenon", "n-Octanol", "1-Butanol" "Cyclohexane", "Isoquinoline", "Quinoline", ] self.solvent = EcceComboBox(self, choices = solventChoice, name = "ES.Theory.SCF.Solvent", label = "Solvent:", default = 0) sovRightSizer.AddWidget(self.solvent, border = EcceGlobals.BorderDefault) self.scrfDielec = EcceFloatInput(self, default = 78.4, name = "ES.Theory.SCF.Dielectric", label = "Dielectric Constant:", hardRange = "(0..)", unit = "Debye") sovRightSizer.AddWidget(self.scrfDielec, border = EcceGlobals.BorderDefault) sovSizer.AddWidget(sovLeftSizer, flag = wx.ALL) sovSizer.AddWidget(sovRightSizer, flag = wx.ALL) self.panelSizer.Add(sovSizer) #End theory options solvation -CAO

Also add the following to the def CheckDependency(self) block, right before the "if (EcceGlobals.Category == "MP" or" line:
# SCRF solvation -- CAO if EcceGlobals.Category == "DFT" or EcceGlobals.Category == "SCF": self.solvent.Enable(self.useSCRF.GetValue()) self.scrf.Enable(self.useSCRF.GetValue()) self.scrfDielec.Enable(self.useSCRF.GetValue() and self.solvent.GetValue() == "Manual") # end SCRF

Edit apps/scripts/parsers/ai.gauss03 and put the following somewhere in the file.
############################################################################## # # Description: # SCRF options field -CAO # ############################################################################## sub SCRFOptions { my($options,$result); $result = ""; $options = ""; #scrf type if ($AbiDict{"ES.Theory.SCF.UseSCRF"}) { if ($AbiDict{"ES.Theory.SCF.SCRF"} eq "" ) { $options .= "PCM,"; } else {$options .= ($AbiDict{"ES.Theory.SCF.SCRF"}).","}; # solvent/dielectric if ($AbiDict{"ES.Theory.SCF.Solvent"} eq "Manual") { if (defined($AbiDict{"ES.Theory.SCF.Dielectric"})) { $options .= "Dielectric=".$AbiDict{"ES.Theory.SCF.Dielectric"}.""; } } elsif ($AbiDict{"ES.Theory.SCF.Solvent"} eq "") { $options .= "Solvent=water"; } else { $options .= "Solvent=".$AbiDict{"ES.Theory.SCF.Solvent"}; } if ($options ne "") { $result = "SCRF=("; $result .= $options; $result .= ") "; } return $result; } }
Also add the following right below the "$route .= &SCFOptions;" line:
$route .= &SCRFOptions;

And you're done:

04 November 2013

525. Briefly: rotating molecule during optimisation in Gaussian

Most people who have used gaussian will be familiar with molecules that are rotated multiple times during optimisation. Occasionally,  the molecule is rotated repeatedly without any sign of convergence, and without any changes in atom positions.

Uploading the video to blogspot has lead to some distortion (i.e. the molecule is translated -- look at the x,y,z axes in the bottom left) but the general behaviour should still be clear:


The energies aren't really changing:


Solution:
Turning off rotation using 'nosymm' might yield faster convergence:
#P rB3LYP/GEN 5D Opt=()  Freq=() SCF=(MaxCycle=128 ) NoSymm  Punch=(MO)

In ECCE, just uncheck 'Use available symmetry'.

Note that NoSymm doesn't turn off symmetry completely -- it just prevent reorientation:

From http://www.gaussian.com/g_tech/g_ur/k_symmetry.htm
The NoSymmetry keyword prevents molecule reorientation and causes all computations to be performed in the input orientation (although the program still attempts to identify the appropriate point group). Symmetry use can be completely disabled by Symmetry=None; use this option if NoSymm generates an error when identifying the point group.



How the figures were made:

First I used this script to extract the .xyz coordinates of the geometry steps:
#!/usr/bin/python
import sys 

def getrawdata(infile):
    f=open(infile,'r')
    opt=0
    geo=0
    struct=[]
    structure=[]
    energies=[]
    energy=[]
    for line in f:
    
        if opt==1 and geo==1 and not ("---" in line):
            structure+=[line.rstrip()]
    
        if 'Coordinates (Angstroms)' in line:
            if opt==0:
                opt=1
                structure=[]
    
        if opt==1 and "--------------------------" in line:
            if geo==0:
                geo=1
            elif geo==1:
                geo=0
                opt=0
        if 'SCF Done' in line:
            energy=filter(None,line.rstrip('\n').split(' ')) 
#       if  'Optimization completed' in line and (opt==0 and geo==0):
            energies+=[float(energy[4])]
            opt=0
            geo=0
            struct+=[structure]
            structure=[]
    
    return struct, energies

def periodictable(elementnumber):
    ptable={1:'H',2:'He',\
    3:'Li', 4:'Be',5:'B',6:'C',7:'N',8:'O',9:'F',10:'Ne',\
    11:'Na',12:'Mg',13:'Al',14:'Si',15:'P',16:'S',17:'Cl',18:'Ar',\
    19:'K',20:'Ca',\
    21:'Sc',22:'Ti',23:'V',24:'Cr',25:'Mn',26:'Fe',27:'Co',28:'Ni',29:'Cu',30:'Zn',\
    31:'Ga',32:'Ge',33:'As',34:'Se',35:'Br',36:'Kr',\
    37:'Rb',38:'Sr',\
    39:'Y',40:'Zr',41:'Nb',42:'Mo',43:'Tc',44:'Ru',45:'Rh',46:'Pd',47:'Ag',48:'Cd',\
    49:'In',50:'Sn',51:'Sb',52:'Te',53:'I',54:'Xe',\
    55:'Cs',56:'Ba',\
    57:'La',58:'Ce',59:'Pr',60:'Nd',61:'Pm',62:'Sm',63:'Eu',64:'Gd',65:'Tb',66:'Dy',67:'Ho',68:'Er',69:'Tm',70:'Yb',71:'Lu',\
    72:'Hf', 73:'Ta', 74:'W',75:'Re', 76:'Os', 77:'Ir',78:'Pt', 79:'Au', 80:'Hg',\
    81:'Tl', 82:'Pb', 83:'Bi',84:'Po',85:'At',86:'Rn',\
    87:'Fr',88:'Ra',\
    89:'Ac',90:'Th',91:'Pa',92:'U',93:'Np',94:'Pu',95:'Am',96:'Cm',97:'Bk',98:'Cf',99:'Es',100:'Fm',101:'Md',102:'No',\
    103:'Lr',104:'Rf',105:'Db',106:'Sg',107:'Bh',108:'Hs',109:'Mt',110:'Ds',111:'Rg',112:'Cn',\
    113:'Uut',114:'Fl',115:'Uup',116:'Lv',117:'Uus',118:'Uuo'}
    element=ptable[elementnumber]
    return element
def genxyzstring(coords,elementnumber):
    x_str='%10.5f'% coords[0]
    y_str='%10.5f'% coords[1]
    z_str='%10.5f'% coords[2]
    element=periodictable(int(elementnumber))
    xyz_string=element+(3-len(element))*' '+10*' '+\
    (8-len(x_str))*' '+x_str+10*' '+(8-len(y_str))*' '+y_str+10*' '+(8-len(z_str))*' '+z_str+'\n'

    return xyz_string

def getstructures(rawdata):

    n=0
    for structure in rawdata:

        n=n+1
        num="%03d" % (n,)
        g=open('structure_'+num+'.xyz','w')
        itson=False
        cartesian=[]

        for item in structure:

            coords=filter(None,item.split(' '))
            coordinates=[float(coords[3]),float(coords[4]),float(coords[5])]
            element=coords[1]
            cartesian+=[genxyzstring(coordinates,element)]
        g.write(str(len(cartesian))+'\n')
        g.write('Structure '+str(n)+'\n')
        for line in cartesian:
            g.write(line)
        g.close()
        cartesian=[]
    return 0

if __name__ == "__main__":
    infile=sys.argv[1]
    rawdata,energies=getrawdata(infile)
    structures=getstructures(rawdata)
    g=open('energies.dat','w')
    for n in range(0,len(energies)):
        g.write(str(n)+'\t'+str(energies[n])+'\n')
    g.close()

Then I turned all the .xyz files into one .xyz file:
cat *.xyz > structures.xyz

Which I then opened in vmd with a view to make a video. Select New Molecule and load structure.xyz. Then go to Extensions, Visualization/Movie Maker. Making a movie using VMD didn't yield to anything of sufficient quality. Instead, I selected Tachyon as the renderer. Under Movie Settings I unchecked Delete Image Files.

Once I had the ppm files I did
ffmpeg -r 2 -i symm.%05d.ppm -vcodec mpeg4 -b 5000k video.avi

The key to getting a decent quality video was picking a high enough bit rate, which is probably obvious to most people who know what bit rate means, but not to a happy amateur like myself (well, now it is).

I made the energy plot using gnuplot and the energies.dat file which the python script above generates.

11 June 2013

446. B3LYP and WAH -- the confusion

Quite a while back I was looking at the WAH (Wilson-Amos-Handy) functional, and while it turned out to be a bit more complicated than I had hoped, it led me to type up a brief discussion about b3lyp in different computational packages.

The issue is that there are several different definitions, and that even if a paper is kind enough to provide the Becke 1993 communication as a reference, this is rarely the actual form of b3lyp used. In fact, the LYP part would speak directly against it. As someone whom isn't well-versed in the computational and theoretical arts I do think it would be nice if we could get to the point where we can get useful information by doing point-and-click computations, but as exchange-correlation functionals essentially are fudge-factors, this probably won't happen for some time.  In other words, 6-31G/B3LYP may be a winning combination for the computation of electronic energies of a limited range of (mostly organic) small molecules in the gas phase, it doesn't always yield anything useful about real-world systems.

NOTE: I wrote the original text as I was trying to figure out what WAH was -- and I thought at that point that it was a simple form of B3PW91 with tweaked prefactors. It isn't -- it's requires changes in the way the GIAOs are computed (I think). So don't focus on the WAH discussion.

Anyway, here's the story as a bench chemist (i.e. not a computational or theoretical chemist) understands it, in the context of trying to understand what the WAH exchange-correlation functional looks like.

The WAH functional is a hybrid exchange correlation functional which was developed to provide accurate NMR shift calculations.

Note that the WAH functional is correctly implemented in PQS -- see the manual.

But the story is really about B3LYP...


1. Definition of the WAH exchange-correlation functional:
The definition[1] consists of
We found that by using hybrid Kohn-Sham orbitals and eigenvalues with an adjusted 'exact-exchange' coefficient Cx, the NMR shielding parameters gave an accuracy approaching the best coupled-cluster calculations for molecules containing first and second row atoms. For B3LYP[2] we find Cx=0.05[..] give(s) best values (note that the coefficient of Local Density Exchange is (1-Cx) in the amended B3LYP. We name the resulting NMR values B3LYPGGA0.05[..].
This is the paper which is cited by the PQS manual. The definition is brief, but not unreasonable.

2 Definition of B3LYP
[The first-person account of the background to B3LYP is found here: http://www.ccl.net/chemistry/resources/messages/2002/05/22.008-dir/]

In the paper which is cited as a source of B3LYP, Becke defined[2] a functional as
EXC=EXCLSDA+a0 (ExHF-ExLSDA )+axΔ ExBecke88+acΔ EcPW91   (eq 1)
where EXC denotes exchange-correlation functional, Ex denotes exchange functional, Ec denotes correlation functional, Δ denotes non-local contribution, LSDA is the local spin-density approximation, Becke88[3] is the gradient-corrected LDA and PW91 is the Perdew-Wang 1991 gradient correction.[4]. The B3 in B3LYP refers to the three parameters it involves: a0=0.2, ax=0.72 and ac=0.81. Note that a0 is the same as Cx above. We'll refer to equation 1 as B3PW91.

LSDA is poorly defined but is normally taken to be the SVWN of the form
EXCLSDA=EXLSDA+ECLSDA  (eq 2)
=ExSlater+EcVWN   (eq 3)
although there's a slew of Vosko-Wilk-Nusair (VWN) functionals -- most sources suggest that Becke referred to VWN5, while my reading of the literature is a bit different (Becke states he uses the electron-gas parametrization in [4]). Either way, equation 1 now becomes
EXC=a0 ExHF+(1-a0 )ExSlater+EcVWN +axΔ ExBecke88+acΔ EcPW91 (eq 4)
This is implemented as the hybrid exchange-correlation functional acm in NWChem (the adibatic connection method) using VWN5. A very brief summary of Becke '93 vs Gaussian '92 (don't ask me about the chronology) is also available by Stephens et al. in J. Phys. Chem. 1994, 98(45), p. 11624.

2.1 Gaussian '92
You may at this point be forgiven for asking yourself why it is called B3LYP and not B3PW91. In 1991 Gaussian hadn't yet implemented PW91 (fair enough) and substituted it with the Lee-Yang-Parr (LYP) correlation functional (ΔEcLYP). Since it's difficult to separate the local component (and we want the non-local component as indicated by Δ), they wrote
Δ EcLYP=EcLYP-EcVWN (eq 5)
which turns equation 4 into
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+acEcLYP +(1-ac)EcVWN (eq 6)
In addition, in the original Gaussian implementation VWN_1_RPA was used, which sources tell me is 100\% wrong when taking Becke's intentions into account.
EXC=a0 ExHF+(1-a0)ExSlater +axΔ ExBecke88+acEcLYP +(1-ac)EcVWN_1_RPA (eq 7)
To make matters worse, today Gaussian uses VWN_3 and it seems they know how to get the nonlocal component of LYP directly (see below). Equation 7 is what you use if you use B3LYP in most software packages (though not all -- e.g. Gamess US uses VWN5 instead of VWN_1_RPA) So in G09 it's now
EXC=a0 ExHF+(1-a0)ExSlater+axΔ ExBecke88+EcVWN_3+acEcLYP (eq 8)

2.2 PQS
PQS uses the old gaussian version (eq. 7) as the b3lyp functional, but it doesn't explicitly state which form -- b3lyp or b3pw91 -- is used for the WAH functional.


3 So what definition did WAH use?
All we really care about is reproducing the original paper by Handy et al. -- not whether Becke would approve or not. But here's where the problem of citing papers you may not have read becomes an issue.

Wilson, Amos and Handy cite the 1993 paper by Becke which defines the canonical version of B3LYP (i.e. B3PW91), which should settle it in favour of WAH being defined as shown in equation 4.

However, they used CADPACK, which implements it as in equation 7. Reading the CADPACK manual I can see what Handy et al. probably did: the way you define custom parameters for hybrid functionals in CADPACK is by doing
hybrid a0 ax ac
so that they during the development of their functional most likely typed in
b3lyp 0.05 0.72 0.81
which meant they probably used
EXC=0.05 ExHF+0.95 ExSlater+ +0.72 Δ ExBecke88+0.81 EcLYP +0.19 EcVWN_1_RPA (eq 9)

4 Implementing it in your package of choice
[NOTE: this will NOT set up WAH correctly -- I'm leaving it as it shows how to set up custom XCs in nwchem, G09 and Dalton]

4.1 NWCHEM
The canonical version of Becke's functional, B3PW91, is implemented as acm in NWChem and which is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
while the Gaussian '92 form is manually entered as
xc HFexch 0.2 slater 0.8 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
This means that the two possible forms of WAH are:
xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_5 1 Perdew91 0.81
and
 xc HFexch 0.05 slater 0.95 becke88 nonlocal 0.72 vwn_1_rpa 0.19 lyp 0.81
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.2 Gaussian 09
Gaussian is a lot less elegant. The canonical version of Becke's functional, B3PW91, is implemented as acm in Gaussian as B3PW91, which is manually entered as
BPW91 IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810010000)
while the old Gaussian form is manually entered as
BLYP IOp(3/76=1000002000) IOp(3/77=0720008000) IOp(3/78=0810001900
This means that the two forms of WAH are:
BPW91 IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810010000)
and
BLYP IOp(3/76=1000000500) IOp(3/77=0720009500) IOp(3/78=0810001900)
We'll refer to them as B3PW910.05 and B3LYP0.05, respectively.

4.3 Dalton
The notations are (more or less)
Combine HF=0.20 Slater=0.80 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.20 Slater=0.80 Becke=0.72 LYP=0.81 VWN=0.19
Combine HF=0.05 Slater=0.95 Becke=0.72 PW91c=0.81 VWN5=1
Combine HF=0.05 Slater=0.95 Becke=0.72 LYP=0.81 VWN=0.19
As an aside, I don't think anyone at this point would be surprised to learn that B3PW91 in Dalton and Gaussian are two completely different exchange-correlation functionals...

Performance
NOTE: the results don't quite make any sense to me anymore -- using the same XCs and basis sets and structures one would expect to get the exact same results for all three packages. I don't know why that wasn't the case, assuming that I implemented the XCs correctly, and assuming that the basis sets really are the same (which often they actually aren't...). So keep that in mind.

Original:
I used the same equilibrium structures as Handy (i.e. those of Cybulski), and used the aug-cc-pVTZ basis set with a fine DFT grid (not the same as Handy -- but should be as good. Also, I've done this with def2-qzvp and 6-31+g* as well). All calculations are for the gas phase. Handy's values are for the Huzinaga IV basis set in their GIAO paper[5], but it doesn't really matter much which basis set is chosen. The results are tabulated in table 1.

Note that Handy also saw chemical shifts for the oxygen in carbon monoxide which were around -80 ppm. Basically, I can reproduce everything except for his B3LYP0.05gga. I did a few test-runs with Huz-IV in Dalton and it was just as bad as the other methods.

Table 1: Tabulated calculated gas phase NMR shifts using different combinations of exchange-correlation functionals and the aug-cc-pvtz basis set. Note that Handy's experimental data are not in agreement with my sources, which are 1.0, -42.3 and 344 ppm for CO, CO and H2O, respectively.
Package b3lypa gb3lypb acmc b3pw91b b3lyp0.05 b3pw910.05 Handy
Expt.
Carbon in CO (ppm)
NWChem -10.32 -10.32 -8.80 -8.80 -8.08 -6.55 5.6 2.8
G09 -7.71 -7.71 -9.95 -9.95 -5.66 -7.74
Dalton -10.21 -10.21 N/A -12.65 -7.97 -10.38
Oxygen in CO (ppm)
NWChem -71.51 -71.51 -72.34 -72.34 -70.06 -70.93 -40 -36.7
G09 -78.82 -78.82 -77.89 -77.89 -77.57 -76.86
Dalton -71.50 -71.50 N/A -72.03 -70.04 -70.89
Oxygen in water
NWChem 328.45 328.45 329.28 329.28 329.79 330.48 327 358
G09 N/A 328.76 328.39 328.39 328.95 328.31
Dalton 328.50 328.50 N/A 328.02 329.85 329.07
aNote that G09 uses eq. 8 by default (gives -11.70, -77.63, and 328.34 ppm), but I forced it to use eq. 7. b These were my manual implementations of 'b3lyp' and 'acm' to make sure I got things right. c This is acm in NWChem and B3PW91 in Gaussian 09.


References:
[1] P. J. Wilson, R. D. Amos, H. N. C., Chem. Phys. Letters 1999, 321, 475-484.
[2] A. D. Becke, J. Chem. Phys. 1993, 98, 5648-5652.
[3] A. D. Becke, Phys. Rev. A 1988, 38, 3098-3100.
[4] J. P. Perdew, Y. Wang, Phys. Rev. B 1991, 45, 13244-13249.
[5] T. Helgaker, P. J. Wilson, R. D. Amos, N. C. Handy, J. Chem. Phys. 2000, 113, 2983-2989.

27 May 2012

165. Approach to computing reorganisational energies using nwchem

I set out to reproduce Malagoli and Brédas in Chemical Physics Letter, 2000, 327, 13-17 (Link). Essentially it's a paper on calculating reorganisational energies in a few simple organic species, such as biphenyl.  I've already covered this work to some extent (here: http://verahill.blogspot.com.au/2012/05/dft-gridsize-ecce-defaults-to-medium.html ) but here's a step-by-step walkthrough and nwchem:

Approach (using biphenyl):
1. Draw biphenyl -- the dihedral angle between the two rings should be about 38 degrees
2. Optimize structure using b3lyp/6-31+g in gas phase. Gives you E1
3. Change the charge and multiplicity to +1 and 2, respectively.
4. Calculate single-point energy using the previously optimised structure (from step 2, i.e. don't optimise). Gives E2.
5. Now, optimise structure. Gives E3.
6. Change the charge and multiplicity to 0 and 1. Calculate single point energy of the optimised structure in step 5. Gives E4.
7. ΔE=(E2-E3)+(E4-E1). Convert energies from Hartree to eV by multiplying by 27.2107


Using nwchem:
title "Calculating E1"
start neutral_ground_state
geometry
 C     3.58691     -1.70661e-06     0.000280946
 C     2.87790     -0.739281     0.946515
 C     1.48493     -0.738173     0.945097
 C     0.747197     0.00000     0.000278762
 C     1.48488     0.738183     -0.944553
 C     2.87785     0.739273     -0.945948
 C     -0.747197     0.00000     0.000278762
 C     -1.48488     0.00000     -1.19873
 C     -2.87785     0.00000     -1.20050
 C     -1.48493     0.00000     1.19927
 C     -3.58691     0.00000     0.000281534
 C     -2.87790     0.00000     1.20107
 H     4.67272     9.42775e-05     0.000158092
 H     3.40921     -1.32260     1.69312
 H     0.975305     -1.32691     1.69865
 H     3.40931     1.32254     -1.69249
 H     -3.40931     0.00000     -2.14788
 H     -0.975305     0.00000     2.15554
 H     -4.67272     0.00000     0.000125630
 H     -3.40921     0.00000     2.14853
 H     -0.975288     0.00000     -2.15502
 H     0.975288     1.32693     -1.69812
end
charge 0
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
end
dft
    direct
    grid fine
    xc b3lyp
    mult 1
end
task dft optimize
and
title "Calculating E2"
start cation_excited_state
geometry units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
end
charge 1
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
end
dft
    direct
    grid fine
    xc b3lyp
    mult 2
end
task dft energy
and
title "Calculating E3"
start cation_ground_state
geometry units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
end
charge 1
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
end
dft
    direct
    grid fine
    xc b3lyp
    mult 2
end
task dft optimize
and
title "Calculating E4"
start neutral_excited_state
geometry
 C     0.00000     -3.54034     0.00000
 C     -1.20296     -2.84049     -0.216000
 C     -1.20944     -1.46171     -0.206253
 C     0.00000     -0.721866     0.00000
 C     1.20944     -1.46171     0.206253
 C     1.20296     -2.84049     0.216000
 C     0.00000     0.721866     0.00000
 C     1.20944     1.46171     -0.206253
 C     1.20296     2.84049     -0.216000
 C     -1.20944     1.46171     0.206253
 C     0.00000     3.54034     0.00000
 C     -1.20296     2.84049     0.216000
 H     0.00000     -4.62590     0.00000
 H     -2.12200     -3.38761     -0.395378
 H     -2.13673     -0.938003     -0.401924
 H     2.12200     -3.38761     0.395378
 H     2.12200     3.38761     -0.395378
 H     -2.13673     0.938003     0.401924
 H     0.00000     4.62590     0.00000
 H     -2.12200     3.38761     0.395378
 H     2.13673     0.938003     -0.401924
 H     2.13673     -0.938003     0.401924
end
charge 0
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
end
dft
    direct
    grid fine
    xc b3lyp
    mult 1
end
task dft energy

And it all gives:
E1: Total DFT energy =     -463.321927500065
E2: Total DFT energy =     -463.035336642074
E3: Total DFT energy =     -463.042292962541
E:4 Total DFT energy =     -463.315725187090
ΔE=-463.035336642074-(-463.042292962541)+(-463.315725187090-(-463.321927500065))=0.013158633442Hartree= .3580556270002294 eV ≅ 0.36 eV i.e. the same as the paper.

26 May 2012

164. A rough approach to calculating redox potentials using dft

I'm not a computational or theoretical chemist, but I'd like to expand my skills in the area. Ergo, if you do post, don't ask me questions about theory -- I'm only at the learning stage myself. In addition, this is my interpretation of the paper I'm following.

EDIT: see here for a rough description of my take on a paper on reorganisational energies.

I'm reproducing papers to improve my understanding of what computational tools are available to 'wet' chemists. A good place to start for this is in the past -- typically, cutting edge methods are difficult to implement for someone who's not an expert in the field (Would you take a soldering iron to an NMR probe? We do it routinely, but most people wouldn't since it takes training.) and lacks the intuition to apply the right type of fudging. 'Old' papers (5-10 years) offer the advantage of having enough papers citing them that you get a crowd-sourced approach to evaluation of them, the methods described are often implemented directly in today's software packages and the computations are much, much cheaper with today's hardware -- often parts or even an entire paper can be reproduced in a day or two.

WARNING I may be doing something wrong WARNING

Here's my attempt at reproducing part of Baik and Friesner in J. Phys. Chem. A, 2002, 106, 7407-7412 - the exact approach is somewhat flavoured by  the specific output that gaussian and nwchem return.

The basic principle is that if we know the ΔG of a redox reaction, then we can calculate the redox potential:
ΔG=-n*F*E,
where n is the number of electrons, F is Faraday's constant and E is the (unreferenced) redox potential. If we know ΔG in kcal/mol, then E=ΔG/(-n*F), where F is ca 23.06 kcal/(mol.V).

The main complication is that a normal dft calc will give us the electronic energy, ε0, while we ultimately need the thermodynamic paramaters, e.g. ΔH, ΔS, ΔG. Luckily, they are easy (although computationally costly) to get by doing a frequency calculation.

The short of it:
We here define G=ε0+Gcorr.
Gaussian will give you Gcorr directly, while e.g. nwchem makes you work for it -- but not too much -- so that
Gcorr=Hcorr-T*Srot+vib+trans

NOTE:In the paper (using Jaguar) the ZPE is explicitly included, but in nwchem and gaussian the  Hcorr/Gcorr will already contain it. Approaches that I've seen online do not include ZPE and the computed data agrees better with experimental data if it's excluded. If I had to put money on anything, I'd say: make sure your software package includes ZPE in the Hcorr reported, and don't add it explicitly.

ΔGin gas phase is obtained by taking the difference in G of the product and reactant. All that remains is the solvation energy.

Throwing in solvation using a continuum solvent model is fairly easy, but can also lead to long computational times, so we will cheat  -- the original paper is from 2002, when some of the calcs took them 40 days. So we will do a single-point/energy calculation to get the solvation.

(I think a more modern approach -- given more powerful hardware -- would be to do everything with a solvent model included for small molecules. For e.g. polyoxometalates, however, this can still be costly when using COSMO, but using PCM is almost a requirement when using g09 since anions converge very slowly. Bottomline: decide on the approach and be prepared to justify it.)

Anyway:
Gsolvation≅ (ε0(using solvation model)+Gcorr(gas phase))-(ε0(in gas phase)+Gcorr(gas phase))=ε0(using solvation model)-ε0(in gas phase)=εsolvation.

When I only write ε0 I mean in gas phase. Here's for a reaction where A goes to B through a redox process:
ΔG≅(ε0solvation+Gcorr)B-(ε0solvation+Gcorr)A=ΔGin gas phase+ΔGsolvation

How:
You need to do the following steps:
  1. Draw benzophenone
  2. Optimise the structure in the gas phase i.e. without a solvent model
  3. Do a frequency calc of the optimised structure in the gas phase
  4. Do an energy (single point) calculation of the gas-phase optimised structure using a solvent model (e.g. PCM or COSMO). That is, you use the structure you optimised in the gas phase, then apply a solvent model and do a single-point energy calculation.

  5. Draw the benzophenone anion
  6. Optimise the structure in the gas phase i.e. without a solvent model
  7. Do a frequency calc of the optimised structure in the gas phase
  8. Do an energy (single point) calculation of the gas-phase optimised structure using a solvent model

2 will give you ε0. 3 will give you ε0 again, and Gcorr. 4 will give you ε0(using solvation model).
Steps 5-8 will do the same, but for the reduced species.


Gaussian inputs:
#p opt rb3lyp/6-31g**

benzophenone, neutral, gas phase

0 1
...xyz coordinates of starting guess...
and
#p freq rb3lyp/6-31g**

benzophenone, neutral, frequency

0 1
...xyz coordinates of optimised structure...
and
#p rb3lyp/6-31g** scrf=(pcm,solvent=acetonitrile)

benzophenone, neutral, solvation

0 1
...xyz coordinates of optimised structure...
and
#p opt rb3lyp/6-31g**

benzophenone, anion, gas phase

-1 2
...xyz coordinates of starting guess...

etc. Note the multiplicity -- one unpaired electron gives a multiplicity of 2 and a charge of -1. You can of course use the .chk file as a restart point, or do both opt and freq in a single calculation.

Nwchem input:
title "Benzophenone, neutral, gas phase"
start benzophenone_gas_opt
charge 0
geometry
       ...xyz coordinates of starting guess...
end
dft
       xc b3lyp
       mult 1
       grid fine
end
task dft optimize
and
title "benzophenone, neutral, gas phase, frequency"
start benzophenone_gas_freq
charge 0
geometry
       ...xyz coordinates of optimised structure...
end
dft
       xc b3lyp
       mult 1
       grid fine
end
task dft freq
and

title "benzophenone, neutral, COSMO"
start benzophenone_solv_energy
charge 0
geometry
       ...xyz coordinates of optimised structure...
end
cosmo
       dielectric 37.5
end
dft
       xc b3lyp
       mult 1
       grid fine
end
task dft energy
and
title "Benzophenone, anion, gas phase"
start benzophenone_anion_gas_opt
charge -1
geometry
       ...xyz coordinates of starting guess...
end
dft
      xc b3lyp
      mult 2
       grid fine
end
task dft optimize
etc.

You can of course throw two task directives in one and the same calculcation etc. You can also re-use movecs files. But I'm trying to make this as simple as possible to follow.

Gaussian output:
Neutral, gas phase opt:

SCF Done:  E(UB3LYP) =  -576.648098680     A.U. after   16 cycles
             Convg  =    0.3611D-08             -V/T =  2.0097
Neutral, gas phase freq:
 Zero-point correction=                           0.191763 (Hartree/Particle)
 Thermal correction to Energy=                    0.202482
 Thermal correction to Enthalpy=                  0.203427
 Thermal correction to Gibbs Free Energy=         0.154218
 Sum of electronic and zero-point Energies=           -576.456336
 Sum of electronic and thermal Energies=              -576.445616
 Sum of electronic and thermal Enthalpies=            -576.444672
 Sum of electronic and thermal Free Energies=         -576.493881
neutral, solvation=acetonitrile, energy:
SCF Done:  E(UB3LYP) =  -576.654962012     A.U. after   13 cycles
             Convg  =    0.2187D-08             -V/T =  2.0097
Anion gas phase opt:
SCF Done:  E(UB3LYP) =  -576.656219870     A.U. after   20 cycles
             Convg  =    0.6297D-08             -V/T =  2.0095
Anion gas frequency:
 Zero-point correction=                           0.187970 (Hartree/Particle)
 Thermal correction to Energy=                    0.198790
 Thermal correction to Enthalpy=                  0.199734
 Thermal correction to Gibbs Free Energy=         0.150074
 Sum of electronic and zero-point Energies=           -576.468250
 Sum of electronic and thermal Energies=              -576.457430
 Sum of electronic and thermal Enthalpies=            -576.456486
 Sum of electronic and thermal Free Energies=         -576.506146
Anion, solvation, energy:
SCF Done:  E(UB3LYP) =  -576.732948458     A.U. after   16 cycles
             Convg  =    0.5837D-08             -V/T =  2.0096
Putting it all together:
ε0,anion,gasphase=-576.656219870
Ganion,gasphase=-576.656219870+0.150074=-576.506145870
εsolvation,anion=-576.732948458-(-576.656219870)=-0.076728588

ε0,neutral,gasphase=-576.648098680
Gneutral,gasphase= -576.648098680+0.154218=-576.493880680
εsolvation,neutral=-576.654962012-(-576.648098680)=-0.006863332

ΔG=(-576.506145870-0.076728588)-(-576.493880680-0.006863332)=-0.082130446 Hartree=-51.537594039014 kcal/mol=ΔG/(-1*F)= -51.537594039014/(-1*23.06)=2.23493469379939288811  V. Referencing it against the SCE at 4.1888 V, we get -1.95386530620060711189 V vs SCE

Looking at the paper, they got -2.426+4.1888=1.7628 V absolute, corresponding to -40.6502 kcal/mol=-0.06478026 hartree. Remember that this was done using G09 and not Jaguar, which was used in the paper. The experimental value is -1.88 V. The maximum obtainable accuracy of DFT (according to the paper) is about 2 kcal/mol.


Nwchem output:
Neutral, gas
        Total DFT energy =     -576.648088887542
      One electron energy =    -2313.472465671473
           Coulomb energy =     1046.450004818160
    Exchange-Corr. energy =      -82.832342635322
 Nuclear repulsion energy =      773.206714601093
neutral, gas, freq
 Zero-Point correction to Energy  =  120.416 kcal/mol  (  0.191895 au)
 Thermal correction to Energy     =  127.114 kcal/mol  (  0.202569 au)
 Thermal correction to Enthalpy   =  127.706 kcal/mol  (  0.203513 au)
 Total Entropy                    =  103.010 cal/mol-K
   - Translational                =   41.486 cal/mol-K (mol. weight = 182.0732)
   - Rotational                   =   31.544 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =   29.980 cal/mol-K
neutral, solvated, energy
                  COSMO solvation results
                  -----------------------
  
                 gas phase energy =      -576.6480876954
                 sol phase energy =      -576.6603289676
 (electrostatic) solvation energy =         0.0122412722 (    7.68 kcal/mol)

Anion gas

       Total DFT energy =     -576.656222314875
      One electron energy =    -2320.751987024161
           Coulomb energy =     1058.312412372760
    Exchange-Corr. energy =      -83.108633373896
 Nuclear repulsion energy =      768.891985710422
Anion gas, freq
Zero-Point correction to Energy  =  118.177 kcal/mol  (  0.188327 au)
 Thermal correction to Energy     =  124.894 kcal/mol  (  0.199031 au)
 Thermal correction to Enthalpy   =  125.486 kcal/mol  (  0.199975 au)
 Total Entropy                    =  102.152 cal/mol-K
   - Translational                =   41.486 cal/mol-K (mol. weight = 182.0732)
   - Rotational                   =   31.577 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =   29.090 cal/mol-K
Anion solvated, energy
                 gas phase energy =      -576.6562158410
                 sol phase energy =      -576.7436282813
 (electrostatic) solvation energy =         0.0874124403 (   54.85 kcal/mol)

Which works out to
Gcorr=(127.706-298.15*103.010/1000)/627.509=0.15456920697551748261
ε0,neutral,gasphase=-576.648088887542
Gneutral,gasphase=-576.648088887542+0.15456920697551748261+0.191895=-576.49351968056648251739+0.191895
εsolvation,neutral=0.0122412722

Gcorr=(125.486-298.15* 102.152/1000)=0.15141494338035305421
ε0,anion,gasphase= -576.656222314875
Ganion,gasphase=  -576.656222314875+0.15141494338035305421+0.188327=-576.50480737149464694579+0.188327
εsolvation,anion=0.0874124403

ΔG=(-576.50480737149464694579-0.0874124403)-(-576.49351968056648251739-0.0122412722)=-0.08645885902816442840 Hartree=-54.25371216990443230085 kcal/mol => 2.35271952167842250687 V (absolute) <=> -1.836 V vs SCE. This is very close to the experimental value of -1.88 V but not at all close to the reported value in the paper of -2.426 V vs SCE.


Discussion:
Whether I'm doing something wrong or whether the differences in solvation models and implementation in differernt software packages is to blame, I don't know. 15 kcal/mol is a lot. 
However, not including solvation at all yields values which are completely off: -3.85 V vs SCE for gaussian and 3.88 for nwchem. The solvation free energy correction is the largest factor in yielding a 'correct' (as in order of magnitude) result -- and if I'd be forced to venture a guess, solvation models/implementations (and their empirical input) both differ a lot between modelling packages and have improved a lot in the intervening 10 year.

I'll post data for larger basis sets in a week or so to see whether the nwchem/cosmo value was a fluke or 'real'.

[g09 using aug-cc-pwdCVDZ:
single point E, gas phase using lanl2dz structure:
neutral :  -576.718523917
anion:     -576.746041821
ΔG=(-576.746041821+0.150074-0.076728588)-(-576.718523917+0.154218-0.006863332)= -0.101527160 Hartree =-1.426 V relative to SCE, vs -1.88 experimental and  -1.953 with LANL2DZ. That didn't help...]

21 May 2012

157. Restarting gaussian (g09) job on an SGE system (qsub)

The Australian University I'm working at has a computational cluster where jobs are submitted using qsub. This post is more like a personal note to myself, but the point about resubmitting jobs may be of use to someone.

This page is useful reading for how these types of scripts with mixed shell and gaussian stuff work: http://www.gaussian.com/g_tech/g_ur/m_running.htm

0. The qsub header (Notes To Myself)

http://cf.ccmr.cornell.edu/cgi-bin/w3mman2html.cgi?qsub(1B)
#$ -S /bin/sh Shell. Can be csh, tcsh etc..
#$ -cwd execute in Current Working Directory
#$ -l h_rt=12:00:00 Maximum allowed run time in hours. -l is a list with resource limits.
#$ -l h_vmem=4G Memory limit. http://www.biostat.jhsph.edu/bit/cluster-usage.html#MemSpec. Should match %mem 4000mb

#$ -j y"-j join. Declares if the standard error stream of the job will be merged with the standard output stream of the job." It creates *.o* and *.p* files with what would've been echoed in the terminal.

#$ -pe g03_smpX XSeems to stand for parallel execution. X would be the number of slots it seems. The g03_smpX seems to be the message passing interface, but not entirely sure. 

1. Setting up simple jobs
First create a standard template, let's call it qsub.header

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log
and save it in your home folder ~.

Create an input file, e.g. water.in and put it in your work directory, e.g. ~/g09


%chk=water
#P ub3lyp/6-31G* opt

water energy

0  1
O
H  1  1.0
H  1  1.0  2  120.0

Put them together:
cat ~/qsub.header > water.qsub
cat ~/g09/water.in >>water.qsub

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log

%chk=water.chk
%nprocshared=6
#P ub3lyp/6-31G* opt

water energy

0  1
O
H  1  1.0
H  1  1.0  2  120.0

2. Restarting jobs

Most likely your home folder is shared across the nodes via nfs.

To find out, submit
#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
pwd
ls -lah
tree -L 1 -d
to get some directory information about the nodes.

Once you have that, just put the absolute path to your .chk file in your restart script, e.g.

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=12:00:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe g03_smp2 2
module load gaussian/g09
time G09 << END > g09_output.log

%chk=/nfs/home/hpcsci/username/g09/water.chk
%nprocshared=2
#P ub3lyp/6-31G* opt guess=read geom=allcheck



05 May 2012

137. Setting up Gaussian g09 on debian -- precompiled binaries

Most people would use a precompiled and pre-installed copy of gaussian on their local computational grid. If you do, however, purchase your own copy, or work at an institution with a site license, then you can install gaussian on your local beowulf cluster (or over-powered desktop).

And it's easy. I've presumed that your username is verahill, and your group is verahill.


You need csh
sudo apt-get install csh

Next set up a destination directory
mkdir /opt/gaussian
sudo chown verahill:verahill /opt/gaussian

copy the gaussian binaries to /opt/gaussian/g09 so that you have

/opt/gaussian/g09
|-- basis
|-- bsd
`-- tests
    |-- com
    |-- ia64
    `-- newz

Gaussian wants almost everything to be executable
chmod +x /opt/gaussian/g09/*
chmod +x /opt/gaussian/g09/bsd/*
cd /opt/gaussian/g09
csh bsd/install

Next, edit your ~/.bashrc and add (anywhere)
export g09root=/opt/gaussian export GAUSS_SCRDIR=/scratch . /opt/gaussian/g09/bsd/g09.profile export PATH=$PATH:/opt/gaussian/g09/bsd:/opt/gaussian/g09/local

Source your bashrc to make changes take effect immediately:
source ~/.bashrc

You need to edit the bashrc of anyone wanting to use gaussian

And you're done!

03 May 2012

132. Ecce v 6.2 -- minor bug: nosymm in g09 input

Ecce 6.2 doesn't officially support Gaussian 09 as far as I know. However, they are compatible enough for a wide range of tasks.

However, if you use nosymm you will not be able to see orbital occupancy:


#P rb3lyp/6-31++g** 5D 7F Opt=()  Freq=()  Punch=(MO) Pop=() scrf=(pcm,         solvent=dichloromethane)


vs

 #P rB3LYP/6-31++g** 5D 7F Opt=()  Freq=()  Punch=(MO) Pop=() scrf=(pcm,         solvent=Dichloromethane) nosymm