31 May 2012

168. Port redirection with ECCE/nwchem

Update 1/6/2012: There may be an alternative, better way of doing this: "Another feature that may or may not be useful to you with this special node that is setup with a higher ulimit for submitting your ECCE jobs is that ECCE has a "hop" feature that lets it go from a main login node on a machine to other nodes before actually running commands (e.g. submitting jobs). If you look at the $ECCE_HOME/siteconfig/CONFIG-Examples/CONFIG.mpp2 file, you'll see this "frontendMachine" directive that is what is used to do this. I'm thinking this might allow you to skip the port redirect options with ssh and just "hop" to your special node from regular login node on the compute host. But, I don't think I'd worry about it if what you have now is working fine."
What I describe below works fine, but it does require that users are able to set up the port redirect themselves. The solution hinted at above would be a better technical solution for a larger group of users with varying technical abilities since it should be enough to copy CONFIG.<> and <>.Q files.
Here's how to use main-subnode hopping: http://verahill.blogspot.com.au/2012/06/ecce-and-inaccessible-cluster-nodes.html

Original post:

If you for some reason can't connect to your computational cluster on port 22, here's how to sort it out.

Setting it up the first time

Adding a node:
I have access to an idle computer off-campus, which sits behind a trusty old linksys router. To access it directly I need to use port forwarding.

Edit  ecce-6.3/apps/siteconfig/remote_shells.site and add

ssh_p9999: ssh -p 9999|scp -P 9999
The miniscule p for ssh and capital P for scp are important. They do the same thing, but the different programmes expect difference cases. 

In the ECCE gateway (the small floating window with icons that start when you start ECCE) select Machine Browser. Under Machine, select Register Machine:

The localhost bit is the key here -- since you'll be doing port redirect you'll formally connect to a port on localhost. Hit Add/Change.
In the main Machine Browser window, highlight oxygen and click on Setup Remote Access. Your ssh_p9999 thingy should show up there. Don't bother testing anything just yet (i.e. Machine status etc.). Since I'm writing this from memory I don't know whether you need to have the port-redirect active at this point or not. If you do, see below under Running.

It really is that simple. See below for how to actually use it.

Adding a remote site:
I work at an Australian university where I appear to be the only person using ECCE. Thus, while ulimit on the local SGE managed cluster is a meagre 32 procs, this hasn't been a problem up till now. However, ECCE launches five procs per job, so using up my procs allocation I've been locked out of the cluster on a regular basis.

As a solution, I've been offered my very own shiny submit node with a heftier 37k procs allowed. The downside is that it's only accessible from the standard submit node. Luckily, it's not much more difficult than doing port redirects for a remote node.

Edit  ecce-6.3/apps/siteconfig/remote_shells.site and add

ssh_p5454: ssh -p 5454scp -P 5454
The miniscule p for ssh and capital P for scp are important. 

In the terminal, run
ecce -admin

Most of the fields above should fairly self-explanatory. A few things to watch out for:

  • ECCE actually looks at the proc to node ratio and will impose strict limitations on the number of cores you can use per node. 50/20 means that if you want to use four cores ECCE forces the use of two nodes. Depending on how you run your jobs (configuration) this may or may not have any real impact. To be safe, pick something like 700 cores and 20 nodes.
  • Path means path. I think ECCE defaults to giving the perl path as /usr/bin/perl, but it should be /usr/bin. Same goes for the qsub path.
  • You need to create a queue. The queue name isn't used anywhere other than in ECCE, so it can be a smart way of setting up defaults. What I'm saying is: it's not that important to get it 'right' since it bears no relation to anything on your cluster.
Click on close.

In 'regular' ecce (i.e. started without -admin) go to the machine browser window, highlight the added site, hit Set up Remote Access, and pick ssh_p5454 as shown below. Don't bother testing anything just yet (e.g. Machine status etc.). Since I'm writing this from memory I don't know whether you need to have the port-redirect active at this point or not. If you do, see below under Running.

As always, setting up a site takes a bit of customisation. Here's my ecce-6.3/apps/siteconfig/CONFIG.gn54 on my ecce workstation.
NWChem: /opt/sw/nwchem-6.1/bin/nwchem
Gaussian-03: /usr/local/bin/G09
perlPath: /usr/bin
qmgrPath: /opt/n1ge62/bin/lx24-amd64
#$ -S /bin/csh
#$ -cwd
#$ -l h_rt=$wallTime
#$ -l h_vmem=4G
#$ -j y

    LD_LIBRARY_PATH /usr/lib/openmpi/1.3.2-gcc/lib/
    PATH /opt/n1ge62/bin/lx24-amd64/
NWChemCommand {
#$ -pe mpi_smp$totalprocs  $totalprocs
module load nwchem/6.1
mpirun -n $totalprocs $nwchem $infile > $outfile
Gaussian-03Command {
#$ -pe g03_smp4 4
module load gaussian/g09
time G09< $infile > $outfile }
and here's my ecce-6.3/apps/siteconfig/gn54.Q  on my ecce workstation.

# Queue details for gn54
Queues:    nwchem squ8
nwchem|minProcessors:       1
nwchem|maxProcessors:       8
nwchem|runLimit:       4320
nwchem|memLimit:       4000
nwchem|scratchLimit:       0
squ8|minProcessors:       1
squ8|maxProcessors:       6
squ8|runLimit:       4320
squ8|memLimit:       4000
squ8|scratchLimit:       0 
Finally, you need to make sure that  nwchem can find everything - put a file called .nwchemrc in your home folder on the remote node with the correct paths in it, e.g.

nwchem_basis_library /opt/sw/nwchem-6.1/data/libraries/
nwchem_nwpw_library /opt/sw/nwchem-6.1/data/libraryps/
ffield amber
amber_1 /opt/sw/nwchem-6.1/data/amber_s/
amber_2 /opt/sw/nwchem-6.1/data/amber_q/
amber_3 /opt/sw/nwchem-6.1/data/amber_x/
amber_4 /opt/sw/nwchem-6.1/data/amber_u/
spce /opt/sw/nwchem-6.1/data/solvents/spce.rst
charmm_s /opt/sw/nwchem-6.1/data/charmm_s/
charmm_x /opt/sw/nwchem-6.1/data/charmm_x/

That's it.


Starting port redirect:
Before you can pipe anything through to your supersecure remote node/remote site, you need to open a teminal window and do
ssh -C username@remoteserver -L 9999:hiddenserver:22
for the remote node above, or
ssh -C username@remoteserver -L 5454:hiddenserver:22
for the remote site above.

Just to make the syntax clear:
Remote node:
My linksys route is at IP address, and the remote node behind it is at My username is verahill
ssh -C verahill@ -L 9999:

Remote site:
The standard submit node is called msgln4.university.au, and the hidden node is called gn54.university.au. My username is lindqvist
ssh -C lindqvist@msgln4.university.au -L 5454:gn54.university.au:22
You may need to install and use autossh if you keep on being booted off due to inactivity! The syntax is identical.

Everything in ECCE now works exactly like before - just select the target computer/site/node and go.

What doesn't work:
So far I haven't been able to sort out the whole 'open' remote terminal, which means that tail -f output doesn't work either. I'm leaving that for a rainy day with too much time on my hands.

28 May 2012

167. ECCE/Nwchem on An Australian University computational cluster using qsub with g09/nwchem

I've just learned the First Rule of Remote Computing:
always start by checking the number of concurrent processes you're allowed on the head node, or you can lock yourself out faster that you can say "IT support'.

ulimit -u
If it's anywhere under 1000, then you need to be careful.
Default ulimit on ROCKS: 73728
Default ulimit on Debian/Wheezy:  63431
Ulimit on the Oz uni cluster: 32

ECCE launches FIVE processes per job.
Each pipe you add to a command launches another proc. Logging in launches a proc -- if you've reached your quota, you can't log in until a processes finishes.

cat test.text|sed 's/\,/\t/g'|gawk '{print $2,$3,$4}' 
yields three processes -- ten percent of my entire quota.

Running something on a cluster where you have limited access is very different from a cluster you're managing yourself. Apart from knowing the physical layout, you normally have sudo powers on a local cluster.

On potential issue is excessive disk usage -- both in terms of storage space and in terms of raw I/O (writing to an nfs mounted disk is not efficient anyway)
So in order to cut down on that:
1. Define a scratch directory using e.g. (use the correct path)
scratch_dir /scratch
The point being that /scratch is a local directory on the execution node

2. Make sure that you specify
or even
to do as little disk caching as possible.

I accidentally ended up storing 52 GB of aoints files from a single job. It may have been what locked me out of the submit node for three hours...

A good way to check your disk-usage is
ls -d * |xargs du -hs

Now, continue reading:

Setting everything up the first time:
First figure out where the mpi libs are:

#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=00:14:00
#$ -l h_vmem=4G
#$ -j y
locate libmpi.so
Assuming that the location is /usr/lib/openmpi/1.3.2-gcc/lib/, put 
export LD_LIBRARY_PATH=/usr/lib/openmpi/1.3.2-gcc/lib/
in your ~/.bashrc

Next, look at ls /opt/sw/nwchem-6.1/data -- if there's a default.nwchemrc file, then
ln -s /opt/sw/nwchem-6.1/data/default.nwchemrc ~/.nwchemrc

If not, create ~/.nwchemrc with the locations of the different basis sets, amber files and plane-wave sets listed as follows:

nwchem_basis_library /opt/sw/nwchem-6.1/data/libraries/
nwchem_nwpw_library /opt/sw/nwchem-6.1/data/libraryps/
ffield amber
amber_1 /opt/sw/nwchem-6.1/data/amber_s/
amber_2 /opt/sw/nwchem-6.1/data/amber_q/
amber_3 /opt/sw/nwchem-6.1/data/amber_x/
amber_4 /opt/sw/nwchem-6.1/data/amber_u/
spce /opt/sw/nwchem-6.1/data/solvents/spce.rst
charmm_s /opt/sw/nwchem-6.1/data/charmm_s/
charmm_x /opt/sw/nwchem-6.1/data/charmm_x/

Using nwchem:
A simple qsub file would be:

#$ -S /bin/sh
#$ -cwd
#$ -l h_rt=00:14:00
#$ -l h_vmem=4G
#$ -j y
#$ -pe orte 4
module load nwchem/6.1
time mpirun -n 4 nwchem  test.nw > nwchem.out

with test.nw being the actual nwchem input file which is present in your cwd (current working directory).

Using nwchem with ecce:
This is the proper way of using nwchem. If you haven't already, look here: http://verahill.blogspot.com.au/2012/05/setting-up-ecce-with-qsub-on-australian.html

Then edit your  ecce-6.3/apps/siteconfig/CONFIG.msgln4  file:

NWChem: /opt/sw/nwchem-6.1/bin/nwchem
Gaussian-03: /usr/local/bin/G09
perlPath: /usr/bin/perl
qmgrPath: /usr/bin/qsub

#$ -S /bin/csh
#$ -cwd
#$ -l h_rt=$wallTime
#$ -l h_vmem=4G
#$ -j y

NWChemFilesToDelete{ core *.aoints.* }

    LD_LIBRARY_PATH /usr/lib/openmpi/1.3.2-gcc/lib/

NWChemCommand {
#$ -pe mpi_smp4  4
module load nwchem/6.1

mpirun -n $totalprocs $nwchem $infile > $outfile

Gaussian-03Command {
#$ -pe g03_smp4 4
module load gaussian/g09

time G09< $infile > $outfile }

Gaussian-03FilesToDelete{ core *.rwf }

find /scratch/* -name "*" -user $USER |xargs -I {} rm {} -rf

And you should be good to go. IMPORTANT: don't copy the settings blindly -- what works at your uni might be different from what works at my uni. But use the above as an inspiration and validation of your thought process. The most important thing to look out for in terms of performance is probably your -pe switch.

Since I'm having problems with the low ulimit, I wrote a small bash script which I've set to run every ten minutes as a cronjob. Of course, if you've used up your 32 procs you can't run the script...also, instead of piping stuff right and left (each pipe creates another fork/proc) I've written it so it dumps stuff to disk. That way you have a list over procs in case you need to kill something manually:

 The script: ~/clean_ps.sh
ps ux>~/.job.list
ps ux|gawk 'END {print NR}'

cat ~/.job.list|grep "\-sh \-i">~/.job2.list
cat ~/.job2.list|gawk '{print$2}'>~/.job3.list
cat ~/.job3.list|xargs -I {} kill -15 {}

cat ~/.job.list|grep "echo">~/.job4.list
cat ~/.job4.list|gawk '{print$2}'>~/.job5.list
cat ~/.job5.list|xargs -I {} kill -15 {}

cat ~/.job.list|grep "notty">~/.job6.list
cat ~/.job6.list|gawk '{print$2}'>~/.job7.list
cat ~/.job7.list|xargs -I {} kill -15 {}

cat ~/.job.list|grep "perl">~/.job8.list
cat ~/.job8.list|gawk '{print$2}'>~/.job9.list
cat ~/.job9.list|xargs -I {} kill -15 {}

qstat -u ${USER} 
ps ux |gawk 'END {print NR}' 
echo "***" 

and the cron job is set up using
crontab -e
 */10 * * * * sh ~/clean_ps.sh>> ~/.cronout

Obviously this kills any job monitoring from the point of view of ecce. However, it keeps you from being locked out. You can manually check the job status using qstat -u ${USER}, then reconnect when a job is ready. Not that convenient, but liveable.

166. Briefly: nvidia API mismatch on debian when running ecce

UPDATE: There's a much better way to do this: "One thing that I did notice is your issues with OpenGL where you suggested moving the shared libraries to another directory. While that's perfectly workable, this would be another instance where consulting the $ECCE_HOME/siteconfig/site_runtime file would be useful. There you would learn about the $ECCE_MESA_OPENGL and $ECCE_MESA_EXCEPT variables that control whether to use the ECCE-supplied GL libraries or native ones (e.g. hardware OpenGL card drivers) on your machine." I'll update this post again when I've had a time to look into it. Lecture slides and grant rejoinders don't write themselves...

Original post:
If you get an error along the lines of this:
only when you're running ECCE i.e. there's an API mismatch error with a difference in kernel module version vs the nvidia driver component (in my case 295.49 and 290.10, respectively), thenyou may want to have a look in your apps folder before you launch a major investigation, e.g.
ecce-6.3/apps/rhel5-gcc4.1.2-m64/3rdparty/mesa/liblibGL.so    libGL.so.295.49  libGLU.so.1           libnvidia-glcore.so.295.49
libGL.so.1  libGLU.so        libGLU.so.1.3.071100  libnvidia-tls.so.295.49
You can symmlink the correct drivers, or -- which is even easier -- just move your 3rdparty/mesa library to e.g.3rdparty/bakmesa and see if it solves it

27 May 2012

165. Approach to computing reorganisational energies using nwchem

I set out to reproduce Malagoli and Brédas in Chemical Physics Letter, 2000, 327, 13-17 (Link). Essentially it's a paper on calculating reorganisational energies in a few simple organic species, such as biphenyl.  I've already covered this work to some extent (here: http://verahill.blogspot.com.au/2012/05/dft-gridsize-ecce-defaults-to-medium.html ) but here's a step-by-step walkthrough and nwchem:

Approach (using biphenyl):
1. Draw biphenyl -- the dihedral angle between the two rings should be about 38 degrees
2. Optimize structure using b3lyp/6-31+g in gas phase. Gives you E1
3. Change the charge and multiplicity to +1 and 2, respectively.
4. Calculate single-point energy using the previously optimised structure (from step 2, i.e. don't optimise). Gives E2.
5. Now, optimise structure. Gives E3.
6. Change the charge and multiplicity to 0 and 1. Calculate single point energy of the optimised structure in step 5. Gives E4.
7. ΔE=(E2-E3)+(E4-E1). Convert energies from Hartree to eV by multiplying by 27.2107

Using nwchem:
title "Calculating E1"
start neutral_ground_state
 C     3.58691     -1.70661e-06     0.000280946
 C     2.87790     -0.739281     0.946515
 C     1.48493     -0.738173     0.945097
 C     0.747197     0.00000     0.000278762
 C     1.48488     0.738183     -0.944553
 C     2.87785     0.739273     -0.945948
 C     -0.747197     0.00000     0.000278762
 C     -1.48488     0.00000     -1.19873
 C     -2.87785     0.00000     -1.20050
 C     -1.48493     0.00000     1.19927
 C     -3.58691     0.00000     0.000281534
 C     -2.87790     0.00000     1.20107
 H     4.67272     9.42775e-05     0.000158092
 H     3.40921     -1.32260     1.69312
 H     0.975305     -1.32691     1.69865
 H     3.40931     1.32254     -1.69249
 H     -3.40931     0.00000     -2.14788
 H     -0.975305     0.00000     2.15554
 H     -4.67272     0.00000     0.000125630
 H     -3.40921     0.00000     2.14853
 H     -0.975288     0.00000     -2.15502
 H     0.975288     1.32693     -1.69812
charge 0
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
    grid fine
    xc b3lyp
    mult 1
task dft optimize
title "Calculating E2"
start cation_excited_state
geometry units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
charge 1
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
    grid fine
    xc b3lyp
    mult 2
task dft energy
title "Calculating E3"
start cation_ground_state
geometry units angstrom
 C     0.00000     -3.56301     0.00000
 C     -1.13927     -2.85928     -0.393841
 C     -1.13879     -1.46545     -0.394153
 C     0.00000     -0.742814     0.00000
 C     1.13879     -1.46545     0.394153
 C     1.13927     -2.85928     0.393841
 C     0.00000     0.742814     0.00000
 C     1.13879     1.46545     -0.394153
 C     1.13927     2.85928     -0.393841
 C     -1.13879     1.46545     0.394153
 C     0.00000     3.56301     0.00000
 C     -1.13927     2.85928     0.393841
 H     0.00000     -4.64896     0.00000
 H     -2.02827     -3.39662     -0.711607
 H     -2.02148     -0.928265     -0.727933
 H     2.02827     -3.39662     0.711607
 H     2.02827     3.39662     -0.711607
 H     -2.02148     0.928265     0.727933
 H     0.00000     4.64896     0.00000
 H     -2.02827     3.39662     0.711607
 H     2.02148     0.928265     -0.727933
 H     2.02148     -0.928265     0.727933
charge 1
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
    grid fine
    xc b3lyp
    mult 2
task dft optimize
title "Calculating E4"
start neutral_excited_state
 C     0.00000     -3.54034     0.00000
 C     -1.20296     -2.84049     -0.216000
 C     -1.20944     -1.46171     -0.206253
 C     0.00000     -0.721866     0.00000
 C     1.20944     -1.46171     0.206253
 C     1.20296     -2.84049     0.216000
 C     0.00000     0.721866     0.00000
 C     1.20944     1.46171     -0.206253
 C     1.20296     2.84049     -0.216000
 C     -1.20944     1.46171     0.206253
 C     0.00000     3.54034     0.00000
 C     -1.20296     2.84049     0.216000
 H     0.00000     -4.62590     0.00000
 H     -2.12200     -3.38761     -0.395378
 H     -2.13673     -0.938003     -0.401924
 H     2.12200     -3.38761     0.395378
 H     2.12200     3.38761     -0.395378
 H     -2.13673     0.938003     0.401924
 H     0.00000     4.62590     0.00000
 H     -2.12200     3.38761     0.395378
 H     2.13673     0.938003     -0.401924
 H     2.13673     -0.938003     0.401924
charge 0
basis "ao basis" cartesian print
  H library "6-31G**"
  C library "6-31G**"
    grid fine
    xc b3lyp
    mult 1
task dft energy

And it all gives:
E1: Total DFT energy =     -463.321927500065
E2: Total DFT energy =     -463.035336642074
E3: Total DFT energy =     -463.042292962541
E:4 Total DFT energy =     -463.315725187090
ΔE=-463.035336642074-(-463.042292962541)+(-463.315725187090-(-463.321927500065))=0.013158633442Hartree= .3580556270002294 eV ≅ 0.36 eV i.e. the same as the paper.

26 May 2012

164. A rough approach to calculating redox potentials using dft

I'm not a computational or theoretical chemist, but I'd like to expand my skills in the area. Ergo, if you do post, don't ask me questions about theory -- I'm only at the learning stage myself. In addition, this is my interpretation of the paper I'm following.

EDIT: see here for a rough description of my take on a paper on reorganisational energies.

I'm reproducing papers to improve my understanding of what computational tools are available to 'wet' chemists. A good place to start for this is in the past -- typically, cutting edge methods are difficult to implement for someone who's not an expert in the field (Would you take a soldering iron to an NMR probe? We do it routinely, but most people wouldn't since it takes training.) and lacks the intuition to apply the right type of fudging. 'Old' papers (5-10 years) offer the advantage of having enough papers citing them that you get a crowd-sourced approach to evaluation of them, the methods described are often implemented directly in today's software packages and the computations are much, much cheaper with today's hardware -- often parts or even an entire paper can be reproduced in a day or two.

WARNING I may be doing something wrong WARNING

Here's my attempt at reproducing part of Baik and Friesner in J. Phys. Chem. A, 2002, 106, 7407-7412 - the exact approach is somewhat flavoured by  the specific output that gaussian and nwchem return.

The basic principle is that if we know the ΔG of a redox reaction, then we can calculate the redox potential:
where n is the number of electrons, F is Faraday's constant and E is the (unreferenced) redox potential. If we know ΔG in kcal/mol, then E=ΔG/(-n*F), where F is ca 23.06 kcal/(mol.V).

The main complication is that a normal dft calc will give us the electronic energy, ε0, while we ultimately need the thermodynamic paramaters, e.g. ΔH, ΔS, ΔG. Luckily, they are easy (although computationally costly) to get by doing a frequency calculation.

The short of it:
We here define G=ε0+Gcorr.
Gaussian will give you Gcorr directly, while e.g. nwchem makes you work for it -- but not too much -- so that

NOTE:In the paper (using Jaguar) the ZPE is explicitly included, but in nwchem and gaussian the  Hcorr/Gcorr will already contain it. Approaches that I've seen online do not include ZPE and the computed data agrees better with experimental data if it's excluded. If I had to put money on anything, I'd say: make sure your software package includes ZPE in the Hcorr reported, and don't add it explicitly.

ΔGin gas phase is obtained by taking the difference in G of the product and reactant. All that remains is the solvation energy.

Throwing in solvation using a continuum solvent model is fairly easy, but can also lead to long computational times, so we will cheat  -- the original paper is from 2002, when some of the calcs took them 40 days. So we will do a single-point/energy calculation to get the solvation.

(I think a more modern approach -- given more powerful hardware -- would be to do everything with a solvent model included for small molecules. For e.g. polyoxometalates, however, this can still be costly when using COSMO, but using PCM is almost a requirement when using g09 since anions converge very slowly. Bottomline: decide on the approach and be prepared to justify it.)

Gsolvation≅ (ε0(using solvation model)+Gcorr(gas phase))-(ε0(in gas phase)+Gcorr(gas phase))=ε0(using solvation model)-ε0(in gas phase)=εsolvation.

When I only write ε0 I mean in gas phase. Here's for a reaction where A goes to B through a redox process:
ΔG≅(ε0solvation+Gcorr)B-(ε0solvation+Gcorr)A=ΔGin gas phase+ΔGsolvation

You need to do the following steps:
  1. Draw benzophenone
  2. Optimise the structure in the gas phase i.e. without a solvent model
  3. Do a frequency calc of the optimised structure in the gas phase
  4. Do an energy (single point) calculation of the gas-phase optimised structure using a solvent model (e.g. PCM or COSMO). That is, you use the structure you optimised in the gas phase, then apply a solvent model and do a single-point energy calculation.

  5. Draw the benzophenone anion
  6. Optimise the structure in the gas phase i.e. without a solvent model
  7. Do a frequency calc of the optimised structure in the gas phase
  8. Do an energy (single point) calculation of the gas-phase optimised structure using a solvent model

2 will give you ε0. 3 will give you ε0 again, and Gcorr. 4 will give you ε0(using solvation model).
Steps 5-8 will do the same, but for the reduced species.

Gaussian inputs:
#p opt rb3lyp/6-31g**

benzophenone, neutral, gas phase

0 1
...xyz coordinates of starting guess...
#p freq rb3lyp/6-31g**

benzophenone, neutral, frequency

0 1
...xyz coordinates of optimised structure...
#p rb3lyp/6-31g** scrf=(pcm,solvent=acetonitrile)

benzophenone, neutral, solvation

0 1
...xyz coordinates of optimised structure...
#p opt rb3lyp/6-31g**

benzophenone, anion, gas phase

-1 2
...xyz coordinates of starting guess...

etc. Note the multiplicity -- one unpaired electron gives a multiplicity of 2 and a charge of -1. You can of course use the .chk file as a restart point, or do both opt and freq in a single calculation.

Nwchem input:
title "Benzophenone, neutral, gas phase"
start benzophenone_gas_opt
charge 0
       ...xyz coordinates of starting guess...
       xc b3lyp
       mult 1
       grid fine
task dft optimize
title "benzophenone, neutral, gas phase, frequency"
start benzophenone_gas_freq
charge 0
       ...xyz coordinates of optimised structure...
       xc b3lyp
       mult 1
       grid fine
task dft freq

title "benzophenone, neutral, COSMO"
start benzophenone_solv_energy
charge 0
       ...xyz coordinates of optimised structure...
       dielectric 37.5
       xc b3lyp
       mult 1
       grid fine
task dft energy
title "Benzophenone, anion, gas phase"
start benzophenone_anion_gas_opt
charge -1
       ...xyz coordinates of starting guess...
      xc b3lyp
      mult 2
       grid fine
task dft optimize

You can of course throw two task directives in one and the same calculcation etc. You can also re-use movecs files. But I'm trying to make this as simple as possible to follow.

Gaussian output:
Neutral, gas phase opt:

SCF Done:  E(UB3LYP) =  -576.648098680     A.U. after   16 cycles
             Convg  =    0.3611D-08             -V/T =  2.0097
Neutral, gas phase freq:
 Zero-point correction=                           0.191763 (Hartree/Particle)
 Thermal correction to Energy=                    0.202482
 Thermal correction to Enthalpy=                  0.203427
 Thermal correction to Gibbs Free Energy=         0.154218
 Sum of electronic and zero-point Energies=           -576.456336
 Sum of electronic and thermal Energies=              -576.445616
 Sum of electronic and thermal Enthalpies=            -576.444672
 Sum of electronic and thermal Free Energies=         -576.493881
neutral, solvation=acetonitrile, energy:
SCF Done:  E(UB3LYP) =  -576.654962012     A.U. after   13 cycles
             Convg  =    0.2187D-08             -V/T =  2.0097
Anion gas phase opt:
SCF Done:  E(UB3LYP) =  -576.656219870     A.U. after   20 cycles
             Convg  =    0.6297D-08             -V/T =  2.0095
Anion gas frequency:
 Zero-point correction=                           0.187970 (Hartree/Particle)
 Thermal correction to Energy=                    0.198790
 Thermal correction to Enthalpy=                  0.199734
 Thermal correction to Gibbs Free Energy=         0.150074
 Sum of electronic and zero-point Energies=           -576.468250
 Sum of electronic and thermal Energies=              -576.457430
 Sum of electronic and thermal Enthalpies=            -576.456486
 Sum of electronic and thermal Free Energies=         -576.506146
Anion, solvation, energy:
SCF Done:  E(UB3LYP) =  -576.732948458     A.U. after   16 cycles
             Convg  =    0.5837D-08             -V/T =  2.0096
Putting it all together:

Gneutral,gasphase= -576.648098680+0.154218=-576.493880680

ΔG=(-576.506145870-0.076728588)-(-576.493880680-0.006863332)=-0.082130446 Hartree=-51.537594039014 kcal/mol=ΔG/(-1*F)= -51.537594039014/(-1*23.06)=2.23493469379939288811  V. Referencing it against the SCE at 4.1888 V, we get -1.95386530620060711189 V vs SCE

Looking at the paper, they got -2.426+4.1888=1.7628 V absolute, corresponding to -40.6502 kcal/mol=-0.06478026 hartree. Remember that this was done using G09 and not Jaguar, which was used in the paper. The experimental value is -1.88 V. The maximum obtainable accuracy of DFT (according to the paper) is about 2 kcal/mol.

Nwchem output:
Neutral, gas
        Total DFT energy =     -576.648088887542
      One electron energy =    -2313.472465671473
           Coulomb energy =     1046.450004818160
    Exchange-Corr. energy =      -82.832342635322
 Nuclear repulsion energy =      773.206714601093
neutral, gas, freq
 Zero-Point correction to Energy  =  120.416 kcal/mol  (  0.191895 au)
 Thermal correction to Energy     =  127.114 kcal/mol  (  0.202569 au)
 Thermal correction to Enthalpy   =  127.706 kcal/mol  (  0.203513 au)
 Total Entropy                    =  103.010 cal/mol-K
   - Translational                =   41.486 cal/mol-K (mol. weight = 182.0732)
   - Rotational                   =   31.544 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =   29.980 cal/mol-K
neutral, solvated, energy
                  COSMO solvation results
                 gas phase energy =      -576.6480876954
                 sol phase energy =      -576.6603289676
 (electrostatic) solvation energy =         0.0122412722 (    7.68 kcal/mol)

Anion gas

       Total DFT energy =     -576.656222314875
      One electron energy =    -2320.751987024161
           Coulomb energy =     1058.312412372760
    Exchange-Corr. energy =      -83.108633373896
 Nuclear repulsion energy =      768.891985710422
Anion gas, freq
Zero-Point correction to Energy  =  118.177 kcal/mol  (  0.188327 au)
 Thermal correction to Energy     =  124.894 kcal/mol  (  0.199031 au)
 Thermal correction to Enthalpy   =  125.486 kcal/mol  (  0.199975 au)
 Total Entropy                    =  102.152 cal/mol-K
   - Translational                =   41.486 cal/mol-K (mol. weight = 182.0732)
   - Rotational                   =   31.577 cal/mol-K (symmetry #  =        1)
   - Vibrational                  =   29.090 cal/mol-K
Anion solvated, energy
                 gas phase energy =      -576.6562158410
                 sol phase energy =      -576.7436282813
 (electrostatic) solvation energy =         0.0874124403 (   54.85 kcal/mol)

Which works out to

Gcorr=(125.486-298.15* 102.152/1000)=0.15141494338035305421
ε0,anion,gasphase= -576.656222314875
Ganion,gasphase=  -576.656222314875+0.15141494338035305421+0.188327=-576.50480737149464694579+0.188327

ΔG=(-576.50480737149464694579-0.0874124403)-(-576.49351968056648251739-0.0122412722)=-0.08645885902816442840 Hartree=-54.25371216990443230085 kcal/mol => 2.35271952167842250687 V (absolute) <=> -1.836 V vs SCE. This is very close to the experimental value of -1.88 V but not at all close to the reported value in the paper of -2.426 V vs SCE.

Whether I'm doing something wrong or whether the differences in solvation models and implementation in differernt software packages is to blame, I don't know. 15 kcal/mol is a lot. 
However, not including solvation at all yields values which are completely off: -3.85 V vs SCE for gaussian and 3.88 for nwchem. The solvation free energy correction is the largest factor in yielding a 'correct' (as in order of magnitude) result -- and if I'd be forced to venture a guess, solvation models/implementations (and their empirical input) both differ a lot between modelling packages and have improved a lot in the intervening 10 year.

I'll post data for larger basis sets in a week or so to see whether the nwchem/cosmo value was a fluke or 'real'.

[g09 using aug-cc-pwdCVDZ:
single point E, gas phase using lanl2dz structure:
neutral :  -576.718523917
anion:     -576.746041821
ΔG=(-576.746041821+0.150074-0.076728588)-(-576.718523917+0.154218-0.006863332)= -0.101527160 Hartree =-1.426 V relative to SCE, vs -1.88 experimental and  -1.953 with LANL2DZ. That didn't help...]

25 May 2012

163. Long rant over LaTeX problems caused by my own stupidity: tex.pro

25 May 2012.
I'm having problems turning dvi into ps
latex test.tex
works fine.
dvips test.dvi
however gives
This is dvips(k) 5.991 Copyright 2011 Radical Eye Software (www.radicaleye.com) dvips: ! Couldn't find header file: tex.pro Process exited with error(s)
Diagnosis: PEBKAC
I was being a bloody idiot.
cat ~/.bashrc|grep TEX
export TEXINPUTS=/usr/share/texmf/tex/latex/prosper/prosper.cls:$TEXINPUTS
export TEXMFMAIN=/usr/share/texmf/

Once I commented those lines out and did
export TEXINPUTS=""
export TEXMFMAIN=""

everything was fine.

I wasted 1h30 min of prime working time on THIS. Anyway, I'm putting it all up here in case someone is having a similar problem. The moral of the story is:
always begin by checking your ~/.bashrc, ~/.cshrc and ~/.profile.

The other reason for posting it online is because the troubleshooting steps may be useful to have in the future.

kpsewhich tex.pro
gave nothing

locate tex.pro
/usr/share/texlive/texmf/dvips/base/tex.pro /usr/share/texlive/texmf-dist/dvips/gastex/gastex.pro

sudo texconfig
sudo mktexlsr
sudo texhash
helped (they pretty much do the same thing).

dpkg -S tex.pro
texlive-science: /usr/share/texlive/texmf-dist/dvips/gastex/gastex.pro texlive-base: /usr/share/texlive/texmf/dvips/base/tex.pro

show dpkg is aware of the files.
kpsewhich -path /usr/share/texlive/texmf/dvips/base tex.pro
obviously works.
dvips -d 2 draft_1.dvi
This is dvips(k) 5.991 Copyright 2011 Radical Eye Software (www.radicaleye.com) input file draft_1.dvi output file draft_1.ps swmem 180000 kdebug:Search path for PostScript header files (from texmf.cnf) kdebug: = .:/usr/share/texlive/texmf/dvips/base//dvips//:/usr/share/texlive/texmf/dvips/base//fonts/enc//:/usr/share/texlive/texmf/dvips/base//fonts/type1//:/usr/share/texlive/texmf/dvips/base//fonts/type42//:/usr/share/texlive/texmf/dvips/base//fonts/type3// kdebug: before expansion = .:$TEXMF/{dvips,fonts/{enc,type1,type42,type3}}// kdebug: application override path = (none) kdebug: application config file path = (none) kdebug: texmf.cnf path = .:$TEXMF/{dvips,fonts/{enc,type1,type42,type3}}// kdebug: compile-time path = /nonesuch kdebug: environment variables = TEXPSHEADERS PSHEADERS [..] dvips: ! Couldn't find header file: tex.pro
export TEXMF=/usr/share/texlive/texmf
solves the tex.pro problem, but gives this instead:
tex.dvips: Can't open font metric file cmr10.tfm dvips: I will use cmr10.tfm instead, so expect bad output. dvips: ! I can't find cmr10.tfm; please reinstall me with proper paths
locate cmr10.tfm
export TEXMF=$TEXMF:/usr/share/texlive/texmf-dist/fonts/tfm/public/cm
and we're back to
dvips: ! Couldn't find header file: tex.pro
sudo dpkg-reconfigure texlive-base
didn't do anything for me.
aptitude search texlive|grep ^i|gawk '{print $2,$3}>texlive.list
I then edited texlive.list and remove all the initial A on the relevant rows.

sudo apt-get purge texlive

followed by
cat texlive.list|xargs >list
I the took the list there
sudo apt-get install `cat list`
which did
sudo apt-get install feynmf latex-beamer latex-xcolor pgf prosper texlive texlive-base texlive-bibtex-extra texlive-binaries texlive-common texlive-doc-base texlive-extra-utils texlive-font-utils texlive-fonts-recommended texlive-fonts-recommended-doc texlive-generic-recommended texlive-latex-base texlive-latex-base-doc texlive-latex-extra texlive-latex-extra-doc texlive-latex-recommended texlive-latex-recommended-doc texlive-luatex texlive-metapost texlive-metapost-doc texlive-pictures texlive-pictures-doc texlive-pstricks texlive-pstricks-doc texlive-publishers texlive-publishers-doc texlive-science texlive-science-doc texpower tipa

It still doesn't [censored] work!

It's the same shit with

dvipdfm draft_1.dvi
** WARNING ** Could not open config file "dvipdfmx.cfg". draft_1.dvi -> draft_1.pdf [1] 22057 bytes written

locate dvipdfmx.cfg
/usr/share/texlive/texmf/dvipdfmx/dvipdfmx.cfg /usr/share/texlive/texmf-dist/tex/generic/pstricks/config/xdvipdfmx.cfg
sudo texhash and sudo texconfig do nothing.
cat /usr/share/texlive/texmf/ls-R|egrep "tex.pro|dvipdfmx.cfg"
dvipdfmx.cfg dvipdfmx.cfg.ucf-dist tex.pro
So what the bloody hell is the effing problem?
locate ls-R
/usr/share/texlive/texmf/ls-R /usr/share/texlive/texmf-dist/ls-R /usr/share/texmf/ls-R /var/lib/texmf/ls-R /var/lib/texmf/ls-R-TEXLIVE /var/lib/texmf/ls-R-TEXLIVEDIST /var/lib/texmf/ls-R-TEXLIVEMAIN /var/lib/texmf/ls-R-TEXMFMAIN
So which ones have tex.pro in them?
/usr/share/texlive/texmf/ls-R -- yes /usr/share/texlive/texmf-dist/ls-R -- no /usr/share/texmf/ls-R -- no /var/lib/texmf/ls-R -- no /var/lib/texmf/ls-R-TEXLIVE -- yes /var/lib/texmf/ls-R-TEXLIVEDIST -- no /var/lib/texmf/ls-R-TEXLIVEMAIN -- yes/var/lib/texmf/ls-R-TEXMFMAIN -- no

24 May 2012

162. PSPW/Carr-Parrinello using ECCE

This is more of a note to self about carr-parrinello using ecce/nwchem. As always, this isn't about the science, but about making the computation run at all. And what I may consider a bug may in fact be a feature.

If you simply click your way through ecce and try to launch a pspw carr-parrinello calc, it will fail.

Two problems:

  • task pspw carr-parrinello expects a .movecs file to be present. You can 'solve' this by putting task pspw steepest_descent before you task pspw carr-parrinello statement
  • if you relaunch a run, often you get a crash with an error referring to writing after EOF. You can solve this by cleaning out your run directory.

Problem number one comes down to this:

"Velocity Wavefunction Datafile
The one-electron orbital velocities are stored in a velocity wavefunction datafile. This is a binary file and cannot be directly edited. This datafile is used by the Car-Parrinello task and can be generated using the v_wavefunction_initializer task."

The dependence on certain files (at a minimum, .movecs) being present and the need for optimisation before CPMD and the fact that either
task pspw energy
task pspw optimize
will more often than not prevent your ecce.out file from showing any of the MD stuff means it's still better to set up your file by hand and run everything by hand in a dedicated directory. Ecce doesn't copy runtime file back and forth, and that's the main problem here.

Really, what I find a problem is that I'd like to optimize and equilibriate a set of molecules, then continue using the equilibriated set.

If all you're trying to do is to get something, anything to work to get a feel for how this stuff works, then continue reading.

Problematic example file:

  1 scratch_dir /home/me/jobs/scratch
  2 Title "biphenyl_ground_twisted_cpmd_1-1"
  4 Start  biphenyl_ground_twisted_cpmd_1-1
  6 echo
  8 charge 0
 10 geometry autosym units angstrom
 11  C     0.00676622     3.53807     0.0197363
 12  C     -1.29633     2.88855     0.554869
 13  C     -1.31879     1.38415     0.519460
 14  C     0.00129627     0.730174     -0.000557722
 15  C     1.28746     1.38368     -0.578129
 16  C     1.31931     2.90453     -0.512952
 17  C     -0.0100394     -0.758319     -0.0224661
 18  C     1.33004     -1.36336     0.563945
 19  C     1.24425     -2.89848     0.485842
 20  C     -1.31683     -1.36948     -0.531559
 21  C     0.0254501     -3.54181     0.0318405
 22  C     -1.30632     -2.89413     -0.540694
 23  H     0.0916004     4.71976     0.273191
 24  H     -2.74374     3.75562     1.47791
 25  H     -2.78549     0.594633     1.40665
 26  H     2.70470     3.74589     -1.20122
 27  H     3.09496     -3.77313     1.52949
 28  H     -2.76973     -0.640827     -1.49262
 29  H     0.0203915     -4.70472     -0.288098
 30  H     -2.76848     -3.72979     -1.38695
 31  H     2.88815     -0.631319     1.39550
 32  H     2.66933     0.621436     -1.58686
 33 end
 35 ecce_print ecce.out
 37 nwpw
 38   mult 1
 39   np_dimensions -1  -1
 40   tolerances 1e-7  1e-7
 41   car-parrinello
 42     time_step 5.000000e+00
 43     fake_mass 5.000000e+02
 44     loop 10 100
 45     scaling 1.000000e+00 1.000000e+00
 46   end
 47 end
 49 task pspw car-parrinello

Quick 'solution'
3 memory 200mw
48 task pspw steepest_descent

The line numbers are added by me. Remove them before running.

You can also stick task pspw energy or optimize in there -- but the way ecce does it, with just a task pspw carr-parrinello, won't work. Either way, it'd be nice to be able to carry over the movecs files between calculations.

See below for various errors:

Error #1:
If you set up the run using ecce, it won't work and there won't be any real error message to explain why the run exits immediately.

294      >>>  JOB STARTED       AT Thu May 24 14:19:04 2012  <<<
295           ================ input data ========================
296   library name resolved from: compiled reference
297   NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
298   library name resolved from: compiled reference
299   NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
301 -----ECCE Log Information-----
302 Starting Job: Thu May 24 14:19:02 EST 2012
303 Using /home/me/jobs/scratch as nwchem SCRATCH_DIR
304 nwchem exit status = -1
305 Final exit status = -1
306 Completed Job: Thu May 24 14:19:05 EST 2012

If you launch the run in the terminal (without mpirun -- mpi suppresses error messages sometimes) you get:
     >>>  JOB STARTED       AT Thu May 24 14:20:15 2012  <<<
          ================ input data ========================
  library name resolved from: compiled reference
  NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
  library name resolved from: compiled reference
  NWCHEM_NWPW_LIBRARY set to: </opt/nwchem/nwchem-6.1/src/nwpw/libraryps/>
ERROR:  Could not open pipe from input file
The reason is that ecce doesn't carry files over from previous simulations -- you need the .movecs file. This can be generated by
  task pspw steepest_descent 

If you could run all your job in the same directory that wouldn't be a problem.

Error #2
438      >>>  JOB STARTED       AT Thu May 24 14:24:38 2012  <<<
439           ================ input data ========================
440  ------------------------------------------------------------------------
441  out of heap memory        0
442  ------------------------------------------------------------------------
443  ------------------------------------------------------------------------
444   current input line :
445     48: task pspw energy
446  ------------------------------------------------------------------------
447  ------------------------------------------------------------------------
448  ------------------------------------------------------------------------
449  For more information see the NWChem manual at http://www.nwchem-sw.org/        index.php/NWChem_Documentation

We chucked task pspw steepest_descent in before our task pspw carr-parrinello and now get a new error: out of heap memory. Easily fixed. You can set e.g. 200MW under pspw/details or add
memory 200 MW
by hand.

Of course, if you add it by clicking in ecce then your task pspw steepest_descent line will be removed, so you'll have to add that by hand again.

Error #3
According to the manual "This [movecs] datafile is used by the Car-Parrinello task and can be generated using the v_wavefunction_initializer task."
Well, try
task v_wavefunction_initialize
and you get

>>>> PSPW Serial Module - v_wavefunction_initializer <<<<
0:Segmentation Violation error, status=: 11
(rank:0 hostname:beryllium pid:24675):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigSegvHandler():310 cond:0
Last System Error Message from Task 0:: No such file or directory
And you find that there's a file called ????

Error 4
Going full out:

task pspw wavefunction_initializer
task pspw pseudopotential_formatter
task pspw v_wavefunction_initializer
task pspw car-parrinello

 >>>> PSPW Serial Module - wavefunction_initializer <<<<
0:Floating Point Exception error, status=: 8
(rank:0 hostname:beryllium pid:26026):ARMCI DASSERT fail. ../../ga-5-1/armci/src/common/signaltrap.c:SigFpeHandler():249 cond:0
Last System Error Message from Task 0:: No such file or directory

Error 5:
If you haven't cleared out your run directory you get this via ecce
        ============ Car-Parrinello iteration ==============
     >>>  ITERATION STARTED AT Thu May 24 18:03:07 2012  <<<
    iter.         KE+Energy             Energy        KE_psi        KE_ion   Temperature
      10  -0.1662131203E+02  -0.1662582428E+02   0.43690E-02   0.39005E-02        143.80

-----ECCE Log Information-----
Starting Job: Thu May 24 18:00:41 EST 2012

and this if you run in the terminal

At line 847 of file cpmdv5.F (unit = 31, file = './cpmd_test.emotion')
Fortran runtime error: Sequential READ or WRITE not allowed after EOF marker, possibly use REWIND or BACKSPACE
      10   0.1175850345E+06   0.3940717762E+03   0.18793E-03   0.27950E-02       1852.03
mpirun has exited due to process rank 0 with PID 26303 on
node beryllium exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

22 May 2012

161. Compiling Me-TV 1.4 on debian testing/wheezy

Quite some time ago me-tv crapped out on me (it's working again as of 9 Jan 2013). While good things came out of it (set up vlc with dvb -- and the image quality is much better for some reason) me-tv does the whole EPG thing really well -- and VLC doesn't really.

[Edit: note that me-tv isn't actively developed anymore: https://answers.launchpad.net/me-tv/+question/216266.]

This 'guide' will put your metv in your home directory and won't interfere with the debian package version. --prefix is your friend, always.

me-tv 1.3.7-2 

Anyway, here's how to compile me-tv v

Start here
sudo apt-get install gnome-common libglibmm-2.4-dev libxml++2.6-dev libgtkmm-2.4-dev libgconfmm-2.6-dev libunique-dev libvlc-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libsqlite3-dev libdbus-glib-1-dev

You'll probably need more than what I've shown above -- intltool, automake etc. But those packages were the ones that were missing on my particular system. As always, if a package is missing, do
aptitude search package|grep dev
and chances are that you find what you're looking for

mkdir ~/tmp
cd ~/tmp
wget https://launchpad.net/me-tv/1.4/1.4.0/+download/me-tv-
tar xvf me-tv-
cd me-tv-
./autogen.sh --prefix=/home/${USER}/.metv-

Configure summary:
        Source code location .......: .
        Compiler ...................: gcc
        Compiler flags .............:  -O0 -g
        Enable compile warnings ....: minimum
        Enable more warnings .......: yes
        Extra Compiler Warnings ....: -g -O2 -Wall -Wno-unused  -Wextra -Wcast-align -Wcast-qual -Wcomment -Wformat -Wimplicit -Wmissing-braces -Wpacked -Wparentheses -Wpointer-arith -Wreturn-type -Wsequence-point -Wstrict-aliasing -Wstrict-aliasing=2 -Wswitch-default -Wundef
        Debug support ..............: yes
        Installing into prefix .....: /home/me/.metv-
Type make to build Me TV

Now type `make' to compile Me TV

Who am I to argue with that?
make install 

Add ~/.metv- to your PATH
echo 'export PATH=$PATH:/home/${USER}/.metv-'>>~/.bashrc

(important that you use ' and not " )



If you get
2012-05-22 17:55:00: Me TV Server started
2012-05-22 17:55:00: An unhandled exception was generated
2012-05-22 17:55:00: Error: The Me TV database version does not match the Me TV server version.

then you may shed a tear and
rm /home/${USER}/.local/share/me-tv/me-tv.db

160. Compiling kernel 3.4 on debian

The steps are the usual ones. At this point compiling your kernel is perhaps more of a hobby than a necessity to most people, unless you happen to have some fancy piece of hardware that's about to become supported.

It's not difficult, so there's no reason not to give it a spin.

UPDATE 9/7: Works fine with 3.4.4 as well (as it should). Compile time with -j7 on AMD X6 1055 is

real 33m27.472s
user 84m7.295s
sys 15m58.668s

which is underwhelming.

-- Start Here --
sudo apt-get install kernel-package fakeroot build-essential

mkdir ~/tmp
cd ~/tmp
wget http://www.kernel.org/pub/linux/kernel/v3.0/linux-3.4.tar.bz2
tar xvf linux-3.4.tar.bz2
cd linux-3.4/
cat /boot/config-`uname -r`>.config
make oldconfig

If your current kernel is 3.3.5 the questions that await are given at the bottom of this post with links to descriptions of the different options. As usual, if in doubt, just hit enter.

make-kpkg clean

Building takes ages (depending on number of cores committed), so don't launch it at 4 pm on a Friday if you need to shut down your computer before going home... As usual, use the -jX switch for parallel builds, where X is the number of cores+1 (i.e. 4 cores => -j5)

The following command goes on a single line
time fakeroot make-kpkg -j5 --initrd --revision=3.4.0 --append-to-version=-amd64 kernel_image kernel_headers

Once the build is done, move the .deb files out of the way and to your linux-3.4 directory for safe-keeping
 mv ../*3.4.0*.deb .
sudo dpkg -i *.deb


The image weighs in at about 33 Mb and the headers at 7.6 Mb
And compile time with 4 out of 6 cores?  Well, not too bad:

real    34m51.027s
user    73m35.644s
sys     15m9.169s

Boottime Graphics Resource Table support (ACPI_BGRT) [N/m/y/?] (NEW)
      Default ASPM policy
      > 1. BIOS default (PCIEASPM_DEFAULT) (NEW)
        2. Powersave (PCIEASPM_POWERSAVE) (NEW)
        3. Performance (PCIEASPM_PERFORMANCE) (NEW)
      choice[1-3]: 1
Enable PCI resource re-allocation detection (PCI_REALLOC_ENABLE_AUTO) [N/y/?] (NEW)
x32 ABI for 64-bit mode (EXPERIMENTAL) (X86_X32) [N/y/?] (NEW) See also cateee
Connection tracking timeout (NF_CONNTRACK_TIMEOUT) [N/y/?] (NEW)
Connection tracking timeout tuning via Netlink (NF_CT_NETLINK_TIMEOUT) [N/m/?] (NEW)
LOG target support (NETFILTER_XT_TARGET_LOG) [N/m/?] (NEW) M
Plug network traffic until release (PLUG) (NET_SCH_PLUG) [N/m/y/?] (NEW)
 PEAK PCAN-ExpressCard Cards (CAN_PEAK_PCIEC) [Y/n/?] (NEW)
PEAK PCAN-USB/USB Pro interfaces (CAN_PEAK_USB) [N/m/?] (NEW)
 Support for DiskOnChip G4 (EXPERIMENTAL) (MTD_NAND_DOCG4) [N/m/?] (NEW)
Universal Flash Storage host controller driver (SCSI_UFSHCD) [N/m/?] (NEW)
virtio-scsi support (EXPERIMENTAL) (SCSI_VIRTIO) [N/m/?] (NEW)
 Verity target support (EXPERIMENTAL) (DM_VERITY) [N/m/?] (NEW)
Solarflare SFC9000-family hwmon support (SFC_MCDI_MON) [Y/n/?] (NEW)
Solarflare SFC9000-family SR-IOV support (SFC_SRIOV) [Y/n/?] (NEW)
 Drivers for the AMD PHYs (AMD_PHY) [N/m/?] (NEW)
QMI WWAN driver for Qualcomm MSM based 3G and LTE modems (USB_NET_QMI_WWAN) [N/m/?] (NEW)
support MFP (802.11w) even if uCode doesn't advertise (IWLWIFI_EXPERIMENTAL_MFP) [N/y/?] (NEW)
Additional debugging output (RTLWIFI_DEBUG) [Y/n] (NEW)
TI OMAP4 keypad support (KEYBOARD_OMAP4) [N/m/y/?] (NEW)
Synaptics USB device support (MOUSE_SYNAPTICS_USB) [N/m/y/?] (NEW)
Cypress TTSP touchscreen (TOUCHSCREEN_CYTTSP_CORE) [N/m/y/?] (NEW) 
Ilitek ILI210X based touchscreen (TOUCHSCREEN_ILI210X) [N/m/?] (NEW)
Xen Hypervisor Multiple Consoles support (HVC_XEN_FRONTEND) [Y/n/?] (NEW)
HSI support (HSI) [N/m/y/?] (NEW)
Intel PCH EG20T as PTP clock (PTP_1588_CLOCK_PCH) [N/m/?] (NEW) 
Dallas 2781 battery monitor chip (W1_SLAVE_DS2781) [N/m/?] (NEW) 
 2781 battery driver (BATTERY_DS2781) [N/m/?] (NEW)
Summit Microelectronics SMB347 Battery Charger (CHARGER_SMB347) [N/m/?] (NEW) 
Microchip MCP3021 (SENSORS_MCP3021) [N/m/?] (NEW) 
TPS65217 Power Management / White LED chips (MFD_TPS65217) [N/m/?] (NEW)
  TI TPS62360 Power Regulator (REGULATOR_TPS62360) [N/m/?] (NEW) 
  GPIO IR remote control (IR_GPIO_CIR) [N/m/?] (NEW) 
 Keene FM Transmitter USB support (USB_KEENE) [N/m/?] (NEW)
AzureWave 6007 and clones DVB-T/C USB2.0 support (DVB_USB_AZ6007) [N/m/?] (NEW) 
Realtek RTL28xxU DVB USB support (DVB_USB_RTL28XXU) [N/m/?] (NEW)
Allow to specify an EDID data set instead of probing for it (DRM_LOAD_EDID_FIRMWARE) [N/y/?] (NEW)
  DisplayLink (DRM_UDL) [N/m/?] (NEW)
Intel740 support (EXPERIMENTAL) (FB_I740) [N/m/y/?] (NEW) 
Exynos Video driver support (EXYNOS_VIDEO) [N/y/?] (NEW)
Backlight driver for TI LP855X (BACKLIGHT_LP855X) [N/m/?] (NEW)
Saitek non-fully HID-compliant devices (HID_SAITEK) [N/m/?] (NEW)
TiVo Slide Bluetooth remote control support (HID_TIVO) [N/m/?] (NEW)
 Generic OHCI driver for a platform device (USB_OHCI_HCD_PLATFORM) [N/y/?] (NEW) 
Generic EHCI driver for a platform device (USB_EHCI_HCD_PLATFORM) [N/y/?] (NEW)
USB Fintek F81232 Single Port Serial Driver (USB_SERIAL_F81232) [N/m/?] (NEW) 
USB Metrologic Instruments USB-POS Barcode Scanner Driver (USB_SERIAL_METRO) [N/m/?] (NEW)
 LED support for PCA9633 I2C chip (LEDS_PCA9633) [N/m/?] (NEW)
Xen ACPI processor (XEN_ACPI_PROCESSOR) [M/n/?] (NEW) 
Memory allocator for compressed pages (ZSMALLOC) [M/y/?] (NEW) 
 Intel Management Engine Interface (Intel MEI) (INTEL_MEI) [N/m/y/?] (NEW) 
USB over WiFi Host Controller (USB_WPAN_HCD) [N/m/?] (NEW) 
Apple Gmux Driver (APPLE_GMUX) [N/m/y/?] (NEW) 
QNX6 file system support (read only) (QNX6FS_FS) [N/m/y/?] (NEW) 
 NFSv4.1 Implementation ID Domain (NFS_V4_1_IMPLEMENTATION_ID_DOMAIN) [kernel.org] (NEW) 
RPC: Enable dprintk debugging (SUNRPC_DEBUG) [N/y/?] (NEW) 
Print additional diagnostics on RCU CPU stall (RCU_CPU_STALL_INFO) [N/y/?] (NEW)
Yama support (SECURITY_YAMA) [N/y/?] (NEW)
Camellia cipher algorithm (x86_64) (CRYPTO_CAMELLIA_X86_64) [N/m/y/?] (NEW)
CRC32 perform self test on init (CRC32_SELFTEST) [N/y/?] (NEW) 
 CRC32 implementation
  > 1. Slice by 8 bytes (CRC32_SLICEBY8) (NEW)
    2. Slice by 4 bytes (CRC32_SLICEBY4) (NEW)
    3. Sarwate's Algorithm (one byte at a time) (CRC32_SARWATE) (NEW)
    4. Classic Algorithm (one bit at a time) (CRC32_BIT) (NEW)

Links to this post:

159. PES scanning of methanol bonds, angles, torsion using nwchem, nwgeom and python

NOTE: there's something dodgy with the potential/bond length plots -- they optimal bond lengths are way too long. I'll leave this post up here anyway, but be WARNED.

This is more of an overview of an idea of how to do it together with some explicit examples. This is more of a sketch than a step-by-step account.

Today's molecule is Methanol.

0. ecce and nwgeom/python
You need to set PYTHONPATH to /opt/nwchem/nwchem-6.1/contrib/python in order for nwchem to find the nwgeom module. That's easy enough on a local system since ~/.bashrc is read -- but it won't read ~/.bashrc on remote systems. For this you need to edit your CONFIG.<machine> files (in ~/.ECCE on your main node)  -- add
setup {
     setenv PYTHONPATH /opt/nwchem/nwchem-6.1/contrib/python

NWChemEnvironment {
          PYTHONPATH /opt/nwchem/nwchem-6.1/contrib/python

1. Draw the molecule,
Draw the Carbon, then the oxygen, then the protons on the carbon, then the protons on the oxygen. Basically, draw the backbone first, then add protons by hand.  Turn it into a residue-based system (under 'build') and optimise the structure using e.g. RHF/6-31G* (this was written with MM/FF parametrisation in mind -- for simple scanning, just do whatever you want )

2. Calculate the partial charges (rhf/6-31g*).  Can skip this
You can e.g. constrain the methyl groups, or force all the methyl protons to be equal. It's a bit of a soft science, really. After the calc has finished, assign charges. I can't claim to understand which method is better (RESP, CRESP, CRESP2 etc.).

 (this was written with MM/FF parametrisation in mind -- for simple scanning, skip this step)

Here's CRESP2 (some variability...):
C -0.721
O -0.368
H 0.240
H 0.240
H 0.240
H 0.368

Also, assign (atom) Types (CT, OH, HC, HC, HC, HO) -- this is done by hand. Pick atom table, select residue view (or something similar), and fill in the Type column.

Then click on Tools, check Residue table, click on the menu-looking icon in the residue view, and select write fragment. Make sure you put the fragment file in a place where ecce and nwchem will find it (e.g. amber_u). For some reason I can't get ecce to actually change the name of my residues, so edit the fragment file by hand and change all instance of UNK to the same name as the fragment file, e.g. TST if you called it TST.frg.

3. Write down the bonds, angles and dihedral angles.
H-C 1.087
C-O 1.400
O-H 0.946

H-O-C 109.467
O-C-H 112.039
H-C-H 108.682

H-C-O-H -61.229

When you scan the parameters using python you want to be able to
1) see if the lowest energy conf make sense and
2) not deviate too much from the ideal angle/bond/torsion.

Things get weird if you do.

4. Try to determine bond strength
This is best done outside of ecce, and you really should have compiled nwchem with python-support to make this easier.

Copy the input file you used for the ESP calc. Call it bonds.nw. Remove anything about esp and all task directives, then add:

Technically I think the bond strengths only really need two data points unless you want to fit the Lennard-Jones equation to them, but it certainly looks neater getting the full behaviour.

Then run using
mpirun -n 2 nwchem bonds.nw | tee bond12.nw

You can also do it in ecce -- but the plotting will have to be done by hand (I open the out file with vim, select the energy/atom-distance columns, :w angle.dat and then plot with gnuplot)

Do the same for atom pair 1-6 and 2-3. Make sure to pull the atoms far enough apart that the energy tails out (the 1-6 pair in the figure needs to be separated more)

5. Angles
As you'll discover, it's not just a matter of throwing in random numbers and scanning -- if you don't collect enough points, or if your first point is far away from the optimal angle, the data will look very odd.

6. Torsion

21 May 2012

158. Setting up ecce with qsub at An Australian University computational cluster

EDIT: this works for G09 on that particular cluster. Come back in a week or two for a more general solution (end of May 2012/beginning of June 2012).

I don't feel comfortable revealing where I work. But imagine that you end up working at an Australian University in, say, Melbourne. I do recognise that I will be giving enough information here to make it possible to identify  who I am (and there are many reasons not to want to be identifiable -- partly because students can be mean and petty, and partly because I suffer from the delusion that IT rules apply to Other People, and not me -- and have described ways of doing things you're not supposed to be doing in this blog)


My old write-ups of ecce are pretty bad, if not outright inaccurate. Anyway, I presume that in spite of that you've managed to set up ECCE well enough to run stuff on nodes of your local cluster.

Now it's time for the next level -- on a remote site using SGE/qsub

So far I've only tried this out with G09 -- they are currently looking to set up nwchem on the university cluster. Not sure what the best approach to the "#$ -pe g03_smp2 2" switch is for nwchem.


EVERYTHING I DESCRIBE IS DONE ON YOUR DESKTOP, NOT ON THE REMOTE SYSTEM. Sorry for shouting, but don't got a-messing with the remote computational cluster -- we only want to teach ecce how to submit jobs remotely. The remote cluster should be unaffected.

1. Creating the Machine
To set up a site with a queue manager, start
ecce -admin

Do something along the lines of what's shown in the figure above.

If you're not sure whether your qsub belongs to PBS or SGE, type qstat -help and look at the first line returned, e.g. SGE 6.2u2_1.

2. Configure the site
Now, edit your ecce-6.3/apps/siteconfig/CONFIG.msgln4  (local nodes go into ~/.ECCE  but remote SITES go in apps/siteconfig --  and that's what we're working with here).

   NWChem: /usr/local/bin/NWCHEM
   Gaussian-03: /usr/local/bin/G09
   perlPath: /usr/bin/perl
   qmgrPath: /usr/bin/qsub
   SGE {
   #$ -S /bin/csh
   #$ -cwd
   #$ -l h_rt=$wallTime
   #$ -l h_vmem=4G
   #$ -j y
   #$ -pe g03_smp2 2

   module load gaussian/g09
A word of advice -- open the file in vim (save using :wq!) or do a chmod +w on it first since it will be set to read-only by default.

3. Queue limits
The same goes for the next file, which controls various job limits, ecce-6.3/apps/siteconfig/msgln4.Q:
# Queue details for msgln4
Queues:    squ8

squ8|minProcessors:       2
squ8|maxProcessors:       6
squ8|runLimit:       4320
squ8|memLimit:       4000
squ8|scratchLimit:       0
4. Connect
In the ecce launcher-mathingy click on Machine Browser, and Set Up Remote Access for the remote cluster. Basically, type in your user name and password.

Click on machine status to make sure that it's connecting

5.Test it out!
If all is well you should be good to go