31 July 2013

486. MS data, part IV: Making a stacked spectrum plot using gnuplot

In part 1, I showed you how to export MS data (MS= Mass Spectrometer/ry): http://verahill.blogspot.com.au/2013/07/474-exporting-data-from-wsearch32-and.html

In part 2, I showed you how to generate theoretical isotopic envelopes, and how to compare them with observed ones: http://verahill.blogspot.com.au/2013/07/480-ms-data-part-ii-plotting-and.html

In part 3, I showed you how to make contour plots of '3D' data -- in that particular case the extra dimension was cone voltage, but it could just as easily have been time: http://verahill.blogspot.com.au/2013/07/ms-data-part-iii-generating-matrix-by.html

In part 4, we'll use the same data as in part 3, but we'll make a stacked plot of it. This is a short post.

There is at least one other way of making a stacked plot in gnuplot, but it doesn't yield what I'd consider as being publication quality plots. It's also fairly restrictive in the type of data can be plotted. The method shown here is general and applicable to all types of data. You can e.g. use it for time-dependent NMR data...

Here's an example of a gnuplot script:
### Preamble start
set term png size 1000,500
set output 'stack.png'

set border 1
set xtics nomirror
set ytics nomirror
unset ytics

set xrange [100:3000]
set yrange [0:100]

set multiplot
set size 0.85,0.45
unset key

### Preamble over

### The stacked plot
set origin 0,0
set label "m/z" at 1500, -18
plot '0.dat' with lines lt -1

# we want the x axis to show ONLY for the first spectrum,
# so we turn it off for the remaining spectra.
# The same goes for our label (xlabel can be tricky 
# to position correctly)
unset label
set border 0
unset xtics
unset ytics

# Here we set the spacing (fx) the initial position of the second spectrum (f),
# the initial horizontal offset of the second spectrum relative to the first one (g),
# and the horizontal offset for all subsequent spectra (gy)

f=0.0025
fx=0.00
g=0.01
gy=0.02

# then we plot all the other spectra
set origin fx+0*f,gy+0*g
plot '10.dat' with lines lt -1

set origin fx+1*f,gy+1*g
plot '20.dat' with lines lt -1

set origin fx+2*f,gy+2*g
plot '30.dat' with lines lt -1

set origin fx+3*f,gy+3*g
plot '40.dat' with lines lt -1

set origin fx+4*f,gy+4*g
plot '50.dat' with lines lt -1

set origin fx+5*f,gy+5*g
plot '60.dat' with lines lt -1

set origin fx+6*f,gy+6*g
plot '70.dat' with lines lt -1

set origin fx+7*f,gy+7*g
plot '80.dat' with lines lt -1

set origin fx+8*f,gy+8*g
plot '90.dat' with lines lt -1

set origin fx+9*f,gy+9*g
plot '100.dat' with lines lt -1

set origin fx+10*f,gy+10*g
plot '110.dat' with lines lt -1

set origin fx+11*f,gy+11*g
plot '120.dat' with lines lt -1

set origin fx+12*f,gy+12*g
plot '130.dat' with lines lt -1

set origin fx+13*f,gy+13*g
plot '140.dat' with lines lt -1

set origin fx+14*f,gy+14*g
plot '150.dat' with lines lt -1

set origin fx+15*f,gy+15*g
plot '160.dat' with lines lt -1

set origin fx+16*f,gy+16*g
plot '170.dat' with lines lt -1

set origin fx+17*f,gy+17*g
plot '180.dat' with lines lt -1

set origin fx+18*f,gy+18*g
plot '190.dat' with lines lt -1

set origin fx+19*f,gy+19*g
plot '200.dat' with lines lt -1

set origin fx+20*f,gy+20*g
plot '210.dat' with lines lt -1

set origin fx+21*f,gy+21*g
plot '220.dat' with lines lt -1

set origin fx+22*f,gy+22*g
plot '230.dat' with lines lt -1

set origin fx+23*f,gy+23*g
plot '240.dat' with lines lt -1

set origin fx+24*f,gy+24*g
plot '250.dat' with lines lt -1

set origin fx+25*f,gy+25*g
plot '260.dat' with lines lt -1

set origin fx+26*f,gy+26*g
plot '270.dat' with lines lt -1

set origin fx+27*f,gy+27*g
plot '280.dat' with lines lt -1

set origin fx+28*f,gy+28*g
plot '290.dat' with lines lt -1

set origin fx+29*f,gy+29*g

plot '300.dat' with lines lt -1




and here's the plot:

If you want to make it really fancy, try this:
### Preamble start
set term png size 1000,800
set output 'stack.png'

set border 1
set xtics nomirror
set ytics nomirror
unset ytics

set xrange [100:3000]
set yrange [0:100]

set multiplot
set size 0.85,0.25
unset key

### Preamble over

### The stacked plot
set origin 0,0
set label "m/z" at 1500, -18
plot '0.dat' with lines lt -1

# we want the x axis to show ONLY for the first spectrum,
# so we turn it off for the remaining spectra.
# The same goes for our label (xlabel can be tricky 
# to position correctly)
unset label
set border 0
unset xtics
unset ytics

# Here we set the spacing (fx) the initial position of the second spectrum (f),
# the initial horizontal offset of the second spectrum relative to the first one (g),
# and the horizontal offset for all subsequent spectra (gy)

f=0.0025
fx=0.00
g=0.01
gy=0.02

# then we plot all the other spectra
set origin fx+0*f,gy+0*g
plot '10.dat' with lines lt -1

set origin fx+1*f,gy+1*g
plot '20.dat' with lines lt -1

set origin fx+2*f,gy+2*g
plot '30.dat' with lines lt -1

set origin fx+3*f,gy+3*g
plot '40.dat' with lines lt -1

set origin fx+4*f,gy+4*g
plot '50.dat' with lines lt -1

set origin fx+5*f,gy+5*g
plot '60.dat' with lines lt -1

set origin fx+6*f,gy+6*g
plot '70.dat' with lines lt -1

set origin fx+7*f,gy+7*g
plot '80.dat' with lines lt -1

set origin fx+8*f,gy+8*g
plot '90.dat' with lines lt -1

set origin fx+9*f,gy+9*g
plot '100.dat' with lines lt -1

set origin fx+10*f,gy+10*g
plot '110.dat' with lines lt -1

set origin fx+11*f,gy+11*g
plot '120.dat' with lines lt -1

set origin fx+12*f,gy+12*g
plot '130.dat' with lines lt -1

set origin fx+13*f,gy+13*g
plot '140.dat' with lines lt -1

set origin fx+14*f,gy+14*g
plot '150.dat' with lines lt -1

set origin fx+15*f,gy+15*g
plot '160.dat' with lines lt -1

set origin fx+16*f,gy+16*g
plot '170.dat' with lines lt -1

set origin fx+17*f,gy+17*g
plot '180.dat' with lines lt -1

set origin fx+18*f,gy+18*g
plot '190.dat' with lines lt -1

set origin fx+19*f,gy+19*g
plot '200.dat' with lines lt -1

set origin fx+20*f,gy+20*g
plot '210.dat' with lines lt -1

set origin fx+21*f,gy+21*g
plot '220.dat' with lines lt -1

set origin fx+22*f,gy+22*g
plot '230.dat' with lines lt -1

set origin fx+23*f,gy+23*g
plot '240.dat' with lines lt -1

set origin fx+24*f,gy+24*g
plot '250.dat' with lines lt -1

set origin fx+25*f,gy+25*g
plot '260.dat' with lines lt -1

set origin fx+26*f,gy+26*g
plot '270.dat' with lines lt -1

set origin fx+27*f,gy+27*g
plot '280.dat' with lines lt -1

set origin fx+28*f,gy+28*g
plot '290.dat' with lines lt -1

set origin fx+29*f,gy+29*g

plot '300.dat' with lines lt -1

### second plot
set size 0.75,0.4
set origin 0.1,0.6
set xrange [800:1800]
set border 3
set xtics nomirror
set ytics nomirror
set xlabel 'm/z'
set ylabel 'Relative abundance (%)'
set title '20 V'

plot '20.dat' with lines lt -1

unset multiplot

which looks like this:

28 July 2013

485. Arch linux - kernel 3.10 issues? Won't boot/no network.

Note: I won't show any fixes in this post. This is my farewell to Arch.

I did a full upgrade of my Arch system (amd athlon II x3 with an nvidia gf210 card) earlier today. Among the packages was kernel 3.10-2 (I think).

Turning on the computer this evening I'd get to the gdm log in screen -- with an unresponsive keyboard and mouse. I also can't log in via ssh from another computer, so the network seems not to work.

I can, however, boot using the fallback option in grub. The network still isn't working though. I use wicd. If I try to connect using wicd the system freezes and crashes.

The closest I can find is this: https://bbs.archlinux.org/viewtopic.php?id=167090 -- but I use BIOS, not EFI. And there are plenty of other differences. It does, however, seem to have to do with the new kernel.

And systemd makes troubleshooting this worse than necessary since the logs are binary. Seriously, the use of plain text files in linux for configuration and logs is one of the MAIN defining features of the OS.

Anyway, Arch is a fine OS. But I've been burned a few times too many now, and I am no longer young and have plenty of time and enthusiasm for this kind of stuff. I've just started installing debian instead...and for once I'm going straight for stable...

A contributing factor is that we use this computer as the family one, for watching tv, listening to music etc. And sitting down on a Sunday evening to relax and finding that the damned system won't even boot isn't a good way to end the weekend.

Anyway, if you are looking for a fix, all I can give you is my gut feeling -- downgrade. The old packages are in /var/cache/pacman/pkg

Otherwise, if you have a copy of the kernel sources, compile your own kernel and install it as shown e.g. here:
http://verahill.blogspot.com.au/2013/03/355-kernel-382-on-arch-linux-exploration.html

Why downloading the source and copying it? Because (imho) it's easier than fiddling with packages -- you can put the sources on a usb stick and copy it. But good luck, whatever method you try. I'm switching my second to last non-debian computer back to my main OS (i.e. debian). I've still got a Scientific Linux box, which I haven't booted for well over a month now...

It's 'funny' this should come less than 24 hours after me nit-picking about the use of 'stable' to describe Arch...http://news.softpedia.com/news/Arch-Linux-Is-the-First-Stable-Distro-with-Linux-Kernel-3-10-371457.shtml



26 July 2013

484. Putting Tomato (USB) on Cisco/Linksys E2500-AU 300M

Update 18/8/2014: I've since done this on a unit with a BCM5357 chip rev 2 pkg 8 as well:


Update: the more I use it, the more I like it. I really like my old WRT54G, but I'm even happier about my fancy new E2500 since it's faster and all. Flashing them are equally easy. If you prefer some else to do it for you -- or if you want to seek independent confirmation that the router can be flashed -- look at http://flashrouters.com/ . The focus on that site seems to be dd-wrt (which is an alternative to Tomato), but they do list tomato routers too, e.g. http://www.flashrouters.com/routers/cisco-linksys-e2500-tomatousb-router

Original post:
Flashing a router is always a bit unsettling, so here's a detailed how-to.

Anyway, I managed to pick up a Linksys E2500-AU for $45 (Broadcom BCM5357 chip rev 1 pkg 8), which isn't too shabby. Some cursory searching showed that people had managed to put dd-wrt and tomato on it. While my experience with dd-wrt hasn't been that good, I've been running tomato on a linksys wrt45g for a around four years now, without any issues.

There's a number of derivatives of Tomato e.g. Tomato USB, and I'm a bit confused over what sets some of them apart. However, it seems like this Polish site is the right one for me. See here for the 'about' page.

What I'm presuming:
That you are running linux, and that you can afford to brick your router. There is always a risk associated with flashing firmware, and don't make any assumptions about the validity of the warranty...

How to:
Download the firmware:
cd ~/Downloads
wget http://tomato.groov.pl/download/K26RT-N/build5x-110-EN/Linksys%20E-series/tomato-E2500-NVRAM60K-1.28.RT-N5x-MIPSR2-110-Max.bin

With the router turned off, connect it via CAT5 cable to your computer. It should be attached to one of the LAN ports (in my case Ethernet 4) on the router. Ignore what the manual says about plugging into the WAN ('Internet') port.

Plug in power cable to the router.

On my computer I've disabled network manager (sudo rcconf, then uncheck network manager and either stop it or reboot) and my /etc/network/interfaces has this in it:
auto eth1 iface eth1 inet dhcp ethernet-wol g
You probably won't have to worry about this. Just make sure that you don't have anything interfering with the 192.168.1.0/24 subnet.

Anyway, once you have been assigned an IP address, navigate to 192.168.1.1, and work your way through the annoying warnings:


leave the user name blank, and use admin as the password



 Here's where it get's interesting. Select the .bin file you downloaded and hit ok.



Use admin for both username and password
 I was then met with this, which at first scared me a little:

I unplugged the power from the router, and plugged it in again, and I could log in:

The first step is to erase the nvram, or you might end up with "Cannot proceed: two or more lan bridges have conflicting ip addresses or overlapping subnets" when configuring your network. To erase NVRAM go to Administration/Configuration/Restore Default Configuration - Erase all data in NVRAM memory(thorough).

 You are now ready to start setting up your router.

Go to Administration/Admin access.

Set up an admin password, turn off telnet and change the colour scheme to Tomato. Optional but recommended: disable ssh access via password -- it's better if you add your public keys here.

Go to Basic/Network, and set an SSID and a password for your wireless. Set up your network details -- in my case I have static IP. I also want the subnet to be 192.168.2.0/24 and I use MAC spoofing, which you can set up under Advanced/MAC address.




 And this is what it looks like if you connect to the router via ssh:
So far so good!

23 July 2013

483. MS data, part III: generating a matrix by combining several spectra, and plotting it in gnuplot

This post is primarily intended for two particular students. However, the problem it addresses is something that a lot of spectrometrists/scopists who want to take their data presentation to the next level have encountered.

My presumption:
You're running linux.

You've already exported your data as csv files as shown in this post: http://verahill.blogspot.com.au/2013/07/474-exporting-data-from-wsearch32-and.html

In addition, for the specifics in the commands below I will presume that this data is based on a cone voltage sweep from 0 to 300 in 10 volt steps. I thus have a series of files named: 0.csv, 10.csv, 20.csv..190,300.csv.

You should be able to easily customize the approach to e.g. time or concentration dependent data.

Let's get started:
0. Pre-reqs
Make sure you have gawk, sed, xargs, gnuplot, paste, python installed. On debian do
sudo apt-get install gawk sed xargs gnuplot paste python

1. Convert the csv files to dat files
Create the following script and call it csv2dat.sh
#!/bin/bash for e in 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 do tail -n +8 $e.csv | sed 's/\,/\t/g'| gawk '{print $2,$4}' > $e.dat done
and run it
sh csv2dat.sh

If all went well, you'll have a series of tab-separated .dat files which contain the m/z and the relative abundance (not absolute).

2. Extract ALL m/z values from all files
Create a file called homogenize.sh and put the following in it:
#!/bin/bash for e in {0..200..10} {210..300..10} do cat $e.dat | gawk '{print $1}' echo "" done
We'll run the homogenize script (it does nothing of the sort though), and then use the unix tools unique and sort to get rid of all non-unique m/z values, and to sort them in reverse numerical order:
sh homogenize.sh > allmz.dat
uniq allmz.dat temp.dat
sort -gr temp.dat > mz.dat

3. Pad the data with zeroes
Create a file called makelist.py, and put the following in it. Watch out for tab lengths etc. It's written for python 2.x, and probably won't work under python 3. It was also hacked together from an earlier script which didn't quite work the way I hoped it would.

#!/usr/bin/env python
import sys
from numpy import linspace
infile=sys.argv[1]

f=open(infile,'r')
arr=[]

print "Read %s" %infile
for line in f:
 line=line.rstrip('\n')
 try:
  arr+=[round(float(line),3)]
 except:
  pass
  #print line
f.close

mylist=arr
mylist.sort(reverse=True)

print "Calculating spacing"
spacing=1.0
old=max(mylist)
for i in range(0,len(mylist)):
 if round(abs(old-mylist[i]),3)<spacing and not (abs(old-mylist[i])==0):
  spacing=round(abs(old-mylist[i]),3)  
 old=mylist[i]

values=1+(max(mylist)-min(mylist))/spacing
print "Max, min, resolution: ",max(mylist),min(mylist),spacing
completelist=linspace(max(mylist),min(mylist),values).tolist()
mylist=completelist

voltages=[0,10,20,30,40,50,60,70,80,90,100,110,120,130,140,150,160,170,180,190,200,210,220,230,240,250,260,270,280,290,300]
myys=[0]*len(mylist)

for n in voltages:
 print "voltage: ",n,'\n'
 f=open(str(n)+'.dat','r')
 g=open(str(n)+'pad.dat','w')
 arrx=[]
 arry=[]

 for line in f:
  line=line.rstrip('\n')
  line=line.split(' ')
  try:
   line[0]=round(float(line[0]),3)
   line[1]=float(line[1])
   arrx+=[line[0]]
   arry+=[line[1]]
  except:
   pass

 for i in range(0,len(arrx)-1):
  try:
   myys[mylist.index(arrx[i])]=arry[i]
  except:
   a=0   

 for i in range(0,len(myys)-1):
  g.write(str(myys[i])+'\n')
f.close
g.close

h=open('mz.x','w')

for i in range(0,len(mylist)-1):
 h.write(str(mylist[i])+'\n')
h.close

Run
python makelist.py allmz.dat

Getting 'fail' messages is ok -- most likely it's due to an empty line. You can check that everything worked out by doing e.g.
wc 0pad.dat 220pad.dat

The numbers in the first column should be the same if the files have the same number of lines.

4. Make a matrix
Paste all the ms data side-by-side.
paste 0pad.dat 10pad.dat 20pad.dat 30pad.dat 40pad.dat 50pad.dat 60pad.dat 70pad.dat 80pad.dat 90pad.dat 100pad.dat 110pad.dat 120pad.dat 130pad.dat 140pad.dat 150pad.dat 160pad.dat 170pad.dat 180pad.dat 190pad.dat 200pad.dat 210pad.dat 220pad.dat 230pad.dat 240pad.dat 250pad.dat 260pad.dat 270pad.dat 280pad.dat 290pad.dat 300pad.dat > allpad.dat

5. Rotate the matrix
Create a script called rotate.sh:
gawk ' { for (i=1; i<=NF; i++) { a[NR,i] = $i } } NF>p { p = NF } END { for(j=1; j<=p; j++) { str=a[1,j] for(i=2; i<=NR; i++){ str=str" "a[i,j]; } print str } }' $1
and run
sh rotate.sh allpad.dat > matrix.rot.dat

6. Plot using gnuplot
See the following script for an example. Note that plotting in gnuplot using 'matrix' you don't get the benefit of proper axes labels. Instead we do a bit of on-the-fly maths to get the axes right. Specifically:
using (2999.3-(($1-1)/10)):(($2-1)*10):($3)

means that for the m/z axes ($1) we take the highest value (in our case 2999.3) and remove 0.1 m/z (our resolution) for each data point. This data is in each row. For the CV axes ($2), which goes down the columns in our matrix.rot.dat, we have thirty values. Each one corresponds to an increase in 10V starting at 0V, hence we multiply by 10. $3 is the intensity, which we don't need to fiddle with.

Save the following as cntr.gplt
set term png size 1000,1000 set output 'map.png' set zrange [-10:110] set yrange [0:300] unset surface set contour base set cntrparam levels 15 set view 0,0 unset ztics unset key splot 'matrix.rot.dat' matrix using (2999.3-($1/10)):($2*10):($3) with lines palette

Running
gnuplot cntr.gplt

VERY SLOWLY (in my case I had 0.9 M data points) gives us

If you're confused as to why the data doesn't go beyond 250 Volt (y axis) it's because I made a mistake at one point.

Changing the ranges a bit we get
And even more zoomed in:

Soon to come as a separate post:
 the same data, but as a stacked plot. Here's what it looks like though:

482. kernel 3.10.2 with CK patch

NOTE: the 304.88 nvidia kernel modules DO NOT BUILD on this kernel. I've also tried 3.10.5 and it also does not work.

NOTE II: I'm getting random slowdowns on my SL410 laptop with intel graphics. Not sure if it's the same issue as this: http://verahill.blogspot.com.au/2013/03/368-slow-mouse-and-keyboard-triggered.html
Once kworker shows up in top everything grinds to a slow crawl.

Nothing odd here. For a list of what questions to expect when going from 3.9 to 3.10, see e.g. http://verahill.blogspot.com.au/2013/07/468-kernel-310-on-debian.html

The CK patch set supposedly improves desktop performance of the kernel. As it seems like Con doesn't update that page anymore, go directly to the patches: http://ck.kolivas.org/patches/3.0/

sudo apt-get install xz-utils kernel-package fakeroot ncurses-dev
mkdir ~/tmp
cd ~/tmp
wget https://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.2.tar.xz
tar xvf linux-3.10.2.tar.xz
cd linux-3.10.2/
wget http://ck.kolivas.org/patches/3.0/3.10/3.10-ck1/patch-3.10-ck1.bz2
bunzip2 patch-3.10-ck1.bz2
patch -p1 < patch-3.10-ck1
patching file arch/powerpc/platforms/cell/spufs/sched.c patching file Documentation/scheduler/sched-BFS.txt patching file Documentation/sysctl/kernel.txt patching file fs/proc/base.c patching file include/linux/init_task.h patching file include/linux/ioprio.h patching file include/linux/sched.h patching file init/Kconfig patching file init/main.c patching file kernel/delayacct.c patching file kernel/exit.c patching file kernel/posix-cpu-timers.c patching file kernel/sysctl.c patching file lib/Kconfig.debug patching file include/linux/jiffies.h patching file drivers/cpufreq/cpufreq.c patching file drivers/cpufreq/cpufreq_ondemand.c patching file kernel/sched/bfs.c patching file include/uapi/linux/sched.h patching file include/linux/sched/rt.h patching file kernel/stop_machine.c patching file drivers/cpufreq/cpufreq_conservative.c patching file kernel/sched/Makefile patching file kernel/time/Kconfig patching file kernel/Kconfig.preempt patching file kernel/Kconfig.hz patching file arch/x86/Kconfig patching file Makefile
make-kpkg clean cat /boot/config-`uname -r`>.config make oldconfig time fakeroot make-kpkg -j3 --initrd kernel_image kernel_headers sudo dpkg -i ../*3.10.2-ck*.deb sudo rm /lib/modules/3.10.2-ck1/build sudo ln -s /usr/src/linux-headers-3.10.2-ck1/ /lib/modules/3.10.2-ck1/build sudo dkms autoinstall -k 3.10.2-ck1

22 July 2013

481. A little bit of samba on the command line

I have a bit of a problem with samba currently.

My problem is that my computers are sitting behind a router (on a 192.168.2.0/24 subnet) and the computers that I want to access sit on the university network, to which the router is connected. The address range is, say, 131.172.x.x.

In other words, I (think I) want to use samba across two subnets.

I've opened up ports 13-139,445 to tcp and udp on both the router and in iptables on my desktop.

My problem:
1. I can't see the network shares of the other computers using
   a) nautilus (Network/Windows Network)
   b) nmblookup
   c) sambascanner

2. I can't connect to network shares using their netbios names. For example, I'd like to connect to e.g. smb://avance400/data, but I have to use the IP address instead. For some curious reason not even that works using nautilus.

Workaround:
So here's not a solution, but a workaround.

I can connect to other computers from the command line as long as I know the IP address, and here's how
smbclient //131.172.123.30/data -U myuni/me

If you actually want to mount the share, which is password protected, and you do, then do
sudo mount -t cifs -o user=me //131.172.123.30/data /media/smbmounts/

where /media/smbmounts belong to you (e.g. sudo mkdir /media/smbmounts && sudo chown $USER /media/smbmounts).

And that's more or less it.

Some additional information:
If you don't get prompted for the password, and get
mount: block device //131.172.123.30/data is write-protected, mounting read-only
mount: cannot mount block device //131.172.123.30/data read-only

but supplying the password as part of the command line works, then you are missing cifs-utils, so install them.

Note that mount.cifs can handle credentials from a special file, e.g. like this , which you chmod to 600. My chief issue with that is that ~/.bash_history has exactly the same permissions (u+rw, go-rwx) and so I don't see how it's that's any safer than exposing everything by supplying your password as part of the mount command. Both should be avoided if possible.

On the other hand you could argue that since the password is transmitted over the network in cleartext you're inviting trouble either way...




480. MS data, part II. Plotting and comparing with predicted isotopic enveloped

NOTE: I've heard rumours about problems with Matt Monroe's calculator on Windows 7 Home, and on Windows 8. I've heard reports of it working on Windows 7 Professional. Given that wsearch also has issues,  this may be linked to VB.

This post is, like this one, is written with two particular students in mind.

MS here stands for Mass Spectrometry.

I'll be presuming that you have exported your data as a csv file as shown in http://verahill.blogspot.com.au/2013/07/474-exporting-data-from-wsearch32-and.html

Our scenario:
So you've exported your data as e.g. data.csv, and you have assigned a signal in your spectrum to a species, and you'd now like to plot the predicted and observed isotopic envelopes in a way that will help you compare them.

The signal we have identified is at 211.90 m/z and we think it belongs to [Ga(CH3OH)2(OH)(NO3)]+.


The Linux way:
You'll need: sed, gawk, gnuplot, pyisocalc or Matt Monroe's Molecular Weight calculator

1. Generating the isotopic envelope:

A. Using pyisocalc:
Set the charge to 1 and output the data to 1.dat, with a gaussian broadening factor of 0.3:
isocalc -f 'Ga(CH3OH)2(OH)(NO3)' -c 1 -o 1.dat -g 0.3

B. Using Matt Monroe's molecular weight calculator
Go to Tools/Isotopic Distribution Modelling
In the spectrum window, go to Edit, Copy Data Points, and paste into e.g. a Gedit window. Save as 1.dat


2. Formatting the data.csv for gnuplot (can skip for spreadsheet programs):
In a single line we remove the first eight lines, replace all commans (,) with tabs, only keep the m/z and relative isotopic abundance columns (2 an 4) and save the output to data.dat
tail -n +8 data.csv |sed 's/\,/\t/g'|gawk '{print $2,$4}' > data.dat

3. Plotting:

A. Using gnuplot:
Create a file called 1.gplt which contains the gnuplot commands:
set term postscript eps enhanced color set output '1.eps' set xrange [206:220] plot '1.dat' u ($1-0.05):($2*0.092) w lines ti 'Calculated' lc -1 lw 2,\ 'data.dat' u 1:2 w lines ti 'Observed' lc 1 lw 2
($1-0.05) means we're offsetting the calculated data by 0.05 m/z. ($2*0.092) means that we're scaling the calculated data intensity to match that of the observed. lc sets line colour and lw sets the width


If you want the output as png instead of eps, just change the first two lines to
set term png size 1000,667 set output '1.png'
Using pyisocalc
Using Matthew Monroe's calculator

B. Using QtiPlot
Qtiplot is in the debian repos and is 'origin'-like (as in Microcal Origin).

You'll need to rescaled your calculated data first, which is a major drawback:
cat 1.dat|gawk '{print $1-0.05,$2*0.095}'> 1_scaled.dat

Start QtiPlot and select Open. Make sure you select 'all files' as the file type. Open 1_scaled.dat.

Next, make sure that the spreadsheet is active, and go to File, Import, Import Ascii

Change the type of column 3

Select all columns and go to Plot, Line. Change the axes (double click on the axes and set the new ranges), set the top and right axes no to show, edit the titles etc.



The Windows way: 
You'll probably need: excel or open/libreoffice, origin, pyisocalc or Matt Monroe's Molecular Weight calculator

Doing this on windows is a PITA compared to Linux, and I don't have the time to go through it. If you do have Origin, it should be straightforward to translate the instructions above into an MS Win-like environment.

Any scaling will have to be done in Excel or a similar spreadsheet program. Not difficult, but it'll add a few extra steps.

19 July 2013

479. Compiling Wine 1.6 on Debian (using a chroot)

Update:
I noticed
configure: libOSMesa 32-bit development files not found (or too old), OpenGL rendering in bitmaps won't be supported.

popping up at the end of ./configure. I've added a fix for it based on http://forum.winehq.org/viewtopic.php?f=2&t=17713

Original post:
Here's a generic way of building Wine 1.6 which is now stable. And yes, it's the instructions for 1.5.28-1.6-rcX recycled.

See here for information about 3D acceleration using libGL/U with Wine: http://verahill.blogspot.com.au/2013/05/429-briefly-wine-libglliubglu-blender.html

Getting started:
If you set up a e.g. chroot to build 1.5.28 you don't need to set up a new chroot to build 1.6. In that case, skip the set-up step below and instead re-enter your existing chroot like this:

sudo mount -o bind /proc wine32/proc
sudo cp /etc/resolv.conf wine32/etc/resolv.conf
sudo chroot wine32
su sandbox
cd ~/tmp

And skip to 'Building wine'.

Otherwise do this:
Setting up the Chroot
sudo apt-get install debootstrap
mkdir $HOME/tmp/architectures/wine32 -p
cd $HOME/tmp/architectures
sudo debootstrap --arch i386 wheezy $HOME/tmp/architectures/wine32 http://ftp.au.debian.org/debian/
sudo mount -o bind /proc wine32/proc
sudo cp /etc/resolv.conf wine32/etc/resolv.conf
sudo chroot wine32

You're now in the chroot:
apt-get update
apt-get install locales sudo vim
echo 'export LC_ALL="C"'>>/etc/bash.bashrc
echo 'export LANG="C"'>>/etc/bash.bashrc
echo '127.0.0.1 localhost beryllium' >> /etc/hosts
source /etc/bash.bashrc
adduser sandbox
usermod -g sudo sandbox
echo 'Defaults !tty_tickets' >> /etc/sudoers
su sandbox
cd ~/

Replace 'beryllium' with the name your host system (it's just to suppress error messages)

Building Wine
While still in the chroot, continue (the i386 is ok; don't worry about it -- you don't actually need it):

sudo apt-get install libx11-dev:i386 libfreetype6-dev:i386 libxcursor-dev:i386 libxi-dev:i386 libxxf86vm-dev:i386 libxrandr-dev:i386 libxinerama-dev:i386 libxcomposite-dev:i386 libglu-dev:i386 libosmesa-dev:i386 libglu-dev:i386 libosmesa-dev:i386 libdbus-1-dev:i386 libgnutls-dev:i386 libncurses-dev:i386 libsane-dev:i386 libv4l-dev:i386 libgphoto2-2-dev:i386 liblcms-dev:i386 libgstreamer-plugins-base0.10-dev:i386 libcapi20-dev:i386 libcups2-dev:i386 libfontconfig-dev:i386 libgsm1-dev:i386 libtiff-dev:i386 libpng-dev:i386 libjpeg-dev:i386 libmpg123-dev:i386 libopenal-dev:i386 libldap-dev:i386 libxrender-dev:i386 libxml2-dev:i386 libxslt-dev:i386 libhal-dev:i386 gettext:i386 prelink:i386 bzip2:i386 bison:i386 flex:i386 oss4-dev:i386 checkinstall:i386 ocl-icd-libopencl1:i386 opencl-headers:i386 libasound2-dev:i386 build-essential
mkdir ~/tmp
cd ~/tmp
wget http://prdownloads.sourceforge.net/wine/wine-1.6.tar.bz2

tar xvf wine-1.6.tar.bz2
cd wine-1.6/


Optional:
To avoid getting the

configure: libOSMesa 32-bit development files not found (or too old), OpenGL rendering in bitmaps won't be supported.

message, do the following:
1. Edit configure
 9450 LIBS="-lOSMesa -lGLU -lGL $X_LIBS $X_PRE_LIBS $XLIB -lm $X_EXTRA_LIBS $LIBS"

2. Also change
 9473     *) ac_cv_lib_soname_OSMesa=libOSMesa.so

Does it change anything? I don't know. But it removes the error message which is triggered by missing symbols so I think it does since the symbols are found in GLU/GL.
End optional.

Then do
./configure
time make -j3
sudo checkinstall --install=no
checkinstall 1.6.2, Copyright 2009 Felipe Eduardo Sanchez Diaz Duran This software is released under the GNU GPL. The package documentation directory ./doc-pak does not exist. Should I create a default set of package docs? [y]: Preparing package documentation...OK Please write a description for the package. End your description with an empty line or EOF. >> wine 1.6 >> ***************************************** **** Debian package creation selected *** ***************************************** This package will be built according to these values: 0 - Maintainer: [ root@beryllium ] 1 - Summary: [ wine 1.6] 2 - Name: [ wine ] 3 - Version: [ 1.6] 4 - Release: [ 1 ] 5 - License: [ GPL ] 6 - Group: [ checkinstall ] 7 - Architecture: [ i386 ] 8 - Source location: [ wine-1.6 ] 9 - Alternate source location: [ ] 10 - Requires: [ ] 11 - Provides: [ wine ] 12 - Conflicts: [ ] 13 - Replaces: [ ]
Checkinstall takes a little while (In particular this step: 'Copying files to the temporary directory...').

Installing Wine

Exit the chroot
sandbox@beryllium:~/tmp/wine-1.6$ exit
exit
root@beryllium:/# exit
exit
me@beryllium:~/tmp/architectures$ 

On your host system
 Enable multiarch* and install ia32-libs, since you've built a proper 32 bit binary:

sudo dpkg --add-architecture i386
sudo apt-get update
sudo apt-get install ia32-libs

*At some point I think ia32-libs may be replaced by proper multiarch packages, but maybe not. So we're kind of doing both here.

 Copy the .deb package and install it
sudo cp wine32/home/sandbox/tmp/wine-1.6/wine_1.6-1_i386.deb .
sudo chown $USER wine_1.6-1_i386.deb
sudo dpkg -i wine_1.6-1_i386.deb

17 July 2013

478. Briefly: proftpd on debian

I need to transfer raw mass spec files off of the computer controlling our waters zmd, and it seems like I may be the only one in the department wishing to do so.

Since the computer is running Windows NT 4 and doesn't support USB drives out of the box, and I'm a bit worried about installing new software (e.g. old versions of filezilla via oldapps) on a computer on which a lot of people rely, I have two options:

* use SMB i.e. a windows share
or
* use ftp

I'm having all sorts of trouble getting my samba to work well at work -- my computers are sitting on a 192.168.2.0/24 LAN behind a router connected to the corporate network which has proper IP addresses (i.e. not using a reserved private network address space). I haven't managed to get my computer behind the router to 'see' the other computers and their shares at work beyond my router . I can, however, connect directly to the computers using e.g. smbclient -- they just won't show up in e.g. nautilus under windows network or using nmblookup. At any rate, connection directly to the target computer prompts me for a password and it seems that there are no open, accessible shares on that computer, only password protected ones.

Win NT has a DOS ftp client, so I finally decided to set up a quick and dirty ftp server on my workstation in my office so that I could transfer a couple of data files to figure out my other issue -- whether I have any piece of software that can actually open the masslynx .raw files. Turns out that neither wsearch32 nor openchrom can, so the exercise has been somewhat futile, although it has to be said that I'd like to be in charge of any raw data that leads to publications, and so I should be able to manage the storage of it myself.

Note: ftp is an inherently unsafe method since it doesn't use encryption. Use a separate user for this with no privileges, change the password of that user regularly, and close port 21 whenever you aren't using it in order to not advertise that you are running an ftp server. Use ssh/sftp if at all possible.

Anyway, setting up an ftp server was easy.

This method follows this post, http://ubuntuforums.org/showthread.php?t=79588, almost verbatim.

First install proftpd

sudo apt-get install proftpd

Edit /etc/shells:
# /etc/shells: valid login shells /bin/csh /bin/sh #/usr/bin/es #/usr/bin/ksh #/bin/ksh #/usr/bin/rc #/usr/bin/esh /bin/dash /bin/bash /bin/rbash #/usr/bin/screen #/bin/tcsh #/usr/bin/tcsh #/bin/ksh93 /bin/false

sudo adduser ftpuser

su ftpuser
cd ~
mkdir download
mkdir upload
exit

Edit /etc/proftpd/proftpd.conf. In addition to what was already there, I added
UserAliasOnly on UserAlias spinebill ftpuser ExtendedLog /var/log/ftp.log TransferLog /var/log/xferlog SystemLog /var/log/syslog.log AllowStoreRestart on <Directory /home/ftpuser> Umask 022 022 AllowOverwrite off <Limit MKD STOR DELE XMKD RNRF RNTO RMD XRMD> DenyAll </Limit> </Directory> <Directory /home/ftpuser/download/*> Umask 022 022 AllowOverwrite off <Limit MKD STOR DELE XMKD RNEF RNTO RMD XRMD> DenyAll </Limit> </Directory> <Directory /home/ftpuser/upload/> Umask 022 022 AllowOverwrite on <Limit READ RMD DELE> DenyAll </Limit> <Limit STOR CWD MKD> AllowAll </Limit> </Directory> Include /etc/proftpd/conf.d/

su ftpuser
chsh -s /bin/false
exit

Check the syntax:
sudo proftpd -td5

Test:
ftp `hostname`
Connected to beryllium. 220 ProFTPD 1.3.4a Server (Debian) [192.168.1.1] Name (beryllium:me): spinebill 331 Password required for spinebill Password: 230 User spinebill logged in Remote system type is UNIX. Using binary mode to transfer files. ftp>
I have since tested this from the Win NT 4 computer and everything is working well. I had to familiarise myself with the windows ftp client first: http://www.nsftools.com/tips/MSFTP.htm

477. OpenChrom - Dempster

I don't want to get into writing software reviews, but given how much OpenChrom, an open source program for mass spectrometry which can open a range of proprietary formats, has evolved since I tried the release code-named Syringe (http://verahill.blogspot.com.au/2012/09/using-openchrom-to-open-aglient-d-esi.html) in September 2012, I think a brief update may be in order.

Essentially, OpenChrom now seems a lot easier and more natural to use.

The installation is the same as before (I've copied the old post below):

1. Install Java v1.7 (need > 1.6)
You can either use openjdk 7 or (Oracle) Java. See here for a general guide to installing Oracle/Sun Java.

As for openjdk, you can easily install it:
sudo apt-get install openjdk-7-jdk

(the openjdk-7-jre package is enough if you don't want the full developer's kit)

Anyway.

Make sure that you've selected the right version:
 sudo update-alternatives --config java
There are 7 choices for the alternative java (providing /usr/bin/java).

  Selection    Path                                            Priority   Status
------------------------------------------------------------
  0            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      auto mode
  1            /usr/bin/gij-4.4                                 1044      manual mode
  2            /usr/bin/gij-4.6                                 1046      manual mode
  3            /usr/bin/gij-4.7                                 1047      manual mode
  4            /usr/lib/jvm/j2re1.6-oracle/bin/java             314       manual mode
  5            /usr/lib/jvm/j2sdk1.6-oracle/jre/bin/java        315       manual mode
  6            /usr/lib/jvm/java-6-openjdk-amd64/jre/bin/java   1061      manual mode
 *7            /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java   1051      manual mode




2. Get openchrom
cd ~/tmp
wget http://aarnet.dl.sourceforge.net/project/openchrom/REL-0.8.0-PREV/openchrom_linux.gtk.x86_64_0.8.0-PREV.zip
unzip openchrom_linux.gtk.x86_64_0.6.0.zip
cd linux.gtk.x86_64/OpenChrom/
sudo mkdir /opt/openchrom
sudo chown $USER /opt/openchrom 
cp * -R /opt/openchrom
chmod +x /opt/openchrom/openchrom

Stick

alias openchrom='/opt/openchrom/openchrom'

in your ~/.bashrc and source it.




A few notes:
You can now start OpenChrom by running
openchrom
from the command line.

You can install plug-ins, e.g. to open Agilent files, by going to Plugins/OpenChrom Marketplace. Plugins will typically run for 30 days, but can be unlocked for perpetuity by adding a serial key, which is free. Add license keys by first registering on the openchrom website, going to http://www.openchrom.net/main/content/plugins.php, and clicking on the plugin you want a serial key for. You can then enter that key by going e.g. Window/Preferences/Converter in the OpenChrom software.

Everyone has their own idea of what a good piece of mass spec software should look like, and I suspect that OpenChrom caters to them all. Layouts are called Perspectives here. On the flip side, if you accidentally use a perspective that doesn't suit you, you may be incredibly frustrated until you figure out what's going on.

I like wsearch32, so my preferred view is to go to Window/Perspective Switcher -> Chromatogram MS (exact).

Opening spectra is much easier than before: now simply go to File/Open Chromatogram (MSD) and click on the file (or directory structure in the case of e.g. Agilent .D directories) and open:

Anyway, openchrom is getting better, is easier to use, and is still the only open source program that I know which can handle such an array of proprietary formats. Also, it can export data in both .csv and .xls formats, but you will need to install plugins for that, which is luckily very simple.


15 July 2013

476: Rehash: using a browser proxy via tunnel, through a router and with reverse ssh

I may have covered this at some point, but if so, I can't find the post.

Here's the situation:
You have a linux computer at work, which is behind a corporate firewall.
You have a router at home which runs an ssh server (e.g. running tomato).
You have a computer at home, which sits behind the router above.
You want to browse from home using the corporate network

In my case it's a little bit different -- I want to make a change to the router my office network (I have my own office) sits behind, and the easiest way to do that is by logging onto that router via http (it's a stock netgear router).

How to:
First, at work, connect to your home router using reverse ssh, so that all traffic on port 19999 on the router gets sent to port 22 on your work computer:
ssh -R 19999:localhost:22 root@myhomerouter

Later, at home, forward all traffic to port 8989 on your home computer to localhost:19999 on your router (which then gets sent to port 22 on your work computer):
ssh -L 8989:localhost:19999 root@192.168.2.1

We've assumed that the router sits on 192.168.2.1 from inside the LAN. Localhost here refers to your home computer, while localhost in the command before that refers to the router.

Then, in a different terminal, open a proxy through port 8989:
 ssh -D 8888 me@localhost -p 8989

Finally, you can now edit your browser/network settings to use a SOCKS proxy on port 8888 like you would with any other proxy.

475. How to get into a Chemistry PhD program in Australia -- or at least a reply from a prospective supervisor

Here's yet another non-linux post. I'm currently getting ready for the start of the new semester and teaching, and so haven't had much time to work on improving my computer skills.

Anyway.

I've been advertising for an international PhD student for the past 9 months and have so far only had one great applicant and three acceptable applicants. That's out of ca 200 applicants in total.

So what does 'acceptable' mean? In this case my use actually agrees with the literal meaning -- students which will stand a chance of being accepted to the PhD program. It also means students which I could imagine working with.

The formal requirements will likely differ between different institutions, and between supervisors. In addition, some supervisors may be looking for different personalities in their prospective hires, than others.

I don't think that I'm being unnecessarily harsh in evaluating applicants, as I've had colleagues review my shortlists and who have thought I've even been a bit too optimistic in my evaluations.

At any rate, if you are looking for a PhD, be aware that there are a lot of applicants out there, and only a limited amount of money and places, so you will want to spend some time on your application.

So here are a few of my thoughts:
Before reading, keep in mind that I understand that applying for a PhD, especially if you are from the developing world and applying for a PhD position in the industrialised world, can be very tough, and sometimes depressing. You don't receive a reply to most of your applications, and when you do, they responses are normally negative.


* Try to familiarise yourself with the formal requirements, and address them in the first paragraph in your email to a prospective supervisor. In the case of my uni, there are two main requirements:
-- an undergraduate degree equivalent to a first class honour degree in Australia
-- a sufficiently good score at the IELTS

That's it. However, the hubris of many universities in Australia mean that the first requirement is a significant hurdle. Typically, good grades are just the beginning. In addition to that, the applicant needs to hold a masters degree (by research) and have a couple of papers in ISI rated journals. Obviously almost none of our own undergraduate students would meet that, but there you go.

So in your first paragraph, state what unis you did your degrees at, what your cumulative GPAs (or equivalent) were, how many papers you have published and what you overall band score AND section scores on the IELTS (or TOELF) are.

At this stage, that's much more important than your background, your hobbies, or anything else. If you can't meet the minimum requirements for entry to the PhD program, everything else doesn't matter.

* Read the advertisement, and follow any instructions
I ask applicants to submit all their documents as PDFs. Yet, I get plenty of applications with .doc, .docx, jpeg etc attached. You didn't read the instructions -- will you be more careful as a PhD student? Remember that you competing against plenty of applicants that did read the instructions.

Did I ask for your IELTS results? Didn't attach or mention them in your email/CV? Not a good sign. Also, it means that you're probably not a candidate.

* Address the supervisor and the supervisor's research
I get way, way too many emails that start with  'Dear Sir', or 'Dear Professor' or even worse: 'Dear Sir/Madam'. Put my name in there. It'll show me that you spent at least a few minutes on personalising your email. If you don't make that effort, why should I make the effort of reading your email and looking at your documents?

Also, please do mention the research of the supervisor you are applying to. It doesn't need to be anything insightful or special, but just write something like: 'I find your research into catalytic activation of molecules in ionic liquids very interesting.' or 'I read your article in Green Chemistry, 2013, 10, 2345 and found it very interesting. In particular, I liked how it showed how the selectivity of blah blah blah'.

The reason is not that you are showing off your great scientific skills (you've got an undergraduate degree -- we don't expect much), but that it shows you spent a bit of effort writing your email and personalising it. Also, flatter -- in moderation -- can occasionally help (don't go overboard, so be careful -- too much makes you seem insincere).

* Don't cold-call
This should go without saying. I've had one student email me in the morning, then call me in the afternoon. That kind of behaviour is probably correct if you are applying for certain jobs in the Real World (marketing?), but not for a PhD in chemistry. It's a sure-fire way of annoying people.

* Don't send a linked-in invite
I don't have time to scroll through your profile and try to compile a CV for you. Send me your CV in pdf format instead. Also, I don't know you, and have no incentive to add you to my 'network'.

* Be careful about 'hobbies' and 'interests'.
To me as a potential supervisor they really don't matter (again, this is my personal opinion). I know that the idea is to show that you are a well-rounded individual, but knowing that you like 'travel' or that you consider 'internet browsing' a skill will not be the edge that gets you into a PhD programme.

* 'It can't help you, only harm you'.
Keep this in mind. Unless it's a piece of information required in the advertisement, or that you are absolutely certain will help your application, consider leaving it out. You may include it to highlight a particular skill or trait, but remember that a CV can be interpreted ambiguously, and your intent may not be obvious. Instead, what you feel shows how independent and committed you are, can be seen as being unfocussed, a difficult person to work with, or simply attract attention away from more important aspects of your CV.

* Attending lectures, conferences
In their CVs, some applicants include lectures by famous people that they've attended, or conferences that they've gone to.

Here's the problem for me: most first year PhD students struggle with the notion that doing the work is no longer enough. Doing the experiments, or following your supervisor's instructions, is not enough. To get a PhD you need to make that extra effort and making things work. And if it doesn't work, you put in 150% effort -- the extra 50% being extra-curricular work on finding a related project that will work. Life as a PhD student can be easy if you are lucky, but most often is not -- life is incredibly good when you project is working, but on the flip-side it can be hard, depressing and demoralising when it isn't. You supervisor can alleviate some of that, but remember that your supervisor is only there to point you in a general direction -- the PhD is all about making the transition to becoming an INDEPENDENT research.

So be careful -- if you've presented posters or given talks at conferences or at other universities, you should definitely list them, but under a suitable heading -- NOT publications. They'll detract attention from the publications, and the publications is what will get you an offer of acceptance.

* Do not make things conditional
I had an applicant who was borderline (in terms of meeting the requirements), and in those cases occasionally the supervisor putting in extra effort into cajoling the university administration MAY be enough to get a student accepted (don't count on it). If your prospective supervisor asks you to re-take IELTS, don't write something along the lines of  'I will, but only if this is the last hurdle'.

I understand it's expensive, but remember: even if you meet all the requirements I cannot guarantee that you get accepted. And I can' wait months for each student to pass through the application system -- I need to hire someone now. So be proactive.

* Face-to-face (or skype/video) interview is a good sign
If your supervisor asks for a skype interview, this is a great sign. And likely this isn't really done in order to gauge your scientific skills, but just to get a feel for your personality. Also, it's a way of making sure that your English levels are good enough that you can communicate with your supervisor. Finally, if you are borderline in terms of IELTS/TOELF, your supervisor may be able to argue that you English is good enough based on that interview. So take the opportunity.

And send an email a few hours after the interview thanking for the opportunity. 1-2 lines is enough. It will show that you're a decent human being.

* Be prompt in replying to emails
It doesn't matter what stage of the application you are at -- until the paperwork has been signed you are still on probation. If you take several days to reply to any of my emails, then you are likely to be dropped. The reason is simple: if you take a week to get things done when you are a PhD student, then you will be a disaster for me. A disaster that I'll have to live with for the next 3-4 years, and whom will be using up my research grant, and potentially ruining my career.

I understand that the reason for you being slow may be different -- maybe you are just nervous, maybe you have nothing to say, maybe you feel you are intruding. Still, be prompt.

So:
if you can show that you can read and follow instructions, and if you can make my life easy by addressing the selection criteria in a clear way, and if you seem like a person I might enjoy working with for the next 3-4 years, then you stand a fair chance of getting an offer.

If I think you'll need constant supervision, is sloppy and won't follow instructions, or that our personalities will clash, I'll probably avoid you no matter how good your grades are.


12 July 2013

474. MS data, part I: Exporting data as csv from wsearch32, and generating MS assignments using Matt Monroe's molecular weight calculator

NOTE: I've heard rumours about problems with wsearch on Windows 7 Home, and on Windows 8. I've heard reports of it working on Windows 7 Professional. Curiously, it works just fine on linux under wine.

This post is written with two particular students in mind. I could put this in a pdf and email it, but why not share with the wider world since other people may encounter the same issues?

See here for part II: http://verahill.blogspot.com.au/2013/07/480-ms-data-part-ii-plotting-and.html


1. Exporting data from wsearch32
To install wsearch32 under wine, see here: http://verahill.blogspot.com.au/2013/01/321-wsearch32-in-wine.html

In order to export data from wsearch so that you can plot it in e.g. gnuplot, octave, origin or excel, do the following:

Open a spectrum (chromatogram) and pick a slice, then click on the M/I icon in the bottom right:

 Pick Save As

 And save as e.g. csv (comma separated file)

Done.

2. Using formula finder in Matthew Monroe's Molecular Weight Calculator
To install the molecular weight calculator in wine, see here: http://verahill.blogspot.com.au/2012/09/matt-monroes-molecular-weight.html

Open the molecular weight calculator and go to edit abbreviations.

 Add an abbreviation for MeO. We'll call it Methx, and it has a charge of -1:
 Methanol:
 Nitrate:
 Hit OK to save the changes.

 Go to formula finder:

We'll be looking for Ga, NO3, MeOH, O, H, MeO. Then click on Formula Finder Options:
 Limit the charge to 1:
 And search:

You can do fancier stuff, e.g. searching directly for the m/z and bound the search to min/max amounts of different elements:
 As shown here:




10 July 2013

473. Programming a Metrohm Titrino -- not a how-to, just a ramble

Many, many years ago I learned basic programming using BASIC (the version that came with PC DOS 5, I think). I even wrote the odd game, but it was all pretty awful. A few years later I learned Turbo Pascal, which was a fantastic experience compared to Basic. It felt all sciency and grown up, and it was my first experience with a real IDE. I even ended up buying a TP book, and became somewhat proficient. This must've been when I was around 18-19.I then stopped programming completely.

At around 30 years of age I decided it was time to get serious about programming again -- I was doing mass spectrometry and needed a simple program that could generate a series of solutions to the identity of a mass/charge ratio given a range of elements. I probably had a quick look at C and C++, but ended up getting a Python book and have been happy Python programmer ever since.

The problem is that I've never been a /good/ python programmer -- and in all these years I've never fully understood the use for (or, in all fairness, use OF) OOP. And at the moment it seems to be holding me back -- all the examples that I find of the use the threading module as well as writing GUIs (using e.g. wxPython) involve using classes. And I just don't understand them well enough to sort out what I need done.

Anyway, long story short: I've written a basic program for communicating with a Metrohm Titrino 736 GP via RS 232. It's found here: https://sourceforge.net/projects/pytitrino/

Currently:
* the code is a mess (see above)
* it works fine for doing monotonic and dynamic end point titrations (MET and DET)
* it saves data to a file, but does so silently (i.e. when you run you won't get any feedback that things are working properly...)
* it uses the thread (not threading) module
* I've managed to pass parameters back and forth between the thread and the main loop using Queue

There are probably much better solutions. One day I hope to be able to stick a GUI on top of it, but the more I look at it I get the impression that one writes the GUI first, then the engine...not that I'd know.

Anyway. That's what I've been up to. Anyone with a bit of programming experience, whom is in possession of an old-school Titrino (i.e. using RS 232) and wants to save $1.5k in software licenses may be interested in taking the sources and turning them into something useful.


03 July 2013

472. Briefly: Iranian PhD students in Australia

I'm not going to leave much in the way of a comment, but this doesn't seem to have been publicised enough. Searching the web quickly didn't bring this up at all, and it's a shame since it's important, in particular if you are an Iranian national thinking about doing a PhD in Australia.

About a week ago the faculty in the chemistry department at my university were informed that heavy restrictions in terms of access to instrumentation has been put in place for students from North Korea, Syria and Iran via Federal legislation.


While I don't think there are any students from North Korea or Syria around, there are several Iranian students at different stages of their PhD. In fact, I would say around 50% of our applicants are from India, 25% are from Pakistan and 20 % are from Iran (in terms of accepted students the ratio is very different)


In practical terms, this means that Iranian students in the department are not allowed to use:
FT-IR
UV-Vis
NMR
Mass spectrometers
Raman
dosimeters
OES/AAS
etc.

All of which are standard instruments which most chemists would find necessary to do research. In addition, they can hardly be considered as being cutting edge, trade secrets or anything like that -- commercial NMR instruments have been around since the 1950s, infrared an UV/Visible spectroscopy go much further back. Mass spectrometry is a standard tool which, although many of the current designs only go back to the 1980s (e.g. ESI), is so conceptually simple and innocuous, that (to me) restrictions on it doesn't make sense. And so on.

In addition, supervisors of Iranian students have been asked to draw up a risk management plan to prevent student access to the above instruments, which is a particular problem given that they are used in teaching as well, and are available on a walk-in basis to undergraduate students doing projects in research labs.

Currently, any supervisor who has an Iranian student needing to use any of the instruments above will need to assign another student to do these measurements for the Iranian national.

While this doesn't formally preclude Iranians from coming to Australia to do a PhD, we have been advised that we should reject any applicants at this point. This may change once the university has figured out exactly where they must draw the line in terms of restricting access to Iranians to different facilities, but for now it's a blanket ban.

My personal opinion is that while you'd be led by the media to think of anyone from North Korea, Syria and Iran as potential spies, these are real people too. Many Iranians would either be completely disinterested in politics, or actively antipathetic to their regime. And the best thing about democracies -- we shouldn't have any issues with them supporting their government either. So I don't really agree with this as a security measure to prevent nuclear proliferation, which must surely be the stated goal.

And if the idea is to put in place sanctions to promote regime change, then why limit the type of instrumentation that students can access? Or are we trying to punish the children of the leadership in Iran? Then why not limit the sanctions to those specifically? Top students tend to come from all socioeconomic classes.

The timing is also very odd, given the recent election of a moderate.

And why Iran and not Belarus, China, Zimbabwe etc.?

Again, I don't like putting opinion pieces on this blog (other than as minor parts/rants of posts with actual content) but I think this should be publicized more.



02 July 2013

471. Debian Jessie -- gnome-shell bug

Update 3/7/2013:
there are now *gnome-bluetooth packages (3.8.1-2) in the jessie repos now. While I haven't looked closer at them, I presume that they fix this issue.

(on a different note: dist-upgrade currently removes gnome...)

Original post:
I've used debian testing since early 2011, and I've only had a few minor issues during that time.

However, sometimes things happen that reminds you that the Testing release is not meant for mission critical work (and makes me happy that I only use Jessie on my laptop, which I mainly use at home).

So...

Last night I did upgrade and dist-upgrade, which installed the following packages according to /var/log/apt/history:
Start-Date: 2013-07-01  22:03:17
Commandline: apt-get dist-upgrade
Install: p11-kit:amd64 (0.18.3-2, automatic), libgnome-bluetooth11:amd64 (3.8.1-1, automatic), libgcr-base-3-1:amd64 (3.8.2-3, automatic), libtasn1-6:amd64 (3.3-1, automatic), libgcr-ui-3-1:amd64 (3.8.2-3, automatic)
Upgrade: libnm-gtk0:amd64 (0.9.8.2-1, 0.9.8.2-1+b1), libgcr-3-1:amd64 (3.4.1-3, 3.8.2-3), gir1.2-gcr-3:amd64 (3.4.1-3, 3.8.2-3), network-manager-gnome:amd64 (0.9.8.2-1, 0.9.8.2-1+b1), gnome-keyring:amd64 (3.4.1-5, 3.8.2-2), gcr:amd64 (3.4.1-3, 3.8.2-3), gnome-bluetooth:amd64 (3.4.2-1, 3.8.1-1), gir1.2-gnomebluetooth-1.0:amd64 (3.4.2-1, 3.8.1-1), gir1.2-gck-1:amd64 (3.4.1-3, 3.8.2-3)
End-Date: 2013-07-01  22:03:29

Now what happens when I log in to gnome via gdm3 I get an empty desktop with no menus, no hot-spots or anything else indicating that things worked out. Alt+F2 doesn't work either, and conky doesn't start.

The only thing that does work is
* my keyboard shortcuts (I've mapped ctrl+shift+Down arrow to chromium)
* guake (which starts with gnome)

ps aux|grep gnome-shell
returns nothing, which might be a clue.

Looking at the debian forums the closest post seems to be (although erroneously labelled -- gdm3 DOES start): http://forums.debian.net/viewtopic.php?f=6&t=105393&p=504077&hilit=gnome+shell#p504077

That in turn led to this bug report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712861

My gnome-shell version is 3.4.2-8,

I don't understand how gnome-bluetooth causes this, especially given that I've disabled bluetooth in rcconf, but whatever it takes...

I tried applying the patch but it failed:
mkdir ~/tmp
cd ~/tmp
wget "http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=66;filename=GnomeBluetooth.patch;att=1;bug=712861" -O blue.patch
sed -i 's_js/ui/status/bluetooth.js_/usr/share/gnome-shell/js/ui/status/bluetooth.js_g' blue.patch
sudo patch -p0 < blue.patch

Instead, I ended up making the changes to /usr/share/gnome-shell/js/ui/status/bluetooth.js by hand (remember that you can always use the ttys using ctrl+Fx):
  6 const Gio = imports.gi.Gio;
  7 const GnomeBluetoothApplet = imports.gi.GnomeBluetoothApplet;
  8 const GnomeBluetooth = imports.gi.GnomeBluetooth;
  9 const Gtk = imports.gi.Gtk;

and then delete the Applet part in GnomeBluetoothApplet so that it reads
 38         this._killswitch.connect('toggled', Lang.bind(this, function() {
 39             let current_state = this._applet.killswitch_state;
 40             if (current_state != GnomeBluetooth.KillswitchState.HARD_BLOCKED &&
 41                 current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER) {
 42                 this._applet.killswitch_state = this._killswitch.state ?
 43                     GnomeBluetooth.KillswitchState.UNBLOCKED:
 44                     GnomeBluetooth.KillswitchState.SOFT_BLOCKED;
 45             } else
 46                 this._killswitch.setToggleState(false);

Then do it again:
 96     _updateKillswitch: function() {
 97         let current_state = this._applet.killswitch_state;
 98         let on = current_state == GnomeBluetooth.KillswitchState.UNBLOCKED;
 99         let has_adapter = current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER;
100         let can_toggle = current_state != GnomeBluetooth.KillswitchState.NO_ADAPTER &&
101                          current_state != GnomeBluetooth.KillswitchState.HARD_BLOCKED;
102 



At this point I rebooted and everything was back to normal (you can try simply doing 'sudo service gdm3 restart' instead of rebooting).
Anyway, done.

470. Very briefly: compiling nwchem 6.3 with ifort and mkl

This used to be part of http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html, but I think it makes more sense making it a separate post.

I did this on debian wheezy.

1. Installing mkl and the compiler
MKL: http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html
Intel compiler collection: http://verahill.blogspot.com.au/2013/07/469-intel-compiler-on-debian.html

I will henceforth presume that you have put the files in the same location as shown in those posts, and that you have created /etc/ld.so.conf.d/intel.conf as shown in the second post.

2 Compiling nwchem 6.3
sudo apt-get install build-essential libopenmpi-dev openmpi-bin
sudo mkdir /opt/nwchem -p
sudo chown $USER:$USER /opt/nwchem
cd /opt/nwchem
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.3.revision1-src.2013-05-28.tar.gz -O Nwchem-6.3.revision1-src.2013-05-28.tar.gz
tar xvf Nwchem-6.3.revision1-src.2013-05-28.tar.gz
mv nwchem-6.3-src.2013-05-28 nwchem-6.3-src.2013-05-28.ifort

export NWCHEM_TOP=`pwd`
export LARGE_FILES=TRUE
export TCGRSH=/usr/bin/ssh
export NWCHEM_TOP=`pwd`
export NWCHEM_TARGET=LINUX64
export NWCHEM_MODULES="all"
export PYTHONVERSION=2.7
export PYTHONHOME=/usr

export BLASOPT="-L/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/ -lmkl_core -lmkl_sequential -lmkl_intel_ilp64"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/lib/openmpi/lib:/opt/intel/composer_xe_2013.4.183/mkl/lib/intel64/"

export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export MPI_LOC=/usr/lib/openmpi/lib
export MPI_INCLUDE=/usr/lib/openmpi/include


export LIBMPI="-lmpi -lopen-rte -lopen-pal -ldl -lmpi_f77 -lpthread"
export ARMCI_NETWORK=SOCKETS

cd $NWCHEM_TOP/src

make clean
make nwchem_config
make FC=ifort 1> make.log 2>make.err

cd $NWCHEM_TOP/contrib
export FC=ifort
./getmem.nwchem

And it works quite fine. See e.g. here if you want to patch to allow to compile with python, and to support gabedit.

3. Performance
This is always a bit contentious, and I want to be upfront with the fact that I haven't spent much time considering whether my test example is a good one. I simply did a geo-opt + vibrational analysis as shown in this post: http://verahill.blogspot.com.au/2013/05/430-briefly-crude-comparison-of.html

The jobs were run using all cores available on that node.
gnu= gfortran + acml 5.3.1 for the Phenom and FX8150, and openblas for the i5-2400 and the Athlon 3800+..
ifort= ifort + mkl for all architectures.

The times are in seconds and are CPU times, not wall times.

Arch|                   cores  gnu      ifort Instruction sets
-------------------------------------------------------------------------
AMD Athlon 64 X2 3800+:   2    10828*    12516  sse, sse2, sse3
AMD Phenom II X6 1055T:   6    2044       2048   sse, sse2, sse3
AMD FX8150            :   8    1611       1507   sse, sse2, sse3, AVX, FMA4
Intel i5-12400        :   4    1652*      1498   sse, sse2, sse3, sse4,AVX

In the last case I also compiled using gfortran but with mkl and got 1550s.

It's a fairly small sample set, but it does seem that there's a little bit of an advantage with mkl+ifort over gfortran+acml on the newest AMD core. One would need much more data though.

A clear downside of using mkl and ifort is the fact that they are not freely available though -- i.e. you can register and download them for free for non-commercial use, but there's no guarantee that your colleague, next-door-neighbour or distant-cousin will be able to use it.

01 July 2013

469. Intel compiler (icc, icpc, ifort) on Debian

I've heard it said that MKL is faster than ACML even on AMD cpus. I've also heard it said that Intel compile + mkl beats everything else, even on AMD cpus.

So let's test the veracity of that statement. I'm in particular looking forward to seeing how this affects my amd fx 8150.

Note that the ACML libs are available as separate packages for different compilers -- download the right libs when linking (i.e. the gnu ones for gcc, and the intel ones for intel composer)

But first we need to install and set-up the intel compiler suite, and that's what we'll do in this post.

In the example below I've installed in on an AMD Athlon II X3, hence the message about non-Intel architecture.

Installation:

Register for the Intel Parallel Studio XE as touched upon in this post: http://verahill.blogspot.com.au/2013/06/465-intel-mkl-math-kernel-library-on.html

Download. It's about 2 Gb. Then extract, and run install.sh

sudo apt-get install build-essential
sudo sh install.sh
Step no: 1 of 7 | Welcome -------------------------------------------------------------------------------- Welcome to the Intel(R) Parallel Studio XE 2013 Update 3 for Linux* installation program. -------------------------------------------------------------------------------- You will complete the steps below during this installation: Step 1 : Welcome Step 2 : License Step 3 : Activation Step 4 : Intel(R) Software Improvement Program Step 5 : Options Step 6 : Installation Step 7 : Complete -------------------------------------------------------------------------------- Press "Enter" key to continue or "q" to quit: -------------------------------------------------------------------------------- Checking the prerequisites. It can take several minutes. Please wait... -------------------------------------------------------------------------------- Step no: 1 of 7 | Options > Missing Optional Pre-requisite(s) -------------------------------------------------------------------------------- There are one or more optional unresolved issues. It is highly recommended to resolve them all before you continue the installation. You can fix them without exiting from the installation and re-check. Or you can quit from the installation, fix them and run the installation again. -------------------------------------------------------------------------------- Missing optional pre-requisites -- Intel(R) VTune(TM) Amplifier XE 2013 Update 5: unsupported OS -- Intel(R) Inspector XE 2013 Update 5: unsupported OS -- Intel(R) Advisor XE 2013 Update 2: unsupported OS -- Intel(R) Composer XE 2013 Update 3 for Linux*: unsupported OS -------------------------------------------------------------------------------- 1. Skip missing optional pre-requisites [default] 2. Show the detailed info about issue(s) 3. Re-check the pre-requisites h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Step no: 2 of 7 | License -------------------------------------------------------------------------------- As noted in the Intel(R) Software Development Product End User License Agreement, the Intel(R) Software Development Product you install will send Intel [..] -------------------------------------------------------------------------------- Do you agree to be bound by the terms and conditions of this license agreement? Type "accept" to continue or "decline" to back to the previous menu: accept Step no: 3 of 7 | Activation -------------------------------------------------------------------------------- If you have purchased this product and have the serial number and a connection to the internet you can choose to activate the product at this time. Activation is a secure and anonymous one-time process that verifies your software licensing rights to use the product. Alternatively, you can choose to evaluate the product or defer activation by choosing the evaluate option. Evaluation software will time out in about one month. Also you can use license file, license manager, or remote activation if the system you are installing on does not have internet access activation options. -------------------------------------------------------------------------------- 1. I want to activate my product using a serial number [default] 2. I want to evaluate my product or activate later 3. I want to activate either remotely, or by using a license file, or by using a license manager h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Note: Press "Enter" key to back to the previous menu. Please type your serial number (the format is XXXX-XXXXXXXX): -------------------------------------------------------------------------------- Activation completed successfully. -------------------------------------------------------------------------------- Press "Enter" key to continue: Step no: 4 of 7 | Intel(R) Software Improvement Program -------------------------------------------------------------------------------- Help improve your experience with Intel(R) software Participate in the design of future Intel software. Select 'Yes' to give us permission to learn about how you use your Intel software and we will do the rest. - No Personal contact information is collected - There are no surveys or additional follow-up emails by opting in - You can stop participating at any time Learn more about Intel(R) Software Improvement Program http://software.intel.com/en-us/articles/software-improvement-program With your permission, Intel may automatically receive anonymous information about how you use your current and future Intel software. -------------------------------------------------------------------------------- 1. Yes, I am willing to participate and improve Intel software. (Recommended) 2. No, I don't want to participate in the Intel(R) Software Improvement Program at this time. b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection: Step no: 5 of 7 | Options -------------------------------------------------------------------------------- You are now ready to begin installation. You can use all default installation settings by simply choosing the "Start installation Now" option or you can customize these settings by selecting any of the change options given below first. You can view a summary of the settings by selecting "Show pre-install summary". -------------------------------------------------------------------------------- 1. Start installation Now 2. Change install directory [ /opt/intel ] 3. Change components to install [ All ] 4. Change advanced options 5. Show pre-install summary h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: -------------------------------------------------------------------------------- Checking the prerequisites. It can take several minutes. Please wait... -------------------------------------------------------------------------------- Step no: 5 of 7 | Options > Missing Optional Pre-requisite(s) -------------------------------------------------------------------------------- There are one or more optional unresolved issues. It is highly recommended to resolve them all before you continue the installation. You can fix them without exiting from the installation and re-check. Or you can quit from the installation, fix them and run the installation again. -------------------------------------------------------------------------------- Missing optional pre-requisites -- Intel(R) VTune(TM) Amplifier XE 2013 Update 5: The system does not have an Intel Architecture processor -------------------------------------------------------------------------------- 1. Skip missing optional pre-requisites [default] 2. Show the detailed info about issue(s) 3. Re-check the pre-requisites h. Help b. Back to the previous menu q. Quit -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [1]: Step no: 7 of 7 | Complete -------------------------------------------------------------------------------- Thank you for installing and using the Intel(R) Parallel Studio XE 2013 Update 3 for Linux* Reminder: Intel(R) VTune(TM) Amplifier XE users must be members of the "vtune" permissions group in order to use Event-based Sampling. To register your product purchase, visit https://registrationcenter.intel.com/RegCenter/registerexpress.aspx?clientsn=N43 3-4XGWTJLB To get started using Intel(R) VTune(TM) Amplifier XE 2013 Update 5: - To set your environment variables: source /opt/intel/vtune_amplifier_xe_2013/amplxe-vars.sh - To start the graphical user interface: amplxe-gui - To use the command-line interface: amplxe-cl - For more getting started resources: /opt/intel/vtune_amplifier_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Inspector XE 2013 Update 5: - To set your environment variables: source /opt/intel/inspector_xe_2013/inspxe-vars.sh - To start the graphical user interface: inspxe-gui - To use the command-line interface: inspxe-cl - For more getting started resources: /opt/intel/inspector_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Advisor XE 2013 Update 2: - To set your environment variables: source /opt/intel/advisor_xe_2013/advixe-vars.sh - To start the graphical user interface: advixe-gui - To use the command-line interface: advixe-cl - For more getting started resources: /opt/intel/advisor_xe_2013/ documentation/en/welcomepage/get_started.html. To get started using Intel(R) Composer XE 2013 Update 3 for Linux*: - Set the environment variables for a terminal window using one of the following (replace "intel64" with "ia32" if you are using a 32-bit platform). For csh/tcsh: $ source /opt/intel/bin/compilervars.csh intel64 For bash: $ source /opt/intel/bin/compilervars.sh intel64 To invoke the installed compilers: For C++: icpc For C: icc For Fortran: ifort To get help, append the -help option or precede with the man command. - For more getting started resources: /opt/intel/composer_xe_2013/Documentation/en_US/get_started_lc.htm. /opt/intel/composer_xe_2013/Documentation/en_US/get_started_lf.htm. To view movies and additional training, visit http://www.intel.com/software/products. -------------------------------------------------------------------------------- q. Quit [default] -------------------------------------------------------------------------------- Please type a selection or press "Enter" to accept default choice [q]:


The Files and Setup:
The compiler binaries are now found in /opt/intel/composer_xe_2013.3.163/bin/intel64/ . Of particular interest are ifort, icc and icpc (fortran, c and c++).

In addition, you'll need the lib and include files, which are found in /opt/intel/composer_xe_2013.3.163/compiler/lib/intel64/ and /opt/intel/composer_xe_2013.3.163/compiler/include/intel64.

You can either simply add the libs using LD_LIBRARY_PATH, but a perhaps easier and better method is to create a file: /etc/ld.so.conf.d/intel.conf
/opt/intel/composer_xe_2013.3.163/compiler/lib/intel64 /opt/intel/composer_xe_2013.4.183/mkl/lib/intel64
Once that's done, run
sudo ldconfig

Then do
echo 'PATH=$PATH:/opt/intel/composer_xe_2013.3.163/bin/intel64' >> ~/.bashrc
source ~/.bashrc

Testing:
See this post for an example of how to compile nwchem using ifort: http://verahill.blogspot.com.au/2013/07/470-very-briefly-compiling-nwchem-63.html