19 January 2016

627. Very briefly: Turn off screen blanking in GNOME/debian jessie

The only thing that worked for turning off screen-blanking when using youtube or vlc was an ugly little hack:
I installed xdotool and ran crontab -e and added:
*/2 * * * * DISPLAY=:0 XAUTHORITY=/home/andy/.Xauthority xdotool key Ctrl

Original post
In spite of not having any screen saver activate and in spite of setting "Blank Screen" to "Never' under Power options, my screen kept on blanking out after 1-3 minutes of watching TV using mythtv. I tried a lot of things, including using xset, but the only thing that worked in the end was editing my /etc/X11/xorg.conf (I use an nvidia driver) and adding the bits in red:
Section "ServerLayout"
    Identifier     "X.org Configured"
    Screen      0  "Screen0" 0 0
    Screen      1  "Screen1" RightOf "Screen0"
    Screen      2  "Screen2" RightOf "Screen1"
    InputDevice    "Mouse0" "CorePointer"
    InputDevice    "Keyboard0" "CoreKeyboard"
    Option "BlankTime" "0"
    Option "StandbyTime" "0"
    Option "SuspendTime" "0"
    Option "OffTime" "0"
    Option "DPMS" "0"
Now the screen never blanks when watching TV using mythtv. On the other hand, it still blanks when watching DVDs in VLC...

18 January 2016

626. Briefly: Gaussian and cloud computing -- Gaussian G09D with Slurm on aws/ec2

Note: you may want to install awscli and euca2ools. I didn't, so I don't actually know whether they are useful.

My instructions are quite rudimentary since I don't have much time to write these blog posts anymore. Hopefully there's enough information to get you through.

Either way, sign up for AWS. If you already have an amazon ID I think you can use that. Go to https://aws.amazon.com/

Select Launch an Instance and pick the ubuntu AIM and do Launch and Review. I launched it as a t2.micro instance type, as it is free and it's sufficient for set up but not to run jobs.

Hit launch, and create a new key pair. I called mine myfirstkeypair and saved the pem file in my ~/Downloads folder

In my Downloads folder:
ssh -i "myfirstkeypair.pem" ubuntu@ec2-11-222-33-444.us-west-2.compute.amazonaws.com
I then set a password in the ubuntu AWS image:
sudo passwd ubuntu

I added my id_rsa.pub to ~/.ssh/authorized_keys on the ubuntu AWS image to make logging in via ssh easier -- that way I won't need the pem file.

Set up Gaussian
I then connected with SCP and uploaded my gaussian files -- I went straight for EM64T G09D. It went quite fast at +5 MB/s

scp E6L-103X.tgz ubuntu@ec2-00-111-22-333.us-west-2.compute.amazonaws.com:/home/ubuntu/E6L-103X.tgz

Once that was done, on the ubuntu AWS instance I did:
sudo apt-get install csh 
sudo mkdir /opt/gaussian
cd /opt 
sudo chown ubuntu gaussian -R
cd /opt/gaussian
cp ~/E6L-103X.tgz .
tar xvf E6L-103X.tgz
cd g09
csh bsd/install

echo 'export GAUSS_EXEDIR=/opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09' >> ~/.bashrc
echo 'export GAUSS_SCRDIR=/home/ubuntu/scratch' >> ~/.bashrc
echo 'export PATH=$PATH:/opt/gaussian/g09' >> ~/.bashrc
source ~/.bashrc 
mkdir ~/scratch ~/jobs

NOTE that you can't run any gaussian jobs under a t2.micro instance. You will have to stop and relaunch as at least a t2.small instance or the jobs will be 'Killed' (that's what is echoed in the terminal when you try to run)
Note that if you terminate an image it will be deleted.

Stop the image and then create a snapshot or an image from it to keep everything you've installed.

Set up Slurm
You'll want a queue manager so that you can launch several jobs in serial. Also, you can set up your script so that it shuts down the image when your job is done to save money.

sudo apt-get update
sudo apt-get install slurm-llnl

ControlMachine=localhost ControlAddr= MpiDefault=none ProctrackType=proctrack/pgid ReturnToService=2 SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd SlurmUser=slurm StateSaveLocation=/var/lib/slurm-llnl/slurmctld SwitchType=switch/none TaskPlugin=task/none FastSchedule=1 SchedulerType=sched/backfill SelectType=select/linear AccountingStorageType=accounting_storage/none ClusterName=rupert JobAcctGatherType=jobacct_gather/none SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log SlurmdLogFile=/var/log/slurm-llnl/slurmd.log NodeName=localhost NodeAddr= PartitionName=All Nodes=localhost
sudo /usr/sbin/create-munge-key
Edit /etc/default/munge:
Then run
sudo service slurm-llnl restart
sudo service munge restart 
Test using slurm.batch
#!/bin/bash # #SBATCH -p All #SBATCH --job-name=test #SBATCH --output=res.txt # #SBATCH --ntasks=1 #SBATCH --time=10:00 srun hostname srun sleep 60
and submit with
sbatch slurm.batch
                 2       All     test   ubuntu  R       0:08      1 localhost

#!/bin/csh #SBATCH -p All #SBATCH --time=9999999 #SBATCH --output=slurm.out #SBATCH --job-name=benchmark setenv GAUSS_SCRDIR /home/ubuntu/scratch setenv GAUSS_EXEDIR /opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09 /opt/gaussian/g09/g09< benchmark.in > benchmark.out
Using the same opt/freq benchmark as in post 621.

c4.2xlarge 2h 11 min [1h 20 min] 8 vcpu/16 Gb
c4.4xlarge 1h 15 min [     44 min] 16 vcpu/32 Gb
c4.8xlarge      41 min [     25 min] 36 vcpu/60 Gb

It scales surprisingly well, although not perfectly linearly. It's clear that it's cheaper to use a smaller instance, so if time isn't critical or the larger memory isn't needed, c4.8xlarge is not the first choice.

You might want to use dropbox to transfer files back and forth, especially finished job files (useful if you shut down the machine using a slurm script as shown below)

cd ~ && wget -O - "https://www.dropbox.com/download?plat=lnx.x86_64" | tar xzf -
This computer isn't linked to any Dropbox account... Please visit https://www.dropbox.com/cli_link_nonce?nonce=0011223344556677889900aabbccddeef to link this device. This computer isn't linked to any Dropbox account...

Open that link in a browser, then go back to the terminal.
wget -O - https://www.dropbox.com/download?dl=packages/dropbox.py > dropbox.py
sudo mv dropbox.py /usr/local/bin
sudo chmod +x d/usr/local/bin/dropbox.py
dropbox.py autostart y

Now, since you don't want to use up space unnecessarily (you're paying for it after all), exclude as many directories as possible. To exclude all existing dropbox dirs, do
cd ~/Dropbox
dropbox.py exclude add `ld -d */`
dropbox.py exclude add `ld *.*`
dropbox.py exclude list

Note that it can't handle directories with spaces in the name, so you'll need to polish the list by hand. Next create a directory where you want to run and store your jobs,e .g.
mkdir ~/Dropbox/aws_jobs

When you run a gaussian job, make sure to specify where the .chk files should end up, e.g.
so that you don't use up space/bandwidth for your chk files (unless of course you want to).

Stop after execution:
Use a batch script along these lines:
#!/bin/csh #SBATCH -p All #SBATCH --time=9999999 #SBATCH --output=slurm.out #SBATCH --job-name=benchmark setenv GAUSS_SCRDIR /home/ubuntu/scratch setenv GAUSS_EXEDIR /opt/gaussian/g09/bsd:/opt/gaussian/g09/local:/opt/gaussian/g09/extras:/opt/gaussian/g09 /opt/gaussian/g09/g09< benchmark.in > benchmark.out rm /home/ubuntu/scratch/*.* sudo shutdown -h now

12 November 2015

625. In Progress: Dead node -- no power, unless 8-pin EDS/ATX12V1 unplugged

One of my nodes died last night. It's got a Corsair GS700, AsRock 990FX Extreme 3 mobo and an AMD FX 8150 cpu. The mobo and PSU are 2 years old, whereas the CPU is over three.

In it's full configuration, it refuses to turn on. There's no indication that there's any power. No lights, no fan movements, no beeps.

I checked the PSU, and it's fine.

I then discovered that if I only plug in the 24-pin motherboard PSU connector, but don't attach the 8-pin EDS CPU power, the computer behaves as if there's power i.e. all fans, including on PIC-E VGA card, spin etc.

There's no output though, and no POST beeps.

Plugging in the EDS/ATX12V1 cable again leads to a dead system.

Removing the CPU or RAM still yields a dead system (not sure what a mobo without a CPU is supposed to do i.e. whether it's supposed to POST. Either way, it's dead).

The CMOS battery tests fine with a DMM.  The EDS/ATX12V1 output yields 12 V on all four yellow pins.

The power button is obviously working given that the system 'turns on' when the EDS cable is unplugged.

So it's down to the MOBO or CPU being faulty.

I've ordered a POST card and a CPU socket tester, and will update when I've done further diagnostics.

UPDATE 18/11: I put an old Phenom II in the socket. System still won't light up. This points towards the motherboard being dead. That's not a bad thing as the mobo might still be under warranty.