Biostatistical Computing, PHC 6068

HiperGator

Zhiguang Huo (Caleb)

Wednesday September 23, 2020

Basic usage of HiperGator

These usages can be applied to other Linux machines.

Preparation

How to login HiperGator (Windows)

How to login HiperGator (Mac, Linux)

Your HiperGator Home directory

Common linux commands:

Common linux commands:

FileZilla

FileZilla

You can transfer files between your local computer and hiperGator

Login node and working node

Login node and working node

Open an interactive session on HiperGator (1)

## open interactive R session
module load ufrc
srundev  --account=phc6068 --qos=phc6068 --time=04:00:00

module load R ## load R
R

Open an interactive session on HiperGator (2)

## open interactive R session
srun --account=phc6068 --qos=phc6068 --ntasks=1 --cpus-per-task 1 --mem=8gb  --time=04:00:00 --pty bash -i

module load R ## load R

R

Open an interactive R session on HiperGator

## do the following on hiperGator
getwd()
dir()

head(cars)
mycars <- cars
write.csv(mycars, "mycars.csv")
dir()

Submit a job (I)

  1. R script (saveCars.R): contains your R code
  2. SLURM job script (saveCars.slurm): coordinate your job with the server
  3. submit: sbatch the slurm file

Prepare R script (a simple one) (I)

WD <- "/blue/phc6068/share/zhuo/example/testR" ## change to your own directory
dir.create(WD, re=T) ## force to create this folder

setwd(WD) ## set to your own directory!
mycars <- mtcars
write.csv(mycars,"mycars.csv")

Prepare SLURM job script (I)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                     # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R CMD BATCH saveCars.R ## make sure saveCars.R is at your current working directory
## R --no-save --quiet --slave < saveCars.R ## alternative way

date

submit the job (I)

cd /blue/phc6068/share/zhuo/example/testR
sbatch saveCars.slurm ## submit job
  1. Submit the slurm job (saveCars.slurm)
  2. The slurm job will submit the R job (saveCars.R)
  3. The R job will return the result

Check log file (I)

cd /blue/phc6068/share/zhuo/example/testR
cat serial_test_25280301.out ## you may have your own log file name
head serial_test_25280301.out ## you may have your own log file name
more serial_test_25280301.out ## you may have your own log file name
cd /blue/phc6068/share/zhuo/example/testR
cat saveCars.Rout
cd /blue/phc6068/share/zhuo/example/testR
cat mycars.csv

Exercise (I)

  1. Copy the saveCars.R and saveCars.slurm into your own working directory
  2. Revise the saveCars.R
    • Change to your own working directory
  3. Try to revise the saveCars.slurm
    • Try to specify your email, revise time and memory
  4. Submit your job

Submit a job with external argument (II)

R script (with external arguments) (II)

args = commandArgs(trailingOnly = TRUE) ## pass in external argument

rowID <- args[1]
aarg <- as.numeric(rowID)

setwd("/blue/phc6068/share/zhuo/example/testR2")
mycars <- mtcars[aarg,]
filename <- paste0("arg",aarg,".csv")
write.csv(mycars,filename)

SLURM job script (II)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R --no-save --quiet --slave --args 1 < saveCarsArgs.R ## again, make sure the R file is at your current working directory.

date

submit the job (II)

cd /blue/phc6068/share/zhuo/example/testR2
sbatch saveCarsArgs.slurm ## submit job
  1. Submit the slurm job (saveCarsArgs.slurm)
  2. The slurm job will submit the R job (saveCarsArgs.R) with extra argument
  3. The R job will return the result

Check log file (II)

cd /blue/phc6068/share/zhuo/example/testR2
cat serial_test_25280860.out ## you may have your own log file name
cd /blue/phc6068/share/zhuo/example/testR2
cat arg1.csv

Submit a job with loops (III)

SLURM job script (III)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

for i in {2..10}
do
echo "Running save cars" $i 
R --no-save --quiet --slave --args $i < saveCarsArgs.R 
done

date

submit the job (III)

cd /blue/phc6068/share/zhuo/example/testR3
sbatch saveCarsArgsLoops.slurm ## submit a loop job

Check job status

Burst mode

If all computing resources are occupied