Introduction to Biostatistical Computing PHC 6937

HiperGator

Zhiguang Huo (Caleb)

Wednesday September 21, 2022

HiperGator user agreement

Acceptable Use Policy

I acknowledge that the access to the HPC resources operated by UF Research Computing is subject to the UF Acceptable Use Policy at https://it.ufl.edu/policies/acceptable-use/acceptable-use-policy/ and the Research Computing policies at http://www.rc.ufl.edu/services/procedures/ and that I am responsible for following these policies.

I also certify that using restricted data and software on the HPC resources requires extra steps described at UFRC Policies and at UFRC Export Policies, and that I will notify both my account sponsor and the Office of Research (Research Compliance) and Research Computing at support.rc.ufl.edu when I am working with such data.

Basic usage of HiperGator

These usages can be applied to other Linux machines.

Preparation

How to login HiperGator (RStudio)

Common linux commands:

Your HiperGator Home directory

Common linux commands:

FileZilla for file transfer

FileZilla for file transfer

You can transfer files between your local computer and hiperGator

Login node and working node

Open an interactive session on HiperGator (1)

## open interactive R session
module load ufrc
srundev  --account=phc6937 --qos=phc6937 --time=04:00:00

module load R ## load R
R

Module is like a package

Quit interactive session: exit

Open an interactive session on HiperGator (2)

## open interactive R session
srun --account=phc6937 --qos=phc6937 --ntasks=1 --cpus-per-task 1 --mem=8gb  --time=04:00:00 --pty bash -i

module load R ## load R

R

Open an interactive R session on HiperGator

## do the following on hiperGator
getwd()
dir()

head(cars)
mycars <- cars
write.csv(mycars, "mycars.csv")
dir()

Submit a job (I)

  1. R script (saveCars.R): contains your R code
  2. SLURM job script (saveCars.slurm): coordinate your job with the server
  3. submit: sbatch the slurm file

Prepare R script (a simple one) (I)

WD <- "/blue/phc6937/share/zhuo/example/testR" ## change to your own directory
dir.create(WD, recursive=T) ## force to create this folder

setwd(WD) ## set to your own directory!
mycars <- mtcars
write.csv(mycars,"mycars.csv")

Prepare SLURM job script (I)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6937             # your own sponser or account from this class
#SBATCH --qos=phc6937                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                     # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R CMD BATCH saveCars.R ## make sure saveCars.R is at your current working directory
## R --no-save --quiet --slave < saveCars.R ## alternative way

date

submit the job (I)

cd /blue/phc6937/share/zhuo/example/testR
sbatch saveCars.slurm ## submit job
  1. Submit the slurm job (saveCars.slurm)
  2. The slurm job contains the code to run the R job (R CMD BATCH saveCars.R)
  3. The R job will return the result

Check log file (I)

cd /blue/phc6937/share/zhuo/example/testR
cat serial_test_25280301.out ## you may have your own log file name
head serial_test_25280301.out ## you may have your own log file name
more serial_test_25280301.out ## you may have your own log file name
cd /blue/phc6937/share/zhuo/example/testR
cat saveCars.Rout
cd /blue/phc6937/share/zhuo/example/testR
cat mycars.csv

Exercise (I)

  1. Copy the saveCars.R and saveCars.slurm into your own working directory
  2. Revise the saveCars.R
    • Transfer to your local computer using Filezilla
    • Change to your own working directory
    • Transfer back to hiperGator
  3. Try to revise the saveCars.slurm
    • Try to specify your email, revise time and memory
  4. Submit your job

Submit a job with external argument (II)

R script (with external arguments) (II)

args = commandArgs(trailingOnly = TRUE) ## pass in external argument

rowID <- args[1]
aarg <- as.numeric(rowID)

setwd("/blue/phc6937/share/zhuo/example/testR2")
mycars <- mtcars[aarg,]
filename <- paste0("arg",aarg,".csv")
write.csv(mycars,filename)

SLURM job script (II)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6937             # your own sponser or account from this class
#SBATCH --qos=phc6937                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R --no-save --quiet --slave --args 1 < saveCarsArgs.R ## again, make sure the R file is at your current working directory.

date

submit the job (II)

cd /blue/phc6937/share/zhuo/example/testR2
sbatch saveCarsArgs.slurm ## submit job
  1. Submit the slurm job (saveCarsArgs.slurm)
  2. The slurm job will submit the R job (saveCarsArgs.R) with extra argument
  3. The R job will return the result

Check log file (II)

cd /blue/phc6937/share/zhuo/example/testR2
cat serial_test_25280860.out ## you may have your own log file name
cd /blue/phc6937/share/zhuo/example/testR2
cat arg1.csv

Submit a job with loops (III)

SLURM job script (III)

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6937             # your own sponser or account from this class
#SBATCH --qos=phc6937                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

for i in {2..10}
do
echo "Running save cars" $i 
R --no-save --quiet --slave --args $i < saveCarsArgs.R 
done

date

submit the job (III)

cd /blue/phc6937/share/zhuo/example/testR3
sbatch saveCarsArgsLoops.slurm ## submit a loop job

Check job status

Burst mode

Since our entire class share these resouces, please plan ahead when you work on your HWs. Using the hiperGator at the last minute may make you compete with other students for resources.

If all computing resources are occupied