Rstudio on HiperGator

Review:

https://caleb-huo.github.io/teaching/2019FALL/lectures/week5_ggplot2/hiperGator/hiperGator_Rstudio.html
Rsudio on HiperGator is recommended for all assignemnts of this class.

Preparation

Softwares
- FileZilla
- terminal (Mac, Linux)
- MobaXterm (Windows)
scripts:
- HiperGator: /ufrc/phc6068/share/zhuo/example
account
- apply https://www.rc.ufl.edu/access/account-request/

How to login HiperGator (Windows)

MobaXterm
- ssh user@hpg.rc.ufl.edu
- enter your password

How to login HiperGator (Mac, Linux)

terminal
- ssh user@hpg.rc.ufl.edu
- enter your password

Your HiperGator Home directory

/home/yourUserName
- same as ~
- 40GB space
/ufrc/phc6068/yourUserName
- This one will be temporary, will be deleted at the end of this sememster
- If you already have hiperGator sponser
  - You can create a folder under /ufrc/phc6068/share
  - You can replace phc6068 with your sponser’s group name
- The entire phc6068 group share 2TB space

Common linux commands:

pwd: currect working directory
ls: list all files in current working directory
mkdir afolder: create a folder named afolder
cd afolder: get into afolder
cd ..: go back one folder
cd ~: go to your home directory
top: monitor CPU/memory usages

Common linux commands:

cd /ufrc/phc6068/share/zhuo/testR ## go the the folder with all test files
cp saveCars.R /ufrc/phc6068/your directory ## copy a single file
cp /ufrc/phc6068/share/zhuo/example /ufrc/phc6068/your directory/. ## copy the entire folder
cat saveCars.R: print everying
more saveCars.R: interactive print
head saveCars.R: head of the contents
cat saveCars.slurm:
rm saveCars.slurm: delete this file

FileZilla

Host: hpg.rc.ufl.edu
username:
password:
Port: 22
connect

You can transfer files between your local computer and hiperGator

Login node and working node

You are not allowed to do any computing at the login node
- If the login node crushes, nobody can use hiperGator
- The HiperGator team also monitors this and may kill your jobs on the login node
Two ways to access the working nodes
- Interactive session
- Submit slurm jobs.

Open an interactive session on HiperGator

## open interactive R session
srun --account=phc6068 --qos=phc6068 --ntasks=1 --cpus-per-task 1 --mem=8gb  --time=04:00:00 --pty bash -i

module load R ## load R

R

this is shell script
need to specify account and qos
ntasks: number of machines (nodes)
cpus-per-task: number of CPUs
mem: memory you want to claim. The entire class shares 112G.
time: specifies how long you want to claim for the interactive session.
-i specifies interactive session.
module load R: will load R software
R: open R software

Open an interactive session on HiperGator

## do the following on hiperGator
getwd()
dir()

head(cars)
mycars <- cars
write.csv(mycars, "mycars.csv")
dir()

Try to transfer the file back to local PC using FileZilla

Submit a job (I)

R script (saveCars.R): contains your R code
SLURM job script (saveCars.slurm): coordinate your job with the server
submit: sbatch the slurm file

Prepare R script (a simple one) (I)

Save the following code into saveCars.R

WD <- "/ufrc/phc6068/share/zhuo/example/testR" ## change to your own directory
dir.create(WD, re=T) ## force to create this folder

setwd(WD) ## set to your own directory!
mycars <- mtcars
write.csv(mycars,"mycars.csv")

Prepare SLURM job script (I)

Save the following code into saveCars.slurm

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                     # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R CMD BATCH saveCars.R ## make sure saveCars.R is at your current working directory
## R --no-save --quiet --slave < saveCars.R ## alternative way

date

submit the job (I)

cd /ufrc/phc6068/share/zhuo/example/testR
sbatch saveCars.slurm ## submit job

Submit the slurm job (saveCars.slurm)
The slurm job will submit the R job (saveCars.R)
The R job will return the result

Check log file (I)

Linux log file:

cd /ufrc/phc6068/share/zhuo/example/testR
cat serial_test_25280301.out ## you may have your own log file name
head serial_test_25280301.out ## you may have your own log file name
more serial_test_25280301.out ## you may have your own log file name

R console history

cd /ufrc/phc6068/share/zhuo/example/testR
cat saveCars.Rout

R/Slurm job output

cd /ufrc/phc6068/share/zhuo/example/testR
cat mycars.csv

Exercise (I)

Copy the saveCars.R and saveCars.slurm into your own working directory
Submit the job
Try to revise the saveCars.R
- Change to your own working directory
- Just try to output any other results, and save them.
Try to revise the saveCars.slurm
- Try to specify your email, revise time and memory
Submit your job again

cp /ufrc/phc6068/share/zhuo/example/testR/saveCars.R

Submit a job with external argument (II)

R script (saveCarsArgs.R): contains your R code
SLURM job script (saveCarsArgs.slurm): coordinate your job with the server
submit: sbatch the slurm file

R script (with external arguments) (II)

Save the following code into saveCarsArgs.R

args = commandArgs(trailingOnly = TRUE) ## pass in external argument

rowID <- args[1]
aarg <- as.numeric(rowID)

setwd("/ufrc/phc6068/share/zhuo/example/testR2")
mycars <- mtcars[aarg,]
filename <- paste0("arg",aarg,".csv")
write.csv(mycars,filename)

SLURM job script (II)

Save the following code into saveCarsArgs.slurm

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

echo "Running save cars script on a single CPU core" 

R --no-save --quiet --slave --args 1 < saveCarsArgs.R 

date

submit the job (II)

cd /ufrc/phc6068/share/zhuo/example/testR2
sbatch saveCarsArgs.slurm ## submit job

Submit the slurm job (saveCarsArgs.slurm)
The slurm job will submit the R job (saveCarsArgs.R) with extra argument
The R job will return the result

Check log file (II)

Linux log file:

cd /ufrc/phc6068/share/zhuo/example/testR2
cat serial_test_25280860.out ## you may have your own log file name

R console history

–no-save: no R console history

R --no-save --quiet --slave --args 1 < saveCarsArgs.R

R/Slurm job output

cd /ufrc/phc6068/share/zhuo/example/testR2
cat arg1.csv

Submit a job with loops (III)

R script (saveCarsArgs.R): save as (II)
SLURM job script (saveCarsArgsLoops.slurm): coordinate your job with the server
submit: sbatch the slurm file

SLURM job script (III)

Save the following code into saveCarsArgsLoops.slurm

#!/bin/sh
#SBATCH --job-name=serial_job_test    # Job name
#SBATCH --account=phc6068             # your own sponser or account from this class
#SBATCH --qos=phc6068                 # your own sponser or account from this class
#SBATCH --mail-type=ALL               # Mail events
#SBATCH --mail-user=xx@xx.xx          # Where to send email 
#SBATCH --ntasks=1                    # Run on a single machine (node)
#SBATCH --cpus-per-task 1             # Run on a single CPU
#SBATCH --mem=8gb                    # Memory limit
#SBATCH --time=04:00:00               # Time: hrs:min:sec
#SBATCH --output=serial_test_%j.out   # Output and error log 

pwd; hostname; date 

module load R 

for i in {2..10}
do
echo "Running save cars" $i 
R --no-save --quiet --slave --args $i < saveCarsArgs.R 
done

date

submit the job (III)

cd /ufrc/phc6068/share/zhuo/example/testR3
sbatch saveCarsArgsLoops.slurm ## submit a loop job

Check job status

squeue: all jobs from all users on hiperGator
squeue | grep zhuo: all jobs from user zhuo on hiperGator ## replace zhuo with your GatorLinkID
scancel jobID: cancel a job with the jobID

Burst mode

Guaranteed computing resources
- qos=phc6068
- 112GB
- 32 CPUs

If all computing resources are occupied

Burst mode (9 times of original number of CPUs & memory)
- qos=phc6068-b
- Use idle resources, depending on other groups

Biostatistical Computing, PHC 6068

HiperGator

Advance usage of HiperGator (Optional)

Rstudio on HiperGator

Preparation

Your HiperGator Home directory

Common linux commands:

Common linux commands:

FileZilla

FileZilla

Open an interactive session on HiperGator

Open an interactive session on HiperGator

Submit a job (I)

Prepare R script (a simple one) (I)

Prepare SLURM job script (I)

submit the job (I)

Check log file (I)

Exercise (I)

Submit a job with external argument (II)

R script (with external arguments) (II)

SLURM job script (II)

submit the job (II)

Check log file (II)

Submit a job with loops (III)

SLURM job script (III)

submit the job (III)

Check job status

Burst mode

recommended text editor