Zhiguang Huo (Caleb)
Thursday August 31, 2023
## [1] "double"## [1] "character"## [1] "logical"## [1] "double"## [1] "integer"## [1] TRUE## [1] "logical"## [1] TRUE## [1] "integer"## [1] TRUE## [1] FALSE## [1] 1.6## [1] 1.6## [1] 1.6Difference between = and <-
## [1] "double"## [1] TRUE## [1] FALSE## [1] TRUE## [1] "character"## [1] TRUE## [1] "He doesn't like Biostatistical computing"| Functions | Meaning | 
|---|---|
| length(x) | Number of elements in x | 
| unique(x) | Unique elements of x | 
| sort(x) | Sort the elements of x | 
| rev(x) | Reverse the order of x | 
| names(x) | Name the elements of x | 
| which(x) | Indices of x that are TRUE | 
| which.max(x) | Index of the maximum element of x | 
| which.min(x) | Index of the minimum element of x | 
| append(x) | Insert elements into a vector | 
| match(x) | First index of an element in a vector | 
| union(x, y) | Union of x and y | 
| intersect(x, y) | Intersection of x and y | 
| setdiff(x, y) | Elements of x that are not in y | 
| setequal(x, y) | Do x and y contain the same elements | 
| Functions | Meaning | 
|---|---|
| sum(x) | Sum of x | 
| prod(x) | Product of x | 
| cumsum(x) | Cumulative sum of x | 
| cumprod(x) | Cumulative product of x | 
| min(x) | Minimum element of x | 
| max(x) | Maximum element of x | 
| pmin(x, y) | Pairwise minimum of x and y | 
| pmax(x, y) | Pairwise maximum of x and y | 
| mean(x) | Mean of x | 
| median(x) | Median of x | 
| var(x) | Variance of x | 
| sd(x) | Standard deviation of x | 
| cov(x, y) | Covariance of x and y | 
| cor(x, y) | Correlation of x and y | 
| range(x) | Range of x | 
| quantile(x) | Quantiles of x for given probabilities | 
| summary(x) | Numerical summary of x | 
## [1] 9## [1] 3## [1] 4.75## [1] 2 9## [1] "character"## [1] 0 0 1## [1] "FALSE" "FALSE" "TRUE"## [1] "double"## [1] "logical"## [1] "integer"## [1] "double"## [1] "character"## List of 4
##  $ : int [1:3] 1 2 3
##  $ : chr "a"
##  $ : logi [1:3] TRUE FALSE TRUE
##  $ :List of 2
##   ..$ : num 2.3
##   ..$ : num 5.9## [1] 1 2 3## List of 4
##  $ p1: int [1:3] 1 2 3
##  $ p2: chr "a"
##  $ p3: logi [1:3] TRUE FALSE TRUE
##  $ p4:List of 2
##   ..$ : num 2.3
##   ..$ : num 5.9## [1] 1 2 3## [1] 1 2 3| Homogeneous | Heterogeneous | |
|---|---|---|
| 1d | Atomic vector | List | 
| 2d | Matrix | Data frame | 
| nd | Array | 
## [1] "This is a vector"## $my_attribute
## [1] "This is a vector"##  [1] "a" ""  ""  ""  ""  ""  ""  ""  ""  ""## NULL##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    3    5    7    9
## [2,]    2    4    6    8   10## [1] "matrix" "array"## [1] a b b a
## Levels: a b## [1] "factor"## [1] "a" "b"factors are very useful when there exist missing class
## sex_char
## m 
## 3## sex_factor
## m f 
## 3 0##      C.1 C.2 C.3
## row1   1   3   5
## row2   2   4   6## [1] "C.1" "C.2" "C.3"## [1] "row1" "row2"## [1] 3## [1] 2##      [,1] [,2]
## [1,]    1    4
## [2,]    2    5
## [3,]    3    6##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6##      C.1 C.2 C.3
## row1   1   3   5
## row2   2   4   6## 'data.frame':    3 obs. of  3 variables:
##  $ x: int  1 2 3
##  $ y: chr  "a" "b" "c"
##  $ z: num  0 0 0## x y z same as x y z## 3 same as 3sentenses <- "R is a great statistical software.\n\nWe use R in the Biostatistical computing class!"
sentenses## [1] "R is a great statistical software.\n\nWe use R in the Biostatistical computing class!"## R is a great statistical software.
## 
## We use R in the Biostatistical computing class!## [1] "this is a dog."## [1] "THIS IS A DOG."## [1] "www.ufl.edu"## [1] 14## [1] 1## [1] 5 5 7## [1] 3## [1] "t"## [1] "dog"## [1] "this is a cat"## [[1]]
## [1] "this" "is"   "a"    "dog"## [[1]]
##  [1] "t" "h" "i" "s" " " "i" "s" " " "a" " " "d" "o" "g"## [[1]]
## [1] "this" "is"   "a"    "dog" 
## 
## [[2]]
## [1] "this" "is"   "a"    "cat" 
## 
## [[3]]
## [1] "this"  "is"    "a"     "gator"## [1] "this is a dog"## [1] "thisisadog"## [1] 4 2 1 3## [1] "this is a dog"## [1] "this is a cat"## [1] "this is a cat"chars <- c("this is a dog", "this is a cat", "this is a gator")
gsub("this","that",chars) ## pattern, replacement, x## [1] "that is a dog"   "that is a cat"   "that is a gator"## [1] 3## [1] 1 2 3## [1] "this is a gator"## [1] 1 2## [1] 1 2## [1] 1 2 3## [1] 3## [1] 1 2## [1] 2## [1] 1 2## [1] 1 2## integer(0)chars <- c("this is a dog", "this is a cat", "this is a   gator")
gsub(pattern = "[aeiou]",replacement="Z",x=chars) ## pattern is vowel, replacement is Z## [1] "thZs Zs Z dZg"     "thZs Zs Z cZt"     "thZs Zs Z   gZtZr"## [1] "a#b#"## 1  2  3  4  5  6  7  8  9  10## 1  2  3  4  5  6  7  8  9  10## 1  3  5  7  9## 1  3  5  7  9## 1x <- list(1:3, "a", c(TRUE, FALSE, TRUE), list(2.3, 5.9))
for(i in 1:length(x)){ 
  ax <- x[[i]]
  print(ax)
}## [1] 1 2 3
## [1] "a"
## [1]  TRUE FALSE  TRUE
## [[1]]
## [1] 2.3
## 
## [[2]]
## [1] 5.9## [1] 1 2 3
## [1] "a"
## [1]  TRUE FALSE  TRUE
## [[1]]
## [1] 2.3
## 
## [[2]]
## [1] 5.9## [1] 1 2 3
## [1] "a"
## [1]  TRUE FALSE  TRUE
## [[1]]
## [1] 2.3
## 
## [[2]]
## [1] 5.9example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 3.3 2.1## [1] 2.1 3.3 4.2 5.4## [1] 2.1 2.1 2.1example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 4.2 5.4## [1] 2.1 3.3 5.4example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 2.1 4.2## [1] FALSE  TRUE  TRUE  TRUE## [1] 4.2 3.3 5.4## [1] 2.1 4.2  NAexample: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 2.1 4.2 3.3 5.4## numeric(0)example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
##   a 
## 2.1##   b   c 
## 4.2 3.3##      A B C
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9##      A B C
## [1,] 1 4 7
## [2,] 2 5 8##      B A
## [1,] 4 1
## [2,] 6 3##      A C
## [1,] 1 7
## [2,] 2 8
## [3,] 3 9##   x y z
## 2 2 1 b##   x z
## 1 1 a
## 2 2 b##   x z
## 1 1 a
## 2 2 b| Functions | simplifying | preserving | 
|---|---|---|
| List | x[[1]] | x[1] | 
| Vector | x[[1]] | x[1] | 
| Factor | x[1:2, drop=T] | x[1:2] | 
| Data frame | x[,1] or x[[1]] | x[, 1, drop=F] or x[1] | 
## [1] 111## [1] "double"## $a
## [1] 111## [1] "list"grades <- c(1,2,2,3,1)
info <- data.frame(grade=3:1, desc=c("Excellent", "Good", "Poor"), fail=c(F,F,T))
id <- match(grades, info$grade)
id## [1] 3 2 2 1 3##     grade      desc  fail
## 3       1      Poor  TRUE
## 2       2      Good FALSE
## 2.1     2      Good FALSE
## 1       3 Excellent FALSE
## 3.1     1      Poor  TRUERead in txt/csv files
read.csv()read.table()read.delim(, delim=";")Also pay attention to the arguments such as header, row.names
write.csv(burnData, file = "myBurnData.csv")
write.table(burnData, file = "myBurnData.txt")
write.table(burnData, file = "myBurnData.txt", append = TRUE)fileNameFull <- 'https://caleb-huo.github.io/teaching/data/Burn/burn.csv'
con  <- file(fileNameFull, open = "r")
while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
    aline = strsplit(oneLine, ",")[[1]]
    print(aline)
} 
close(con) ## remember to close filesIf you take a long time to obtain your result. How to save your result so in the future, you won’t bother re-run them again?
a <- 1:4
b <- 2:5
ans <- a * b
result <- list(a=a, b=b, ans=ans)
save(result,file="myResult.rdata")load("myResult.rdata")
result2 <- get(load("myResult.rdata"))If you take a long time to obtain your result. How to save your result so in the future, you won’t bother re-run them again?
a <- 1:4
b <- 2:5
ans <- a * b
result <- list(a=a, b=b, ans=ans)
saveRDS(result,file="myResult.rdata")result2 <- readRDS("myResult.rdata")Dates are represented as the number of days since 1970-01-01, with negative values for earlier dates.
mydates <- as.Date(c("2017-09-11", "2012-12-17", "1970-01-01"))
# number of days in between 
days <- mydates[1] - mydates[2]
days## Time difference of 1729 days## [1] 17420 15691     0## [1] "2023-09-04"## [1] "Mon Sep  4 17:48:19 2023"| Symbol | Meaning | Example | 
|---|---|---|
| %d | day as a number (0-31) | 01-31 | 
| %a | abbreviated weekday | Fri | 
| %A | unabbreviated weekday | Friday | 
| %m | month (00-12) | 00-12 | 
| %b | abbreviated month | Oct | 
| %B | unabbreviated month | October | 
| %y | 2-digit year | 22 | 
| %Y | 4-digit year | 2022 | 
## [1] "September 04 2023"# convert date info in format 'mm/dd/yyyy'
strDates <- c("01/05/1995", "08/16/1995")
dates <- as.Date(strDates, "%m/%d/%Y")## [1] "2007-06-22" "2004-02-13"## [1] "1995-01-05" "1995-08-16"## [1] 9135 9358