Zhiguang Huo (Caleb)
Thursday August 31, 2023
## [1] "double"
## [1] "character"
## [1] "logical"
## [1] "double"
## [1] "integer"
## [1] TRUE
## [1] "logical"
## [1] TRUE
## [1] "integer"
## [1] TRUE
## [1] FALSE
## [1] 1.6
## [1] 1.6
## [1] 1.6
Difference between = and <-
## [1] "double"
## [1] TRUE
## [1] FALSE
## [1] TRUE
## [1] "character"
## [1] TRUE
## [1] "He doesn't like Biostatistical computing"
Functions | Meaning |
---|---|
length(x) | Number of elements in x |
unique(x) | Unique elements of x |
sort(x) | Sort the elements of x |
rev(x) | Reverse the order of x |
names(x) | Name the elements of x |
which(x) | Indices of x that are TRUE |
which.max(x) | Index of the maximum element of x |
which.min(x) | Index of the minimum element of x |
append(x) | Insert elements into a vector |
match(x) | First index of an element in a vector |
union(x, y) | Union of x and y |
intersect(x, y) | Intersection of x and y |
setdiff(x, y) | Elements of x that are not in y |
setequal(x, y) | Do x and y contain the same elements |
Functions | Meaning |
---|---|
sum(x) | Sum of x |
prod(x) | Product of x |
cumsum(x) | Cumulative sum of x |
cumprod(x) | Cumulative product of x |
min(x) | Minimum element of x |
max(x) | Maximum element of x |
pmin(x, y) | Pairwise minimum of x and y |
pmax(x, y) | Pairwise maximum of x and y |
mean(x) | Mean of x |
median(x) | Median of x |
var(x) | Variance of x |
sd(x) | Standard deviation of x |
cov(x, y) | Covariance of x and y |
cor(x, y) | Correlation of x and y |
range(x) | Range of x |
quantile(x) | Quantiles of x for given probabilities |
summary(x) | Numerical summary of x |
## [1] 9
## [1] 3
## [1] 4.75
## [1] 2 9
## [1] "character"
## [1] 0 0 1
## [1] "FALSE" "FALSE" "TRUE"
## [1] "double"
## [1] "logical"
## [1] "integer"
## [1] "double"
## [1] "character"
## List of 4
## $ : int [1:3] 1 2 3
## $ : chr "a"
## $ : logi [1:3] TRUE FALSE TRUE
## $ :List of 2
## ..$ : num 2.3
## ..$ : num 5.9
## [1] 1 2 3
## List of 4
## $ p1: int [1:3] 1 2 3
## $ p2: chr "a"
## $ p3: logi [1:3] TRUE FALSE TRUE
## $ p4:List of 2
## ..$ : num 2.3
## ..$ : num 5.9
## [1] 1 2 3
## [1] 1 2 3
Homogeneous | Heterogeneous | |
---|---|---|
1d | Atomic vector | List |
2d | Matrix | Data frame |
nd | Array |
## [1] "This is a vector"
## $my_attribute
## [1] "This is a vector"
## [1] "a" "" "" "" "" "" "" "" "" ""
## NULL
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 3 5 7 9
## [2,] 2 4 6 8 10
## [1] "matrix" "array"
## [1] a b b a
## Levels: a b
## [1] "factor"
## [1] "a" "b"
factors are very useful when there exist missing class
## sex_char
## m
## 3
## sex_factor
## m f
## 3 0
## C.1 C.2 C.3
## row1 1 3 5
## row2 2 4 6
## [1] "C.1" "C.2" "C.3"
## [1] "row1" "row2"
## [1] 3
## [1] 2
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
## C.1 C.2 C.3
## row1 1 3 5
## row2 2 4 6
## 'data.frame': 3 obs. of 3 variables:
## $ x: int 1 2 3
## $ y: chr "a" "b" "c"
## $ z: num 0 0 0
## x y z same as x y z
## 3 same as 3
sentenses <- "R is a great statistical software.\n\nWe use R in the Biostatistical computing class!"
sentenses
## [1] "R is a great statistical software.\n\nWe use R in the Biostatistical computing class!"
## R is a great statistical software.
##
## We use R in the Biostatistical computing class!
## [1] "this is a dog."
## [1] "THIS IS A DOG."
## [1] "www.ufl.edu"
## [1] 14
## [1] 1
## [1] 5 5 7
## [1] 3
## [1] "t"
## [1] "dog"
## [1] "this is a cat"
## [[1]]
## [1] "this" "is" "a" "dog"
## [[1]]
## [1] "t" "h" "i" "s" " " "i" "s" " " "a" " " "d" "o" "g"
## [[1]]
## [1] "this" "is" "a" "dog"
##
## [[2]]
## [1] "this" "is" "a" "cat"
##
## [[3]]
## [1] "this" "is" "a" "gator"
## [1] "this is a dog"
## [1] "thisisadog"
## [1] 4 2 1 3
## [1] "this is a dog"
## [1] "this is a cat"
## [1] "this is a cat"
chars <- c("this is a dog", "this is a cat", "this is a gator")
gsub("this","that",chars) ## pattern, replacement, x
## [1] "that is a dog" "that is a cat" "that is a gator"
## [1] 3
## [1] 1 2 3
## [1] "this is a gator"
## [1] 1 2
## [1] 1 2
## [1] 1 2 3
## [1] 3
## [1] 1 2
## [1] 2
## [1] 1 2
## [1] 1 2
## integer(0)
chars <- c("this is a dog", "this is a cat", "this is a gator")
gsub(pattern = "[aeiou]",replacement="Z",x=chars) ## pattern is vowel, replacement is Z
## [1] "thZs Zs Z dZg" "thZs Zs Z cZt" "thZs Zs Z gZtZr"
## [1] "a#b#"
## 1 2 3 4 5 6 7 8 9 10
## 1 2 3 4 5 6 7 8 9 10
## 1 3 5 7 9
## 1 3 5 7 9
## 1
x <- list(1:3, "a", c(TRUE, FALSE, TRUE), list(2.3, 5.9))
for(i in 1:length(x)){
ax <- x[[i]]
print(ax)
}
## [1] 1 2 3
## [1] "a"
## [1] TRUE FALSE TRUE
## [[1]]
## [1] 2.3
##
## [[2]]
## [1] 5.9
## [1] 1 2 3
## [1] "a"
## [1] TRUE FALSE TRUE
## [[1]]
## [1] 2.3
##
## [[2]]
## [1] 5.9
## [1] 1 2 3
## [1] "a"
## [1] TRUE FALSE TRUE
## [[1]]
## [1] 2.3
##
## [[2]]
## [1] 5.9
example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 3.3 2.1
## [1] 2.1 3.3 4.2 5.4
## [1] 2.1 2.1 2.1
example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 4.2 5.4
## [1] 2.1 3.3 5.4
example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 2.1 4.2
## [1] FALSE TRUE TRUE TRUE
## [1] 4.2 3.3 5.4
## [1] 2.1 4.2 NA
example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## [1] 2.1 4.2 3.3 5.4
## numeric(0)
example: x <- c(2.1, 4.2, 3.3, 5.4). How can we obtain a subset of this vector?
## a
## 2.1
## b c
## 4.2 3.3
## A B C
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
## A B C
## [1,] 1 4 7
## [2,] 2 5 8
## B A
## [1,] 4 1
## [2,] 6 3
## A C
## [1,] 1 7
## [2,] 2 8
## [3,] 3 9
## x y z
## 2 2 1 b
## x z
## 1 1 a
## 2 2 b
## x z
## 1 1 a
## 2 2 b
Functions | simplifying | preserving |
---|---|---|
List | x[[1]] | x[1] |
Vector | x[[1]] | x[1] |
Factor | x[1:2, drop=T] | x[1:2] |
Data frame | x[,1] or x[[1]] | x[, 1, drop=F] or x[1] |
## [1] 111
## [1] "double"
## $a
## [1] 111
## [1] "list"
grades <- c(1,2,2,3,1)
info <- data.frame(grade=3:1, desc=c("Excellent", "Good", "Poor"), fail=c(F,F,T))
id <- match(grades, info$grade)
id
## [1] 3 2 2 1 3
## grade desc fail
## 3 1 Poor TRUE
## 2 2 Good FALSE
## 2.1 2 Good FALSE
## 1 3 Excellent FALSE
## 3.1 1 Poor TRUE
Read in txt/csv files
read.csv()
read.table()
read.delim(, delim=";")
Also pay attention to the arguments such as header, row.names
write.csv(burnData, file = "myBurnData.csv")
write.table(burnData, file = "myBurnData.txt")
write.table(burnData, file = "myBurnData.txt", append = TRUE)
fileNameFull <- 'https://caleb-huo.github.io/teaching/data/Burn/burn.csv'
con <- file(fileNameFull, open = "r")
while (length(oneLine <- readLines(con, n = 1, warn = FALSE)) > 0) {
aline = strsplit(oneLine, ",")[[1]]
print(aline)
}
close(con) ## remember to close files
If you take a long time to obtain your result. How to save your result so in the future, you won’t bother re-run them again?
a <- 1:4
b <- 2:5
ans <- a * b
result <- list(a=a, b=b, ans=ans)
save(result,file="myResult.rdata")
load("myResult.rdata")
result2 <- get(load("myResult.rdata"))
If you take a long time to obtain your result. How to save your result so in the future, you won’t bother re-run them again?
a <- 1:4
b <- 2:5
ans <- a * b
result <- list(a=a, b=b, ans=ans)
saveRDS(result,file="myResult.rdata")
result2 <- readRDS("myResult.rdata")
Dates are represented as the number of days since 1970-01-01, with negative values for earlier dates.
mydates <- as.Date(c("2017-09-11", "2012-12-17", "1970-01-01"))
# number of days in between
days <- mydates[1] - mydates[2]
days
## Time difference of 1729 days
## [1] 17420 15691 0
## [1] "2023-09-04"
## [1] "Mon Sep 4 17:48:19 2023"
Symbol | Meaning | Example |
---|---|---|
%d | day as a number (0-31) | 01-31 |
%a | abbreviated weekday | Fri |
%A | unabbreviated weekday | Friday |
%m | month (00-12) | 00-12 |
%b | abbreviated month | Oct |
%B | unabbreviated month | October |
%y | 2-digit year | 22 |
%Y | 4-digit year | 2022 |
## [1] "September 04 2023"
# convert date info in format 'mm/dd/yyyy'
strDates <- c("01/05/1995", "08/16/1995")
dates <- as.Date(strDates, "%m/%d/%Y")
## [1] "2007-06-22" "2004-02-13"
## [1] "1995-01-05" "1995-08-16"
## [1] 9135 9358