Zhiguang Huo (Caleb)
Wednesday October 10, 2018
## [1] 7 6
## [1] 6 2
## [1] 10 8
## [1] 5 4
set.seed(32611) ## if you keep the same random seed, you will end up with the exact same result
sample(x = a, size = 2)
## [1] 5 4
set.seed(32611) ## if you keep the same random seed, you will end up with the exact same result
sample(x = a, size = 2)
## [1] 5 4
## [1] 5 4
## [1] 1 5
## [1] 2 1
## [1] 5 4
## [1] 1 5
## [1] 2 1
## [1] 9 4 6 10 2 5 8 3 7 1
## [1] 7 8 6 10 3 4 9 2 10 4
## [1] "A" "B" "C" "B" "H" "F" "H" "H" "E" "E"
For normal distribution:
Distribution | R command |
---|---|
binomial | rbinom |
Poisson | rpois |
geometric | rgeom |
negative binomial | rnbinom |
uniform | runif |
exponential | rexp |
normal | rnorm |
gamma | rgamma |
beta | rbeta |
student t | rt |
F | rf |
chi-squared | rchisq |
Weibull | rweibull |
log normal | rlnorm |
Distribution | R command |
---|---|
binomial | dbinom |
Poisson | dpois |
geometric | dgeom |
negative binomial | dnbinom |
uniform | dunif |
exponential | dexp |
normal | dnorm |
gamma | dgamma |
beta | dbeta |
student t | dt |
F | df |
chi-squared | dchisq |
Weibull | dweibull |
log normal | dlnorm |
Distribution | R command |
---|---|
binomial | pbinom |
Poisson | ppois |
geometric | pgeom |
negative binomial | pnbinom |
uniform | punif |
exponential | pexp |
normal | pnorm |
gamma | pgamma |
beta | pbeta |
student t | pt |
F | pf |
chi-squared | pchisq |
Weibull | pweibull |
log normal | plnorm |
Distribution | R command |
---|---|
binomial | qbinom |
Poisson | qpois |
geometric | qgeom |
negative binomial | qnbinom |
uniform | qunif |
exponential | qexp |
normal | qnorm |
gamma | qgamma |
beta | qbeta |
student t | qt |
F | qf |
chi-squared | qchisq |
Weibull | qweibull |
log normal | qlnorm |
\[f(x;\mu,\sigma) = \frac{1}{\sqrt{2\pi \sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}\]
aseq <- seq(-4,4,.01)
plot(aseq,dnorm(aseq, 0, 1),type='l', xlab='x', ylab='Density', lwd=2)
lines(aseq,dnorm(aseq, 1, 1),col=2, lwd=2)
lines(aseq,dnorm(aseq,0, 2),col=3, lwd=2)
legend("topleft",c(expression(paste(mu==0, ", " , sigma==1 ,sep=' ')),
expression(paste(mu==1, ", " , sigma==1 ,sep=' ')),
expression(paste(mu==0, ", " , sigma==2 ,sep=' '))),
col=1:3, lty=c(1,1,1), lwd=c(2,2,2), cex=1, bty='n')
mtext(side=3,line=.5,'Normal distributions',cex=1, font=2)
aseq <- seq(-4,4,.01)
plot(aseq,dnorm(aseq),type='l', xlab='x', ylab='Density', lwd=2)
lines(aseq,dt(aseq,10),col=2, lwd=2)
lines(aseq,dt(aseq,4),col=3, lwd=2)
lines(aseq,dt(aseq,2),col=4, lwd=2)
legend("topleft",c(expression(normal), expression(paste(df==10,sep=' ')),
expression(paste(df==4,sep=' ')),
expression(paste(df==2,sep=' '))),
col=1:4, lty=c(1,1,1), lwd=c(2,2,2), cex=1, bty='n')
mtext(side=3,line=.5,'t distributions',cex=1, font=2)
\[f(k;\lambda) = \frac{\lambda^k e^{-\lambda}}{k!},\] where \(k\) is non negative integer.
\[f(x;\alpha,\beta) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)} x^{\alpha - 1}(1 - x)^{\beta - 1}\]
aseq <- seq(.001,.999,.001)
plot(aseq,dbeta(aseq,.25,.25), type='l', ylim=c(0,6), ylab='Density', xlab='Proportion (p)',lwd=2)
lines(aseq, dbeta(aseq,2,2),lty=2,lwd=2)
lines(aseq, dbeta(aseq,2,5),lty=1,col=2,lwd=2)
lines(aseq, dbeta(aseq,12,2),lty=2,col=2,lwd=2)
lines(aseq, dbeta(aseq,20,.75),lty=1,col='green',lwd=2)
lines(aseq, dbeta(aseq,1,1),lty=2,lwd=2, col=4)
legend(.2,6,c(expression(paste(alpha==.25,', ', beta==.25)), expression(paste(alpha==2,', ',beta==2)), expression(paste(alpha==2,', ', beta==5)), expression(paste(alpha==12,', ',beta==2)), expression(paste(alpha==20,', ',beta==.75)), expression(paste(alpha==1,', ', beta==1))), lty=c(1,2,1,2,1,2), col=c(1,1,2,2,'green',4), cex=1,bty='n',lwd=rep(2,6))
mtext(side=3,line=.5,'Beta distributions',cex=1, font=2)
\[f(x;k,\theta) = \frac{1}{\Gamma(k) \theta^k} x^{k-1}e^{-\frac{x}{\theta}}\]
aseq <- seq(0,7,.01)
plot(aseq,dgamma(aseq,1,1),type='l', xlab='x', ylab='Density', lwd=2)
lines(aseq,dgamma(aseq,2,1),col=4, lwd=2)
lines(aseq,dgamma(aseq,4,4),col=2, lwd=2)
legend(3,1,c(expression(paste(alpha==1,', ',beta==1,sep=' ')), expression(paste(alpha==2,', ',beta==1,sep=' ')), expression(paste(alpha==4,', ', beta==4,sep=' '))), col=c(1,4,2), lty=c(1,1,1), lwd=c(2,2,2), cex=1, bty='n')
mtext(side=3,line=.5,'Gamma distributions',cex=1, font=2)
A positive random variable \(X\) is log-normally distributed if the logarithm of X is normally distributed, \[\ln(X) \sim N(\mu, \sigma^2)\]
aseq <- seq(0,7,.01)
plot(aseq,dlnorm(aseq,.1,2),type='l', xlab='x', ylab='Density', lwd=2)
lines(aseq,dlnorm(aseq,2,1),col=4, lwd=2)
lines(aseq,dlnorm(aseq,0,1),col=2, lwd=2)
legend(3,1.2,c(expression(paste(mu==0.1,', ',sigma==2,sep=' ')), expression(paste(mu==2,', ',sigma==1,sep=' ')), expression(paste(mu==0,', ',sigma==1,sep=' '))), col=c(1,4,2), lty=c(1,1,1), lwd=c(2,2,2), cex=1,bty='n')
mtext(side=3,line=.5,'Lognormal distributions',cex=1, font=2)
The weak law of large number states the sample average converges in probability towards the expected value.
\[\frac{1}{n} \sum_{i=1}^n X_i \rightarrow \mathbb{E}(X)\]
## [1] 0.02275013
## [1] 0.04550026
## [1] 1.959964
\[X \sim N( \begin{pmatrix} \mu_1 \\ \mu_2 \end{pmatrix}, \begin{pmatrix} \sigma_{11} & \sigma_{12} \\ \sigma_{21} & \sigma_{22} \end{pmatrix} )\]
## [,1] [,2]
## [1,] 10 3
## [2,] 3 2
## [,1] [,2]
## [1,] 10.144428 2.880464
## [2,] 2.880464 1.859495
mu <- c(0,0)
Sigma1 <- matrix(c(1,0,0,1),2,2)
Sigma2 <- matrix(c(1,0,0,5),2,2)
Sigma3 <- matrix(c(5,0,0,1),2,2)
Sigma4 <- matrix(c(1,0.8,0.8,1),2,2)
Sigma5 <- matrix(c(1,1,1,1),2,2)
Sigma6 <- matrix(c(1,-0.8,-0.8,1),2,2)
arange <- c(-6,6)
par(mfrow=c(2,3))
plot(MASS::mvrnorm(n = 1000, mu, Sigma1), xlim=arange, ylim=arange)
plot(MASS::mvrnorm(n = 1000, mu, Sigma2), xlim=arange, ylim=arange)
plot(MASS::mvrnorm(n = 1000, mu, Sigma3), xlim=arange, ylim=arange)
plot(MASS::mvrnorm(n = 1000, mu, Sigma4), xlim=arange, ylim=arange)
plot(MASS::mvrnorm(n = 1000, mu, Sigma5), xlim=arange, ylim=arange)
plot(MASS::mvrnorm(n = 1000, mu, Sigma6), xlim=arange, ylim=arange)
## Loading required package: stats4
## Loading required package: evd
## [1] 0.3602613
## [1] 1.913265
## [1] 0.8379001
\[F_Z(x) = P(F_X(X) \le x) = P(X \le F_X^{-1}(x)) = F_X(F_X^{-1}(x)) = x\] - If \(U\) is a uniform random variable who takes values in \([0, 1]\), \[F_U(x) = \int_R f_U(u) du = \int_0^x du = x\]
Thus \(Z \sim UNIF(0, 1)\)
Simulate 100 samples with group label A from \[X \sim N( \begin{pmatrix} 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 1 & -0.8 \\ -0.8 & 1 \end{pmatrix} )\]
Simulate 100 samples with group label B from \[X \sim N( \begin{pmatrix} 3 \\ 3 \end{pmatrix}, \begin{pmatrix} 1 & 0.8 \\ 0.8 & 1 \end{pmatrix} )\]
apply hclust() to the data (dimension \(200 \times 2\)), visualize the hierarchical tree structure.