Biostatistical Computing, PHC 6068

Monte Carlo methods

Zhiguang Huo (Caleb)

Monday Nov 13, 2017

Bayesian sampling method

  1. Monte Carlo methods.
    • Direct sampling.
    • Rejection sampling.
    • Importance sampling.
    • Sampling-importance resampling.
  2. Markov chain Monte Carlo.
    • Metropolis-Hasting.
    • Gibbs sampling. (Next lecture)

Notation

Our targets

Examples: \(\mathbb{E} (x | p^*(x))\) or \(\mathbb{V}\mbox{ar} (x | p^*(x))\)

A simulation approach

Problem: We want to estimate \[\mathbb{E}(g (x) | p^*(x)) = \int g(x) p^*(x) dx,\] Given distribution \(p^*(x)\).

Examples: \(\mathbb{E} (x | p^*(x))\) or \(\mathbb{V}\mbox{ar} (x | p^*(x))\)

Un-normalized density function problems

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l",  main = expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))

integrate(f = p, lower = -4, upper = 4)
## 7.852178 with absolute error < 9.1e-06

Direct Sampling

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x2 <- seq(-4, 4, 0.1)
plot(x,p(x),type="n",  main = expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))
segments(x2,0,x2,p(x2))

Rejection sampling (Demonstrate Monte Carlo method)

(e.g. \(p(x) = \exp [ 0.4(x-0.4)^2 - 0.08x^4 ]\))

  1. \(x \sim q^*(x)\)
  2. accept \(x\) with prob \(p(x)/c q^*(x)\):
    • Sample \(u \sim \mbox{UNIF}(0,1)\), accept if \(p(x)/c q^*(x) > u\).
  3. Repeat Step 1 and 2 many times.

Rejection sampling

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.01)
qstar <- function(x, C = 30){
  C*dnorm(x,sd = 3) 
}
plot(x,p(x),type="l", ylim = c(0,5))
curve(qstar,add = T)
text(0, 4.5, expression({q^"*"} (x) == 30* N (x , 0, 3^2) ))
text(1, 2, expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))
x0 <- -2.5
segments(x0,0,x0,qstar(x0),col=2)
N <- 10
for(i in 1:N){
  set.seed(i)
  ay <- runif(1,0,qstar(x0))
  acol = ifelse(ay < p(x0),2,4)
  points(x0,ay,col=acol,pch=19)
}

Rejection sampling

Proof: \[\begin{align*} p^*(x) &= \frac{p(x)}{Z} \\ &= \frac{p(x)}{\int_x p(x) dx} \\ &= \frac{[p(x)/c q^*(x)]q^*(x)}{\int_x [p(x)/c q^*(x)]q^*(x)dx} \\ \end{align*}\]

Interpretation of the numerator:

Rejection sampling

## rejection sampling
p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.1)
plot(x,p(x),type="l")

# uniform proposal on [-4,4]:
qstar <- function(x){rep.int(0.125,length(x))}
# we can find M in this case:
C <- round(max(p(x)/qstar(x))) + 1; C
## [1] 25
# number of samples
N <- 1000
# generate proposals and u
x.h <- runif( N, -4, 4 )
u <- runif( N )
acc <- u < p(x.h) / (C * qstar(x.h))
x.acc <- x.h[ acc ]
# how many proposals are accepted
sum( acc ) /N
## [1] 0.326
# calculate some statistics
c(m=mean(x.acc), s=sd(x.acc))
##          m          s 
## -0.7046715  1.3898611
par(mfrow=c(1,2), mar=c(2,2,1,1))
plot(x,p(x),type="l")
barplot(table(round(x.acc,1))/length(x.acc))

Discussion: What does the acceptance rate depend on?

Importance sampling

Importance sampling is not a method for generating samples from \(p(x)\) (target 1), it is just a method for estimating the expectation of a function \(g(x)\) (target 2).

Importance sampling, algorithm

\[\begin{align*} \mathbb{E} (\phi (x) | p^* ) &= \int \phi (x) p^*(x) dx\\ &= \frac{\int \phi (x) p^*(x) dx}{\int p^*(x) dx}\\ &= \frac{\int [\phi (x) p(x)/Z] dx}{\int [p(x)/Z] dx}\\ &= \frac{\int [\phi (x) p(x)/q^*(x)] q^*(x) dx}{\int [p(x)/q^*(x)] q^*(x) dx}, \end{align*}\]

\[\hat{\mathbb{E}} (\phi (x) | p^* ) = \frac{\frac{1}{M} \sum_{m=1}^M[\phi (x_m) p(x_m)/q^*(x_m)] }{ \frac{1}{M} \sum_{m=1}^M[p(x_m)/q^*(x_m)]}\]

\(w(x_m) =\frac{p(x_m)}{q^*(x_m)}\)

\[\hat{\mathbb{E}} (\phi (x) | p^* ) = \frac{ \sum_{m=1}^M \phi(x_m) w(x_m)}{ \sum_{m=1}^M w(x_m)} \]

Importance sampling, examples

par(mfrow=c(1,2), mar=c(2,2,2,1))

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l",  main = expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))

phi <- function(x){ (- 1/3*x^3 + 1/2*x^2 + 12*x - 12) / 30 + 1.3}
x <- seq(-4, 4, 0.01)
plot(x,phi(x),type="l",main= expression(phi(x)))

Underlying solution

\[\begin{align*} \mathbb{E} (\phi (x) | p^* ) &= \int \phi (x) p^*(x) dx\\ &= \frac{\int \phi (x) p^*(x) dx}{\int p^*(x) dx}\\ &= \frac{\int [\phi (x) p(x)/Z] dx}{\int [p(x)/Z] dx}\\ &= \frac{\int [\phi (x) p(x)] dx}{\int [p(x)] dx}\\ \end{align*}\]
ep <- function(x) p(x)*phi(x)
truthE <- integrate(f = ep, lower = -4, upper = 4)$value/integrate(f = p, lower = -4, upper = 4)$value
truthE
## [1] 0.6971733

Importance sampling, examples (2)

q.r <- rnorm
q.d <- dnorm

par(mfrow=c(1,2))
plot(x,q.d(x),type="l",main='sampler distribution Gaussian')
curve(p, from = -4,to = 4 ,col=2 ,  main = expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))

M <- 1000
x.m <- q.r(M)
ww <- p(x.m) / q.d(x.m)
qq <- ww / sum(ww)
x.g <- phi(x.m)
sum(x.g * qq)
## [1] 0.6956244

Number of samples for importance sampling

M <- 10^seq(1,7,length.out = 30)

result.g <- numeric(length(M))
for(i in 1:length(M)){
  aM <- M[i]
  x.m <- q.r(aM)
  ww <- p(x.m) / q.d(x.m)
  
  qq.g <- ww / sum(ww)
  x.g <- phi(x.m)
  
  result.g[i] <- sum(x.g * qq.g)/sum(qq.g)
}

plot(log10(M),result.g,main='importance sampling result Gaussian')
abline(h = truthE, col = 2)

Sampling from a narrow Gaussian distribution

q.r_narrow <- function(x){rnorm(x,0,1/2)}
q.d_narrow <- function(x){dnorm(x,0,1/2)}

par(mfrow=c(1,2))
plot(x,q.d_narrow(x),type="l",main='sampler narrow distribution Gaussian')
curve(p, from = -4,to = 4 ,col=2 ,  main = expression(p(x) == exp (0.4(x-0.4)^{2}  - 0.08 * x^{4})))

Number of samples for importance sampling (narrow Gaussian distribution)

M <- 10^seq(1,7,length.out = 30)

result.narrow <- numeric(length(M))
for(i in 1:length(M)){
  aM <- M[i]
  x.m <- q.r_narrow(aM)
  ww <- p(x.m) / q.d_narrow(x.m)
  
  qq.c <- ww / sum(ww)
  x.c <- phi(x.m)
  
  result.narrow[i] <- sum(x.c * qq.c)/sum(qq.c)
}
plot(log(M,10),result.narrow)
abline(h = truthE, col = 2)

Importance sampling remark

Importance resampling (SIR)

Remark:

Implement importance resampling (SIR)

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l")

qstar <- function(x){rep.int(0.125,length(x))}
N <- 10000
S <- 1000
x.qstar <- runif( N, -4, 4 )
ww <- p(x.qstar) / qstar(x.qstar)
qq <- ww / sum(ww)
x.acc <-sample(x.qstar, size = S, prob=qq, replace=F)
par(mfrow=c(1,2), mar=c(2,2,1,1))
plot(x,p(x),type="l")
barplot(table(round(x.acc,1))/length(x.acc))

Summarize

limitation of Monte Carlo method

Motivation of Metropolis-Hastings

compare importance sampling and Metropolis-Hastings

compare importance sampling and Metropolis-Hastings

Monte Carlo vs Markov Chain Monte Carlo

Introduce MH algorithm

MH algorithm

Implementation of MH algorithm

p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.1)
plot(x,p(x),type="l")

N <- 10000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 1 ## Spread of proposal distribution
xc <- 0; ## Starting value
for (ii in 1:N){
  xp <- rnorm(1, mean=xc, sd=std) ## proposal
  alpha <- min(1, (p(xp)/p(xc)) *
                 (dnorm(xc, mean=xp,sd=std)/dnorm(xp, mean=xc,sd=std)))
  x.acc5[ii] <- xc <- ifelse(u[ii] < alpha, xp, xc)
  ## find number of acccepted proposals:
  acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.7341
par(mfrow=c(1,2), mar=c(2,2,1,1))
plot(x,p(x),type="l")
barplot(table(round(x.acc5,1))/length(x.acc5))

Check samples from MH

plot(x.acc5,type="l")

Good convergence.

Check samples from MH

Remove initial values for burnin period

N <- 1000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 1 ## Spread of proposal distribution
xc <- 8; ## Starting value
for (ii in 1:N){
  xp <- rnorm(1, mean=xc, sd=std) ## proposal
  alpha <- min(1, (p(xp)/p(xc)) *
                 (dnorm(xc, mean=xp,sd=std)/dnorm(xp, mean=xc,sd=std)))
  x.acc5[ii] <- xc <- ifelse(u[ii] < alpha, xp, xc)
  ## find number of acccepted proposals:
  acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.732
plot(x.acc5,type="l")

Doesn’t converge

N <- 1000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 0.1 ## Spread of proposal distribution
xc <- 0; ## Starting value
for (ii in 1:N){
  xp <- rnorm(1, mean=xc, sd=std) ## proposal
  alpha <- min(1, (p(xp)/p(xc)) *
                 (dnorm(xc, mean=xp,sd=std)/dnorm(xp, mean=xc,sd=std)))
  x.acc5[ii] <- xc <- ifelse(u[ii] < alpha, xp, xc)
  ## find number of acccepted proposals:
  acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.974
plot(x.acc5,type="l")

Why MH converge? (optional)

\[ \int_{x} p(x)T(x'|x) dx = \int_{x} p(x')T(x|x') dx\] \[ p(x') = \int_{x} p(x)T(x'|x) dx\]

Since p(x) is the true distribution. MH algorithm will eventually converges to the true distribution.

Special cases for MH

Always remember for MCMC

References

knitr::purl("MC.rmd", output = "MC.R ", documentation = 2)
## 
## 
## processing file: MC.rmd
## 
  |                                                                       
  |                                                                 |   0%
  |                                                                       
  |..                                                               |   2%
  |                                                                       
  |...                                                              |   5%
  |                                                                       
  |.....                                                            |   7%
  |                                                                       
  |......                                                           |  10%
  |                                                                       
  |........                                                         |  12%
  |                                                                       
  |.........                                                        |  14%
  |                                                                       
  |...........                                                      |  17%
  |                                                                       
  |............                                                     |  19%
  |                                                                       
  |..............                                                   |  21%
  |                                                                       
  |...............                                                  |  24%
  |                                                                       
  |.................                                                |  26%
  |                                                                       
  |...................                                              |  29%
  |                                                                       
  |....................                                             |  31%
  |                                                                       
  |......................                                           |  33%
  |                                                                       
  |.......................                                          |  36%
  |                                                                       
  |.........................                                        |  38%
  |                                                                       
  |..........................                                       |  40%
  |                                                                       
  |............................                                     |  43%
  |                                                                       
  |.............................                                    |  45%
  |                                                                       
  |...............................                                  |  48%
  |                                                                       
  |................................                                 |  50%
  |                                                                       
  |..................................                               |  52%
  |                                                                       
  |....................................                             |  55%
  |                                                                       
  |.....................................                            |  57%
  |                                                                       
  |.......................................                          |  60%
  |                                                                       
  |........................................                         |  62%
  |                                                                       
  |..........................................                       |  64%
  |                                                                       
  |...........................................                      |  67%
  |                                                                       
  |.............................................                    |  69%
  |                                                                       
  |..............................................                   |  71%
  |                                                                       
  |................................................                 |  74%
  |                                                                       
  |..................................................               |  76%
  |                                                                       
  |...................................................              |  79%
  |                                                                       
  |.....................................................            |  81%
  |                                                                       
  |......................................................           |  83%
  |                                                                       
  |........................................................         |  86%
  |                                                                       
  |.........................................................        |  88%
  |                                                                       
  |...........................................................      |  90%
  |                                                                       
  |............................................................     |  93%
  |                                                                       
  |..............................................................   |  95%
  |                                                                       
  |...............................................................  |  98%
  |                                                                       
  |.................................................................| 100%
## output file: MC.R
## [1] "MC.R "