Zhiguang Huo (Caleb)
Monday Nov 30, 2020
Target 2, To estimate expectations of functions under this distribution, for example \[\mathbb{E}(g (x) | p^*(x)) = \int g(x) p^*(x) dx,\]
Problem: We want to estimate \[\mathbb{E}(g (x) | p^*(x)) = \int g(x) p^*(x) dx,\] Given distribution \(p^*(x)\).
Examples: \(\mathbb{E} (x | p^*(x))\) or \(\mathbb{V}\mbox{ar} (x | p^*(x))\)
## 7.852178 with absolute error < 9.1e-06
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l", main = expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
x2 <- seq(-4, 4, 0.1)
plot(x,p(x),type="n", main = expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
segments(x2,0,x2,p(x2))
(e.g., \(p(x) = \exp [ 0.4(x-0.4)^2 - 0.08x^4 ]\))
Sample from a simpler distribution \(q^*(x)\).
Rejection sampling algorithm:
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l", main = expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
x <- seq(-4, 4, 0.01)
qstar <- function(x, C = 30){
C*dnorm(x,sd = 3)
}
plot(x,p(x),type="l", ylim = c(0,5))
curve(qstar,add = T)
text(0, 5, expression({q^"*"} (x) == N (x , 0, 3^2) ))
text(0, 4.5, expression({cq^"*"} (x) == 30* N (x , 0, 3^2) ))
text(1, 2, expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
x0 <- -2.5 ## if we sampled -2.5 from the proposal distribution
segments(x0,0,x0,qstar(x0),col=2)
N <- 10
for(i in 1:N){
set.seed(i)
ay <- runif(1,0,qstar(x0))
acol = ifelse(ay < p(x0),2,4)
points(x0,ay,col=acol,pch=19)
}
Proof: \[\begin{align*} p^*(x) &= \frac{p(x)}{Z} \\ &= \frac{p(x)}{\int_x p(x) dx} \\ &= \frac{[p(x)/c q^*(x)]q^*(x)}{\int_x [p(x)/c q^*(x)]q^*(x)dx} \\ \end{align*}\]
Interpretation of the numerator:
## rejection sampling
#p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.1)
qstar <- function(x){
dnorm(x,sd = 3)
}
# we can find M in this case:
C <- round(max(p(x)/qstar(x))) + 1; C
## [1] 28
# number of samples
N <- 1000
# generate proposals and u
x.h <- rnorm( N, sd = 3)
u <- runif( N )
acc <- u < p(x.h) / (C * qstar(x.h))
x.acc <- x.h[ acc ]
# how many proposals are accepted
sum( acc ) /N
## [1] 0.285
## m s
## -0.6207873 1.4258200
par(mfrow=c(1,2), mar=c(2,2,1,1))
plot(x,p(x),type="l")
barplot(table(round(x.acc,1))/length(x.acc))
Discussion: What does the acceptance rate depend on?
Importance sampling is not a method for generating samples from \(p(x)\) (target 1), it is just a method for estimating the expectation of a function \(g(x)\) (target 2).
\[\mathbb{E} (\phi (x) | p^* ) = \int \phi (x) p^*(x) dx\]
\[\hat{\mathbb{E}} (\phi (x) | p^* ) = \frac{1}{M}\sum_{m=1}^M\phi (x_m) \]
\[\mathbb{E} (\phi (x) | p^* ) = \int \phi (x) p^*(x) dx\]
\[\begin{align*} \mathbb{E} (\phi (x) | p^* ) &= \int \phi (x) p^*(x) dx\\ &= \frac{\int \phi (x) p^*(x) dx}{\int p^*(x) dx}\\ &= \frac{\int [\phi (x) p(x)/Z] dx}{\int [p(x)/Z] dx}\\ &= \frac{\int [\phi (x) p(x)/q^*(x)] q^*(x) dx}{\int [p(x)/q^*(x)] q^*(x) dx}, \end{align*}\]
\[\hat{\mathbb{E}} (\phi (x) | p^* ) = \frac{\frac{1}{M} \sum_{m=1}^M[\phi (x_m) p(x_m)/q^*(x_m)] }{ \frac{1}{M} \sum_{m=1}^M[p(x_m)/q^*(x_m)]}\]
\(w(x_m) =\frac{p(x_m)}{q^*(x_m)}\)
\[\hat{\mathbb{E}} (\phi (x) | p^* ) = \frac{ \sum_{m=1}^M \phi(x_m) w(x_m)}{ \sum_{m=1}^M w(x_m)} \]
Importance ratio: \(\frac{w(x_m)}{ \sum_{m=1}^M w(x_m)}\)
when \(q^* = p\), the regular mean estimator is a special case of the importance sampling.
par(mfrow=c(1,2), mar=c(2,2,2,1))
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l", main = expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
phi <- function(x){ (- 1/3*x^3 + 1/2*x^2 + 12*x - 12) / 30 + 1.3}
x <- seq(-4, 4, 0.01)
plot(x,phi(x),type="l",main= expression(phi(x)))
\[\begin{align*} \mathbb{E} (\phi (x) | p^* ) &= \int \phi (x) p^*(x) dx\\ &= \frac{\int \phi (x) p^*(x) dx}{\int p^*(x) dx}\\ &= \frac{\int [\phi (x) p(x)/Z] dx}{\int [p(x)/Z] dx}\\ &= \frac{\int [\phi (x) p(x)] dx}{\int [p(x)] dx}\\ \end{align*}\]
ep <- function(x) p(x)*phi(x)
truthE <- integrate(f = ep, lower = -4, upper = 4)$value/integrate(f = p, lower = -4, upper = 4)$value
truthE
## [1] 0.6971733
q.r <- rnorm
q.d <- dnorm
par(mfrow=c(1,2))
plot(x,q.d(x),type="l",main='sampler distribution Gaussian')
curve(p, from = -4,to = 4 ,col=2 , main = expression(p(x) == exp (0.4(x-0.4)^{2} - 0.08 * x^{4})))
M <- 1000
x.m <- q.r(M)
ww <- p(x.m) / q.d(x.m)
qq <- ww / sum(ww) ## importance ratio
x.g <- phi(x.m)
sum(x.g * qq)
## [1] 0.7022795
M <- 10^seq(1,7,length.out = 30)
result.g <- numeric(length(M))
for(i in 1:length(M)){
aM <- M[i]
x.m <- q.r(aM)
ww <- p(x.m) / q.d(x.m)
qq.g <- ww / sum(ww)
x.g <- phi(x.m)
result.g[i] <- sum(x.g * qq.g)/sum(qq.g)
}
plot(M,result.g,log = "x", main='importance sampling result Gaussian')
abline(h = truthE, col = 2)
What will affect the estimation accuracy? - sample size \(n\) - others?
Remark:
#p <- function(x, a=.4, b=.08){exp(a*(x-a)^2 - b*x^4)}
x <- seq(-4, 4, 0.01)
plot(x,p(x),type="l")
qstar <- function(x){rep.int(0.125,length(x))} ## proposal distribution: uniform distribution.
N <- 10000
S <- 1000
x.qstar <- runif( N, -4, 4 )
ww <- p(x.qstar) / qstar(x.qstar)
qq <- ww / sum(ww) ## importance ratio
x.acc <-sample(x.qstar, size = S, prob=qq, replace=F)
par(mfrow=c(1,2), mar=c(2,2,1,1))
plot(x,p(x),type="l")
barplot(table(round(x.acc,1))/length(x.acc))
\(\frac{p(x')}{q(x'|x)}/\frac{p(x)}{q(x|x')}\) is a ratio of importance sampling weights.
N <- 10000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 1 ## Spread of proposal distribution
xp <- 0; ## Starting value
for (ii in 1:N){
set.seed(ii)
xc <- rnorm(1, mean=xp, sd=std) ## xp previous sample; xc: current sample
alpha <- min(1, (
p(xc) /dnorm(xc, mean=xp,sd=std) /
(p(xp) / dnorm(xp, mean=xc,sd=std))
)
)
x.acc5[ii] <- xp <- ifelse(u[ii] < alpha, xc, xp)
## find number of acccepted proposals:
acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.737
Trade off between acceptance rate and convergence.
N <- 1000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 1 ## Spread of proposal distribution
xp <- 8; ## Starting value
for (ii in 1:N){
xc <- rnorm(1, mean=xp, sd=std) ## xp previous sample; xc: current sample
alpha <- min(1, (
p(xc) /dnorm(xc, mean=xp,sd=std) /
(p(xp) / dnorm(xp, mean=xc,sd=std))
)
)
x.acc5[ii] <- xp <- ifelse(u[ii] < alpha, xc, xp)
## find number of acccepted proposals:
acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.73
N <- 1000
x.acc5 <- rep.int(NA, N)
u <- runif(N)
acc.count <- 0
std <- 0.1 ## Spread of proposal distribution
xc <- 0; ## Starting value
for (ii in 1:N){
xp <- rnorm(1, mean=xc, sd=std) ## proposal
alpha <- min(1, (p(xp)/p(xc)) *
(dnorm(xc, mean=xp,sd=std)/dnorm(xp, mean=xc,sd=std)))
x.acc5[ii] <- xc <- ifelse(u[ii] < alpha, xp, xc)
## find number of acccepted proposals:
acc.count <- acc.count + (u[ii] < alpha)
}
## Fraction of accepted *new* proposals
acc.count/N
## [1] 0.971
Concepts:
\[ A(x'|x) = \frac{p(x')q(x|x')}{p(x)q(x'|x)}\] \[ p(x)q(x'|x) A(x'|x) = p(x')q(x|x') \] \[ p(x)q(x'|x) A(x'|x) = p(x')q(x|x') A(x|x')\] \[ p(x)T(x'|x) = p(x')T(x|x') \] The last line is called detailed balance condition.
\[ \int_{x} p(x)T(x'|x) dx = \int_{x} p(x')T(x|x') dx\] \[ p(x') = \int_{x} p(x)T(x'|x) dx\]
Since p(x) is the true distribution. MH algorithm will eventually converges to the true distribution.