Zhiguang Huo (Caleb)
Monday November 5, 2018
\[Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \ldots + \beta_pX_p + \varepsilon, \text{with }\varepsilon \sim N(0, \sigma^2)\]
Intepretation of linear model coefficients:
How about other data types?
\[Y = X^\top\beta + \varepsilon, \text{with }\varepsilon \sim N(0, \sigma^2)\]
We can generalize these conditions for linear regression model
Link | \(X^\top\beta = g(\mu)\) | \(\mu = g^{-1} (X^\top\beta)\) |
---|---|---|
identity | \(X^\top\beta = \mu\) | \(\mu = X^\top\beta\) |
log | \(X^\top\beta = \log(\mu)\) | \(\mu = \exp(X^\top\beta)\) |
logit | \(X^\top\beta = \log \frac{\mu}{1 - \mu}\) | \(\mu = \frac{\exp(X^\top\beta)}{1 + \exp(X^\top\beta)}\) |
probit | \(X^\top\beta = \Phi^{-1}({\mu})\) | \(\mu = \Phi (X^\top\beta)\) |
X <- matrix(c(189, 104, 10845, 10933), nrow=2,
dimnames=list(Treatment=c("Placebo","Aspirin"),
"Myocardial Infarction"=c("Yes", "No")))
X
## Myocardial Infarction
## Treatment Yes No
## Placebo 189 10845
## Aspirin 104 10933
## [1] 1.832054
ID | Y(Yes/No) | Treatment(1/0) |
---|---|---|
1 | 1 | 0 |
2 | 0 | 1 |
3 | 1 | 1 |
4 | 1 | 0 |
… | … | … |
The data is from the book element of statistical learning
## 'data.frame': 97 obs. of 10 variables:
## $ lcavol : num -0.58 -0.994 -0.511 -1.204 0.751 ...
## $ lweight: num 2.77 3.32 2.69 3.28 3.43 ...
## $ age : int 50 58 74 58 62 50 64 58 47 63 ...
## $ lbph : num -1.39 -1.39 -1.39 -1.39 -1.39 ...
## $ svi : int 0 0 0 0 0 0 0 0 0 0 ...
## $ lcp : num -1.39 -1.39 -1.39 -1.39 -1.39 ...
## $ gleason: int 6 6 7 6 6 6 6 6 6 6 ...
## $ pgg45 : int 0 0 20 0 0 0 0 0 0 0 ...
## $ lpsa : num -0.431 -0.163 -0.163 -0.163 0.372 ...
## $ train : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
glm_binomial_logit <- glm(svi ~ lcavol, data = prostate, family = binomial(link = "logit"))
summary(glm_binomial_logit)
##
## Call:
## glm(formula = svi ~ lcavol, family = binomial(link = "logit"),
## data = prostate)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.79924 -0.48354 -0.21025 -0.04274 2.32135
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.0296 1.0429 -4.823 1.42e-06 ***
## lcavol 1.9798 0.4543 4.358 1.31e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 101.35 on 96 degrees of freedom
## Residual deviance: 64.14 on 95 degrees of freedom
## AIC: 68.14
##
## Number of Fisher Scoring iterations: 6
glm_binomial_probit <- glm(svi ~ lcavol, data = prostate, family = binomial(link = "probit"))
summary(glm_binomial_probit)
##
## Call:
## glm(formula = svi ~ lcavol, family = binomial(link = "probit"),
## data = prostate)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.77321 -0.49129 -0.16705 -0.00706 2.34194
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.9142 0.5508 -5.290 1.22e-07 ***
## lcavol 1.1486 0.2474 4.643 3.43e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 101.353 on 96 degrees of freedom
## Residual deviance: 63.654 on 95 degrees of freedom
## AIC: 67.654
##
## Number of Fisher Scoring iterations: 7
flogit <- function(x) exp(x)/(exp(x) + 1)
fprobit <- pnorm
curve(flogit, -5, 5, ylab = "f(x)", lwd = 2)
curve(fprobit, -5, 5, add = T, col = 2 , lwd = 2)
legend("topleft", c("logit", "probit"), col=c(1,2),lwd = 2)
The natural form of the probability density functuion (pdf) of a distribution in the exponential family is \[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\]
Comments:
Canonical link is a function that links canonical parameter \(\theta\) in terms of the mean of the distribution \(\mu = E(Y|X)\)
The canonical link for each family is the one most commonly used
It arises naturally from the general formula for distributions in the exponential families.
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\]
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\] \[f(y) = \mu^y (1 - \mu)^{1 - y}\]
\[f(y) = \exp(\frac{y \log(\frac{\mu}{1 - \mu}) - \log(\frac{1}{1 - \mu})}{1} + 0)\]
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\] \[f(y) = \frac{\mu^y \exp (-\mu)}{y!}\]
\[f(y) = \exp(\frac{y \log(\mu) - \mu}{1} - \log(y!))\]
Family | Canonical link | Canonical parameter \(\theta\) | Link function | Mean function |
---|---|---|---|---|
Gaussian | identity | \(\theta = \mu\) | \(X^\top\beta = \mu\) | \(\mu = X^\top\beta\) |
Binomial | logit | \(\theta = \log(\frac{\mu}{1 - \mu})\) | \(X^\top\beta = \log(\frac{\mu}{1 - \mu})\) | \(\mu = \frac{X^\top\beta}{1 + X^\top\beta}\) |
Poisson | log | \(\theta = \log(\mu)\) | \(X^\top\beta = \log(\mu)\) | \(\mu = \exp(X^\top\beta)\) |
glm_binomial_logit <- glm(svi ~ lcavol, data = prostate, family = binomial())
summary(glm_binomial_logit)
##
## Call:
## glm(formula = svi ~ lcavol, family = binomial(), data = prostate)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.79924 -0.48354 -0.21025 -0.04274 2.32135
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.0296 1.0429 -4.823 1.42e-06 ***
## lcavol 1.9798 0.4543 4.358 1.31e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 101.35 on 96 degrees of freedom
## Residual deviance: 64.14 on 95 degrees of freedom
## AIC: 68.14
##
## Number of Fisher Scoring iterations: 6
\[y|X \sim f(\mu)\]
\[\hat{OR} = \frac{\hat{odds}|_{x_i=1}}{\hat{odds}|_{x_i=0}} = \frac{\frac{P(y_i=1)}{P(y_i=0)}|_{x_i=1}}{\frac{P(y_i=1)}{P(y_i=0)}|_{x_i=0}} = \frac{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=1}}{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=0}} = \frac{\exp(\hat{\beta}_0 + 1 \times \hat{\beta}_1)}{\exp(\hat{\beta}_0 + 0 \times \hat{\beta}_1)} = \exp(\hat{\beta}_1) \]
\[\hat{OR} = \frac{\hat{odds}|_{x_i=1}}{\hat{odds}|_{x_i=0}} = \frac{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=1}}{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=0}} = \frac{\exp(\hat{\beta}_0 + 1 \times \hat{\beta}_1 + Z_i^\top\gamma)}{\exp(\hat{\beta}_0 + 0 \times \hat{\beta}_1 + Z_i^\top\gamma)} = \exp(\hat{\beta}_1) \]
glm_binomial3 <- glm(svi ~ lcavol + lweight + age, data = prostate, family = binomial())
summary(glm_binomial3) # display results
##
## Call:
## glm(formula = svi ~ lcavol + lweight + age, family = binomial(),
## data = prostate)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.80253 -0.51277 -0.20249 -0.03868 2.29319
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.26605 3.85789 -1.624 0.104
## lcavol 1.99773 0.46970 4.253 2.11e-05 ***
## lweight -0.24193 0.91732 -0.264 0.792
## age 0.03203 0.05097 0.628 0.530
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 101.353 on 96 degrees of freedom
## Residual deviance: 63.742 on 93 degrees of freedom
## AIC: 71.742
##
## Number of Fisher Scoring iterations: 6
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) -14.28625935 1.0304960
## lcavol 1.18463130 3.0485710
## lweight -2.09281024 1.5785799
## age -0.06780234 0.1336198
## (Intercept) lcavol lweight age
## 0.001899712 7.372329880 0.785110058 1.032546950
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) 6.245344e-07 2.802456
## lcavol 3.269481e+00 21.085191
## lweight 1.233400e-01 4.848066
## age 9.344452e-01 1.142958
## 1 2 3 4 5
## 0.0015117583 0.0007476917 0.0038053593 0.0004963039 0.0263507563
## 6
## 0.0005295007
## 1 2 3 4 5 6
## -0.05500731 -0.03867742 -0.08732260 -0.03150959 -0.23110240 -0.03254661
## Start: AIC=55.72
## svi ~ lcavol + lweight + age + lbph + lcp + gleason + pgg45 +
## lpsa
##
## Df Deviance AIC
## - lcavol 1 37.760 53.760
## - pgg45 1 37.772 53.772
## - lweight 1 37.782 53.782
## - gleason 1 37.794 53.794
## - lbph 1 38.248 54.248
## - age 1 38.907 54.907
## <none> 37.721 55.721
## - lcp 1 48.013 64.013
## - lpsa 1 49.249 65.249
##
## Step: AIC=53.76
## svi ~ lweight + age + lbph + lcp + gleason + pgg45 + lpsa
##
## Df Deviance AIC
## - pgg45 1 37.809 51.809
## - gleason 1 37.839 51.839
## - lweight 1 37.857 51.857
## - lbph 1 38.248 52.248
## - age 1 38.912 52.912
## <none> 37.760 53.760
## - lcp 1 50.549 64.549
## - lpsa 1 51.558 65.558
##
## Step: AIC=51.81
## svi ~ lweight + age + lbph + lcp + gleason + lpsa
##
## Df Deviance AIC
## - lweight 1 37.898 49.898
## - gleason 1 38.076 50.076
## - lbph 1 38.398 50.398
## - age 1 38.920 50.920
## <none> 37.809 51.809
## - lpsa 1 52.945 64.945
## - lcp 1 54.318 66.318
##
## Step: AIC=49.9
## svi ~ age + lbph + lcp + gleason + lpsa
##
## Df Deviance AIC
## - gleason 1 38.102 48.102
## - lbph 1 38.758 48.758
## - age 1 39.002 49.002
## <none> 37.898 49.898
## - lpsa 1 53.462 63.462
## - lcp 1 54.379 64.379
##
## Step: AIC=48.1
## svi ~ age + lbph + lcp + lpsa
##
## Df Deviance AIC
## - lbph 1 38.959 46.959
## - age 1 39.066 47.066
## <none> 38.102 48.102
## - lpsa 1 53.509 61.509
## - lcp 1 57.700 65.700
##
## Step: AIC=46.96
## svi ~ age + lcp + lpsa
##
## Df Deviance AIC
## - age 1 39.653 45.653
## <none> 38.959 46.959
## - lpsa 1 54.840 60.840
## - lcp 1 59.815 65.815
##
## Step: AIC=45.65
## svi ~ lcp + lpsa
##
## Df Deviance AIC
## <none> 39.653 45.653
## - lpsa 1 55.355 59.355
## - lcp 1 60.781 64.781
##
## Call: glm(formula = svi ~ lcp + lpsa, family = binomial(), data = prostate)
##
## Coefficients:
## (Intercept) lcp lpsa
## -8.532 1.343 2.198
##
## Degrees of Freedom: 96 Total (i.e. Null); 94 Residual
## Null Deviance: 101.4
## Residual Deviance: 39.65 AIC: 45.65
When the outcome categories \(1,2,\ldots,J\) are unordered, a multinomial logit model defines \(J - 1\) logits with the form (treating \(j=1\) as the reference category):
\[\log (odds_{j1}) = \log(\frac{\mu_{ij}}{\mu_{i1}}) = \log(\frac{P(Y_i = j)}{P(Y_i = 1)}) = \beta_{0j} + \sum_k \beta_{kj} X_{ik}\]
ml <- read.csv("https://raw.githubusercontent.com/rlowrance/re/master/hsbdemo.csv",row.names=1)
head(ml)
## id female ses schtyp prog read write math science socst
## 1 45 female low public vocation 34 35 41 29 26
## 2 108 male middle public general 34 33 41 36 36
## 3 15 male high public vocation 39 39 44 26 42
## 4 67 male low public vocation 37 37 42 33 32
## 5 153 male middle public vocation 39 31 40 39 51
## 6 51 female high public general 42 36 42 31 39
## honors awards cid
## 1 not enrolled 0 1
## 2 not enrolled 0 1
## 3 not enrolled 0 1
## 4 not enrolled 0 1
## 5 not enrolled 0 1
## 6 not enrolled 0 1
ml$prog2 <- relevel(ml$prog, ref = "vocation")
ml$ses2 <- relevel(ml$ses, ref = "low")
library("nnet")
test <- multinom(prog2 ~ ses2 + write, data = ml)
## # weights: 15 (8 variable)
## initial value 219.722458
## iter 10 value 179.993001
## final value 179.981726
## converged
## Call:
## multinom(formula = prog2 ~ ses2 + write, data = ml)
##
## Coefficients:
## (Intercept) ses2high ses2middle write
## academic -5.218216 0.9826776 -0.2913902 0.11360290
## general -2.366021 -0.1801588 -0.8246874 0.05567435
##
## Std. Errors:
## (Intercept) ses2high ses2middle write
## academic 1.163550 0.5955674 0.4763737 0.02221991
## general 1.174249 0.6484555 0.4901231 0.02333137
##
## Residual Deviance: 359.9635
## AIC: 375.9635
\[log (\frac{P(Y > j | X)}{P(Y \le j | X)}) = log (\frac{P(Y > j | X)}{1 - P(Y > j | X)}) = \alpha_j + \beta^T X\] For all \(j = 1, \ldots, J - 1\).
## apply pared public gpa
## 1 very likely 0 0 3.26
## 2 somewhat likely 1 0 3.21
## 3 unlikely 1 1 3.94
## 4 somewhat likely 0 0 2.81
## 5 somewhat likely 0 0 2.53
## 6 unlikely 0 1 2.59
library(MASS)
dat$apply2 <- relevel(dat$apply, ref = "unlikely")
m <- polr(apply2 ~ pared + public + gpa, data = dat, Hess=TRUE)
summary(m)
## Call:
## polr(formula = apply2 ~ pared + public + gpa, data = dat, Hess = TRUE)
##
## Coefficients:
## Value Std. Error t value
## pared 1.04769 0.2658 3.9418
## public -0.05879 0.2979 -0.1974
## gpa 0.61594 0.2606 2.3632
##
## Intercepts:
## Value Std. Error t value
## unlikely|somewhat likely 2.2039 0.7795 2.8272
## somewhat likely|very likely 4.2994 0.8043 5.3453
##
## Residual Deviance: 717.0249
## AIC: 727.0249
## pared public gpa
## 2.8510579 0.9429088 1.8513972
One unit increase in parental education, from 0 (Low) to 1 (High), the odds of “very likely” applying versus “somewhat likely” or “unlikely” applying combined are 2.85 greater .
The odds “very likely” or “somewhat likely” applying versus “unlikely” applying is 2.85 times greater .
For gpa, when a student’s gpa increases 1 unit, the odds of moving from “unlikely” applying to “somewhat likely” or “very likely” applying (or from the lower and middle categories to the high category) are multiplied by 1.85.
## Classes and Methods for R developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University
## Simon Jackman
## hurdle and zeroinfl functions by Achim Zeileis
## art fem mar kid5 phd ment
## 1 0 Men Married 0 2.52 7
## 2 0 Women Single 0 2.05 6
## 3 0 Women Single 0 3.75 6
## 4 0 Men Married 1 1.18 3
## 5 0 Women Single 0 3.75 26
## 6 0 Women Married 2 3.59 2
glm_poisson <- glm(art ~ . , data = bioChemists, family = poisson())
summary(glm_poisson) # display results
##
## Call:
## glm(formula = art ~ ., family = poisson(), data = bioChemists)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.5672 -1.5398 -0.3660 0.5722 5.4467
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.304617 0.102981 2.958 0.0031 **
## femWomen -0.224594 0.054613 -4.112 3.92e-05 ***
## marMarried 0.155243 0.061374 2.529 0.0114 *
## kid5 -0.184883 0.040127 -4.607 4.08e-06 ***
## phd 0.012823 0.026397 0.486 0.6271
## ment 0.025543 0.002006 12.733 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 1817.4 on 914 degrees of freedom
## Residual deviance: 1634.4 on 909 degrees of freedom
## AIC: 3314.1
##
## Number of Fisher Scoring iterations: 5
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) 0.10155469 0.50526581
## femWomen -0.33193995 -0.11781821
## marMarried 0.03520230 0.27584836
## kid5 -0.26422746 -0.10689876
## phd -0.03881122 0.06467552
## ment 0.02154163 0.02940716
## (Intercept) femWomen marMarried kid5 phd ment
## 1.3561053 0.7988403 1.1679422 0.8312018 1.0129051 1.0258718
## Waiting for profiling to be done...
## 2.5 % 97.5 %
## (Intercept) 1.1068905 1.6574260
## femWomen 0.7175304 0.8888576
## marMarried 1.0358292 1.3176480
## kid5 0.7677989 0.8986166
## phd 0.9619323 1.0668128
## ment 1.0217753 1.0298438
## 1 2 3 4 5 6
## -1.977948 -1.610197 -1.627842 -1.698833 -2.101566 -1.388027
## Start: AIC=3314.11
## art ~ fem + mar + kid5 + phd + ment
##
## Df Deviance AIC
## - phd 1 1634.6 3312.3
## <none> 1634.4 3314.1
## - mar 1 1640.8 3318.5
## - fem 1 1651.5 3329.2
## - kid5 1 1656.5 3334.2
## - ment 1 1766.2 3444.0
##
## Step: AIC=3312.35
## art ~ fem + mar + kid5 + ment
##
## Df Deviance AIC
## <none> 1634.6 3312.3
## - mar 1 1640.8 3316.6
## - fem 1 1651.8 3327.5
## - kid5 1 1656.7 3332.4
## - ment 1 1776.7 3452.5
##
## Call: glm(formula = art ~ fem + mar + kid5 + ment, family = poisson(),
## data = bioChemists)
##
## Coefficients:
## (Intercept) femWomen marMarried kid5 ment
## 0.34517 -0.22530 0.15218 -0.18499 0.02576
##
## Degrees of Freedom: 914 Total (i.e. Null); 910 Residual
## Null Deviance: 1817
## Residual Deviance: 1635 AIC: 3312
\(\beta_1\) is difference in log of expected counts when \(X\) increases by 1 unit.
\(\exp (\beta_1)\) is multiplicative effect of the mean of \(Y\) when \(X\) increase by 1 unit.
If \(\beta_1 > 0\), then \(\exp(\beta_1) > 1\). Therefore, \(\mu_1 > \mu_0\); \(\mu_1\) is \(\exp (\beta_1)\) times larger than \(\mu_0\).
If \(\beta_1 < 0\), then \(\exp(\beta_1) < 1\). Therefore, \(\mu_1 < \mu_0\); \(\mu_1\) is \(\exp (\beta_1)\) times smaller than \(\mu_0\).
burnData <- read.csv("https://caleb-huo.github.io/teaching/data/Burn/burn.csv", row.names = 1)
head(burnData)
## treatment gender race percent head buttock trunk upper_leg lower_leg
## 1 0 0 0 15 0 0 1 1 0
## 2 0 0 1 20 0 0 1 0 0
## 3 0 0 1 15 0 0 0 1 1
## 4 0 0 0 20 1 0 1 0 0
## 5 0 0 1 70 1 1 1 1 0
## 6 0 0 1 20 1 0 1 0 0
## resp_tract type numfup infection
## 1 0 2 12 0
## 2 0 4 9 0
## 3 0 2 7 1
## 4 0 2 29 0
## 5 0 2 4 1
## 6 0 4 8 1
glm_poisson_rate <- glm(infection ~ . - numfup + offset(log(numfup)), data = burnData, family = poisson())
summary(glm_poisson_rate) # display results
##
## Call:
## glm(formula = infection ~ . - numfup + offset(log(numfup)), family = poisson(),
## data = burnData)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.0008 -0.8023 -0.5350 1.0419 2.5011
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.603081 1.201833 -4.662 3.13e-06 ***
## treatment -0.622350 0.316669 -1.965 0.0494 *
## gender -0.724283 0.396879 -1.825 0.0680 .
## race 2.155901 1.016616 2.121 0.0340 *
## percent 0.003402 0.009854 0.345 0.7299
## head -0.023392 0.353237 -0.066 0.9472
## buttock 0.612703 0.421929 1.452 0.1465
## trunk 0.058203 0.500184 0.116 0.9074
## upper_leg -0.472179 0.373669 -1.264 0.2064
## lower_leg -0.299164 0.387843 -0.771 0.4405
## resp_tract -0.042332 0.364401 -0.116 0.9075
## type -0.045439 0.171668 -0.265 0.7912
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 207.46 on 153 degrees of freedom
## Residual deviance: 184.46 on 142 degrees of freedom
## AIC: 304.46
##
## Number of Fisher Scoring iterations: 7
\[Y|\lambda \sim Poisson(\lambda)\] \[\lambda \sim gamma(\alpha, \beta)\] \[Y \sim NB(\alpha, 1/(\beta + 1))\]
The canonical link for the negative binomial distribution is rather complicated and hard to interpret, so it is rarely used. Instead to facilitate comparisons with the Poisson generalized linear model, a log link is typically used.
\[b''(\theta) = \frac{\partial^2 b}{\partial p^2} \times (\frac{\partial p}{\partial \theta})^2 + \frac{\partial b}{\partial p} \times \frac{\partial^2 p}{\partial \theta^2} = \frac{r(1-p)}{p^2} \]
\[b''(\theta) = \frac{\partial^2 b}{\partial p^2} \times (\frac{\partial p}{\partial \theta})^2 + \frac{\partial b}{\partial p} \times \frac{\partial^2 p}{\partial \theta^2} = \mu + \alpha \mu^2 \] The variance is always larger than the mean for the negative binomial.
NB is possion if \(\alpha = 0\)
##
## Call:
## glm.nb(formula = art ~ ., data = bioChemists, init.theta = 2.264387695,
## link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1678 -1.3617 -0.2806 0.4476 3.4524
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.256144 0.137348 1.865 0.062191 .
## femWomen -0.216418 0.072636 -2.979 0.002887 **
## marMarried 0.150489 0.082097 1.833 0.066791 .
## kid5 -0.176415 0.052813 -3.340 0.000837 ***
## phd 0.015271 0.035873 0.426 0.670326
## ment 0.029082 0.003214 9.048 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(2.2644) family taken to be 1)
##
## Null deviance: 1109.0 on 914 degrees of freedom
## Residual deviance: 1004.3 on 909 degrees of freedom
## AIC: 3135.9
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 2.264
## Std. Err.: 0.271
##
## 2 x log-likelihood: -3121.917
## Warning in glm.nb(infection ~ . - numfup + offset(log(numfup)), data =
## burnData): alternation limit reached
##
## Call:
## glm.nb(formula = infection ~ . - numfup + offset(log(numfup)),
## data = burnData, init.theta = 0.6478173549, link = log)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7731 -0.8564 -0.5674 0.5065 2.3192
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -5.285673 1.300638 -4.064 4.83e-05 ***
## treatment -0.840907 0.371826 -2.262 0.0237 *
## gender -1.037906 0.464514 -2.234 0.0255 *
## race 2.620297 1.081204 2.423 0.0154 *
## percent 0.005769 0.011903 0.485 0.6279
## head -0.208180 0.410433 -0.507 0.6120
## buttock 0.899329 0.500276 1.798 0.0722 .
## trunk 0.339320 0.565532 0.600 0.5485
## upper_leg -0.692250 0.446461 -1.551 0.1210
## lower_leg -0.180517 0.428152 -0.422 0.6733
## resp_tract -0.191947 0.437243 -0.439 0.6607
## type -0.149369 0.201242 -0.742 0.4579
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for Negative Binomial(0.6478) family taken to be 1)
##
## Null deviance: 149.78 on 153 degrees of freedom
## Residual deviance: 126.24 on 142 degrees of freedom
## AIC: 299.93
##
## Number of Fisher Scoring iterations: 1
##
##
## Theta: 0.648
## Std. Err.: 0.246
## Warning while fitting theta: alternation limit reached
##
## 2 x log-likelihood: -273.928
For a change of \(\Delta\) in \(x\), the expected count increases by a factor of \(\exp(\gamma \Delta)\) holding other variables \(Z = z\) constant.
\[ P(y_i = j)= \begin{cases} \pi_i + (1 - \pi_i) \exp (-\mu_i), \mbox{if } j=0\\ (1 - \pi_i) \frac{\mu_i^{y_i} \exp(-\mu_i)}{y_i!}, \mbox{if } j>0 \end{cases} \] - The Poisson component is \[\log(\mu_i) = \beta_0x_{10} + \beta_1 x_{1i} + \ldots + \beta_px_{1p}\] - The zero proportion component is \[\pi_i = \frac{\exp(\gamma_0z_{10} + \gamma_1 z_{1i} + \ldots + \gamma_pz_{1p})}{1 + \exp(\gamma_0z_{10} + \gamma_1 z_{1i} + \ldots + \gamma_pz_{1p})}\]
fm_pois <- glm(art ~ ., data = bioChemists, family = poisson)
## without inflation
fm_zip <- zeroinfl(art ~ . | 1, data = bioChemists)
## with simple inflation (no regressors for zero component)
fm_zip2 <- zeroinfl(art ~ . | ., data = bioChemists)
## inflation with regressors
## ("art ~ . | ." is "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment")
fm_nb <- MASS::glm.nb(art ~ ., data = bioChemists)
## without inflation
fm_zinb <- zeroinfl(art ~ . | 1, data = bioChemists, dist = "negbin")
## with simple inflation (no regressors for zero component)
fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")
## inflation with regressors
## ("art ~ . | ." is "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment")
We can generalize these conditions for linear regression model
How about non linear systematic component: