Zhiguang Huo (Caleb)
Monday October 21, 2019
\[Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \ldots + \beta_pX_p + \varepsilon, \text{with }\varepsilon \sim N(0, \sigma^2)\]
Intepretation of linear model coefficients:
How about other data types?
\[Y = X^\top\beta + \varepsilon, \text{with }\varepsilon \sim N(0, \sigma^2)\]
We can generalize these conditions for linear regression model
| Link | \(X^\top\beta = g(\mu)\) | \(\mu = g^{-1} (X^\top\beta)\) | 
|---|---|---|
| identity | \(X^\top\beta = \mu\) | \(\mu = X^\top\beta\) | 
| log | \(X^\top\beta = \log(\mu)\) | \(\mu = \exp(X^\top\beta)\) | 
| logit | \(X^\top\beta = \log \frac{\mu}{1 - \mu}\) | \(\mu = \frac{\exp(X^\top\beta)}{1 + \exp(X^\top\beta)}\) | 
| probit | \(X^\top\beta = \Phi^{-1}({\mu})\) | \(\mu = \Phi (X^\top\beta)\) | 
X <- matrix(c(189, 104, 10845, 10933), nrow=2,
            dimnames=list(Treatment=c("Placebo","Aspirin"), 
                          "Myocardial Infarction"=c("Yes", "No")))
X##          Myocardial Infarction
## Treatment Yes    No
##   Placebo 189 10845
##   Aspirin 104 10933
## [1] 1.832054
| ID | Myocardial Infarction (Yes/No) | Treatment(1/0) | 
|---|---|---|
| 1 | 1 | 0 | 
| 2 | 0 | 1 | 
| 3 | 1 | 1 | 
| 4 | 1 | 0 | 
| … | … | … | 
The data is from the book element of statistical learning
## 'data.frame':    97 obs. of  10 variables:
##  $ lcavol : num  -0.58 -0.994 -0.511 -1.204 0.751 ...
##  $ lweight: num  2.77 3.32 2.69 3.28 3.43 ...
##  $ age    : int  50 58 74 58 62 50 64 58 47 63 ...
##  $ lbph   : num  -1.39 -1.39 -1.39 -1.39 -1.39 ...
##  $ svi    : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ lcp    : num  -1.39 -1.39 -1.39 -1.39 -1.39 ...
##  $ gleason: int  6 6 7 6 6 6 6 6 6 6 ...
##  $ pgg45  : int  0 0 20 0 0 0 0 0 0 0 ...
##  $ lpsa   : num  -0.431 -0.163 -0.163 -0.163 0.372 ...
##  $ train  : logi  TRUE TRUE TRUE TRUE TRUE TRUE ...
glm_binomial_logit <- glm(svi ~ lcavol, data = prostate,  family = binomial(link = "logit"))
summary(glm_binomial_logit)## 
## Call:
## glm(formula = svi ~ lcavol, family = binomial(link = "logit"), 
##     data = prostate)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.79924  -0.48354  -0.21025  -0.04274   2.32135  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -5.0296     1.0429  -4.823 1.42e-06 ***
## lcavol        1.9798     0.4543   4.358 1.31e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 101.35  on 96  degrees of freedom
## Residual deviance:  64.14  on 95  degrees of freedom
## AIC: 68.14
## 
## Number of Fisher Scoring iterations: 6
glm_binomial_probit <- glm(svi ~ lcavol, data = prostate,  family = binomial(link = "probit"))
summary(glm_binomial_probit)## 
## Call:
## glm(formula = svi ~ lcavol, family = binomial(link = "probit"), 
##     data = prostate)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.77321  -0.49129  -0.16705  -0.00706   2.34194  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.9142     0.5508  -5.290 1.22e-07 ***
## lcavol        1.1486     0.2474   4.643 3.43e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 101.353  on 96  degrees of freedom
## Residual deviance:  63.654  on 95  degrees of freedom
## AIC: 67.654
## 
## Number of Fisher Scoring iterations: 7
flogit <- function(x) exp(x)/(exp(x) + 1)
fprobit <- pnorm 
curve(flogit, -5, 5, ylab = "f(x)", lwd = 2)
curve(fprobit, -5, 5, add = T, col = 2 , lwd = 2)
legend("topleft", c("logit", "probit"), col=c(1,2),lwd = 2)The natural form of the probability density functuion (pdf) of a distribution in the exponential family is \[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\]
Comments:
Canonical link is a function that links canonical parameter \(\theta\) in terms of the mean of the distribution \(\mu = E(Y|X)\)
The canonical link for each family is the one most commonly used
It arises naturally from the general formula for distributions in the exponential families.
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\]
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\] \[f(y) = \mu^y (1 - \mu)^{1 - y}\]
\[f(y) = \exp(\frac{y \log(\frac{\mu}{1 - \mu}) - \log(\frac{1}{1 - \mu})}{1} + 0)\]
\[f(y) = \exp (\frac{y\theta - b(\theta)}{a(\phi)} + c(y; \phi))\] \[f(y) = \frac{\mu^y \exp (-\mu)}{y!}\]
\[f(y) = \exp(\frac{y \log(\mu) - \mu}{1} - \log(y!))\]
| Family | Canonical link | Canonical parameter \(\theta\) | Link function | Mean function | 
|---|---|---|---|---|
| Gaussian | identity | \(\theta = \mu\) | \(X^\top\beta = \mu\) | \(\mu = X^\top\beta\) | 
| Binomial | logit | \(\theta = \log(\frac{\mu}{1 - \mu})\) | \(X^\top\beta = \log(\frac{\mu}{1 - \mu})\) | \(\mu = \frac{\exp (X^\top\beta)}{1 + \exp (X^\top\beta)}\) | 
| Poisson | log | \(\theta = \log(\mu)\) | \(X^\top\beta = \log(\mu)\) | \(\mu = \exp(X^\top\beta)\) | 
glm_binomial_logit <- glm(svi ~ lcavol, data = prostate,  family = binomial())
summary(glm_binomial_logit)## 
## Call:
## glm(formula = svi ~ lcavol, family = binomial(), data = prostate)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.79924  -0.48354  -0.21025  -0.04274   2.32135  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -5.0296     1.0429  -4.823 1.42e-06 ***
## lcavol        1.9798     0.4543   4.358 1.31e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 101.35  on 96  degrees of freedom
## Residual deviance:  64.14  on 95  degrees of freedom
## AIC: 68.14
## 
## Number of Fisher Scoring iterations: 6
\[\hat{OR} = \frac{\hat{odds}|_{x_i=1}}{\hat{odds}|_{x_i=0}} = \frac{\frac{P(y_i=1)}{P(y_i=0)}|_{x_i=1}}{\frac{P(y_i=1)}{P(y_i=0)}|_{x_i=0}} = \frac{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=1}}{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=0}} = \frac{\exp(\hat{\beta}_0 + 1 \times \hat{\beta}_1)}{\exp(\hat{\beta}_0 + 0 \times \hat{\beta}_1)} = \exp(\hat{\beta}_1) \]
\[\hat{OR} = \frac{\hat{odds}|_{x_i=1}}{\hat{odds}|_{x_i=0}} = \frac{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=1}}{\frac{\hat{\mu}_i}{1 - \hat{\mu}_i}|_{x_i=0}} = \frac{\exp(\hat{\beta}_0 + 1 \times \hat{\beta}_1 + Z_i^\top\gamma)}{\exp(\hat{\beta}_0 + 0 \times \hat{\beta}_1 + Z_i^\top\gamma)} = \exp(\hat{\beta}_1) \]
glm_binomial3 <- glm(svi ~ lcavol + lweight + age, data = prostate,  family = binomial())
summary(glm_binomial3) # display results## 
## Call:
## glm(formula = svi ~ lcavol + lweight + age, family = binomial(), 
##     data = prostate)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.80253  -0.51277  -0.20249  -0.03868   2.29319  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -6.26605    3.85789  -1.624    0.104    
## lcavol       1.99773    0.46970   4.253 2.11e-05 ***
## lweight     -0.24193    0.91732  -0.264    0.792    
## age          0.03203    0.05097   0.628    0.530    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 101.353  on 96  degrees of freedom
## Residual deviance:  63.742  on 93  degrees of freedom
## AIC: 71.742
## 
## Number of Fisher Scoring iterations: 6
## Waiting for profiling to be done...
##                    2.5 %    97.5 %
## (Intercept) -14.28625935 1.0304960
## lcavol        1.18463130 3.0485710
## lweight      -2.09281024 1.5785799
## age          -0.06780234 0.1336198
## (Intercept)      lcavol     lweight         age 
## 0.001899712 7.372329880 0.785110058 1.032546950
## Waiting for profiling to be done...
##                    2.5 %    97.5 %
## (Intercept) 6.245344e-07  2.802456
## lcavol      3.269481e+00 21.085191
## lweight     1.233400e-01  4.848066
## age         9.344452e-01  1.142958
## Start:  AIC=55.72
## svi ~ lcavol + lweight + age + lbph + lcp + gleason + pgg45 + 
##     lpsa
## 
##           Df Deviance    AIC
## - lcavol   1   37.760 53.760
## - pgg45    1   37.772 53.772
## - lweight  1   37.782 53.782
## - gleason  1   37.794 53.794
## - lbph     1   38.248 54.248
## - age      1   38.907 54.907
## <none>         37.721 55.721
## - lcp      1   48.013 64.013
## - lpsa     1   49.249 65.249
## 
## Step:  AIC=53.76
## svi ~ lweight + age + lbph + lcp + gleason + pgg45 + lpsa
## 
##           Df Deviance    AIC
## - pgg45    1   37.809 51.809
## - gleason  1   37.839 51.839
## - lweight  1   37.857 51.857
## - lbph     1   38.248 52.248
## - age      1   38.912 52.912
## <none>         37.760 53.760
## - lcp      1   50.549 64.549
## - lpsa     1   51.558 65.558
## 
## Step:  AIC=51.81
## svi ~ lweight + age + lbph + lcp + gleason + lpsa
## 
##           Df Deviance    AIC
## - lweight  1   37.898 49.898
## - gleason  1   38.076 50.076
## - lbph     1   38.398 50.398
## - age      1   38.920 50.920
## <none>         37.809 51.809
## - lpsa     1   52.945 64.945
## - lcp      1   54.318 66.318
## 
## Step:  AIC=49.9
## svi ~ age + lbph + lcp + gleason + lpsa
## 
##           Df Deviance    AIC
## - gleason  1   38.102 48.102
## - lbph     1   38.758 48.758
## - age      1   39.002 49.002
## <none>         37.898 49.898
## - lpsa     1   53.462 63.462
## - lcp      1   54.379 64.379
## 
## Step:  AIC=48.1
## svi ~ age + lbph + lcp + lpsa
## 
##        Df Deviance    AIC
## - lbph  1   38.959 46.959
## - age   1   39.066 47.066
## <none>      38.102 48.102
## - lpsa  1   53.509 61.509
## - lcp   1   57.700 65.700
## 
## Step:  AIC=46.96
## svi ~ age + lcp + lpsa
## 
##        Df Deviance    AIC
## - age   1   39.653 45.653
## <none>      38.959 46.959
## - lpsa  1   54.840 60.840
## - lcp   1   59.815 65.815
## 
## Step:  AIC=45.65
## svi ~ lcp + lpsa
## 
##        Df Deviance    AIC
## <none>      39.653 45.653
## - lpsa  1   55.355 59.355
## - lcp   1   60.781 64.781
## 
## Call:  glm(formula = svi ~ lcp + lpsa, family = binomial(), data = prostate)
## 
## Coefficients:
## (Intercept)          lcp         lpsa  
##      -8.532        1.343        2.198  
## 
## Degrees of Freedom: 96 Total (i.e. Null);  94 Residual
## Null Deviance:       101.4 
## Residual Deviance: 39.65     AIC: 45.65
When the outcome categories \(1,2,\ldots,J\) are unordered, a multinomial logit model defines \(J - 1\) logits with the form (treating \(j=1\) as the reference category):
\[\log (odds_{j1}) = \log(\frac{\mu_{ij}}{\mu_{i1}}) = \log(\frac{P(Y_i = j)}{P(Y_i = 1)}) = \beta_{0j} + \sum_k \beta_{kj} X_{ik}\]
ml <- read.csv("https://caleb-huo.github.io/teaching/data/mlogistic_program/hsbdemo.csv",row.names=1)
head(ml)##    id female    ses schtyp     prog read write math science socst
## 1  45 female    low public vocation   34    35   41      29    26
## 2 108   male middle public  general   34    33   41      36    36
## 3  15   male   high public vocation   39    39   44      26    42
## 4  67   male    low public vocation   37    37   42      33    32
## 5 153   male middle public vocation   39    31   40      39    51
## 6  51 female   high public  general   42    36   42      31    39
##         honors awards cid
## 1 not enrolled      0   1
## 2 not enrolled      0   1
## 3 not enrolled      0   1
## 4 not enrolled      0   1
## 5 not enrolled      0   1
## 6 not enrolled      0   1
ml$prog2 <- relevel(ml$prog, ref = "vocation")
ml$ses2 <- relevel(ml$ses, ref = "low")
library("nnet")
test <- multinom(prog2 ~ ses2 + write, data = ml)## # weights:  15 (8 variable)
## initial  value 219.722458 
## iter  10 value 179.993001
## final  value 179.981726 
## converged
## Call:
## multinom(formula = prog2 ~ ses2 + write, data = ml)
## 
## Coefficients:
##          (Intercept)   ses2high ses2middle      write
## academic   -5.218216  0.9826776 -0.2913902 0.11360290
## general    -2.366021 -0.1801588 -0.8246874 0.05567435
## 
## Std. Errors:
##          (Intercept)  ses2high ses2middle      write
## academic    1.163550 0.5955674  0.4763737 0.02221991
## general     1.174249 0.6484555  0.4901231 0.02333137
## 
## Residual Deviance: 359.9635 
## AIC: 375.9635
\[\log (\frac{P(Y > j | X)}{P(Y \le j | X)}) = \log (\frac{P(Y > j | X)}{1 - P(Y > j | X)}) = \alpha_j + \beta^T X\] For all \(j = 1, \ldots, J - 1\).
\[\log (\frac{P(Y > 1 | X)}{P(Y \le 1 | X)}) = \log (\frac{P(Y = 2 | X) + P(Y = 3 | X)}{P(Y = 1 | X)}) = \alpha_1 + \beta^T X\]
\[\log (\frac{P(Y > 2 | X)}{P(Y \le 2 | X)}) = \log (\frac{P(Y = 3 | X)}{P(Y = 1 | X) + P(Y = 2 | X)}) = \alpha_2 + \beta^T X\]
\[\log (\frac{P(Y > j | X)}{P(Y \le j | X)}) = \log (\frac{P(Y > j | X)}{1 - P(Y > j | X)}) = \alpha_j + \beta^T X\] For all \(j = 1, \ldots, J - 1\).
##             apply pared public  gpa
## 1     very likely     0      0 3.26
## 2 somewhat likely     1      0 3.21
## 3        unlikely     1      1 3.94
## 4 somewhat likely     0      0 2.81
## 5 somewhat likely     0      0 2.53
## 6        unlikely     0      1 2.59
library(MASS)
dat$apply2 <- relevel(dat$apply, ref = "unlikely")
m <- polr(apply2 ~ pared + public + gpa, data = dat, Hess=TRUE)
summary(m)## Call:
## polr(formula = apply2 ~ pared + public + gpa, data = dat, Hess = TRUE)
## 
## Coefficients:
##           Value Std. Error t value
## pared   1.04769     0.2658  3.9418
## public -0.05879     0.2979 -0.1974
## gpa     0.61594     0.2606  2.3632
## 
## Intercepts:
##                             Value   Std. Error t value
## unlikely|somewhat likely     2.2039  0.7795     2.8272
## somewhat likely|very likely  4.2994  0.8043     5.3453
## 
## Residual Deviance: 717.0249 
## AIC: 727.0249
##     pared    public       gpa 
## 2.8510579 0.9429088 1.8513972
One unit increase in parental education, from 0 (Low) to 1 (High), the odds of “very likely” applying versus “somewhat likely” or “unlikely” applying combined are 2.85 greater .
The odds “very likely” or “somewhat likely” applying versus “unlikely” applying is 2.85 times greater .
For gpa, when a student’s gpa increases 1 unit, the odds of moving from “unlikely” applying to “somewhat likely” or “very likely” applying (or from the lower and middle categories to the high category) are multiplied by 1.85.
## Classes and Methods for R developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University
## Simon Jackman
## hurdle and zeroinfl functions by Achim Zeileis
##   art   fem     mar kid5  phd ment
## 1   0   Men Married    0 2.52    7
## 2   0 Women  Single    0 2.05    6
## 3   0 Women  Single    0 3.75    6
## 4   0   Men Married    1 1.18    3
## 5   0 Women  Single    0 3.75   26
## 6   0 Women Married    2 3.59    2
glm_poisson <- glm(art ~ . , data = bioChemists,  family = poisson())
summary(glm_poisson) # display results## 
## Call:
## glm(formula = art ~ ., family = poisson(), data = bioChemists)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.5672  -1.5398  -0.3660   0.5722   5.4467  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.304617   0.102981   2.958   0.0031 ** 
## femWomen    -0.224594   0.054613  -4.112 3.92e-05 ***
## marMarried   0.155243   0.061374   2.529   0.0114 *  
## kid5        -0.184883   0.040127  -4.607 4.08e-06 ***
## phd          0.012823   0.026397   0.486   0.6271    
## ment         0.025543   0.002006  12.733  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 1817.4  on 914  degrees of freedom
## Residual deviance: 1634.4  on 909  degrees of freedom
## AIC: 3314.1
## 
## Number of Fisher Scoring iterations: 5
## Waiting for profiling to be done...
##                   2.5 %      97.5 %
## (Intercept)  0.10155469  0.50526581
## femWomen    -0.33193995 -0.11781821
## marMarried   0.03520230  0.27584836
## kid5        -0.26422746 -0.10689876
## phd         -0.03881122  0.06467552
## ment         0.02154163  0.02940716
## (Intercept)    femWomen  marMarried        kid5         phd        ment 
##   1.3561053   0.7988403   1.1679422   0.8312018   1.0129051   1.0258718
## Waiting for profiling to be done...
##                 2.5 %    97.5 %
## (Intercept) 1.1068905 1.6574260
## femWomen    0.7175304 0.8888576
## marMarried  1.0358292 1.3176480
## kid5        0.7677989 0.8986166
## phd         0.9619323 1.0668128
## ment        1.0217753 1.0298438
\(\beta_1\) is difference in log of expected counts when \(X\) increases by 1 unit.
\(\exp (\beta_1)\) is multiplicative effect of the mean of \(Y\) when \(X\) increases by 1 unit.
## Start:  AIC=3314.11
## art ~ fem + mar + kid5 + phd + ment
## 
##        Df Deviance    AIC
## - phd   1   1634.6 3312.3
## <none>      1634.4 3314.1
## - mar   1   1640.8 3318.5
## - fem   1   1651.5 3329.2
## - kid5  1   1656.5 3334.2
## - ment  1   1766.2 3444.0
## 
## Step:  AIC=3312.35
## art ~ fem + mar + kid5 + ment
## 
##        Df Deviance    AIC
## <none>      1634.6 3312.3
## - mar   1   1640.8 3316.6
## - fem   1   1651.8 3327.5
## - kid5  1   1656.7 3332.4
## - ment  1   1776.7 3452.5
## 
## Call:  glm(formula = art ~ fem + mar + kid5 + ment, family = poisson(), 
##     data = bioChemists)
## 
## Coefficients:
## (Intercept)     femWomen   marMarried         kid5         ment  
##     0.34517     -0.22530      0.15218     -0.18499      0.02576  
## 
## Degrees of Freedom: 914 Total (i.e. Null);  910 Residual
## Null Deviance:       1817 
## Residual Deviance: 1635  AIC: 3312
burnData <- read.csv("https://caleb-huo.github.io/teaching/data/Burn/burn.csv", row.names = 1)
head(burnData)##   treatment gender race percent head buttock trunk upper_leg lower_leg
## 1         0      0    0      15    0       0     1         1         0
## 2         0      0    1      20    0       0     1         0         0
## 3         0      0    1      15    0       0     0         1         1
## 4         0      0    0      20    1       0     1         0         0
## 5         0      0    1      70    1       1     1         1         0
## 6         0      0    1      20    1       0     1         0         0
##   resp_tract type numfup infection
## 1          0    2     12         0
## 2          0    4      9         0
## 3          0    2      7         1
## 4          0    2     29         0
## 5          0    2      4         1
## 6          0    4      8         1
glm_poisson_rate <- glm(infection ~ . - numfup + offset(log(numfup)), data = burnData,  family = poisson())
summary(glm_poisson_rate) # display results## 
## Call:
## glm(formula = infection ~ . - numfup + offset(log(numfup)), family = poisson(), 
##     data = burnData)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0008  -0.8023  -0.5350   1.0419   2.5011  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -5.603081   1.201833  -4.662 3.13e-06 ***
## treatment   -0.622350   0.316669  -1.965   0.0494 *  
## gender      -0.724283   0.396879  -1.825   0.0680 .  
## race         2.155901   1.016616   2.121   0.0340 *  
## percent      0.003402   0.009854   0.345   0.7299    
## head        -0.023392   0.353237  -0.066   0.9472    
## buttock      0.612703   0.421929   1.452   0.1465    
## trunk        0.058203   0.500184   0.116   0.9074    
## upper_leg   -0.472179   0.373669  -1.264   0.2064    
## lower_leg   -0.299164   0.387843  -0.771   0.4405    
## resp_tract  -0.042332   0.364401  -0.116   0.9075    
## type        -0.045439   0.171668  -0.265   0.7912    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for poisson family taken to be 1)
## 
##     Null deviance: 207.46  on 153  degrees of freedom
## Residual deviance: 184.46  on 142  degrees of freedom
## AIC: 304.46
## 
## Number of Fisher Scoring iterations: 7
\[Y|\lambda \sim Poisson(\lambda)\] \[\lambda \sim gamma(\alpha, \beta)\] \[Y \sim NB(\alpha, 1/(\beta + 1))\]
The canonical link for the negative binomial distribution is rather complicated and hard to interpret, so it is rarely used. Instead to facilitate comparisons with the Poisson generalized linear model, a log link is typically used.
\[b''(\theta) = \frac{\partial^2 b}{\partial p^2} \times (\frac{\partial p}{\partial \theta})^2 + \frac{\partial b}{\partial p} \times \frac{\partial^2 p}{\partial \theta^2} = \frac{r(1-p)}{p^2} \]
\[b''(\theta) = \frac{\partial^2 b}{\partial p^2} \times (\frac{\partial p}{\partial \theta})^2 + \frac{\partial b}{\partial p} \times \frac{\partial^2 p}{\partial \theta^2} = \mu + \alpha \mu^2 \] The variance is always larger than the mean for the negative binomial.
NB is possion if \(\alpha = 0\)
## 
## Call:
## glm.nb(formula = art ~ ., data = bioChemists, init.theta = 2.264387695, 
##     link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.1678  -1.3617  -0.2806   0.4476   3.4524  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  0.256144   0.137348   1.865 0.062191 .  
## femWomen    -0.216418   0.072636  -2.979 0.002887 ** 
## marMarried   0.150489   0.082097   1.833 0.066791 .  
## kid5        -0.176415   0.052813  -3.340 0.000837 ***
## phd          0.015271   0.035873   0.426 0.670326    
## ment         0.029082   0.003214   9.048  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(2.2644) family taken to be 1)
## 
##     Null deviance: 1109.0  on 914  degrees of freedom
## Residual deviance: 1004.3  on 909  degrees of freedom
## AIC: 3135.9
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  2.264 
##           Std. Err.:  0.271 
## 
##  2 x log-likelihood:  -3121.917
## Warning in glm.nb(infection ~ . - numfup + offset(log(numfup)), data =
## burnData): alternation limit reached
## 
## Call:
## glm.nb(formula = infection ~ . - numfup + offset(log(numfup)), 
##     data = burnData, init.theta = 0.6478173549, link = log)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.7731  -0.8564  -0.5674   0.5065   2.3192  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -5.285673   1.300638  -4.064 4.83e-05 ***
## treatment   -0.840907   0.371826  -2.262   0.0237 *  
## gender      -1.037906   0.464514  -2.234   0.0255 *  
## race         2.620297   1.081204   2.423   0.0154 *  
## percent      0.005769   0.011903   0.485   0.6279    
## head        -0.208180   0.410433  -0.507   0.6120    
## buttock      0.899329   0.500276   1.798   0.0722 .  
## trunk        0.339320   0.565532   0.600   0.5485    
## upper_leg   -0.692250   0.446461  -1.551   0.1210    
## lower_leg   -0.180517   0.428152  -0.422   0.6733    
## resp_tract  -0.191947   0.437243  -0.439   0.6607    
## type        -0.149369   0.201242  -0.742   0.4579    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for Negative Binomial(0.6478) family taken to be 1)
## 
##     Null deviance: 149.78  on 153  degrees of freedom
## Residual deviance: 126.24  on 142  degrees of freedom
## AIC: 299.93
## 
## Number of Fisher Scoring iterations: 1
## 
## 
##               Theta:  0.648 
##           Std. Err.:  0.246 
## Warning while fitting theta: alternation limit reached 
## 
##  2 x log-likelihood:  -273.928
For a change of \(\Delta\) in \(x\), the expected count increases by a factor of \(\exp(\gamma \Delta)\) holding other variables \(Z = z\) constant.
\[ P(y_i = j)= \begin{cases} \pi_i + (1 - \pi_i) \exp (-\mu_i), \mbox{if } j=0\\ (1 - \pi_i) \frac{\mu_i^{y_i} \exp(-\mu_i)}{y_i!}, \mbox{if } j>0 \end{cases} \] - The Poisson component is \[\log(\mu_i) = \beta_0x_{10} + \beta_1 x_{1i} + \ldots + \beta_px_{1p}\] - The zero proportion component is \[\pi_i = \frac{\exp(\gamma_0z_{10} + \gamma_1 z_{1i} + \ldots + \gamma_pz_{1p})}{1 + \exp(\gamma_0z_{10} + \gamma_1 z_{1i} + \ldots + \gamma_pz_{1p})}\]
fm_pois <- glm(art ~ ., data = bioChemists, family = poisson) 
## without inflation
fm_zip <- zeroinfl(art ~ . | 1, data = bioChemists) 
## with simple inflation (no regressors for zero component)
fm_zip2 <- zeroinfl(art ~ . | ., data = bioChemists) 
## inflation with regressors
## ("art ~ . | ." is "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment")
fm_nb <- MASS::glm.nb(art ~ ., data = bioChemists)
## without inflation
fm_zinb <- zeroinfl(art ~ . | 1, data = bioChemists, dist = "negbin")
## with simple inflation (no regressors for zero component)
fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")
## inflation with regressors
## ("art ~ . | ." is "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment")
We can generalize these conditions for linear regression model
How about non linear systematic component: