Inference with Regression

and regression tables…

January 29, 2025

Inference

Confidence Intervals

Confidence Intervals


  • What is our goal here?

Bootstrap approach

Bootstrap approach

Bootstrap approach

Bootstrap approach

Regression Model Output

linear_reg() %>%
  set_engine("lm") %>%
  fit(v2x_libdem ~ v2cacamps, data = modelData) %>% 
  tidy(conf.int = TRUE) %>% select(term, estimate, std.error, conf.low,conf.high) %>% kable(digits = 3)
term estimate std.error conf.low conf.high
(Intercept) 0.411 0.018 0.376 0.446
v2cacamps -0.079 0.013 -0.104 -0.054

Interpretation


  • Our best guess is that the regression coefficient is -0.079

  • We are 95 percent confident that is it somewhere between -0.104 and -0.054

    • This means we are 95 percent sure that the coefficient is NOT zero or positive.

Hypothesis Testing


  • Null hypothesis: explanatory variable has no association with the response variable

    • The coefficient is 0
  • Alternative hypothesis: there is an association between the variables

Regression Model Output

linear_reg() %>%
  set_engine("lm") %>%
  fit(v2x_libdem ~ v2cacamps, data = modelData) %>% 
  tidy()  %>% select(term, estimate, std.error, p.value) %>% kable(digits = 3)
term estimate std.error p.value
(Intercept) 0.411 0.018 0
v2cacamps -0.079 0.013 0

Interpretation


  • The p-value is very close to 0

  • What should we conclude?

  • How should we interpret this?

  • If there were no association between polarization and democracy, there is a less than 5% chance that we would generate these results due to random chance alone (close to 0% chance)

Questions?

Regression Tables

What’s in a Regression Table?

Load VDEM Data

modelData <- vdem %>% 
  filter(year == 2019) %>% 
  select(country_name, v2x_libdem, e_gdppc, v2cacamps, e_total_oil_income_pc, v2x_corr) %>% 
  mutate(lg_gdppc = log(e_gdppc))

Models

gdp <- linear_reg() %>%
  set_engine("lm") %>%
  fit(v2x_libdem ~ lg_gdppc, data = modelData) 
  
polarization <- linear_reg() %>%
  set_engine("lm") %>%
  fit(v2x_libdem ~ v2cacamps, data = modelData) 

both <- linear_reg() %>%
  set_engine("lm") %>%
  fit(v2x_libdem ~ v2cacamps + lg_gdppc, data = modelData) 

Prep Data for Display


models <- list("GDP" = gdp,  # store list of models in an object
               "Polarization" = polarization, 
               "Both" = both)

coef_map <- c("lg_gdppc" = "GDP per capita (log)",  # map coefficients
        "v2cacamps" = "Polarization",     #(change names and order)
        "(Intercept)" = "Intercept")

caption = "Table 1: Predictors of Liberal Democracy" # store caption
reference = "V-Dem Data 2019."      # store reference notes

modelsummary Code

library(modelsummary)

modelsummary(models,                      # display the table
             stars = TRUE,                # include stars for significance
             gof_map = c("nobs"),         # goodness of fit stats to include   
             coef_map = coef_map,         # coefficient mapping
             title = caption,             # title
             notes = reference)           # source note

Display the Models in a Table

Table 1: Predictors of Liberal Democracy
GDP Polarization Both
+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001
V-Dem Data 2019.
GDP per capita (log) 0.120*** 0.100***
(0.015) (0.015)
Polarization -0.079*** -0.054***
(0.013) (0.012)
Intercept 0.131*** 0.411*** 0.182***
(0.038) (0.018) (0.038)
Num.Obs. 174 179 174