Building on possible dependent and explanatory variables compiled in this google sheet, this pages describes possible designs to study the effects of redistricting, starting with the most basic models.

1. Measuring representation by district characteristics

As political scientists on the UW 2020 project, our comparative advantage is that our colleagues in other disciplines are creating and comparing alternative maps, both those that exist and hypothetical alternatives. For our purposes, these maps will come with two key types of information:

district partisanship (led by Ken and Blake)
measures of community (led by Song et al.)

Measures of both district partisanship and community relate theoretically to measures of representation. If we can estimate relationships between district characteristics and representation for districts we observe, we can then estimate the quality of representation for hypothetical districts. Many kinds of representation may be of interest. The list below is incomplete and will grow as we develop measures of different forms of representation.

Legislator behavior

Floor votes (substantive representation)

For example, NOMINATE scores.

Floor speeches (representational style)

Our preliminary analysis countying keywords in floor speeches is here.

To see what comparing district partisanship to speeches will look like, see this comparison of district partisan ship to the total number of speeches. (There is no relationship between presidential vote share and the one’s total number of speeches, but there may be in the content of those speeches.)

Talking about one’s district

The relationship between district partisanship and the number of references to one’s district.

Partisan or bipartisan rhetoric

The relationship between district partisanship and the amount bipartisan rhetoric.

The relationship between district partisanship and the amount of partisan vitriol.

2. Tensions among districting goals

Another possible line of inquiry lies in assessing tradeoffs among the normative goals of districting. (See relationships among goals in the exploratory DAG)

For simplicity, assume a perfect correlation of “interest” and partisanship. (Interest and community are squishy concepts anyway.) Maximizing the goal of “preserving” “communities of interest” (by which we mean grouping communities of interest, even where they have previously been split) then results in maximizing the difference in two-party vote share. In a world without other constraints, maximizing the grouping of communities of interest would yield some districts that are all majority party and others that are all minority party. In the extreme, unconstrained world, grouping communities of interest into districts leads to stable districts that are always represented by the same party that enjoys an overwhelming majority in the district. These districts are never competitive in a general election but are presumably very competitive in primary elections.

Extreme hypotheticals

The following hypothetical outcomes each take one goal to the extreme. All have the same total number of majority and minority party voters, allocated differently across 15 districts, yielding a different number of seats for the majority party.

# imagined data maximizing interest coherence 
interest <- tibble(district = 1:15,
                   majority_party_vote = c(rep(100,8), 
                                      rep(0,7)) ) %>%
  mutate(minority_party_vote = 100-majority_party_vote)

# imagined data maximizing competitiveness
competitive <- tibble(district = 1:15,
                      majority_party_vote = c(100,
                                         rep(50,14)  )) %>%
  mutate(minority_party_vote = 100-majority_party_vote) 

# imagined data maximizing majority seats
majority <- tibble(district = factor(1:15),
                   majority_party_vote = c(rep(55,14), 
                                      rep(30,1)))%>%
  mutate(minority_party_vote = 100-majority_party_vote) 

# imagined data maximizing
minority <- tibble(district = factor(1:15),
                   minority_party_vote = c(rep(10,2),
                                           20,
                                           rep(55,12) ) ) %>%
  mutate(majority_party_vote = 100-minority_party_vote) 

dplot <- function(d){
ggplot(d) +
  aes(x = factor(district), 
      y = majority_party_vote) + 
  geom_col() + 
    geom_hline(yintercept = 50) +
    geom_text(aes(label = ifelse(majority_party_vote == 45 | minority_party_vote ==45,
                   "Cracked", NA)), 
               vjust = -1) +
        geom_text(aes(label = ifelse(majority_party_vote %in% c(10,20,30) | minority_party_vote %in% c(10,20,30),
                   "Packed", NA)), 
               vjust = -1) +
  scale_y_continuous(limits = c(0,100) ) +
    labs(x = "District")
}


dplot(interest) + 
  labs(title = "Districting that maximizes interest alignment within districts",
  y = str_c("Majority Party Vote Share \n Expected Seats = ", sum(interest$majority_party_vote>50)))

dplot(competitive) + 
  labs(title = "Districting that maximizes the number of competitive districts\n (and thus proportionality, in expectation)",
  y = str_c("Majority Party Vote Share \n Expected Seats = 1 + (14*.5) = 8"))

dplot(majority) + 
  labs(title = "Districting that maximizes majority-party seats",
  y = str_c("Majority Party Vote Share \n Expected Seats = ", sum(majority$majority_party_vote>50)))

dplot(minority) + 
  labs(title = "Districting that maximizes minority-party seats",
  y = str_c("Majority Party Vote Share \n Expected Seats = ", sum(minority$majority_party_vote>50)))

Mockups

Assume continuous measures of representation, polarization, ideological extremity, in the range \(\{0,1\}\) like those described in the “DVs” sheet.

Also, assume a measure of partisan advantage, “packed” and “cracked” indicators, and other explanatory variables like those described in the “EVs” sheet.

Are “packed” districts represented differently than other safe districts?

Are representatives from packed districts less aligned with voters?

Let \(y_i\) be a measure of representation for official \(i\).

Let \(v_i\) be the margin of victory of official \(i\) (or the margin for official \(i\)’s party in their district, etc.).

Let \(p_i\) be an indicator of whether \(i\)’s district is packed.

We can simulate data for a hypothetical set of safe districts (e.g., \(v > 5%\)). For illustration, I set the mean of \(Y\) to .4 for packed districts and .6 for non-packed districts.

packed <- tibble(packed = T, 
                 vote_margin = sample(seq(5,45, 1), 
                                      100, 
                                      replace = T),
                 representation = rnorm(100, 
                                        .4, 
                                        .1))

notpacked <- tibble(packed = F, 
                 vote_margin = sample(seq(5,45, 1), 
                                      100, 
                                      replace = T),
                 representation = rnorm(100, 
                                        .6, 
                                        .1))

d <- full_join(packed, notpacked) 

ggplot(d) + 
  aes(x = representation, 
      fill = packed) +
  geom_histogram()

ggplot(d) + 
  aes(x = vote_margin, 
      fill = packed) +
  geom_histogram()

We can then estimate representation given the vote margin and whether a district is packed, \(y|p,v\)

A linear fit would look like this:

\(\hat{y_i} = \beta_0 + \beta_1 p_i + \beta_2 v_i\)

ggplot(d) +
  aes(x = vote_margin, 
      y = representation, 
      color = packed) + 
  geom_point(alpha = .2) + 
  geom_smooth(method = "lm") + 
  scale_color_viridis_d()

Are representatives from packed districts more extreme?

Now let \(y_i\) be a measure of ideological extremity.

## d %<>% mutate(ideological_extremity = ifelse(packed == 1,...))
d$ideological_extremity <- rbeta(200, 1, 3)

ggplot(d) + 
  aes(x = ideological_extremity, 
      fill = packed) +
  geom_histogram()

We can then estimate ideological extremity given the vote margin and whether a district is packed, \(y|p,v\).

A linear fit would look like this:

\(\hat{y_i} = \beta_0 + \beta_1 p_i + \beta_2 v_i\)

m2 <- lm(ideological_extremity ~ packed + vote_margin,
         data = d)  %>% 
  augment(se_fit = T)

ggplot(m2) +
  aes(x = vote_margin, 
      y = ideological_extremity, 
      color = packed,
      fill = packed ) + 
  geom_point(alpha = .2) + 
  geom_ribbon(aes(ymin = .fitted - .se.fit,
                  ymax = .fitted + .se.fit),
              alpha = .5,
              color = NA) + 
  geom_line(aes(y = .fitted)) + 
  scale_color_viridis_d()+ 
  scale_fill_viridis_d()

Are “cracked”/“minimal secure margin” districts represented differently than other relatively secure districts?

Now consider the set of districts with voter splits like those engineered by a partisan gerrymander (margins \(v\) about +5% for the advantaged party).

Let \(c_i\) be a measure of whether representative \(i\)’s district is “cracked.”

For \(v_i \in[5,10]\), we can then test for a difference in mean ideological extremity by whether a district is cracked \(y|c\).

## use same sampe dist
d$cracked <- d$packed

d1 <- d %>% filter(vote_margin >= 5, 
                   vote_margin <= 10)

d1 %<>% group_by(cracked) %>% 
  mutate(mean = mean(ideological_extremity))


d1 %>% ggplot() +
  aes(x = vote_margin,
      y = ideological_extremity,
      color = cracked) + 
  geom_boxplot() +
  geom_point() + 
  facet_grid(. ~ cracked) + 
  #geom_line(aes(y = mean))+ 
  scale_color_viridis_d()

t.test(x = d1 %>% filter(cracked) %>%
         .$ideological_extremity,
       y = d1 %>% filter(!cracked) %>%
         .$ideological_extremity)

## 
##  Welch Two Sample t-test
## 
## data:  d1 %>% filter(cracked) %>% .$ideological_extremity and d1 %>% filter(!cracked) %>% .$ideological_extremity
## t = 2.088, df = 32.877, p-value = 0.04462
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.003480229 0.269571174
## sample estimates:
## mean of x mean of y 
## 0.3485650 0.2120393

Are black voters in cracked districts less well represented?

Now let \(y_1\) be a measure of representation of Black residents.

Let \(b_i\) be the percent of \(i\)’s district’s residents that identify as Black.

## use same sample dist
d$black_representation <- d$representation

## draw percent black
d$percent_black <- rbeta(200, 1, 3)

ggplot(d) + 
  aes(x = percent_black, 
      fill = cracked) +
  geom_histogram()

A linear fit would look like this:

\(\hat{y_i} = \beta_0 + \beta_1 c_i + \beta_2 b_i + \beta_3 c_i b_i\)

m2 <- lm(black_representation ~ cracked*percent_black,
         data = d)  %>% 
  augment(se_fit = T)

ggplot(m2) +
  aes(x = percent_black, 
      y = black_representation, 
      color = cracked,
      fill = cracked ) + 
  geom_point(alpha = .2) + 
  geom_ribbon(aes(ymin = .fitted - .se.fit,
                  ymax = .fitted + .se.fit),
              alpha = .5,
              color = NA) + 
  geom_line(aes(y = .fitted)) + 
  scale_color_viridis_d()+ 
  scale_fill_viridis_d()

Are delegations from more gerrymandered states biased toward the advantaged party?

There are two sub-questions:

Are advantaged delegations from more gerrymandered states more ideologically extreme?

Are disadvantaged delegations from more gerrymandered states biased toward the advantaged party?

Limitations

Ideally, we want is counterfactual levels representation and polarization at different levels of partisan advantage.

The level of partisan advantage is always confounded by time (in within-state) or state characteristics (in across-state analysis).

How might redistricting affect representation?

1. Measuring representation by district characteristics

Legislator behavior

Floor votes (substantive representation)

Floor speeches (representational style)

Talking about one’s district

Partisan or bipartisan rhetoric

2. Tensions among districting goals

Extreme hypotheticals

Mockups

Are “packed” districts represented differently than other safe districts?

Are representatives from packed districts less aligned with voters?

Are representatives from packed districts more extreme?

Are “cracked”/“minimal secure margin” districts represented differently than other relatively secure districts?

Are black voters in cracked districts less well represented?

Are delegations from more gerrymandered states biased toward the advantaged party?

Are advantaged delegations from more gerrymandered states more ideologically extreme?

Are disadvantaged delegations from more gerrymandered states biased toward the advantaged party?

Limitations