This vignette shows examples of assessing bias in literature review networks based on covariates from metadata about the studies and authors included or excluded from the review on redistricting in the main manuscript. Specifically, for each study, we collect metadata on the lead author’s gender, H-Index, and total number of citations. We then assess the impact of selecting studies on covariates in two ways:

First, we subset the network (e.g., to studies where the lead author is a man) and observe how many nodes and edges are missing in these subsets. This reveals the contributions of underrepresented scholars to the network by showing what we lose if they are excluded.
Second, we draw random samples of 100 studies weighted by covariates. This simulates a literature review that is biased (e.g., toward scholars who are men or have many citations). We then compare these biased samples to an unweighted random sample of studies in the network.

1 Metadata

1.1 Lead Author Gender, H-Index, and Citation

# Load replication version of main data and metadata on citations
load(here::here("replication_data","literature_metadata.rda"))
load(here::here("replication_data","literature.rda"))

names(literature_metadata) %<>% janitor::make_clean_names()

literature_metadata %<>% 
  rename(author_gender = author_sex)

literature_metadata%>% kable()

id	author	year	publication	title	citations	outside_u_s	author_gender	author_h_index	author_citations
Hayes & McKee 2011	Hayes & McKee	2011	AJPS	The Intersection of Redistricting, Race, and Participation	41	NA	M	19	2675
Katz, King & Rosenblatt 2020	Katz, King & Rosenblatt	2020	APSR	Theoretical Foundations and Empirical Evaluations of Partisan Fairness in District-Based Democracies	20	NA	M	30	18684
Lo 2013	Lo	2013	QJPS	Legislative Responsiveness to Gerrymandering: Evidence from the 2003 Texas Redistricting	20	NA	M	18	1611
Chen & Rodden 2013	Chen & Rodden	2013	QJPS	Unintentional Gerrymandering: Political Geography and Electoral Bias in Legislatures	361	NA	M	8	375
Matsusaka 2010	Matsusaka	2010	QJPS	Popular Control of Public Policy: A Quantitative Approach	138	NA	M	38	10435
Moskowitz & Schneer 2019	Moskowitz & Schneer	2019	QJPS	Reevaluating Competition and Turnout in US House Elections	10	NA	M	2	18
McGhee & Shor 2017	McGhee & Shor	2017	Perspectives on Politics	Has the Top Two Primary Elected More Moderates?	20	NA	M	17	1809
Wildgen & Engstrom 1980	Wildgen & Engstrom	1980	Legislative Studies Quarterly	Spatial Distribution of Partisan Support and the Seats/Votes Relationship	30	NA	NA	NA	NA
Cain 1985	Cain	1985	APSR	Assessing the Partisan Effects of Redistricting	243	NA	M	44	9625
Buchler 2005	Buchler	2005	Journal of Theoretical Politics	Competition, Representation and Redistricting: The Case Against Competitive Congressional Districts	81	NA	M	NA	NA
Lublin & McDonald 2006	Lublin & McDonald	2006	Election Law Journal	Is It Time to Draw the Line?: The Impact of Redistricting on Competition in State House Elections	44	NA	M	26	3164
Caughey et al. 2017	Caughey et al.	2017	JOP	Incremental Democracy: The Policy Effects of Partisan Control of State Government	92	NA	M	17	1959
Glazer et al. 1987	Glazer et al.	1987	AJPS	Partisan and Incumbency Effects of 1970s Congressional Redistricting	102	NA	NA	NA	NA
Jacobson 2005	Jacobson	2005	Political Science Quarterly	Polarized Politics and the 2004 Congressional and Presidential Elections	134	NA	M	49	16104
Abramowitz et al. 2006	Abramowitz et al.	2006	JOP	Incumbency, Redistricting, and the Decline of Competition in US House Elections	461	NA	M	53	14228
Cain et al. 2005	Cain et al.	2005	Brookings Institution	From Equality to Fairness: The Path of Political Reform since Baker v. Carr	56	NA	M	44	9625
Griffin & Newman 2007	Griffin & Newman	2007	JOP	The Unequal Representation of Latinos and Whites	145	NA	M	16	1923
Grofman et al. 2000	Grofman et al.	2000	NCL Review	Drawing Effective Minority Districts: A Conceptual Framework and Some Empirical Evidence	165	NA	M	74	22114
McDonald 2006	McDonald	2006	PS: Political Science & Politics	Drawing the Line on District Competition	76	NA	M	30	3441
Desposato & Petrocik 2003	Desposato & Petrocik	2003	AJPS	The Variable Incumbency Advantage: New Voters, Redistricting, and the Personal Vote	186	NA	M	27	3003
Ashworth & Bueno de Mesquita 2006	Ashworth & Bueno de Mesquita	2006	JOP	Delivering the Goods: Legislative Particularism in Different Electoral and Institutional Settings	201	NA	M	17	2856
Hayes & McKee 2008	Hayes & McKee	2008	American Politics Research	Toward a One-Party South?	65	NA	M	19	2675
Winburn & Wagner 2010	Winburn & Wagner	2010	Political Research Quarterly	Carving Voters Out: Redistricting’s Influence on Political Information, Turnout, and Voting Behavior	46	NA	M	11	450
Hayes & McKee 2009	Hayes & McKee	2009	AJPS	The Participatory Effects of Redistricting	85	NA	M	19	2675
Chen 2010	Chen	2010	AJPS	The Effect of Electoral Geography on Pork Barreling in Bicameral Legislatures	38	NA	M	8	375
Cameron et al. 1996	Cameron et al.	1996	APSR	Do Majority-Minority Districts Maximize Substantive Black Representation in Congress?	709	NA	M	28	6226
Canon 1999	Canon	1999	Legislative Studies Quarterly	Electoral Systems and the Representation of Minority Interests in Legislatures	77	NA	M	21	2864
Gay 2007	Gay	2007	JOP	Legislating Without Constraints: The Effect of Minority Districting on Legislators’ Responsiveness to Constituency Preferences	47	NA	F	NA	NA
Bratton & Haynie 1999	Bratton & Haynie	1999	JOP	Agenda Setting and Legislative Success in State Legislatures: The Effects of Gender and Race	740	NA	F	NA	NA
Wyrick 1991	Wyrick	1991	American Politics Quarterly	Management of Political Influence: Gerrymandering in the 1980s	10	NA	M	NA	NA
Barabas & Jerit 2004	Barabas & Jerit	2004	State Politics & Policy Quaterly	Redistricting Principles and Racial Representation	44	NA	M	20	3789
Shotts 2003	Shotts	2003	JOP	Does Racial Redistricting Cause Conservative Policy Outcomes? Policy Preferences of Southern Representatives in the 1980s and 1990s	76	NA	M	22	3066
Bullock 1995	Bullock	1995	American Politics Quarterly	The Impact of Changing the Racial Composition of Congressional Districts on Legislators’ Roll Call Behavior	63	NA	M	NA	NA
Overby & Cosgrove 1996	Overby & Cosgrove	1996	JOP	Unintended Consequences? Racial Redistricting and the Representation of Minority Interests	164	NA	M	NA	NA
Sharpe & Garand 2001	Sharpe & Garand	2001	Political Research Quarterly	Race, Roll Calls, and Redistricting: The Impact of Race-Based Redistricting on Congressional Roll-Call	48	NA	M	NA	NA
LeVeaux & Garand 2003	LeVeaux & Garand	2003	Social Science Quarterly	Race‐Based Redistricting, Core Constituencies, and Legislative Responsiveness to Constituency Change*	13	NA	F	NA	NA
Lyons & Galderisi 1995	Lyons & Galderisi	1995	Political Research Quarterly	Incumbency, Reapportionment, and US House Redistricting	43	NA	M	NA	NA
Hirsch 2003	Hirsch	2003	Election Law Journal	The United States House of Unrepresentatives: What Went Wrong in the Latest Round of Congressional Redistricting	159	NA	NA	NA	NA
Grofman 1982	Grofman	1982	Political Geography Quarterly	Reformers, Politicians, and the Courts: A Preliminary Look at US Redistricting in the 1980s	12	NA	M	74	22114
Forgette & Winkle 2006	Forgette & Winkle	2006	Social Science Quarterly	Partisan Gerrymandering and the Voting Rights Act	16	NA	M	NA	NA
Hetherington et al. 2003	Hetherington et al.	2003	JOP	The Redistricting Cycle and Strategic Candidate Decisions in US House Races	91	NA	M	24	10690
Carson et al. 2006	Carson et al.	2006	AJPS	The Electoral Costs of Party Loyalty in Congress	342	NA	M	24	5637
Lublin 1999	Lublin	1999	APSR	Racial Redistricting and African-American Representation: A Critique of “Do Majority-Minority Districts Maximize Substantive Black Representation in Congress?”	216	NA	M	26	3164
Forgette & Platt 2005	Forgette & Platt	2005	Political Geography	Redistricting Principles and Incumbency Protection in the US Congress	33	NA	M	NA	NA
Carson & Crespin 2004	Carson & Crespin	2004	State Politics & Policy Quaterly	The Effect of State Redistricting Methods on Electoral Competition in United States House of Representatives Races	100	NA	M	24	5637
Katz et al. 2020	Katz et al.	2020	APSR	Theoretical Foundations and Empirical Evaluations of Partisan Fairness in District-Based Democracies	20	NA	M	30	18684
Bertelli & Carson 2011	Bertelli & Carson	2011	Electoral Studies	Small Changes, Big Results: Legislative Voting Behavior in the Presence of New Voters	12	NA	M	29	3204
Chen & Cottrell 2016	Chen & Cottrell	2016	Electoral Studies	Evaluating Partisan Gains from Congressional Gerrymandering: Using Computer Simulations to Estimate the Effect of Gerrymandering in the U.S. House	62	NA	M	8	375
Hunt 2018	Hunt	2018	Electoral Studies	When Does Redistricting Matter? Changing Conditions and Their Effects on Voter Turnout	8	NA	M	2	25
Sauger & Grofman 2016	Sauger & Grofman	2016	Electoral Studies	Partisan Bias and Redistricting in France	15	1	M	21	1492
Wong 2019	Wong	2019	BJPS	Gerrymandering in Electoral Autocracies: Evidence from Hong Kong	16	1	NA	11	454
Incerti 2018	Incerti	2018	Electoral Studies	The Optimal Allocation of Campaign Funds in US House Elections	5	NA	M	11	454
Limbocker & You 2020	Limbocker & You	2020	Electoral Studies	Campaign Styles: Persistency in Campaign Resource Allocation	2	NA	M	6	119
Carson et al. 2014	Carson et al.	2014	State Politics & Policy Quaterly	Reevaluating the Effects of Redistricting on Electoral Competition, 1972–2012	38	NA	M	24	5637
Makse 2014	Makse	2014	State Politics & Policy Quaterly	The Redistricting Cycle, Partisan Tides, and Party Strategy in State Legislative Elections	12	NA	M	9	474
Hood & McKee 2013	Hood & McKee	2013	State Politics & Policy Quaterly	Unwelcome Constituents: Redistricting and Countervailing Partisan Tides	7	NA	M	21	1948
Goedert 2017	Goedert	2017	State Politics & Policy Quaterly	The Pseudoparadox of Partisan Mapmaking and Congressional Competition	6	NA	M	5	183
Kirkland 2013	Kirkland	2013	State Politics & Policy Quaterly	Wallet-Based Redistricting: Evidence for the Concentration of Wealth in Majority Party Districts	6	NA	M	15	927
Carsey et al. 2017	Carsey et al.	2017	State Politics & Policy Quaterly	Rethinking the Normal Vote, the Personal Vote, and the Impact of Legislative Professionalism in U.S. State Legislative Elections	6	NA	M	25	5340
Stephanopoulos & McGhee 2015	Stephanopoulos & McGhee	2015	University of Chicago Law Review	Partisan Gerrymandering and the Efficiency Gap	345	NA	M	23	1917
Chen & Rodden 2015	Chen & Rodden	2015	Election Law Journal	Cutting Through the Thicket: Redistricting Simulations and the Detection of Partisan Gerrymanders	87	NA	M	8	375
Barnes & Solomon 2020	Barnes & Solomon	2020	Political Analysis	Gerrymandering and Compactness: Implementation Flexibility and Abuse	13	NA	M	11	652
Atsusaka 2021	Atsusaka	2021	APSR	A Logical Model for Predicting Minority Representation: Application to Redistricting and Voting Rights Cases	0	NA	M	1	3
Gatesman & Unwin 2021	Gatesman & Unwin	2021	Political Analysis	Lattice Studies of Gerrymandering Strategies	0	NA	M	1	1
Magleby & Mosesson 2018	Magleby & Mosesson	2018	Political Analysis	A New Approach for Developing Neutral Redistricting Plans	23	NA	M	6	148
Deford, Eubank & Rodden 2020	Deford, Eubank & Rodden	2020	Political Analysis	Partisan Dislocation: A Precinct-Level Measure of Representation and Gerrymandering	0	NA	M	10	309
Krasa & Polborn 2018	Krasa & Polborn	2018	APSR	Political Competition in Legislative Elections	41	NA	M	14	673
Saxon 2020	Saxon	2020	Political Analysis	Reviving Legislative Avenues for Gerrymandering Reform with a Flexible, Automated Tool	3	NA	M	5	457
Kang 2017	Kang	2017	Michigan Law Review	Gerrymandering and the Constitutional Norm Against Government Partisanship	61	NA	M	22	1501
Stephanopoulos 2012	Stephanopoulos	2012	University of Pennsylvania Law Review	Redistricting and the Territorial Community	61	NA	M	23	1917
Altman & McDonald 2010	Altman & McDonald	2010	Duke Journal of Constitutional Law and Public Policy	The Promise and Perils of Computers in Redistricting	87	NA	M	31	7948
McDonald & Best 2015	McDonald & Best	2015	Election Law Journal	Unfair Partisan Gerrymanders in Politics and Law: A Diagnostic Applied to Six Cases	73	NA	M	25	5220
Tam Cho & Liu 2016	Tam Cho & Liu	2016	Election Law Journal	Toward a Talismanic Redistricting Tool: A Computational Method for Identifying Extreme Redistricting Plans	76	NA	F	18	1171
McGhee 2014	McGhee	2014	Legislative Studies Quarterly	Measuring Partisan Bias in Single-Member District Electoral Systems	97	NA	M	17	1809
Wang 2016	Wang	2016	Stanford Law Review	Three Tests for Practical Evaluation of Partisan Gerrymandering	97	NA	M	44	9711
Cox & Holden 2011	Cox & Holden	2011	University of Chicago Law Review	Reconsidering Racial and Partisan Gerrymandering	76	NA	M	20	2488
Stewart et al. 2019	Stewart et al.	2019	Nature	Information Gerrymandering and Undemocratic Decisions	81	NA	M	14	1090
Siegel-Hawley 2013	Siegel-Hawley	2013	Harvard Educational Review	Educational Gerrymandering? Race and Attendance Boundaries in a Demographically Changing Suburb	58	NA	F	27	2839
Richards 2014	Richards	2014	American Educational Research Journal	The Gerrymandering of School Attendance Zones and the Segregation of Public Schools	91	NA	F	11	815
Fraga 2016	Fraga	2016	JOP	Redistricting and the Causal Impact of Race on Voter Turnout	67	NA	M	12	714
De Assis et al. 2014	De Assis et al.	2014	Computers & Operations Research	A Redistricting Problem Applied to Meter Reading in Power Distribution Networks	52	NA	F	5	220
Hayes et al. 2010	Hayes	2010	Legislative Studies Quarterly	Redistricting, Responsiveness, and Issue Attention	52	NA	M	11	544
Liu et al. 2016	Liu et al.	2016	Swarm and Evolutionary Computation	PEAR: A Massively Parallel Evolutionary Computation Approach for Political Redistricting Optimization and Analysis	62	NA	M	22	1415
Yoshinaka & Murphy 2011	Yoshinaka & Murphy	2011	Political Research Quarterly	The Paradox of Redistricting: How Partisan Mapmakers Foster Competition but Disrupt Representation	53	NA	M	15	1486
Webster 2013	Webster	2013	Political Geography	Reflections on Current Criteria to Evaluate Redistricting Plans	53	NA	M	24	1875
Gentry et al. 2013	Gentry et al.	2013	American Journal of Transplantation	Addressing Geographic Disparities in Liver Transplantation Through Redistricting	137	NA	F	30	3498
Grainger 2010	Grainger	2010	The Journal of Law and Economics	Redistricting and Polarization: Who Draws the Lines in California?	50	NA	M	13	972
Masket et al. 2012	Masket et al.	2012	PS: Political Science & Politics	The Gerrymanderers are Coming! Legislative Redistricting Won’t Affect Competition or Polarization Much, No Matter Who Does It	57	NA	M	23	3217
Altman & McDonald 2011	Altman & McDonald	2011	Journal of Statistical Software	BARD: Better Automated Redistricting	84	NA	M	31	7948
Gul & Pesendorfer 2010	Gul & Pesendorfer	2010	American Economic Review	Strategic Redistricting	67	NA	M	24	3381
Cain 2011	Cain	2011	Yale Law Journal	Redistricting Commissions: A Better Political Buffer	128	NA	M	44	9625
Arrington 2016	Arrington	2016	Election Law Journal	A Practical Procedure for Detecting a Partisan Gerrymander	5	NA	M	7	126
Ladewig 2018	Ladewig	2018	Election Law Journal	‘‘Appearances Do Matter’’: Congressional District Compactness and Electoral Turnout	0	NA	M	10	498
Campisi et al. 2019	Campisi et al.	2019	Election Law Journal	Declination as a Metric to Detect Partisan Gerrymandering	5	NA	F	3	19
Makse 2012	Makse	2012	Election Law Journal	Defining Communities of Interest in Redistricting Through Initiative Voting	17	NA	M	9	474
Gimpel & Harbridge-Yong 2020	Gimpel & Harbridge-Yong	2020	Election Law Journal	Conflicting Goals of Redistricting: Do Districts That Maximize Competition Reckon with Communities of Interest?	1	NA	M	42	7339
Chen 2017	Chen	2017	Election Law Journal	The Impact of Political Geography on Wisconsin Redistricting: An Analysis of Wisconsin’s Act 43 Assembly Districting Plan	22	NA	M	8	375
Ansolabehere & Snyder 2012	Ansolabehere & Snyder	2012	Election Law Journal	The Effects of Redistricting on Incumbents	26	NA	M	37	5606
Sabouni & Shelton 2021	Sabouni & Shelton	2021	Election Law Journal	State Legislative Redistricting: The Effectiveness of Traditional Districting Principles in the 2010 Wave	0	NA	M	1	4
Williamson 2019	Williamson	2019	Election Law Journal	Examining the Effects of Partisan Redistricting on Candidate Entry Decisions	2	NA	M	7	155
Veomett 2018	Veomett	2018	Election Law Journal	Efficiency Gap, Voter Turnout, and the Efficiency Principle	23	NA	F	3	37
Tamas 2019	Tamas	2019	Election Law Journal	American Disproportionality: A Historical Analysis of Partisan Bias in Elections to the U.S. House of Representatives	5	NA	M	6	91
Duchin et al. 2019	Duchin et al.	2019	Election Law Journal	Locating the Representational Baseline: Republicans in Massachusetts	25	NA	F	8	154
McGhee 2017	McGhee	2017	Election Law Journal	Measuring Efficiency in Redistricting	29	NA	M	17	1809
Wang et al. 2018	Wang et al.	2018	Election Law Journal	An Antidote for Gobbledygook: Organizing the Judge’s Partisan Gerrymandering Toolkit into Tests of Opportunity and Outcome	5	NA	M	44	9711
Caughey et al. 2017b	Caughey et al.	2017	Election Law Journal	Partisan Gerrymandering and the Political Process: Effects on Roll-Call Voting and State Policies	31	NA	M	17	1959
Powell et al. 2020	Powell et al.	2020	Election Law Journal	Partisan Gerrymandering, Clustering, or Both? A New Approach to a Persistent Question	2	NA	M	5	110
Fougere et al. 2010	Fougere et al.	2010	Election Law Journal	Partisanship, Public Opinion, and Redistricting	33	NA	M	2	17
Best et al. 2018	Best et al.	2018	Election Law Journal	Considering the Prospects for Establishing a Packing Gerrymandering Standard	32	NA	F	10	522
Warrington 2018	Warrington	2018	Election Law Journal	Quantifying Gerrymandering Using the Vote Distribution	34	NA	M	15	736
Gardner 2012	Gardner	2012	Election Law Journal	How to Do Things with Boundaries: Redistricting and the Construction of Politics	14	NA	M	10	881
Goedert 2014	Goedert	2014	Election Law Journal	Redistricting, Risk, and Representation: How Five State Gerrymanders Weathered the Tides of the 2000s	8	NA	M	5	183
Wang 2016b	Wang	2016	Election Law Journal	Three Practical Tests for Gerrymandering: Application to Maryland and Wisconsin	38	NA	M	44	9711
Ramachandran & Gold 2018	Ramachandran & Gold	2018	Election Law Journal	Using Outlier Analysis to Detect Partisan Gerrymanders: A Survey of Current Approaches and Future Directions	9	NA	F	5	104
Nagle 2019	Nagle	2019	Election Law Journal	What Criteria Should Be Used for Redistricting Reform?	16	NA	M	82	25088

# split out multiple cites per edge 
literature_long <- literature %>% 
  mutate(id = str_split(cites, ";")) %>% 
  unnest(id)

# merge edgelist with metadata
literature_long %<>% full_join(literature_metadata)

literature_long %>% 
  ggplot() +
  aes(x = author_h_index, fill = author_gender)+
  geom_histogram()

literature_long %>% 
  ggplot() +
  aes(x = author_citations, fill = author_gender)+
  geom_histogram()

library(ggraph)

1.2 The Full Graph

lit <- literature_long %>% 
  distinct(to, from) %>% 
  review()

lit

## A netlit_review object with the following components:
## 
## $edgelist
##  - 69 edges
##  - edge attributes: edge_betweenness
## $nodelist
##  - 56 nodes
##  - node attributes: degree_in, degree_out, degree_total, betweenness
## $graph
##    an igraph object

# best seed 1,4, *5*
set.seed(5)

netlit_plot <- function(g){
ggraph(g, layout = 'fr') + 
  geom_node_point(
    aes(color = degree_total %>% as.factor() ),
    size = 6, 
    alpha = .7
    ) + 
  geom_edge_arc2(
    start_cap = circle(3, 'mm'),
    end_cap = circle(6, 'mm'),
    aes(
      color = edge_betweenness,
      ),
    curvature = 0,
    arrow = arrow(length = unit(2, 'mm'), 
                  type = "open")
    ) +
  geom_edge_loop(
      start_cap = circle(5, 'mm'),
      end_cap = circle(2, 'mm'),
      aes( color = edge_betweenness),
      n = 300,
      strength = .6,
    arrow = arrow(length = unit(2, 'mm'), 
                  type = "open")
    ) +
  geom_node_text( aes(label = name), size = 2.3) + 
  ggplot2::theme_void() + 
  theme(legend.position="bottom") + 
  labs(edge_color = "Edge Betweenness",
       color = "Total Degree\nCentrality",
       edge_linetype = "") + 
scale_edge_color_viridis(option = "plasma", 
                         begin = 0, 
                         end = .9, 
                         direction = -1, 
                         guide = "legend") +
  scale_color_viridis_d(option = "mako", 
                        begin = 1, 
                        end = .5)
}


g <- literature_long %>% 
  distinct(to, from) %>% 
  review()  %>% 
  .$graph 

g %>% 
  netlit_plot()

# for plotting bias
netlit_bias_plot <- function(subgraph){
  
  # lit with edge attribute indicating missing from subgraph 
lit <- literature_long %>% 
  distinct(to, from) %>% 
    left_join( subgraph$edgelist %>% distinct(to, from) %>% mutate(missing_edges = "Not missing") 
) %>% 
    mutate(missing_edges = replace_na(missing_edges, "Missing")) 

lit %<>% 
  review(edge_attributes = names(lit))  
  
#  missing nodes 
  missing_nodes <- lit$nodelist$node[!lit$nodelist$node %in% subgraph$nodelist$node]

  set.seed(5)

ggraph(lit$g, layout = 'fr') + 
  geom_node_point(
    aes(color = ifelse(name %in% missing_nodes, "Missing", "Not Missing")),
    size = 6, 
    alpha = .7
    ) + 
  geom_edge_arc2(
    start_cap = circle(3, 'mm'),
    end_cap = circle(6, 'mm'),
    aes(
      color = missing_edges,
      ),
    curvature = 0,
    arrow = arrow(length = unit(2, 'mm'), 
                  type = "open")
    ) +
  geom_edge_loop(
      start_cap = circle(5, 'mm'),
      end_cap = circle(2, 'mm'),
      aes(color = missing_edges),
      n = 300,
      strength = .6,
    arrow = arrow(length = unit(2, 'mm'), 
                  type = "open")
    ) +
  geom_node_text( aes(label = name), size = 2.3) + 
  ggplot2::theme_void() + 
  theme(legend.position="bottom") + 
  labs(edge_color = "",
       color = "",
       edge_linetype = "") +
  scale_color_discrete() + 
  scale_edge_color_discrete()
}


literature_long %<>%
  mutate(author_is_man = author_gender == "M")

2 Biased Samples

# biased sample weights 
literature_long %<>% 
    mutate(unbiased = .5,
           weight = case_when(
      author_is_man ~ .6,
      !author_is_man ~ .4,
      TRUE~ .5 
    ))


# a function to sample the network 
sample_lit <- function(n, literature_long, prob){
  
  # create an index for the sample
  samp_idx <- sample(seq_len(nrow(literature_long)), 
                     100, # 100 draws = number of studies to draw 
                     prob=prob # with prob var provided 
                     )
  
  # subset sample to index 
  sample <- literature_long %>% 
    rowid_to_column() %>% 
    filter(rowid %in% samp_idx) %>% 
    distinct(to, from) %>% 
    review()
  
    return(sample)
}

n_samples <-1000

2.1 Random draws of 100 studies (1000 draws)

There are 165 studies in the original literature review. We draw 100 of them—first at random, then weighted random samples. For each type of simulated bias we use 1000 draws.

random_samples <- map(1:n_samples, # 100 samples 
                      sample_lit,
                      literature_long=literature_long, 
                      prob = literature_long$unbiased)

samples <- random_samples

mean_edge_betw <- . %>% pull(edge_betweenness) %>% mean()
mean_node_betw <- . %>% pull(betweenness) %>% mean()
mean_node_degree <- . %>% pull(degree_total) %>% mean()

# make a table of the total number of nodes, edges, and the graph object for plotting
summarise_samples <- function(samples){
summary <- tibble(
  #edge stats
  edges = samples %>% map(1) %>% modify(nrow) %>% unlist(),
  edge_between_mean = samples %>% map(1) %>% modify(mean_edge_betw) %>% unlist(),
  # nodes stats
    nodes = samples %>% map(2) %>% modify(nrow) %>% unlist(),
  node_between_mean = samples %>% map(2) %>% modify(mean_node_betw) %>% unlist(),
  node_degree_mean = samples %>% map(2) %>% modify(mean_node_degree) %>% unlist(),
  #graph stats 
  communities = samples %>% map(3) %>% modify(cluster_walktrap) %>% modify(length) %>% unlist(),
  diameter = samples %>% map(3) %>% modify(diameter)  %>% unlist(),
  graph = samples %>% map(3)
  )
return(summary)
}

summary <- summarise_samples(samples)

random <- summary %>% mutate(
  sample = "Random"
)

# map(random$graph, netlit_plot)
map(random_samples[1:10], netlit_bias_plot)

Average nodes recovered: 43.8

Average node betweenness recovered: 2.9607115

Average edges recovered: 46.94

Average edge betweenness recovered: 5.1360215

Average node degree recovered: 2.1470984

Average communities recovered: 10.12

Average diameter recovered: 4.65

2.2 Gender-biased draws

2.2.1 pr(cite|man) = .60, pr(cite|woman) = .40

#  biased samples
gender_samples <- map(1:n_samples, sample_lit,literature_long=literature_long, prob = literature_long$weight)

samples <- gender_samples

summary <- summarise_samples(samples)

gender <- summary %>% mutate(sample = "Gender bias favoring men")
  
# map(gender_samples[1:10], netlit_bias_plot)
map(gender_samples[1:10], netlit_bias_plot)

Average nodes recovered: 44.25

Average node betweenness recovered: 2.9773211

Average edges recovered: 47.472

Average edge betweenness recovered: 5.1271756

Average node degree recovered: 2.1480552

Average communities recovered: 10.328

Average diameter recovered: 4.667

2.2.2 pr(man) = 1, pr(woman) = .30

# biased sample weights 
literature_long %<>% 
    mutate(weight = case_when(
      author_is_man ~ 1,
      !author_is_man ~ .3,
      TRUE~ .5 
    ))


#  biased samples
gender_samples <- map(1:n_samples, sample_lit,literature_long=literature_long, prob = literature_long$weight)

samples <- gender_samples

summary <- summarise_samples(samples)

gender <- summary %>% mutate(
  sample = "Gender bias favoring men"
)
  
#map(gender$graph, netlit_plot)
map(gender_samples[1:10], netlit_bias_plot)

Average nodes recovered: 45.339

Average node betweenness recovered: 3.3160159

Average edges recovered: 48.951

Average edge betweenness recovered: 5.5354191

Average node degree recovered: 2.1615376

Average communities recovered: 10.722

Average diameter recovered: 4.837

2.2.3 pr(man) = .30, pr(woman) = 1

# biased sample weights 
literature_long %<>% 
    mutate(weight = case_when(
      author_is_man ~ .3,
      !author_is_man ~ 1,
      TRUE~ .5 
    ))

gender_samples2 <- samples <- map(1:n_samples, sample_lit,literature_long=literature_long, prob = literature_long$weight)


# biased samples
summary <- summarise_samples(samples)

gender2 <- summary %>% mutate(
  sample = "Gender bias favoring women"
)
  
#map(gender$graph, netlit_plot)
map(gender_samples2[1:10], netlit_bias_plot)

Average nodes recovered: 42.591

Average node betweenness recovered: 2.3101483

Average edges recovered: 44.627

Average edge betweenness recovered: 4.3232846

Average node degree recovered: 2.0983178

Average communities recovered: 9.96

Average diameter recovered: 4.249

2.3 H-Index-biased draws

(replacing NA HIndex with 0)

literature_long %<>%
  mutate(author_h_index = replace_na(author_h_index, 0 ))

#  biased samples
hindex_samples <- samples <- map(1:n_samples, sample_lit,literature_long=literature_long, prob = literature_long$weight)



summary <- summarise_samples(samples)

hindex <- summary %>% mutate(
  sample = "H-Index bias"
)
  
#map(gender$graph, netlit_plot)
map(hindex_samples[1:10], netlit_bias_plot)

Average nodes recovered: 42.591

Average node betweenness recovered: 2.3101483

Average edges recovered: 44.627

Average edge betweenness recovered: 4.3232846

Average node degree recovered: 2.0983178

Average communities recovered: 9.96

Average diameter recovered: 4.249

2.4 Citation-biased draws

(replacing NA author citations with 0)

literature_long %<>%
  mutate(author_citations = replace_na(author_citations, 0 ))

# gender-biased samples
citations_samples <- map(1:n_samples, sample_lit,literature_long=literature_long, prob = literature_long$author_citations)

samples <- citations_samples

summary <- summarise_samples(samples)

citations <- summary %>% mutate(
  sample = "Citations bias"
)
  
# map(citations$graph, netlit_plot)
map(citations_samples[1:10], netlit_bias_plot) # %>% .[c(1:10)]

Average nodes recovered: 46.811

Average node betweenness recovered: 3.754296

Average edges recovered: 51.905

Average edge betweenness recovered: 6.1469967

Average node degree recovered: 2.2184447

Average communities recovered: 10.823

Average diameter recovered: 4.638

3 Comparing Biases

s <- full_join(random, gender) %>% 
  full_join(gender2) %>% 
  full_join(hindex) %>% 
  full_join(citations)

round2 <- . %>% round(1)

s_table <- s %>% group_by(sample) %>% 
  select(where(is.numeric)) %>% summarise_all(mean) %>% 
  group_by(sample) %>% 
  mutate_all(round2) %>% 
  arrange(rev(sample))

color.me <- which(s_table$sample == "Random")

names(s_table) %<>% str_remove("_mean")

s_table %>% 
  kable(booktabs = T) %>% 
  kable_styling()

sample	edges	edge_between	nodes	node_between	node_degree	communities	diameter
Random	46.9	5.1	43.8	3.0	2.1	10.1	4.7
H-Index bias	44.6	4.3	42.6	2.3	2.1	10.0	4.2
Gender bias favoring women	44.6	4.3	42.6	2.3	2.1	10.0	4.2
Gender bias favoring men	49.0	5.5	45.3	3.3	2.2	10.7	4.8
Citations bias	51.9	6.1	46.8	3.8	2.2	10.8	4.6

s %>% 
  ggplot() + 
  aes(x = nodes, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Nodes Recovered (out of 56)") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())

s %>% 
  ggplot() + 
  aes(x = edges, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Edges Recovered (out of 69)") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())


s %>% 
  ggplot() + 
  aes(x = edge_between_mean, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Average Edge Betweenness") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())

s %>% 
  ggplot() + 
  aes(x = node_between_mean, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Average Node Betweenness") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())

s %>% 
  ggplot() + 
  aes(x = node_degree_mean, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Average Degree") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())
  
s %>% 
  ggplot() + 
  aes(x = communities, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Communities") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())

s %>% 
  ggplot() + 
  aes(x = diameter, fill = sample, color = sample) +
  geom_density(alpha = .3) + 
  scale_color_viridis_d() + 
  scale_fill_viridis_d() +
  theme_minimal() + 
  labs(color = "", 
       fill = "", y = "Density",
       x = "Diameter") + 
  theme(axis.text.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.minor.y = element_blank())

Simulating Biased Literature Reviews with `netlit`

Redistricting Literature

Devin Judge-Lord, Adeline Lo & Kyler Hudson

1 Metadata

1.1 Lead Author Gender, H-Index, and Citation

1.2 The Full Graph

2 Biased Samples

2.1 Random draws of 100 studies (1000 draws)

2.2 Gender-biased draws

2.2.1 pr(cite|man) = .60, pr(cite|woman) = .40

2.2.2 pr(man) = 1, pr(woman) = .30

2.2.3 pr(man) = .30, pr(woman) = 1

2.3 H-Index-biased draws

2.4 Citation-biased draws

3 Comparing Biases

Simulating Biased Literature Reviews with netlit

Redistricting Literature

Devin Judge-Lord, Adeline Lo & Kyler Hudson

1 Metadata

1.1 Lead Author Gender, H-Index, and Citation

1.2 The Full Graph

2 Biased Samples

2.1 Random draws of 100 studies (1000 draws)

2.2 Gender-biased draws

2.2.1 pr(cite|man) = .60, pr(cite|woman) = .40

2.2.2 pr(man) = 1, pr(woman) = .30

2.2.3 pr(man) = .30, pr(woman) = 1

2.3 H-Index-biased draws

2.4 Citation-biased draws

3 Comparing Biases

Simulating Biased Literature Reviews with `netlit`