The visual taming of a paradox

@drob has posted code to play with on Twitter today. To illustrate what he calls a veridical paradox he’s posted the set up, the code and result of a coin flipping experiment:

There are some good and exact explanations in the thread, for this at-first-glance puzzle. But I didn’t see a visualization that might give you quick intuition about what is going on.

So I prepare one here. We’ll use the tidyverse packages and stringr.

library(tidyverse)
library(stringr)

First we simulate one flip’s possible outcomes.

one_flip <- tribble(
  ~flip,
  "Heads", 
  "Tails"
)

We can also simulate the possible outcomes for histories which have equal probability.

two_flips <- crossing(one_flip, one_flip)
two_flips

And so on…

crossing(two_flips, one_flip)

For more flips I use a for-loop. Here I just have the histories for six flips. This give us 2^5 (32) equally probable histories. This is enough, I think, to make a viz that might illuminate the paradox.

flip_histories <- one_flip
for(i in 1:4){
flip_histories <- crossing(flip_histories, one_flip)
}

dim(flip_histories)

Now let’s plot these histories to give us some insights about the apparent paradox.

We’ll use ggplot2(), so first we get the data into tidy form.

names(flip_histories) <- paste0("flip", 1:ncol(flip_histories))
flip_histories <- flip_histories %>% mutate(history = 1:n()) 


tidy_df = gather(flip_histories, "flip", "outcome",  1:5) %>% 
  mutate(flip = as.numeric(str_extract(flip, "\\d"))) %>% 
  arrange(history)

Then we compute the position where we have observed the first heads heads pattern, and where we observed the first heads tails pattern.

tidy_df = tidy_df %>% group_by(history) %>% 
  mutate(next_flip_outcome = lead(outcome)) %>% 
  mutate(hh = outcome == "Heads" & next_flip_outcome == "Heads") %>% 
  mutate(first_hh = min(flip[hh], na.rm = T) + 1) %>% 
  mutate(ht = outcome == "Heads" & next_flip_outcome == "Tails") %>% 
  mutate(first_ht = min(flip[ht], na.rm = T) + 1)

Now we can plot the two scenarios of interest side-by-side. The full hypothetical histories are plotted, but the transparency is increased if the goal is reached previously in the flipping space.

library(cowplot)
g_hh <- ggplot(tidy_df) +
  aes(flip, history, 
      alpha = flip <= first_hh, 
      col = outcome) +
  geom_point() +
  scale_alpha_discrete(range = c(.3,1)) +
  theme_classic() +
  labs(title = "Heads-heads as success case") +
  geom_hline(yintercept = seq(.5,32.5, by = 1), 
             lty = "dotted", col = "grey")

g_ht <- g_hh +
  aes(alpha = flip <= first_ht) +
  labs(title = "Heads-tails as success case") 

cowplot::plot_grid(g_hh, g_ht)

You observe that the success in flips 1 and 2 for the heads-heads desired outcome leads less opportunities for flips 2 and 3 to be a success, compared with the heads-tails case. In the heads-heads case, success is the kind of success that quenches more success.

It might be fun too plot these as branching might-have-been networks (like with tidygraph!). But I will leave that for someone else or another day.

Packages used:

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2018). dplyr: A Grammar of Data Manipulation. R package version 0.7.5. https://CRAN.R-project.org/package=dplyr

Hadley Wickham and Lionel Henry (2018). tidyr: Easily Tidy Data with ‘spread()’ and ‘gather()’ Functions. R package version 0.8.1. https://CRAN.R-project.org/package=tidyr

Claus O. Wilke (2017). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 0.9.2. https://CRAN.R-project.org/package=cowplot

Hadley Wickham (2018). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.3.1. https://CRAN.R-project.org/package=stringr

Evangeline Reynolds
Visiting Teaching Assistant Professor

My research interests include international institutions, causal inference, data visualization, and computational social science and pedagogy.