The visual taming of a paradox
@drob has posted code to play with on Twitter today. To illustrate what he calls a veridical paradox he’s posted the set up, the code and result of a coin flipping experiment:
There are some good and exact explanations in the thread, for this at-first-glance puzzle. But I didn’t see a visualization that might give you quick intuition about what is going on.
So I prepare one here. We’ll use the tidyverse packages and stringr.
library(tidyverse)
library(stringr)
First we simulate one flip’s possible outcomes.
one_flip <- tribble(
~flip,
"Heads",
"Tails"
)
We can also simulate the possible outcomes for histories which have equal probability.
two_flips <- crossing(one_flip, one_flip)
two_flips
And so on…
crossing(two_flips, one_flip)
For more flips I use a for-loop. Here I just have the histories for six flips. This give us 2^5 (32) equally probable histories. This is enough, I think, to make a viz that might illuminate the paradox.
flip_histories <- one_flip
for(i in 1:4){
flip_histories <- crossing(flip_histories, one_flip)
}
dim(flip_histories)
Now let’s plot these histories to give us some insights about the apparent paradox.
We’ll use ggplot2(), so first we get the data into tidy form.
names(flip_histories) <- paste0("flip", 1:ncol(flip_histories))
flip_histories <- flip_histories %>% mutate(history = 1:n())
tidy_df = gather(flip_histories, "flip", "outcome", 1:5) %>%
mutate(flip = as.numeric(str_extract(flip, "\\d"))) %>%
arrange(history)
Then we compute the position where we have observed the first heads heads pattern, and where we observed the first heads tails pattern.
tidy_df = tidy_df %>% group_by(history) %>%
mutate(next_flip_outcome = lead(outcome)) %>%
mutate(hh = outcome == "Heads" & next_flip_outcome == "Heads") %>%
mutate(first_hh = min(flip[hh], na.rm = T) + 1) %>%
mutate(ht = outcome == "Heads" & next_flip_outcome == "Tails") %>%
mutate(first_ht = min(flip[ht], na.rm = T) + 1)
Now we can plot the two scenarios of interest side-by-side. The full hypothetical histories are plotted, but the transparency is increased if the goal is reached previously in the flipping space.
library(cowplot)
g_hh <- ggplot(tidy_df) +
aes(flip, history,
alpha = flip <= first_hh,
col = outcome) +
geom_point() +
scale_alpha_discrete(range = c(.3,1)) +
theme_classic() +
labs(title = "Heads-heads as success case") +
geom_hline(yintercept = seq(.5,32.5, by = 1),
lty = "dotted", col = "grey")
g_ht <- g_hh +
aes(alpha = flip <= first_ht) +
labs(title = "Heads-tails as success case")
cowplot::plot_grid(g_hh, g_ht)
You observe that the success in flips 1 and 2 for the heads-heads desired outcome leads less opportunities for flips 2 and 3 to be a success, compared with the heads-tails case. In the heads-heads case, success is the kind of success that quenches more success.
It might be fun too plot these as branching might-have-been networks (like with tidygraph!). But I will leave that for someone else or another day.
Packages used:
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
Hadley Wickham, Romain François, Lionel Henry and Kirill Müller (2018). dplyr: A Grammar of Data Manipulation. R package version 0.7.5. https://CRAN.R-project.org/package=dplyr
Hadley Wickham and Lionel Henry (2018). tidyr: Easily Tidy Data with ‘spread()’ and ‘gather()’ Functions. R package version 0.8.1. https://CRAN.R-project.org/package=tidyr
Claus O. Wilke (2017). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 0.9.2. https://CRAN.R-project.org/package=cowplot
Hadley Wickham (2018). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.3.1. https://CRAN.R-project.org/package=stringr