Up until recently, I was scheduled to be on a panel “Public Engagement: Blogging, Twitter & Podcasts” at the annual Midwest Political Science Association Conference. I thought I might use the chance to tell collegues about flipbooks and the new package {flipbookr}.
With worry reasonably growing about the Covid-19, I and the other panalists are unable to make it for the conference. But I have this story to tell. It’s a story about getting things done at distance, reaching out to people that you don’t know in real life to collaborate — and to create a tool that I think helps communicate clearly at distance. I think we need stories like this now, with all of change on the horizon (at least in the short term) for how we get things done. I feel nervous about these changes, amidst the other worries of these times. Anyway, I’ve also just been wanting to write down an account of the influences and key moments on the way to {flipbookr} for some time, just for myself, before time slips by and I start forgetting.
Let’s begin. I hope that it reaches some of the same audience that would have been there live at the panel and maybe a few more folks here and there.
Don’t care much about the origin story, but are interested that flipbooks exist? No worries - you can just jump the all-things-flipbooks page here.
The very beginning I guess is grad school — when I started learning R. It was base R times, and that was fine. I loved base R. Learning R at all, rather than another analysis tool, was a bit of a twist of fate too; my department had only just moved over to teaching R from another statistical software.
Later on in grad school I was hearing about murmurings about dynamic documents — documents that combined prose and code. I was intrigued. I ultimately compiled my first dynamic document under the tutalidge of Roger Peng, whose Coursera class was a part of the wildly popular Johns Hopkins data science series. Peng also taught me how to right a function. So useful. It was probably around this time I started hearing the murmorings about ggplot and some new data manipulation tools too. But I didn’t really get into them.
I joined twitter to participate in an RStudio shiny competition in 2015. I saw the competition announcement and wanted to participate and win a t-shirt. It seems kind of silly now. I didn’t win and didn’t really have any chance of winning - I joined on the day the competition was closing — 3 hours before the end of the competition!
Just got on twttr. T Shirt competition. Like my App! 3hrs left!https://t.co/PL7usRvJ7M #rstudioshirt @rstudio pic.twitter.com/LclSvo0gMv
— Gina Reynolds (@EvaMaeRey) October 16, 2015
The next phase is basically about learning ggplot2, which I and so many people love to use to build plots!
In December 2016, I learned about #MakeoverMonday. It happened by listening to the podcast “Data Stories”. One of the first episodes I heard was and interview with Andy Kriebel and Andy Cotgreave who were talking about the data visualization initiative #MakeoverMonday that they were running on Twitter to practice data viz with Tableau. They were posting datasets on a weekly basis and whoever could try their hand at building a viz and sharing with the #MakeoverMonday community.
I thought #MakeoverMonday sounded fantastic and posted a submission within a week or two. I used base R graphics. My viz wasn’t too pretty (though I was trying to be fancy with some “enhanced histogram” idea that I’d been working with) but I got a “Welcome to #Makeovermonday” message. I was hooked. There just a few of us #rstats people in the #MakeoverMonday mix, but we were always welcomed! Eva Murry and Andy Kriebel, who were issuing feedback cared more about the composition of plots than tools used to build them.
My first #makeovermonday! Just one week late with the dangerous driving data. Information enhanced histograms for small # of observations. pic.twitter.com/Zkf9cZOw5T
— Gina Reynolds (@EvaMaeRey) December 19, 2016
In Spring 2017, a Quantitative Political Methodology Summer School for Women was announced to be held at the University of Zurich. I applied and was lucky enough to go! Especially because there was a workshop one afternoon on ggplot2. I’d done some ggplot2 plotting here and there — using the popular copy-paste-tweek method (zero theory) — but hadn’t really had a formal introduction. After the workshop, I decided to focus on learning that tool for the #MakeoverMonday submissions.
Zurich Summer School, class of 2017 (+organizers) #ZurichSummerSch pic.twitter.com/YFmL44um7v
— Anita Gohdes (@ARGohdes) August 9, 2017
Especially as I explored the new-to-me ggplot2 tool, I found #MakeoverMonday to be totally addictive. I built a lot of plots.
In the Spring of 2018, maybe it was April or so, Harvard Political Science professor Matt Blackwell tweeted some advice about how to present figures in talks. Basically it was to present them incrementally. Present the x-axis, then the y, then a point, and then the rest of the data. Bite sized pieces are good for audience’s digestion!
My best tip on how to give better quantitative presentations is to (a) use more plots and (b) build up your plots on multiple overlays, as in:
— Matt Blackwell (@matt_blackwell) April 30, 2018
- Just x-axis (explain it)
- Add y-axis (explain it)
- Add 1 data point (explain it)
- Plot the rest of the data (explain it)
The tip resonated with tons of people. It was a great idea. With me, it resonated and reverberated — it sounded a lot like ggplot2’s philosophy; it was a layered presentation of graphics. (By the way, this progressive presentation is exactly what Hans Rosling did in some of his presentations — dramatically introducing the x and y axes, and subsets of the data.)
Good ideas! Very #ggplot. A layered presentation of graphics. https://t.co/M4tRB9bGS5
— Gina Reynolds (@EvaMaeRey) May 1, 2018
In the comments of the tweet it didn’t sound like an efficient of doing this was totally worked out, and definitely not with ggplot2; the discussion for doing so was about using Stata.
I turned to the problem immediately, although I’m sure I was meant to be doing something else that morning; I guess I still feel some guilt as failing to stay focused. Probably with a bunch of ggplot2::last_plot() statements, you manage the task pretty effeciently. ggplot2::last_plot() (which I learned in Zurich) lets you keep progress from a previous version of a plot to a new phase; in Matt’s case, an incomplete plot to the next phase of the plot for the slow presentation.
an implementation w/ #ggplot2 pic.twitter.com/sDxUDrlMcw
— Gina Reynolds (@EvaMaeRey) May 2, 2018
I found the problem really engaging, and kept mulling it over. Came back to it two days later. “It sure would be nice if you could add aes() outside of ggplot, and one at a time”. I wanted it to be true. I tried it. AND. IT. WAS. TRUE!!! Slow ggplot was allowed!
I like this implementation even better. I'm using aes() outside of the ggplot() function or a geom function seems not to be conventional. I just tried and see that it works in this case as I was hoping. But, should it be avoided? bad style? @StatGarrett pic.twitter.com/GxUGo1agHU
— Gina Reynolds (@EvaMaeRey) May 4, 2018
And not many people seemed to know about it. It was a well kept secret, but so handy and I loved it and felt a very clever for having found it.
Garrick Aden-Buie created something like a modern flipbook in early 2018. Mara Averick @dataandme tweeted about it later in the year. It presented code side-by-side with output — slowly building up code. I took special notice because Garrick, as I, had pulled the aes statement out of the ggplot statement. This made me feel a little less clever and insider-y about the + aes() technique. But I got over this fairly quickly, the feeling overwhelmed by admiration of the stunning side-by-side, incremental presentation. It was great. And wouldn’t it be cool to even take it one step further — to move as incrementally as possible and make all the decisions sequential?
Really dig how @grrrck builds up #ggplot2 syntax w/ 📊:
— Mara Averick (@dataandme) May 13, 2018
📽 “A Gentle Guide to the Grammar of Graphics with ggplot2” https://t.co/2Okhri7Hox #rstats #dataviz pic.twitter.com/qO43IeTtHG
Kindly enough, Garrick had shared the Xaringan rmarkdown file from his slides, so I could isolated the “flipbook part”. And I was using Xaringan in teaching my classes too, so wasn’t feeling too overwhelmed. I built my own side-by-side code-plot slow build. I titled the frames “Slow ggplot2” and tweeted about it.
I was pleased, but I did have to back off from a much more ambitious project — which would have shown a much more complicated plot of 25 or so lines of code. I chose one made up of about 10. It was too confusing to keep track of how much code was needed on all the slides and for the longer case. Better methods would be needed for a longer case.
Here, building up a #ggplot2 as slowly as possible, #rstats. Incremental adjustments. #rstatsteachingideas pic.twitter.com/nUulQl8bPh
— Gina Reynolds (@EvaMaeRey) August 13, 2018
Thereafter, Garrick also wrote about his methods in a blog post… “A recent tweet by Gina Reynolds reminded me that I’ve been sitting on this blog post for a while.”
A month or so later, Emi Tanaka joined in tweeting about a code-evolution set of slides she’d built. She was using her gorgeous Xaringan styles kunoichi and ninjutsu, and she had embraced the fully sequential and incremental workflow of “Slow ggplot2” that I’d put forth - totally sequential, totally incremental. Unbenounced to me, this process got her thinking about full fledged flipbooks - flipbooks that would be build automatically from a single input of code.
Inspired by @grrrck and @EvaMaeRey, made the kunoichi + ninjutsu (ninja-theme) version of #ggplot tutorial although Garrick already does explaining this in his excellent blog https://t.co/msXfg14Ztn. Gist for ninja-theme here: https://t.co/soHH4Qvz4F #rstats pic.twitter.com/YlRHAGnaUm
— Emi Tanaka 🌾 (@statsgen) September 16, 2018
Meanwhile Andy Kiebel and Eva Murry were busy with a new project, they planned to write a book for #Makeovermonday. They approached me among many other participants about contributed visualizations that might be included in the book. I was glad to have been asked and sent a couple of higher resolution visualizations and my permission to include the work.
Their project also got me thinking, was it time to put together some kind of gallary for my own visualizations? They were scattered on Twitter and on my laptop, but might be more compelling in some kind of collection. On the internet would be fine. Modest goals.
RStudio announced the first bookdown competition in September 2018. There was a thought. What if I put together a book of my data visualizations in one place, maybe in the bookdown tool. And it wouldn’t it be really marvellous to show the figures all being built — as Garrick, and I and Emi had done with the simple plots! The bookdown competition was a great pretext for getting in touch with them too. Proposing a colab for the contest.
Announcing the 1st Bookdown Contest: https://t.co/oCB20Lhv9k We cordially invite you to submit your bookdown examples, so that future authors may create more beautiful/free/open-access books and future students no longer need to struggle with formatting their dissertations! pic.twitter.com/fIf5eovg4z
— RStudio (@rstudio) July 27, 2018
I reached out to Emi and Garrick via direct message on Twitter, asking about the bookdown competition – maybe we could build a “flipbook” of ggplot examples together:
“Recently, I’ve been putting together a collection of plots I’ve made with ggplot for the Tableau initiative #MakeoverMonday (it is like the Tidy Tuesday initiative), in the bookdown format; I thought I should just submit it for fun. But, it would be much cooler to make a flipbook, showing how each line of code updates the plot (with fewer plots naturally).”
Emi is “in” to collaborate, though doubtful we could meet could meet the bookdown contest deadline.
Garrick is also “in”, and expresses desire to automate.
I express that I’ve had the same wish.
Emi sends us to partial automation, she’d already worked out with a then-secret-and-possibly-dangerous knitr function knitr:::knit_code$get(), and even blogged about days before: h.he quoted Yihui Xie in her post:
There was only one thing upon which I hesitated when deciding whether I should give users the access. That is knitr:::knit_code. Here the triple-colon is obviously a danger sign. When you can even modify the content of a code chunk, I have no idea what can happen. Evil or creative? I’ll leave it to you to think about.
I’m so happy that he didn’t let the hesitations get in the way! And Emi expressed her motivations as coming from exactly the same frustrations that I’d experienced when I tried to “flipbook” the initial 25 line ggplot2 pipeline.
The slide was made using xaringan and the incremental reveal was made by copying and pasting the slide multiple times, deleting lines and then adding highlight to the right line. It did the job but this was far from ideal especially when I decided to change the order of the line so that theme_bw appears last.
Emi concluded:
Now that I know how knitr:::knit_code works, it’s giving me ideas.
Her gist is here: https://gist.github.com/emitanaka/99c5673ddc8f9103dd3c8fec05ab15ea
Garrick adds some know-how, knitr::knit(text = ?) and glue::glue()… We’re were at full automation!
His gist is here: https://gist.github.com/gadenbuie/634060984f0007bf390a931dd3b31bab
After the September 24th rush of productivity, things slowed down a bit. Mostly, I was writing the ggplot flipbook. Also, we decided bookdown was not the best plotform for the decision-by-decision reveals for ggplot2. Xaringan, the slide show tool, was already perfect suited to this, so I went back to that. The code that had been originally used to produce the #MakeoverMonday plots was adjusted to a set of “Slowggplot2” rules that I had made, the tried to deliver feedback in the plot for each new code reveal. Also, there was cleaning up to do in terms of naming arguments — which I thought would help in communicating. And I wanted to write up a bit of explanation about each plot too to introduce them. It was managible, but took some time.
my #ggplot2 flipbook project is online! 😎🤓🤓 Incrementally walks through plotting code (#MakeoverMonday, soon #TidyTuesday plots). Using #xaringan with reveal function; thanks, @statsgen @grrrck. #rstats. https://t.co/bBBzv0iZLw pic.twitter.com/tFtD78IOHZ
— Gina Reynolds (@EvaMaeRey) February 11, 2019
The rest of the story is just details. The response to the ggplot flipbook was really fantastic and demonstrated a hunger for tools like flipbooks. There were a ton of obvious features to be added out of the gate (extending the flipbooks to data manipulation, and allowing code to span multiple lines), and features that I added as I felt that I “needed” them in teaching (like non-sequential reveals and multiple realizations of the exact same code) and user requests (reveal only the output). And it was clear that eventually the tools would need to be packaged up. {flipbookr} was born.