Creating Visualizations in R - Dataviz with #TidyTuesday and #SWDchallenge

If you can't already tell, I've been really into R lately.

I've been doing DataCamp nearly every day (I'm already finished with 12 of the courses) and I've been trying to do all my analyses in R. The latter has been a bit slow going; doing new things in R is taking 2-3x as long as it typically takes me to do in Excel and/or SPSS. But the reproducible nature of the code is going to be a major game-changer in the future and my code will be there for future reference. I already put together one reference sheet (merging datasets with dplyr) that I've already used twice since I made it!

However, the big thing I knew I needed to tackle was how to create visualizations in R. I've gotten really good at doing my dataviz in Excel, but I knew I needed to be able to do this in R, particularly for the more complex stuff I just can't quite do in Excel. 

So imagine my luck when I found both #TidyTuesday and #SWDchallenge on Twitter!

As the organizers of #TidyTuesday describe, it is "a weekly data project aimed at the R ecosystem. An emphasis will be placed on understanding how to summarize and arrange data to make meaningful charts with ggplot2tidyrdplyr, and other tools in the tidyverse ecosystem." 

#SWDchallenge is from Storytelling with Data (a great podcast if you don't listen to it already). Similar to #TidyTuesday except they pick a particular challenge, and this month's challenge was to create a dot plot. Challenge accepted!

Dallas Animal Services

The open data provided this week was from Dallas Animal Services and had data from all the animals they took in over the course of a year or so. I didn't really delve too deep into this data as I am still really novice to dataviz with ggplot2 in R. You can see all the code for these visualizations here.

For this data, I decided to create a bar plot. Simple, right?

Not simple at all. This took me probably two hours total to create. There are a few reasons why:

  1. Getting the data in the format I needed took quite a while. I'm still new to the tidyverse and figuring out things like ordering to get the max was difficult
  2. Figuring out how to color the Dogs column in red but the others in grey also took a while. That required the mutate and scale_fill_manual code (and a lot of Googling!) to achieve. 
  3. I struggled with the different colors in the title. This is one thing where Excel, well... excels. Just double-click and edit the text! As you can tell from the code, this took a lot more finagling to make it finally work (see the stuff about grobs in the code). I'm still not 100% confident I understand how it works, but it did. 

SWD Challenge

As if that simple (yet complex!) bar chart wasn't hard enough, then I saw on Twitter the #SWDchallenge calling for people to create a dot plot. I decided to use the #TidyTuesday data for this since I'd already started to get used to it. 

To make things not too complicated, I decided to just add another layer to my previous graph and look at type of animal by what outcome they had (and not just look at those that were rescued, like in the previous graph).

This probably took me around the same amount of time. Some of the issues I came across:

  1. Again, getting the data in the format I needed took a while. This time, what took the most time was figuring out how to get my data in percentages. This is where the janitor package came in handy!
  2. Getting all the colors across both the geom_point and geom_label to work properly took me a long time. Figuring out where to put which fill/color aesthetic was annoying. 
  3. On my computer, the dots were showing up as low resolution. I couldn't figure out why, but when I used ggsave they came out looking perfectly fine. Oh well!

Conclusion

Overall, I am really happy I went through this process even if it did take probably 5 hours out of my week. In the future, I will have code for dotplots and bar charts ready to go that I will only need to tweak. Furthermore, I am now feeling much more confident in my ggplot2 and tidyverse skills! Onward and upward!

Again, here is the code for anyone who may be interested. 

Previous
Previous

Evaluation as Research-Practice Partnerships

Next
Next

Using Zotero for Managing References