Webinar Series - Visualization and Reproducible Reports in R

Date: 
Tuesday, March 31, 2020
Event type: 
Webinar
Time: 
12:00 noon to 1:30 pm PST
Location: 

Online

Sessions: Tuesday March 31 | Thursday April 2 | Tuesday April 7 | Thursday April 9th 

 

 


Overview

This webinar series focuses on the use of ggplot2 and tidyverse packages to generate reproducible reports. The tidyverse is a collection of R packages used in data science that share the same underlying design philosophy, grammar and data structures. Data visualization is information visualization and analytics to gain insight into the underlying data. Visualizations are useful for summarizing large amounts of data to examine trends and patterns and to help users understand data in a visual context (e.g. no more long tables).  Data visualization has grown with the advent of interactive reports and tools made available online. Static graphs can be made interactive with pop-ups, animations, panning and zooming and many other features. Libraries in R are built for interactive web display and users are no longer required to learn multiple languages in order to display and visualize their data – All can be done in R!

RMarkdown https://rmarkdown.rstudio.com/ allows for dynamic outputs and formats of your R analysis. RMarkdown can create fully reproduceable reports and can be written in R, Python and SQL. The main benefit for using RMarkdown in research settings is the generation of a template for a report that will be output into Word, PDF or an HTML file. So, when you inevitably have to go back to change your analysis, you can do it all in RMarkdown and create a revised report in no time.

The webinar series will be divided into four 1.5-hour sessions. Background theory and live demos will be presented during each live session, with take-home practice assignments for some sessions. This course is not a statistics or analysis course but will be using tidyverse packages to summarize and shape data for the purpose of data visualization.

Prior required knowledge

Familiarity with R is an asset though not mandatory for enrollment.

Webinar objectives

By the end of this webinar series, participants will be able to:

  • Use the ggplot2 library to visualize data
  • Use tidyverse packages to analyze data
  • Create reproducible PDF and Word Reports using RMarkdown
  • Create interactive reports

Course content

Session 1:

  • What is data visualization?
  • Data Visualization Principals
  • Shapes, colours for clear communication and accessibility
  • What is the tidyverse?
  • What is Markdown?
  • Setting up your environment and projects
  • Introduction to RMarkdown
  • Introduction to wildfires data set
  • Exploratory Analysis of Wildfires

Session 2:

  • Ggplot2 library and grammar
  • Plotting in ggplot2
    • Scatterplots
    • Bar Charts
    • Boxplots
    • Bubble charts
    • Line Graph

Session 3:

  • Many plots in ggplot2

    • Faceting over time
    • Faceting by region
    • Patchwork
  • Generate word and pdf report
  • Creating your own themes
  • Visualizing spatial data in ggplot2
    • Intro to sf
    • geoms_sf()

Session 4:

  • Interactive reports
  • gganimate
  • Plot.ly
  • Leaflet
  • Generate interactive report

Webinar Format

The interactive webinar software will provide remote access for students to view the instructor's screen, listen to the lecture in real time, and ask questions. The instructor will provide lecture slides (PowerPoint) and required readings prior to the start of the webinar. For practice between webinar sessions and for follow up study, students will also receive training data and programming code for R.

Students can download the R software package for use on their computers through this site: https://www.rstudio.com/products/rstudio/download/

https://www.r-project.org/

Libraries to Install

This list should encompass the libraries you will use in this workshop.

ggplot2, tidyverse, plotly, rmarkdown, dplyr, sf, purrr, leaflet, gganimate, patchwork, RColorBrewer, viridis

Upon registration, you will be provided additional instructions for software download.

Instructor

Lauren Yee is multifaceted researcher and data scientist. She currently works in the consulting industry and designs projects around data visualization, dashboards and ecological and spatial analysis.  Her interdisciplinary background has provided her with the experience of working with many different types of datasets, methodologies and their related data quality issues and methods of visualization. She has taught workshops on GIS and modelling in academic settings and in the municipal environment. Her research interests include: spatial epidemiology, Ecohealth approach, determinants of health, white nose syndrome and zoonoses.

Course fees

Regular rate: $260
Student rate: $160

For more information contact: Ann Greenwood, Education and Training Lead at ann.greenwood@popdata.bc.ca

 

 


Page last revised: March 11, 2020