Power of Population Data Science Webinar - Visualising Logistic Regression: Application of coloring book technique in a reproducible ggplot2 system
All sessions will be delivered live and online via the Gotowebinar system.
Can’t attend the live session? This presentation will be recorded and posted on the PopData's YouTube channel and the International Journal of Population Data Science (IJPDS) website for future reference. We recommend you register for the presentations of your choice so we can send you a link to the latest recorded sessions as they are available.
Visualising results of statistical modeling is a key component of data science workflow. Statistical graphs are often the best means to explain and promote research findings. However, in order to find that one graph that tells the story worth sharing, we sometimes have to try out and sift through many data visualizations. How should we approach such a task? What can we do to make it easier from both production and evaluation perspectives?
This presentation will demonstrate a reproducible graphing system designed for the IPDLN-2018 hackathon. The system evaluates synthetic socioeconomic and mortality data with logistic regression. The data was prepared for the hackathon by Statistic Canada and represents Canadian population.
Topics covered will include:
- Introduction to a visualisation technique that uses color to create meaningful expectations from the results of a logistic regression.
- Details related to the workflow of the project that implements this graphing system (github.com/andkov/ipdln-2018-hackathon )
- Building the case for preference of reproducible workflows with version control over computational notebooks (e.g. Jupyter, R Notebook).
View webinar presentation below.
Andriy Koval, Ph.D. is a data scientist with background in quantitative methods and interests in data-driven models of human aging. He is a Health System Impact Fellow with the BC Observatory for Population and Public Health and an incoming tenure-track assistant professor at the University of Central Florida. Andriy’s works centers around developing tools for reproducible research with R and GitHub as key components. Presently, Andriy’s work focuses on developing statistical methods for analysing transactional data extracted from the electronic health records (EHR) of Vancouver Island Health Authority. His current interests include design of information displays with R, literate programming, statistical modelling in general, and probabilistic computing in particular. See more at https://github.com/andkov/bio