install.packages(c("tidyverse", "ggbeeswarm", "broom", "visdat"))
ETC5521 Worksheet Week 3
Initial data analysis
🎯 Objectives
Practice conducting initial data analyses, and make a start on learning how to assess significance of patterns.
Initial data analysis
Prof. Di Cook
Practice conducting initial data analyses, and make a start on learning how to assess significance of patterns.
glimpse
of the penguins
data. What types are variables are present in the data?palmerpenguins
package, or see if AI knows.visdat
package make an overview plot to examine types of variables and for missing values.geom_quasirandom()
function in the ggbeeswarm
package. There seems to be some bimodality in some species on some variables eg bill_len
. Why do you think this might be? Check your thinking by making a suitable plot.body_mass_g
vs flipper_length_mm
for all the penguins. What do the vertical stripes indicate? Are there any other unusual patterns to note, such as outliers or clustering or nonlinearity?flipper_length_mm
. From the residual plot, are there any concerns about the model fit?