install.packages(c("tidyverse"))
ETC5521 Tutorial 4
Initial data analysis
🎯 Objectives
Practice conducting initial data analyses, and make a start on learning how to assess significance of patterns.
🔧 Preparation
The reading for this week is Wickham et al. (2010) Graphical inference for Infovis.
- Complete the weekly quiz, before the deadline! - Make sure you have this list of R packages installed:
- Open your RStudio Project for this unit, (the one you created in week 1,
ETC5521
). Create a.qmd
document for this weeks activities.
📥 Exercises
This tutorial focuses on IDA for the gardenR
data, with the goal to answer this question:
Which variety of tomato produces the most return on investment, as measured by weight?
Exercise 1
- How many types of vegetables were grown in each year?
- How many vegetables were grown in 2020 that were not grown in 2021?
- What are some of the data recording errors that can be seen by comparing vegetables grown in each year?
Exercise 2
- Join the harvest, spending and planting data for the two years, after adding a new variable each, called
year
. Show your code. - Make a subset containing just the tomatoes, for each set.
- Are the varieties of tomatoes grown each year the same?
- Are the tomato varieties grown in the same plots each year?
- When are tomatoes planted and harvested, in Lisa’s garden?
Exercise 3 Try to answer the original question.
How should you calibrate weight of harvest by amount of seeds planted?
Which variety produces the most return on investment?
👌 Finishing up
Make sure you say thanks and good-bye to your tutor. This is a time to also report what you enjoyed and what you found difficult.