ETC5521 Worksheet Week 7

Making comparisons between groups and strata

Author

Prof. Di Cook

Exercise 1: Discussion

The Women’s Weekly published a story about famous Australian model, Elle McPherson’s breast cancer story. Diagnosed 7 years ago, she is in remission after choosing alternative therapies as treatment. The original diagnosis was accompanied by lumpectomy removing the cancerous tissue.

What does data say relative to this statement?

Alternative therapies assisted Elle’s being considered cleared of cancer today.

This article has good explanations that clarify missing pieces from the Women’s Weekly article.

Elle was diagnosed with HER2 positive oestrogen receptive intraductal carcinoma. What was not reported was that her cancer was non-invasive.

If you read this information you will see the survival rate for localised (non-invasive) is 99%.

Exercise 2: Hate Crime

If this topic is upsetting for you, please feel free to take yourself out of the discussion, and exit the workshop.

A certain person made the following statement about this data and used the graph below to illustrate his point.

The post-9/11 upsurge in hate crimes against Muslims was real and unforgivable, but the horrible truth is that it didn’t loom that large compared with what Blacks face year in and year out.
Code
df <- tribble(
  ~year, ~offense, ~count,
  2000, "Anti-Black", 3535,
  2000, "Sexual Orientation", 1558,
  2000, "Anti-Islamic", 36,
  2001, "Anti-Black", 3700,
  2001, "Sexual Orientation", 1664,
  2001, "Anti-Islamic", 554,
  2002, "Anti-Black", 3076,
  2002, "Sexual Orientation", 1513,
  2002, "Anti-Islamic", 174
) |>
  mutate(offense = fct_reorder(offense, -count))

pop_df <- tribble(
  ~pop, ~size,
  "Anti-Black", 36.4e6,
  "Sexual Orientation", 28.2e6,
  "Anti-Islamic", 3.4e6
)

crime_df <- left_join(df, pop_df, by = c("offense" = "pop")) |>
  mutate(prop = count / size)

Victims of hate crime in USA in years 2000-2002.

Discuss whether the plot supports his statement or not. Is his comparison of the number of crimes against Muslim and Blacks fair? What graph would you suggest to make to support/disprove his statement? The data and additional information is provided below.

This uses the data from the USA hate crime statistics found here. The number of victims by three particular hate crime is shown in the table below.

The number of victims by hate crime in the USA. Data sourced from https://ucr.fbi.gov/hate-crime.
Year Offense Victims
2000 Anti-Black 3535
2000 Sexual Orientation 1558
2000 Anti-Islamic 36
2001 Anti-Black 3700
2001 Sexual Orientation 1664
2001 Anti-Islamic 554
2002 Anti-Black 3076
2002 Sexual Orientation 1513
2002 Anti-Islamic 174

The 2000 USA Census reports that there were a total of 36.4 million people who reported themselves as Black or African American. Weeks (2003) estimated there are 3.4 million Muslims in the USA. The LGBT population is harder to estimate but reports indicate 2-10% of the population so likely below 28.2 million people in the USA.

  • The use of a line plot rather than bar plot makes it easier to compare the trend across years.
  • The second sentence compares the number of victims of anti-Black hate crimes and of anti-Islamic hate crimes.
  • The problem with this comparison is that the population size is vastly different for the two comparisons.
  • While the number of anti-Black victims are far larger than anti-Islamic victims as shown in Plot (A) below, the Muslim community is roughly 10% of the size of the Black community.
  • Assuming the population size is roughly the same across 2000-2002, a rough estimate of the proportions of hate crime victims for each population is compared in Plot (B).
  • The significant surge in anti-Islamic crimes in 2001 is more apparent in Plot (B).
  • Plot (C) shows the odds ratio with respect to year 2000. This shows that the anti-Islamic crime in 2001 was nearly 15 times higher than in 2000 lowering to about 4.8 in 2001. This however is higher than that of the incidences related to anti-Black and sexual orientation hate crimes which remain somewhat stable from 2000-2002 (odds ratio is close to 1 or slightly lower).

Once again, these plots show that the answer to the question (is the quote true), depends on how one interprets the quote. Is the person saying that the number of victims of anti-Black hate crimes is always higher than anti-Islamic hate crimes? Then the answer is likely yes. However, per capita, the rate of anti-Islamic hate crimes far exceeded anything else in 2001, which goes against the quote.

Code
ggplot(crime_df, aes(as.factor(year), count, color = offense)) +
  geom_point() +
  geom_line(aes(group = offense)) +
  scale_color_discrete_qualitative() +
  labs(
    x = "Year", y = "The number of victims",
    color = "Offense", tag = "(A)"
  )

Code
ggplot(crime_df, aes(as.factor(year), prop * 10000, color = offense)) +
  geom_point() +
  geom_line(aes(group = offense)) +
  scale_color_discrete_qualitative() +
  labs(
    x = "Year", y = "Incidence estimate per 10,000 people",
    color = "Offense", tag = "(B)"
  )

Code
year2000dict <- crime_df |>
  dplyr::filter(year == 2000) |>
  dplyr::select(offense, prop) |>
  deframe()

crime_df |>
  mutate(rel2000 = prop / year2000dict[offense]) |>
  dplyr::filter(year != 2000) |>
  ggplot(aes(as.factor(year), rel2000, color = offense)) +
  geom_point() +
  geom_line(aes(group = offense)) +
  scale_color_discrete_qualitative() +
  scale_y_continuous(breaks = c(1, 4, 5, 15, 16)) +
  labs(
    x = "Year", y = "Odds ratio with respect to year 2000",
    color = "Offense", tag = "(C)"
  )