Trivia Questions Flashcards
You are working in R with a data frame called df. In this data frame, you have a variable called gdp per capita, and it has some missing values. How would you get the mean of this variable?
mean(df$gdp_per_capita, na.rm = TRUE)
You are working in R. Your data frame is called df, which has data for the France over the years 2000-2010. In this data frame, you have variables called year and poverty rate. How would you create a labelled scatter plot of these two variable using the ggplot2 package?
make scatter plot of poverty rate by year
library(ggplot2)
scatter= ggplotdf, aes(x=year, y=poverty_rate)) + geom_point) +
labs ( x = “Year”
y = “Poverty Rate”,
title = “The Poverty Rate in France”,
caption = “Source: world Bank”)
print (scatter)
You are working in R. How would you import a .csv file called “this class rocks.csv” and assign it to an object called “data”?
library(rio)
data = import(‘‘this class rocks.csv’’)
What is a null hypothesis?
According to Kellstedt and Whitten (2018), a null hypothesis is a theory-based statement about what we would observe if there were no relationship between the dependent and independent variable
Lionel Messi scored 100 goals in 2010 and 90 goals in 2011. Cristiano Ronaldo scored 70 goals in 2010 and 80 goals in 2011.
Write tables for this dataset in both the long and wide formats. (Bonus: What can we infer from these goal numbers?)
LONG
Player Year Goals
Messi 2010 100
Messi 2011 90
Ronaldo 2010 70
Ronaldo 2011 80
WIDE
Player 2010 2011
Messi 100 90
Ronaldo 70 80
In the context of hypothesis testing, what is a Type I error?
A Type I error is when you reject the null hypothesis when it is true. The other name for a Type I error is a false positive.
Provide one variable with a nominal scale and one variable with an ordinal scale
Nominal: colors —red, green, blue
Ordinal: rating —bad, OK, excellent
Name a statistic that allows you to test if the difference in means between two groups is statistically significant.
t-ratio statistic or z-score