test 3 Flashcards
what are zener cards
cards that test psychic powers, 5 card choices total
success/correct = choose 1 symbol and its correct meaning you predicted it right
difference between N, n, and p
N: population size, total number of trials (ex. number of students)
n: number of experiments in which event of interest occurs (ex. number of cards)
p: probability of success
what is a binomial distribution
used for determining probability of getting a certain number of successes, where each trial has only 2 possible outcomes: success or failure (used in zener decks)
difference between normal and binomial distribution
binomial: used when you have a fixed number of trials (counting fails or successes)
normal: used for continuous data and can take on any interval, shaped like a bell-curve, no fixed number of trials
explain how this sapply function works (which is used inside of data frames):
sapply(0:n, function(X) sum(x==X))
- sapply applies the given function
- 0:n generates a sequence from 0 to n number
- function(X) sum(x==X) is applied to each element of this sequence
- inside function, X represents each element of sequence
- for each X, it calculates how many times X occurs in the vector x (which was generated earlier)
The sum(x==X) part counts how many times the value X occurs in the vector x.
so in essence: sapply function is used to count how many times each number from 0 to n occurs in the generated x data
what’s the code?
create a frequency distribution of the observed number correct in a zener deck with 25 cards and 100 students
first generate a set of data using the rbinom function
x <- rbinom(100, size=25, prob = 0.2)
prob of correct is 1/5 so prob. is 25, 100 students is N and size is 25 cards
now make a data frame for it:
df.zener <- data.frame(Count=0:25, Frequency = sapply(0:25, function(X) sum(x==X)))
this gives you the frequency distribution of the number correct
mean and standard deviation formulas in a binomial distribution + codes for mean and mean prob.
mean = np
n = number of trials
p = prob. of success
standard deviation = √(np(1-p))
- these formulas always work for finding the mean and standard deviation in a binomial distribution
codes:
mean(x)
mean(x)/n for prob.
sd(x) for standard deviation
95% confidence interval - why its used and how to code for it (using binomial distribution)
its used bc its reliable/precise and gives good info ab the data
- similar to 2 standard deviations above the mean
code:
qbinom(0.025, size= n, prob =p) #lower limit
qbinom(0.975, size=n, prob =p) #upper limit
q binom function is used for quartiles, and you are finding the 2.5th percentile and 97.5th percentile to get the 95% confidence interval
what’s the code?
use the probability distribution function to get the probability of observing more than a certain value
pbinom(qbinom(0.975, size =n, prob=p), size =n , prob=p, lower.tail=FALSE)
pbinom for the probability, lower.tail=FALSE to find the upper threshold, the upper tail
what’s the code?
what’s the probability that at least one person in N gets more than 9 cards correct?
prob. of no one getting more than 9 cards correct → (1-P)
then find the complement of that by subtracting it from 1 and raising to the N power → 1- (1-P)^N
code:
1 - (1 - pbinom(9, size=n, prob=p, lower.tail=FALSE))^N
what’s the code?
finding the 95% confidence interval using the normal distribution
qnorm(0.025, mean= np, sd=sqrt(np*(1-p))) #lower limit
qnorm(0.975, mean= np, sd=sqrt(np*(1-p))) #upper limit
formula to find Z score
Z = (X - μ)/σ
X = vector of observations
μ = mean
σ = standard deviation
Z score (what it is, what values are common + Z score code)
Z score: tells u how far a particular data point is from the average of a group of data points, measured in terms of standard deviations
- Z score of 1 means 1 standard deviation and so on
Z 0.025 = -1.96 & Z 0.975 = 1.96
Z- score code: qnorm(0.975) or qnorm(0.025)
what’s the code?
expected lower and upper 95% confidence limit of X using Z score
when it says expected, it has to do with proportions
qnorm(0.025) sqrt(np(1-p)) + np #lower limit
qnorm(0.975) sqrt(np(1-p)) + np #upper limit
formula: X =σZ + μ
whats the code?
one sided z-test for proportions using p and p0 & two-sided z-test
z <- (p - p0) / sqrt(p0 * (1-p0)/N)
one sided
pnorm(z, lower.tail=TRUE)
- in one sided tests, we only care about the left hand side so (bc observed value is less than what we expected it to be)
what u just calculated gives u p value and if its less than 0.05, you reject null hypothesis, greater = accept null hypothesis
two sided
2*pnorm(abs(z), lower.tail=FALSE)
- can do either FALSE or TRUE for tail but just be consistent