General Flashcards
What is Z Score
used for data that is of normal distribution, just tells you how many standard distributions away from the mean a point is
What is Z distribution
this is a normal distribution
What is Standard Deviation
The average deviation from the mean in a distribution
What is Variance
The Standard Deviation, Squared
Interactively edit a data frame
fix() or edit()
See data type of object
class()
See Data type of column
you can use class(Dataframe$columnname)
Return subset of dataframe that meets certain conditions
subset(Dataframe, Dataframe[, 2] == ‘Hamel’)
What is the kurtosis and skew of normal distribution
kurtosis = 3, skew = 0
Interpret Kurtosis
if kurtosis > 3 then has a sharper peak
if kurtosis < 3 then has flatter peak
Interpret Skew
If skewness is less than −1 or greater than +1, the distribution is highly skewed.
If skewness is between −1 and −½ or between +½ and +1, the distribution is moderately skewed.
If skewness is between −½ and +½, the distribution is approximately symmetric.
Different type of correlation measurements
- Pearson: when x & y are continous
- Point bi-serial: when 1 var is continous, 1 is dichotomous
- Phi coefficient: when both vars are dichotomous
- Spearman: when both vars are ordinal (ranked data)
Homoscedasticity
Means the residual is not related to the variable
Assumptions when interpreting the Pearson correlation coefficient
- Normal distribution for x and y
- Linear relatinoship
- Homoskedasticity
- Relaibility
- Validity
- Random Sample
What is reliability
how closely does your sample reflect the true population?