Statistics I Flashcards
Define Measurement!
The process of assigning a number to an attribute (or phenomenon) according to a (set of) rule(s).
Define Sample!
A collection of individual observations selected by a specific procedure.
Define Population!
Totality of individual observations about which inferences are to be made.
What’s a variable?
A symbol that stands for a value that may vary.
Statistics dealing with just one variable is called?
Univariate Statistics
Who introduced the student’s t-test? and when?
William Seal Gosset, 1908
What is Hotelling’s t-squared test in comparison to the student’s t-test?
Hotelling’s t-squared test is the multivariate extension to the t-test.
What’s the Type I error?
We reject H0, but it is true.
What’s the Type II error?
We accept H0, but it is false.
What does H0 stand for?
Null-Hypothesis
What’s the p-value dealing with H0?
Probability of a Type I error.
How is the Student’s t-test defined on wikipedia?
A two-sample location test of the null hypothesis that the means of two normally distributed populations are equal.
The name “Student’s t-test” should strictly speaking only be applied, if …
… the variances of the two tested populations are also assumed to be equal.
A “Student’s t-test” in which the variances of the two populations are not hypothesized to be equal is sometimes called …
… Welch’s t-test.
What is a dependent variable?
The dependent variable represents the output or effect, or is tested to see if it is the effect.
What is an independent variable?
The independent variables represent the inputs or causes, or are tested to see if they are the cause.
Roles of dependent and independent variables in statistics:
In a statistical experiment, the dependent variable is the event studied and expected to change, whenever the independent variable is altered.
In a study whether taking vitamin C pills daily make people live longer, researchers will dictate the vitamin C intake of a group of people over time. One part of the group will be given vitamin C pills daily. The other part of the group will be given a placebo pill. Nobody in the group knows which part they are in. The researchers will check the life span of the people in both groups. Idependent and dependent variables?
Here, the dependent variable is the life span and the independent variable is a binary variable for the use or non-use of vitamin C.
In a study measuring the influence of different quantities of fertilizer on plant growth,
the independent variable would be …
The dependent variable would be …
The controlled variables would be …
… the amount of fertilizer used.
…the growth in height or mass of the plant.
… the type of plant, the type of fertilizer, the amount of sunlight the plant gets, the size of the pots, etc.
In a study of how different doses of a drug affect the severity of symptoms, a researcher could compare the frequency and intensity of symptoms when different doses are administered.
Idependent and dependent variables?
Here the independent variable is the dose and the dependent variable is the frequency/intensity of symptoms.
In measuring the amount of color removed from beetroot samples at different temperatures, temperature is …
and amount of pigment removed is …
… the independent variable …
… the dependent variable.
In sociology, in measuring the effect of education on income or wealth, the dependent variable is …
and the independent variable is …
… level of income/wealth …
… the education level of the individual.
What’s variance?
a quantity equal to the square of the standard deviation.
What’s standard deviation?
A quantity calculated to indicate the extent of deviation for a group as a whole.
What is the median?
The value so that half of the values are above it and half of them are below it.
e.g. 56, 75, 84, 265, 266 -> median = 84
What’s the “mode”?
the value that occurs most often
e.g. 1, 3, 2, 2, 3, 3, 3, 4, 3
mode = 3
2, 2, 3, 4, 10
mean?
median?
mode?
2, 2, 3, 4, 10
mean = 4.2
median = 3
mode = 2
List three statistical measures of dispersion!
- Range
- Variance
- Standard Deviation
Another term for variance?
mean squared deviation
A measure to represent the asymmetry of data?
Skewness
two possible Skewness values?
Right skewed (e.g. distribution of wealth) Left skewed (on what side is the long tail?)
For a unimodal distribution, negative skew indicates that the tail on the left side of the probability density function is longer or fatter than the right side – it does not distinguish these shapes. Conversely, positive skew indicates that the tail on the right side is longer or fatter than the left side.