Week 1 Flashcards
Five stages of statistical process.
Hypothesis - Question of interest
Study Design - how to collect information
Collect Data
Analyse Date - descriptive statistics (understanding), inference (modelling)
Present Results
What are the six principles of statistics?
-collection
-classification
-organisation
-analysis
-interpretation
-presentation of information
Explanatory variable.
Piece of information that we are interested in. We observe differences in the explanatory variable and notice whether they are related to the response variable.
Define population.
Entire set of objects of interest.
Define sample.
Subset of the set of objects of interest.
What is statistical inference?
Analysing a sample from a population to make inferences about a population.
What is a designed experiment? Give an example.
Researcher exerts control over the experimental units. Researcher gives medicine to some patients placebo to others.
What is an observational study? Give an example.
Researcher observes the experimental units and records the variables of interest. E.g. To assess study habits, students are asked how they prepared for an exam and then their grades are compared.
What is quantitative data?
Measurements that are recorded on a natural numerical scale.
What is continuous data?
Quantitative Data; measurements of which can fall anywhere on the real line.
What is discrete data?
Quantitative data; measurements of which can only take one of a finite set of values.
What is qualitative data?
Measurements that cannot be recorded on a natural numerical scale.
What is nominal data?
Is qualitative data with no meaningful ordering.
What is ordinal data?
Is qualitative data with an inherent order.
What does a barchart represent?
Shows the distribution of categorical values.
What does a histogram represent?
Shows the distribution of numerical values.
What does a scatterplot represent?
Shows the relationship between variables.
What are bins?
Frequency on the x-axis in a histogram representation.
What are the measures of location? (4)
Mean
Median
Mode
Percentiles
What are the measures of spread?
Variance
Standard Deviation
Range
IQR
Explain box plots.
Median is the horizontal line in the box.
Top of box is 3rd Q
Bottom of box is 1st Q
Arms stretch to 1.5 times 1Q and 3Q respectively.
Dots outside arms represent outliers.
Define standard deviation.
Records the root mean squared deviation of values from the mean.
What is the relationship between sample standard deviation and sample variance?
sample variance = (sample standard deviation) ^2
Formula for sample s.d.
s = root(1/n-1 * (sum (xi - (Xbar)^2)))
where xi is each value. Xbar is sample mean. n is sample size.
Formula for s.d.
s = root(1/N * (sum (xi - (X)^2)))
where xi is each value. X is population mean. N is population size.
Formula for sample variance.
s = 1/n-1 * (sum (xi - (Xbar)^2)))
where xi is each value. Xbar is sample mean. n is sample size.