Displaying Data, Stats and Errors Flashcards
understand how data can be displayed
What are the three main ways of collecting data?
- Polls
- Experiments
- Observational studies
What is a sample?
A subset of the target population
What are the benefits of Random sampling ?
- Ensures equal chance of an individual being chosen from the Target population for sampling
- Avoids Bias
- Allows for the calculations of the likely size of sampling errors
- Increasing the sample size, decreases sampling error
Describe precision
-Implies that the value of the statistic is similar in all samples
Describe Bias
-Implies that the sample statistic differs systematically
Describe the process of systematic random sampling with a random start
- Take the size of the population (n) and the size of the sample (N)
- Calculate the fixed period interval (K=N/n)
- Randomly pick a number between 1 and K
- sample the next individual and the next (individual + K) and so on
The process of splitting a sample into groups or subsets and then sampling is called…
Stratified Random sampling
Name the 8 types of non-sampling errors
- Selection Bias
- Self-selection Bias
- Interviewer effects
- Non-response Bias
- Question effects
- Survey format/conduction
- Behavioural considerations
- Transferring Findings
What is the term for taking a group in an experiment and splitting them up (By age for example)?
Blocking of experimental units
What are the two types of observational studies?
- Prospective (for future events)
- Retrospective (for past events)
Why might you use a observational study?
If it is impossible, unethical or impractical to conduct an experiment
What is a confounding variable?
A factor not accounted for that introduces a difference in outcomes
What are the three types of plot sampling?
- Completely random sampling
- Systematic grid, random sampling
- Systematic grid, Systematic sampling with a random start
What are the descriptions of response and explanatory variables ?
- Response–> variable we would like to predict
- Explanatory–>variable that helps us explain or predict the response variable
What are the two types of quantitative data variables?
- Continuous (infinite number of possible values)
- Discrete (distinct values)
What are the two types of qualitative data variables?
- Ordinal (non-numerical, relative values like good, bad)
- Nominal (distinct by name only like green or October)
What is the interquartile range ?
The difference between the 75th quantile and the 25th quantile
Describe Histograms
- Data partitioned into Bins on the x axis
- Number of points in each bin on the Y axis
Describe the shape of a histogram for the following cases:
- Mean>Median
- Mean
- Right skewed
- Left skewed
- Symmetrical
What are some of the possible ways to visually display data?
- Histogram
- Box plot
- Violin plot
- Quilt plot
- Bar chart
- Pie Chart
What are the lines that extend out of a box plot?
Whiskers and they extend to 1.5 times the interquartile range
Describe the Normal distribution
- Bell shaped
- Defined by two parameters, mean and variance
Describe what confidence intervals are used to represent
A confidence interval is used to represent a range of values we are either 95 or 99% confident that the true value for the mean or statistical value lies