Lecture 8 Flashcards

1
Q

What is a variable

A

Any quantity that can be measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Finish the sentence “In a dataset there will be ___ of a variable for each individual in the sample”

A

Observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s a central tendency

A

The typical value of a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is dispersion

A

How far from the typical value the individual observations of a variable are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an association?

A

How a variable relates to another variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are inferential statistics?

A

Stats used to make predictions about parameters of the population based on two factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are parameters?

A

Characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What estimates the parameters?

A

Statistics computed from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is probability?

A

The chance that a particular event will occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sampling distribution

A

The probability that we obtain the parameters observed in our sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is hypothesis testing?

A

The data supporting our beliefs about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do we use descriptive statistics?

A

To summarise sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do we use statistical inference?

A

To generalise about population parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What determines or influences what statistical methods we can apply?

A

The level of measurement of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s descriptive statistics for?

A

To summarise the key features of data.
- To make it understandable for human readers
- To identify characteristics
- To identify patterns
- To provide basis for further analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are measure of Central tendency?

A

Mean (x̄), median (M), mode (Z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are measure of central tendency

A

Single number that represents the ‘typical’ value of a variable (an average: mean, median, mode)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How would you visualise data?

A

In frequency tables i.e. Bar charts and Histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is skewness?

A

Distributions that have a relatively higher proportion of values at the low (left) or high (right) end of the range (on the graph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Where can you visualise skewness best?

A

Comparing values of means, median and mode in histograms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does a normal distribution look like?

A

Evenly spread above and below the mean (bell shape)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which side does a positive skew lean towards?

A

Right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which way does a negative skew lean towards?

A

Left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mean the best representation of?

A

The average in most cases of continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does the median identify?
The central point
26
What is the median useful for?
Correcting skewed data or when continuous variables are measured on subjective scales
27
What is the mode suitable for?
Nominal data or grouped data
28
What does dispersion measure?
How far, on average, each observation is from the central tendency (mean)
29
What does the dispersion figure represent?
The variation in values within a variable
30
What do lower values of dispersion indicate?
That the central tendency (mean) is a better representation of the 'typical value' (more accurate)
31
What does the range and interquartile range provide?
A basic measure, useful for visualisation and identifying outliers
32
Why should we use variance and standard deviation?
They are more statistically powerful measures
33
What's the interquartile range?
The range of the middle 50% of values (Median of upper and lower halves)
34
What is variance?
The mean of the squared differences between each data point and the mean
35
What is standard deviation
Square root of the variance (most common measure of dispersion)
36
What can measures of dispersion not be applied to?
Nominal variables
37
What is a good visual form for understanding dispersion of a variable and identifying outliers?
Box plots
38
What is a plot outlier?
Values, figures, or data that lie outside the box plot limits
39
How do you calculate variance?
Mean of the squared differences between each value in the dataset
40
How do you calculate standard deviation
Square root of the variance
41
What does the measure of association consider?
The relationship between two variables
42
What does Kurtosis mean?
Flatness
43
What is a large SD? (Standard deviation)
Flat distribution
44
What is a small SD? (Standard deviation)
Narrow distribution
45
What does standard deviation tell us about in terms of distribution?
The flatness of distribution
46
What's the statistic for categorical data?
Chi-squared x^2
47
Whats the statistic for continuous data?
Pearson's correlation coefficient (r)
48
What does Chi-Squared measure?
The association between two categorical variables
49
What does chi-squared compare?
The expected frequency if there was no relationship with the observed frequency in the sample data
50
Correlation (r)
Strength and magnitude (direction) of the association between two variables
51
What does a positive correlation mean?
Increase in one variable associated with an increase in the other
52
What does a negative correlation mean
Increase in one variable associate with a decrease in the other
53
What does 0 correlation mean
No association
54
What does Chi-Squared rely on?
Testing for statistical significance
55
What is a statistical significance
Importance or quality of the data/stats
56
What is a critical value
How far from expected centre do you need to be before saying something is unusual here
57
Correlation is used for?
Ordinal and scale data (continuous)
58
Chi-Squared is used for?
Nominal (Categorical) data
59
What is covariance?
The degree to which two variables deviate from their expected values (mean) in similar ways
60
What does a positive covariance indicate?
Variables that tend to ‘move together’ away from their means: if we observe a high value of x, we also expect to see a high value of y
61
What's a scatter plot good for?
Checking if there is a linear relationship between two variables
62
What does negative covariance indicate?
Variables that move in opposite directions: if we observe a high value of x, we expect to see a value of y below its mean
63
What is a strong correlation's r value?
r = ± 0.8
64
What is a weak correlation's r value?
r ± 0.3
65
What is an omitted variable?
A factor that could lead to changes in X and Y
66
What is a reverse casuality?
A change in Y leads to the change in X
67
What is Sample selection bias?
When Individuals sampled have a different tendency to show the association than the whole population
68
What's a measurement error?
Values in the data that differ from the true value of the variable
69
What does association NOT imply?
Causation
70
What are alternative reasons for finding a relationship?
Omitted variables, reverse casuality, sample selection and measurement error
71
What does the test we use depend on?
Data meeting certain assumptions
72
What does the statistical inference process rely on?
Estimating the probability of obtaining our sample results, based on the distribution of sample statistics and population parameters.
73
What's the standard normal (Z) distribution's mean and SD value?
Mean = 0 SD = 1
74
What does a values Z-score represent
How many standard deviations it lies from th mean?
75
What does a higher Z score mean?
A lower probability of observing the value
76
What is the central limit theorem?
the sampling distribution of the mean will always be normally distributed, as long as the sample size is large enough
77
What is the 'sampling distribution of the sample means?'
The distribution consisting of the means of all random samples (n) of a given size that can be drawn from a population.
78
Alternative hypothesis (H1) means
what we believe the data will support
79
What's the Null hypothesis (H0)
It covers all states we want to disprove
80
What is the statistical significance level?
0.05 (Significance level of 5%)
81
Give me 4 levels of measurement data
Nominal, ordinal, interval or ratio data
82
What do larger samples provide?
More reliable results
83