Variables, Data and Statistics Flashcards

1
Q

What are variables?

A

Feature of population which is of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What kinds of qualitative data are there?

A
  • Nominal
    • Eye colour, job
  • Ordinal (inherent order)
    • Rank teaching as poor/fair/good/verygood
    • Order needs to be preserved
    • Age (18-25, 25-30)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What kinds of quantitate data are there?

A
  • Discrete (count)
    • Almost every case - whole numbers
    • Number of people
    • Age (as of last birthday)
    • Bar charts
  • Continuous (interval)
    • Things we’ve measured
    • Height, weight, exam marks, incomes,
    • Age (exact)
    • Histograms
  • Ratio data
    • Data that have all the characteristics of continuous data but also have a true zero point
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the notation for population average?

A

µ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the notation for population variance?

A

σ2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the notation for population standard deviation?

A

σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the notation for sample average?

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the notation for sample variance?

A

s2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the notation for sample standard deviation?

A

s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the notation for sample correlation?

A

r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the notation for sample proportion?

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the notation for population proportion?

A

p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which measure of centre is best?

A
  • Mean generally most commonly used but it is sensitive to extreme values
  • If data skewed/extreme values present, median better (median is robust to outliers) (real estate prices)
  • Mode generally best for categorical data (ratings etc)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe the mean median relationship

A
  • If symmetric, mean = median
  • If positive skew, mean > median
  • If negative skew, mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the coefficient of variation?

A
  • A measure of spread of data as a proportion of the level of the centre
  • Equal to the standard deviation devided by the mean multiplied by 100%
  • Called cv
  • cv = s/X̅
  • Sometime x 100 and reported as a percentage
  • Not same units as data
  • Especially useful when comparing two or more sets of data that are measured in different units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How are percentiles found?

A
  • Location of the Pth percentile = ((number of data points + 1) x P)/100
  • Lp = ((n+1)P)/100
17
Q

What is a covariance?

A
  • Measures the strength of the linear relationship between X and Y
  • Sign indicates direction of slope (negative vs positive relationship), but magnitude of covariance is dependent on units of measurement (so cannot indicate strength of relationship) - doesn’t tell if big or small
18
Q

Describe what covariance’s tell us

A
  • If cov>0, then as X increases, Y increases; as X decreases, Y decreases
  • If cov
19
Q

What is the coefficient of correlation?

A
  • Also measures strength of linear relationship between X & Y
  • Is bounded between -1 and +1 - extreme relationship
  • If covariance = 0 , correlation = 0 - no linear relationship
  • Units cancel out
  • Correlation ≠ Causation