Chapter 1 Flashcards

1
Q

what are Categorical variables

A

places each case into one of several groups or categories ex: gender, race, music genre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are quantitative variables

A

takes numerical values for which arithmetic operations such as adding and
averaging make sense

Examples: income, weight, heart rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the distribution of a variable tell us

A

The values that a variable takes and how often it takes each value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define population

A

A group of individuals which we want info about

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define sample

A

a part of the population from which we actually collect information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List some problems with taking a census

A
  • people can be hard to locate
  • populations rarely stand still
  • a census may be more complex than sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Non-response bias

A

if only a small fraction of the randomly sampled people choose to respond to the survey.

The population may no longer be represented appropriately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Under-coverage

A

some groups in the population are left out of the process of choosing the same

Examples: A household survey will miss homeless people and people in jail or dorms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Voluntary response bias

A

The sample consists of people who volunteer to respond because they have strong opinions on the issue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Response Bias

A

Lying, or forgetting the truth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an observational study?

A

Where researchers collect data in a way that doesn’t interfere with how the data arises

ie: they observe instead of ask

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a retrospective study?

A

Collect data after events have taken place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a simple random sample?

A

Randomly selecting cases from the population, where there is no implied connection between the points that are selected.

Each member has an equal chance to be selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a stratified Sample?

A

Strata are made up of similar observations

  • We take a simple random sample from each stratum
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a cluster sample?

A

take a simple random sample of clusters, then sample all observations in that cluster

  • preferred for economical reasons
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a multistage sample?

A

Take a simple random sample of cluster, then take a simple random sample of observations from the sampled clusters

17
Q

What is blocking?

A

blocking is like stratifying, except used in experimental settings when randomly assigning as opposed to when sampling

18
Q

How to interpret a histogram

A

describe its:

  • shape
  • centre
  • spread
19
Q

What is modality when referring to histograms

A

How many peaks does the graph have

20
Q

What is the 1st quartile?

A

The median of the observations located to the left of the median

21
Q

What is the 3rd quartile?

A

The median of the observations located to the right of the median

22
Q

What is the formula for the interquartile range (IQR)?

A

IQR = Q3 - Q1

23
Q

What does the five number summary include?

A
Minimum
First Quartile
Median
Third Quartile
Maximum
24
Q

What is the 1.5 IQR Rule?

A

A rule for pointing out outliers

  • if an observation is 1.5 x IQR above the 3rd Q or below the 1st Q it is an outlier
25
Q

What is the Standard Deviation?

A

How far each observation is from the mean

the average distance of the observations from their mean

26
Q

What is the Standard Deviation?

A

How far each observation is from the mean

the average distance of the observations from their mean

27
Q

When to use Mean & SD vs Median & IQR

A
  • use median and IQR for skewed distribution, or distribution with outliers
  • use Mean & SD for reasonably symmetric distributions that dont have outliers
28
Q

What to do with extremely skewed data?

A

Transform them.

Common transformation is the log transformation.