LM 01 Flashcards

1
Q

Definition of a data matrix

A

A convenient way to store data. Examples included tables and spreadsheets. Your trout is a unique case (observational unit) . Each column corresponds to a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of variables

A

Numerical or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classifications of numerical variables

A

Numerical variables can be discreet or continuous.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Classifications of categorical variables

A

Categorical variables can be ordered or nominal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explanatory and response variables

A

Explanatory variables might affect response variables. For example hours of study per week might affect GPA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of data collection

A

Observational studies. Researchers collected data passively they merely observe.
Experiments: researchers actively control the data collection trying to establish causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sample versus population

A

Sample is a subset of population. Population is people sample is a group of selected people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Simple, random sample.

A

Randomly selected from population. Example cars passing through intersections in Kelowna.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Stratified sample

A

Cases grouped into strata, then a simple random sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cluster sample.

A

Divided into clusters and sample all of an individual cluster. Example all cars at three intersections.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Multi stage sampling

A

Clusters are sampled for example, cars are randomly sampled at three intersections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Scatterplot

A

Way to provide case by case view of data. can visualize relationship between two numerical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dot plot

A

Visualize one numerical variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Histograms

A

Provides a view of the data density. I.e. the data distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Unimodal

A

A single prominent peak.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Bimodal/multimodal

A

Several prominent peaks

17
Q

Uniform

A

No apparent peaks

18
Q

Right skewed

A

Tale of the cat on the right hand side

19
Q

Left skewed

A

Tale of the cat on the left-hand side.

20
Q

Small variance

A

Sharpen, narrow peak

21
Q

Large variance

A

Wide peak

22
Q

Deviation

A

Distance from the mean

23
Q

How to draw a box plot

A

1) draw a thick line for the median, Q2
2) draw a rectangle with bound Q1 and Q3
3) draw a dotted line for Q1 -1.5 IQR and Q3 +1.5 IQR
4) label, outliers and draw T-shirt, upper and lower whiskers. Only goes as far as either the highest or lowest points.

24
Q

Robust statistics

A

Mean and IQR are more robust than mean and standard deviation.

25
Q

Common practice

A

Symmetric distributions you use mean and standard deviation.
Skewed distributions you use median and IQR .