Week 2 - Summarizing Data Flashcards

1
Q

Scatter plot

A

allows us to visualize the nature of the relationship between 2 variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

linear relationship

A

when the gradient of the slope stays the same throughout

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Histogram

A

used to understand the distribution of a single numerical variable - ie how data is spread out or arranged

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

mean

A

the center or typical value in a set of data. one of the most common measures of central tendency is the mean.
in a sample the mean is: x^-

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

one pitfall of mean

A

its very sensitive to outliers, and they can have a huge impact on the accuracy of the result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

median

A

another measure of central tendency. median represents the middle value - it seperates the smallest 50% from the largest 50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

standard deviation

A

this is a measure of dispersion. it tells us how far away observations may be from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

interquartile range

A

the difference between the 75th percentile in a set of data and the 25th percentile
this is robust to outliers because it focuses only on the middle 50
% of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Box whisker plot

A

the box represents the interquartile range
the whiskers extend to 1.5 times the interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

shapes of distributions

A

one important dimension is the symmetry and skew-ness of a distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

different skews

A

right skewed - the mean is larger than the median
left skewed - the mean is smaller than the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

presenting categorical data

A

this is harder, its typically summarized using counts or proportions of different outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

contingency table

A

this summarizes data for 2 categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly