Statistical Methods- Lectures3-4 Flashcards

1
Q

What are scatterplots useful for?

A

Scatterplots are useful for visualizing the relationship between two numerical variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you describe the distribution of a quantities variable in a graph?

A

Describe the overall pattern (shape, center, spread etc.) and deviations from the pattern (e.g. Outliers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you calculate the sample mean?

A

The sample mean, denoted by x̄, can be calculated as
x̄=(x1+x2+…+xn)/n

The sample mean is a sample statistic, and serves as a point estimate if the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the population mean?

A

The population mean is a population parameter computed the same way using all values in the population and is denoted by μ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do histograms provide?

A

Histograms provide a view of the data density. Higher bars represent where the data is more common.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are commonly observed shapes of distributions?

A

Modality- Unimodal, bimodal, multimodal, uniform

Skewness- right skew(tails off to the right), left skew, symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are outliers?

A

An outlier is an observation that lies an abnormal distance from other values in a random sample from a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you find the sample variance?

A

The sample Variance is roughly the average squared deviation from the mean.

S^2= ∑^n i=1 (xi-x̄)^2/n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we use the n-1 in the calculation of variance?

A

To make the variance an unbiased estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you find the standard deviation?

A

The standard deviation is the square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can you describe the centre and spread of a distribution?

A

Spread:

  • Mean
  • Variance
  • Standard Deviation

Centre:

  • Median
  • Range
  • Interquartile Range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you find find the interquartile range and the range of all data?

A

IQR=Q3-Q1

Range=Max value - Min Value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the five number summary of the data?

A

The median, Q1, Q3, Min and Max are called the five number summary of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the box in a box plot represent?

A

The box in a box plot represents the middle 50% of the data, and the thick line in the box is the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are whiskers of a box plot?

A

Max upper whisker reach=Q3+1.5xIQR

Max lower whisker reach=Q1-1.5xIQR

A potential outlier is defined as an observation beyond the maximum reach of the whiskers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is it important to look for outliers?

A
  • To identify extreme skew in the distribution
  • Identify data collection and entry errors
  • Provide insight into interesting features of the data
17
Q

How do you describe the center and spread in skewed and symmetric distributions?

A

For skewed distributions it is often more helpful to use median and IQR to describe the center and spread meanwhile it is better to use the mean and SD for a symmetric distribution.