Intro to sraristics Flashcards

1
Q

What are the two categories used to classify data?

A

Numerical and categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of numerical variables?

A

Continuous and discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two types of categorical variables?

A

Ordinal and nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe continuous variables.

A

When a continuum of values is possible. For example,height (m). E.g. 1.87m, 1.58m, 1.77m.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe discrete variables.

A

When only discrete values can be used (a whole number). For example, Number of people. E.g. 0, 1, 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe ordinal variables.

A

Categories that have an order. For example, size. E.g. small, medium, large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe nominal variables.

A

Categories that have no order. For example, eye color. E.g. brown, blue, hazel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What graph is most suitable to represent nominal data?

A

A Pareto chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What graph is most suitable to represent ordinal or discrete data?

A

A bar chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What graph is most suitable to represent continuous data.

A

A histogram or bar chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are five ways the shape of the distribution of a histogram described?

A
  • Symmetrical or bell shaped (uni-modal (one peak))
  • Skewed to the left (left side is the tail)
  • Skewed to the right (right side is the tail)
  • Symmetrical and bi-modal (two peaks)
  • Symmetrical and uniform (flat)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the three numerical summaries for center or location?

A

Mode, median and mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the three numerical summaries for spread?

A

Range, inter-quartile range (IQR) and standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the mode?

A

The value that occurs the most.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the median?

A

The middle value located after the values are arranged from highest to lowest. Defined for ordinal,discrete and continuous data. If there are an even number of variables there can be two values for the median (M).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the mean?

A

The average.

17
Q

What is range when measuring spread?

A

The difference between the largest value and the smallest value.

18
Q

What are quarterlies?

A

When ordered data is divided into four equal quarters.

19
Q

What is inter-quartile range when measuring spread?

A

Is simply the ranged spanned by the 1st quarter and the 3rd quarter.

20
Q

What is the 1.5 IQR rule?

A

It identifies outliers determined if the values are lower (lower threshold) or higher (upper threshold) than the inter-quartile range when multiplied by 1.5 and measured from below Q1 and above Q2.

21
Q

What is a 5-number summary?

A

A summary of data using the minimum, Q1, median, Q3 and the maximum.

22
Q

How are 5-number summaries represented?

A

A boxplot

23
Q

What is standard deviation?

A

Describes the variation about the mean.

Calculated

24
Q

What is standard deviation?

A

Describes the variation about the mean.
Calculated by dviding the sum of squared diviants (value minus the mean)2 by the degrees of freedom (n-1) and finally square rooting that value.

25
Q

Does correlation imply causation?

A

No.

26
Q

What is correlation?

A

The strength of the linear relationship between two continuous variables x and y.