Data Summary Flashcards

1
Q

What is Quantitative data?

A

Data measuring some quantity resulting in a numerical value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Qualitative data?

A

Data measuring the quality of something resulting in a value that doesn’t have a numerical value (colour, religion, seasons)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is discrete quantitive data?

A

Data with distinct values and possible values take only a distinct series of numbers (number of traffic accidents, number of children born to a women)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is continuous quantitive data?

A

Data with a value that can be measured evermore precisely (heights, speed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is ordinal qualitative data?

A

Non-numerical value but values that have some natural order (poor, fair, good, great)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is nominal qualitative data?

A

Unordered, distinct by name only (red, blue, green)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a frequency distribution?

A

Used for discrete variables, with a limited number of distinct values. Formed by counting the number of frequency of each distinct value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Meaning: mode

A

Most frequently recorded value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What re some measures of centre?

A

Mean and median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some measures of spread?

A

Range, interquartile range, sample variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why do we not usually know the population mean parameter?

A

Would have to sample the whole population which takes too long/ is too expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What would you use if you didn’t know the true value of the parameter?

A

Obtain the estimate (mu hat), the sample mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How would you find the sample mean (mu hat)?

A

Mu hat = the average of the set of values being used divided by the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How would you find an outcome?

A

Outcome = (mean) + error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you find the destiny at some i?

A

Destiny at some time/location i is equal to the mean destiny plus some error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you find the sample median?

A

Sort the values into value order and find the middle number

17
Q

What is the calculation for find the sample median for an ordered set of values?

A

(n+1)/2

18
Q

What is the range?

A

The difference between the max and min value

19
Q

What is an outlier?

A

A value that is very different to the other values

20
Q

How do you find the interquartile range?

A

The 75th percentile minus the 25th percentile, and 50% of the data lies in that range

21
Q

What is the error useful for?

A

The size of the error determines whether the model is a good or bad fit for the data

22
Q

When would you use a bar plot?

A

When illustrating frequency information across discrete categories or groups

23
Q

When would you use a histogram?

A

Used to display continuous data, data portioned into distinct bins

24
Q

If he tail of the histogram is on the left what way is it skewed?

A

Left skewed

25
Q

If the mean is less than the median what way is the graph skewed?

A

Left

26
Q

When would you use a box plot?

A

Used to convey summary information about a variable

27
Q

What do notched box plots include?

A

Info about the median

28
Q

What are violin plots a combination of?

A

Box plots and a smoothed sideways histogram

29
Q

What do the x and y variables usually represent?

A

X - explanatory variable

Y - response variable

30
Q

Why would a scatter plot be used?

A

If there are two continuous variables