Week 4 Flashcards

1
Q

what is frequency

A

how often a value appears in data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does a histogram show

A

how data is distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the mode

A

the most common piece of data
can be used for all types of variables but mostly nominal and ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the median

A

the middle value
cannot be used for nominal variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the mean

A

mean = total value / number of data sets
can only be used for interval and ratio variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how similar are median and mean

A

one extreme outlier can hugely affect the mean but not the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is spread

A

how wide the range of data is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are quantiles, quartiles and percentiles

A
  • quantile: the sections data is split into
  • quartile: name for if there are 4 sections total
  • percentile: name for if there are 100 sections total
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is variance

A
  • 2nd moment
  • (distance from mean)^2 to each data point / number of data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is standard deviation

A

the square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the z-score

A

a ratio with respect to standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is skewness

A

the degree is asymmetry
3rd moment
(distance from mean)^3 to each data point / number of data points
skewness = 3rd moment / SD^3
zero skewness means data are symmetrically distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is kurtosis

A

sharpness of data
4th moment
(distance from mean)^4 to each data point / number of data points
kurtosis = 4th moment / SD^4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are outliers

A

extreme values relative to the bulk of values in a data set
- based on zscore more than 3 or less than -3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a box plot

A

a plot summarising quartile-based statistics of a data set
includes
- location of quartiles
- range of data excluding outliers
- outliers detected by quartile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is probability

A

the probability of obtaining A on the condition of B

17
Q

what is a decision tree

A

a visualisation of the direction and decisions a person could make

18
Q

what is binomial distribution

A

the probability that a value will take one of two choices

19
Q

what is cumulative probability

A

adding all the numbers together

20
Q

what is a discrete event

A

you can count how many times something has happened

21
Q

what is a continuous event

A

when you are measuring continuous variables e.g. height, weight, error etc

22
Q

what is normal distribution

A

it is the shape of normal distribution curve
it can be described using mean and SD