2- Visualisation and Presentation of Data Flashcards

1
Q

Discrete data

A

When the data values are quantitative and the numbers are finite or countable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Continuous data

A

Result from infinitely many possible quantitative values, where the collection of values is not countable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Levels of measurement

A

Nominal
Ordinal
Interval
Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Nominal level of measurement

A
  • The nominal level of measurement is characterised by data that consist of names, labels, or categories only.
  • The data cannot be arranged in an ordering scheme.
  • Cannot be used for calculations.
  • Numbers are sometimes assigned to the different categories.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Example of nominal level of measurement

A

Social security numbers: Substitutes for names, they do not count or measure anything
Yes/No/Undecided :survey responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ordinal level of measurement

A
  • Data are the ordinal level of measurement if they can be arranged in some order
  • The differences (obtained by subtraction) between data values either cannot be determined or are meaningless.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Examples of ordinal level of measurement

A
  • Course grades: A college professor assigns grades of A, B, C, D or E. These grades can be arranged in order, but we cannot determine the differences.
  • Ranks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interval level of measurement

A
  • Data are at the interval level of measurement if they can be arranged in order, and differences between data values can be found and are meaningful.
  • Data at this level do not have a natural zero starting point at which none of the quantity is present
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Example of Interval level of measurement

A
  • Data are at the interval level of measurement if they can be arranged in order, and differences between data values can be found and are meaningful.
  • Data at this level do not have a natural zero starting point at which none of the quantity is present
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Examples of Interval level of measurement

A

Outdoor temperatures:18 ℃ and 34 ℃ are examples of data at this interval level of measurement. We can determine their difference of 16 ℃, but there is no natural starting point. Though 0 ℃ seems like a starting point, it is arbitrary and does not represent the total absence of heat.
Years: The years 1492 and 1776 can be arranged in an order, and we can determine the difference, and is meaningful but, time did not begin in the year zero, so zero is arbitrary instead of being a natural zero starting point representing β€œno time”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ration Level of measurement

A
  • Data are the ratio level of measurement if they can be arranged in order, differences can be found and are meaningful, and there is a natural zero starting point.
  • Here, zero indicates that none of the quantity is present
    For data at this level, differences and ratios are both meaningful.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Examples of Ratio Level of measurement

A
  • Car lengths: car lengths of 106 inch for a Smart car, 212 inch for a Mercury Grand Marquis, ( 0 inch represents no length, 212 inch is twice as long as 106 inch)
  • Class times: the times of 50 mins and 100 mins for a statistics class (0 min represents no class time, and 100 mins is twice as long as 50 mins)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Extra of ratio measurement

A
  • Ratio level of measurement is called ratio because the zero starting point makes ratios meaningful.
  • Consider two quantities when one number is twice the other and ask whether β€œtwice” can be used to correctly describe the data.

Example:
- We can say a person with a hight of 6ft is twice as tall as a person with hight 3ft – the heights are the ratio level of measurements
However, we cannot say 50 ℃ is twice as hot as 25 ℃, temperatures are not at the ratio level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Tables which can represent qualitative data

A

Frequency Table
Relative frequency table
Percentage frequency table
Cumulative frequency table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Tables which can represent quantitative data

A

Frequency, Relative frequency, percentage frequency, cumulative frequency tables – instead of using category names we use discrete values here

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Charts which can represent quantitative data

A

Histogram
Bar chart
Line graph
Scatter graph
Box plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Charts which can represent qualitative data

A

Bar graph
Pie graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Measures of location

A

The mean,
The median,
The mode,
Skewness and kurtosis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Measures of dispersion (variability)

A

The range and Percentile
Quartiles and interquartile range
The mean deviation
The variance
The standard deviation.

20
Q

The mean

A
  • Perhaps the most important measure of location is the mean, or average value, for a variable.
  • If sample data, the mean is denoted by π‘₯Μ…
  • If a population, the Greek letter πœ‡ is used to denote the mean.
21
Q

Mode- when there is no mode?

A

Unimodal

22
Q

Mode- when there is 2 modes?

A

Bimodal

23
Q

Skewness definition

A

A measure of the degree of asymmetry of a distribution.

24
Q

Kurtosis definition

A

A measure of whether the data are peaked or flat relative to a normal distribution.

25
Q

Description of negative skew

A

Has a high frequency or relatively high values and a low frequency of relatively low values, so the mean is dragged toward the left (the low values) of the distribution.

26
Q

Mean, median and mode of negative skew

A

mode > median > mean

27
Q

Description of normal skew

A

Said to be symmetrical ; the mean, median and mode have the same value and thus coincide at the same point of the distribution.

28
Q

Mean, median and mode of normal skew

A

Mean=mode=median

29
Q

Description of positive skew

A

Has a high frequency of relatively low values and low frequency of relatively high values, so the mean is dragged toward the right (the high values) of the distribution

30
Q

Mean, mode and median of positive skew

A

Mean > median > mode

31
Q

Is the mean a good measure of central tendancy with skewed data?

A

No- because it is sensitive to extreme values.

32
Q

Kurtosis definition

A

A measure of whether the data are peaked or flat relative to a normal distribution.

33
Q

Positive kurtosis

A

More peaked amongst the three distribution

34
Q

Positive kurtosis

A

More peaked amongst the three distributions- Leptokurtic

35
Q

Normal kurtosis

A

Mesokurtic

36
Q

Negative kurtosis

A

Flattest distribution- no peak/ obvious curve- Platykurtic

37
Q

Mean value, standard deviation and kurtosis of all kurtosis’

A

The mean value (therefore the standard deviation and the variance for all three distributions are the same).

38
Q

Percentile

A

A percentile provides information about how the data are spread over the interval from the smallest value to the largest value.
The 𝑝^π‘‘β„Ž percentile is a value such that at least 𝑝 per cent of the observations are less than or equal to this value and at least (100- 𝑝) per cent of the observations are greater than or equal to this value.

39
Q

How to calculate the pth percentile

A

1- Arrange the data in ascending order (smallest value to largest value)
2- Compute an index 𝑖, 𝑖=𝑝/100 𝑛
where 𝑝 is the percentile of interest and 𝑛 is the number of observations.
3 a) If 𝑖 is not an integer, round up. The next integer greater than i denotes the position of
the pth percentile.
b) If 𝑖 is an integer, the 𝑝^π‘‘β„Ž percentile is the average of the values in positions 𝑖 and 𝑖+1

40
Q

Variance definition

A

The variance is a measure of variability that uses all the data. The variance is based on the difference between the value of each data and the mean. The difference is called a deviation about the mean.

41
Q

How is the deviation about the mean expressed for a sample?

A

(π‘₯_π‘–βˆ’π‘₯Μ…)

42
Q

How is the deviation about the mean expressed for the population?

A

(π‘₯_π‘–βˆ’πœ‡)

43
Q

Standard deviation definition

A

The standard deviation is defined to be the positive square root of the variance.

44
Q

What does a low standard deviation indicate?

A

Low standard deviation indicates that the values tend to be close to the mean of the data set.

45
Q

What does a high standard deviation indicate?

A

High standard deviation indicates that the values are spread out over a wider range

46
Q

Histogram frequency equation

A

Frequency= class interval x frequency density