Quiz 2 Flashcards

1
Q

What is a nominal scale?

A

A scale whose numbers serve only as labels or tags for identifying and classifying objects with a strict one-to-one correspondence between the numbers and the objects eg. medicare numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an ordinal scale?

A

A ranking scale in which numbers are assigned to indicate the relative extent to which they possess some characteristic eg. market position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is an interval scale?

A

A scale in which numerically equal distances represent equal values in the characteristic being measured, eg. attitudes and opinions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a ratio scale?

A

A scale that allows the researcher to identify or classify objects, rank order the objects, compare intervals or differences and compute ratios of scale values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Instead of using one of the four scales of measurement (nominal, ordinal, interval or ratio), what are the two other ways?

A
  • Metric data

* Categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is metric data?

A

Data which includes interval and ratio. It is numeric and is measured on some sort of comparative scale - eg. how old are you in years?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is categorical data?

A

Includes nominal and ordinal data and groups possible responses into two or more separate categories, eg. are you male or female?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can data fit into both metric and categorical categories?

A

Yes, age, can be metric, 10 years old or categorical in a 0-18 year category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The data produced from a multi-item scale such as a Likert scale produces what sort of data?

A

Individual questions are categorical, but the rating applied by averaging the responses is metric.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a frequency table?

A

A tabulation of how many times each of the possible responses was recorded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a pie chart?

A

A graphical representation of data where the number of categories is not too large and no individual category is too small.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When writing a report what information should always be mentioned?

A
  • the sample size
  • percentages
  • Interesting aspects of the responses
  • type of tests applied
  • the middle
  • the spread
  • the shape
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is important to remember in capturing results?

A

Do not speculate - keep it fact based, speculation is for the discussion section.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the rules of using a histogram?

A
  • No space between the bars

* Each category must be the same size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between a bar chart and a histogram?

A
  • Bar charts use categorical data on the x axis

* There are gaps between the bars on a bar chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the key parts of a histogram or data that need to be described?

A
  • The Middle
  • The Spread
  • The Shape
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the descriptors of the middle of a data set?

A
  • Mean
  • Mode
  • Median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the Mean?

A

The average - the value obtained by summing all elements in a set and dividing by the number of elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Mode?

A

Is a measure of central tendency given as the value that occurs the most n the sample distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the Median?

A

It is a measure of the most central tendency given as the value above which half of the values fall and below which half of the values fall.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the descriptors of the spread of a data set?

A
  • Range
  • Percentile
  • First quartile
  • Interquartile range
  • Variance
  • Standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the range?

A

The difference between the largest and smallest values of a distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the Percentile?

A

These are values below which a certain percentage of the data lies, eg. the 30th percentile had 30% of the data beneath it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is First Quartile?

A

This is the first quarter (25th Percentile) of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the interquartile range?

A

It is the range of a distribution encompassing the middle 50% of the observations, eg from 25th to 75th percentile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is variance?

A

It is the mean squared deviation of all the values from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is standard deviation and what is the important rule with standard deviation?

A

Is the square root of the variance. At least 75% of all data will be within 2 standard deviations of the mean and at least 88.89% of all data within 3 standard deviations of the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What are the options for the shape of the data set?

A

Symmetrical (where mean = median)
Negative skewed - where mean is lower than the median
Positive skewed - where mean is higher than the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

When writing your report, what is the rule to use if the distribution is skewed?

A

Refer to the median & quartiles not the mean and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is descriptive statistics?

A

A term used to describe how statistics are looking and is the basic way data is described.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Categorical data can be classified into what two categories?

A
  • Ordinal data - can be classified and ranked

* Nominal data - values and observations can be classified but cannot be ranked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the advantage of a box plot graph?

A

It shows spread, interquartile range and the median

33
Q

What is a uniform graph?

A

Where all boxes on a bar graph are around about the same height.

34
Q

What is a statistic inference?

A

It means we can use the sample to say things about the population.

35
Q

What is comparing when referring to statistical inference?

A

It involves looking at responses based on groups.

36
Q

What is correlation?

A

Metric data can be plotted on a graph, how closely they align to a trend (drawn by a line) indicates how correlated the data is.

37
Q

How is perfect correlation expressed?

A

r=1

38
Q

What is a perfect linear positive relationship?

A

Where high values of one variable means high values of another (r= 1)

39
Q

What is a perfect linear negative relationship?

A

Where high values of one variable means low values of another (r= -1)

40
Q

What is a non linear relationship?

A

Values of one variable do not correlate to the other value at all (r=0)

41
Q

How would a correlation coefficient of 0.8 be described as?

A

Strong positive linear relationship

42
Q

How would a correlation coefficient of -0.5 be described as?

A

Moderate negative linear relationship

43
Q

How would a correlation of 0.1 be described?

A

Extremely weak linear relationship.

44
Q

What is the purpose of the p-value?

A

The p-value is the probability that our sample represents the population. It helps determine how likely our result will occur in the population.

45
Q

What is the ‘considered’ acceptable level of p-value to indicate there is enough evidence to believe the survey represents the population?

A

p-value = 0.05

46
Q

Where a p-value is high, what can the person do to reduce it?

A
  • Increase the sample size

* Determine the point at which the results are reasonable (under 0.05)

47
Q

When the p-value is high how would we write this into our report

A

There is insufficient evidence to make a conclusion

48
Q

What are the different types of tests that can be applied?

A
  • One Sample t-test
  • Two sample t-test
  • Paired t-test
  • ANOVA (Analysis of Variance)
  • Chi-square
  • z-test
  • F-test
49
Q

What is a point estimate?

A

It is where you use a percentage by itself and is generally considered as bad in statistics.

50
Q

What is a confidence interval?

A

It is a an interval for a parameter estimate, with a specified level of certainty. So if we think 55% from sample - we could say with 95% confidence it would be 45-65%

51
Q

To provide a higher level of confidence what happens to the range of expected solution?

A

It gets wider.

52
Q

How will a higher sample size affect the confidence interval?

A

It narrows the interval - provides more confidence

53
Q

Does correlation mean causation?

A

No - be very careful just because they are correlated they are not necessarily linked by cause.

54
Q

Correlation tells us about the strength, what does regression do?

A

It gives us an equation that characterises the straight line relationship between the two variables (an independent variable that predicts a dependent variable)

55
Q

What is the difference between correlation and regression?

A

There is a link of cause in regression.

56
Q

What is regression analysis?

A

A statistical procedure for analysing associative relationships between a metric dependent variable and one or more independent variables.

57
Q

What is factor analysis?

A

It allows us to find correlating data and to test the correlation of items in a multi-item scale. It is an exploratory technique.

58
Q

What does factor analysis achieve?

A

Reduces a large number of intercorrelated variables down to a smaller set of meaningful underlying variables.

59
Q

What are the two main uses of factor analysis?

A
  • Summarising information

* Creating/testing scales

60
Q

What is a factor analysis table?

A

It shows how strongly each item responds to each factor. 1.00 equals a perfect correlation (though this is rare)

61
Q

What are the most important items on a factor analysis table?

A

Those that have a greater than 0.3 or -0.3 score.

62
Q

What are the three components to a good recommendation?

A

What - recommending course of action
Why - giving data based evidence for our recommendation
How - further detail about the action.

63
Q

What are the four main steps in the ‘what’ stage of a report?

A
  • Find significant relationships in the data
  • Use this to determine sensible actions to recommend
  • Recommendations generally relate to 1) features and design and 2) marketing
  • Common sense check.
64
Q

What are the three main steps in the ‘why’ stage of a report?

A
  • Give evidence from your data for why we made our recommendation
  • Describe relationships in words (not statistical terms)
  • Common sense check.
65
Q

What goes into the ‘how’ stage of a report?

A

More detail on the recommendation - be clear and concise and ensure it is a recommendation, not a statement of intent.

66
Q

What is the sampling design process?

A
  • Define the target population
  • Determine the sampling frame
  • Select the sampling techniques
  • Determine the sample size
  • Execute the sampling process
67
Q

What is the sampling frame?

A

A representation of the elements of the target population. It consists of a list or set of directions for identifying the target population.

68
Q

What are the two major types of sampling techniques?

A
  • Non-probability sampling techniques - relies on the judgment of the researcher
  • Probability sampling techniques - each element of the population is selected by chance.
69
Q

What are the four types of non-probability sampling techniques?

A
  • Convenience sampling
  • Judgemental sampling
  • Quota sampling
  • Snowball sampling
70
Q

What are the five types of probability sampling techniques?

A
  • Simple random sampling
  • Systematic sampling
  • Stratified sampling
  • Cluster Sampling
  • Other sampling techniques
71
Q

What is convenience sampling?

A
  • Non-probability sampling that attempts to obtain a sample of convenient elements chosen by the interviewer, eg right place right time - shopping centres
72
Q

What is judgemental sampling?

A
  • A form of convenience sampling (non-probability) in which the population elements are selected based on researcher’s judgement - specified places, eg university to get student opinions on teaching methods
73
Q

What is quota sampling?

A
  • A non-probability sampling technique consisting of two-stage restricted judgemental sampling. The first stage looks at elements in the population, eg if sex is 40% women, then the quota for the sample of 100 people would be 40 women. The second stage is based on convenience or judgement.
74
Q

What is snowball sampling?

A

A non-probability sample technique in which initial group is selected randomly and then they identify others and so on and so on. This allows to get people who have rare characteristics, such as widowed men under 35 years.

75
Q

What is simple random sampling?

A

A probability sampling technique in which every element is selected independently of every other element of the sample is drawn by a random procedure from the sampling frame.

76
Q

What is systematic sampling?

A

A probability sampling technique in which the sample is chosen by selecting a random starting point and then picking every ‘x’th element in succession from the sampling frame.

77
Q

What is stratified sampling?

A

A probability sampling technique that uses a two-step process to partition the population into subpopulations, or strata. Elements are selected from each stratum by a random procedure.
MUST MENTION STRATA

78
Q

What is cluster sampling?

A

A two-step probability sampling technique. First the target population is divided into mutually exclusive and collectively exhaustive subpopulations called clusters. Then a random sample of clusters is selected based on a probability sampling technique such as SRS. For each selected cluster, either all the elements are included in the sample or a sample of elements is drawn.

79
Q

What are examples of a non-sampling error or bias?

A

A statistical error caused by human error to which statistical analysis is exposed. Data entry errors, bias questions, biased processes or decision making, inappropriate analysis and incorrect conclusions