Quiz 2 Flashcards

Question 1

Q

What is a nominal scale?

Answer

A

A scale whose numbers serve only as labels or tags for identifying and classifying objects with a strict one-to-one correspondence between the numbers and the objects eg. medicare numbers.

Question 2

Q

What is an ordinal scale?

Answer

A

A ranking scale in which numbers are assigned to indicate the relative extent to which they possess some characteristic eg. market position.

Question 3

Q

What is an interval scale?

Answer

A

A scale in which numerically equal distances represent equal values in the characteristic being measured, eg. attitudes and opinions

Question 4

Q

What is a ratio scale?

Answer

A

A scale that allows the researcher to identify or classify objects, rank order the objects, compare intervals or differences and compute ratios of scale values.

Question 5

Q

Instead of using one of the four scales of measurement (nominal, ordinal, interval or ratio), what are the two other ways?

Answer

A

Metric data

* Categorical data

Question 6

Q

What is metric data?

Answer

A

Data which includes interval and ratio. It is numeric and is measured on some sort of comparative scale - eg. how old are you in years?

Question 7

Q

What is categorical data?

Answer

A

Includes nominal and ordinal data and groups possible responses into two or more separate categories, eg. are you male or female?

Question 8

Q

Can data fit into both metric and categorical categories?

Answer

A

Yes, age, can be metric, 10 years old or categorical in a 0-18 year category.

Question 9

Q

The data produced from a multi-item scale such as a Likert scale produces what sort of data?

Answer

A

Individual questions are categorical, but the rating applied by averaging the responses is metric.

Question 10

Q

What is a frequency table?

Answer

A

A tabulation of how many times each of the possible responses was recorded.

Question 11

Q

What is a pie chart?

Answer

A

A graphical representation of data where the number of categories is not too large and no individual category is too small.

Question 12

Q

When writing a report what information should always be mentioned?

Answer

A

the sample size
percentages
Interesting aspects of the responses
type of tests applied
the middle
the spread
the shape

Question 13

Q

What is important to remember in capturing results?

Answer

A

Do not speculate - keep it fact based, speculation is for the discussion section.

Question 14

Q

What are the rules of using a histogram?

Answer

A

No space between the bars

* Each category must be the same size

Question 15

Q

What is the difference between a bar chart and a histogram?

Answer

A

Bar charts use categorical data on the x axis

* There are gaps between the bars on a bar chart.

Question 16

Q

What are the key parts of a histogram or data that need to be described?

Answer

A

The Middle
The Spread
The Shape

Question 17

Q

What are the descriptors of the middle of a data set?

Answer

A

Mean
Mode
Median

Question 18

Q

What is the Mean?

Answer

A

The average - the value obtained by summing all elements in a set and dividing by the number of elements

Question 19

Q

What is the Mode?

Answer

A

Is a measure of central tendency given as the value that occurs the most n the sample distribution.

Question 20

Q

What is the Median?

Answer

A

It is a measure of the most central tendency given as the value above which half of the values fall and below which half of the values fall.

Question 21

Q

What are the descriptors of the spread of a data set?

Answer

A

Range
Percentile
First quartile
Interquartile range
Variance
Standard deviation

Question 22

Q

What is the range?

Answer

A

The difference between the largest and smallest values of a distribution.

Question 23

Q

What is the Percentile?

Answer

A

These are values below which a certain percentage of the data lies, eg. the 30th percentile had 30% of the data beneath it.

Question 24

Q

What is First Quartile?

Answer

A

This is the first quarter (25th Percentile) of the data.

Question 25

Q

What is the interquartile range?

Answer

A

It is the range of a distribution encompassing the middle 50% of the observations, eg from 25th to 75th percentile.

Question 26

Q

What is variance?

Answer

A

It is the mean squared deviation of all the values from the mean.

Question 27

Q

What is standard deviation and what is the important rule with standard deviation?

Answer

A

Is the square root of the variance. At least 75% of all data will be within 2 standard deviations of the mean and at least 88.89% of all data within 3 standard deviations of the mean.

Question 28

Q

What are the options for the shape of the data set?

Answer

A

Symmetrical (where mean = median)
Negative skewed - where mean is lower than the median
Positive skewed - where mean is higher than the median.

Question 29

Q

When writing your report, what is the rule to use if the distribution is skewed?

Answer

A

Refer to the median & quartiles not the mean and standard deviation

Question 30

Q

What is descriptive statistics?

Answer

A

A term used to describe how statistics are looking and is the basic way data is described.

Question 31

Q

Categorical data can be classified into what two categories?

Answer

A

Ordinal data - can be classified and ranked

* Nominal data - values and observations can be classified but cannot be ranked.

Question 32

Q

What is the advantage of a box plot graph?

Answer

A

It shows spread, interquartile range and the median

Question 33

Q

What is a uniform graph?

Answer

A

Where all boxes on a bar graph are around about the same height.

Question 34

Q

What is a statistic inference?

Answer

A

It means we can use the sample to say things about the population.

Question 35

Q

What is comparing when referring to statistical inference?

Answer

A

It involves looking at responses based on groups.

Question 36

Q

What is correlation?

Answer

A

Metric data can be plotted on a graph, how closely they align to a trend (drawn by a line) indicates how correlated the data is.

Question 37

Q

How is perfect correlation expressed?

Question 38

Q

What is a perfect linear positive relationship?

Answer

A

Where high values of one variable means high values of another (r= 1)

Question 39

Q

What is a perfect linear negative relationship?

Answer

A

Where high values of one variable means low values of another (r= -1)

Question 40

Q

What is a non linear relationship?

Answer

A

Values of one variable do not correlate to the other value at all (r=0)

Question 41

Q

How would a correlation coefficient of 0.8 be described as?

Answer

A

Strong positive linear relationship

Question 42

Q

How would a correlation coefficient of -0.5 be described as?

Answer

A

Moderate negative linear relationship

Question 43

Q

How would a correlation of 0.1 be described?

Answer

A

Extremely weak linear relationship.

Question 44

Q

What is the purpose of the p-value?

Answer

A

The p-value is the probability that our sample represents the population. It helps determine how likely our result will occur in the population.

Question 45

Q

What is the ‘considered’ acceptable level of p-value to indicate there is enough evidence to believe the survey represents the population?

Answer

A

p-value = 0.05

Question 46

Q

Where a p-value is high, what can the person do to reduce it?

Answer

A

Increase the sample size

* Determine the point at which the results are reasonable (under 0.05)

Question 47

Q

When the p-value is high how would we write this into our report

Answer

A

There is insufficient evidence to make a conclusion

Question 48

Q

What are the different types of tests that can be applied?

Answer

A

One Sample t-test
Two sample t-test
Paired t-test
ANOVA (Analysis of Variance)
Chi-square
z-test
F-test

Question 49

Q

What is a point estimate?

Answer

A

It is where you use a percentage by itself and is generally considered as bad in statistics.

Question 50

Q

What is a confidence interval?

Answer

A

It is a an interval for a parameter estimate, with a specified level of certainty. So if we think 55% from sample - we could say with 95% confidence it would be 45-65%

Question 51

Q

To provide a higher level of confidence what happens to the range of expected solution?

Answer

A

It gets wider.

Question 52

Q

How will a higher sample size affect the confidence interval?

Answer

A

It narrows the interval - provides more confidence

Question 53

Q

Does correlation mean causation?

Answer

A

No - be very careful just because they are correlated they are not necessarily linked by cause.

Question 54

Q

Correlation tells us about the strength, what does regression do?

Answer

A

It gives us an equation that characterises the straight line relationship between the two variables (an independent variable that predicts a dependent variable)

Question 55

Q

What is the difference between correlation and regression?

Answer

A

There is a link of cause in regression.

Question 56

Q

What is regression analysis?

Answer

A

A statistical procedure for analysing associative relationships between a metric dependent variable and one or more independent variables.

Question 57

Q

What is factor analysis?

Answer

A

It allows us to find correlating data and to test the correlation of items in a multi-item scale. It is an exploratory technique.

Question 58

Q

What does factor analysis achieve?

Answer

A

Reduces a large number of intercorrelated variables down to a smaller set of meaningful underlying variables.

Question 59

Q

What are the two main uses of factor analysis?

Answer

A

Summarising information

* Creating/testing scales

Question 60

Q

What is a factor analysis table?

Answer

A

It shows how strongly each item responds to each factor. 1.00 equals a perfect correlation (though this is rare)

Question 61

Q

What are the most important items on a factor analysis table?

Answer

A

Those that have a greater than 0.3 or -0.3 score.

Question 62

Q

What are the three components to a good recommendation?

Answer

A

What - recommending course of action
Why - giving data based evidence for our recommendation
How - further detail about the action.

Question 63

Q

What are the four main steps in the ‘what’ stage of a report?

Answer

A

Find significant relationships in the data
Use this to determine sensible actions to recommend
Recommendations generally relate to 1) features and design and 2) marketing
Common sense check.

Question 64

Q

What are the three main steps in the ‘why’ stage of a report?

Answer

A

Give evidence from your data for why we made our recommendation
Describe relationships in words (not statistical terms)
Common sense check.

Answer 64

A

More detail on the recommendation - be clear and concise and ensure it is a recommendation, not a statement of intent.

Answer 65

A

Define the target population
Determine the sampling frame
Select the sampling techniques
Determine the sample size
Execute the sampling process

Answer 66

A

A representation of the elements of the target population. It consists of a list or set of directions for identifying the target population.

Answer 67

A

Non-probability sampling techniques - relies on the judgment of the researcher
Probability sampling techniques - each element of the population is selected by chance.

Answer 68

A

Convenience sampling
Judgemental sampling
Quota sampling
Snowball sampling

Answer 69

A

Simple random sampling
Systematic sampling
Stratified sampling
Cluster Sampling
Other sampling techniques

Answer 70

A

Non-probability sampling that attempts to obtain a sample of convenient elements chosen by the interviewer, eg right place right time - shopping centres

Answer 71

A

A form of convenience sampling (non-probability) in which the population elements are selected based on researcher’s judgement - specified places, eg university to get student opinions on teaching methods

Answer 72

A

A non-probability sampling technique consisting of two-stage restricted judgemental sampling. The first stage looks at elements in the population, eg if sex is 40% women, then the quota for the sample of 100 people would be 40 women. The second stage is based on convenience or judgement.

Answer 73

A

A non-probability sample technique in which initial group is selected randomly and then they identify others and so on and so on. This allows to get people who have rare characteristics, such as widowed men under 35 years.

Answer 74

A

A probability sampling technique in which every element is selected independently of every other element of the sample is drawn by a random procedure from the sampling frame.

Answer 75

A

A probability sampling technique in which the sample is chosen by selecting a random starting point and then picking every ‘x’th element in succession from the sampling frame.

Answer 76

A

A probability sampling technique that uses a two-step process to partition the population into subpopulations, or strata. Elements are selected from each stratum by a random procedure.
MUST MENTION STRATA

Answer 77

A

A two-step probability sampling technique. First the target population is divided into mutually exclusive and collectively exhaustive subpopulations called clusters. Then a random sample of clusters is selected based on a probability sampling technique such as SRS. For each selected cluster, either all the elements are included in the sample or a sample of elements is drawn.

Answer 78

A

A statistical error caused by human error to which statistical analysis is exposed. Data entry errors, bias questions, biased processes or decision making, inappropriate analysis and incorrect conclusions