Project Prep Benchtest Flashcards

1
Q

Research question characteristics

A
  • Focused on a single problem
  • Researchable using primary/secondary sources
  • Feasible to answer within the timeframe and practical constraints
  • Specific enough to answer thouroughly
  • Complex enough to devlop the answer over a space of a paper or thesis
  • Relevant to your field of study/or society more broadly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of Research questions

A
  • Descriptive research
  • Comparative research
  • Correlational research
  • Exploratory research
  • Explanatory research
  • Evaluation research
  • Action research
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is in a problem statement?

A
  • Context
  • Specific issue being investigated
  • Why this problem? Why now? Currency?
  • Set objectives (project goals)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Descriptive research

A

What are the characteristics of X?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Comparative research

A

What are the differences and similarities between X and Y?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlational research

A

What is the relationship between variable X and variable Y?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Exploratory research

A

What are the main factors in X? What is the role of Y in Z?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explanatory research

A

Does X have an effect on Y? What is the impact of Y on Z? What are the causes of X?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Evaluation research

A

What are the advntages and disadvantages of X? How well does Y work? How effective or desirable is Z?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Action research

A

How can X be acheived? What are the most effective strategies to improve Y?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

S.M.A.R.T

A

Specific, Measurable, Attainable, Realistic, Timely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Inductive vs Deductive research

A

Developing a theory vs testing a theory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Exploratory vs Explanatory research

A

Exploring the main aspects of problem vs explaining causes and consequences of a well defined problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Academic critique

A
  • Deep dive into a single body of work
  • Should be a counter argument - need to use external evidence and give counter points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Positivist

A
  • Objective study
  • Reductionist (break down complexities into simpler units of study)
  • Verifying theories
  • Can be studied in isolation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Critical Theorist

A
  • Knowledge used to empower people
  • Participatory
  • Seeks to bring about change
  • Focus on empowering groups
  • Studied within that context
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Constructivist

A
  • Truth is relative to context
  • Theory is open to interpretation
  • Generates theories in a given context
  • Cannot be studied in isolation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Pragmatist

A
  • All research is biased
  • No objective ‘truth’
  • Works towards pratical solutions to problems
  • Multiple answers
  • Seek the best one(s)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Reliability

A
  • How consistent are repeated measurements
  • How close together are the measurements
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Validity

A

Results correspond to the real thing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Types of reliability assessments

A
  • Test-retest
  • Inter-rater
  • Internal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Types of validity assessments

A
  • Construct
  • Face
  • Concurrent
  • Predictive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Test-retest

A

-Determines reliability of the test and results over time
- Good indicator of reliability is strong correlation (r > 0.8) between same test given to same subjects over time
- Only works on consistent attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Inter-rater

A
  • Determines reliability of test measurements and results gathered by different researchers
  • Different people should give strongly correlated results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Internal

A
  • Do you get same results if you use different tests to measure the same thing
  • Strong correlation supports reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Construct (Validity assessment)

A

Does the test relate to high level theories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Face (Validity assessment)

A

Does test appear to test what it aims to test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Concurrent (Validity assessment)

A
  • Does the test relate to an existing similar validated test
  • Work is built on findings of another test and matches their work
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Predictive (Validity assessment)

A

Does the test predict performance in a later developed test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Research Ethics

A

Concerns the responsibility of researchers to be honest and respectful to all individuals who are affected by their research studies or their reports of the studies’ results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Research integrity

A

Conducting research in a way that allows others to have trust and confidence in the methods used and findings that result from this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Bias

A

Conscious or unconscious influencing of the study and its results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Types of Bias

A
  • Recall bias
  • Selection bias
  • Observation bias
  • Confirmation bias
  • Publishing bias
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Recall Bias

A
  • Survey respondents asked to recall events
  • different types of events more likely to be remembered than others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Selections bias

A

Samples can sometimes under-represent certain people and over represent others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Observation bias

A
  • Hawthorne Effect
  • When participants are aware that they’re being observed they, either consciously or unconsciously, alter the way they act or the answers they give
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Confirmation bias

A
  • Occurs during interpretation of study data
  • Researchers consciously or unconsciously look for information or patterns that confirm the ideas or opinions that they already hold
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Publishing bias

A
  • Studies with negative findings (nothing found) are less likely to be submitted by scientists or published by journals
  • Perceived as less interesting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Avoiding bias

A
  • Bias in per-course survey (unbalanced data) - automatic profiling
  • Bias in learning about user instead of type of user (stereotyping) - different users in training and test sets
  • Bias in future data predicting past - train on past, test on future
  • Bias in unbalanced data sample - stratified sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Literature review

A
  • A survey of scholarly sources on a specific topic(s)
  • Provides an overview of current knowledge allowing you to identify relevant theories methods and gaps in existing research
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Review article

A
  • Summarises current state of understanding on a topic
  • Surveys and summarises previously published studies - rather than report on new facts or analysis
  • Gives roadmap on future research
  • Can be used to back up the validity of your question
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Surveys

A

Any method focused on asking Participants for responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Purpose of Surveys

A
  • Gather information not available from other sources
  • Ubiased representation of population interest
  • Collect information from many individuals to understand them as a whole
  • Allows massive information gathering
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Type of data collected by surveys

A

Mainly quantitative but qualitative methods can be used too

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Pros of Surveys

A
  • Can get info from large samples
  • Can have different types and numbers of variables
  • Gets info that’s hard to observe
  • Easy and cheap
  • Standardised stimulus - no observer subjectivity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Cons of surveys

A
  • Intentional misreporting to hide inappropriate behaviour
  • Poor recall
  • Response rates are critical
  • Can introduce bias from wording of questions
  • Inflexible - can’t be changed during data gathering
  • Not ideal for controversial issues
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Survey Types by purpose

A
  • Exploratory - form general ideas about research questions
  • Descriptive - collect more specific descriptions of the variables of interest
  • Explanatory - develop understanding of relationships among variables of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

How can you validate surveys?

A
  • Need to validate bias in question design
  • Ask positive and negative questions - should be given opposite answers
  • Validity of survey comes from the representativeness of the sample and the precision of the questions
  • Face validity - Do questions appear reasonable and acquire data you want
  • Content validity - Are questions all about issue and other subjects related to it
  • Internal validity - Do questions imply the outcome you want to achieve
  • External validity - Do questions elicit answers that are generalizable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Survey - Research questions

A
  • Correlational questions
  • Less technical questions - usability
  • Exploratory questions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Types of sampling

A
  • Random sampling- each member has equal chance of being picked
  • Stratified sampling- use subsets of the population to sample - lower sampling error
  • Systematic sampling- every Nth name is selected
  • Quota sampling- researcher chooses necessary number of participants per stratum
  • Purposive sampling- researcher selects participants according to criteria
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Purpose of Observation

A

To understand how people naturally interact with products and people and the challenges they face

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Pros of Observation

A
  • Can get more subtle data
  • Allows richly detailed description
  • Viewing or participating in unscheduled events
  • Improves quality of data collection
  • Can see things you weren’t expecting
  • Useful for formulating hypothesis
  • Doesn’t depend on information provided by respondents
  • Can deal infants/animals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Cons of Observation

A
  • Less structured responses
  • Get huge amount of data - analysing and not including bias is hard
  • Difficult to replicate - lots of variables you don’t have control of
  • Different researchers gain different understanding of what they observe
  • Male/female researchers have access to different information
  • Many events are uncertain in nature - difficult for researcher to determine time and place
  • Can’t generalise
  • Long and expensive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Observation - Research questions

A
  • Exploratory
  • Explanatory
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Type of data collected by Observation

A

Typically qualitative but can be quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Types of Observation

A
  • Complete observer
  • Observer as Participant
  • Participant as Observer
  • Complete Participant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Complete observer

A
  • Detached observer
  • Researcher is neither seen nor noticed by participants
  • Minimises Hawthorne effect - participants more likely to act natural
  • Most likely to raise ethical questions
58
Q

Observer as participant

A
  • Researcher is known and recognised by participants
  • Participants know research goals of the observer
  • Some interaction with participants but limited
  • Researchers aim is to play a neutral role
59
Q

Participant as observer

A
  • Researcher is fully engaged with the participants
  • More of a friend or colleague than neutral third party
  • Full interaction with participants but they still know its a researcher
60
Q

Complete participant

A
  • Fully embedded researcher
  • Observer fully engages with the participants and partakes in their activities
  • Participants aren’t aware that observation and research is being conducted
61
Q

How do you validate an observational study?

A

Use multiple independent researchers to observe

62
Q

Direct Observation

A
  • Quantitative technique
  • Explicitly counting the frequency and/or intensity of specific behaviours
  • Most direct observation data collection done by actual observers
  • Don’t require human data collector - audio/video can be used
  • Ordinal data/ purely factual description
  • Structured form of data collection
63
Q

Participant observation

A
  • Process enabling researchers to learn about the activities of the people under study in the natural setting through observing and participating in those activities
  • Qualitative, interactive and unstructured
  • Information collected is unique to the individual collecting the data
64
Q

Purpose of Interviews

A

Explore the views, experiences, beliefs and/or motivations of individuals on specific matters

65
Q

Purpose of Focus Groups

A
  • Group of respondents are interviewed together
  • Obtain data from purposely selected group of individuals rather than representative sample
66
Q

Pros of Interviews

A
  • Can get qualitative data
  • Preferable when researcher wants subjective perspective rather than generalisable understandings
67
Q

Cons of Interviews

A
  • Time consuming
  • Not the best for researching sensitive topics
68
Q

Pros of Focus Groups

A
  • Better at drawing people out of their shells - increased validity
  • Allows for discovery
  • Can build on each others comments for richer contextual data
69
Q

Cons of Focus Groups

A
  • Time consuming
  • Anonymity is hard
  • Less reliable
  • Participants can be influenced by other group members - conformity, social desirability, oppositional behaviours
  • Need skilled interviewer to prevent these problems
70
Q

Interviews and Focus Groups - Research questions

A
  • Exploratory questions
  • Theory testing/creation questions
  • Confirmatory research questions
71
Q

Type of data collected by Focus Groups and Interviews

A
  • Almost always qualitative
72
Q

Structured vs Unstructured questions

A

Structured:
- Quantitative method
- Closed-ended questions
- List of questions
- Everyone asked same questions in the same order
- Easy to replicate
- Easy to test for reliability
- Quick to conduct
- Not flexible
Unstructured:
- Do not use any set questions
- Guided discussion
- Most useful for qualitative research
- Rarely provide valid basis for generalisation
- More flexible
- Increased validity - can probe for deeper understanding
- Time consuming to conduct and analyse the data
- Employing and training interviewers is expensive
Semi-Structured:
- Set questions but can investigate answers more
- Gets qualitative and quantitative data
- Can explore around answers
- Gathers useful info but respondents can answer more on their own terms
- More flexible
- More time-consuming

73
Q

Types of Focus Groups

A
  • Dual moderator - Two moderators
  • Two-way - Two seperate groups having discussions at the same time - second group listens to the firs tbefore having teh discussion
  • Mini- 4-5 participants instead of 6-10
  • Client-involvement - clients ask for focus group and invite those who ask
  • Participant-moderated- one or more participants are moderators
  • Online
74
Q

Purpose of experiments

A

Allows researchers to look at cause-and-effect relationship
Used when:
- There is time priority in a causal relationship
- There is consistency in a causal relationship
- The magnitude of the correlation is great

75
Q

Pros of experiments

A
  • Allows for reproducibility
  • Generalisation is easier
  • Can take bias into account using statistics
76
Q

Cons of experiments

A
  • Equipment might be more expensive
  • Highly prone to human error
  • Errors can reduce validity
  • Eliminating real-life variables can result in inaccurate conclusions
  • Time-consuming process
  • Researchers can control variables to suit personal preferences
  • Results are not descriptive
77
Q

Experiments - Research questions

A

Correlational questions

78
Q

Type of data collected by experiments

A

Quantitative

79
Q

True experiment

A

Researcher manipulates one variable and controls the rest of the variables

80
Q

Ad hoc analysis

A

Hypothesis invented after testing is done to try and explain contrary evidence

81
Q

Independent variable

A

variable manipulated

82
Q

Dependent variable

A

variable measured

83
Q

Control variables

A

not changed

84
Q

Purpose of Secondary data analysis

A
  • Take data from previous research and examine it for new question
  • Look for datasets that other people have created
85
Q

Pros of secondary data analysis

A
  • Discover new things from old data
  • Can use data that you wouldn’t have the resources to gather
  • Access to historical data
  • Ease of Access
  • Inexpensive
  • Time-saving
86
Q

Cons of secondary data analysis

A
  • May be issues with the data e.g bias
  • Might twist yourself to fit the data you’ve got
  • If you don’t know how the data is collected - don’t know the validity
  • Because data is hugely heterogeneous in many cases - have to make decisions to remove, ignore or add sections - can lead to confirmation bias
  • Many critical decisions in processing the data
  • Irrelevant Data - have to find the relevant data from the irrelevant data
87
Q

Secondary data analysis - research questions

A
  • Often explorational
  • Every question can be asked
88
Q

What can go wrong in data cleaning

A
  • Because data is hugely heterogeneous in many cases - have to make decisions to remove, ignore or add sections - can lead to confirmation bias
  • Need to know a lot about the data to prove that any changes in adding or ignoring have valid assumptions and rationale
89
Q

How can you validate secondary data analysis

A

To validate secondary data, find the:
- Purpose for which the material was collected/created
- Specific methods used to collect it
- Population studied and validity of the sample
- Ccredibility of the collector
- Limits
- Historic and/or political circumstances
- And consider how the data is coded/categorised
- Consider whether data must be adapted/adjusted

90
Q

Quantitative data - Research questions

A
  • Correlational
  • Causation
  • The how questions
91
Q

Qualitative data - Research questions

A
  • The why questions
92
Q

Mixed approach

A
  • Mix of qualitative and quantitative data
  • Usually use different methods to collect them
  • When you have a small sample size - want to do quantitative but don’t have enough people
  • Qualitative used to underpin quantitative
  • For exploration
93
Q

Quantitative data

A
  • Expressed in numbers and graphs
  • Used to test or confirm theories and assumptions
  • Can be used to establish generalisable facts about a topic
  • Methods include experiments, observations recorded as numbers and surveys with closed-ended questions
  • At risk for research biases icl. Information bias, omitted variable bias, sampling bias or selection bias
94
Q

Qualitative research

A
  • Expressed in words
  • Used to understand concepts through experiences
  • Gather in-depth insights on topics
  • Methods include interviews with open-ended questions, observations described in words, focus groups, Ethnographies and literature reviews
  • At risk of research biases incl. Hawthorne effect, observer bias, recall bias and social desirability bias
95
Q

Qualitative data limitations

A
  • Don’t draw samples from large-scale data sets due to time and costs involved
  • Problem of adequate validity or reliability is major concern due to subjective nature
  • Contexts, situations, events, conditions and interactions cannot be replicated
  • Generalisations can’t be made to a wider context than the one studied
  • Lengthy time required
  • Expert knowledge of an area is required to interpret the data
96
Q

Qualitative data advantages

A
  • Researcher gains an insider’s view of the field - can find issues that are often missed
  • Can be important in suggesting possible relationships, causes, effects and dynamic processes
  • Allows for ambiguities/contradictions in the data which reflect social reality
  • Uses a descriptive, narrative style
97
Q

Quantitative data limitations

A
  • Do not take place in natural setting
  • Do not allow participants to explain their choices
  • Poor knowledge of the application of the statistical analysis may negatively affect analysis and subsequent interpretation
  • Large sample sizes needed for more accurate analysis
  • Confirmation bias - researcher might miss observing phenomena because of focus on theory or hypothesis testing rather than on theory/hypothesis generation
98
Q

Quantitative data advantages

A
  • Scientific objectivity - data can be interpreted with statistical analysis
  • Useful for testing and validating already constructed theories
  • Data analysis and collection can be performed quickly
  • Data can be checked by others and replicated
  • Hypotheses can be tested
99
Q

Hypothesis testing

A

Collect data to determine if a claim about the population is true

100
Q

Hypothesis

A

-Testable statement that you want to accept or reject
- You never “prove” a hypothesis

101
Q

Validity of a hypothesis

A
  • Needs to be testable
  • Need to be able to prove it false
  • Be specific - don’t use ambiguous words e.g “athlete” or “better”
  • Don’t be too specific - overlap with methodology
    “If (one variable) ‘is related to’/’is affected by’/’causes’ (other variables) then (comment on relationship)”
102
Q

Alternative hypothesis tails

A
  • Two tailed test - doesn’t state direction
  • One-tailed test - states direction
103
Q

Type 1 error

A

Null Hypothesis is true but is rejected - false positive

104
Q

Type 2 error

A

Null hypothesis is false but is not rejected - false negative

105
Q

P-value

A
  • Compare p-value to a threshold value (significance level/alpha) to reject null hypothesis
  • P > alpha - fail to reject
  • P <=alpha - reject
106
Q

Critical value

A
  • Some tests return a list of critical values and their associated significance levels and a test statistic
  • Test statistic < critical value - fail to reject
  • Test statistic >= critical value - reject
107
Q

Types of data

A
  • Observational data
  • Experimental data
  • Simulation data
  • Dervived/Compiled data
108
Q

Observational data

A

Open surveys, observational studies, focus groups etc. …

109
Q

Experimental data

A

Collected via experimentation - easier to reproduce

110
Q

Simulation data

A

Scenario simulation allows for generation of predictive data

111
Q

Derived/Compiled data

A

Utilises existing data to generate new data - secondary data analysis

112
Q

Descriptive analysis

A
  • Basic analysis of the data giving a general overview
  • Only describes what the data is or what it shows
  • Allows for simple analyses
  • No extrapolation of inference
  • Measures of frequency
  • Measures of central tendency
  • Measures of dispersion or variation
  • Measures of position
113
Q

Measures of frequency

A

Count, percent, frequency

114
Q

Measures of central tendency

A
  • Mean, median, mode
  • Used to show an average or most commonly indicated response
115
Q

Measures of dispersion or variation

A
  • Range, variance, standard deviation
  • Variance/standard deviation - difference between observed score and mean
  • When you want to show how spread out the data is
116
Q

Measures of position

A
  • Percentile ranks, Quartile ranks
  • Describes how scores fall in relation to one another
  • Relies on standardised scores
  • Use when you need to compare scores to a normalised score
117
Q

Exploratory Analysis

A
  • Examine or explore data and find relationships between variables which were previously unknown
  • Does not describe the cause
  • Useful for discovering new connections
118
Q

Inferential Analysis

A
  • Use statistics to look beyond the collected data to identify new conclusions
  • Using a small sample of data to infer about a larger population
  • Based on laws of probability and confidence intervals
  • Central Limit Theorem
  • T-test
119
Q

Central Limit Theorem

A
  • Distribution sample means approximates a normal distribution and the sample size gets larger, regardless of populations distribution
  • Average of sample means and standard deviations will equal the population mean and standard deviation
120
Q

T-test

A
  • Tells how likely the difference between two groups is a real difference rather than sampling artefact
  • ‘P-value’ - probability that the data collected occurs by random chance
121
Q

Predictive Analysis

A
  • Using historical or current data to find patterns to make predictions about the future
  • Simulations can both generate data for prediction as well as using existing data
  • Accuracy of predictions depends on input variables/data
  • Accuracy depends on types of models - linear model generally works well
  • Using variable to predict another doesn’t denote a causal relationships
122
Q

Causal Analysis

A
  • Step beyond inferential analysis
  • Examines the cause and effect relationships between variables focused on finding the cause of a correlation
  • Generally large, complex and expensive studies
  • Four important components
    1. Correlation
    2. Temporal sequence - cause must occur before effect
    3. Concomitant variation - variation must be systematic between the two variables
    4. Nonspurious association - Any covariation between a cause and an effect must be true and not due to another variable
123
Q

Mechanistic Analysis

A
  • Similar to predictive but instead of general data driven predictions - utilise highly specific changes in variables that lead to changes in linked variables
  • Generally used in high precision disciplines e.g engineering and physics
  • Often used in high precision computer models
124
Q

5 characteristics of quality data

A
  1. Validity - degree to which data conforms to defined business rules or constraints
  2. Accuracy - ensure data is close to true values
    - E.g put in positive and negative questions in questionnaire - person should answer 1 to the negative if they answered 5 to the positive
  3. Completeness - degree to which all required data is known
  4. Consistency - ensure data is consistent within the same dataset/ across multiple datasets
  5. Uniformity - degree to which data is specified using the same unit of measure
125
Q

Qualitative data scales

A
  • Nominal (categories, no ordering) e.g male, female
  • Ordinal (categories, ordered) e.g small, medium, large
126
Q

Quantitative data scales

A
  • Discrete (countable, integers)
  • Continuous (measurable) e.g Age, temperature - can subdivide it
127
Q

Paired or match variables

A

Two variables in the individuals of a population that are linked together in order to determine the correlation

128
Q

Choice of statistical test from paired or matched observation

A
  • Nominal variable - McNemar’s Test
  • Ordinal (Ordered categories) - Wilcoxon
  • Quantitative (Discrete or Non-Normal) - Wilcoxon
  • Quantitative (Normal) - Paired t test
129
Q

Parametric test

A
  • Make assumptions about the parameters of the population distribution from which the sample is drawn
  • Often that the population data are normally distributed
  • Can only apply parametric tests (e.g T-test) if you have a sample big enough (in regards to population) to assume that the central limit theorem applies
130
Q

Non parametric tests

A
  • “distribution-free”
  • Can be used for non-Normal variables
131
Q

Reducing Type 1 and Type 2 errors

A
  • Reducing the chances of a type I error increases the chances of a type II error and vice versa
  • In science it is better to miss something than draw incorrect conclusions - reduce type I errors
  • Bonferroni correction - Reduces instances of type I errors but increases type II errors
  • Types II error reduction not as easy as Bonferroni:
  • Increase sample size
  • Change alternative value in the alternate hypothesis
132
Q

ANOVA (analysis of variance)

A
  • test looking at 3 or more groups
  • reduces type I errors
  • Used for comparing the means of three or more groups or variables
133
Q

Monte Carlo simulation

A
  • In uncertain scenario - allows for exploration of the problem/solution space
  • One of the most popular techniques for calculating effect of unpredictable variables on a specific output variable
  • Ideal for risk analysis
134
Q

Factor analysis

A
  • Large well-structured questionnaire
  • Trying to address multiple things
  • Many questions may investigate the same ‘factor’
  • Method allows for grouping variables into set of underlying factors
  • Confirmatory factor analysis - know what the factors are and have set them
  • Exploratory Factor analysis - assume there are factors but not setting them
135
Q

Cohort analysis

A
  • Form of behavioural analytics
  • Ideal for examining user behaviour
  • Allow for exploration between cohorts
  • Group of people who share common characteristics over a given time frame
136
Q

Cluster Analysis

A
  • Works by organising items into groups or clusters on how associated they are
  • K-means clustering - n data points in k clusters
  • Setting different number of clusters gives different results
  • Works at a data-set level - every point is assessed relative to the others - data must be as complete as possible
  • Intracluster distance - distance between clusters
  • Intercluster distance - distance within clusters
137
Q

Time series analysis

A
  • Useful to see how variable changes over time
  • Forecasting via trends
138
Q

Sentiment Analysis

A
  • Natural language processing technique to determine whether data is positive, negative or neutral
  • Not terribly refined - can’t figure out sarcasm
139
Q

Basic vs applied research

A

Research for curiosity vs research to answer a specific question

140
Q
A