Empirical 09.09.24 Flashcards

Lenze/Varsamou

1
Q

Kinds of Sampling and reasons to do sampling

A
  1. Probability sampling
    - Simple random sampling
    - Stratified sampling
    - Cluster sampling
    - (Multistage sampling)

Especially appropriate in qualitative research methods, where only few individual cases are analysed.

  1. Nonprobability sampling
    - Typical sampling
    - Extreme case sampling
    - Concentration sampling
    - Quota sampling

Reasons to do sampling:

  1. Cost saving
    ▪ Collecting smaller amounts of elements is cheaper than conducting a census (complete enumeration).
  2. Time saving
    ▪ Data collection in the case of sampling takes less time than a census.
  3. A census or complete enumeration is practically impossible
    ▪ It is theoretically imaginable, but practically not reasonable or possible.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is sample mean?

A

When people use the word ‘average’ in everyday conversation, they are usually referring to the mean. Our sample mean should be close to the population mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Factors to calculate sample size

A

Population size, parameter variation, degree of accuracy, and degree of confidence

  • Take into account any practical constraints such as budget, time, or feasibility of collecting data. Sometimes you may need to balance statistical considerations with these constraints.
  • If necessary, adjust the sample size based on factors like the complexity of the analysis, potential non-response rates, or the need for subgroup analysis.
  • Validate Sample Size: After collecting data, validate whether your sample size was sufficient to provide reliable estimates.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Degree of Accuracy

A

How accurate our sample represent the population. We typically use the terms biased and unbiased to describe the accuracy of sample statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Arbitrary Sampling

A

The researcher selects individuals or units of study because they are available, convenient and represent some characteristic the researcher seeks to study without following any specific system, e.g. street surveys.
- Representativity is problematic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Simple Random Sampling

A

Most basic kind of probability sampling design. Each case in the population has a known and equal probability of being selected for the sample. Goal: construct a sample that is like the population so that we can use what we learn about the sample to generalize to the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cluster Sampling

A

Cluster sampling is a spatiotemporal defined conglomeration of elements of the population which form a structurally reduced representation of the respective population.

((Кластерная выборка — это пространственно-временная определенная группировка элементов населения, которая представляет собой структурно сокращенную репрезентацию соответствующей популяции))

Example: GMF experience – time and space – Bonn, June 2024. Population – media ppl from all over the worlds, different regions, genders, ages, etc.

Typical clusters: households, school classes, apartment buildings. Elements included: persons, pupils, households.

Procedure: you choose by random sampling a number of clusters and analyse all units of study that occur in these clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Stratified Sampling

A

Divides the population into smaller groups, or strata, based on shared characteristics. Division should be in a way that the sub-samples are still representative.

Each strata should be representative to certain group of population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Nonprobability Sampling
(Concentration Principle)

A

The researcher focuses on the part of the population where they SUSPECT the predominant part of these elements to be.

− Example: Investigation about German skiers,
−95% of all German skiers live in Bavaria – using a random sample only from the Bavarian population.

Cut-Off-Procedure = The less productive or rich part of the study population is being cut off.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Nonprobability (Quota principle)

A

You select individuals or units of study according to some fixed quota (e.g. male, above 50).
▪ Units of study are selected based on pre-specified characteristics so that the total sample has the same distribution of characteristics assumed to exist in the population that is studied.
▪ Quotas generally rely on demographic characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sample Drop-outs

A

Sample drop-out refers to all cases where an element of a sample could not be analysed. We distinguish between random and systematic deniers.

Random drop-out:
Relocation, illness…

Systematic drop-out:
Persons who deliberately refuse to take part in the survey e.g. highly educated people, people with low language knowledge etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sampling Variation: Problems with Sampling

A

Sampling variation is how much the results change when you take different samples from a group. Because of this, there is always some uncertainty when you try to make conclusions about the whole group based on those samples.

Sampling variation is about how much a number you find might change when you look at different samples of things.

If you are measuring something in different samples, you might notice that the numbers you get can be quite different from one group to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Variability and Sampling Error

A

A closely related term (almost a synonym) is sampling error. An error in sampling isn’t a mistake — it’s a measure of how much a value differs from the “true” value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Hypothesis testing

A

A hypothesis is a proposed explanation for a phenomenon. The term hypothesis is a statement about something that is supposed to be true. The logic of a hypothesis test is to compare two statistical data sets

A hypothesis test involves two hypothesis:
* the null hypothesis
* and the alternative hypothesis

The null hypothesis assumes there is no difference between two groups (e.g. Light color has no effect on plant growth). Researcher tries to disprove or nullify.

“Light color affects plants growth”. The researcher tries to prove this type of hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data Analysis (in pre-testing of survey)

A

This involves looking at patterns in responses to see where confusion, hesitation, disengagement, or drop-out has occurred.
* You can often be discovered by identifying straight-lining (the same answer is always checked regardless of the question), unanswered questions, and inconsistent or unrealistic responses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data visualisation

A

Descriptive statistics is the idea of quantitatively describing data and you can do that through various means. For example, through visualization techniques like:
* graphical representation
* tabular representation
* summary statistics

It presents the data in a more meaningful way, which allows for simpler interpretation through graphs or through numbers
* Descriptive statistics is about variables

17
Q

Descriptive Statistics: Charts, Graphs and Plots

A

Which one you choose depends on what kind of data you have and what you want to display.

  • If you want to display
    relationships between data in categories, you could make a bar graph.
  • A pie chart shows how categories in your data relate to the whole set
  • ## Scatter plots are a good way to display data points. It shows the relationship between two variables
18
Q

Function Questions in a questionnaire

A

Function questions control the course of the questionnaire without bringing any contribution to the actual result interest.
These questions guarantee that the survey questions are applied correctly.

  1. Ice-breaker questions
  2. Transfer and Resting questions
  3. Filter and Funnel questions
  4. Verification questions
19
Q

Survey Mode

A

All surveys are conducted in one of the three survey modes:
* Face-to-Face Interview
* Written interview
* Telephone Interview / Survey.
Additionally, there is a version of written interviews that has established itself in recent years: the online survey.

Advantages/disadvantages – drop-out rates, see the reactions, etc. In phone - more drops out. Social desirability. Time and costs.

20
Q

Experimental research Design

A

Experimental research design involves comparing two groups on one outcome measure to test some hypothesis regarding causation.

Example:
* If a researcher is interested in the effects of a new medication on headaches, they would randomly divide a group of people with headaches into two groups.
* One of the groups, the experimental group, would receive the new medication being tested.
* The other group (control group) would receive a placebo medication.
* Groups receiving different medications but would be treated exactly the same so that the research could isolate the effects of the medications.
* Both groups would be compared

21
Q

What is Research Design?

A

A design or strategy justifies the logic, structure and the principles of the research methodology and methods and how these relate to the research questions and hypothesis
* Provides a framework for the collection and analysis of data
* Expressing causal connection between variables
* Having temporal appreciation of social phenomena and their interconnections

22
Q

Mean

A

Mathematical average of all terms
The mean is the same as the average value of a data set and is found using a calculation.
Add up all of the numbers and divide by the number of numbers in the data set.

23
Q

Median

A

The median 𝑥 ̃is the data value separating the upper half of a data set from the lower half.
* Arrange data values from lowest to highest value
* The median is the data value in the middle of the set
* If there are 2 data values in the middle the median is the mean of those 2 values.
* For the data set 1, 1, 2, 5, 6, 6, 9 the median is 5.

24
Q

Mode

A

Mode is the value or values in the data set that occur most frequently.
For the data set 1, 1, 2, 5, 6, 6, 9 the mode is 1 and also 6.

25
Q

Measure of Variability: Interquartile Range

A

The middle fifty or midspread of a set of numbers, removes the outliers (highest and lowest numbers in a set)

  • If there is a large set of numbers, divide them evenly into lower and higher numbers.
  • Then find the median of each of these groups.
  • Find the interquartile range by subtracting the lower median from the higher median.
26
Q

Standard deviation

A

Average distance from the mean

27
Q

Why and how do we pretest a survey?

A

To identify and correct potential issues before full deployment. Pretesting helps ensure that questions are clear, the survey flow is logical, and respondents understand and interpret the questions as intended. This process helps avoid biases, ambiguous questions, and other errors that could compromise data quality.

How to pretest a survey:

  • Pilot Testing: Conduct the survey with a small, representative sample to simulate the actual survey conditions. Rather than ask evaluation questions, you give respondents the survey as-is and ask them to complete it.
  • Cognitive Interviews: Ask participants to verbalize their thought process as they answer the questions to identify misunderstandings.
  • Feedback: Collect feedback on question clarity, survey length, and overall experience.

This process allows for revisions and improvements to enhance the reliability and validity of the survey.

Also possible: expert evaluation, focus group, etc.

28
Q

Pitfalls When Designing a Survey

A

Biased or leading questions: Phrasing questions in a way that suggests a particular answer can skew results. Ensure questions are neutral and don’t lead respondents to a specific response.

  • Ambiguous questions: Vague or unclear questions can confuse respondents and lead to inaccurate responses. Questions should be precise and easy to understand.
  • Overly long or complex surveys: Lengthy surveys can lead to respondent fatigue and lower completion rates. Keep surveys concise and focused on relevant topics.
  • Limited response options: Providing insufficient response options can limit the range of answers and fail to capture diverse perspectives. Include a variety of response options, including open-ended questions when appropriate.

Order bias: The order in which questions are presented can influence responses. Be mindful of question sequencing and consider randomizing or alternating question order to mitigate bias.

  • Social desirability bias: Respondents may provide answers that they perceive as socially acceptable rather than their true opinions or behaviors. Use anonymous surveys and assure respondents of confidentiality to minimize this bias.
  • Sampling bias: If the survey sample is not representative of the target population, results may not be generalizable. Ensure the survey sample accurately reflects the demographics and characteristics of the population of interest.
  • Non-response bias: If certain groups are more likely to participate in the survey than others, results may be skewed. Employ strategies to encourage participation from underrepresented groups.
29
Q

How to design a codebook?

A

For quantitative:

1) List Variables: Name each variable clearly.
2) Label Variables: Provide short descriptions for each (Value labels)
3) Define Values: Specify possible values for categorical variables.
4) Units & Types: Note measurement units and data types.
5) Missing Data: Define codes for missing values.
6) Annotations: Add any necessary notes.
7) Organize: Arrange logically for easy navigation.

This ensures clarity and consistency in data interpretation and analysis.

Codebook includes (see Adeaba notes):

For qualitative:

Code + Description + Example

30
Q

Creating an Index

A

An index is formed by summing up several single indicators.

Example:
Press Freedom Index by Reporters Without Borders.
− The annual ranking list assesses the situation of press freedom in almost 180 countries worldwide.
Basis/Foundation for the ranking:
An extensive questionnaire
Different indicators (pressure, corruption), moderators, impact factors

  • The index needs to cover all aspects of the variables being studied, and it should do so in a straightforward way, without getting too complicated or including unnecessary elements.
  • The index should be able to measure everything important about the topic in a clear and simple manner.
31
Q

What is a Likert Scale?

A

The Likert scale is a five (or seven) point scale that is used to allow an individual to express how much they agree or disagree with a particular statement.
Simple questions: the less thinking required from respondents the better the response rate.

Example:
How satisfied were you with your in-store experience?

32
Q

Finding a Question - Inductive

A

Research questions can be developed inductively by observing social phenomena. Questions are developed based on the observations.

− Example: You observe more men than women in MBA courses but not in other courses.
This casual observation can be the basis of a research question: Are men more likely to take MBA’s than women? or How does gender socialization affect students’ selection of majors?

Some statement –> generalisation

33
Q

Finding a Research Question - Deductive

A

A deductive research process begins with theory and generalizations that lead to observation.

Example:
− All dogs have fleas (premise)
− Enno is a dog (premise)
− Enno has fleas (conclusion)

34
Q

What is Research Design?

A

In quantitative research, design will centre on numerical data collection and analysis

Descriptive Quantitative Research Design
− Measure variables and perhaps establishes associations between variables

Correlational Quantitative Research Design
− Seeks to understand the relationship between the variables

Quasi-Experimental Quantitative Research Design
− Establish a cause-effect relationship from one variable to another (studying more leads to higher grades?)

Experimental Quantitative Research Design
− Establishes procedures that allow to test a hypothesis and systematically/scientifically study causal relationships among variables.

35
Q

Correlational Research

A

The goal of correlational research is to determine whether two or more variables are related.

Example: a researcher is interested in determining whether age is related to weight - Researcher may discover that age is indeed related to weight because as age increases, weight also increases.

  • If a correlation between two variables is strong enough, knowing about one variable allows a researcher to make a prediction about the other variable.
  • A correlation— or relationship— between two things does not necessarily mean that one thing caused the other.
36
Q

Measures of central tendency

A

It tells you what the most useful data is, excludes the extremes from both side.

A measure of central tendency (also referred to as measures of centre or central location):
* Is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or centre of its distribution.

There are three main measures of central tendency:
−the mode, the median and the mean.

Each of these measures describes a different indication of the typical or central value in the distribution.

37
Q

Data cleaning techniques

A

Standardization
* Ensure consistent formatting for written responses (e.g., all uppercase or lowercase).

Coding
* Create numerical codes for answer choices in multiple-choice questions to simplify analysis.

Outlier Detection
* Identify and potentially exclude extreme values that might skew the results (consider reviewing these cases individually).

Researchers needs to be cautious about straightline respondents and responses must be removed before analysis.

  • Straightlining is when respondents choose similar answer option frequently (such as first/last option etc.).
  • There might be higher possibility that the respondent has not responded the answers honestly.
  • Remove Fake Or Manipulated Answers

!!! Keep a log of the decisions made during data cleaning to ensure transparency and replicability.

38
Q

Conceptualizing your Qualitative Research Question

A

Should be exploratory:

  • Is framed as a question, aim or objective but not as a hypothesis
  • Should focus on a single phenomenon, concept or idea
  • Start with a verb and stating your goal (e.g. characterize, understand etc.)
  • Identify your topic of interest
  • Use a language that is non-directional and neutral to be exploratory
  • Define your sample and your setting

Example: Understand interpersonal factors relevant to relationships formed online by retirees - in private homes in Bavaria

39
Q

How to Analyze Surveys?

A

1.Look at the results of your survey as a whole
2.Take a look at the demographics of those who responded
3.Compare responses to different questions to find deviations
4.Find connections between specific data points with layered data
5.Compare new data with past data