PL1010: Research Design and Methods 1B Flashcards
You can email destinee.mbo@forward-college.eu with any questions/suggestions about the flashcards in this deck.
Categorical variable
Variable with scores that are not on a numeric scale
Descriptive statistics –
Summarise samples – giving someone the main points in a simple form To describe data, we will use graphical and numerical (statistical) techniques
Inferential statistics –
Examine patterns in the data and consider how much data we have You can then draw conclusions about a population based on the analysis of a sample. -> conceptual replication
Summarising
collecting and summarising data
Statistical inference
the ability to draw general conclusions from samples
How many times does a particular score occur?
Percentages/Averages Scores for a particular variable (Frequency statement)
Do scores for one variable correlate with scores for the other variable?
Statement about association
How strong is the correlation or association between two variables?
Statement about association
Do I trust that there is a “genuine” association (relationship)?
Statement about relationship between two variables
Frequency Distribution?
show scores in order and their frequency of appearance in the sample
Negatively skewed
Positively Skewed
When not to describe the skew of data?
When we cannot put our scores in order , from lowest
to highest so when we are describing a categorical
variable with unordered categories
Unimodal?
One major peak
Bimodal
Two major peaks
Approximately symmetrical
How do outliers and the mean relate to each other?
Outliers are extreme values that differ from most values in the data set. Because all values are used in the calculation of the mean, an outlier can have a dramatic effect on the mean by pulling the mean away from the majority of the values.
What happens to the mean, median and mode in a skewed distribution
in normal distributions, they all take on the same number
Why are histograms good?
effective visual summary of a variable’s central tendency and variability
What is a discrete, continuous, independent and dependent variable?
Discrete: variable that is limited (age, gender) Continuous: exists on a continuum basically infinite between highest/lowest IV: variable manipulated/changed to see whether it has an effect on the DV that might change because of the manipulation DV: variable that, though measured, is not being controlled
What is the role of measurement scales?
The numbers don’t necessarily say anything concrete about the objects measured <i>ex.: if I scored high on a test, but someone else scored lower, it’s not necessary because they remembered less even though the data might suggest it → we assume that they mean I remembered more</i>
What is the purpose of a frequency distribution?
Organising data into a meaningful order of how many times
Which variable do I usually find on the X- and Y-Axis in histograms vs. line graphs?
histogram: dv-iv Line/Bar graph: iv-dv
What is the mode, median, mean (+formulas)?
Mode: the highest point in the graph Median: 50th percentileMean: Sum of N/ N
If the mean is slightly larger what does it probably say about or distribution?
Positively skewed
When will the mean and the median be equal?
Symmetric distribution
The benefit of the mode is?
- Representing categorical data * More informative *But not very reflective of the remaining data set
The benefit of the median is?
Not affected by outliers Not stable in comparison and not useful to calculation
What does central tendency refer to?
The scores tendency to distribute in a certain way?
What is the advantage of a bar chart?
- Comparing categories * Mirrors other visualisation techniques were the spread is along the X-axis and the frequency or percentage is along the Y-axis already hints at modality and skewness
What is an alternate name for the y-axis/x-axis?
ordinate/abscissa
Suppose you sell ice cream with three different flavours: chocolate, strawberry and yogurt. The ice cream flavours are measured on a ____________ level. You sell ice cream to children, adults and elderly people. These age groups are measured on a ____________ level.
nominal; ordinal
operational definition
defining a variable in terms of the set of steps or procedures that the researcher goes through in order to manipulate or measure the variable
right skewed
positively skewed
What does a negatively skewed distribution reveal?
A lot of people got close to the maximum score
What does central tendency mean?
average score
Age in months is an example of a variable with a ratio scale of measurement. Select one:
True
False
T
What are two ways to visually represent to measurement data variables?
- scatter plots 2. contingency tables/crosstabulation
What is a way to visually represent a mix of categorical and measurement data?
compound histogram
What is a way to visually represent categorical data pairs?
crosstabulation
What are the groupings of scores in histograms called?
bins
Do these images show the same data?
Yes
Which visual representation should you choose if you want to show that variables vary simuntaneously?
scatter-plots
What does a boxplot do?
summarises the data while showing the range, interquartile range, as well as the min, max and the median
When is the mean most useful?
best for interval/ratio measurement data (categorical data can hardly be split into 2), needs equal spacing between adjacent values
What is the mode most useful for?
all but notably for nominal/ordinal categorical data because popular choice
Variables are
properties of objects that vary in the values that they take on
A score is
an individual value for a variable
Measurement data describes
scores on a numerical scale
Categorical data describes
scores not on a numerical scale
A Population describes
a complete set of scores that might be of interest
A Sample is
a sub-set of scores from a population which were obtained
A parameter is
a number that summarises the entire set of scores in a population
A statistic is
a number that summarises the scores in a sample
Descriptive Statistics…
summarise samples by presenting the main points in a simplified way
Inferential statistics…
examine patterns in the data and consider the amount of data
Ethnicity or political ideology are examples
nominal variables
standardised scores (z-scores)
Z-Score Formula
Falsifiability
capacity for some proposition, statement, theory or hypothesis to be proven wrong (through systematic empiricism) a basis provided by the null hypothesis
null hypothesis
states the contrary of the experimental or alternative hypothesis
falsifiable hypothesis
can be logically contradicted by an empirical test that can potentially be executed with existing technologies .
What is meant by dispersion/variability around the mean?
Determining how the scores relate towards the mean score
What are measures of variability?
Range Interquartile Range Standard deviation Sample variance if sd=0 so we square scores and then take the sum so negative scores become positive Absolute Mean Deviation SD but not squared which could= 0
What are the mean and standard deviation in the standard normal distribution?
mean=0 and sd=1 → z-score standard score specifying the amount of distance of the sd
What is the mean and the standard deviation of t-scores?
Mean = 50 SD= 100
What is meant by sampling error?
Chance difference = the way some statistics naturally varies from sample to sample = in that it wll always deviate from the parameter it is the random variability = standard deviation of the sampling distribution
What is the standard error of the mean?
The standard deviation / variability from the estimated parameter mean of the distribution
What is the purposes of the standard error?
It gives us an indication of just how much the sample statistic might differ across the samples. It’s like a z-score for all the potential differences we could observe but would not go against the finding
What are the logical steps of hypothesis testing?
Set up a research Hypothesis H1 Set up a null hypothesis H0 Get a sample and sample distribution of sample statistics (eg mean) und the H0 Calculate probability value of of sample statistic at least as large as the one obtained Reject or Fail to Reject H0
What’s the philosophical hypothesis of the null hypothesis?
M1-m2 =0 has been proposed by Fisher With the logic that we can always show that something is false
How do you calculate the IQR?
Order the scores Find the median location (N/2) Find the median of the upper and lower quartile N low /2; N high /2
What do d= 1 and d= .5 indicate?
That the effect the difference is either twice or half as large as the standard deviation
What are the absolute deviations from the mean?
X- Mean that’s why if we average and take the root of them we get the standard sample deviation
When you collect data from a sample, the sample variance is used to ?
make estimates or inferences about the population variance and comparing the variance of samples helps you assess group differences
How is the sample mean related to variance and standard deviation?
it is expanded on in the formulas for variance and standard deviation
Which five steps need to be taken to calculate the sample variance?
- The mean (∑ 𝑋 /N) 2. The Deviation from the mean X- (∑ 𝑋 /N) 3. Squared deviation from the mean (X- (∑ 𝑋 /N))^2 4. Find the sum for all scores and devide by N-1 5. Take the root to find the standard deviation or z-score
The standard deviation is more informative about the variability than the variance.
False
The standard deviation is expressed in larger units than the variance.
False because the root is taken
What does the standard deviation tell me?
how far, on average, a value lies from the mean which is why it is derived from the variance (square root)
Which graph describes a correct null hypothesis?
right
The p-value can be defined as
the probability of obtaining a significant result when the null hypothesis is true
Do scores for one variable correlate with scores for the other variable?
Statement about association
How strong is the correlation or association between two variables?
Statement about association