Statistics, SPSS, and Intro Methods (Fields Ch. 1 -4) Flashcards
Between–subjects design vs. Within-Subjects design
Comparing differences between scores of individuals in different conditions versus scores of people that experience multiple conditions (with their own scores).
Boredom effect / Learning(practice) effect
Boredom effect- refers to the possibility that performance in tasks may be influenced (the assumption is a negative influence) by boredom or lack of concentration if there are many tasks, or the task goes on for a long period of time. Learning effect - refers to the possibility that participants’ performance in a task may be influenced (positively or negatively) if they repeat the task because of familiarity with the experimental situation and/or the measures being used.
Binary Variable, Categorical Variable
Binary Variable- has two values / Categorical Variable- has only two mutually exclusive categories (e.g., being dead or alive)
See PsycStats Methods Text Book ?? Nominal Variable, Ordinal Variable, Ratio Variable, Continuos Variable ??
to be updated
What are the types of Validity? List and describe them
to be updated
Concurrent validity
??? a form of criterion validity where there is evidence that scores from an instrument correspond to concurrently recorded external measures conceptually related to the measured construct.
Spurious Relationship
A mathematical relationship in which two variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor (referred to as a “confounding variable”) Suppose there is found to be a correlation between A & B. Aside from coincidence, there are 3 possible relationships: A causes B, B causes A, OR C causes both A and B.
Confounding variable
a variable (that we may or may not have measured) other than the predictor variables in which we’re interested that potentially affects an outcome variable.
Content validity
evidence that the content of a test corresponds to the content of the construct it was designed to cover.
Continuos Variable
a variable that can be measured to any level of precision. (Time is a continuous variable- no limit on how finely it could be measured.)
Correlational research
a form of research in which you observe what naturally goes on in the world without directly interfering with it. This term implies that data will be analysed so as to look at relationships between naturally occurring variables rather than making statements about cause and effect. Compare with cross-sectional research, longitudinal research and experimental research.
Compare and contrast: correlational research, cross-sectional research, longitudinal research and experimental research.
to be updated
Counterbalancing (***)
systematically varying the order in which participants in experimental conditions see the manipulations. In the simplest case of there being two conditions (A and B), counterbalancing simply implies that half of the participants complete condition A followed by condition B, whereas the remainder do condition B followed by condition A. The aim is to remove systematic bias caused by practice effects or boredom effects.
Criterion validity
evidence that scores from an instrument correspond with (concurrent validity) or predict (predictive validity) external measures conceptually related to the measured construct.
Cross-sectional research
a form of research in which you observe what naturally goes on in the world
without directly interfering with it, by measuring several variables at a single
time point. In psychology, this term usually implies that data come from
people at different age points, with different people representing each age
point. See also correlational research, longitudinal research.
Dependent variable
another name for outcome variable. This name is usually associated with experimental methodology (which is the only time it really makes sense) and is used because it is the variable that is not manipulated by the experimenter and so its value depends on the variables that have been manipulated. To be honest I just use the term outcome variable all the time - it makes more sense (to me) and is less confusing.
Deviance
the difference between the observed value of a variable and the value of that variable predicted by a statistical model.
Discrete variable
a variable that can only take on certain values (usually whole numbers) on the scale.
Ecological validity
evidence that the results of a study, experiment or test can be applied, and allow inferences, to real-world conditions.
Experimental research
a form of research in which one or more variables are systematically manipulated to see their effect (alone or in combination) on an outcome variable. This term implies that data will be able to be used to make statements about cause and effect. Compare with cross-sectional research and correlational research.
Falsification
the act of disproving a hypothesis or theory.
Frequency distribution
a graph plotting values of observations on the horizontal axis, and the frequency with which each value occurs in the data set on the vertical axis (a.k.a. histogram).
Histogram
a frequency distribution.
Hypothesis
a prediction about the state of the world (see experimental hypothesis and null hypothesis).
Independent design
an experimental design in which different treatment conditions utilize different organisms (e.g., in psychology, this would mean using different people in different treatment conditions) and so the resulting data are independent (a.k.a. between-groups or between-subjects design).
Independent variable
another name for a predictor variable. This name is usually associated with experimental methodology (which is the only time it makes sense) and is used because it is the variable that is manipulated by the experimenter and so its value does not depend on any other variables (just on the experimenter). I just use the term predictor variable all the time because the meaning of the term is not constrained to a particular methodology.
Interquartile range
the limits within which the middle 50% of an ordered set of observations fall. It is the difference between the value of the upper quartile and lower quartile.
Interval variable
data measured on a scale along the whole of which intervals are equal. For example, people’s ratings of this book on Amazon.com can range from 1 to 5; for these data to be interval it should be true that the increase in appreciation for this book represented by a change from 3 to 4 along the scale should be the same as the change in appreciation represented by a change from 1 to 2, or 4 to 5.
Journal
In the context of academia a journal is a collection of articles on a broadly related theme, written by scientists, that report new data, new theoretical ideas or reviews/critiques of existing theories and data. Their main function is to induce learned helplessness in scientists through a complex process of self-esteem regulation using excessively harsh or complimentary peer feedback that has seemingly no obvious correlation with the actual quality of the work submitted.
Kurtosis
this measures the degree to which scores cluster in the tails of a frequency distribution. There are different ways to estimate kurtosis and in SPSS no kurtosis is expressed as 0 (but be careful because outside of SPSS no kurtosis is sometimes a value of 3). A distribution with positive kurtosis (leptokurtic, kurtosis > 0) has too many scores in the tails and is too peaked, whereas a distribution with negative kurtosis (platykurtic, kurtosis < 0) has too few scores in the tails and is quite flat.
Leptokurtic
see Kurtosis.
Levels of measurement
the relationship between what is being measured and the numbers obtained on a scale.
Longitudinal research
a form of research in which you observe what naturally goes on in the world without directly interfering with it by measuring several variables at multiple time points. See also correlational research, cross-sectional research.
Lower quartile
the value that cuts off the lowest 25% of the data. If the data are ordered and then divided into two halves at the median, then the lower quartile is the median of the lower half of the scores.
Mean
a simple statistical model of the centre of a distribution of scores. A hypothetical estimate of the ‘typical’ score.
Measurement error
the discrepancy between the numbers used to represent the thing that we’re measuring and the actual value of the thing we’re measuring (i.e., the value we would get if we could measure it directly).
Median
the middle score of a set of ordered observations. When there is an even number of observations the median is the average of the two scores that fall either side of what would be the middle value.
Mode
the most frequently occurring score in a set of data.
Multimodal
description of a distribution of observations that has more than two modes.
Negative skew
see Skew.
Nominal variable
where numbers merely represent names. For example, the numbers on sports players’ shirts: a player with the number 1 on her back is not necessarily worse than a player with a 2 on her back. The numbers have no meaning other than denoting the type of player (full back, centre forward, etc.).
Noniles
a type of quantile; they are values that split the data into nine equal parts. They are commonly used in educational research.
Normal distribution
a probability distribution of a random variable that is known to have certain properties. It is perfectly symmetrical (has a skew of 0), and has a kurtosis of 0.
Ordinal variable
data that tell us not only that things have occurred, but also the order in which they occurred. These data tell us nothing about the differences between values. For example, gold, silver and bronze medals are ordinal: they tell us that the gold medallist was better than the silver medallist, but they don’t tell us how much better (was gold a lot better than silver, or were gold and silver very closely competed?).
Outcome variable
a variable whose values we are trying to predict from one or more predictor variables.
Percentiles
a type of quantile; they are values that split the data into 100 equal parts.
Platykurtic
see Kurtosis.
Positive skew
see skew.
Practice effect
refers to the possibility that participants’ performance in a task may be influenced (positively or negatively) if they repeat the task because of familiarity with the experimental situation and/or the measures being used.
Predictive validity
a form of criterion validity where there is evidence that scores from an instrument predict external measures (recorded at a different point in time) conceptually related to the measured construct.
Predictor variable
a variable that is used to try to predict values of another variable known as an outcome variable.
Probability density function (PDF)
the function that describes the probability of a random variable taking a certain value. It is the mathematical function that describes the probability distribution.
Probability distribution
a curve describing an idealized frequency distribution of a particular variable from which it is possible to ascertain the probability with which specific values of that variable will occur. For categorical variables it is simply a formula yielding the probability with which each category occurs.
Qualitative methods
extrapolating evidence for a theory from what people say or write (cf. quantitative methods).
Quantitative methods
inferring evidence for a theory through measurement of variables that produce numeric outcomes (cf. qualitative methods).
Quantiles
values that split a data set into equal portions. Quartiles, for example, are a special case of quantiles that split the data into four equal parts. Similarly, percentiles are points that split the data into 100 equal parts and noniles are points that split the data into 9 equal parts (you get the general idea).
Quartiles
a generic term for the three values that cut an ordered data set into four equal parts. The three quartiles are known as the lower quartile, the second quartile (or median) and the upper quartile.
Randomization
the process of doing things in an unsystematic or random way. In the context of experimental research the word usually applies to the random assignment of participants to different treatment conditions.
Range
the range of scores is the value of the smallest score subtracted from the highest score. It is a measure of the dispersion of a set of scores. See also variance, standard deviation, and interquartile range.
Ratio variable
an interval variable but with the additional property that ratios are meaningful. For example, people’s ratings of this book on Amazon.com can range from 1 to 5; for these data to be ratio not only must they have the properties of interval variables, but in addition a rating of 4 should genuinely represent someone who who rated it as 2. Likewise, someone who rated it as 1 should be half enjoyed this book twice as much as someone as impressed as someone who rated it as 2.
Reliability
the ability of a measure to produce consistent results when the same entities are measured under different conditions.
Repeated-measures design
an experimental design in which different treatment conditions utilize the same organisms (i.e., in psychology, this would mean the same people take part in all experimental conditions) and so the resulting data are related (a.k.a. related design or within-subject design).
Second quartile
another name for the median.
Skew
a measure of the symmetry of a frequency distribution. Symmetrical distributions have a skew of 0. When the frequent scores are clustered at the lower end of the distribution and the tail points towards the higher or more positive scores, the value of skew is positive. Conversely, when the frequent scores are clustered at the higher end of the distribution and the tail points towards the lower more negative scores, the value of skew is negative.
Standard deviation
an estimate of the average variability (spread) of a set of data measured in the same units of measurement as the original data. It is the square root of the variance.
Systematic variation
variation due to some genuine effect (be it the effect of an experimenter doing something to all of the participants in one sample but not in other samples, or natural variation between sets of variables). We can think of this as variation that can be explained by the model that we’ve fitted to the data.
Sum of squared errors
another name for the sum of squares.
Tertium quid
the possibility that an apparent relationship between two variables is actually caused by the effect of a third variable on them both (often called the third-variable problem).
Test-retest reliability
the ability of a measure to produce consistent results when the same entities are tested at two different points in time.
Theory
although it can be defined more formally, a theory is a hypothesized general principle or set of principles that explain known findings about a topic and from which new hypotheses can be generated.
Unsystematic variation
this is variation that isn’t due to the effect in which we’re interested (so could be due to natural differences between people in different samples such as differences in intelligence or motivation). We can think of this as variation that can’t be explained by whatever model we’ve fitted to the data.
Upper quartile
the value that cuts off the highest 25% of ordered scores. If the scores are ordered and then divided into two halves at the median, then the upper quartile is the median of the top half of the scores.
Validity
evidence that a study allows correct inferences about the question it was aimed to answer or that a test measures what it set out to measure conceptually (see also Content validity, Criterion validity).
Variables
anything that can be measured and can differ across entities or across time.
Variance
an estimate of average variability (spread) of a set of data. It is the sum of squares divided by the number of values on which the sum of squares is based minus 1.
Within-subject design
another name for a repeated-measures design.
z-score
the value of an observation expressed in standard deviation units. It is calculated by taking the observation, subtracting from it the mean of all observations, and dividing the result by the standard deviation of all observations. By converting a distribution of observations into z-scores a new distribution is created that has a mean of 0 and a standard deviation of 1.
_-level
the probability of making a Type I error (usually this value is .05).
Alternative hypothesis
the prediction that there will be an effect (i.e., that your experimental manipulation will have some effect or that certain variables will relate to each other).
_-level
the probability of making a Type II error (Cohen, 1992, suggests a maximum value of .2).
Bonferroni correction
a correction applied to the _-level to control the overall Type I error rate when multiple significance tests are carried out. Each test conducted should use a criterion of significance of the _-level (normally .05) divided by the number of tests conducted. This is a simple but effective correction, but tends to be too strict when lots of tests are performed.
Central limit theorem
this theorem states that when samples are large (above about 30) the sampling distribution will take the shape of a normal distribution regardless of the shape of the population from which the sample was drawn. For small samples the t-distribution better approximates the shape of the sampling distribution. We also know from this theorem that the standard deviation of the sampling distribution (i.e., the standard error of the sample mean) will be equal to the standard deviation of the sample(s) divided by the square root of the sample size (N).
Cohen’s d
An effect size that expressed the difference between two means in standard deviation units. In general it can be estimated using the formula above.
Confidence interval
for a given statistic calculated for a sample of observations (e.g., the mean), the confidence interval is a range of values around that statistic that are believed to contain, with a certain probability (e.g., 95%), the true value of that statistic (i.e., the population value).
Degrees of freedom
an impossible thing to define in a few pages, let alone a few lines. Essentially it is the number of ‘entities’ that are free to vary when estimating some kind of statistical parameter. In a more practical sense, it has a bearing on significance tests for many commonly used test statistics (such as the F-ratio, t-test, chi-square statistic) and determines the exact form of the probability distribution for these test statistics. The explanation involving soccer players in Chapter 2 is far more interesting…
Deviance
the difference between the observed value of a variable and the value of that variable predicted by a statistical model.
Effect size
an objective and (usually) standardized measure of the magnitude of an observed effect. Measures include Cohen’s d, Glass’s g and Pearson’s correlations coefficient, r.
Experimental hypothesis
synonym for alternative hypothesis.
Experimentwise error rate
the probability of making a Type I error in an experiment involving one or more statistical comparisons when the null hypothesis is true in each case.
Familywise error rate
the probability of making a Type I error in any family of tests when the null hypothesis is true in each case. The ‘family of tests’ can be loosely defined as a set of tests conducted on the same data set and addressing the same empirical question.
Fit
how sexually attractive you find a statistical test. Alternatively, it’s the degree to which a statistical model is an accurate representation of some observed data. (Incidentally, it’s just plain wrong to find statistical tests sexually attractive.)
Linear model
a model that is based upon a straight line.
Meta-analysis
this is a statistical procedure for assimilating research findings. It is based on the simple idea that we can take effect sizes from individual studies that research the same question, quantify the observed effect in a standard way (using effect sizes) and then combine these effects to get a more accurate idea of the true effect in the population.
Method of least squares
a method of estimating parameters (such as the mean, or a regression coefficient) that is based on minimizing the sum of squared errors. The parameter estimate will be the value, out of all of those possible, that has the smallest sum of squared errors.
Null hypothesis
the reverse of the experimental hypothesis, it says that your prediction is wrong and the predicted effect doesn’t exist.
One-tailed test
a test of a directional hypothesis. For example, the hypothesis ‘the longer I write this glossary, the more I want to place my editor’s genitals in a starved crocodile’s mouth’ requires a one-tailed test because I’ve stated the direction of the relationship (see also two-tailed test).
Parameter
a very difficult thing to describe. When you fit a statistical model to your data, that model will consist of variables and parameters: variables are measured constructs that vary across entities in the sample, whereas parameters describe the relations between those variables in the population. In other words, they are constants believed to represent some fundamental truth about the measured variables. We use sample data to estimate the likely value of parameters because we don’t have direct access to the population. Of course it’s not quite as simple as that.
Population
in statistical terms this usually refers to the collection of units (be they people, plankton, plants, cities, suicidal authors, etc.) to which we want to generalize a set of findings or a statistical model.
Power
the ability of a test to detect an effect of a particular size (a value of .8 is a good level to aim for).
Sample
a smaller (but hopefully representative) collection of units from a population used to determine truths about that population (e.g., how a given population behaves in certain conditions).
Sampling distribution
the probability distribution of a statistic. We can think of this as follows: if we take a sample from a population and calculate some statistic (e.g., the mean), the value of this statistic will depend somewhat on the sample we took. As such the statistic will vary slightly from sample to sample. If, hypothetically, we took lots and lots of samples from the population and calculated the statistic of interest we could create a frequency distribution of the values we got. The resulting distribution is what the sampling distribution represents: the distribution of possible values of a given statistic that we could expect to get from a given population.
Sampling variation
the extent to which a statistic (the mean, median, t, F, etc.) varies in samples taken from the same population.
Standard error
the standard deviation of the sampling distribution of a statistic. For a given statistic (e.g., the mean) it tells us how much variability there is in this statistic across samples from the same population. Large values, therefore, indicate that a statistic from a given sample may not be an accurate reflection of the population from which the sample came.
Standard error of the mean (SE)
the standard error associated with the mean. Did you really need a glossary entry to work that out?
Test statistic
a statistic for which we know how frequently different values occur. The observed value of such a statistic is typically used to test hypotheses.
Two-tailed test
a test of a non-directional hypothesis. For example, the hypothesis ‘writing this glossary has some effect on what I want to do with my editor’s genitals’ requires a two-tailed test because it doesn’t suggest the direction of the relationship. See also One-tailed test.
Type I error
occurs when we believe that there is a genuine effect in our population, when in fact there isn’t.
Type II error
occurs when we believe that there is no effect in the population, when in fact there is.
Currency variable
a variable containing values of money.
String variables
variables involving words (i.e., letter strings). Such variables could include responses to open-ended questions such as ‘How much do you like writing glossary entries?’; the response might be ‘About as much as I like placing my gonads on hot coals’.
Bar chart
a graph in which a summary statistic (usually the mean) is plotted on the y-axis against a categorical variable on the x-axis (this categorical variable could represent, for example, groups of people, different times or different experimental conditions). The value of the mean for each category is shown by a bar. Different-coloured bars may be used to represent levels of a second categorical variable.
Boxplot (a.k.a. box-whisker diagram)
a graphical representation of some important characteristics of a set of observations. At the centre of the plot is the median, which is surrounded by a box, the top and bottom of which are the limits within which the middle 50% of observations fall (the interquartile range). Sticking out of the top and bottom of the box are two whiskers which extend to the highest and lowest extreme scores, respectively.
Chartjunk
superfluous material that distracts from the data being displayed on a graph.
Density plot
similar to a histogram except that rather than having a summary bar representing the frequency of scores, it shows each individual score as a dot. They can be useful for looking at the shape of a distribution of scores
Error bar chart
a graphical representation of the mean of a set of observations that includes the 95% confidence interval of the mean. The mean is usually represented as a circle, square or rectangle at the value of the mean (or a bar extending to the value of the mean). The confidence interval is represented by a line protruding from the mean (upwards, downwards or both) to a short horizontal line representing the limits of the confidence interval. Error bars can be drawn using the standard error or standard deviation instead of the 95% confidence interval.
Line chart
a graph in which a summary statistic (usually the mean) is plotted on the y-axis against a categorical variable on the x-axis (this categorical variable could represent, for example, groups of people, different times or different experimental conditions). The value of the mean for each category is shown by a symbol, and means across categories are connected by a line. Different-coloured lines may be used to represent levels of a second categorical variable.
Regression line
a line on a scatterplot representing the regression model of the relationship between the two variables plotted.
Scatterplot
a graph that plots values of one variable against the corresponding value of another variable (and the corresponding value of a third variable can also be included on a 3-D scatterplot).