Statistics Flashcards
Descriptive Statistics
Organizes, summarizes, and communicates a group of numerical observations
Inferential Statistics
Uses a sample data to make general estimates about the larger population
Sample
Set of observations drawn from the population of interest
Population
includes all possible observations about which we’d like to know something
Variable
any observation of physical, attitudinal, or behavioural characteristic that can take on different values
Discrete observation
Can take only specific values; on other values can exist between the numbers; times one woke up early in a week
Continuous observation
can take on a full range of values (numbers out to several decimal places); infinite number of potential values exist; A person might complete a task in 12.839 seconds, etc.
Nominal Variable
variable used in observations that have categories, or names, as their values; 1 for female and 2 for male
Ordinal Variable
A variable used for observations that have rankings as their values; team sports, which team placed first, second, third
Interval Variables
used for observations that have numbers are their values; distance (or interval) between pairs of consecutive numbers assumed to be equal; temperature because the interval from one degree to the next is always the same; cannot be anything but whole numbers, can be personality and attitude measures
Ratio Variables
Variables that meet the criteria for interval variables but also have meaningful zero points; reaction time; time has a meaningful zero
Scale Variable
Variable that meets the criteria for an interval variable or a ratio variable
Level
Discrete value or condition that a variable can take on
Independent Variable
has at least two levels that we either manipulate or observe to determine its effects on the dependent variable; does gender predict one’s attitude about politics; gender with two levels, male and female
Dependent Variable
Outcome variable that we hypothesise to be related to, or caused by, changes in the independent variable;
Confounding Variable
Any variable that systematically varies with the independent variable to that we cannot logically determine which variable is at work; also called a confound; start using a diet drug AND exercise
Reliable measure
One that is consistent, your weight now will be the same as your weight an hour from now, your scale is reliable
Valid measure
One that measures what it was intended to measure; your scale may match your weight when you measure it at the doctor’s office
Hypothesis Testing
process of drawing conclusions about whether a particular relation between variables is supported by evidence
Operational Definition
Specifies the operations or procedures used to measure or manipulate a variable
Correlation
An association between two or more variables
Random assignment
Every participant in the study has an equal chance of being assigned to any of the groups or experimental conditions in a study
Experiment
A study in which participants are randomly assigned to a condition or level of one or more independent variables
Between-Groups Research Design
Participants experience one, and only one, level of the independent variable
Within-Groups Research Design
The different levels of the independent variable are experienced by all participants in the study, also called a Repeated measures design
Outlier
an extreme score that is either very high or very low in comparison with the rest of the scores in the sample
Outlier Analysis
Studies that examine observations that do not fit the overall pattern of the data, in an effort to understand the factors that influence the dependent variable
Raw Score
Data point that has not yet been transformed or analyzed
Frequency Distribution
Describes the pattern of a set of numbers by displaying a count or proportion for each possible value of a variable
Frequency Table
Visual description of data that shows how often each value occurred, that is, how many scores were at each value
Grouped Frequency Table
Visual depiction of data that reports the frequencies within a given interval rather than the frequencies for a specific value
Normal Distribution
A very specific frequency that is bell-shaped, symmetric, unimodal curve
Skewed distribution
Distributions in which one of the tails of the distribution is pulled away from the centre; lopsided, off-venter, or nonsymmetric
Positively skewed data
The distribution’s tail extends to the right, in a positive direction
Floor effect
Situation in which a constraint prevents a variable from taking values below a certain point
Negatively skewed data
Have a distribution with a tail that extends to the left, in a negative direction
Ceiling effect
situation in which a constraint prevents a variable from taking on values above a given number
Hint to tell whether the data is positively or negatively skewed
The tail tells the tale; negative scores are to the left, when the long thin tail of a distribution is to the left of the distribution centre, it is negatively skewed. When the long thin tail of a distribution is to the right of the distribution centre, it is positively skewed.
Ways to present raw data
Frequency Tables, Grouped Frequency tables, Histograms, and Frequency Polygons
Ways to mislead with graphs
False Face Validity Lie Biased Scale Lie Sneaky sample lie Interpolation Lie Extrapolation Lie Inaccurate Values Lie
Types of Graphs
Scatterplot Line Graph Time Series Plot Bar Graph Pictorial Graphs Pie Charts
Central tendency
Refers to the descriptive statistics that represents the centre of a data set, the particular value that all the other data seem to be gathering around, it’s what we mean when we refer to the typical score; can be measured through the mean, median, and mode
Mean
Arithmetic average of a group of scores
Statistic
A number based on a sample taken from a population
Parameter
number based on the whole population
Median
the middle score of all the score in a sample when the scores arranged in ascending order, if there is no single middle score, the median is the mean of the two middle scores
Mode
The most common score of all the scores in the sample; used (1) when one particular score dominates a distribution (2) when the distribution is bimodal or multimodal (3) when the data are nominal
Unimodal distribution
has one mode, or most common score
Bimodal distribution
has two modes, or most common scores
Multimodal distribution
has more than two modes, or most commons cores
Standard deviation
The square root of the average of the squared deviation from the mean, the typical amount that each score varies, or deviates, from the mean
Measures of variability
Range
Variance
Standard Deviation
Interquartile Range
Independent measures t-test
Mann-Whitney Test
Repeated measures t-test
Wilcoxon Signed Rank
Independent measures Anova
Kruskal Wallis Test
Repeated measures Anova
Friedman Test
Pearson r
Spearman Rho
Random sample
One in which every member of the population has an equal chance of being selected into the study
Convenience Sample
One that uses participants who are readily available
Generalizability
Refers to researchers’ ability to apply findings from one sample or in one context to the other samples or contexts, known as external validity
Replication
refers to the duplication of scientific results, ideally in a different context or with a sample that has different characteristics
Volunteer sample
special kind of convenience sample in which participants actively choose to participate in a study; also called a self-selected sample
Control Group
A level of the independent variable that does not receive the treatment of interest in a study; designed to match an experimental group in all ways but the experimental manipulation itself
Experimental Group
Level of the independent variable that receives the treatment or intervention of interest in an experiment
Null hypothesis
a statement that postulates that there is no difference between populations or that the difference is in a direction opposite from that anticipated by the researcher
Research hypothesis
Statement that postulates that there is a difference between populations or sometimes, more specifically, that there is a difference in a certain direction, positive or negative; also called an alternative hypothesis
Making a Decision About Our Hypothesis
We decide to reject the null hypothesis (there is a difference)
We dede to fail to reject the null hypothesis (there is no difference)
Rules of Formal Hypothesis Testing
The null hypothesis is that there is no difference between groups and usually, our hypotheses explore the possibility of a mean difference
We either reject or fail to reject the null hypothesis. There are no other options.
We never use the word accept in reference to formal hypothesis testing
Type I Error
Occurs when we reject the null hypothesis but the null hypothesis is correct; false positive; rejecting the null hypothesis falsely; detrimental consequences because people often take action based on a mistaken finding
Type II Error
Occurs when we fail to reject the null hypothesis but the null hypothesis is false; false negative; results in a failure to take action because a research intervention is not supported or a given diagnosis is not received;
Standardization
Converts individual scores to standard scores for which we know the percentiles if the data were normally distributed;
z Score
The number of standard deviations a particular score is from the mean; can be computed if we know the mean and the standard deviation of a population
z scores into percentiles
2-14-34-34-14-2
Central Limit Theorem
REfers to how a distribution of sample means is a more normal distribution than a distribution of scores, even when the population distribution is not normal; repeated sampling approximates a normal curve even when the original population is not normally distributed; a distribution of means is less variable than a distribution of individual scores; minimum of thirty comprises each sample
Distribution of means
Distribution composed of many means that are calculated from all popsicle samples of a given size, all taken from the same population
Ways to describe the same scores within a normal distribution
Raw Scores
z Scores
Percentile Rankings
Assumptions
The characteristics that we ideally require the population from which we are sampling to have so that we can make accurate inferences;
Parametric Tests
Inferential statistical analyses based on a set of assumptions about the population
Nonparametric Tests
Inferential statistical analyses that are not based on a set of assumptions about the population
Assumptions for Conducting Analyses
The dependent variable is assessed using a scale measure, there is an equal distance between the number. If variable is nominal or ordinal, don’t make assumption.
Assume that the participants are randomly selected.
Distribution of the population of interest must be approximately normal.
Steps of Hypothesis Testing
Identify populations, comparison, distribution, and assumptions.
State the null and research hypothesis
Determine the characteristics of the comparison distribution
Determine critical values or cutoffs
Calculate the test statistic
Decide whether to reject or fail to reject the null hypothesis
Statistically significant finding
If the data differ from what we would expect by chance if there were, in fact, no actual difference; does not necessarily mean the finding is important or meaningful
Robust hypothesis test
one that produces fairly accurate results even when the data suggest that the population might not meet some of the assumptions
Critical value
Test statistic value beyond which we reject the null hypothesis, also known as a cutoff
Critical region
refers to the area in the tails of the comparison distribution in which we reject the null hypothesis if our test statistic falls there.