Statistics Flashcards

Question

Within-Groups Research Design

Answer 1

The different levels of the independent variable are experienced by all participants in the study, also called a Repeated measures design

Answer 2

an extreme score that is either very high or very low in comparison with the rest of the scores in the sample

Answer 3

Studies that examine observations that do not fit the overall pattern of the data, in an effort to understand the factors that influence the dependent variable

Answer 4

Data point that has not yet been transformed or analyzed

Answer 5

Describes the pattern of a set of numbers by displaying a count or proportion for each possible value of a variable

Answer 6

Visual description of data that shows how often each value occurred, that is, how many scores were at each value

Answer 7

Visual depiction of data that reports the frequencies within a given interval rather than the frequencies for a specific value

Answer 8

A very specific frequency that is bell-shaped, symmetric, unimodal curve

Answer 9

Distributions in which one of the tails of the distribution is pulled away from the centre; lopsided, off-venter, or nonsymmetric

Answer 10

The distribution's tail extends to the right, in a positive direction

Answer 11

Situation in which a constraint prevents a variable from taking values below a certain point

Answer 12

Have a distribution with a tail that extends to the left, in a negative direction

Answer 13

situation in which a constraint prevents a variable from taking on values above a given number

Answer 14

The tail tells the tale; negative scores are to the left, when the long thin tail of a distribution is to the left of the distribution centre, it is negatively skewed. When the long thin tail of a distribution is to the right of the distribution centre, it is positively skewed.

Answer 15

Frequency Tables, Grouped Frequency tables, Histograms, and Frequency Polygons

Answer 16

``` False Face Validity Lie Biased Scale Lie Sneaky sample lie Interpolation Lie Extrapolation Lie Inaccurate Values Lie ```

Answer 17

``` Scatterplot Line Graph Time Series Plot Bar Graph Pictorial Graphs Pie Charts ```

Answer 18

Refers to the descriptive statistics that represents the centre of a data set, the particular value that all the other data seem to be gathering around, it's what we mean when we refer to the typical score; can be measured through the mean, median, and mode

Answer 19

Arithmetic average of a group of scores

Answer 20

A number based on a sample taken from a population

Answer 21

number based on the whole population

Answer 22

the middle score of all the score in a sample when the scores arranged in ascending order, if there is no single middle score, the median is the mean of the two middle scores

Answer 23

The most common score of all the scores in the sample; used (1) when one particular score dominates a distribution (2) when the distribution is bimodal or multimodal (3) when the data are nominal

Answer 24

has one mode, or most common score

Answer 25

has two modes, or most common scores

Answer 26

has more than two modes, or most commons cores

Answer 27

The square root of the average of the squared deviation from the mean, the typical amount that each score varies, or deviates, from the mean

Answer 28

Range Variance Standard Deviation Interquartile Range

Answer 29

Mann-Whitney Test

Answer 30

Wilcoxon Signed Rank

Answer 31

Kruskal Wallis Test

Answer 32

Friedman Test

Answer 33

Spearman Rho

Answer 34

One in which every member of the population has an equal chance of being selected into the study

Answer 35

One that uses participants who are readily available

Answer 36

Refers to researchers' ability to apply findings from one sample or in one context to the other samples or contexts, known as external validity

Answer 37

refers to the duplication of scientific results, ideally in a different context or with a sample that has different characteristics

Answer 38

special kind of convenience sample in which participants actively choose to participate in a study; also called a self-selected sample

Answer 39

A level of the independent variable that does not receive the treatment of interest in a study; designed to match an experimental group in all ways but the experimental manipulation itself

Answer 40

Level of the independent variable that receives the treatment or intervention of interest in an experiment

Answer 41

a statement that postulates that there is no difference between populations or that the difference is in a direction opposite from that anticipated by the researcher

Answer 42

Statement that postulates that there is a difference between populations or sometimes, more specifically, that there is a difference in a certain direction, positive or negative; also called an alternative hypothesis

Answer 43

We decide to reject the null hypothesis (there is a difference) We dede to fail to reject the null hypothesis (there is no difference)

Answer 44

The null hypothesis is that there is no difference between groups and usually, our hypotheses explore the possibility of a mean difference We either reject or fail to reject the null hypothesis. There are no other options. We never use the word accept in reference to formal hypothesis testing

Answer 45

Occurs when we reject the null hypothesis but the null hypothesis is correct; false positive; rejecting the null hypothesis falsely; detrimental consequences because people often take action based on a mistaken finding

Answer 46

Occurs when we fail to reject the null hypothesis but the null hypothesis is false; false negative; results in a failure to take action because a research intervention is not supported or a given diagnosis is not received;

Answer 47

Converts individual scores to standard scores for which we know the percentiles if the data were normally distributed;

Answer 48

The number of standard deviations a particular score is from the mean; can be computed if we know the mean and the standard deviation of a population

Answer 49

2-14-34-34-14-2

Answer 50

REfers to how a distribution of sample means is a more normal distribution than a distribution of scores, even when the population distribution is not normal; repeated sampling approximates a normal curve even when the original population is not normally distributed; a distribution of means is less variable than a distribution of individual scores; minimum of thirty comprises each sample

Answer 51

Distribution composed of many means that are calculated from all popsicle samples of a given size, all taken from the same population

Answer 52

Raw Scores z Scores Percentile Rankings

Answer 53

The characteristics that we ideally require the population from which we are sampling to have so that we can make accurate inferences;

Answer 54

Inferential statistical analyses based on a set of assumptions about the population

Answer 55

Inferential statistical analyses that are not based on a set of assumptions about the population

Answer 56

The dependent variable is assessed using a scale measure, there is an equal distance between the number. If variable is nominal or ordinal, don't make assumption. Assume that the participants are randomly selected. Distribution of the population of interest must be approximately normal.

Answer 57

Identify populations, comparison, distribution, and assumptions. State the null and research hypothesis Determine the characteristics of the comparison distribution Determine critical values or cutoffs Calculate the test statistic Decide whether to reject or fail to reject the null hypothesis

Answer 58

If the data differ from what we would expect by chance if there were, in fact, no actual difference; does not necessarily mean the finding is important or meaningful

Answer 59

one that produces fairly accurate results even when the data suggest that the population might not meet some of the assumptions

Answer 60

Test statistic value beyond which we reject the null hypothesis, also known as a cutoff

Answer 61

refers to the area in the tails of the comparison distribution in which we reject the null hypothesis if our test statistic falls there.

Answer 62

The probability used to determine the critical values, or cutoffs, in hypothesis testing

Answer 63

Hypothesis test in which the research hypothesis does not indicate a direction of the mean difference or change in the dependent variable, but merely indicates that there will be a mean difference;

Answer 64

Summary statistic from a sample that is just one number used as an estimate of the population parameter

Answer 65

Based on a sample statistic and provides a range of plausible values for the ovulation parameter; used when reporting polls;

Answer 66

Internal estimate, based on the sample statistic, that would include the population mean a certain percentage of the time if we sampled from the same population repeatedly; centred around the mean. 95% confidence interval most commonly used, 95% falls between the two tails; Confidence level is 95%, confidence interval is the range between the two values that surround the sample mean.

Answer 67

As sample size increases, there is a corresponding increase in test statistic during hypothesis testing; A larger sample size should influence our level of confidence but it shouldn't increase our confidence that the story is important

Answer 68

Indicates the size of a difference and is unaffected by sample size; tells us how much two populations DO NOT overlap; the less overlap, the bigger the effect size

Answer 69

If means are farther apart | If the variation within each population is smaller

Answer 70

When two population distributions decrease their spread, the overlap of the distributions is less and the distribution is bigger

Answer 71

Developed by Jacob Cohen; a measure of effect size that assesses the difference between two means in terms of standard deviation, not standard error; similar to a z statistic

Answer 72

Effect Size Convention Overlap Small 0.2 85% Medium 0.5 67% Large 0.8 53%

Answer 73

Measure of our ability to reject the null hypothesis given that the null hypothesis is false; the probability that we will reject the null hypothesis when we should reject the null hypothesis; the probability that we will not make a Type II Error. Acceptable rate is .80

Answer 74

1. Increase the alpha. Take the p level of 0.05 and increase it to 0.10; Side effect of increasing the probability of a Type I error from 5% to 10% 2. Turn a two-tailed hypothesis into a one tailed hypothesis 3. Increase N. Increasing sample size leads to an increase in the test statistic, making it easier to reject the null because a larger test statistic is more likely to fall beyond the cutoff 4. Exaggerate the levels of the independent variable. Example is to add to the length of group therapy if the study is on the effectiveness of group therapy for social phobia 5. Decrease the standard deviation (use reliable measure from the beginning of the study and sampling from a more homogeneous group in which participants' responses are more likely to be more similar to begin with)

Answer 75

Study that involves the calculation of a mean effect size from the individual effect sizes of many studies

Answer 76

Hypothesis Testing Confidence Intervals Effect Size Power Analysis

Answer 77

Help us specify precisely how confident we can be in our research findings; The t test, based on t distributions, tells us how confident we can be that our sample differs from the larger population; used instead of a z distribution when sampling requires us to estimate the population standard deviation from the sample standard deviation

Answer 78

Indicates the distance of a sample mean from a population mean in terms of the standard error

Answer 79

Hypothesis test in which we compare data from one sample to a population for which we know the mean but not the standard deviation

Answer 80

The number of scores that are free to vary when estimating a population parameter from a sample

Answer 81

1. Write the symbol for the test statistic 2. Write the degrees of freedom, in parentheses 3. Write an equal sign and then the value of the test statistic, typically to two decimal places 4. Write a comma and then indicate the p value by writing "p=" and then the actual value t(4) = 2.87, p < 0.05 It appears that counselling centre clients who sign a contract to attend at least 10 sessions do attend more sessions, on average, than do clients who do not sign such a contract, t(4) = 2.87, p < 0.05

Answer 82

Graph that displays all the data points in a sample with the range of scores along the x-axis and a dot for each data point above the appropriate value

Answer 83

Types of t tests: Single sample t test (when we compare a sample mean to a population mean but don't know the population standard deviation) Paired-Samples t-test [Dependent Samples t test] (when we are comparing two samples and every participant is in both samples, a within-groups design; before and after comparisons) Independent samples t test (when we are comparing two samples and every participant is in only one sample, a between-groups design)

Answer 84

1. The dependent variable is scale 2. The participants were randomly selected 3. The population is normally distributed

Answer 85

Refer to how a participant's behaviour changes when the dependent variable is presented for a second time

Answer 86

minimises the practice effect by varying the order of presentation of different levels of the independent variable from one participant to the next

Answer 87

Salaries for the same position in two different cities; scores of 30 students on two different exams; scores on tests before and after interventions

Answer 88

used to compare two means for a between-groups design, situation in which each participant is assigned to only one condition; difference between means

Answer 89

1) The dependent variable is a rating on a liking measure, which can be considered a scale variable 2) We do not know whether the population is normally distributed, there are at least 30 participants 3) Participants are randomly selected

Answer 90

Group 1: Low trust in leader Group 2: High trust in leader Level of agreement with their supervisor from 1 (strongly disagree) to 7 (strongly agree) Population 1: women exposed to humorous cartoons Population 2: men exposed to humorous cartoons Dependent variable: percentage of cartoons characterised as funny (scale) Ho: u1 = u2 Are women really more talkative than men? How long, in minutes, do male and female students spend getting ready for a date? Can women experience "mother hearing," an increased sensitivity to and awareness of noises, in particular, those of children? Mothers and non mothers.

Answer 91

Single sample t test because we have one sample of figure skaters and are comparing that sample to a population (women with eating disorders) for which we know the mean

Answer 92

Independent Samples t test because we have two samples, and no participant can be in both samples. One cannot have both high level and low level of knowledge about a topic

Answer 93

We would use a paired-samples t test because we have two samples, but every student is assigned to both samples - one night of sleep loss and one night of no sleep loss

Answer 94

Three experiments to compare Group 1 vs Group 2, Group 1 vs Group 3, Group 2 vs Group 3, and putting all three groups in a single experiments is far more efficient. Scores in the final exam for Group 1 (control group), and group 3 (take responsibility group) were the same. average scores on the final exam for group 2 (self-esteem group) sank to .37%.

Answer 95

Leads to more chances of committing a Type I error. (0.95)(0.95)(0.95) = 0.857, this gives us almost a 15% chance of having at least one Type I error if we run 3 analyses.

Answer 96

Allow us to conduct a single hypothesis test with multiple groups; more complex variations of the z distributions and the t distributions

Answer 97

Analysis of Variance; a hypothesis test typically with one or more nominal independent variables with at least three groups overall and a scale dependent variable

Answer 98

Ratio of two measures of variance: (1) between groups variance, which indicates differences among sample means, and (2) within-groups variance, which is essentially an average of the sample variances; a way of measuring whether three or more groups vary from one another; an expansion of the z statistic and t statistic

Answer 99

An estimate of the population of the population variance based on the differences among the means

Answer 100

An estimate of the population variance based on the differences within each of the three (or more) sample distributions

Answer 101

``` z = one sample population and standard deviation are known t = one sample, only population is known; two samples F = three or more samples ```

Answer 102

A hypothesis test that includes one nominal independent variable with more than two levels and a scale dependent variable

Answer 103

A hypothesis test in which there are more than two samples, and each sample is composed of the same participants; repeated measured ANOVA

Answer 104

A hypothesis test in which there are more than two samples, and each sample is composed of different participants

Answer 105

Samples are randomly selected. Population distribution is normal. All samples comes from populations with the same variances

Answer 106

Ability to generalize beyond the sample

Answer 107

Homoscedastic populations are those that have the same variance

Answer 108

Those that have different variances

Answer 109

Ho = u1 = u2 = u3 = u4

Answer 110

Presents the important calculations and final results of an Anova in a consistent and easy-to-read format

Answer 111

The mean of every score in a study, regardless of which sample the score came from

Answer 112

Proportion of variance accounted for by the dependent variable that is accounted for by the independent variable

Answer 113

A test that is conducted when there are multiple groups of scores, but specific comparisons have been specified prior to data collection

Answer 114

Statistical procedure frequently carried out after we reject the null hypothesis in an analysis of variance; it allows us to make multiple comparisons among several means; often referred to as a follow-up test

Answer 115

Guided by an existing theory or a previous finding

Answer 116

Conducting one or more independent samples t tests with a p level of 0.05 Conducting one or more independent-samples t tests using a more conservative p level as determined by a Bonferroni Test

Answer 117

Post-hoc test that determines the differences between means in terms of standard error, comparable to a critical value; sometimes referred to as the q test; Involves (1) calculation of differences between each pair of means (2) division of each difference by the standard error

Answer 118

df within = df1+df+df3+df4 Sum the degrees of freedom for each group by subtracting 1 from the number of people in that sample

Answer 119

When there's just one nominal or ordinal independent variable (type of beer), the independent variable has more than two levels (cheap, mid-range, and high-end), the dependent variable is scale (ratings of beers), and every participant is in every group (each participant tastes the beers in every category)

Answer 120

Each group includes exactly the same participants, groups are identical on all the relevant variables; same taste preferences, amount of alcohol typically consumed, tendency to be critical or lenient when rating, and so on

Answer 121

Identify the populations, distribution, assumptions State the null and research hypotheses Determine the characteristics of the comparison distribution (F distribution, degrees of freedom [df within = (df between)(df subjects)] [df total = df between + df subjects + df within)] Determine critical values or cutoffs (F statistic for a p level of 0.05) Calculate the test statistic Make a decision

Answer 122

We should critically examine the research design and, regardless of its merits, call for a replication

Answer 123

We may not be aware of all of the important variables of interest If one of the people in a matched pair deicdes not to complete the study, then we must discard the data for the match for this person

Answer 124

Occurs when a factorial design when two or more independent variables have an effect in combination that we do not see when we examine each independent variable on its own

Answer 125

Hypothesis test that includes two nominal independent variables, regardless of their numbers of levels, and a scale dependent variable

Answer 126

A statistical analysis used with one scale dependent variable and at least two nominal independent variables (factors); also called a multifactorial ANOVA

Answer 127

Term used to describe an independent variable in a study with more than one independent variable

Answer 128

if IVs Participants in 1 or all Always follows desc. One-way Between Groups ANOVA Two-way Within-Groups Three-way Mixed-Design

Answer 129

Examine (1) the effect of Lipitor versus other medication (2) the effect of grapefruit juice versus other beverages (3) ways in which a drug and a juice might combine to create some entirely new and unexpected effect Lipitor Zocor Placebo GF JUICE L & G Z & G P & G WATER L & W Z & W P & W

Answer 130

Occurs in a factorial design when one of the independent variables has an influence on the independent variable

Answer 131

An interaction in which one independent variable exhibits a strengthening or weakening of its effect at one or more levels of the other independent variable, but the direction of the initial effect does not change

Answer 132

Particular type of quantitative interaction of two (or more) independent variables in which one independent variable reverses its effect depending on the level of the other independent variable

Answer 133

The mean of a row or a column in a table that shows the cells of a study with a two-way ANOVA design

Answer 134

Identify the populations, distribution, assumptions State the null and research hypothesis Determine characteristics of the comparison distribution Determine the critical values, or cutoffs Calculate the test statistic Make a decision

Answer 135

Used to analyse data from a study with at least two independent variables; at least one variable must be between groups. Includes both a between-groups variable and within-groups variable

Answer 136

Form of ANOVA in which there is more than one dependent variable; The word multivariate refers to the number of dependent variables, not the number of independent variables

Answer 137

Type of Anova in which a covariate is included so that statistical findings reflect effects after a scale variable has been statistically removed;

Answer 138

scale variable that we suspect associates, or covaries, with the independent variable of interest; statistically subtracts the effect of a possible confounding variable

Answer 139

An ANOVA with multiple dependent variables and the inclusion of a covariate

Answer 140

Online dating Website allows users to post personal ads to meet others. Each person is asked to specify a range from the youngest age acceptable to the oldest age acceptable. Data were randomly selected from ads of 25-year-old people living in the New York City area. Scores represent youngest acceptable ages listed by those in the sample. 25 y.o. women seeking men 25 y.o. women seeking women 25 y.o. men seeking women 25 y.o. men seeking men Two independent variables (gender of seeker, levels: male and female); and gender of the person being sought, levels: male and female); one dependent variable: youngest acceptable age of the person being sought)

Answer 141

Association or relation between two variables; gives new ways to measure behaviour and to distinguish among the influences of overlapping variable

Answer 142

A statistic that quantifies a relation between two variables

Answer 143

An association between two variables such that participants with high scores on one variable tend to have high scores on one variable tend to have high scores on the other variable as well, and those with low scores on one variable tend to have low scores on the other variable

Answer 144

It can be either positive or negative It always falls between -1.00 and 1.00 It is the strength (or magnitude) of the coefficient, not its sign, that indicates how large it is

Answer 145

An association between two variables in which participants with high scores on one variable tend to have low scores on the other variable

Answer 146

Size of Correlation Correlation Coefficient Small 0.10 Medium 0.30 Large 0.50

Answer 147

Correlation is Not Causation | Restricted Range

Answer 148

The first variable might cause the second variable The second variable could cause the first variable A third variable could cause both A and B

Answer 149

A correlation can be dramatically altered by a restricted range or by an extreme outlier

Answer 150

Statistic that quantifies a linear relation between two scale variables; a single number is used to describe the direction and strength of the relation between two variables when their overall pattern indicates a straight-line relation

Answer 151

1. Identify the population, distribution, and assumptions 2. State the null and research hypotheses Ho: p = 0; H1: ≠ 0 3. Determine the characteristics of the comparison distribution (df = N-2) 4. Determine the critical/cutoff values. Look up values in r table given the degrees of freedom and the p level 5. Calculate the test Statistic 6. Make a decision

Answer 152

Estimate of a test measure's reliability and is calculated by taking the average of all possible split-half correlations

Answer 153

Technique that quantifies the degree of association between two variables after statistically removing the association of a third variables after statistically removing the association of a third variable with both of those two variables

Answer 154

A statistical tool that lets us predict a person's score on the dependent variable from his or her score on one independent variable

Answer 155

A statistical technique that can provide specific quantitative information that predicts relations between variables; can provide specific quantitative predictions that more precisely explain relations among variables

Answer 156

Regression of the dependent variable; the tendency of scores that are particularly high or low to drift toward the mean over time

Answer 157

A standardised version of the slope in a regression equation, is the predicted change in the dependent variable in terms of standard deviations for an increase of 1 standard deviation in the independent variable

Answer 158

The line that best fits the points on the scatterplot; the regression line is the line that leads to the least amount of error in prediction

Answer 159

Statistic indicating the typical distance between a regression line and the actual data point; we are concerned with variability around the best line of fit rather than variability around the mean

Answer 160

Occurs because extreme scores tend to become less extreme, that is, they tend to regress towards the mean

Answer 161

Statistic that quantifies how much more accurate predictions are when we use the regression line instead of the mean as a prediction tool

Answer 162

An independent variable that makes a separate and distinct contribution in the prediction of a dependent variable, as compared with another variable

Answer 163

A statistical technique that includes two or more predictor variables in a prediction equation

Answer 164

A way of quantifying whether multiple pieces of evidence really are better one A way of quantifying precisely how much better each additional piece of evidence actually is

Answer 165

A type of multiple regression in which computer software determines the order in which independent variables are included in the equation

Answer 166

type of multiple regression in which the researcher adds independent variables to the equation in an order determined by theory

Answer 167

A statistical technique that quantifies how well sample data "fit" a theoretical model that hypothesises a set of relations among multiple variables

Answer 168

Hypothesized network of relations, often portrayed graphically among multiple variables

Answer 169

A term that statisticians use to describe the connection between two variables in a statistical model

Answer 170

A statistical method that examines a hypothesised model, usually by conducting a series of regression analyses that quantify the paths at each succeeding step in the model

Answer 171

The variables in a study that we can observe and that are measured

Answer 172

The ideas we want to research but cannot directly measure

Answer 173

Allows us to test relations between variables when they are nominal

Answer 174

1. When the dependent variable is nominal (whether or not a woman gets pregnant) 2. When the dependent variable is ordinal 3. When the sample size is small and we suspect that the underlying population of interest is skewed

Answer 175

A nonparametric hypothesis test used with one nominal variable

Answer 176

A nonparametric hypothesis test used with two nominal variables

Answer 177

Researchers reported that the best soccer players in the world were more likely to have been born early in the year than later. 52 elite youth players in Germany were born in January, February, or March. Only 4 players were born in October, November, or December

Answer 178

1. Identify populations, distribution, and assumption 2. State the null and research hypotheses 3. Determine the characteristics of the comparison distribution (how many degrees of freedom) 4. Determine the critical values or cutoffs (use the chi-square table, basis is degrees of freedom and p level) 5. Calculate the test statistic 6. Make a decision

Answer 179

1. Identify populations, distribution, and assumption 2. State the null and research hypotheses 3. Determine the characteristics of the comparison distribution 4. Determine the critical values or cutoffs using the degrees of freedom and the p level 5. Calculate the test statistic 6. Make a decision

Answer 180

EXPECTED FREQUENCIES WITH TOTALS Pregnant Not Pregnant Clown No Clown

Answer 181

A measure created by making a ration of two conditional proportions

Answer 182

The difference between the observed frequency and the expected frequency for a cell in a chi-square research design, divided by the standard error

Answer 183

A nonparametric Statistic that quantifies the association between two ordinal variables; coefficient can range from -1 to +1; can indicate a strong correlation but no causation

Answer 184

Nonparametric hypothesis test used when there are two groups, a within-groups design, and an ordinal dependent variable

Answer 185

1. Identify the assumptions (differences between pairs must be ranked, random selection, difference scores should come from a symmetric population distribution) 2. State the null and research hypotheses (only in words, not symbols) 3. Determine the characteristics of the comparison distribution (T statistic; decide the cutoff or critical value; one-tailed or two-tailed test; determine the sample size) 4. Determine the critical values (check the table) 5. Calculate the test statistic 6. Make the decision

Answer 186

Nonparametric hypothesis test used when there are two groups, a between-groups design, and an ordinal dependent variable; U statistic

Answer 187

1. Identify the assumptions 2. State the null and research hypotheses 3. Determine the characteristics of the comparison distribution 4. Determine the critical values, or cutoffs (We want the smaller of the test statistics to be equal or smaller than this critical value) 5. Calculate the test statistics 6. Make the decision

Answer 188

A nonparametric hypothesis test used when there are more than two groups, a between-groups design, and an ordinal dependent variable, H

Answer 189

1. Identify the assumptions 2. State the null and research hypotheses 3. Determine the characteristics of the comparison distribution 4. Determine the critical values, or cutoffs using a table, based on a chi square distribution with a p level of 0.05, and degrees of freedom 5. Calculate the test statistic 6. Make a decision

Answer 190

Statistical process in which the original sample data are used to represent the entire population, and we repeatedly take samples from the original sample data to form a confidence interval

Statistics Flashcards

(214 cards)