1. Descriptive and inferential statistics Flashcards by Michal Pelikan

Classify these variables as NOMINAL or CONTINUOUS:

A) Age

B) Gender

C) Height

A) Age = Continuous

B) Gender = Nominal

C) Height = Continuous

How well did you know this?

Not at all

Perfectly

Describe what a confounding variable is

A variable that affects the outcome being measured as well as, or instead of, the independent variable.

because a confounding variable is an unforeseen and unaccounted-for variable that jeopardizes reliability and validity of an experiment’s outcome

How well did you know this?

Not at all

Perfectly

If a test is valid, what does this mean?

The test measures what it claims to measure

How well did you know this?

Not at all

Perfectly

If a test is reliable what does this mean?

The test will give consistent results.

How well did you know this?

Not at all

Perfectly

The discrepancy between the numbers used to represent something that we are trying to measure and the actual value of what we are measuring is called:

Measurement error

How well did you know this?

Not at all

Perfectly

What is the ‘fit’ of the model?

The ‘fit’ of the model is the degree to which a statistical model represents the data collected

How well did you know this?

Not at all

Perfectly

What is variance?

The variance is the average error between the mean and the observations made

How well did you know this?

Not at all

Perfectly

A frequency distribution in which low scores are most frequent (i.e. bars on the graph are highest on the left hand side) is said to be:

Positively skewed

How well did you know this?

Not at all

Perfectly

How can we compensate for practice effects?

Counterbalancing

How well did you know this?

Not at all

Perfectly

How can we compensate for boredom effects?

Giving participants a break between tasks

How well did you know this?

Not at all

Perfectly

Variation due to variables that have not been measured is known as:

Unsystematic variation

Unsystematic variation results from random factors that exist between the experimental conditions (such as natural differences in ability, the time of day, etc.)

How well did you know this?

Not at all

Perfectly

What is the assumption of homogeneity of variance?

That the variance within each of the populations is equal.

How well did you know this?

Not at all

Perfectly

Variation due to the experimenter doing something in one condition but not in the other condition is known as:

Systematic variation

How well did you know this?

Not at all

Perfectly

What does residual variance tell us?

Residual variance helps us confirm how well a regression line that we constructed fits the actual data set. The smaller the variance, the more accurate the predictions are

How well did you know this?

Not at all

Perfectly

The purpose of a control condition is to

Allow inferences about cause

A properly constructed control condition provides you with a reference point to determine what change (if any) occurred when a variable was modified

How well did you know this?

Not at all

Perfectly

What helps to control for participant characteristics (thus minimize unsystematic variation)?

Randomization

How are Z scores calculated?

By subtracting the mean from the score and dividing the answer by the standard deviation

SCORE - MEAN = X

X / STDEV = Z-SCORE

The standard deviation is the square root of the:

Variance

What is the coefficient of determination?

A measure of the amount of variability in one variable that is shared by the other

Calculated as:

correlation coefficient squared

Complete the following sentence:

A large standard deviation (relative to the value of the mean itself)…

Indicates that the data points are distant from the mean

(i.e. the mean is a poor fit of the data).

The probability is p = 0.80 that a patient with a certain disease will be successfully treated with a new medical treatment. Suppose that the treatment is used on 40 patients. What is the “expected value” of the number of patients who are successfully treated?

because 80% of 40 patients is 32 (or 40 x .80 = 32)

What is the Confusion of the inverse?

A logical fallacy whereupon a conditional probability is equated with its inverse

that is, given two events A and B, the probability of A happening given that B has happened is assumed to be about the same as the probability of B given A, when there is actually no evidence for this assumption.

More formally, P(A|B) is assumed to be approximately equal to P(B|A).

The test statistics we use to assess a linear model are usually _______ based on the normal distribution.

Parametric tests

What are the assumptions of the general linear model?

Independence:

The errors in your model should not be related to each other

Additivity/Linearity:

If you have several predictors then their combined effect is best described by adding their effects together
The outcome variable is, in reality, linearly related to any predictors

Normality:

The core element of the
Assumption of Normality asserts that the distribution of sample means (across independent
samples) is normal.

(In technical terms, the Assumption of Normality claims that the sampling
distribution of the mean is normal or that the distribution of means across samples is normal)

Homogeneity of variance:

When testing several groups of participants, samples should come from populations with the same variance

Finish the sentence The further the values of skewness and kurtosis are from zero, the more likely...

...it is that the data are not normally distributed

Parameters are numbers that summarize data for...

an entire population

Statistics are numbers that summarize data from...

a sample

What are the measures of central tendency?

- Mean - Median - Mode

What are the measures of spread or dispersion?

- Range - Variance - Standard Deviation

What does kurtosis tell us?

what data points are outliers Distributions: Leptokurtic = relatively large tails (heavy drop off) Platykurtic = relatively small tails (light/no drop off) Mesokurtic = same kurtosis as the normal distribution

What is Gambler's fallacy?

mistaken belief that, if something happens more frequently than normal during some period, it will happen less frequently in the future, or that, if something happens less frequently than normal during some period, it will happen more frequently in the future

What is the Law of small numbers?

exaggerated confidence in the validity of conclusions based on small samples. - Misperceive a small sample to be indicative of the entire population

What does the Sum of squared errors (SS) indicate?

The total dispersion, or total deviance of scores from the mean

How does an increasing number of participants affect the distribution of the sample?

- Distribution becomes more normal | - Spread of the distribution decreases

What do confidence intervals tell us?

The likelihood of the population mean lying within certain boundaries -------------------------------- There is a tradeoff between degree of certainty and width of the CI: - The more certain you want to be, the wider (larger) the interval needs to be - The goal is to have a high level of confidence paired with a small interval. - One way to help achieve this is to have less variability in your sample (i.e. smaller error or mean)

Sum of squares, Variance and standard deviation represent the same things. What do they represent?

- The 'fit' of the mean to the data - the variability in the data - How well the mean represents the observed data - error

What does standard error tell you?

It is the standard deviation of the sampling distribution of a statistic How accurate the mean of any given sample from that population is likely to be compared to the true population mean. When the standard error increases, i.e. the means are more spread out, it becomes more likely that any given mean is an inaccurate representation of the true population mean.

Which t-test has more power to find an effect given that everything else is equal? Repeated measures vs independent measures

repeated measures t-test: - When the same participants are used across conditions the unsystematic variance (often called the error variance) is reduced dramatically, making it easier to detect any systematic variance