Introduction and Descriptive Statistics Flashcards

1
Q

What is Descriptive Statistics?

A

A way of describing the data distribution derived from the sample
Used in tabulating, summarizing, and describing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What models are used to capture and simplify the sample data distribution?

A

Central tendency
Variance
Shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can the models be used to describe the population from which the data was samples?

A

Yes, if it is representative of the population and is sufficiently large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are statistical models?

A

They are simplifications of reality
They are imperfect, but can be valuable for understanding and prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the goals of statistics?

A

Summary of salient characteristics (description) - central tendency (expected value), variability (variance), shape of distribution (skew)
Estimation - infer an unknown parameter of a population using sample data via a probability function
Hypothesis testing - differences among groups (comparative) and relationships among variables (associative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are population parameters constants at a fixed point in time?

A

Yes
Statistics are estimates that change over time across different samples from the same population
Observations from the sample, as well as the summary of statistics generated from the sample, are assumed to be random variants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a random variants?

A

Different outcomes generated by the same random process (margin of error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are inferential statistics?

A

Used to estimate characteristics (parameters) of a population based on data measured in a (representative) sample from the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is the standard deviation the square root of the variant?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does epsilon mean?

A

Summing a set of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the measures of central tendency?

A

Mode (most frequently observed)
Mean (sum of scores divided by number of scores)
Median (middle score when scores are in rank order, 50th percentile)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the measures of variability?

A

Range
Interquartile range (IQR)
Sum of squares (variance and standard deviation)
Coefficient of variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is range?

A

Maximum - minimum scores
Very gross descriptor, but typically reported for comparative purposes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is interquartile range?

A

75th percentile (P75) - 25th percentile (P25)
Boundaries of the middle 50% of the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do the variance and the standard deviation tell you?

A

The variability
Based on the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the standard deviation?

A

The average absolute difference in scores from the mean value

17
Q

What do narrow confidence intervals mean?

A

The more precisely we have estimated the population parameter
The confidence interval is inversely related to the size of the sample

18
Q

What are the degrees of freedom?

A

The number of independent values that can be estimated in a statistical analysis
How many items can be randomly selected before constraint must be put in place
If a data set has 10 values, 9 of the values of free to vary, but the 10th value is determined

19
Q

What is the coefficient of variation?

A

Unit-free measure of the precision of an estimate
Useful for comparing the degree of variation (precision) from one distribution to another, even if means are very different
Ratio of the SD to the mean times 100
(SD/x)100

20
Q

Does the study with a smaller coefficient of variation has more precisely estimated the mean for the population?

A

Yes
The data used to calculate the mean are less variable

21
Q

What is a symmetric distribution?

A

Mean=median=mode

22
Q

What is a positive skew?

A

Mode<median<mean
Mean is most sensitive to skew
Tail is to the right

23
Q

What is a negative skew?

A

Mean<median<mode
Tail to the left

24
Q

What is a normal distribution curve?

A

Also known as a gaussian curve
Most important distribution in statistics
Many physical measures naturally result in normal distributions (height, weight, reaction times, etc.)
Problems present if the distribution is not normal

25
Q

What characteristics do normal distributions have?

A

Unimodal
Symmetric
Can be described with 2 parameters (mu - population mean and sigma - population SD
Have tails that asymptotically (in very large samples) approach the x axis

26
Q

Do all normal density curves satisfy the empirical rule?

A

68% of the observations fall within 1 standard deviation of the mean: between mu-1sigma and mu+1sigma
95% of the observations fall within 2 standard deviations of the mean: between mu-2sigma and mu+2sigma
99.7% of the observations fall within 3 standard deviations of the mean: between mu-3sigma and mu+3sigma

27
Q

In normal distributions, do almost all values lie within 3 SDs of the mean?

28
Q

What do z scores mean?

A

Observations expressed in terms of the number of standard deviation units from the mean

29
Q

What is the mu and sigma for standard normal distribution?

A

mu = 0
sigma = 1

30
Q

How do you compute the z score?

A

(xi - mu)/sigma
This gives you the number of SD, need to do another calculation to get the percentage

31
Q

What do plots allow us to do?

A

View characteristics of the data
Detect oddities in the data
Understand relationships among variables
Make predictions