Modules 1-2 Flashcards

1
Q

What should we identify before gathering and analyzing data?

A

The question we wish to answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of graph is useful for examining a data set to reveal patterns and trends?

A

Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In a histogram, what does the x-axis represent?

A

Bins corresponding to ranges of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a histogram, what does the y-axis indicate?

A

The frequency of observations falling into each bin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is an outlier?

A

A value that falls far from the rest of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What should we do before deciding on an outlier?

A

Carefully investigate it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What can graphing two variables on a scatter plot reveal?

A

Relationships between two variables (two data sets)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a key point regarding correlation and causation?

A

Correlation does not imply causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should we be alert to when examining relationships between two data sets?

A

The possibility of hidden variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are descriptive statistics also known as?

A

Summary statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What three values describe the center of a data set?

A
  • Mean
  • Median
  • Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How is the mean calculated?

A

Sum of all data points divided by the number of data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the median?

A

The middle value of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the mode represent in a data set?

A

The value that occurs most frequently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can a data set have multiple modes?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What measures the spread of the data?

A
  • Range
  • Variance
  • Standard deviation
17
Q

How is the standard deviation calculated?

A

The square root of the variance

18
Q

What is a conditional mean?

A

A conditional mean is the mean of a subset of the data that includes all values satisfying a certain condition.

19
Q

What is a percentile?

A

A percentile is a value below which a certain percentage of observations fall. For example, 60% of the observations are less than or equal to the 60th percentile.

20
Q

What is the median in terms of percentiles?

A

The median is by definition the 50th percentile of a data set.

21
Q

What is the coefficient of variation?

A

The coefficient of variation measures the size of the standard deviation relative to the size of the mean.

22
Q

What does the correlation coefficient measure?

A

The correlation coefficient quantifies the strength of a linear relationship between two variables.

23
Q

What is the range of the correlation coefficient?

A

The value of the correlation coefficient ranges between -1 and +1.

24
Q

What does a correlation coefficient near zero indicate?

A

A correlation coefficient near zero indicates a weak or nonexistent linear relationship.

25
Q

What is a time series?

A

When one of the variables is time, the relationship is known as a time series.

26
Q

What is cross-sectional data?

A

Cross-sectional data provide a snapshot of data across multiple groups at a given point in time.

27
Q

What should you recall about Excel functions and analyses?

A

Familiarize yourself with all of the necessary steps, syntax, and arguments for the Excel functions covered in this course.

28
Q

What does the AVERAGEIF function do?

A

The AVERAGEIF function returns the conditional mean, or average of the cells in a specified range that meet the given criteria.

29
Q

What is criteria in the context of data ranges?

A

Criteria is the condition that is to be applied to the range.

30
Q

What does [average range] refer to?

A

[average range] is the range of cells containing the data we wish to average.

31
Q

What does the function PERCENTILE.INC(array, k) do?

A

Returns the k-th percentile of value in the specified array.

For example, if we want to know the 95 percentile for an array of data, k would be 0.95.

32
Q

What is the syntax for calculating variance in a sample?

A

=VAR.S(number 1, [number 2], …)

33
Q

What is the syntax for calculating standard deviation in a sample?

A

=STDEV.S(number 1, [number 2], …)

34
Q

What does the function SQRT(number) calculate?

A

Calculates the square root of a number.

35
Q

What is the syntax for counting values?

A

=COUNT(value 1, [value 2], …)

36
Q

What does the function MIN(number 1, [number 2], …) return?

A

Returns the minimum value from the specified numbers.

37
Q

What does the function MAX(number 1, [number 2], …) return?

A

Returns the maximum value from the specified numbers.

38
Q

What is the syntax for summing values?

A

=SUM(number 1, [number 2], …)

39
Q

What does the function CORREL(array 1, array 2) calculate?

A

Calculates the correlation coefficient between two arrays.