Chapter 12: Data-based and Statistical Reasoning Flashcards

1
Q

Mean or average

A

These calculated by adding up all the individual values within the data set and then dividing the result by the number of values. The meaning may be parameter or statistic depending on whether we are discussing a population or sample. Having an outlier (and extremely large or extremely small value compared to the other data values) can shift the mean towards one end of the range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Median.

A

Value for a set of data is its midpoint, where half of the data are greater than the value and half are smaller. In datasets with an odd number of values, the medium will actually be one of the data points. In datasets with an even number of values, the medium will be the mean of the two central data points. A data point must be first listed in increasing fashion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mode.

A

The number that appears the most often is set of data. When we examine distributions, the peaks represent modes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Normal distribution.

A

The normal distribution has been solved in the sense that we can transform any the normal distribution to a standard distribution with a mean of 0 and a standard deviation of one. In a normal distribution, all of the measurements of central tendency (mode, mean and median) are the same. Approximately 68% of the distribution is within one standard deviation of the mean, 95% within two, and 99% within 3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Skewed distribution.

A

It’s one of the contains a tail on one side or the other of the data set. A negatively skewed distribution has a tail on the left side, where the positive skewed distribution has a tail on the right. The mean of a negative skew distribution will be longer than the medium, while the mean of a positive skewed distribution will be higher than the medium.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bimodal distributions.

A

A distribution containing two peaks with the valley in between. They can often be analysed as two separate distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Range

A

Data size, the difference between its largest and smallest value. It is heavily affected by the presence of a data outliers. It is possible to approximate the standard deviation is 1/4 of the range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interquartile range.

A

IQR = Q3 – Q1

Any value that falls more than 1.5 IQR below the first quartile or above the third quartile is considered an outlier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard deviation.

A

It is calculated relative to the mean of the data. We calculate it by taking the difference between each data point and the mean, squaring this value, dividing the sum of all of these squared values by the number of the point, the data set -1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Dependent events.

A

Do impact each other such that the order changes the probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Independent events.

A

Do not impact each other, so their probabilities are never expected to change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Mutually exclusive outcomes.

A

Cannot occur at the same time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Exhaustive.

A

If there are no other possible outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In probability when using the word:

A

And: Multiply the probabilities.

Or: Add the probabilities and subtract the probability of both happening together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Correlation

A

Reverse your connection between data. If two variables shrink together, that is, as one increases, so does the other, there is a positive correlation. If two variables train in opposite directions, there’s a negative correlation. Correlation coefficient. A number between -1 and +1 that represents the strength of the relationship. A correlation coefficient of plus one indicates a strong positive relationship. A value of -1 indicates a strong negative relationship. In, a value of 0 indicates no apparent relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Slope.

A

Change in the Y direction divided by the change in X direction for any two points.

m =rise / run = Δy/ Δx

16
Q

Semilog graphs.

A

Our specialized representation of a logarithmic data set. The axes on the graph will determine which type of flow is being used and provide key information about the underlying relationship between relevant variables. A log log graph has both axes that can be a different axis ratio to create a linear plot.