Midterm 1 Flashcards
Variability
How Spread out the data is
CV
stdev/mean
Individual
The objects that are the focus of the study. Basically who/what the data is about
Variable
The characteristic of the individual that is being recorded or measured
Categorical
Variables that function as names/labels for objects. Essentially this is non-numerical data
Quantitative
Variables that are measurements. Essentially this is numerical data.
How do you calculate Percent Frequency?
Percent Frequency (PERCENT) = Frequency / Size of the dataset
How do you calculate Relative Frequency?
Relative Frequency (DECIMAL) = Frequency / Size of the dataset
What is the Frequency Distribution?
A table listing each unique category in a dataset next to that categories frequency
How do you calculate frequency of the number 10 in row A on excel?
=COUNTIF(A:A9,”10”)
How do you form number classes for the range 300-400, using the numbers from the column B?
=COUNTIFS($B$1:$B$57,”>=300”,$B$1:$B$57,”<=400”)
Cumulative Frequency
The cumulative frequency of a class (number range) is the frequency
74 of that class, summed together with the frequencies of each class that came before it
What’s a histogram and how is it different from a bar graph?
A histogram is like a bar graph, but for quantitative data. The huge visual difference between a histogram and a bar graph is that a histogram has NO gaps between the bars.
Positive Correlation
Relatively large X -> Relatively large Y
Negative Correlation
Relatively large X -> Relatively small Y
What is the Mode?
It is the value with the largest frequency.
How do you calculate mode on excel with values in column C?
=MODE.MULT(C:C)
What is the median?
The literal center of the dataset. It is a value such that exactly 50% of the values in the dataset will be smaller/larger than when lined up.
How do you calculate the median on excel?
=MEDIAN(C70:K70)
What is the mean?
The arithmetic average of a set of numbers. It essentially considers the size of each number as its mass, and tries
to find a sort of “center of gravity” for that mass
How do you calculate the mean on excel?
=AVERAGE(C85:K85)
how do you calculate percents of a dataset in excel?
=percentile.inc(array, k)
how do you calculate the first quartile?
=quartile.inc(dataset, 1)
how do you calculate the second quartile?
=median(dataset)
how do you calculate the third quartile?
=quartile.inc(dataset, 3)
how do you calculate the fourth quartile?
=max(dataset)
what is the range?
The difference between the maximum and the minimum of a dataset.
how do you calculate the range in excel for data that’s in the 8th row, through D and H?
=MAX(D8:H8)-MIN(D8:H8)
What is the Interquartile Range (IQR)?
It is difference between the third quartile (Q3) and the first quartile (Q1). This represents the spread
of the “middle 50%” of the dataset.
How do you calculate the Interquartile Range (IQR) in rows 40 columns D through H?
=QUARTILE.INC(D40:H40,3)-QUARTILE.INC(D40:H40,1)
What is the standard deviation?
The average deviation between the values in the dataset, and the mean of the dataset.
What is the long way to compute the stdev? (5 Steps)
1) Compute the mean of the dataset
2) Take the difference between every value in the dataset and the mean
3) Square each difference found in step 2
4) “average” the differences^2 computed in step 3 by summing them and dividing by (amount of values - 1). This gives
us the variance
5) Take the square root of the variance to get the standard deviation
What is the command for standard deviation?
=STDEV.S(A:A)
What is the variance?
the square of the standard deviation.
How do you get the variance ine xce;? excel?
=VAR.S(A:A)
What is the co-efficient of the Variant? (CV)
it represents what percentage of the mean the stdev makes up
Standard Value (z-score)
The z-score of a value in a dataset represents how far that value is from
the mean of the dataset, in terms of standard deviations
How do you calculate the Z-score?
z-score = (Value - Mean) / Stdev
What do we use to determine if a number is an outlier in excel?
We use the IQR rule: lower and upper fence formulas.
What are the formulas for lower and upper fence in excel?
Lower Fence = Q1 - 1.5IQR
Upper Fence = Q3 + 1.5IQR
What is included in a 5 number summary?
Min, median, max, Q1, and q3.
(TRUE or FALSE) In general, if a value in a dataset has a z-score of 5, we would consider that an extremely large value in comparison to the rest of the dataset
True,any z score above -1 or 1 is very large.
What has to be true about a dataset for there to be a standard deviation equal to 0?
every number in the dataset needs to be the same