Data Science Math Vocabulary Flashcards

1
Q

Math Vocabulary

Are the “mean” and the “average” of a set of numbers the same thing?

A

Yes, the “mean” and the “average” the same thing.

mean or average = sum of the numbers / count of the numbers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Math Vocabulary

What is the mean, or average of a set of numbers?

A

The mean (also known as the arithmetic mean or average) is a calculated “central” value of a set of numbers. It represents the typical value within the data set and is obtained by adding up all the data points and dividing by the total number of data points. For example, the mean of the set of numbers {4, 1, 7 } is 4 because:

(4 + 1 + 7) / 3 = 12 / 3 = 4

From copilot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Math Vocabulary

What is the median of a set of numbers?

A

The median is the middle value in a sorted list of numbers. When arranging the data points from smallest to largest, the median is the middle data point if there’s an odd number of data points. If there’s an even number, the median is the average of the two middle data points. For example, the median of the data set {1, 4, 2, 5, 0} is 2 because it’s the middle value when ordered {0, 1, 2, 4, 5}. In the ordered set with an even number of values, {0,1,2,4,5,8}, there are two middle values {2, 4} so the median is their average (2 + 4) / 2 = 6 / 2 = 3

Memory trick: “median” is like “middle”

From copilot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Math Vocabulary

What is the mode of a set of numbers?

A

The mode is the value that appears most frequently in a data set. It represents the data point with the highest frequency of occurrence. For example, in the set {4, 2, 4, 3, 2, 2}, the mode is 2 because it occurs three times (more than any other number). It may be easier (but not necessary) to see that if the set were ordered, {2, 2, 2, 3, 4, 4}

If a set has no repeating values, e.g. {1, 2, 3, 4}, there is no mode.

If a set has multiple values that repeat the same number of times, e.g. {1, 2, 3, 3, 4, 4} (both 3 and 4 occur twice) then there are multiple modes, in this case, there are two, 3 and 4.

Memory trick: “mode” starts with “M” and “o” with stands for “Most often”.

From copilot and modified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Math Vocabulary

What is a Measure of Central Tendency of a data set, and in what ways can it be calculated?

A

A Measure of Central Tendency refers to a descriptive summary of a dataset through a single value that reflects the center of the data distribution. The three most common measures of central tendency are:

Mean or Average: The sum of all values divided by the total number of values.
Mode: The most frequently occurring value(s) in the dataset. It could be a single value, multiple values if two values occur the same number of times, or no value if no values occur more than once in the set.
Median: The middle number in an ordered dataset, or the mean of the middle two numbers if the set is evenly numbers.

Understanding central tendency is essential for analyzing data and making informed decisions.

From copilot and modified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Math Vocabulary

What is the Variability of a data set, and how can it be calculated?

A

Variability or Variation in statistics refers to how spread out a set of data is. It describes how far apart data points lie from each other and from the center of a distribution. Measures of variability allow us to summarize and compare data sets. Common ways to describe variability include:

  • Range: The difference between the highest and lowest values in the data set.
  • Interquartile Range (IQR): The range of the middle half of a distribution.
  • Standard Deviation: The average distance from the mean.
  • Variance: The average of squared distances from the mean.

Understanding variability is crucial because it affects our ability to generalize results from a sample to a population. Low variability allows better predictions, while high variability makes predictions more challenging. Both central tendency (average) and variability together provide a complete picture of the data

From copilot and modified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the range of a set of numbers?

A

The range is the difference between the the hightest and lowest values in the set:

range = highest value - lowest value

or

r = h - l
How well did you know this?
1
Not at all
2
3
4
5
Perfectly