Sections 4-7 Frequency (Distributions and Cumulative), Percentages, and Percentiles Flashcards

1
Q

Frequency (f)

A

The number of individual cases FOR a GIVEN SCORE.

  • Designated by lowercase/italicized ‘f’ (or ‘N’, which stands for Number of cases, but N usually represents TOTAL NUMBER of OBSERVATIONS/SCORES, while f refers to the frequency of a given score.)
  • Ex: I f = 23 for a score of 99, that means 23 individuals scored 99.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Importance of the format of statistical symbols

A

Pay attention to the format of a Statistical Symbol. Whether it is upper/lowercase, italicized or not, etc. matters. An italicized lower-case ‘f’ means something different from an uppercase, non-italicized ‘F’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Percentage

A

Designated by the ‘%’ or an uppercase ‘P’. Indicates the NUMBER PER HUNDRED that have a certain characteristic.

To Calculate, divide the number of observations meeting a specific criteria (f) (ex: students scoring 99 on a test) by the total number observations (N) (ex: total number of students who took the test). And then multiply the outcome by 100 to get the percent (ex: % of students who scored a 99).

Ex: I f = 23 for a score of 99, that means 23 individuals scored 99. And a total of N = 50 people took the exam…

  • …then 23/50 = 46/100 = 0.46.
  • Multiply that by one hundred (0.46 * 100) = 46%.
    • So 46% of the exam takers scored 99.
    • As a check, go back to the original numbers to see if 46% makes sense.
      • Sure enough, 46% of 50 = 0.46 * 50 = 23. Check!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Rounding

A

Remember that anything 0.5 or higher rounds to the next whole number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Using a SAMPLE to Predict the outcome of the entire POPULATION

A

Using the Example from the “Percentage” flashcard, pretend that the 50 test-takers looked at were merely a SAMPLE of 50 randomly chosen test-takers out of the full POPULATION of test-takers, which equaled 11,500.

  • Using the finding that 46% of the SAMPLE got a 99 on the test, we could estimate the number of students in the entire POPULATION that got a 99.
    • Simply multiply the POPULATION by the 46% (11,500 * 0.46 = 5,290).
      • This means that we would expect ABOUT 5,290 (within a MARGIN of ERROR – something we will go over in the future) students to have scored a 99 on the test given what we learned BASED on the SAMPLE of 50 students.
    • IMPORTANT: The larger the SAMPLE is relative to the POPULATION, the more accurate the estimate will be and the smaller the MARGIN OF ERROR.
    • LESS ERROR = GREATER CONFIDENCE in the final number.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Proportion

A

A PROPORTION is a fraction of 1.

  • It can be expressed either in fraction form (ex: 4/5) or in decimal form (ex: 0.80).
  • Note: that the decimal form is merely the solved fraction (ex: 4/5 = 0.80).
  • Ex: a PROPORTION of 23/50 = 0.46 = a PERCENTAGE of 0.46 * 100 = 46%
    • Because of their clarity, PERCENTAGES are usually USED in STATISTICAL REPORTING, not proportions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Comparing Groups of Different size

A

Percentages are useful in comparing characteristics of groups of different size.

Ex: Suppose you have two groups.

  • Group A has f = 32 students with tattoos
  • Group B with f = 78 students with tattoos.

But in which group are you more likely to find a person with a tattoo?

A: To find the answer, we need to know how many total students there are in each group. It turns out that:

  • Group A has a total of N = 47 students
  • Group B has a total of N = 530 students.

With this information, we can determine the PERCENT of EACH GROUP that has the tattooed students. The higher the percent, the more likely you are to find a person in that group with a tattoo. [[% of Students with Tattoos = Students with Tattoos / Total number of Students]]

  • For Group A: 32/47 = .6808 .6808 * 100 = 68.08 = 68% (rounded)
  • For Group B: 78/530 = .1472 .1472 * 100 = 14.72 = 15% (rounded)

So you can see that, although Group A has far fewer tattooed students than group B, you are still FAR more likely to find a tattooed student in Group A because the CONCENTRATION of tattooed students to total students is much HIGHER.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Frequency Distribution

A

Shows how many observations there are for EACH OUTCOME within a larger sample of outcomes.

Ex: Looking at test scores in a class of 33 students with scores that ranged from 24 to 37, we can see that 8 students scored 37, 4 students scored 36, and so on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Skewed

A

A SKEWED DISTRIBUTION is one that is NOT evenly distributed among the outcomes. In the distribution below, you can see that the scores are SKEWED to one side or the other rather than being balanced around the AVERAGE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Creating a Frequency Distribution Table

A

Creating a Frequency Distribution Table: Refer to the table at the bottom.

  1. Give each Table a number “Table 1”
  2. Brief descriptive (or title).
  3. Then a horizontal line (called a RULE) – there is also a RULE at the bottom of the table to mark the end.
  4. The symbol X stands for the SCORES. Be sure to use an uppercase, italicized X because a lowercase, italicized x has another meaning, discussed later in this book.
  5. The scores are listed in order with the highest score placed at the top, and the lowest score placed at the bottom.
  6. The ‘f’ (in italics) stands for FREQUENCY. (Some people put an ‘N’ where the ‘f’ is)
  7. The symbol ‘N’ stands for NUMBER of CASES (or SUM of the FREQUENCIES).
  8. When no cases exist for a given score, a frequency of zero is entered.
  9. N’, the SUM of the FREQUENCIES is shown at the bottom of the table.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequency Distribution for Grouped Data

A

This should follow the same structure as “Creating a Frequency Distribution Table” with the following modifications:

  • If there are more than 20 scores (or categories), group the scores (categories) as shown in Table 1. (below) For instance, the scores 39, 40, and 41 are in the group labeled “39-41” in the table. And the frequency for that group will be the SUM of the frequency for 39, 40, and 41.
  • These groups are called “SCORE INTERVALS”, and the “INTERVAL SIZE” is 3 points.
  • ALL Score Intervals MUST have the SAME Interval Size.
  • ESTIMATE INTERVAL SIZE by subtracting the lowest from the highest score (ex: 41-6 = 35), then divide by 15 (35/15 = 2.33) and then round to the next highest number (2.33 rounded to the next highest number is 3), so use 3 as the INTERVAL SIZE.
    • You divide by 15 because that is the number of SCORE INTERVALS you’ve chosen to use. Of course you could have chosen a different amount, but you want to keep it in a number that is easily worked with.
  • USE an ODD INTERVAL SIZE because that makes it easier to plot a distribution on graph paper.
  • TALLY RAW DATA into GROUPS by creating the groups and THEN assigning a check for each group that receives a score while crossing out the raw data as you go along. This is important because the alternate method of trying to count all the occurences in each score interval creates lots of mistakes.
  • When finished, CHECK YOUR WORK by ensuring that the SUM of the FREQUENCIES is the same as the total number of scores with which you started.
  • P’ represents the given group’s PERCENTAGE of the TOTAL.
    • P = f / Σf (Divide FREQUENCY of the GROUP by the TOTAL NUMBER of Student Scores)
      • Ex: in the 24-26 group >>> (5/34)*100 = 14.7%) So 14.7% of the total scores were in the SCORE INTERVAL of 24-26.
    • (Note: Σf is a different way to write ‘N’)
  • Note: adding all the percentages for all the groups will not necessarily = 100% due to rounding.
  • Grouping the scores in this way keeps the data cleaner, easier to work with, and easier to interpret. What if there were a million possible scores?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cumulative Frequency (cf)

A

Indicates how many cases are IN AND BELOW A GIVEN SCORE INTERVAL. For instance, refer to the cf column in Table 1 below. It contains _CUMULATIVE FREQUENCIES (*cf*)._

  • Ex: In the bottom interval, 6-8, there is one case in the interval (where f = 1) and there are zero cases below the interval. Therefore, the cf for this interval is 1 + 0 = 1. (Remember that the frequency column, f, indicates the number of cases in EACH score interval.)
  • In the next interval, 9-11, there is one case in the interval and one case below the interval. Therefore, the cf for this interval is 1 + 1 =2.
  • In the next interval, 12-14, there are zero cases in the interval and a total of two cases in all the intervals below it. Therefore, the cf for this interval is 0 + 2 = 2.
  • And so on. the number of observations ACCUMULATE as you add addiitioinal score intervals. When you reach the top, all the SCORE INTERVALS will have accounted for ALL the Observations and so the cf for the TOP SCORE INTERVAL will = N = Σf
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cumulative Percentages (Cumulative %)

A

CUMULATIVE PERCENTAGE is simply the PERCENTAGE of the TOTAL that ACCUMULATES IN AND BELOW the specified score (or Score Interval).

Ex: in the Table below:

  • For the 39-41 score interval, 99.8% scored in and below that interval. Wait, why isn’t that 100%? A: rounding error will often keep this number from equalling exactly 100%.
  • For the 36-38 score interval, 96.9% scored in and below that interval.
  • The CUMULATIVE PERCENTAGES are APPROXIMATE PERCENTILE RANKS, which indicate the percentage who scored at or below a given score level.
    • Ex: We could report to students with scores of 27, 28, and 29 that their PERCENTILE RANK is 85 (based on the cumulative percentage of 85.2)-meaning that their scores are as high as or higher than 85% of the total students.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Percentile Rank

A

The PERCENTILE RANK indicates the percentage of scores at or below a given score level. This is equivalent to a cumulative percentage of the distribution.

* Ex: If you got a score that was as high or higher than 90% of the class, then you scored in the 90th PERCENTILE.

  • You could also say that the person’s score had a CUMULATIVE PERCENTAGE of 90%.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly