Frequency and Probability Distribution Flashcards
What is a frequency distribution?
A tabular summary of data showing the number (frequency) of observations in each of several nonoverlapping categories or classes. This definition holds for both quantitative as well as categorical data.
What are the characteristics of a frequency distribution? (3)
A) they show how frequently each of the different classes occur
B) they make the pattern of numbers clear at a glance
C) they are made up of 2 principal components:
1) an ordinary frequency distribution (f) - the raw count in other words (whole number)
2) a relative frequency distribution (i.e. proportion) (relative f) - being the proportion that a particular score has occurred
What is (f) or ordinary frequency also referred to as?
The absolute frequency.
What is the purpose of graphing frequency distributions?
To provide a picture of the data distribution
What should you do to avoid distorting the data?
Set the intersection of the 2 axes at zero and then choose scales for the axes such that the height of the graphed data is about 3/4 the width.
Is it possible to not set the intersection of the axes at 0 ?
Yes, but you have to explain why you made that choice.
What do we mean by “distorting data”by not setting the axes at 0?
You are more likely to see a pattern that is not really accurate. You might see many peaks, when in reality it’s relatively stable. (Go see diapo 9 si jamais)
What is a bar graph?
It’s a graphical display for depicting categorical data (qualitative categories) summarized in a frequency, relative frequency, or percent frequency distribution.
What does the space between the bars of a bar chart emphasizes?
The fact that each class is separate.
What are two graphical displays of categorical data?
bar chart, pie chart. Broken line graph (potentially???)
What is a pie chart?
cercle, catégories prennent proportion x du cercle.
What are things we can include in a tabular display of categorical data?
Frequency, relative f, and % frequency
What are things we can include in a tabular display of quantitative data?
Frequency, relative f, % frequency, cumulative f, cumulative relative f and cumulative %
What are 2 graphical displays of quantitative data?
Histogram, stem and leaf display
What are the 5 steps for making a frequency distribution table?
- Make a table with a list of each possible score (FROM HIGHEST TO LOWEST! ET ATTENTION, CA DOIT ETRE CONTINU!!!)
- In the table, show how many times each score occurs (“f”or the absolute frequency)
- Figure the relative occurrence (aka proportion, p) of each score:
4) Figure the cumulative frequency of each score
- Figure the cumulative proportion (P) of each score
*** 3 first steps are the same with categorical data.
What is the formula of relative frequency (also referred to as relative frequency distribution)?
Relative f = f/N
What is the cumulative frequency distribution?
It indicates the number of scores that fell below the upper real limits of the desired score (it’s whole number).
What is the cumulative proportion distribution? What is the formula ?
It indicates the proportion of scores that fell below the upper real limits of the desired score
Cum. Proportion = cum f / N
What does big N represent?
The total number of scores.
What is a grouped frequency distribution?
A) Used when summarizing quantitative data, and there are so many different possible values that a simple frequency distribution table is to cumbersome to give a simple account of the information
B) They group values of all cases within a certain interval. A listing of non overlapping intervals.
What are the 5 rules for grouping data?
- Intervals must be continuous and mutually exclusive
- The lower limit of the lowest interval must be such that the interval contains the lowest score
- The lower limit of the lowest interval must be divisible by i (the interval width)
- The interval width should be an integer number of the units of the variables.
- The interval width should be familiar(whole number)
What should the sum of the relative frequency always be?
1
When are grouped frequency distributions done?
When the difference between the highest and lower score is high (15/20 or more - not gonna be in the exam dw)
Why do we create intervals in grouped frequency distribution tables?
Because there are too many scores to keep each one individually on its own.