Frequency and Probability Distribution Flashcards

1
Q

What is a frequency distribution?

A

A tabular summary of data showing the number (frequency) of observations in each of several nonoverlapping categories or classes. This definition holds for both quantitative as well as categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the characteristics of a frequency distribution? (3)

A

A) they show how frequently each of the different classes occur

B) they make the pattern of numbers clear at a glance

C) they are made up of 2 principal components:

1) an ordinary frequency distribution (f) - the raw count in other words (whole number)

2) a relative frequency distribution (i.e. proportion) (relative f) - being the proportion that a particular score has occurred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is (f) or ordinary frequency also referred to as?

A

The absolute frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of graphing frequency distributions?

A

To provide a picture of the data distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What should you do to avoid distorting the data?

A

Set the intersection of the 2 axes at zero and then choose scales for the axes such that the height of the graphed data is about 3/4 the width.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Is it possible to not set the intersection of the axes at 0 ?

A

Yes, but you have to explain why you made that choice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do we mean by “distorting data”by not setting the axes at 0?

A

You are more likely to see a pattern that is not really accurate. You might see many peaks, when in reality it’s relatively stable. (Go see diapo 9 si jamais)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a bar graph?

A

It’s a graphical display for depicting categorical data (qualitative categories) summarized in a frequency, relative frequency, or percent frequency distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the space between the bars of a bar chart emphasizes?

A

The fact that each class is separate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are two graphical displays of categorical data?

A

bar chart, pie chart. Broken line graph (potentially???)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a pie chart?

A

cercle, catégories prennent proportion x du cercle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are things we can include in a tabular display of categorical data?

A

Frequency, relative f, and % frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are things we can include in a tabular display of quantitative data?

A

Frequency, relative f, % frequency, cumulative f, cumulative relative f and cumulative %

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 2 graphical displays of quantitative data?

A

Histogram, stem and leaf display

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 5 steps for making a frequency distribution table?

A
  1. Make a table with a list of each possible score (FROM HIGHEST TO LOWEST! ET ATTENTION, CA DOIT ETRE CONTINU!!!)
  2. In the table, show how many times each score occurs (“f”or the absolute frequency)
  3. Figure the relative occurrence (aka proportion, p) of each score:

4) Figure the cumulative frequency of each score

  1. Figure the cumulative proportion (P) of each score

*** 3 first steps are the same with categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula of relative frequency (also referred to as relative frequency distribution)?

A

Relative f = f/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the cumulative frequency distribution?

A

It indicates the number of scores that fell below the upper real limits of the desired score (it’s whole number).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the cumulative proportion distribution? What is the formula ?

A

It indicates the proportion of scores that fell below the upper real limits of the desired score

Cum. Proportion = cum f / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does big N represent?

A

The total number of scores.

20
Q

What is a grouped frequency distribution?

A

A) Used when summarizing quantitative data, and there are so many different possible values that a simple frequency distribution table is to cumbersome to give a simple account of the information

B) They group values of all cases within a certain interval. A listing of non overlapping intervals.

21
Q

What are the 5 rules for grouping data?

A
  1. Intervals must be continuous and mutually exclusive
  2. The lower limit of the lowest interval must be such that the interval contains the lowest score
  3. The lower limit of the lowest interval must be divisible by i (the interval width)
  4. The interval width should be an integer number of the units of the variables.
  5. The interval width should be familiar(whole number)
22
Q

What should the sum of the relative frequency always be?

23
Q

When are grouped frequency distributions done?

A

When the difference between the highest and lower score is high (15/20 or more - not gonna be in the exam dw)

24
Q

Why do we create intervals in grouped frequency distribution tables?

A

Because there are too many scores to keep each one individually on its own.

25
What are the steps to constructing a grouped frequency distribution table?
1. Finding the range. Range = (highest data value minus the lowest data value)+ 1 2. Determine the interval width (i) - ideally between 5 and 20 - try to find a whole number - he will tell us how many intervals (number of groups the data is divided into) he wants so no worries. formula: i = range/number of class intervals **Attention class intervals have to be continuous 3. Identify interval midpoints (point au milieu de la classe d'intervalle): Formula: Midpoint(i) = minimum + ((max-min)/2) 4. Identify real limits of each class interval: real limits(i) = midpoint ± (width/2) 5. Count the raw scores and add them up to determine the frequencies. 6. Figure the relative frequency (p) for each interval relative f = f/n 7. Figure the cumulative frequency distribution for each class interval - indicates the number of scores that fell below the upper real limits of the desired data value 8. Figure the cumulative proportion distribution of each class interval.
26
Quel est le but du midpoint et des real limits?
Midpoint: It gives a single number to represent the whole interval, which makes calculations (like averages) and graphs simpler. Real limits: They make sure there's no overlap or gaps between intervals and help include borderline values properly, especially for continuous data. Borderline values" are numbers that fall exactly at the boundary between two class intervals in a frequency distribution. For example: If your class intervals are 10–14 and 15–19, the values 14 and 15 are borderline values because they mark the division between these two intervals.
27
What are histograms?
-They are kind of a bar chart (bars are put right next to each other because we are graphing continuous data, interval or ratio). not independent data like with a bar graph. -Height of each bar corresponds to the frequency of each value or interval in the frequency distribution table
28
What is one of the most important functions of the histogram?
To provide information about the shape of the distribution.
29
What are three possible shapes of frequency distribution for histograms?
A curve is symmetrical if when folded in half the two sides coincide. If a curve is not symmetrical it is skewed In a positively skewed curve most of the scores occur at the lower values of the horizontal axis, and the curve tails (like think an actual tail of an animal) off toward the higher end; may reflect a floor effect In a negatively skewed curve most of the scores occur at the higher values of the horizontal axis, and the curve tails off toward the lower end; may reflect a ceiling effect
30
What is a stem and leaf display (even though we don't use it anymore) ? What are two advantages?
* Simultaneously displays the rank order and the shape of the distribution data. * The display is created by rank ordering the leading digit(s) of each data value to the left of the vertical line (stem), and recording the last digit(s) for each data value to the right of the vertical line (leaf) * 2 primary advantages of the stem and leaf display over the histogram 1) Easier to construct by hand 2) Provides more information because it shows the actual data
31
PROBABILITY - what is the definition of probability?
It is a numerical measure of the chances that an event will occur. * Probabilities can be used as a measure of uncertainties associated with events
32
What does it mean for the probability of a event occurring being 0, 0.5 or 1.
0: the event is very unlikely to occur. 0.5: the occurrence of the event is just as likely as it is unlikely. 1. the event is almost certain to occur.
33
What is the possible range of a probability?
from 0 to 1. There are no negative values.
34
How do experiments in statistics and in the physical sciences differ?
In a statistical experiment, the outcomes are not completely predictable because they are based on probability. Even though the experiment is repeated in exactly the same way, an entirely different outcome may occur. Ex: coin flip
35
How are statistical experiments also called?
Random experiments
36
What is an experiment (trial)?
An experiment (trial) is any process that generates well-defined outcomes * On any single repetition of an experiment, one and only one of the possible experimental outcomes will occur ex.: experiment - toss a coin experiment outcomes: head, tail. experiment- play a football game experimental outcomes: win, lose, tie
37
What is "sample space" and "sample point"?
The sample space for an experiment is the set of all experimental outcomes. * An experimental outcome is also called a sample point to identify it as an element of the sample space. ex: toss a coin S (sample space) = [Head (sample point), tail]
38
What are two basic requirements for assigning probabilities?
1. The probability assigned to each experimental outcome must be between 0 and 1, inclusively. 0
39
What are three assigning probabilities methods?
The classical method, the relative frequency method and the subjective method.
40
What is the classical method of assigning probabilities?
Assigning probabilities based on the assumption of equally likely outcomes Example: Rolling a Die If an experiment has n possible outcomes, the classical method would assign a probability of 1/n to each outcome. Experiment: Rolling a die Sample space: S={1,2,3,4,5,6} Probabilities: Each sample point has a 1/6 chance of occurring
41
What is the relative frequency method of assigning probabilities?
Assigning probabilities based on experimentation or historical data. Example: Lucas Tool Rental Lucas Tool Rental would like to assign probabilities to the number of car polishers it rents each day. Office records show the following frequencies of daily rentals for the last 40 days. Each probability assignment is given by dividing the frequency (number of days) by the total frequency (total number of days). Basically same as relative frequency. f 0 polishers were rented for 4 days out of the 40 days, then the probability of no polishers being rented on a given day is 0.10 (4 days/40 - 0 polishers were rented each day for those 4 days out of 40)
42
Go see diapo 47. What would be the probability of renting 3 polishers or more per day at Lucas tool rental?
0.30
43
What is the subjective method of assigning probabilities?
The subjective method of assigning probabilities means using judgment to estimate the likelihood of an outcome, especially when historical data isn't enough or the situation is changing quickly. It involves considering available data, as well as personal experience and intuition. Ultimately, the probability reflects how confident you are that a specific outcome will happen. Gut felling type of thing. Often, the best estimates come from combining both the estimates from the classical or relative frequency approach with the subjective estimate. Example: Estimating the Probability of a Product Selling Out Suppose you work for a company that sells a limited edition product, and you want to estimate the probability that it will sell out by the end of the week. Relative Frequency Estimate: Based on historical data, you know that similar products have sold out 80% of the time over the past few months. This is a data-driven estimate. Subjective Estimate: However, this time, the economic conditions have changed (e.g., there’s a new competitor), and you feel there’s a 50% chance that the product might sell out, based on your knowledge of the current market trends and consumer behavior. Combining Both Estimates: You could combine the two estimates, perhaps by averaging them or weighing them based on how confident you are in each approach: Combined Probability: (80%+50%) / 2 = 65% Alternatively, if you trust the historical data more, you could weigh it higher, such as: Combined Probability This gives you a 74% chance of selling out the product.
44
What is an event?
A collection of sample points. example: Event A = days on which 3 or more polishers were rented. A = {(3),(4)} ** basically saying: on the days where 3 polishers were rented and on the days where 4 polishers were rented, that is what we are focusing on, because we want to track the days where the number of polisher rented is 3 or more. The number of polishers rented IS THE EXPERIMENTAL OUTCOME!!!**
45
What is the probability of an event? What are two ways to find it?
1. The probability of any event is equal to the sum of the probabilities of the sample points in the event. If we can identify all the sample points of an experiment and assign a probability to each, we can compute the probability of an event. What is the case for : Event A = days on which 3 or more polishers were rented A = {(3),(4)} P(A) - meaning probability de event A = P(3) + P(4) = 0.25+0.05 = 0.30 2. The probability of any event can also be found by dividing the number of successful experimental outcomes (Nsuccess) by the total number of experimental outcomes (NSE) Example with luca tool rentals: Event A = Days on which 3 or more polishers were rented P(A) = Nsuccess/ N(SE) = 10 + 2 / 40 = 0.30
46
What are mutually exclusive and independent events? ** si ta pas bien compris regarde des videos**
Mutually exclusive events are events that cannot co-occur. Example rolling a 2 and a 3 on a die. Events are independent if the occurrence of one does not affect the probability of the other occurring. ex: rolling a dice twice, the outcome of the first roll and second roll have no effect on each other.