2.1, 2.2 Visualizing Numerical Data Flashcards

1
Q

UD

A

It doesnt have CVs, rather it’s one graph for ONE of the NV headers (its values and frequencies)
No other variables to compare it with (correlational), just observing the variability within itself (descriptive research)- the WHAT
Organize through different patterns of sample for better visualization + conclusion

  • dot plots, histograms, stemplots
  • summarize: shape- symmetry?
    Center- most common value?
    Spread- any data values farther from the rest?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Analyze distributions (dot plot)

A

CHECK what each dot represents
Questions will center around the observation of the numerical values- patterns, density (popular values)
EX: what percentage are at least 68 mm?- researcher wants the amount of sampled measurements thats in that range (maybe its useful for their TOI)
Just because there is more dots on one value than teh rest- watch out for high but sparsed out variability outside fo that, because they accumulate in comparison to the obvious one
To accurately determine : obvious dots/ total dots
On dot plots it shows each person or each country aligning with whichever numerical variable they fall under (18 or 20)/ are a value of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categories

A

We are looking at numerical distributions of the numerical variable for a conclusion of TOI

You cant put categories on a dot plot, but DO put categories underneath or beside the graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Outlier

A

Not like other dots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Find true (not obvious) mode

A

Similar to dot plot card
1. Count dots of inquired numerical variable (age 18- dots)
2. Divide by total dots (all students on each age category)
3. Smaller than 0.5%? NOT MODE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Relative frequency

A

Proportion in DECI form - frequency/ total
Proportion of the observations that exhibit the relevant characteristic (<- numerical variable)
Shown on frequency table (IMP- this frequency table has NUMERICAL HEADERS) - standard deviations too
Shows how FREQUENTly each numerical value shows up on a graph
Frequency-> counts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Bimodal distribution

A

When distributions show 2 or more OBVIOUS peaks-> 2 different categorical wholes that are mostly polarized because of CIRCUMSTANCES attached to each
Eyeball-> if there is an obvious peak, but unlike unimodal peaks, the rest of the bins don’t cascade down in size, ONE is brave enough to stick out
EX: westerners are guaranteed less variability in our life expectancy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

“More than 100” (dot plot)

A

Dots on 100 do NOT count

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Center

A

most obviously the center numerical variable, not mean or typical value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

N=

A

When it says “n=“ this gives the total number of dots on the graph
ALSO is the total number of numbers for the mean

You can also deduce whether or not there are significantly less of a CW when looking at how much less they show up in comparison to the other CW in a study with two graphs (gender)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Eyeballing variablity

A

measure variability by eyeballing it
1. Find center
- least variability= MOST dots/ values in center (EVERY girl has this!- nobody’s different smh)
- most variability= MOST sparsed dots/ values (balance= variability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Histograms rules

A

Have frequency indicator on side IN PLACE of the dots that we could easily count before (for gauging like we did with dots)
Running number line so bars have to touch each other.
Rules:
- first post of bin counts but not its second
- any data value that lands ON a post (1st or second) automatically belongs in the bin to its right (therefor is represented by this bin + classified by it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Relative frequency histograms

A

How much a percent is representative of a whole- frequency/ total
Y axis takes the frequency (count) of the numerical VALUES per numerical VARIABLE, then divide that by the total number of entries
Percentage of how much Numerical data falls into this bar (the one that matches it in height) using the total as reference
EX: bar at 5 counts -> 5/28 -> its now at 0.18 (18%) (SAME HEIGHT)
Find percentage of the relative frequencies accounted for within a certain amount of numerical values-> add the two relative frequencies you find
TIP- 4/100
We do this so we can see it through its accurate *LIKLIHOOD of showing up (counts aren’t enough)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Histograms on stat crunch

A

Making bins too small-> too much data to look at
Distorts data because there is a lot of information between TOO wide bars

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Average on histogram

A

Locate bins according to what they ask for AND the bin rules
Measure the frequency of those bars (height)
Add up all the frequencies within your range
Then divide by total (will be given or just add up frequencies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Adding up bars/ range, not total

A

3 up to, but not including 6
First post ——> not last post

17
Q

Reading relative frequency

A

4/100

18
Q

Finding typical value/ comparing higher means

A

Eyeball it
Choose the bins that are more towards the other bins (rigth next to iT) that is SECOND largest bin on right or left side of it)
EX: don’t choose 0 or 1 if 1 is largest but 0 is significantly smaller
You scope out which bins can approximate an accurate center
Basically MEAN

19
Q

Reading frequencies (not relative)

A

Identify a bars frequency by “Between 80 and 100”
80- the value you eyeball it at
100- the tick mark value you actually measure it up to

When percentage is asked for this frequency (s) ->
“Between % and %”

20
Q

Predicting shapes of distributions

A

Right-skewed= outliers on right-> represent MORE than the NV (normal result is typically on the left side so it is low here)
* its about the context (7 days) -> it depends on if the greater you go up on the number line, the more you actually go up in your context
Left-skewed- most values are large (on right since larger values are farther up on number line) but there are outliers that take up a lot of space

21
Q

Comparing typical values (centers)

A

Look at the tallest bins (typical value card) BUT make sure to compare the values UNDERNEATH those bins

22
Q

Tip

A

Really make sure you’re looking at the right thing they want you to look at (higher monthly cost-> NV NOT relative frequency)

23
Q

The skinny bars

A

Are just dots

24
Q

Reading graph

A

The bottom tells all CONTEXT needed
Words- NV
Numbers- numerical values of that numerical variable
Y axis, unimortNt to actual NVs but how much they responded yes to NV

25
Q

UD SD

A

“With a SD of 2.24” = ONE SD equals 2.24 inches / distance between each tick mark is 2.24