Section 1.2 Flashcards
Stacked data
Every single row represents a variable (categorical whole because the people are wholes and their traits are variables in these data tables)
Each different variable ( categorical, numerical) weekend and weeknight- not headers, just implied) is represented by a different row (time and CODED DATA headers ((whether or not-0 or 1- its the weekend)) of weekend)
Use this when you want to target the coded (TOI) against others
@@@ Row (One person- categorical whole)-> their categorical V and numerical V. Ether one containing the TOI and contrasting/ testing the TOI
Unstacked data
Every single COLUMN lists a variable (C or N) separate from each other, not visually combining the variables IOT represent EACH categorical whole/ person (one Cw= certain variables)
Each different number given, represents a different data value (weekend (8, 8) and weeknights( 7, 6, 8, 7, 7))
Its basically the same OVERALL representation as stacked, but stacked just reads the categorical variable like a neat little sentence. Doesn’t frame TOI as well/ efficiently to really see if the person is being discriminated in comparison (other pppl circumstances are displayed neatly next to your TOI
@@@ the TOI variable in either C or N- separated in headers
The reason for unstacked and stacked formats
Depending on type of problem, explore data through different formats (some may present the data more clearly and useful to your research than others
Unstacked is basically the same OVERALL representation as stacked, but stacked just reads the categorical variables like a neat little sentence. UN doesn’t frame TOI as well/ efficiently to really see if the person is being discriminated in comparison (other ppls. Circumstances are displayed neatly next to your TOI)
Categorical variables (qualitative- describe qualities/ categories)
Specific DESCRIPTION
Independent variable (but not rlly- this is 4 memory lol)
EX: race, gender, the headers of the chart that have WORDS below them
BEWARE- sometimes they are numbers- weight is numerical for one person (doesn’t change in moment), but weighing 10 people, depending on the group of ppl you pick, your @@@ RANKING (category) in the group could be different (variability changes)
Numerical variables (quantitative/ quantity or how much of the objects of interest)
Characteristics that are measured, counted, or calculated (number-based variables)
Dependent variable/ measured outcome
(not description)
EX: test scores, height, headers that have NUMBERS below them
Helpful tip
Categorical Vs are classified by names and labels given to it (qualities)
Numerical Vs are QUANTIFIED (number data), THEN classified by labels/ CATEGORICAL variables (still numerical Vs nonetheless because keyword MEASURED)
ONLY categorical Vs can be coded
EX: number of aunts- “3” cant be coded
If you QUALIFY-> “If you have over 2 aunts” header (variable)- categorical
Coded data
It is categorical even if it has numbers (0s and 1s- no and yes RESPECTIVLEY) _ the numbers are categories and have a RULE of what it represents based on the HEADER VARIABLE
Decimal to percent
Multiply decimal by 100 (right twice)
Just a fraction (part over whole) but it literally reads as part divided by 100 to get a percent (you get decimal first, but then convert)
Percent equation
Part/ whole -> decimal-> percent
Percent is always out of 100-> 14/100 -> 0.14-> 14%
Right twice
Left twice for decimal
Finding percent amount of a whole
15% of 400
400(0.15)
Find percent
35 is what percent of 400
35/400 = ?
35/400 = x/100 cross multiply (35 x 100)
Looking at proportion context
- Identify variables. How MANY and what are they classified by? (Usually once you focus in on names of CVs, identify part and whole faster- things that have a more specific category than other)
Variables: unmarried, married, (can both be ICV), total births - Identify numbers
Numbers: INTEGER CV, percent and whole
Then:
Is % + ICV there but not total? = find WHOLE by equating it all to the given %
% and whole but no ICV? = find ICV by equating it all to the given %
Part and whole given for one CV, but they’re asking for other CV’s %? = subtract whole by part, then use difference as Part out of given whole and solve for %
Are any of the CVs in integer form/ NUMBER value (ICV)? = find its percent form
- always cancel out the denominator
IMP! When in equation make percent a DECI
The total includes all CVs (according to TOI- all is all when it says so)
(Biggest #)
ALWAYS notice the year difference of when data arose.
Categorical data is not categorical variable
Categorical variables give categorical values
EX; unmarried births can be one of the values in the category “births in US”
But the numerical values of this categorical statistic are technically categorical DATA because it is NOTICEABLY attached to a TOI.
@@@ EX: numerical data is only given importance when attached to a categorical value that will aid in a research/ TOI
Rates
If the rate DATA says “birth rate for teens is 18.8 per 1,000”, they want us to EXPECT at least 18-19 teen moms for every 1,000 teens
average joe variable- per Changed CV- prison pop (will be in whole form before we turn it into rate) (sampled variable) CW/ sampled whole- US pop (make sure it is TOI/ what question specifically asks for)
Describes how many people (variables) are LIKELY to get their categorical label (residents) switched to another categorical variable of INTEREST (prisoners) in a whole (all of US/ CW)
EX: estimates within how many average joes in US, are there gonna be teachers. (Describes population of a SPECIFIED TOI category through direct comparison (times 1,000) to the categorical whole)
When people are involved in proportions.
Rate: 18.8 per 1,000
Percent: 0.0188 (divide)
To FIND RATE:
Identify the CW (not category whole, but the prison population) -> (prison pop/ US pop) reference #
* CHECK what CV comes after “per” (US pop VS US adult pop)
* why is 1,000 the reference number?
Rate is given (and CW defin.)
Turn rate to percent (decimal/ per)
Given categorical whole? (the whole in equation but its the SAMPLED whole- US pop)
The given rate applies to this whole
Use rate in DECIMAL format (not “per”) to fit in regular “% and whole, no part” as the “given %”, except “given deci.” Instead.
*this isn’t the same equation as FINDING rate (multiply by 1000), its opposite (dividing) because we’re DISMANTLING rate to
get the categorical VARIABLE (not whole)