Week 8 Flashcards
For continous variables, which information do we want to look at: M (SD) or N(%)?
Mean and Standard Deviation.
With fixed effects in a random intercept model, what are you estimating between X and Y?
The overall association between X and Y for the WHOLE sample without allowing the strength of direction of the effect/slope to vary randomly by person.
If you have a fixed effect in a random intercept model, are you assuming the effect of X on Y will be different or the same for everybody?
The same
If you have a random intercept with a fixed effect, and say Y is stress and X is energy, what is the random intercept actually showing?
That the mean value of stress does vary between persons because it is random, however, the overall effect of energy on stress for the whole sample is fixed - this does not vary
The Int_Fri variable contains information on whether a person interacted with a friend on a day. One = they did, Zero = they didn’t.
Does the 0 and 1 column here tell us that 64% of people interacted with a friend on any day?
No. It tells us that of the observations (days) in the survey, 64% featured an interaction with a friend
In the Bstress variable, or between stress, the average scores of stress were firstly averaged across the days per person
(e.g ID 10 got 3, 2, and 5 across three days. Then divided that by 3)
THEN that score is ….?
Averaged across the sample for all people.
What type of variable is being shown here and how has the data been dealt with?
The betweenstress variable (Bstress), which is avg. score per person across days, and for those we only want ONE row of data input per person (not duplicated operator being used).
Why? Because we want to avoid inflating the mean for the individuals who filled out more observations.
To ensure that repeated baseline variables (seen in between person variables such as sex on loneliness measures, because often the things that are repeated are biological sex) do not contribute a higher weighting to the mean, what might we do?
We drop the duplicated ID by using thegltable code but als data= dm[!duplicated(ID)]
If you have long format data set and want to report descriptives statistics on variables that do NOT change across time points, this is one way to include non duplicated ID’s.
When looking at descriptives of any between subjects variables in multilevel data need to ask yourself…?
If the variable does NOT change within a person in a dataset, do you want the mean of all observations?
No, as people with more observations will count more towards the descriptives
Better to have the mean of the observations within the one ID.
Will this variance in this WITHIN continous variable show variance in both between and within aka total variance?
No, just mean of individual means so variance will be between units only.
In this within continous variable, are we seeing the mean of individual means, so between units only, or mean across all observations, so variance in both between AND within units also known as TOTAL Variance?
Yes! Variance in both between and within units, aka TOTAL VARIANCE.
What could be a problem with taking the mean across ALL observations, not just people?
You include a lot of variance from people who provided more data. Potentially bias the results.
In the mean of individual means, what will the variance be taken from?
Between units only
In the mean across ALL observations, what will the variance be taken from?
Variance will be TOTAL variance taken from both between and within units