NNNWeek 2 - Basic R, Data Structure/Manipulation Flashcards
What are logical operators used in?
Data management
What do logical operators return in R?
True or false (a Boolean value)
To find outliers, values greater than or less than a specific score, or check values fall within a certain range, or recode continous variables in categorial variables (low/medium/high) what would you use?
A logical operator
What might a logical operator look like?
How do operators deal with data - where do they pull from?
Operators take data on the LEFT hand side an a few options arguments on the right hand side.
In an analysis, we often subset data. Why would we do this?
Because subsetting means we can select a portion of data that is relevant in the criteria you are wanting to use.
If you wanted to exclude outliers or select participants who meet a certain criteria in your data, what task would you use ?
Subset the data
True or false: logical operators are commonly used in subsetting to pick specific rows of a dataset or specific values from a single variable?
TRUE
If we wanted to run analyses in data excluding people outside 25 years of age and 20 years of age, which logical operator would we use in subsetting?
%gele% because it captures people greater than or equal to.
You can see we are working with just a single variable as Age is only word to the left of the logical operator. If more than one, would be & or vertical line if more than one. e.g [Age == 18 & Female == 1]
What operators do we use to chain conitions together?
& or the single vertical line
e.g
d[Age == 18 & Female == 1, .(UserID, Age)]
What if you had several values that you wanted to test within a data set, what operator could do this?
the %in% one.
select anyone whose age is in 18, 19, or 20
d[Age %in% c(18, 19, 20), .(UserID, Age)]
So this is saying, in this variable of AGE on the left, find anyone whose age is in 18,19 or 20.
If you had two variables to consider - not just Age, but also Gender as an example, how would you chain these together?
Using brackets! (parentheses)
So say you wanted to see a 19 year old female participant or 18 year old male participants. You’ve used vertical line below to put two together but could do & potentially..
d[(Age == 19 & Female == 1) | (Age == 18 & Female == 0),
.(UserID, Age)]
What is this operator asking the data to pull out?
d[Age < 20, .(UserID, Age)]
Show anyone under the age of 20.
What is this operator asking the data to pull out?
## anyone age 20 or under d[Age <= 20, .(UserID, Age)]
Show anyone age 20 or under
What does the operator ! say
who is NOT