CODE Flashcards
Number of successes (binomial)
x
Number of trials (binomial)
size
Probability of success in each trial (binomial)
prob
Number of repeats (binomial)
n
Probability of overall outcome (binomial)
p
Probability density at each point (binomial distribution)
dbinom(x, size, prob)
Is explained by
~
Add line of best fit
abline()
Change data class
as.class(DATASET)
Boxplot
boxplot()
Print data classes
class()
Create a new data frame
data.frame()
Print the data in a single column
DATASET$COLUMN
Changing dataset class
DATACLASS(DATASET$COLUMN)
What does the function describeBy() output?
Variation
Variable
Mean
SD
Median
Trimmed
Mad
Max
Min
Range
Skew
Kurtosis
SE
Display working directory
getwd()
Install a package into working environment
install.packages()
Load the package into the workspace
library(PACKAGE)
Load the working env
load(“FILENAME.Rdata”)
Give the regression line
lm()
Print objects loaded into the working environment
ls()
Probability of up to a certain value (binomial)
pbinom(x, size, prob)
Plot a explained by b
plot(a~b)
Probability of up to a certain value (normal distribution)
pnorm(x, mean, sd)
Dataset preditcions
predict(object, newdata interval=”confidence”)
Set working directory
setwd()
Summarise a dataset
summary(DATASET)
Compare two groups (i.e.: treatment and control)
Two sample t test
t.test(a~b, data=dataframe)
Compare one group to a mean (determine if they belong to the same population)
One sample t test
t.test(output, mu=population_mean, data=dataframe)
Calculate the critical value (highest possible) of a given distribution at a specific alpha
(opposite of pbinom)
qbinom (1-alpha, size, prob)
Two-tailed hypothesis test
binom.test(number of successes, size, prob)
How is a t-test made paired?
paired=T is added
Why would a t test be paired?
Samples are in pairs (i.e.: before and after)
What two functions give Q-Q plots to test normal distribution?
qqnorm() and qqline()
Undergo an ANOVA test
aov(data~group, data=data_frame)
Adjust p values as boferroni or BH
p.adjust(p, method = “BH”)
p.adjust(p, method = “bonferroni”)
Apply a particular function to every row/column of data
apply()
How do you specify specific columns (e.g.: 1-5)
[1:5]
To run an ANOVA what must occur first if all columns are continuous data?
Vectors must be stacked such that the categorical variables are in one column (ind) and the continuous data is in the other (values).
Stack vectors
stack()
Turkey Honesty Significance Test
TukeyHSD(fit)
(where fit= aov(data~group))
How do you make a t test be one-tailed?
Add:
alternative = greater/less
How can you make a t test be paired?
Add:
paired = True/T
How can you check for variance?
describeBy()
Correlation coefficient
cor()
Regression equation (linear)
lm()
Predict values of variable a from variable b, where b=15 and a is explained by b
predict(object, newdata, interval = “confidence”)
where:
object = lm(a~b) output
newdata = the dataframe with new data in it, ouput of: data.frame(b=c(15))
interval = “confidence” = the type of intervals we want to calculate
How should you output an ANOVA?
Assign it to an object and use the summary() function on said object
Printing a single value from a vector (produce value 3 from vector d)
d[3]
Produce a single value from a dataset by it’s row and column
dataframe_name[row_number,column_number]
Produce a single row from a dataset by it’s number
dataframe_name[row_number, ]
Produce a single column from a dataset by it’s number
dataframe_name[ ,column_number]
Print column names of a dataframe
names()
Print column names and top few rows of a dataframe
head()
Print row names of a dataframe
rownames()
Print the first two rows of a dataframe
head (data_frame, n=2)
What data is given by a summary() function?
Min.
1st Qu.
Median
Mean
3rd Qu.
Max.
Convert a numeric variable into a factor variable
factor(data_frame$variable)
Remember to then store this under the column/ object name to ensure the change is registered
Produce summary statistics by group
describeBy(a, group=b)
where is a is the object to be summarised by b
Calculate the probability of observing an exact number of “successes” (binomial distribution)
dbinom(x, size, prob)
Calculate the probability of observing up to a certain number of “successes” or events (binomial distribution)
pbinom(x, size, prob)
Calculate the probability of observing up to a certain number of “successes” or events (normal distribution)
pnorm(x, mean, sd)
Define scalar
Scalar <- object storing a single value``
Define strings
Strings <- objects storing words
Define vectors
Vectors <- object holding more than one value
Define dataframe
Dataframe <- objects with hold several sets of values
How do you make a binom.test one tailed?
Add:
alternative = greater/less
How might you make a non-numeric varaible be treated as a numeric one so as to plot it?
plot(a~as.numeric(b), data=data_frame)
where b is a non-numeric dataset
What function might make plots coloured?
col=
What function can change the shape of plots on a scatter diagram?
pch=
What must be true about the arrangement of data for a paires t test to work?
Must be by the same order in both groups
What test can be done to check if a dataset looks normally distributed?
Q-Q plot
qqnorm()
qqline()
What does the table produced by R in an ANOVA represent?
The first column represents the source of variation, both between groups (1st row – treatment) and within groups (second row – Residuals).
In the R output, which value of a summary(ANOVA) refers to SSW?
Sum sq
Reiduals
In the R output, which value of a summary(ANOVA) refers to SSB
Sum Sq
TOP ROW
Which function is used to stack data?
stack()
What function gives the scatterplots for pairs of columns?
pairs(data_frame)
Residuals plot
plot(fit$residuals~fit$fitted.values, data=cars)
where fit<-lm(a~b, data=data_frame)