Lab 2: Means and Beyond- More Descriptive Statistics, Defining and Making Your Own Functions That Can be Reused, and Using R Packages + Lab 3: Confidence Interval of a Proportion Flashcards
Can the letter c be used both as a built-in R function and object?
YES
What will determine whether the letter c is used a built-in R function or as an object?
Its location in the code!
- If it precedes the arrow, it's being defined as an object - If it comes after the arrow, it's being used as a built-in R function
If you define an object and then try to define it again with the same letter, what will happen? How would this effect the mean/median
- That object will be overwritten and will now be replaced/ contain the new data set.
- The mean/ median will change to be representative of the new data set
An object is always determined by what?
Its newest definition
What are two ways in which an exponent can be inputted into R?
Using either the carrot(EX: 10^5) or e method (1e+5) with positive (+) numbers
What are two ways in which an exponent can be inputted into r?
Using either the carrot(EX: 10^5) or e method (1e+5) with negative (-) numbers
If there is no built-in R function for performing certain statistical analyses, what can you do?
Either define your own R function or download reproducible codes in some fundamental units made by other people called R packages.
What is another good way to think of an R package?
Like an extra toolbox that contains a variety of tools to add to built- in tools that come with base R
Do people usually upload the packages they developed and share them with all R users?
YES
What are the two ways in which we can install an R package?
Type in the R command or use the pull- down manual
What is the function you should use to install an R package?
install. packages(“ “)
- With a lowercase i and packages being plural
What are the “CRAN mirrors?”
Mirror sites with the repositories of the package
What will happen if you try to install a package into base R using command lines and what do you have to do?
The “CRAN mirrors” window will pop up and you’ll have to scroll down and highlight to select “USA (PA 1)” to install?
What will happen if you try to install a package into the code editor R Studio using command lines and what do you have to do?
The “CRAN mirrors” window will NOT pop up and you won’t have to do anything else, as the package will automatically be installed after hitting enter/return.
How do you install an R package in base R, using the pull- down manual?
Follow the selection: “Packages”-> Install Package(s)-> select “raster”
How do you install an R package in base R, using the pull- down manual?
- There is a tab named “Packages” in the lower-right quadrant/panel, click it
- Click “install”
- In the pop-up window, type “raster” in the space below the sentence that reads “Packages separate multiple with space or comma)”
- Click “install”
What function must you use to activate a newly installed R package?
library(name of package) with a lowercase l and NO QUOTATION MARKS
If you stay in the same R session, do you need to reactive the newly installed R package before using it again?
NO
If you close and restart a new R session, do you need to reactivate the newly installed R package before attempting to use it again?
YES
What package are the geometric and harmonic mean function found in?
The “psych” package
What package is the coefficient of variation function found in?
The “raster” package
What function would you use to find the geometric mean?
geometric.mean( )
What function would you use to find the harmonic mean?
harmonic.mean( )
What function would you use to find the coefficient of variation (CV)?
cv( )
What is the CV calculated by the function “cv( )” from the package “raster” really? As a result, how should it be reported?
It’s the percentage of the actual CV, so it should be reported either with a % or as the decimal version of that percentage
What function would you use to determine the MAD of a data set?
> mad(object,constant= )
When trying to determine the MAD of a data set and you’re using the command line “>mad(object,constant= ), what must you specify your constant as if you data set is SMALL`
1
What are the two ways in which you can find the interquartile range of a data set in R?
Use the information from the 5 number summary or use the built- in R function, “IQR( )”
When trying to determine the interquartile range of a data set and the data set is small, what may happen and what do we do to avoid this?
When the data set is small, different methods may give us different results, so we must specify what method using the argument “type=”
When trying to determine the interquartile range of a data set, if you don’t specify what method to use with the argument “type=”, which one will R chose by default?
“type=7”
When the confidence interval is 95%, the z score is?
1.96
What does w stand for?
The margin of error
How do we find the upper CL?
p’+w
How do we find the lower CL?
p’-w
What are the 4 ways in which you could find the CI of a proportion in R?
Performing step-by-step calculations in R, the Modified Wald Method, Standard Wald Method, Exact Method
Out of the 4 ways to calculate the CI of a proportion in R, which is the most accurate and which is the easiest/hardest to do by hand?
- The Standard Wald Method is easiest to compute by hand
- The Exact Method is hardest to do by hand
- The Modified Wald Method is not much harder to do than the Standard Wald Method, but it’s MUCH more accurate
Is the CI calculated using the Modified Wald Method and the Standard Wald Method going to be the same?
NO, they’ll be slightly different
Which built-in R function is used for the Modified Wald Method?
add4ci( )
Which built- in R function is used for the Standard Wald Method
addz2ci( )
Which built in R function is used for the Exact Method?
exactci( )
What is the disadvantage of using the Exact Method?
It can’t be easily computed by hand and calculates a CI that is sometimes wider than necessary
If we increase the confidence level, what will happen to the CI?
It will become wider
If we decrease the confidence level, what will happen to the CI?
It will become narrower
What is the relationship between the confidence level and the CI?
It’s a direct relationship: As the confidence level increase, the CI becomes wider and as the confidence level decreases, the CI becomes narrower.
When should the Rule of Three be used?
When the numerator in s/n AKA the number of successes is 0
When should the Rule of Five be used?
When the numerator in s/n AKA the number of successes is 1
When should the Rule of Seven be used?
When the numerator in s/n AKA the number of successes is 2
When you apply the Rule of Three, Five, or Seven, what is always the lower confidence limit?
0
When can the Rule of Three, Five, and Seven be applied?
When the success # is v small and the sample size is large.
You can only take shortcuts like the Rule of Three, Five, and Seven to estimate the CI when the confidence interval is what?
95%
What does every aspect of this code mean: >add4ci(x=16,n= 565,conf.level = 0.95) ?
- Function “add4ci” is for the Modified Wald Method
- x or s is the number of successes (s) -> Would change depending on what method being used!
- n is the number of experiments (n)
- Conf.level equals the confidence level and expresses it in proportions
The functions for the Modified Wald Method, Standard Wald Method, and Exact Method are found under what packages?
“PropCIs” and “devtools”
What line of code can be used to set p’ after defining s and n when trying to determine the CI of a proportion ?
> pprime
What line of code can be used to set w after defining pprime(p’), s, and n when trying to determine the CI of a proportion?
> w
The result from the Rule of Three is closer to the result of which method?
Exact Method