WEEK 3 Flashcards
INDEXING
With R we can relate one group of vector with another.
INDEXING EXAMPLE PROGRAM
MURDER$RATE <- #MURDER$TOTAL/MURDERS$POPULATION * 100000
#MURDERS$RATE<=0.71
#MURDERS$STATE[MURDERS$RATE]
THE SUM FUNCTION
The function sum returns the sum of the entries oF a vector and logical vectors get coerced to numeric with TRUE coded as 1 and FALSE as 0.
Thus we can count the states using:
SUM[MURDERS$RATE]
LOGICAL OPERATOR PROGRAMMING EXAMPLE
WEST <- MURDER$REGION == “WEST”
SAFE <- MURDERS$RATE < 1
INDEX <- WEST & SAFE
MURDERS$STATE [INDEX]
WHICH FUNCTION
This helps us to find the specific entry by converting vectors of logical into indexes
example
index <- murder$state == “California”
murder$rate[index]
MATCH
This function tells us which
indexes of a second vector match each of the entries of a first vector
example
index<- match(c(“California”,”New York”, “Florida”), murder$state)
ind
%in%
If rather than an index we want a logical that tells us whether or not each element of a
first vector is in a second, we can use the function %in%.
c(“Boston”, “Dakota”, “Washington”) %in% murders$state
#> [1] FALSE FALSE TRUE
PLOT
PLOT FUNCTION CAN BE USED TO MAKE SCATTERPLOTS
EXAMPLE
X<- MURDERS$POPULATION / 10^6
Y<- MURDERS$TOTAL
PLOT(X,Y)
ALSO
X <-WITH(MURDERS(POPULATION/10^6,TOTAL)
PLOT(X)
HISTOGRAM
HISTOGRAMS ARE A POWERFUL GRAPHICAL SUMMARY OF A LIST OF NUMBERS THAT GIVES YOU A GENERAL OVERVIEW OF NUMBERS YOU HAVE.
HIST()
BOXPLOT
They provide a
more terse summary than histograms, but they are easier to stack with other boxplots.
murders$rate <- with(murders, total / population * 100000)
boxplot(rate~region, data = murders)
DPLYR
Library(dplyr)
MUTATE FUNCTION
This function is used to change the date table by adding more columns, or rows.
FILTER FUNCTION
This is used to filter the data.
How to select a specific column in a data table?
By using select function.
EXAMPLE FOR MUTATE - ADD A NEW COLUMN CALLED RATE IN MURDERS DATA TABLE
murders <- mutate(murders, rate = total / population * 100000)