2.Programming with R & Python - Segment 2 [Week 3 & 4 -Data Visualisation and Transformation] Flashcards
What is tidyverse ?
tidyverse Is a Metal Library in R Language. It is a collection of multiple packages intended for data science applications.
How many packages are in tidyverse library ?
ggplot2
tibble
forcats
purrr
dplyr
tidyr
stringr
readr
lubridate
How to create a new notebook in RStudio ?
Goto File-NewFile-R-Notebook
How to create a new notebook in RStudio ?
Goto File-NewFile-R-Notebook
How to create a new chunk in Rstudio ?
ctrl +r
How to run a chunk in Rstudio ?
shift + enter or ctr + shift + enter
What is the inbuilt demo data set in Rstudio ?
cars is an inbuilt data set ?
How to get the structure of data set ?
We have to use str() function
How to modify keyboard shortcut in RStudio ?
Goto Tool/modify keyboard shortcuts/
Now search for command for which you have to set shortcut like ‘ insert chunk’
How to invoke library in Rstudio
We have to use require(library) method
Which package is used for data Wrangling
dplyr Library
How to load package in RStudio ?
To load package in RStudio we have to use library() function
How to load specific package ?
To load specific package we have to use library(ggplot2) method .
How do invoke inbuilt data sets in RStudio ?
data() method is used .
In which package inbuilt data sets are available ?
datasets package
How to know the description and details of inbuild data set ?
Quistion mark sighn followed by dataset name.
?cars
How to load inbuild data sets ?
Inbuild data sets are already loaded you have to just use them
What is the full form of csv ?
coma separated values
How to list environment variables in work space ?
ls()
How to clear workspace environment & console
by running code ?
rm(list = ls()) clear variables
cat(‘\014’) clear console
How to remove single variable from environment ?
rm(variableName)
How to remove environment variables using GUI ?
You have to click on brush at right hand side on top of Enviroment Pane.
How to add comments in R markdown file ?
To write comments we have to use # symbol followed by text
# This is some text
How to add new colum to an existing data frame on condition ?
We have to use mutate function
carData = mtcars %>% mutate(cylType = ifelse(cyl > 5 ,’High’ , ‘Low’))
What is called the process of adding new column to an data frame ?
This process is called mutation or muting a data frame .
How to save dataset in a variable ?
carData= mtcars
mtcars is an inbuilt Data Set in R
What is pipe operator ?
%>%
Pipe operator is used to pipe data frame to a function or any object.
How to add a new column to an existing data set ?
We have two ways to do it ,
Mutate & Direct Method
carData %>% mutate(carColor = ‘NotDefined’)
carData$LaunchYear = “NotDefined”
How to add a New column to an existing data set on certain condition ?
carData %>% mutate(cylType = ifelse(cyl > 6 , 'High','Low'))
Adding a column to an existing data set it is temporary or permanent ?
It is temporary, if you want to save permanent then create a new data set object and save changes in that.
Add a new column to an existing data set by using the values of a existing data set colum ?
carData %>% mutate(wtton = .45 * wt )
Here weight in lbs is converted to tons by using values of wt column
How to use condition While adding a new column to an existing data set ?
We have to use ifelse function ?
ifelse(newColumn = condition , Yes,No)
~~~
ifelse(cylType = cyl > 6 , ‘High’ , ‘Low’)
~~~
Create a new data set by adding two columns to an existing dataSet ?
carData.new = carData %>% mutate(cylType = ifelse(cyl > 6 , 'High','Low'),wtton = wt*.45)
How to get the average of each numeric colum ?
We have to use summary function .
summary(carData.new)
How to group dataset and get mean of two colums?
We have to use summarise method .
~~~
carData %>%group_by(cylType) %>% summarise(mean(wtton) , mean(disp))
~~~
How to brief a column & store the value in Variable?
carData.new %>% summarise(dispavg = mean(disp))
How to get the mean of two columns together
carData.new %>% summarise(mean(disp),mean(hp))
How to get the mean and no. of elements in a colum , store them in varables ?
carData.new %>% summarise(dispAvg=mean(disp) , n = n())
How to group and summarise the values ?
```carData.new %>% group_by(cylType) %>% summarise(mean(wtton),mean(disp))
~~~
How to extract rows on single conditions ?
We have to use filter() method
~~~
carData.new %>% filter(cyltype==’High’)
~~~
How to extract column on Single conditions ?
We have to use select method.
~~~
carData.new %>% select(hp)
~~~
How to extract multiple columns ?
carData.new %>% select(hp, wt)
How to extract rows on Multiple conditions ?
carData.new %>% filter(cyltype==’High’ & mpg > 15)
How to extract Required columns only from data set ?
mtcars %>% select(mpg,cyl)
If some columns are not required then how can we create a new dataset without them ?
mtcars %>% select(-mpg,-cyl)
How to print string & variable together ?
There is two methods for print .
~~~
num = 25
cat(‘Number is :’,num,’\n’)
print(paste0(‘Square of Number is :’,num*2))
~~~
How to use for loop ?
for(i in 1:10){ cat('This is - ',i,'\n') }
Use dataFrame column length to run a for loop ?
for(i in 1:ncol(dataFrame)){ cat('This is - ',i,'\n') }
How to print newline ?
print(paste0(‘\n,))
cat(‘\n’)
How to create a Empty Vector of five values ?
myvector = vector('integer' , ncol(dataFrame))
How to create an empty list ?
mylist = list()
How to get sum of vector values ?
We have to use sum() function .
sum(myvector)
How to use conditional Statements if else ?
Conditional statements are used in two ways ?
ifelse(sum(courtDecision)>=3 , ‘New Trial Accepted’,’New Trial Denied’)
if (sum(courtDecision)>=3) { cat('New Trial Accepted') }else{ cat('New Trial Denied') }
How to create an plot object ?
carplot = ggplot(data = carData)
How to assighn variables/Feature to plot object ?
We have to use aesthetic method ,We use mtcars dataset.carplot = ggplot(data = carData , aes(x = wt , y = disp))
How to add geomatric elements to plot object ?
We have to use geom_point() method .
~~~
carplot = ggplot(data = carData , aes(x = wt , y = disp))
carplot = carplot + geom_point()
~~~
How to get a row from mtcars data set which have weight of 2.2 and displacement is 78.7
We have to use filter method >carData %>% filter(wt > 2 & disp < 100)
How to get a row from mtcars data set which have weight of 3.46 and displacement is 225
carData %>% filter( (wt > 3 & wt < 4) & (disp > 200 & disp < 250) )
How to save plot as image
right click on plot and save it
How to add labels to plot object ?
We have to use labs method
~~~
carplot = carplot + labs(x = ‘Weight (1000 lbs)’ , y = ‘Displacement (cu. in)’ , title = ‘Weight vs Displacement’)
~~~
Which function is used to create plot object ?
ggplot()
which function is used to assigh variables to plot object ?
aes()
aes(x = wt , y = disp)
which function is used to add geomatry to plot object ?
geom_point()
carplot = carplot + geom_point()
which variable is used in plot object creation method for dataset assighnment ?
data variable
data = carData
How to rename a column header & overwrite a data frame ?
<b>We have to use rename () method .</b>
foodData = foodData %>% rename(OilPercentage = Oil)
How to get mean of Oil column in foodtexture dataset ?
mean(foodData$OilPercentage)
How to get mean center of Oil column in foodtexture dataset ?
foodData$OilPercentageMeanCenter = foodData$OilPercentage - mean(foodData$OilPercentage)
How to add new column to dataset object ?
We have two methods : **mutate & direct method **
- mutate method
foodData = foodData %>% mutate(OilinFood = ifelse(OilPercentage > 16 ,'High','Low'))
- Direct mehod
foodData$OilPercentageMeanCenter = foodData$OilPercentage - mean(foodData$OilPercentage)
How to get largest & lowest value of vector ?
We have to use min & max functions :
~~~
min(foodData$OilPercentage)
max(foodData$OilPercentage)
~~~
How to add all values of vector ?
we have top use sum method
~~~
sum(foodData$OilPercentage)
~~~
How to find the variance by inbuilt method ?
var(foodData$Oil)
How many panes are in R Studio ?
We have four ?
1. Source Editor,
2. Console,
3. Workspace Browser (and History),
4. Files (Plots, Packages, Help)