1.Programming with R & Python - Segment 1 [Week 1 & 2- Introduction to R] Flashcards
What is TIOBE ?
Programming language popularity index
What is data Wrangling ?
Data Wrangling is a process of Transforming raw data into a usable format that is used to transform visualise and analyze .
Data Wrangling package in R language
TidyVerse Meta Libray
WhatTidyVerse Package do?
TidyVerse is an Meta library . It has many packages intended for Data Science Applications , like packages for Data Wrangling, Visualisation and more .
How to know the popularity of programming language?
We have to watch programming language popularity index TIOBE ?
What are the common features of R language
- It can handle any size of data
- Fast operators on arrays
- Efficient tool for data a wrangling . TidyVerse
- High quality graphics output capability
Which package is used for high-quality graphics in R ?
ggplot2
When was R Language Developed ?
1993
From which Language R is Derived ?
R is derived from S Language.
R is an advanced version of S Language
What is S Language ?
R is an High Level Programming Language , desighned for Statistical Computation ,S stands for Statistics ,The goal of S Language is to give user an interactive experience of statistical Computation
Source Code of R is written in which language ?
Source code of R is written in C & Fortran , so that it can run on any Platform
When & where S Language was Developed ?
S Language was developed 1976 by Bell Labs .
(Formerly AT & T) & now Lucent Technologies
What is the full form of .ipynb
Interactive Python notebook extension
Who Controll the Design & Evolution of R Language
R is an Open-Source Language, and the Design & Evolution of R Language is controlled by R Core Group & R Foundation
R Source code is written in which Language ?
C & Fortran
How to install R Programming Suite in desktop ?
Download RStudio Desktop from Rstudio.com
& R interpretor from cran.r-project.org & R-project.org.
First install R-Interpretor and then Rstudio IDE
How to Set Working Directory in R IDE ?
Goto Sessions / Set Directory
How to set default working Directory in R IDE ?
Goto Tools/Global options
What are the Drawbacks of Google Colabs
You have to install packages every time when you run colab.
Uploaded DataFrame is not available in next time when you start colab
Collab does not support direct support to R Language
What languages programming can be done in colab
Python , Julia, R
How to install packages in Google Colab and R notebook ?
install.packages(“packageName”)
How to install packages in R IDE console?
install.packages(“package name”)
How to search for a package in R ?
require(“packageName”)
How to include installed package in R program ?
library(“PackageName”)
How to Check R Version ?
R.version
How to clear console in RStudio
ctrl + L
How to know the current working Directory from Console?
getwd() Function
How Many Basic Datatypes are in R ?
Total Five Data Types :
1.Char
2.Integer
3.Numeric(Real or Float)
4.Logical
5.Complex
How Many Types of Data Structures are in R Language ?
- Atomic Vector
- List
- Matrix
- Data Frame
- Factor
How to create an integer only Variable ?
We have to use
L option after value
my_integer = 48L
L indicates its an integer value
How to Create Logic Variable
my_logic = TRUE
Ways to create an vector in R ?
We have three ways :
myvect1 = c(1L, 2L, 3L)
myvect2=1:10
myvect3=seq(from=1, to=10, by=2)
What is atomic Vector ?
Atomic vector is an Homogenous data structure , it is used to create Single Dimension Arrays of same types
What are the two functions to create a vector ?
c(10L, 20L,45L)
seq(from=1, to=10, by=1)
Which function is used to create a vector of all value types
c() function
How to know the number of elements in a vector
We have to use length function.
length(vector)
How to know the object type in a vector ?
We have to use class function.
typeof(vector)
How to know the type of data structure used ?
class(objectType)
What is the difference between class and typeof function in R?
The class function gives the data structure , while typeof method gives what type of data value in the object.
example -
mylst=list(c(1,2,3))
class(mylist) it gives , it is an object of class list
typeof(mylist) - it gives what type of value it contains, int char etc.
How to know the structure of vector ?
str(vector)
How to know the all null values in a vector
is.na(vector)
How to know whether a vector has the NA values or not
anyNA(vector)
Which functions are used to find missing values in a vector ?
is.na(vector)
anyNA()
What are the special values in R language ?
infinity 1/0
0/0
These values are treated as special values
What is the starting index value of data structure in R language ?
Every index started by one
What is the list ?
List is a hetrogenous data structure in which we can Store different type of values.
How to access elements of a vector ?
Vector elements are access by index value and array subscript
vector[1]
vector[2]
How to assign new value to an vector element
vector[1]=45
How to create a list ?
We have to use list object with c() Function .
my_list2 = list(1, ‘Name’, c(‘a’, ‘b’, ‘c’))
What are slots in data frame ?
Slots are column names of List & DataFrame
How to give name to slots in a list
names(my_list2) = c(‘first’, ‘second’, ‘third’)
How to access vales in list by using slots ?
Two methods:
names(my_list2) = c(‘first’, ‘second’, ‘third’)
my_list2$first
How to modify elements of a list ?
Two Methods
my_list2$first = 45
list1[1] = 10
What is matrix in R language .
Matrix is an Atomic vector data structure of one or two dimensional.
How to create a matrix ?
We have to use matrix class.
myMatrix2=matrix(c(1,2,3,4,5,6),nrow=3,ncol=2)
How data is organised while creating matrix ?
By default data is organised by column first.
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
How to organise data by row while creating a matrix ?
We have to use a byrow option in matrix function to organised data by row first.
myMatrix2=matrix(c(1,2,3,4,5,6),nrow=3,ncol=2,byrow=TRUE)
[,1] [,2] [1,] 1 2 [2,] 3 4 [3,] 5 6
How to assign row names and column names to a matrix?
We have to use row names and column names method
rownames(my_matrix2) = c(‘row1’, ‘row2’)
colnames(my_matrix2) = c(‘col1’, ‘col2’, ‘col3’)
col1 col2 col3 row1 1 2 3 row2 4 5 6
How to access elements of matrix
Index method is used
my_matrix2[1, 2]
What is data frame in R ?
Data frame is very essential data structure in R ,
Data frame is a list of list with each sublist of same length.
How to assign Row names and column names to matrix in R language ?
rownames(mymatrix)=c(‘Row-1’,’Row-2’,’Row-3’,’Row-4’)
colnames(mymatrix)=c(‘Col-1’,’Col-2’,’Col-3’)
What stands for S in S language
S means Statistics
What is the goal of S language ?
The goal of S language is to give user an interactive experience in statistical computation.
How to change directory from R console ?
setwd(‘Parth to Directory’)
What symbol is used for slot ?
$ Dollor Symbol is used for slot
How to get the structure of matrix
We have to use str() method .
str(matrix)
How to create Data frame manually ?
- First create atomic vectors of desired values.
- Now create data frame object by using data.frame method.
- Now assign Row names and colum names to Data Frame.
ID = c(‘A’,’B’,’C’,’D’)
Height=c(165,170,145,180)
Age = c(40,24,65,34)
studenData=data.frame(ID,Height,Age)
rownames(studenData)=c(‘Jhon’,’Sundy’,’Bob’,’Ajit’)
colnames(studenData)=c(‘ID’,’Height’,’Age’)
How to get total order of data frame ?
We have to use dim() method ?
dim(dataFrameObject)
How to show first N Rows of data frame ?
head(dataFrameObject,5)
How to show Last N Rows ?
tail(dataFrameObject,5)
How to get total Row counts
nrow(dataFrameObject)
How to get total Column counts ?
ncol(dataFrameObject)
How to get structure of data frame object ?
str(dataFrameObject)
How to get full data frame with single column only ?
To get column with header and rownames ,
studenData[‘Age’]
student$age
How to get full data frame with one or more columns From data set ?
studenData[c(‘ID’,’Age’)]
How to get particular row From dataset?
studenData[‘Ajit’, ]
How to get particular row, and single colum value from datset
We have two Methods:
studentData[‘Ajit’,’Age’]
studentData[‘Ajit’,c(‘Age’)]
What is the output of following expression :
studenData[‘Ajit’,c(“Age”)]
It give only the value of Age column not the full dataFrame. If single call me is given in that way it only give the value of that colum field.
How to get all values only of particular single column ,
from all rows ?
We have 2 Methods:
Subscript Method - studentData[[‘Age’]]
Slot Method - studentData$Age
Output - 40 24 65 34
How to get single column all values with row names ?
studentData[‘Age’]
carData[c(‘disp’)]
How to get two columns with all values with row names ?
studentData[c(‘Age’,’ID’)]
What is the factor in R language ?
Factor is a data structure, it is a type of vector that can contain only predefined values and is used to categorise the data. Factors are the data objects which are used to category the data and store it as levels.
How to create a factor ?
gender=factor(c(‘Male’,’Female’,’Male’,’Female’))
Which function is used to see factor categorization ?
We have to use level()s method
levels(factorObject)
How to modify a factor value ?
Modification Does not allow new values, only pre define values are used in modification. Use index value under subscript to modify.
gender=factor(c(‘Male’,’Female’,’Male’,’Female’))
gender[1]=’Female’
How R language treats character values in columns ?
By default character values in data set Treated as factors by default in R language.
How to create data frame from file ?
- Create a string object of path to data frame
- Create a data frame object by using read.csv method
file=’http://openmv.net/file/food-texture.csv’ # Web Path
file=’food-texture.csv’ # Local Path
foodData=read.csv(file,header=TRUE,row.names=1,stringsAsFactors = FALSE)
header=TRUE means treat the First row as column names
row.names=1 means treat the first column as the name of rows
stringsAsFactors = FALSE means tell R not to treat chars int column as factors
What is the option to set First row as column names ?
header=TRUE
What is option to set First column as row names ?
row.names=1
What is the option to set not to treat Characters in column as factors ?
stringsAsFactors = FALSE
How to get the structure of data frame ?
We have to use str() method.
str(dataFrameObject)
How to know how many rows & columns are in the data frame ?
We have to use dim() method.
dim(dataFrameObject)
How to get total rows in data frame?
We have to use nrow() method.
nrow(dataFrameObject)
How to get total columns in a data frame ?
We have to usecol() method
ncol(dataFrameObject)
How to get the mean of all columns or rows , which function?
We have to use apply() method.
apply(foodData,2,mean) here 2 is for column & 1 is for row
How to calculate mean of integer column
mean(foodData$Density)
How to get factor of data frame row names ?
We have to use romnames() method .
factor(rownames(foodData))
How to get a row names of data frame object ?
rownames(dataFrameObject)
How to get column names of data frame object ?
colnames(dataFrameObject)
How to get factor of particular column
factor(foodData$Density)
How to convert ipynb file to R marked down file
- First install ‘rmarkdown’ package
install.packages(‘rmarkdown’)
2.convert-ipynb(xyx.ipynb)
3.Now .ipynb file is converted to .rmd file
How to factor a colum, and compare the result ?
To compare the result, we have to use length method.
length(factor(carData$mpg)) 32
length(levels(factor(carData$mpg))) 25
What is the output of the this expression ?
foodData[‘B907’,’Oil’]
It gives the value only of oil colum
When we give single column what it means in the expression ?
foodData[‘B907’,’Oil’]
When single column given in the expression it only give the value of that field not the full data frame,And when two Colonel values are given like this foodData[‘B110’,c(‘Oil’,’Density’)] then It gives the full row data frame with column names and role names
How to create a factor Classification from column field
dataFrame=head(foodData,5)
dataFrame
factor(dataFrame$Crispy)
How to Extract first & second row with all columns ?
df[1:2, ]
What is the order of matrix ?
Order of matrix means how many elements are in a matrix m x n .