00R Basic and Intro Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

how to explicitly specify that u are using a particular function from a package

A

ggplot2::ggplot()…package::function().

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to install package ?

A

install.packages(“tidyverse”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how to load package?

A

library(tidyverse)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

which one has double quote? Loading or installing package?

A

Installing package

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ggpplot

A

gplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy),)

To set an aesthetic manually, set the aesthetic by name as an argument of your geom function; i.e. it goes outside of aes(). You’ll need to pick a level that makes sense for that aesthetic:
geom_point(mapping = aes(x = displ, y = hwy), color = “blue”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the first argument in ggplot??

A

data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

which geom create scatter plot?

A

geom_point()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is an aesthetics?

A

An aesthetic is a visual prop‐ erty of the objects in your plot. Aesthetics include things like the size, the shape, or the color of your points

Once you map an aesthetic, ggplot2 takes care of the rest. It selects a reasonable scale to use with the aesthetic, and it constructs a legend that explains the mapping between levels and values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does this does glimpse(mpeg)?

A

displays the type of each column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happened when u see + and code does not execute?

A

Sometimes you’ll run the code and nothing happens. Check the left-hand of your console: if it’s a +, it means that R doesn’t think you’ve typed a complete expression and it’s waiting for you to finish it. In this case, it’s usually easy to start from scratch again by pressing ESCAPE to abort processing the current command.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

“The simple graph has brought more information to the data analyst’s mind than any other device.”

A

— John Tukey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is ggplot2?

A

R has several systems for making graphs, but ggplot2 is one of the most elegant and most versatile. ggplot2 implements the grammar of graphics, a coherent system for describing and building graphs. With ggplot2, you can do more faster by learning one system and applying it in many places

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is relationship between ggplot2 and tidyverse?

A

ggplot2, one of the core members of the tidyverse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to view all ur data set in R studio pane?

A

View(flights)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are tibbles?

A

Tibbles are data frames, but slightly tweaked to work better in the tidyverse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data frame?

A

A data frame is a rectangular collection of variables (in the columns) and observations (in the rows)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is difference between ggplot2::mpg and mpg?

A

The first we explicitly call the data frame mpg and second we have already import the ‘tidyverse’ which ia a collection of packages including mpg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is the graphing template ?

A

ggplot(data = ) +

(mapping = aes())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is aesthetics ?

A

An aesthetic is a visual property of the objects in your plot. Aesthetics include things like the size, the shape, or the color of your points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

in aesthetic , how do u use colour?

A

(If you prefer British English, like Hadley, you can use colour instead of color.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is scaling? what is relation with colour?

A

To map an aesthetic to a variable, associate the name of the aesthetic to the name of the variable inside aes(). ggplot2 will automatically assign a unique level of the aesthetic (here a unique color) to each unique value of the variable, a process known as scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

aesthetic

A

For each aesthetic, you use aes() to associate the name of the aesthetic with a variable to display. The aes() function gathers together each of the aesthetic mappings used by a layer and passes them to the layer’s mapping argument. The syntax highlights a useful insight about x and y: the x and y locations of a point are themselves aesthetics, visual properties that you can map to variables to display information about the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

can u select ads properties manually?

A

yes …ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy), color = “blue”). To set an aesthetic manually, set the aesthetic by name as an argument of your geom function; i.e. it goes outside of aes(). You’ll need to pick a level that makes sense for that aesthetic: 1. The name of a color as a character string. 2. The size of a point in mm. 3. The shape of a point as a number, as shown in Figure 3.1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

aes can be ?

A

colour , size , shape etc?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does glimpse() does?

A

glimpse() displays the type of each column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What happens if you map the same variable to multiple aesthetics?

A

n the above plot, hwy is mapped to both location on the y-axis and color, and displ is mapped to both location on the x-axis and size. The code works and produces a plot, even if it is a bad one. Mapping a single variable to multiple aesthetics is redundant. Because it is redundant information, in most cases avoid mapping a single variable to multiple aesthetics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what are diff ways to use ggplot?

A

ggplot(mtcars, aes(wt, mpg)) +
geom_point(shape = 21, colour = “black”, fill = “white”, size = 5, stroke = 5)

ggplot(mpg, aes(x = displ, y = hwy, shape = cty)) +
geom_point()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What happens if you map an aesthetic to something other than a variable name, like aes(colour = displ < 5)? Note, you’ll also need to specify x and y.

A

Aesthetics can also be mapped to expressions like displ < 5. The ggplot() function behaves as if a temporary variable was added to the data with with values equal to the result of the expression. In this case, the result of displ < 5 is a logical variable which takes values of TRUE or FALSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

One common problem when creating ggplot2 graphics is to put the + in the wrong place:Where does it supposed to come?

A

it has to come at the end of the line, not the start. In other words, make sure you haven’t accidentally written code like this:
ggplot(data = mpg)
+ geom_point(mapping = aes(x = displ, y = hwy))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Facets

A

One way to add additional variables is with aesthetics. Another way, particularly useful for categorical variables, is to split your plot into facets, subplots that each display one subset of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is geom?

A

A geom is the geometrical object that a plot uses to represent data. People often describe plots by the type of geom that the plot uses. For example, bar charts use bar geoms, line charts use line geoms, boxplots use boxplot geoms, and so on. Scatterplots break the trend; they use the point geom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Does each geom have same aesthetic and mapping?

A

Every geom function in ggplot2 takes a mapping argument. However, not every aesthetic works with every geom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

how many geom ggplot2 provide?

A

ggplot2 provides over 30 geoms, and extension packages provide even more (see https://www.ggplot2-exts.org for a sampling). The best way to get a comprehensive overview is the ggplot2 cheatsheet, which you can find at http://rstudio.com/cheatsheets. To learn more about any single geom, use help: ?geom_smooth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

how to display multiple geom ?

A

To display multiple geoms in the same plot, add multiple geom functions to ggplot():

ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
geom_smooth(mapping = aes(x = displ, y = hwy))

better one

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()

If you place mappings in a geom function, ggplot2 will treat them as local mappings for the layer. It will use these mappings to extend or overwrite the global mappings for that layer only. This makes it possible to display different aesthetics in different layers.

ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(color = class)) +
geom_smooth()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Diff kind of geom

A

line chart: geom_line()
boxplot: geom_boxplot()
histogram: geom_hist()
area chart: geom_area()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does show.legend = FALSE do? What happens if you remove it? Why do you think I used it earlier in the chapter?

A

The theme option show.legend = FALSE hides the legend box.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Many graphs, like scatterplots, plot the raw values of your dataset. Other graphs, like bar charts, calculate new values to plot:

A

bar charts, histograms, and frequency polygons bin your data and then plot bin counts, the number of points that fall in each bin.

smoothers fit a model to your data and then plot predictions from the model.

boxplots compute a robust summary of the distribution and then display a specially formatted box.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

The algorithm used to calculate new values for a graph is called?

A

A stat : short for statistical transformation. T

39
Q

what is reship between geoms and stats interchangeably.?

A

You can generally use geoms and stats interchangeably. For example, you can recreate the previous plot using stat_count() instead of geom_bar():
ggplot(data = diamonds) +

stat_count(mapping = aes(x = cut))

40
Q

What does geom_col() do? How is it different to geom_bar()?

A

The geom_col() function has different default stat than geom_bar(). The default stat of geom_col() is stat_identity(), which leaves the data as is. The geom_col() function expects that the data contains x values and y values which represent the bar height.

The default stat of geom_bar() is stat_bin(). The geom_bar() function only expects an x variable. The stat, stat_bin(), preprocess input data by counting the number of observations for each value of x. The y aesthetic uses the values of these counts.

41
Q

You can colour a bar chart using either the colour aesthetic, or, more usefully, fill:. But how ?

A

ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, colour = cut))

ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = cut))

if you map the fill aesthetic to another variable, like clarity: the bars are automatically stacked. Each colored rectangle represents a combination of cut and clarity

ggplot(data = diamonds) +
geom_bar(mapping = aes(x = cut, fill = clarity))

The stacking is performed automatically by the position adjustment specified by the position argument. If you don’t want a stacked bar chart, you can use one of three other options: “identity”, “dodge” or “fill”.

42
Q

Position = identity

A

position = “identity” will place each object exactly where it falls in the context of the graph. This is not very useful for bars, because it overlaps them. To see that overlapping we either need to make the bars slightly transparent by setting alpha to a small value, or completely transparent by setting fill = NA.

ggplot(data = diamonds, mapping = aes(x = cut, fill = clarity)) +
geom_bar(alpha = 1/5, position = “identity”)
ggplot(data = diamonds, mapping = aes(x = cut, colour = clarity)) +
geom_bar(fill = NA, position = “identity”)

43
Q

position = “fill” works like stacking, but makes each set of stacked bars the same height. This makes it easier to compare proportions across groups.

A

ggplot(data = diamonds) +

geom_bar(mapping = aes(x = cut, fill = clarity), position = “fill”)

44
Q

position = “dodge” places overlapping objects directly beside one another. This makes it easier to compare individual values.

A

gplot(data = diamonds) +

geom_bar(mapping = aes(x = cut, fill = clarity), position = “dodge”)

45
Q

Most geoms and stats come in pairs that are almost always used in concert. Read through the documentation and make a list of all the pairs. What do they have in common?

A
geom_bar()	stat_count()
geom_bin2d()	stat_bin_2d()
geom_boxplot()	stat_boxplot()
geom_contour()	stat_contour()
geom_count()	stat_sum()
geom_density()	stat_density()
geom_density_2d()	stat_density_2d()
geom_hex()	stat_hex()
geom_freqpoly()	stat_bin()
geom_histogram()	stat_bin()
46
Q

when there is overplotting because there are multiple observations for each combination of cty and hwy values.how to improve it?

A

I would improve the plot by using a jitter position adjustment to decrease overplotting.

47
Q

coord_flip() ?

A

coord_flip() switches the x and y axes. This is useful (for example), if you want horizontal boxplots. It’s also useful for long labels: it’s hard to get them to fit without overlapping on the x-axis.

48
Q

The layered grammar of graphics

A
ggplot(data = ) + 
  (
     mapping = aes(),
     stat = , 
     position = 
  ) +
   \+

In practice, you rarely need to supply all seven parameters to make a graph because ggplot2 will provide useful defaults for everything except the data, the mappings, and the geom function.

The seven parameters in the template compose the grammar of graphics, a formal system for building plots. The grammar of graphics is based on the insight that you can uniquely describe any plot as a combination of a dataset, a geom, a set of mappings, a stat, a position adjustment, a coordinate system, and a faceting scheme.

49
Q

names recommendation style?

A

I recommend snake_case where you separate lowercase words with _. i_use_snake_case

50
Q

cmd+up arrow ? what it does?

A

You will see all previous command and u can move around and select one u want reexecute

51
Q

How to select function easily using Tab command?

A

just type the small part e.g seq and press tab , A popup shows you possible completions

52
Q

When u see + , what does R means?

A

The + tells you that R is waiting for more input; it doesn’t think you’re done yet. Usually that means you’ve forgotten either a “ or a ). Either add the missing pair, or press ESCAPE to abort the expression and try again.

53
Q

Where can you see all objects in Rstudio?

A

Under Environment

54
Q

Error messages of the form “object ‘…’ not found” mean exactly what they say. R cannot find an object with that name. How to solve?

A

The most common scenarios in which I encounter this error message are

I forgot to create the object, or an error prevented the object from being created.

I made a typo in the object’s name, either when using it or when I created it (as in the example above), or I forgot what I had originally named it. If you find yourself often writing the wrong name for an object, it is a good indication that the original name was not a good one.

I forgot to load the package that contains the object using library().

55
Q

filter function what it does?

A

select observations with certain conditions filter(diamonds, carat > 3)

56
Q

How to show keyboard shortcut ? with which key?

A

Alt + Shift + K.

57
Q

How to show keyboard shortcut ? with which key?

A

Alt + Shift + K.

This gives a menu with keyboard shortcuts. This can be found in the menu under Tools -> Keyboard Shortcuts Help.

58
Q

clear workspace in r ?

A

rm(list=ls())

or click the broom thintsiya to remove all.

59
Q

Leyer plot means what?

A

if you say to layers plots mean to combine them together. for example to layer gglpot_point() and ggplot_line() means to combine the two plots in one graph. that is poin and line.

60
Q

Where + comes in ggplot at end of line or start of new line?

A

end of a line

61
Q

Shortcut for pipe operator

A

shift + cmd + m

62
Q

R programming from Cousera starts here

A

Cousera R Prgramming

63
Q

Who wrote R?

A

R is dialect of S (S developed by John Chambers and others at Bell Labs in 1976, revision in 1988 in c writtrn by hastie and chamber)

64
Q

Which country R created?

A

1991…created in NewZealand
1993 first announcement to public
1997 : The R core group formed

65
Q

Drawbacks of R?

A
  • Base on 40year old langauage
  • Functionality base on user command and user contribution
  • Objects must be generally store in physical memory (though there is advancement now increase in memory)
  • Not ideal for all possible situation(all software package)
66
Q

• R system is divided into 2 conceptual parts, what are they?

A
  1. The base R system that u can download form CRAN
    - The“base”R system contains,among other things,the base package which Is required to run R and contains the most fundamental functions.
    - The other packages contained in the“base system include utils,stats,datasets,graphics, grDevices, grid, methods, tools, parallel, compiler, splines, tcltk, stats4.
  2. Everything else
    - People often make packages available on their personal websites and Git; there is no reliable way to keep track of how many packages are available in this fashion.
67
Q

How to ask questions?

A
• Asking Questions
		• Provide reproducible output
		• What d u expect output
		• What d u see instead
		• Version u using , e.g R package
		• OS
Additional info
	• Subject Headers :
		• Smarter R 3.0.2 lm() function on MAC OS 10.1 …seg fault on large data frame
	• Do 
		• Describe the goal , not step
		• Explicit 
		• Followup if solution found solution
		•
68
Q

Attribute in R?

A

• Attribute : R object can have attribute
□ Names , dimension ,class , length , other define
□ Attribute() is a function use to find attribute of an object

69
Q

• Explicit coercion?

A

As.numeric(x)….changes x to numeric As.character(x)…changes x to character .etc

70
Q

matrix how to create?

A

• Matrix
§ Dim attribue (row , col)
§ M

71
Q

matrix how to create?

A

• Matrix
§ Dim attribue (row , col)
§ M

72
Q

how to create matrix with cbind and rbind?

A

§ Matrix can be created by colum bindng and row-binding with cbind() and rbind(). Rbind combine colum , rbind combine raw
§ x

73
Q

hwo missing values is represented?

A

• Mixing Values
§ NA or NAN for undefind mthcl oprx
§ Is.na() use to test if object are NA
§ Is.nan() use to test for NAN
§ NA have class , can be integer NA, charcter NA
A NAN value in also NA but the converse is not tru

74
Q

Dataframes?

A

• Data Frames
§ Store tabular data
§ They are special type of list wher every elemet have the same length
§ Data frames can store object of diff classes
§ Have special attr called row.names
§ Data frames created by calling read.table() or read.csv()
§ Can be converted to matrix by calling data.matrix()
§ Can be created using data.frame()
□ E.g a

75
Q

Names in R for object?

A

§ R objects can also have names , which is very useful for writitng readable code and self-describing
§ E.g x

76
Q

How to read tabular data in R ?

A

Read.table or read.csv for reading tabular data text file , row and data file
§ ReadLine : for reading lines of txt file
§ Source for reading R code files (inverse of dump)
§ Dget for reading in R code file (inverse of dput)
§ Load for reading in saved worksapce

77
Q

How to read text with rea.table or read.csv?

A

• Reading with read.table / read.csv is same except default separator is comma
§ File , name of file or connection
§ Header , logical indicating if the file has a header line
§ Sep , a string indicating how the colums are separated
§ colClasses , a xter vector indicating the class of each colum in the dataset
§ Nrows ..the number of rows in the dataset
§ Comment.char. A character string indicating the comment xter
§ Skip, the number of lines to skip from d begining

78
Q

writting data in R?

A
• Writing Data
			§ Write.table
			§ writeLines
			§ Dump
			§ Dput
			§ Save
79
Q

Reading Lage Datasets with read.table, how to do it easier?

A

• With much larger datasets, doing the following things will make your life easier and will prevent R from choking.
§ Read the help page for read.table which contain hints for large data set
§ Make rough calc of the memory requ to store ur data and find if ur RAM can do that
§ Set comment.chat = “” if no commented lines in the file
§ Use the colClasses argument. Specifying this option instead of using the default can make ’read.table’ run MUCH faster, often twice as fast. In order to use this option, you have to know the class of each column in your data frame. If all of the columns are “numeric”, for example, then you can just set colClasses = “numeric”. A quick an dirty way to figure out the classes of each column is the following:
§
initial

80
Q

How to calculate memory requirment when using R with large data?

A

• Calculating Memory Requirements Calculating Memory

Requirements I have a data frame with 1,500,000 rows and 120 columns, all of which are numeric data.

Roughly, how much memory is required to store this data frame? 1,500,000 × 120 × 8 bytes/numeric = 1440000000 bytes = 1440000000 / bytes/MB = 1,373.29 MB = 1.34 GB
• Rule of Thumb
§ U need twice as memory as the data needs
Example in this case 1.34 * 2

81
Q

subseeting list?

A

• Subsetting List
• X X $bar gives value of bar , it return 0.6
□ Or x ([“bar”])

82
Q

extracting multiple element from list?

A

• Extract multiple element from list
• X[ c(1,2)] this returns foo and baz values
U cant use $ sign or double bracket when extracting multiple from list only single bracket

83
Q

How to remove NAN values from list?

A

X a X[!a]
[1] 1 2 4 5
>

84
Q

The capability of R reflect needs of the community. what can u say?

A

The capabilities of the R system generally reflect the interests of the R user community. As the community has ballooned in size over the past 10 years, the capabilities have similarly increased

85
Q

The primarcy source code of R can only be change by who?

A

The R core group

86
Q

Who wrote design R graphics system original ?

A

Murrel.R Graphics

87
Q

Springer has Use R.

A

for R books in specific areas. I think we will write R bok or our PhDs to be turned into Springerinbriefs

88
Q

x=5.
x
[1] 5. What [1] means?

A

It tell us x is vector and 5 is the first element

89
Q

Everything in R is ?

A

object.

90
Q

R has five basic atomic class of object?

A

xter , numeric , boolean , complex , intgere

91
Q

What is shorthand for TREU and FALSE

A

T and F.

92
Q

Does vector and List print out same?

A

No , because the List content are different and so it print them differently in diff lines.

93
Q

What does print out of list shows us ? using [ 0r [[ for indexing?

A

using [[
> a
[[1]]
[1] 1

94
Q

Element of List will have double bracket and Elemet of other vectors will have single bracket for subsetting. Tue or False?

A

True