Program Flashcards

1
Q

what are core tidyverse packages?

A

the ggplot2, tibble, tidyr, readr, purrr, and dplyr packages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Y master one thing at a time?

A

However, we strongly believe that it’s best to master one tool at a time. You will get better faster if you dive deep, rather than spreading yourself thinly over many topics. This doesn’t mean you should only know one thing, just that you’ll generally learn faster if you stick to one thing at a time. You should strive to learn new things throughout your career, but make sure your understanding is solid before you move on to the next interesting thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Where does pipe package comes from?

A

The pipe, %>%, comes from the magrittr package by Stefan Milton Bache

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

if you load tidyverse do u need to load magrittr?

A

Packages in the tidyverse load %>% for you automatically, so you don’t usually load magrittr explicitly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pipe cant work for 2 classes ?

A
  1. Functions that use the current environment. For example, assign() will create a new variable with the given name in the current environment: 2. Functions that use lazy evaluation. In R, function arguments are only computed when the function uses them, not prior to calling the function. The pipe computes each element in turn, so you can’t rely on this behaviour.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pipes are most useful for rewriting a fairly short linear sequence of operations. when not use pipe?

A
  1. Your pipes are longer than (say) ten steps. In that case, create intermediate objects with meaningful names. That will make debugging easier, because you can more easily check the intermediate results, and it makes it easier to understand your code, because the variable names can help communicate intent. 2. You have multiple inputs or outputs. If there isn’t one primary object being transformed, but two or more objects being combined together, don’t use the pipe. 3. You are starting to think about a directed graph with a complex dependency structure. Pipes are fundamentally linear and expressing complex relationships with them will typically yield confusing code.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

For assignment magrittr provides the %<>% operator which allows you to do what?

A

replace code like: mtcars % transform(cyl = cyl * 2) with mtcars %<>% transform(cyl = cyl * 2) I’m not a fan of this operator because I think assignment is such a special operation that it should always be clear when it’s occurring. In my opinion, a little bit of duplication (i.e. repeating the name of the object twice) is fine in return for making assignment more explicit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Functions. why?

A

Functions allow you to automate common tasks in a more powerful and general way than copy-and-pasting. three advantages 1. You can give a function an evocative name that makes your code easier to understand. 2. As requirements change, you only need to update code in one place, instead of many. 3. You eliminate the chance of making incidental mistakes when you copy and paste (i.e. updating a variable name in one place, but not in another).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

On writting function do we need tidyverse or base R?

A

On R4DSC the focus is just to use function for base R , so no need of using any library.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When should you consider writting a function?

A

You should consider writing a function whenever you’ve copied and pasted a block of code more than twice (i.e. you now have three copies of the same code). For example, take a look at this code. What does it do?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

To use tibble to create data.frame , u need to specify it explicitly if u dont import it from library.

A

df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

There are 3 key steps to create a function. what are they?

A

There are three key steps to creating a new function: 1. You need to pick a name for the function. Here I’ve used rescale01 because this function rescales a vector to lie between 0 and 1. 2. You list the inputs, or arguments, to the function inside function. Here we have just one argument. If we had more the call would look like function(x, y, z). 2. You place the code you have developed in body of the function, a { block that immediately follows function(…).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It’s easier to start with working code and turn it into a function; it’s harder what?

A

it’s harder to create a function and then try to make it work.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Generally function names should be what ? argument should be what?

A

Generally, function names should be verbs, and arguments should be nouns. There are some exceptions: nouns are ok if the function computes a very well known noun (i.e. mean() is better than compute_mean()), or accessing some property of an object (i.e. coef() is better than get_coefficients()). A good sign that a noun might be a better choice is if you’re using a very broad verb like “get”, “compute”, “calculate”, or “determine”. Use your best judgement and don’t be afraid to rename a function if you figure out a better name later.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is recommendation for naming function in R?

A

If your function name is composed of multiple words, I recommend using “snake_case” where each lowercase word is separated by an underscore.camelCase is a popular alternative. It doesn’t really matter which one you pick, the important thing is to be consistent: # Never do this! col_mins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to name family of function that do similar things?

A

If you have a family of functions that do similar things, make sure they have consistent names and arguments. Use a common prefix to indicate that they are connected. That’s better than a common suffix because autocomplete allows you to type the prefix and see all the members of the family. # Good input_select() input_checkbox() input_text() # Not so good select_input() checkbox_input() text_input() A good example of this design is the stringr package: if you don’t remember exactly which function you need, you can type str_ and jog your memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we use comment in function?

A

Use comments, lines starting with #, to explain the “why” of your code. You generally should avoid comments that explain the “what” or the “how”. If you can’t understand what the code does from reading it, you should think about how to rewrite it to be more clearly. Another important use of comments is to break up your file into easily readable chunks. Use long lines of - and = to make it easy to spot the breaks. # Load data ————————————– # Plot data ————————————–

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

?if does not give u help menu. How to do it?

A

?if To get help on if you need to surround it in backticks:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What || (or) and && (and) use for?

A

You can use || (or) and && (and) to combine multiple logical expressions. Dont use I . You should never use | or & in an if statement: these are vectorised operations that apply to multiple values (that’s why you use them in filter())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

be careful when testing equality with == . why?

A

Be careful when testing for equality. == is vectorised, which means that it’s easy to get more than one output Either check the length is already 1, collapse with all() or any(), or use the non-vectorised identical(). identical() is very strict: it always returns either a single TRUE or a single FALSE, and doesn’t coerce types. This means that you need to be careful when comparing integers and doubles:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

near() and ==

A

There’s another common problem you might encounter when using ==: floating point numbers. These results might surprise you! remember that every number you see is an approximation. Instead of relying on ==, use near() near(sqrt(2) ^ 2, 2) #> [1] TRUE near(1 / 49 * 49, 1) #> [1] TRUE sqrt(2) ^ 2 == 2 #> [1] FALSE 1 / 49 * 49 == 1 #> [1] FALSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Multiple condition

A

use if else if or switch . But if you end up with a very long series of chained if statements, you should consider rewriting. One useful technique is the switch() function. It allows you to evaluate selected code based on position or name.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

if u have a logical vector , how can you collapse it to single value?

A

If you do have a logical vector, you can use any() or all() to collapse it to a single value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Code style

A

Both if and function should (almost) always be followed by squiggly brackets ({}), and the contents should be indented by two spaces An opening curly brace should never go on its own line and should always be followed by a new line. A closing curly brace should always go on its own line, unless it’s followed by else. Always indent the code inside curly braces. # Good if (y < 0 && debug) { message(“Y is negative”) } if (y == 0) { log(x) } else { y ^ x }

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

argument of function fall into 2.

A

The arguments to a function typically fall into two broad sets: one set supplies the data to compute on, and the other supplies arguments that control the details of the computation. For example: n mean(), the data is x, and the details are how much data to trim from the ends (trim) and how to handle missing values (na.rm). Generally, data arguments should come first. Detail arguments should go on the end, and usually should have default values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

spaces in function call and after = , what is best way of using them ?

A

Notice that when you call a function, you should place a space around = in function calls, and always put a space after a comma, not before (just like in regular English) # Good average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

how to name arguments?

A

The names of the arguments are also important. R doesn’t care, but the readers of your code (including future-you!) will. Generally you should prefer longer, more descriptive names, but there are a handful of very common, very short names. It’s worth memorising these: x, y, z: vectors. w: a vector of weights. df: a data frame. i, j: numeric indices (typically rows and columns). n: length, or number of rows. p: number of columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Checking values in function

A

It’s good practice to check important preconditions, and throw an error (with stop()), if they are not true: wt_mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is NA in R?

A

Missing data in R appears as NA. NA is not a string or a numeric value, but an indicator of missingness. We can create vectors with missing values. x1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is na.omit and na.exclude?

A

na.omit and na.exclude: returns the object with observations removed if they contain any missing values; differences between omitting and excluding NAs can be seen in some prediction and residual functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

How to use R’s ellipsis feature when writing your own function?

A

https://stackoverflow.com/questions/3057341/how-to-use-rs-ellipsis-feature-when-writing-your-own-function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What deos that means R is lazy evaluation programming language?

A

Arguments in R are lazily evaluated: they’re not computed until they’re needed. That means if they’re never used, they’re never called. This is an important property of R as a programming language, but is generally not important when you’re writing your own functions for data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

How many types of vectors?

A
  1. Atomic vectors, of which there are six types: logical, integer, double, character, complex, and raw. Integer and double vectors are collectively known as numeric vectors. 2. Lists, which are sometimes called recursive vectors because lists can contain other lists. The chief difference between atomic vectors and lists is that atomic vectors are homogeneous, while lists can be heterogeneous
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is relationship between NULL and NA ?

A

NULL is often used to represent the absence of a vector (as opposed to NA which is used to represent the absence of a value in a vector). NULL typically behaves like a vector of length 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Every vector has two key properties, what are they?

A
  1. Its type, which you can determine with typeof(). 2.Its length, which you can determine with length().
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

There are three important types of augmented vector:?

A
  1. Factors are built on top of integer vectors. 2. Dates and date-times are built on top of numeric vectors. 3. Data frames and tibbles are built on top of lists.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are the 4 most important of atomic vector>

A

The four most important types of atomic vector are logical, integer, double, and character. Raw and complex are rarely used during a data analysis,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Logical Vector?

A

Logical vectors are the simplest type of atomic vector because they can take only three possible values: FALSE, TRUE, and NA. 1:10 %% 3 == 0 #> [1] FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE FALSE

39
Q

What are Numeric Vector?

A

nteger and double vectors are known collectively as numeric vectors. In R, numbers are doubles by default. To make an integer, place an L after the number: typeof(1) #> [1] “double” typeof(1L) #> [1] “integer” 1.5L #> [1] 1.5

40
Q

Diff btw Integer and double. Double are approximation. What does that means?

A

Doubles are approximations. Doubles represent floating point numbers that can not always be precisely represented with a fixed amount of memory. This means that you should consider all doubles to be approximations. For example, what is square of the square root of two? x [1] 2 x - 2 #> [1] 4.44e-16 This behaviour is common when working with floating point numbers: most calculations include some approximation error. Instead of comparing floating point numbers using ==, you should use dplyr::near() which allows for some numerical tolerance.

41
Q

What is diff btwn special values for Integer and double?

A

Integers have one special value: NA, while doubles have four: NA, NaN, Inf and -Inf. All three special values NaN, Inf and -Inf can arise during division: c(-1, 0, 1) / 0 #> [1] -Inf NaN Inf Avoid using == to check for these other special values. Instead use the helper functions is.finite(), is.infinite(), and is.nan():

42
Q

Whata re character vectors?

A

Character vectors are the most complex type of atomic vector, because each element of a character vector is a string, and a string can contain an arbitrary amount of data.

43
Q

R uses global string fool. What does that mean?

A

Here I wanted to mention one important feature of the underlying string implementation: R uses a global string pool. This means that each unique string is only stored in memory once, and every use of the string points to that representation. This reduces the amount of memory needed by duplicated strings.

44
Q

What are diff missing string values in vectors?

A

Note that each type of atomic vector has its own missing value: NA # logical #> [1] NA NA_integer_ # integer #> [1] NA NA_real_ # double #> [1] NA NA_character_ # character #> [1] NA Normally you don’t need to know about these different types because you can always use NA and it will be converted to the correct type using the implicit coercion rules

45
Q

Coercion ? how many in R>?

A

There are two ways to convert, or coerce, one type of vector to another: 1.Explicit coercion happens when you call a function like as.logical(), as.integer(), as.double(), or as.character(). Whenever you find yourself using explicit coercion, you should always check whether you can make the fix upstream, so that the vector never had the wrong type in the first place. For example, you may need to tweak your readr col_types specification. 2. Implicit coercion happens when you use a vector in a specific context that expects a certain type of vector. For example, when you use a logical vector with a numeric summary function, or when you use a double vector where an integer vector is expected.

46
Q

implicit coercion example in code?

A

You may see some code (typically older) that relies on implicit coercion in the opposite direction, from integer to logical: if (length(x)) { # do something } You may see some code (typically older) that relies on implicit coercion in the opposite direction, from integer to logical: if (length(x)) { # do something }

47
Q

Higher precedence always win in vector?

A

typeof(c(TRUE, 1L)) #> [1] “integer” typeof(c(1L, 1.5)) #> [1] “double” typeof(c(1.5, “a”)) #> [1] “character”

48
Q

Test functions in R?

A

One option is to use typeof(). Another is to use a test function which returns a TRUE or FALSE Base R provides many functions like is.vector() and is.atomic(), but they often return surprising results. Instead, it’s safer to use the is_* functions provided by purrr is_logical() x is_integer() x is_double() x is_numeric() x x is_character() x is_atomic() x x x x is_list() x is_vector() x x x x x Each predicate also comes with a “scalar” version, like is_scalar_atomic(), which checks that the length is 1. This is useful, for example, if you want to check that an argument to your function is a single logical value

49
Q

R does not have scalar, what it does have?

A

R doesn’t actually have scalars: instead, a single number is a vector of length 1. Because there are no scalars, most built-in functions are vectorised, meaning that they will operate on a vector of numbers. That’s why, for example, this code works: sample(10) + 100 #> [1] 109 108 104 102 103 110 106 107 105 101

50
Q

what happens if you add two vectors of different lengths?

A

1:10 + 1:2 #> [1] 2 4 4 6 6 8 8 10 10 12 1:10 + 1:3 #> Warning in 1:10 + 1:3: longer object length is not a multiple of shorter #> object length #> [1] 2 4 6 5 7 9 8 10 12 11 Here, R will expand the shortest vector to the same length as the longest, so called recycling. This is silent except when the length of the longer is not an integer multiple of the length of the shorter:

51
Q

Tidyverse throw error if vectors of diff length? but what is solution?

A

the vectorised functions in tidyverse will throw errors when you recycle anything other than a scalar. If you do want to recycle, you’ll need to do it yourself with rep() tibble(x = 1:4, y = 1:2) #> Error: Tibble columns must have consistent lengths, only values of length one are recycled: #> * Length 2: Column y #> * Length 4: Column x tibble(x = 1:4, y = rep(1:2, 2)) #> # A tibble: 4 x 2 #> x y #> #> 1 1 1 #> 2 2 2 #> 3 3 1 #> 4 4 2 tibble(x = 1:4, y = rep(1:2, each = 2)) #> # A tibble: 4 x 2 #> x y #> #> 1 1 1 #> 2 2 1 #> 3 3 2 #> 4 4 2

52
Q

How to name vectors in R?

A

All types of vectors can be named. You can name them during creation with c(): c(x = 1, y = 2, z = 4) #> x y z #> 1 2 4 Or after the fact with purrr::set_names(): set_names(1:3, c(“a”, “b”, “c”)) #> a b c #> 1 2 3

53
Q

Can we use diplyr::filter() to for vectors no tibble?

A

So far we’ve used dplyr::filter() to filter the rows in a tibble. filter() only works with tibble, so we’ll need new tool for vectors: . [is the subsetting function, and is called like x[a]

54
Q

How can you subset vector with integer?

A
  1. A numeric vector containing only integers. The integers must either be all positive, all negative, or zero. Subsetting with positive integers keeps the elements at those positions: x [1] “two” “four” It’s an error to mix positive and negative values: x[c(1, -1)] #> Error in x[c(1, -1)]: only 0’s may be mixed with negative subscripts
55
Q

There are4 ways to subsect vector. How to subset vector with logical vector?

A

Subsetting with a logical vector keeps all values corresponding to a TRUE value. This is most often useful in conjunction with the comparison functions. x [1] 10 3 5 8 1 # All even (or missing!) values of x x[x %% 2 == 0] #> [1] 10 NA 8 NA

56
Q

There are4 ways to subsect vector.

A
  1. integere 2. Logical vector 3. With charcter vector\4. nothing . The simplest type of subsetting is nothing, x[], which returns the complete x. This is not useful for subsetting vectors, but it is useful when subsetting matrices
57
Q

How to subset with character vector:?

A

If you have a named vector, you can subset it with a character vector: x xyz def #> 5 2

58
Q

What is Recursive vectors ?

A

Lists are recursive vector

59
Q

What makes List diff from atomic vector and how can we create it?

A

lists can contain other lists. This makes them suitable for representing hierarchical or tree-like structures. You create a list with list() x [[1]] #> [1] 1

60
Q

What is diff bten subsetiing vectors and list?

A

Vectors we use [x(1)] , List we use [x[1]] all () and [] works in each case

61
Q

what str() does in list for R?

A

What is str? Str is a compact way to display the structure of an R object. This allows you to use str as a diagnostic function and an alternative to summary. Str will output the information on one line for each basic structure. Str is best for displaying contents of lists. The goals is to get an output for any R object. str(x) #> List of 3 #> $ : num 1 #> $ : num 2 #> $ : num 3 x_named List of 3 #> $ a: num 1 #> $ b: num 2 #> $ c: num 3 Unlike atomic vectors, list() can contain a mix of objects: y List of 4 #> $ : chr “a” #> $ : int 1 #> $ : num 1.5 #> $ : logi TRUE list can contain other list . Lists can even contain other lists! z

62
Q

how many ways to subset list() ?

A

There are three ways to subset a list, which I’ll illustrate with a list named a: a List of 2 #> $ a: int [1:3] 1 2 3 #> $ b: chr “a string” Like with vectors, you can subset with a logical, integer, or character vector. [[extracts a single component from a list. It removes a level of hierarchy from the list. str(a[[1]]) #> int [1:3] 1 2 3 str(a[[4]]) #> List of 2 #> $ : num -1 #> $ : num -5 $ is a shorthand for extracting named elements of a list. It works similarly to [[except that you don’t need to use quotes. a$a #> [1] 1 2 3 a[[“a”]] #> [1] 1 2 3

63
Q

Difference between [ and [[ in subsetting

A

The distinction between [ and [[ is really important for lists, because [[ drills down into the list while [ returns a new, smaller list.

64
Q
A
65
Q

What is diff for subsetting tibble and list?

A

Subsetting a tibble works the same way as a list; a data frame can be thought of as a list of columns. The key difference between a list and a tibble is that all the elements (columns) of a tibble must have the same length (number of rows). Lists can have vectors with different lengths as elements.

66
Q

what are argumented vectors?

A

Atomic vectors and lists are the building blocks for other important vector types like factors and dates. I call these augmented vectors, because they are vectors with additional attributes, including class. Because augmented vectors have a class, they behave differently to the atomic vector on which they are built.

4 argumented vectors

Factors

Dates

Date-times

Tibbles

67
Q

What are factors?

A

Factors are designed to represent categorical data that can take a fixed set of possible values. Factors are built on top of integers, and have a levels attribute:

68
Q

what are Dates and date-times argumented vectors?

A

Dates in R are numeric vectors that represent the number of days since 1 January 1970.

x \<- as.Date("1971-01-01")
unclass(x)
#\> [1] 365
typeof(x)
#\> [1] "double"
attributes(x)
#\> $class
#\> [1] "Date"
69
Q

What are tibble argumnted Lists?

A

Tibbles

Tibbles are augmented lists: they have class “tbl_df” + “tbl” + “data.frame”, and names (column) and row.names attributes:

The difference between a tibble and a list is that all the elements of a data frame must be vectors with the same length. All functions that work with tibbles enforce this constraint.

Traditional data.frames have a very similar structure:

The main difference is the class. The class of tibble includes “data.frame” which means tibbles inherit the regular data frame behaviour by default.

70
Q

Reducing code duplication has three main benefits:?

A
  1. It’s easier to see the intent of your code, because your eyes are drawn to what’s different, not what stays the same.
  2. It’s easier to respond to changes in requirements. As your needs change, you only need to make changes in one place, rather than remembering to change every place that you copied-and-pasted the code.
  3. You’re likely to have fewer bugs because each line of code is used in more places.
71
Q
A
72
Q

What is Another tool after function for reducing duplication in code?

A

iteration : which helps you when you need to do the same thing to multiple inputs: repeating the same operation on different columns, or on different datasets.

73
Q
A
74
Q

What are important iteration paradigms in this book?

A

imperative programming and functional programming

On the imperative side you have tools like for loops and while loops, which are a great place to start because they make iteration very explicit, so it’s obvious what’s happening. However, for loops are quite verbose, and require quite a bit of bookkeeping code that is duplicated for every for loop.

Functional programming (FP) offers tools to extract out this duplicated code, so each common for loop pattern gets its own function. Once you master the vocabulary of FP, you can solve many common iteration problems with less code, more ease, and fewer errors.

75
Q

Learn loop and discard them

A

learn about loops. They offer you a detailed view of what it is supposed to happen at the elementary level as well as they provide you with an understanding of the data that you’re manipulating.

And after you have gotten a clear understanding of loops, get rid of them.

Put your effort into learning about vectorized alternatives. It pays off in terms of efficiency.

76
Q

For loop general structire?

A
77
Q

creating an empty vector ?

A

A general way of creating an empty vector of given length is the vector() function. It has two arguments: the type of the vector (“logical”, “integer”, “double”, “character”, etc) and the length of the vector.

output <- vector(“double”, length(x)).

output <- vector(“double”, length(x))

Before you start the loop, you must always allocate sufficient space for the output

78
Q

for loop has 3 component?

A

output <- vector(“double”, ncol(df)) # 1. output
for(i in seq_along(df)) { # 2. sequence
output[[i]] <- median(df[[i]]) # 3. body
}

79
Q
A
80
Q

Component of for loop?

A
  1. The output: output <- vector(“double”, length(x)). Before you start the loop, you must always allocate sufficient space for the output
  2. The sequence: i in seq_along(df). This determines what to loop over: each run of the for loop will assign i to a different value from seq_along(df) . You might not have seen seq_along() before. It’s a safe version of the familiar 1:length(l)
  3. The body: output[[i]] <- median(df[[i]])
81
Q

For loop variations?

A

There are four variations on the basic theme of the for loop:

  1. Modifying an existing object, instead of creating a new object.
  2. Looping over names or values, instead of indices.
  3. Handling outputs of unknown length.
  4. Handling sequences of unknown length.
82
Q

How to get output from Function.

A

Function automatically return the result of last statement in the body. So even u dont used return or variable to return the result.

83
Q
A
84
Q
A
85
Q

What are two tools for reducing duplication?

A

Fuction and iteration(which helps you when you need to do the same thing to multiple inputs: repeating the same operation on different columns, or on different datasets)

86
Q
A
87
Q

What makes tranforming tidy data feel natural?

A

Most built-in R functions work with vectors of values. That makes transforming tidy data feel particularly natural.dplyr, ggplot2, and all the other packages in the tidyverse are designed to work with tidy data

88
Q
A
89
Q

count()

A
Compute cases per year
table1 %\>% 
 count(year, wt =cases)
#\> # A tibble: 2 x 2
#\> year n
#\> 
#\> 1 1999 250740
#\> 2 2000 296920
90
Q

how to use Gathering to tidy data?

A

A common problem is a dataset where some of the column names are not names of variables, but values of a variable

table4a
#\> # A tibble: 3 x 3
#\> country `1999` `2000`
#\> \* 
#\> 1 Afghanistan 745 2666
#\> 2 Brazil 37737 80488
table4a %\>% 
 gather(`1999`, `2000`, key = "year", value = "cases")
#\> # A tibble: 6 x 3
#\> country year cases
#\> 
#\> 1 Afghanistan 1999 745
#\> 2 Brazil 1999 37737
91
Q

How can we combine Tibble?

A

left_join(tidy4a, tidy4b)

92
Q

names function in R, how to use it?

A

names() function gets or sets the names of an object.

names(x)

names(x) <- valuex:

R object
value: to be assigned to the x, with the same length as x, or NULL

93
Q

comment multiple line in R?

A

command + Shift + C

94
Q
A