Module 1 Flashcards

1
Q

When to use R

A

raw data is complex
R & RStudio are designed to handle large data sets and can easily reproduce work on different data sets.
Flexible data visualisation - difference across cities effectively using plotting features like facets.
Automaticallt create output of summary stats - or visualised plots for each group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Fundamentals using programming in R-studio

A

Coding in RStudio
Syntax for performing calculations
Pipes
R packages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Basic concepts of R

A

Functions
Comments
Variables
Data Types
Vectors
Pipes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Functions (R)

A

A body of reusable code used to perform specific tasks in R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Argument (R)

A

Information that a function in R needs in order to run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Variable (R)

A

A representation of a value in R that can be stored for use later during the programming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vector (R)

A

Group of data elements of the same type stored in a sequence in R c(X, Y, Z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pipe (R)

A

A tool in R for expressing a sequence of multiple operations, represented with “%>%”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data Structure

A

format for organising and storing data (Vectors/Data frames/Matrice/Arrays)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 different types of vectors in r

A

atomic and lists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

6 different types of atomic vectors

A

logical, integer, double, character (which contains strings), complex, and raw.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can you determine the properties of vectors

A

typeof() function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

naming vectors

A

names() fucntion can be used to name elements of a vector

x <- c(“a”, “b”, “c”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

atomic vectors

A

c(x, y, z)..
can only be made with variables of the same type.
homogenous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how do you determine the structure of lists

A

str(list, x, y, z..)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of vectors are numeric

A

Integar and double vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Lists

A

heterogeneous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Logical Vectors

A

simplest type of atomic vector and can only contain 3 values, TRUE, FALSE, NA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Three ways to likely creat date-time formats

A

From a string
From an individual date
From an exisiting date/time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data Frame

A

A collection of columns containing data, similar to a spreadsheet or SQL table.

Data frames summarise data and organise it into a format that is easy to read and use.

data.frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Characteristics of Data Frames

A

variety of data types
only one element in each cell
columns should be names
each column consists of elements of the same data tyoe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Extract Operator Data Frames

A

two arguments
1. the row(s)
2.column(s) you like to extract

e.g. x[2,1]

will draw the second row and first column of the data frame assigned to x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Create a file

A

create.file - if TRUE response in the console than the file has been created. If FALSE, then file has not been created

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Copy a file

A

file.copy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Delete a file
unlink("some.file.csv")
25
26
Matrix
two dimensional collection of data elements. Contains both rows and columns can only contain the same data type
27
difference between matrices and vectors
matrices are two-dimensional and vectors are one-dimensional
28
create a matrix
matrix()
29
operator
a symbol that names the type of operation or calculation to be performed in a formula
30
assignment operators
used to assign values to variables and vectors
31
arithmetic operators
used to complete math calculations + addition - subtraction * multiplication / division
32
logical operators and three primary types of logical operators
TRUE or FALSE AND (& OR &&) in R OR (sometime | or ||) in R NOT(!)
33
AND operator "&"
34
Conditional statements
declaration that if a certain condition holds, then a certain event must take place. if() else() else if()
35
if statement
sets a condition, and if the ocndition is TRUE, the R code associated with the statement is executed. if ( condition) {expr}x
36
else statement
used in combination with an if statement. This is how the code is structured in R: if (condition) { expre1 } else { expre }
37
else if statement
if (condition1) { expr } else if (condition2 { expr2 } else { expre3 } if the first conition is met than the first expresison is executed, if FALSE, then the second condition is checked and if met, the second expression will be met, if FALSE the third expression will be executed. x -1
38
Packages (R)
Units of reproducible R code
39
CRAN
Comprehensive R Archive Network an online archive with R packages, source code, manuals, and documentation
40
Tidyverse (R)
A system of packages in R with a common design philosophy for data manipulation, exploration, and visualisation
41
Conflicts
happen when packages have functions with the same names as other functions
42
Tidyverse
collection of packages in R with a common design philosophy. The tidyverse packages are especially useful for data manipulation, exploration, and visualisation
43
Vignette
documentation that acts as a guide to an R package A vignette share details about the problem that the package is designed to solve and how the indluded fucntions can help you solve it.
44
browseVignettes
read through vignettes of a loaded package.
45
8 core packages of the tidyverse
ggplot2 tidyr readr dplyr tibble purrr stringr forcats
46
4 packages that are essential for workflow of data analsyts
ggplot2 dplyr tidyr readr
47
ggplot2
creat a variety of data viz by applying different visual properties to the data variable in R
48
tidyr
a package used for data cleaning to make tidy data tidy data works with long and wide data to make sure the tright data types are s=heterogenous within data frames across data tables
49
readr
used for importing data e.g. read_csv()
50
dplyr
offers a consistent set of functions that help you complete some common data manipulation tasks
51
Factors
store categorical data in R where the data values are limited and usually based on a finite group like country or year
52
Pipe
A tool in R for expressing a sequence of multiple operations, represented with "%>%" - and then
53
Nested
In programming, describes code that performs a particular function and is contained within code that performs a broader function
54
Nested function
A function that is completely contained within another function
55
Data frame
A collection of columns
56
Tibbles
never change data type of the inputs never change the names of your variables never create row names
57
Tidy data
A way of standardising the organisation of data within R variables are organised inito columns observations are organisaed into rows Each value must have its own cell.
58
%%
Modulusa (returns the remainder after division) e.g. x <- 2 7 <- 5 y %% x = 1
59
%/%
Integer division (returns an integer value after division) y %/% x = 2
60
^
Exponent
61
Relational operators
known as comparators, allow you to compare values. Identify how one R object relates to another, like whether an object is less than, equall to, or greater than another object
62
Logical operators
allow you to combine logical values. Return a logical data type or boolean (TRUE or FALSE). & && | || !
63
& and ~&&
& element-wise logical AND && Logical AND
64
| ||
| element-wise logical OR || Logical OR ! Logical NOT
65
wide data
has observations across several columns. each column contains data from a different condition of the variable
66
long data
has all the observations in a single column, and the variable conditions are placed into separate rows.
67
pivot_longer()
lengthen the data in a data frame by increaing the number of rows and decreasing the number of columns
68
pivot_wider()
convert you rdata to have more columns and fewer rows.
69
Anscombe's quartet
four datasets that have nearly identical summary statistics
70
Benefits of ggplot 2
Create different type of plots Customise the look and feel of plots Create high quality visuals Combine data manipulation and visualisation
71
Aesthetic
a visual property of an object in your plot
72
Geom
The geometric object used to represent your data
73
Facets
Let you display smaller groups, or subsets, of your data
74
Labels and annotation
Let you customise your plot
75
The basics of ggplot2
a dataset a set of geoms (e.g. points to create a scatterplot, bars to create a bar chart a set of aesthetic attributes: Visual property of an object in your plot. Aesthetic as a connection, or mapping, between a visual feature in your plot and a variable in your data
76
Mapping
Matching up a specific variable in your dataset with a specific aesthetic
77
Colour
allows you to change the colour of all the points on you rplaot, or colour each data groups
78
Size
Allows you to change the size of the points on your plot by data group
79
Shape
allows you to change the shape of the points on your plot by data group
80
Smoothing
Enables the detection of a data trend even when you can't easily notice a trend from the plotted data points. Adds a smoothing line as another layer to a plot: the smoothing line helps the data to make sense to a casual observer
81
Annotate
To add notes to a document or diagram to explain or comment upon it