R for bio Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is the purpose of a Shiny package?

A

Shiny is an R package that enables the creation of interactive web applications directly from R.
- web applications for data visualization, analysis, and communication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why should you augment the data?

A

It will increase:
- dataset size (Q-value, P-value, Residual)
- Robustness to variations
- feature learning

and reduce sensitivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When does it make sense to augment the data?

A

It makes sense to use the augment() function or similar approaches to augment the data when you want to enrich your dataset with additional information related to the fitted model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does it mean to perform augment on the data?

A

Typically refers to a function or operation that adds new columns to a dataset containing additional information related to the model
- P-value, Q-value or residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the 3 tidy data rules?

A
  • each variable must have its own colum
  • each observation must have its own row
  • each value must have its own cell
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does it mean to wrangle the data?

A

Data wrangling is the process of converting raw data into a usable form and is done before any data analysis is done.
- Ensure reliable and complete data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why should you wrangle your data?

A

because it cleans the data, transforming and organizing it to a more suitable and structured format for data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the general pipeline for bio data science?

A

import -> tidy -> (Transform -> visualize -> model -> ) -> communicate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is reproducibility of data analysis important?

A

Ensuring verification and validate the analysis, making it reusable.
- reduce error
- enhance adaptility and learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the components of reducebility

A
  • Raw data
  • Cleaning data
  • code and script
  • parameterization and tuning
  • explanation
  • documented workflow
  • results
  • accessibility and sharing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which data forms best fit a boxplot?

A
  • 2 < numeric variables
  • several groups in the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which data forms best fit a heatmap?

A

not ordered numeric variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which data forms best fit density plot?

A

Numeric values, not ordered and work with many points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which data forms best fit histogram?

A

Numeric variables, not ordered and works for a few points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of boxplot?

A

Gives a summary of one or several numeric variables. The line that divides the box into 2 parts represents the median of the data. includes potential outliers in lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of heatmap?

A

It is really useful to display a general view of numerical data, not to extract specific data point.
- can show potential correlation

17
Q

What is the purpose of density plot?

A

Shows the distribution of a numeric variable. It takes only numeric variables as input and is very close to a histogram. It can be used in the same exact condition

18
Q

What is the purpose of histogram?

A

takes as input a numeric variable only. The variable is cut into several bins. It is possible to represent the distribution of several variable on the same axis using this technique

19
Q

Which different types of augment() is there?

A
  • Linear model
  • Generalized linear model
  • Machine learning models
  • Complex models
  • Prediction task
  • Visualizations
20
Q

How would you unify model output in R?

A

Use the tidy()-function from the broom-package

21
Q

what does the summary()-function from the base-package do?

A

used to obtain and print a summary of the results of various model fitting functions.