QBIO2001 Flashcards

Question

Why are sample sizes constrained?

Answer 1

• Sample size constraints: o Need to put request through ethics communities o Funding

Answer 2

• Graphing data is extremely useful in informing the researcher of the shape and distribution of their data

Answer 3

• A big assumptions in most tests is that values are normally distributed

Answer 4

The pointiness of the skew

Answer 5

• Independent (between subjects) tests: | o The assumption that distributions within treatment groups are normally distributed

Answer 6

o The assumption that distributions of differences between treatment groups are normally distributed

Answer 7

o Shapiro.test

Answer 8

o Every value in comparison needs to be treated equally and transformed o Log () is used to bring data closer to the mean of values  used for high values o Sqr()- If some values are negative, a constant might have to be applied first so that all values become positive o Reciprocal -> when very large values become very small

Answer 9

Any value >1.5x the interquartile range

Answer 10

 Remove outliers • mvoutlier  Transform data

Answer 11

Difference between means of treatment groups is zero o Where H0 is the null hypothesis, which assumes that the difference between the observed value (data) and expected value (EV) is due to chance alone. p=p0 for example p=0.5

Answer 12

• Alternative hypothesis- Difference between means of treatment groups is either not zero (two-tailed), or less/greater than zero (one-tailed). Can be given directionality o Where H1 is the alternative hypothesis, which assumes that the difference between the observed value (data) and expected value (EV) is NOT due to chance alone. pp0 (upper sided test) or p doesn’t equal to p0 (two sided)

Answer 13

All comparison groups have the same variance

Answer 14

Levene’s test (leveneTest), p<0.05 indicates difference variance between treatment groups

Answer 15

• Welch’s t-test (t.test(paired=FALSE)) does not assume homogeneity of variance o Need data to be normally distributed

Answer 16

o Wilcox’s robust tests including (yuen)- trimmed mean calculated after 20% of scores have been removed from each extreme of the distribution  Can specify the level of trimming: • Makes normal distribution • If trim too much, data would lose its effect and a lot of experimental units would be needed • The data should be transformed first

Answer 17

Within subjects

Answer 18

Easier to see data trend

Answer 19

Welch 2-sample T-test | R command: t.test(x,y,mu=0,var.equal=F)

Answer 20

Transformations or non parametric tests | Mann-Whitney-Wilcoxon test: wilcox.test

Answer 21

Paired T-Test -Sometimes it is desirable to analyse dependent data. We often design an experiment to take advantage of this dependency in order to control variation between experimental groups

Answer 22

o Analysis of variance (ANOVA) is used to compare several means between subjects

Answer 23

o One-way independent ANOVA (aov(outcome~predictor)) then (summary) to see the output

Answer 24

o Both homeogeneity of variance is assumed and normality is assumed

Answer 25

the variance within one particular sample

Answer 26

• Standard error- standard deviation of mean across many samples

Answer 27

compare abundance across multiple samples (typically expressed as a ratio) without determining number of molecules (e.g. copies, pmol/L, etc) o Most commonly used because it’s easier

Answer 28

etermine the number of molecules in each sample (e.g. copies, pmol/L, etc) o Gives more information o Don’t need to have multiple samples -> don’t need to compare between multiple sample (but generally need to compare against multiple samples) o Can use known concentration of a thing, then use abundance of known thing to calculate exact molecules of an unknown thing

Answer 29

- Needs to be in a vacuum 1. You place the substance you want to study in a vacuum chamber inside the machine (into an inlet) 2. The substance is bombarded with a beam of electrons so the atoms or molecules it contains are turned into ions. This process is called ionization and produces molecular ions. - Ion source 3. The ions shoot out from the vacuum chamber into a powerful electric field (the region that develops between two metal plates charged to high voltages), which makes them accelerate. Ions of different atoms have different amounts of electric charge, and the more highly charged ones are accelerated most, so the ions separate out according to the amount of charge they have. (This stage is a bit like the way electrons are accelerated inside an old-style, cathode-ray television.) - Mass Analysers 4. The ion beam shoots into a magnetic field (the invisible, magnetically active region between the poles of a magnet). When moving particles with an electric charge enter a magnetic field, they bend into an arc, with lighter particles (and more positively charged ones) bending more than heavier ones (and more negatively charged ones). The ions split into a spectrum, with each different type of ion bent a different amount according to its mass and its electrical charge. - Mass Analysers 5. A computerized, electrical detector records a spectrum pattern showing how many ions arrive for each mass/charge. 6. Computer - Instrument control and data acquisition

Answer 30

• Can determine the structure and quantity of molecules by measuring their mass to charge ratio and comparing those to that of elements to see what elements are in the molecule.

Answer 31

o Use liquid chromatography to decrease complexity of sample before it’s fed into the mass spectrometer

Answer 32

o Separates out proteins/ peptides in time by introducing them more slowly to the instrument --> spread out separation of analytes in time so we can give mass spectrometer more time to quantify and identify number of analytes in the sample  Feed to the mass spectrometer more slowly  Instrument has finite speed at which it can operate

Answer 33

In the inlet of the mass spectrometer

Answer 34

o In a 2 hour acquisition (normal acquisition time), would have about 40,000 peaks in the separation- 40,000 analytes separated

Answer 35

Electrospray ionization (ESI) is a technique used in mass spectrometry to produce ions using an electrospray in which a high voltage is applied to a liquid to create an aerosol. It is especially useful in producing ions from macromolecules because it overcomes the propensity of these molecules to fragment when ionized.

Answer 36

 Applies very high voltage to liquid coming out of liquid chromatograph  Generates a spray of droplets which leads to generation of gas phase ions in front of the source region of the mass spectrometer -> ions get attracted and introduced into the machine itself.

Answer 37

o Technique to break down selected ions (precursor ions) into fragments (product ions) o Fragments reveal aspects of the chemical structure of the precursor ions

Answer 38

Gaussian (normally distributed)

Answer 39

• One large peak has many different peaks because all molecules analyzed are composed of different elements, which are made of many isotopes o 1% of all carbon is C13, so that means that it is a common contaminant o Very important for biological molecules (such as peptides and proteins) o Rate of contamination by these isotopes are quite high • Peaks are mixtures of different elements in the molecule being analysed o Contamination becomes bigger and bigger as peaks decrease to the right of the monoisotopic peak • These distributions are referred to as the isotopic envelope, and they represent the inclusion of isotopes within each particular analyte

Answer 40

A pure peak

Answer 41

• Quantitative mass spectrometry typically utilizes proteins labelled with heavy stable isotopes • Labeled (heavy) peptides maintain the same characteristics as unlabeled or ‘light’ peptides and co-elute into the mass spectrometer from liquid chromatography columns • In the mass spectrometer they are easily distinguished by their mass • Algorithms are then used to extract the light and heavy peptide ion chromatograms, which represents the peptide’s abundance • The light/heavy ratios are used to infer relative abundance • By mixing the same labeled protein standard with different unlabeled protein samples, changes in relative abundance can be determined between biological conditions For example; if you applied a stimulus to the light medium and had the heavy medium as the control, then can identify which source it came from and hence identify the ratio difference between the two things.

Answer 42

- Get a sample - Homogenisation - Protein extraction - Protein quantitation - Protein clean-up digestion - Peptide clean-up - LC-MS/MS analysis - Data Processing (identify and quantify) - Data analysis and vsualisation

Answer 43

a. Simplify analytical technique b. Digest in smaller components c. Use proteases  cut proteins up into smaller pieces in a very consistent way d. Very sequence specific cutting

Answer 44

a. LC- liquid chromatography b. Tandem mass spectrometry  look at what’s coming off chromatograph and choose to isolate individual analytes/peptides and determine what the masses of their fragments are  work out structure of each analyte c. Retention time- time of LC separation : want different analytes come off at different times d. Map  relative abundance indicated by colour i. Sometimes, peaks are not Gaussian which causes problems e. Can link identification data of masses of fragments to the abundance of the intact peptide

Answer 45

• TIC- map of precursor ions coming off at any one time | o Can’t use it for quantitative analysis

Answer 46

map of precursor ions coming off at any one time o Can extract a single peak from data if we ask software to look specifically at abundance of one analyte o Integrated area under the peak is proportional to the abundance of that analyte  One way of extracting piece of quantitative data  But still don’t know what that peak is o Can isolate a peak, fragment it and see a new spectrum –Mass spectrum  Precursor peptide mass  Shows masses of fragments From gaps between fragment peaks, can determine sequence of peptide

Answer 47

- Abundance of intact precursor ion - Abundance of peptide fragment ion - Number of MS/MS per percursor - Counting number of times you trigger a fragmentation spectrum on that particular analyte, because triggering is abundance-biased - Extracting a three dimensional volume for each peak

Answer 48

Label free> chemical labelling > metabolic labelling

Answer 49

Feed enriched isotopically amino acids to cells | o Enables us to introduce a light population of cells and a heavy population of cells

Answer 50

o Abundance difference between two peaks can be used for relative quantitation measuring

Answer 51

o Any bias in the workflow is applied to both heavy and light samples

Answer 52

• Insulin binds to insulin receptor • Triggers phosphorylation cascades o Phosphorylation of IRS1 o Recruitment of two subunits of PI3Kinase  P85 and p110 o Produce phospholipid called PIP3 o Recruits PDK1 and PDK/AKT o AKT kinase acts on downstream substrates o Ultimately culminates in movements of the vesicles • Movement of glucose transporter from cytoplasm to plasma membrane • Allows cell to take up glucose, especially in muscle cells

Answer 53

``` • Sample prep o Extremely important • Understand the technique • Understand what makes a good image • Reproducible unbiased analysis ```

Answer 54

``` • A visual representation of something • Digital • A 2d rectilinear array of pixels o An xy table of values • A way of capturing, storing and displaying data • Transform an image using maths to extract data • Made of pixels o Pixel  individual unit of an image ```

Answer 55

The basis of digital info (0 or 1)

Answer 56

The number of recorded bits per pixel

Answer 57

o Increased bit depth= increased information at the cost of file size o Bit depth captures a lot more of the image

Answer 58

o Information capture per unit area vs file size

Answer 59

the number of pixels (wxh)

Answer 60

- A subject - Energy source that will interact with the subject - A way to control and focus the energy - A detector

Answer 61

A decrease in energy

Answer 62

o Imaging with UV causes damage to tissue and scatters more easily (doesn’t penetrate very far) o Want to move into Infra-Red spectrum because deeper penetration and less damaging to tissues  Whilst there are benefits for going to IR, there’ll be loss of resolution or limit to resolution that can be achieved

Answer 63

Wavelength/2

Answer 64

``` o Light changes in refractive index as it moves through the sample --> dramatically changes the info you can get from the image o Results include  Brightfield  Phase contrast  DIC  Dark field ```

Answer 65

- Transmitted light path microscope | - Reflected light path microscope

Answer 66

For samples such as metals or extremely thick organisms that remain opaque after ground that the transmitted light path microscope can't see- also mostly used for fluorescence

Answer 67

* High magnification, high resolution, large working distance * Typically used for observing cells on coverslips or surfaces close to coverslips submerged in liquid An inverted microscope is a microscope with its light source and condenser on top, above the stage pointing down, while the objectives and turrets are below the stage pointing up.

Answer 68

Color-stained, high contrast sample

Answer 69

Find structure, tiny sample

Answer 70

Low contrast, transparent sample

Answer 71

Low contrast sample, for surface structure observation

Answer 72

o Illuminates everything in the sample-> excites everything o Illuminates the entire cell and captures info from entire cell -> image quality degraded as capture out of focus light o Has low resolution o All molecules out of plane of focus excited and makes the image noisy o Epifluorescence- profound change in size but can’t see much change in intensity -> good for area -Has 240 nm wavelength limit to resolution

Answer 73

o Uses a pinhole near detector to cut out all out of focus light o Get less resolution because 3 times resolution of light o Cuts out of focus light o Point scanning o Spinning disk  Allows to look further into the cell -Has 240 nm wavelength limit to resolution

Answer 74

• Total Internal Reflection Fluorescence (TIRF) o Light at critical angle gets completely reflected o Generates weak ER wave that goes through interface and decays exponentially o TIRF good for things that are close to the surface of the cell such as the plasma membrane o Only allows image of membrane of cell o Image of high resolution at basal membrane -Has 240 nm wavelength resolution limit

Answer 75

o Gives better resolution o Stochastic methods- PALM, STORM, GSD  Statistics based o Deconvolution methods- Zeiss Airyscan

Answer 76

``` •Autofluorescence (label free) o NAD and FAD- Redox o Multiphoton harmonics-  2nd harmonics- collagen  3rd harmonics- RI mismatch; haemoglobin • Antibody labelling • Genetic encoding o Fluorescent proteins  Genes of interest  Biosensors o Dyes  Markers  Sensors ```

Answer 77

- Widefield - Confocal - Total Internal Reflection Fluorescence - Multiphoton - Fluorescence lifetime imaging - Super resolution

Answer 78

``` • Insulin responsive glucose transporter • Necessary for insulin stimulated uptake of glucose • Localization o Perinuclear (around nucleus of a cell) o Peripheral puncta  Partial overlap with endosomal markers  Specialized storage vesicles (GSVs) o Translocate to the plasma membrane upon insulin stimulation • Label with antibody ```

Answer 79

o Noise removal e.g. smoothing  Can come from camera or imaging techniques  Image histogram -> counts number of pixels at different intensity -> shows background noise that needs to be removed o Background subtraction o Segmentation  Thresholding • Simple: gray scale images into binary images • Works well on clean data • Horrible for noisy data o Make measurements o Post analysis

Answer 80

``` o Pixel level o Generates a set of descriptors o Uses inferred background levels of noise to train the software o Powerful approach o Harder to implement o Computationally expensive o Also has downstream applications ```

Answer 81

• Beta barrel • 11 beta strands o Fluofor protected from outside environment by barrel o If exposed to outside loses fluorescence  Made of Ser65- Tyr66- Gly674  Degrades into HBI • Naturally occurring fluorescent protein from the jellyfish Aequorea Victoria

Answer 82

- Monomericity - Brightness - Spectral characteristics - Sensitivity to pH - Temperature sensitivity - Lifetime - Photobleaching - Folding time - Redox sensitivity - Electrostatic potential

Answer 83

o The more light you have to put in, the more toxicity you get and the quicker you kill your cells

Answer 84

Excitation wavelength-488 nm peak which is blue | Emission wavelength- 509 nm peak which is green

Answer 85

Brightness will be affected by temperature

Answer 86

o Probability of how long it is after photon is released after excitation

Answer 87

o Life cell imaging perspective-is important o Excitation light forces the FP into a dark state- dead state- where it can no longer be excited and no longer releases energy o Not good if you want to image for a long period of time

Answer 88

o Some will fold in minutes, others will take days o Red FP go through phase where they’re green first o Good as timers o Time how a protein/ its abundance changes through time

Answer 89

``` • 2016 • Small ultra-red fluorescent protein o smURFP • Developed into a new range of distinct fluorophores • Require Biliverdin as a cofactor • Ideal for in vivo imaging ```

Answer 90

o Cytoplasmic eGFP o Intracellular trafficking o Exocytosis-  Vesicle comes up to membrane and undergoes fusion

Answer 91

o A class of pH sensitive fluorescent proteins  Targeted mutation of eGFP  Theoretical pKa=7.11 o ~10 fold improvement in exocytotic signal o Inside vesicle acidic compared to cytosol and outside of cell o Not detected inside vesicle but as soon as released into cell it is detected (hence detects when vesicle undergoes fusion)

Answer 92

o Insulin Responsive Amino Peptidase o Surrogate marker for GLUT4 o Trafics with GLUT4  ~85% colocalization o Lumenal pHluorin tag o Insulin stimulates the fusion of IRAP-pH containing vesicles (GSVs) o Insulin stimulates a transient burst of GSV fusion events

Answer 93

o Lumenal pHluorin (1st exofacial loop)  pH sensitivity  Marker of fusion  Exposed to external environment or inside the vesicle o Cytoplasmic TdTomato  pKa 4.8  To see trafficking on the cytoplasmic side of the protein o Can see both trafficking and fusion events o rGLUTpHluor behaves like GLUT4

Answer 94

* Every cell appears to respond differently- heterogeneity * Important component of biology * Seen in single cells and complex organisms * Seen as an outcome of randomness but results can be replicated, so this is not a sufficient explanation * Heterogeneity is present with many doses of insulin * Akt is driving the heterogeneity in GLUT4

Answer 95

* Fluorescent protein at N terminus | * eGFP Akt2 is the classic probe for this

Answer 96

• Green (eGFP) and red (TagRFP-T) Akt behave differently o This is because of their net charges:  eGFP is has a net charge of -5  tag RFP-T has a net charge of 0 o Due to its negative charge, eGFP can’t go to membrane properly so less efficient in showing what’s going on there than tag RFP-T • Electrostatic potential can alter fusion protein behaviour

Answer 97

• Read files such as data.txt using read.delim, scan or read.table

Answer 98

 Numeric  Character  Logical

Answer 99

• A function returns a value resulting from programming statements. A function may or may not have input arguments.

Answer 100

o Mymatrix

Answer 101

 Mymatrix[1,2] | • Want the first row second column number

Answer 102

o The set of data (numeric or otherwise) corresponding to the entire collection of units about which information is sought

Answer 103

A subset of the population data that are actually collected in the course of a study

Answer 104

o In most studies, it is difficult to obtain information about the whole population which is why we rely on samples to make estimates and inferences related to the whole population

Answer 105

Population -> Probability -> Sample -> Inference

Answer 106

Parameter: Number that describes a population Denoted in greek letters Unknown fixed number Statistic: Number that describes a sample Denoted in roman letters Variable whose value varies from sample to sample

Answer 107

The sum of all the observations divided by the number of observations

Answer 108

median of a set of data is a value such that at least one half of the observations are less than or equal to x and at least one half of the observations are greater than or equal to x o Sample median- the (n+1)/2th largest observation if n is odd o The average of the (n/2 +1)th largest observation if n is even • The median is not influenced as much as the mean by outliers because it is robust

Answer 109

the most frequently occurring value among all observations in a sample o If no entry is repeated, the data set has no mode o If two entries occur with the same greatest frequency, each entry is a mode

Answer 110

* For symmetric data, the mean is usually less variable from sample to sample than the median * For skewed data, the median is a better measure of location

Answer 111

 Standard deviation, MAD (median absolute deviation), IQR

Answer 112

• Range of a list is the largest value minus the smallest value o Misleading because it is solely influenced by two most extreme values

Answer 113

the difference between the individual sample points and the average

Answer 114

MAD= median(| Xi- median(X)|)

Answer 115

o Find the mean of the data o Make a list of deviations from the mean (value-mean) o Calculate the average of the squares of deviations (var)

Answer 116

``` • The three quartiles, Q1, Q2 and Q3 approximately divide as a ordered data set into four equal parts o The interquartile range is defined as the upper quartile (75th percentile) minus the lower quartile (25th percentile). Contains middle 50% of data o IQR is robust o quantile(x) ```

Answer 117

boxplot(mpg~cyl,data=mtcars, main="Car Milage Data", xlab="Number of Cylinders", ylab="Miles Per Gallon")

Answer 118

hist(mtcars$mpg, breaks=12, col="red")

Answer 119

Tableau is a database that allows visualization across data tables

Answer 120

• Identifier column- primary key column that allows linkage of one table to another o Name of column doesn’t have to be identical but the values do o But make sure you’re not cutting a lot of data when you’re putting the tables together

Answer 121

links columns from 2 or more tables in a relational database. Linked via a primary key

Answer 122

a minimal set of columns that uniquely specify a row in a table

Answer 123

 Inner- any entry only found in one table is discarded  Left- keep all entries in one side of the join linked to the other  Right- keep all entries in one side of the join linked to the other  Full outer

Answer 124

• Union- combines table by stacking with the same number of columns and compatible data types o Requires same amount of columns and data types o Use it for repetitive experiments

Answer 125

contain qualitative values. You can use dimensions to categorize, segment and reveal the details in your data

Answer 126

contain quantitative values that can be described numerically and then either displayed as they are, aggregated, or used for mathematical operations

Answer 127

summarise multiple data points (e.g. mean, median, sum…)

Answer 128

dimension 1+ “separator” + dimension 2

Answer 129

o IF CONTAINS( criteria, “criteria”) =TRUE o THEN “criteria” o ELSE “Something else” o END

Answer 130

Decimal measurements

Answer 131

Follows a sequence

Answer 132

Naming conventions

Answer 133

• Sheets can be integrated in web pages • Tootips pop-up data • Can drag measures and dimensions in an already made plot and the plot will change • Discrete filters o Plot will update depending on what filters you tick on it • Continuous filters o Slide bar • Can make word clouds o Size of word can be proportional to amount of times word is in table

QBIO2001 Flashcards

Small scale data (162 cards)