1. Introduction to Molecular Biology Flashcards by Sophie Wilkinson

Nucleus Definition

-an organelle that contains the genetic material of the cell

How well did you know this?

Not at all

Perfectly

Chromosome Definition

-organised structure of DNA in the cell nucleus

How well did you know this?

Not at all

Perfectly

DNA Definition

-deoxyribonucleic acid -it is a nucleic acid containing the genetic instructions used in the development and functioning of all known living organisms

How well did you know this?

Not at all

Perfectly

Gene Definition

-the DNA segments carrying this genetic information are called genes -a molecular unit of heredity in a living organism -a region of DNA that codes for mRNA.

How well did you know this?

Not at all

Perfectly

Transcription Definition

-the process of creating a complementary RNA copy of a sequence of DNA

How well did you know this?

Not at all

Perfectly

Translation Definition

-a process where messenger RNA (mRNA) produced by transcription is decoded by the ribosome to produce a speciﬁc amino acid chain, or polypeptide, that will later fold into an active protein

How well did you know this?

Not at all

Perfectly

What is RNA abundance used to measure?

-RNA abundance is a measurable indicator of gene expression -measured using microarray -simultaneous measurement of 15-30 thousand genes -number of samples 4-100s

How well did you know this?

Not at all

Perfectly

Gene Expression and Microarray Data

-comparison between e.g. Control/Treatment, Normal/Disease -we want to identify differently expressed genes between the two groups

How well did you know this?

Not at all

Perfectly

Microarray Data and Linear Regression

-linear regression is not possible since there are so many more variables (genes) than observations

How well did you know this?

Not at all

Perfectly

Microarray Definition

-a technology where DNA sequences (from 1000s of genes) are pre-printed onto its surface (primers) -isolate human mRNA from samples in the lab -label the mRNA with coloured dye -mix with the microarray, hybridisation occurs, RNA binds to complementary primers -quantification, scan intensity of pixels on the array

How well did you know this?

Not at all

Perfectly

What are the two types of microarray?

-single colour (Affymetrix arrays) -two-colour (cDNA arrays)

How well did you know this?

Not at all

Perfectly

Two Colour Microarray

-microarray is hybridised with cDNA from two different samples each labelled in a different colour, usually red and green -they are mixed and hybridised to a single array -the relative intensities of each colour indicate the relative expression of a particular gene in each sample -generally only able to measure relative expression not an absolute measurement

How well did you know this?

Not at all

Perfectly

One Colour Microarray

-e.g. Affymetrix gene chip -RNA from a single sample so provides intensity data for each gene from one sample -there are batch effects that have to be accounted for in comparison between different arrays

How well did you know this?

Not at all

Perfectly

How is microarray data plotted?

-frequency on y -log scale on x -histogram

How well did you know this?

Not at all

Perfectly

The Simple Linear Regression Model

yi = βo + β1xi + εi, i=1,…,n -yi is the response or dependent variable -βo and β1 are regression parameters -xi is the independent or explanatory variable, measured without error -and εi is iid N(0,σ²)

How well did you know this?

Not at all

Perfectly

Using Linear Regression for Microarray Data

-possible if we use y as the outcome (e.g. diseased or healthy) -and use the x variables as each gene -then βj represents the relationship between gene expression of gene j and the condition

Linear Regression Estimation, β^

-derived by minimising the sum of square residuals S => β^ = (Xt X)^(-1) Xt y -where X is the matrix whose ij element is the ith observation of the jth independent variable (the genes) -Xt indicates the transpose of X -y is a vector whose ith element is the ith observation of the dependent variable (condition) -and β^ is a vector of estimators for the β parameters, βo^, β1^

Linear Regression Estimation, σ²^

σ²^ = [et e]/[n-p] -where et indicates the transpose of e -p is the number of parameters, 2 for the simple model β0,β1 -and: e = y - Xβ^

Hypothesis Testing for Microarray Data Test Statistic

T = β^j / se(βj^) ~ t(n-L-1) -where se means the standard error: se(βj^) = sqrt(var(βj^)) -L is the number of covariates

Simple Linear Regression Model in Matrix Form

yi = βo + β1xi + εi, i=1,…,n -matrix form: y = Xβ + ε -where y is a column vector with entries y1, y2, …, yn -X is an nx2 matrix with entries: all 1s in the first column and x1,x2,…,xn in the second column -β is a column vector with entries βo,β1 -ε is a column vector with entries ε1,ε2,…,εn

Linear Regression Variance-Covariance Matrix

Σ = cov(β^) = E [(β^-β) (β^-β)t] = σ² (XtX)^(-1) -the diagonal is the variance of each β^ -other entries are the corresponding covariances

Linear Regression Standard Errors for β^

-the standard errors of βo^ and β1^ are given by the square root of the corresponding diagonal of Σ

Linear Regression Hypothesis Testing Overview

-our main interest is to test whether any of (or any function of) the individual regression parameters is significantly different from zero -the hypotheses involved are: Ho : βj=0 H1 : βj≠0 -for any j

Linear Regression Hypothesis Testing Test Statistic

tj = βj^ / SE(βj^) -under Ho:βj=0, tj follows a t-distribution with degrees of freedom n-p -at a significance level α, the decision is to reject Ho if either: ->in the two-sided case: |tj|>Tdf(α/2 %) ->or equivalently: P_Ho (|T|>tj) < α

Linear Regression Matrix Form for Microarray Data Analysis

-in microarray data analysis, y will be the vector of gene expression in log scale -X will be the design matrix, in analysis X may need to be constructed beforehand (not given) -β would correspond to (possibly) many factors including differential expression