LEC43: Intro to Genomics Flashcards

1
Q

what does every human cell contain?

A

complete genetic code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when are chromosomes visualizable?

A

during mitosis, when chromosomes condense

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what was first genome screen technology?

A

karyotyping by G-banding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what was reference human genome, its use?

A

traditional sequencing method: took DNA from individual, arranged into pieces of chromosomes, chopped up, individually sequenced, stitched back together, compared to reference human genome

= sanger sequencing

costly and slow, not method anymore

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what did human genome project do?

A

sequence entirety of human genome

took 10 yrs, $1 billion

info acquired allowed cataloging of complete set of human genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what did human genome proj find?

A

complete set of human genes = similar to number observed in other model organisms like mouse, watercress plant, roundworm

explanation: complexity in mammals due to alternative splicing, permitting increased number of potential proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how much of our DNA is protein coding genes?

A

1.5%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is difference between number of genes, gene density, in humans vs plants?

A

genome size is similar

but gene number is much greater in humans

thus human’s avg gene density is much lower; only 1.5% of our DNA encodes proteins

due to alternative splicing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is result of alternative splicing?

A

from 1 single gene, exons’ arrangement can be different, get different resulting proteins!

only 1.5% of our DNA is protein coding, though

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how much of non-gene DNA is conserved?

A

2-5% of non-gene DNA is conserved through evolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if a piece of DNA is conserved, what does that suggest?

A

that it’s important

basis for idea that there’s functionality among non-gene portions of our DNA that’ve been conserved through ages/across animals

suggests these regions have important regulatory role in genome function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HOXD gene cluster function?

A

basic body patterning control

example of conserved region of essential proteins that regulate genome function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how much of our genome is repeat elements?

what are they relics of?

A

50%

relics of retrovirsues and ‘genomic parasites’ that invaded our DNA in evolutionary history, i.e. HIV - ‘junk DNA’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what causes finger webbing?

A

mutation in hoxD gene cluster, as HoxD genes encode for basic body patterning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

segmental duplications?

A

blocks of DNA 1-500 kb in length that occur at multiple sites in the genome, share a high level of sequence identity

~5% of our DNA

can be intrachromosomal (same chromsome) duplications or interchromosomal (between chromosomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what role do segmental duplications play in genetic disease?

A

these large highly idneitical repeats often flank certain regions of the genome that are thus prone to misalignment during meiosis, leading to improper recombination

if any repeats are dosage sensitive, results in genomic deletions and/or duplications that are associated w/ a particular genetic disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

examples of recurrent genomic disorders?

A

velocardiofacial syndrome

angelman/prader-willi syndrome

charcot-marie tooth disease

x-linked hemophilia

all caused by mechanism of recombination between large high-identity repeats

18
Q

recurrent deletion on chromosome 15 causes what/example of what?

A

causes intellectual disability, dysmorphisms, epilepsy

deletion = most common known genetic cause of epilepsy, present in ~1% of epilepsy patients

example of recurrent genomic disorder caused by aberrant recombination between large high-identity repeats

19
Q

how many bases of difference exist between 2 individuals?

A

avg ~6 million bases, ~0.1% of genome

means 99.9% shared DNA among humans

20
Q

what are the types of variation in the human genome? from smallest to largest

A

1) single base-pair changes - point mutatoins/SNVs/SNPs
2) small insertions/deletions & microsatellites
3) mobile elements - retroelement insertions (300 bp - 10kb in size)
4) large-scale genomic variation (>1 kb) - deletions, duplications
5) chromosomal variation - translocatoins, inversions, trisomy

21
Q

most common type of genetic variant?

A

SNVs, single nucleotide variants or polymorphism or point mutation

occurs 1x every 1,000 bp = 3-5 million SNVs in individual genome

22
Q

where do SNVs usually occur?

A

most in non-coding regions - may have regulatory effects, but not well understood

however, 10,000 per genome (0.3%) are in coding regions, & cause changes in protein sequence

23
Q

what do SNPs within coding regions cause?

A

sometimes, no change, since a.as are reduntant

sometimes, changes amino acid, different protein results

24
Q

what do SNPs outside of coding regions cause?

how much of SNPs are outside of genes?

A

can influence disease by altering gene regulation

i. e. if change a ntd within txn factor binding site code, txn factor may not recognize, may not bind to DNA, no activation occurs, gene may be OFF when should be ON
99. 7% of SNPs are outside of genes

25
Q

what does microarry on SNP chips show?

A

useable to genotype millions of SNPs in a single experiment

can find identity of a base pair at an SNP

floursescently labeled DNA is hybridized to an array of probes immobilized on a glass slide that bind either to normal or variant DNA

26
Q

how does array CGH work?

A

label a control and patient DNA w/ flourscent dyes

cohybridize them together onto a slide that contians DNA corresponding to different parts of the genome

flourescently labeled DNA hybridizes to the slide

scan it, get image

YELLOW indicates no gain or loss or duplication on the array

however if see color of flourescence of sample, know there is duplication in, for ex., patient’s DNA, at that position

27
Q

what does array CGH enable?

A

detection of copy number changes that’re too small to be seen by karyotyping

28
Q

what do different chip/microarray tests give info about?

A

sattistical testing for assocaition between diseases of interset and SNPs at specific chromosomal locations

DNA copy number across genome

detection of sub-microscopic gains or losses of material for rare conditions and common conditions alike

duplications of genomic regions that’re associated w/ protection from disease

29
Q

how can array-based technology be used to inform diagnosis and treatment of cancer?

A

take cancerous tumor DNA and match to control DNA from same person’s blood or non-tumorous site

compare the 2 to see what happened in tumor

see where chromosomes have extra copies, see deletion of known tumor suppressor

see massive amplification of EGFR gene region, growth-promoting gene and see deletion of tumor suppressor genes: so can develop drugs to inhibit this gene where see amplification b/c clearly key event in tumorogenesis

30
Q

what are tandem repeats?

A

serial repetition of 2 bases (acacacacac…)

inherently unstable highly repetitive sequences

are rich source of variation in genome b/c polymerase working on DNA at repeat site will add or delete copies of repeats

highly variable regoins btwn individuals

31
Q

what are triple repeat expansions associated w?

A

neurological diseases

32
Q

what is cause of fragile x?

A

CGG motif repeat has 5-50 copies in healthy individuals; in ill individuals, can be up to 50-200 copies; in patient w/ fragile X, hundreds/thousands of repeats

this switches off nearby gene, causes disease

causes breakage of chromosome, making DNA polymerase unable to replicate

causes mental retardation w/ distinct dysmorphic features, accompanied by a ‘fragie site’ on X chromosome (= original name)

33
Q

what can a large tandem repeat contain?

A

entire genes

may be true for genes present in multiple copies, e.g. salivary amylase

34
Q

are genomic duplication regions ever protective from disease?

A

yes

eg. cheokine CCL3L1, inflammatory signaling molecule

it’s binding partner of CCR5, major receptor molecule for HIV cell entry

more copies of CCL3L1 gene is inversely correlated w/ susceptibility to HIV infection

35
Q

is complete personal genome sequencing expensive?

A

no! quick and cheap now

36
Q

what is focus of next generation sequencing?

A

whereas old sanger sequencing focused on 1 gene at a time,

next gen sequencing permits analysis of massively parallel sequencing- more data simultaneously

37
Q

describe process of next generation genome sequencing

A

1) extract genomic DNA
2) shear DNA into small 200-500 ntd pieces
3) ligate adaptors to ends of fragments
4) enrich and amplify library by PCR
5) sequence on microscopic scale, from adaptor w/ platform

wash through w/ bases that floursece differently; each cluster of DNA will flouresce

measure that flourescence or electrochemical energy, detemrine which base was added durign each step of DNA synthesis rxn

38
Q

describe whole genome shotgun sequencing

A

can stictch back together fragments of DNA by mapping onto reference human genome

due to random nature of sequences, depth of coverage at any 1 place in genome is variable

reads also contain errors (1%)

therefore need **high redundancy **to generate high-quality gap-free sequence (20x-20x)

39
Q

what is imperfect about whole genome shotgun sequencing?

A

random errors in sequencing occur

thus cannot know if heterozygous SNV or sequencing error or random error has occurred when a base is mismatched

so SNV calling in genome sequencing is a probabilistic exercise

40
Q

what are barriers to personalized genomics being the be-all/end-all of medical treatment today?

A

cost is falling rapidly ($1500-2k now)

BUT knowledge of how to interpret consequences of majority of genetic variation is limited

geneticists only know phenotypes caused by mutations in ~3200 of ~25,000 human genes (13%)

each human has ~3 million SNVs, 1200 CNVs - what are effects of these on individual disease risk?

even for the ~10,000 that change a.a. sequence of proteins, currently we can interpret effects of a minority of these, + these are 0.3% of each person’s variation