Lecture 15 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

genome

A

the complete set of genetic material present in a cell or organism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

genomics

A

the cloning and molecular characterization of entire genomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

a haplotype

A

The specific set of SNPs and other genetic variants observed on a single chromosome or part of a chromosome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

linkage disequilibrium

A

The nonrandom association between genetic variants within a haplotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

tag-SNPs

A

The few SNPs used to identify a haplotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Genome-wide association studies use

A

numerous SNPs scattered across the genome to find genes of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

annotated (gene) which means

A

linking its sequence information to other information about its function and expression, the protein it encodes, and similar genes in other species.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Metagenomics is an emerging field in which

A

the genome sequences of an entire group of organisms that inhabit a common environment are sampled and determined.(eDNA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Synthetic biology seeks to

A

design organisms that might provide useful functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Functional genomics

A

characterizes what sequences do—their function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Genome content consist of

A

much more than just protein-coding genes

Intergenic sequences. → “non-coding” DNA

Repetitive sequences → short and long sequences that repeat in tandem or are interspersed throughout the genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

prokaryotic and eukaryotic genomes differ drastically in

A

size & organization

prokaryote - attached to cytosol (no organells, DNA not in nucleus)

eukaryote - genome in distinct chromosomes - tightly bound to proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Anatomy of a prokaryotic genome

A

1) single, circular chromosome

2) Single origin of replication (req. for DNA rep. machenerary)

3) Genomes are compact
→. ~1-10 million bases (Mb)

4) Most content is genic
→ Minimal intergenic DNA (non- coding)
→ few repetitive sequences
→ No introns

5) Genome size is directly related to gene content
→ larger genomes encode more proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

regulatory consequences of organization of prokaryotic genome

A

Genes in biochemical or signaling pathways often clustered and controlled as operons

Chromosome not sequestered in nucleus

Chromosome not bound by histone proteins
→ No chromatin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Eukaryotic genomes

A

1) Genomes divided into multiple linear chromosomes, with telomeres & centromeres

2) DNA complexed with histone proteins (=chromatin) in a nucleus

3) Genome size tends to be much larger, and varies widely, even within a taxonomic group
→ Genes interrupted by introns
→ Copious intergenic DNA
→ Copious repetitive DNA

4) Genomes don’t tend to be compact

5) With rare exceptions, genes not clustered into operons

6) Many genes (most human genes) are interrupted by introns; genes are far apart

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

C-value is the

A

DNA content per haploid cell
→ think of this as genome size (how many bp)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

G-value is the

A

protein-coding gene number
(amount of DNA seq corresponds to coding protein)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

G-value paradox

A

Gene number does not fully correlate with organismal complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

G-value paradox explained by

A

(1) alternative splicing
(2) expansion/contraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

alternative splicing explanation for G-value paradox

A

Multiple exons from one gene can be spliced in different ways (=alternative splicing) to form distinct mRNAs and proteins

No. of proteins&raquo_space; no. of protein coding genes

Explains smaller-than-expected gene count in multicellular spp.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

expansion/contraction explanation for G-value paradox

A

Gene expansion & contraction is frequent, even among closely related spp.

gene duplication
family duplication
entire genome duplicated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

C-value paradox

A

Genome size doesn’t fully correlate with organismal complexity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

C-value paradox explanation

A

expansion of non-genic DNA, largely repetitive DNA

> 85% of human genome is repetitive DNA → caused by interspersed transposable elements → non-autonomous, non-coding transposable elements

25
Q

Assembly of eukaryotic genomes is

A

very challenging

Human genome is 3,200 Mb (million bases; =3.2 Gb) with large amounts of repetitive DNA.

Technology not up to the task: Sanger sequencing

26
Q

Draft human genome reference assembly

A

Sequencing is only the beginning, resulting in multiple millions of “reads”

Assembly → sequencing reads must be put in order on chromosomes (we are skipping this aspect)

Draft assembly → unfinished, with lots of “gaps”

Reference assembly → the assembly (usually a working model) is used as a framework to guide interpretation of individual genome variation and functional genome analysis

27
Q

HGP brought about

A

radical technological changes in genetics
and
radical conceptual changes in genetics

28
Q

technological changes in genetics brought about by HGP

A

High throughput, massively parallel, genome-wide data collection and functional assays

Sequencing efficiencies
→ 1 human genome: from 10 years and 2.7 billion (1990-2003) to 1 day and <$999 in 2019

Concomitant strides in computing power and analysis software

29
Q

conceptual changes in genetics brought about by HGP

A

Humans are more variable than we thought

Humans have far fewer protein-coding genes than we thought…
…yet, most of the genome is transcribed → a lot of RNA not turned into proteins

Cells are full of noncoding RNAs

30
Q

Initial predictions before the Human Genome Project were ____
Current estimate is ___

A

~200,000
~20,000 or less

caveat: we need to reconsider how a “gene” is defined, as we will see later in the course

31
Q

Functional Genomics

A

how to go from DNA to what do

32
Q

Genome controls phenotype through

A

transcription

33
Q

We expect that functional elements in the genome should be

A

1) transcribedor
2) bind proteins that regulate transcription

34
Q

Bioinformatics involves ___

which can ____

A

using computer technology to collect, store, analyze and disseminate biological data and information

can increase our understanding of health and disease and, in certain cases, as part of medical care.

35
Q

Homologous genes

A

Genes that share a common evolutionary origin. Likely to have conserved sequence and function.

36
Q

Paralogs

A

Homologous genes in the same species.

e.g. alpha and beta hemoglobin in humans.

37
Q

Orthologs

A

Homologous genes in different species.

e.g. mouse and human alpha hemoglobin

38
Q

Predict function from sequence

A

how closely related to other genome (ex. SARSr-CoV)

39
Q

Comparative genomics

A

field of genomics that studies similarities and differences in gene content, function, and organization among genomes of different organisms

40
Q

Transcriptome

A

All RNA molecules transcribed from a genome

41
Q

Transcriptomics

A

Techniques used to identify and quantify the transcriptome.

42
Q

protein domains

A

Complex proteins often contain regions, called

that have specific shapes or functions

(ex. zinc finger)

43
Q

RNA-seq

A

Transcriptomics

identifies all transcribed elements
→ extract all cellular RNA
→ transcribe → cDNA
→ chop up and add adapters → sequence

Relies on next generation sequencing and bioinformatics

44
Q

Microarrays

A

Transcriptomics

Can be used to determine relative levels of mRNA (i.e. expression levels) for 1000’s of genes.

Employ an array of complementary probes that are complementary to mRNA sequences.

45
Q

Proteome

A

All proteins encoded in a genome.

46
Q

Proteomics

A

Techniques used to identify and quantify the proteome.

47
Q

Mass spec

A

Proteomics

is a high throughput method to identify proteins in a cell

→ digest proteins into peptides
→ separating fragments by mass-to-charge ratio
→ match peak profiles to a database of known proteins

48
Q

ChIP-seq

A

(Chromatin ImmunoPrecipitation)
Proteomics
(affinity capture)

identifies DNA bound by known DNA-binding proteins
→ e.g., transcription factors (TFs), RNA pol

antibodies bind to specific protein → take genomic DNA → mix with antibody → bind to protein (that is bound to DNA) → can pull complex out of solution and seq DNA bound by that protein

Requires specific antibodies
→ need to know what protein looking for (and have antibody for it)

high throughput sequencing

49
Q

two-dimensional polyacrylamide gel electrophoresis

A

(2D-PAGE), proteomics

in which the proteins are separated in one dimension by charge, separated in a second dimension by mass, and then stained

50
Q

Protein Microarrays

Employ ___

Can be use to ____

A

Proteomics

Employ an array of proteins immobilized on a solid support.

to identify protein-protein interactions or measure expression of proteins within cells (using immobilized antibodies).

51
Q

Modifications of affinity capture and other techniques can be used to ____ termed the_____

A

determine the complete set of protein interactions in a cell,

interactome.

52
Q

Genome-wide mutagenesis screens

A

can be used to search for all genes affecting a particular function or trait.

two methods—random inducement of mutations on a genome-wide basis and mapping with molecular markers—are coupled and automated

53
Q

segmental duplications,

A

duplicated regions greater than 1000 bp that are almost identical in sequence.

Many eukaryotic genomes, especially those of multicellular organisms, are filled with

54
Q

multigene family is a

A

group of evolutionarily related genes that arose through repeated duplication and evolution of an ancestral gene.

55
Q

gene deserts

A

(genetically engineered mice that were) missing large chromosomal regions with no protein-encoding genes

56
Q

collinearity

A

many genes are present in the same order in related genomes

57
Q

pangenome

A

the entire set of genes possessed by all members of a particular species.

58
Q

single-nucleotide polymorphism (SNP)

A

A site in the genome where individual members of a species differ in a single base pair