W10LECT - Introduction to genomics, Methods in genomics Flashcards

1
Q

What are the features of genome?

A
  • Genome is the entirety of an organism’s hereditary information.
  • It is encoded either in DNA or, for many types of viruses, in RNA.
  • The genome includes both the genes and the non-coding sequences of the DNA/RNA. In diploid cells, there are two genomes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Genomics?

A

Genomics is the study of the function, structure and interactions of the genome.
− Involves- methods, RNA, protein, bioinformatics.
− Can be- structural genomics, comparative genomics, plant genomics, human genomics,
pharmacogenomics or medical genomics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data about the Human Genome
1. What is the percentage of protein coding and non-protein coding?

A

1.2% protein coding (the rest 98.8% is non-coding)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data about the Human Genome
2. Recombination is higher in male or female?

A

female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data about the Human Genome
3. Mutations are higher in male meiosis OR female meiosis?

A

Mutations are higher in male meiosis (the majority of mutations originates in males)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data about the Human Genome
3. Mutations are higher in male meiosis OR female meiosis?

A

Mutations are higher in male meiosis (the majority of mutations originates in males)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data about the Human Genome
4. How many new mutations in the offspring from the parents are there?

A

60

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data about the Human Genome
5. how many loss-of-mutations in the annotated genes and genes involved in Mendelian diseases?

A

Every individual has 250-300 loss-of-function mutations in the annotated genes, among which 50-
100 genes are involved in Mendelian diseases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data about the Human Genome
6. What is the percentage of repeats in human genome?

A

46% repeats, a lot of them are transponsons (i.e., jumping genes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data about the Human Genome
7. What are the most frequent repeats?

A

Most frequent repeats are called Alu, which occupy 10.6%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data about the Human Genome
8. What is the largest gene? its role? Its size and location

A
  • Largest gene: DMD, which codes for dystrophin
  • size: 2,224,919 bases
  • location: Xp21.2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data about the Human Genome
9. What is the longest coding sequence?

A

TTN, codes for titin; coding sequence: 104,076 bp; 34,692 amino acid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data about the Human Genome
10. What is the longest exon?

A

Longest exon: TTN: 17,106 bp

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data about the Human Genome
11. What is the percentage of human genome is gene desert?

A

20% of the genome is gene desert (a region >500 kbp without a gene)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data about the Human Genome
12. What are the Gene rich chromosomes?

A

Gene rich chromosomes: 17, 19, 22 (richest is the 19, with 1,458 coding and 980 non-coding genes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data about the Human Genome
12. What are the Gene poor chromosomes?

A

Gene-poor chromosomes: Y, 4, 13, 18, and X; (poorest is the Y with 72 coding and 137 non-coding
genes and < 1.0 gene/Mb)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data about the Human Genome
13. How many imprinted genes are known?

A

At present 156 imprinted genes are known

18
Q

Data about the Human Genome
14. The majority of SNPs found associated with ___

A

disease outside the coding region.

19
Q

Data about the Human Genome
15. Examples of diseases, where CNVs can play a role?

A

There are several diseases, where CNVs can play a role, like Crohn’s disease, Alzheimer disease,
autism, obesity, AIDS, etc.

20
Q

Data about the Human Genome
16. How can CNVs play a role in transplantation?

A

CNVs can play a role in transplantation.
=> If in the organ acceptor, owing to a CNV, a gene is missing, and the gene is present in the donor, a graft-versus-host disease can develop in spite of MHC identity, i.e. an immune response could develop against the gene product.

21
Q

Data about the Human Genome
17. In every cell type assessed, between 10 and 25% of human and mouse autosomal genes can be subject to ___

A

monoallelic expression (MAE).

22
Q

Data about the Human Genome
18. The majority of genes (75-90%) shows BAE, i.e. ___

A

both alleles are active in the cell, with the additional remark from above that at any point in time, a cell contains mostly transcripts from one allele.

23
Q

Data about the Human Genome
19. The majority of genes (75-90%) shows __, i.e. both alleles are active in the cell, with the additional remark from above that at any point in time, a cell contains mostly transcripts from one allele.

A

BAE

24
Q

Human Genome Project (HGP)
1. What are the features of Human Genome Project (HGP)?

A
  • US government project coordinated by the department of energy and the national institutes of health (NIH)
  • Formally began 1st October 1990
  • Planned for 15 years
  • Completed April 2003 (after 13 years)
  • Paper published in 2006
25
Q

Human Genome Project (HGP)
2. What are the main roles of HGP?

A
26
Q

Human Genome Project (HGP)
3. How is HGP performed?

A
27
Q

Human Genome Project (HGP)
4. How do we perform sequencing?

A
  • 1st generation (Sanger)
  • 2nd generation (New generation, ie NGS = next generation sequencing or short-read sequencing, massively parallel sequencing)
  • 3rd generation (Long-read sequencing)
28
Q

Encyclopedia of DNA elements (ENCODE)
1. What is Encyclopedia of DNA elements (ENCODE)?

A

A public research project which aims to identify functional elements in the human genome.
- Aim: to determine which regions are transcribed into RNA, which regions are likely to control the genes that are used in a particular type of cell, and which regions are associated with a wide variety of proteins.

29
Q

Encyclopedia of DNA elements (ENCODE)
2. What are the main Results of Encyclopedia of DNA elements (ENCODE)?

A
  • 80% of the genome have biochemical functions, in particular outside of the well-studied protein coding regions.
  • They first introduce NGS (Next Generation Sequencing).
  • Disease-linked regions include enhancers or other functional sequences. And cell type is important.
  • About 75% of the genome is transcribed at some point in some cells, and that genes are highly
    interlaced with overlapping transcripts that are synthesized from both DNA strand.
  • 96% of CpGs exhibited differential methylation in at least one cell type or tissue assayed, and levels of
    DNA methylation correlated with chromatin accessibility.
30
Q

ENCODE: Searching for functional sites in the genome
3. How do we perform ENCODE: Searching for functional sites in the genome?

A
  • Genome digestion with DNAse. If the enzyme accesses and digests, it is an open region, and other enzymes and molecules (e.g. transcription factors) also have access, i.e. a functional region.
  • Next, sequencing around the cleavage site.
31
Q

Background Studying of Disease
1. What are the Two main types of Background Studying of Disease?

A
  1. Hypothesis driven
  2. Hypothesis-free
32
Q

Background Studying of Disease
2. What is Hypothesis driven?

A

Hypothesis driven- we know what gene we are looking for.
- There is preconception (i.e., idea).
- For example: candidate gene association studies.

33
Q

Background Studying of Disease
2. What is Hypothesis -free?

A

Hypothesis-free- screening of whole / some in population to look for a gene.
- No preconception.
- For example: genome wide association studies (GWAS), whole genome sequencing, microarray measurements for studying gene expression (genomic methods).

34
Q

Describe Gene deficiencies, KO people

A
  • In mice 30% of gene deficiencies is in utero lethal
  • Majority has phenotypic consequences
    – In average every individual lacks 20 genes (KO)
    – E.g. genes for smell, or redundant genes
    – Advantageous gene deficiencies:
    – LPA, FUT2, CCR5, PCSK9 KO
    – 43 genes whose inactivation is lethal to mice were found to be inactivated in humans who are alive and apparently well.
35
Q

What is the Most variable part of the genome?

A
  • MHC: 4 million bp 6p21.3, >100 genes
  • 10 times more variations than in other part of the genome. Evolutionary advantageous
36
Q

Explain „Single-cell transcriptomics”

A
  • At any point in time, a cell contains mostly transcripts from one allele. It is independent of the parent of origin of the allele
  • transcription in mammals is discontinuous and occurs in transcriptional bursts interspersed by refractory periods of gene inactivity.
37
Q

Explain „Single-cell transcriptomics”

A
  • At any point in time, a cell contains mostly transcripts from one allele. It is independent of the parent of origin of the allele
  • transcription in mammals is discontinuous and occurs in transcriptional bursts interspersed by refractory periods of gene inactivity.
38
Q

Explain Monoallelic expression (MAE)

A
  • 10-25% of genes shows MAE (not imprinting)
  • It is mitotically stable higher nucleotide diversity than genes with biallelic expression enriched for ones encoding proteins present on the cell surface and responsible for interactions between the cell and its environment
  • Heterozygote advantage
39
Q

Describe Comparative genomics

A
  • Comparing the genomes of contemporary species.
  • Genes essential for life
  • Gene essential for multicellular organisms.
  • Genome regions conserved through the evolution.
40
Q

What are the features of Conserved regions?

A
  1. Small differences between evolutionary distant species
  2. Probably important functions
  3. 99% of the protein coding genes in the mouse have human homologs. Difference ≈ 300 genes. At genomic level: 90%.
  4. Chimpanzee: 96%, Y chromosomes are very
41
Q

What are the features of Genome of modern humans

A
  1. The genome of homo sapiens mixed with other human species
  2. Genomes of European and Asian people contain 1-4% Neanderthal genome
  3. About 3–5% of the DNA of Melanesians and Aboriginal Australians and around 7%-8% in Papuans deriving from Denisovans.
  4. Useful genes survived