GENETIC ASSOCIATION STUDIES Flashcards

1
Q

how did they find disease causing genes in the pre genomic era

A

they used pedigree diagrams of 3 generations or more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how did they find disease causing genes in the post genomic era

A

they did GWAS on the entire genome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

findings from the first human genome generated

A
  1. it was generated from 13 people
  2. found that there are 3 billion base pairs in the haploid genome
  3. around 25000 genes exist
  4. around 18 million SNPs exist and each person has around 3.3 million SNPs
    ie; extensive allelic variation between members of the same species
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the 4 different types of genetic variants and their frequencies

A
  1. SNPs -18 million of them; occur every 1kb
  2. InDels- 200000 of them; occur every 10kb
  3. SSRs- 100000 of them; occur every 30kb
  4. CNVs-8600 of them; occur every 3Mb
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

explain what a SNP is

A

it is a single nucleotide polymorphism ie; change in nucleotide base from the wild type to a variant type
- more than 1% of the population must have the alternate nucleotide at this position for it to be considered a SNP
-SNPs can be homozygous or heterozygous because we have 2 chromosomes ie; 2 copies of DNA
-always take the top nucleotide on each chromosome
-the top nucleotide on chromosome 1 is how you name the SNP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how do you figure out what the wild type SNP is

A

compare the genome to that of the chimp because chimps have the wild type allele and any other SNP must have arisen after divergence of the 2 species

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do we genotype SNPs

A

use GWAS
-this requires 1000s of individuals genomes to find SNPs with high association to a disease
-GWAS is chip based ie microarray based
-the loci on the chip are SNPs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

explain how a GWAS works

A
  1. generate oligonucleotides from the DNA you wish to sequence
  2. stick these oligonucleotides to the chip
  3. then fragment your DNA of interest and wash the fragments over the chip so that fragments bind to their complimentary oligo probe
  4. note that the oligo probe sequence ends in length just before the base you wish to sequences and so a fluorescently labelled nucleotide can be added to the oligo that is complimentary to the fragment you wish to genotype
    -lit up using a light source
    -shows you if that person is hetero/homozygous at that SNP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

in what disease is a SSR found

A

in huntingtons disease
-the SSR is a triplet repeat oof CAG found in the coding region of the HD gene
-a person who has less than 34 CAG repeats in this region of the gene will have a normal allele
-a person with more than 42 CAG repeats in this region of the gene will have HD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

explain what incomplete penetrance is

A

not every person who has the disease genotype will express the disease phenotype

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

explain what genetic heterogeneity is

A

this is when different disease genotypes are responsible for the same disease in different families

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

explain what polygenic determination is

A

this is when mutant alleles at more than one locus influence disease expression in one person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

explain what complex inheritance is

A

this is when 2 unlinked disease loci are inherited together to predispose someone to the disease
eg: breast cancer
-inheritance of the 2 unlinked disease loci BRCA1 and BRCA2 can predispose women to breast cancer
-incompletely penetrable disease because not all women have mutant BRCA1/BRCA2 have breast cancer
-note that these 2 genes transcribe tumour suppressor proteins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what were the aims of the kirov paper

A
  1. to find the novel CNVs associated with SZ
  2. to compare SZ CNVs with CNVs in other diseases
  3. compare novel vs inherited CNVs
  4. understand the pathophysiology behind SZ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What were the cohorts used in kirov study

A
  1. bulgarian case only parent proband trios
  2. iceland control only parent proband trios(note that this group have a controlled gene pool because they are geographically isolated)
  3. data from an ASD case-control study
  4. data from publically available datasets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is gene ontology

A

this is the process of determining the function of genes in 3 aspects
1. molecular function
2.cellular components
3. biological processes
-look specifically at the pathways in which these genes are involved ie; GSEA
-the GSEA finds pathways that are enriched in the disease of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

define de novo

A

a genetic variant arises for the first time in a family due to a mutation in one of the germ cells from either parent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what were the results of the kirov paper

A
  1. they found more rare/de novo CNVs occuring in SCZ than in controls
  2. they found 34 de novo CNVs associated with SCZ
  3. 8 of these 34 de novo CNVs were found at known SCZ loci but the rest were found at new loci- not yet found to be associated with SCZ
  4. some of the CNVs they found in SCZ were also found to be associated with other disorders
  5. they found some CNVs were located in genes that function in the post synaptic density pathway
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is a haplotype

A

this is a group of SNPs located on a single chromatid that are associated statistically ie; inherited together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is a tag SNP

A

a tag SNP is a single representative SNP that represents a group of SNPs ie; a haplotype
-high linkage disequilibirum to the other SNPs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is a manhatten plot

A

this plot shows the position of all the SNPs across all the chromosomes and their probability of association with a disease
-the higher up it sits; the greater the probability that SNP is associated with the disease
-the SNPs sitting at the bottom have no association with the disease
-in the areas that we see the large peaks; these SNPs all the give off the same signal and so are associated with one another ie; represent a haplotype. this peaks indicates that this haplotype in this genomic region has a high association with the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

how do we deal with multiple testing

A
  1. need replication
  2. make note of the false discovery rate
  3. use statistical correction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

how do we deal with multiple testing

A
  1. need replication
  2. make note of the false discovery rate
  3. use statistical correction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

how do we assess a GWAS publication

A
  1. make sure the results have been independently replicated using different methods/sample groups
    2.ensure a big enough sample size has been used
  2. make sure quality control has been done
  3. check for any confounders ie; variables besides the disease that may differ between the control and case groups
24
Q

what is a QQ plot(quality control)

A

this plot shows the observed P values on the Y value against the expected P values on the X axis
-if we see any deviations from the line this means these SNPs have a greater association with a disease than we initially expected.

25
Q

what did the wellcome paper study

A

studied 7 common diseases to find the genetic variants associated with these diseases
1. rheumatoid arthritis
2. bipolar
3. diabetes type 1
4. diabetes type 2
5. hypertension
6. coronary artery disease
7. crohns disease

26
Q

what were the case and controls for the wellcome study

A

-they took 2000 cases for each of the 7 diseases and then used a shared 3000 controls for all 7.
-1500 controls came from a recruited blood donor group
-1500 came from the 1958 british birth cohort

27
Q

rheumatoid arthritis

A

chronic inflammatory disease

28
Q

crohns disease

A

inflammatory bowel disease

29
Q

bipolar

A

episodic, recurrent pathological disturbance in mood
-it is 80-90% heritable

30
Q

DM type 1 and 2

A

type 1 is an autoimmune disease whereas type 2 is a metabolic disease

31
Q

hypertension

A

chronic increase in blood pressure

32
Q

rationale for the wellcome study

A

-the genetic basis of these 7 diseases were still largely unknown
-despite some successful efforts-these results couldnt be replicated in candidate gene or linkage studies
-finding the genetic basis would aid in developing new therapies and treatment plans

33
Q

what was the cohort used for the wellcome paper

A

-all the cases and controls came from british population
-ie; from people who identified their ancestry to be white Caucasian

34
Q

what genotyping method was used in the wellcome study

A

used GWAS analysis ie; to find the SNPs involved in these studies

35
Q

what were the results from the wellcome GWAS

A
  1. found 24 independent signals
    2.found 58 new loci with association to the diseases
  2. found many SNPs to be associated with more than one disease
36
Q

why was it essential to use the 2 different control groups

A

because these 2 groups used different sampling methods
-it allowed researchers to study the effect differential genotyping errors on the results
-found that there was no difference between the 2 groups and so it justified the combining of them

37
Q

what quality control was done in the study

A
  1. they excluded any people with non caucasian ancestry
  2. checked for relatedness
  3. false identity
  4. contamination
    6.the SNPs were run through a quality control filter and any SNPs with poor clustering were removed
38
Q

affect of geographical variation and population structure on the wellcome study

A

-found that in the british population; 13 genomic regions shows geographical variation along a NW/SE axis
- the impact of this was too small to change the results
-also because all other ancestries were removed; the results were not skewed by population structure in that regard
-they concluded that the british population is heterogeneous

39
Q

what was the disease association found in the manhatten plots and the QQ plots

A

found that the more peaks of association there were in the manhatten plot; the more skewed the QQ plot data was
-this means that there is greater genetic influence on the disease; greater heritability

40
Q

what did lecture block 3 study

A

-changes to gene expression through DNA methylation
-because not all disorders are caused by genetic variants but some are due to changes in gene expression

41
Q

where does DNA methylation occur

A

-it occurs in CPG islands

42
Q

how common is autsim

A

they have found an increased prevalence of autism has there has been an increase in diagnosis recently due to
1. the use of a standardized assessment tool
2. the inclusion of milder forms of neurodevelopmental disorders
3. decreased stigma and increased awareness
4. broaded diagnostic criteria have been introduced

43
Q

what is the assessment tool generally used

A

called DSM-V american pyschiatric association criteria for ASD
-this tool looks at:
1. repetitive, restrictive behaviour
2. deficiency in social interactions and communication

44
Q

what does autism look like

A

because there is such large interpersonal difference with autism scientists developed endophenotypes to better be able to class the type of autism an individual has

45
Q

what are the genetic causes for ASD

A
  1. 70% unknown
  2. 5% due to rare/de novo mutations
  3. 3% chromosomal abnormalities
  4. 15% mendelian disorders
    -so it was concluded that ASD is due to dysregulation of an interacting network of genes; not just one gene
    -these genes are dysregulated through DNA methylation
46
Q

what experiment was done to show how gene expression differs in ASD

A

she took ASD patients and divided them into 4 endophenotypes for language impairment
-she also used a control group
-found that gene expression not only differed between the ASD groups and control group but also differed in 3 of the endophenotypes

47
Q

what are discordant twins

A

these are twins where one twin has autism but the other twin doesnt
-occurs 30% of the time

48
Q

what genotyping method is used to study gene expression differences in ASD

A

EWAS
-ewas looks for differential methylation at specific epigenetic loci amongst case and controls
-so the loci on the chip are epigenetic loci
-EWAS can use a smaller sample size compared to GWAS

49
Q

What was the first EWAS study done( in ASD)

A

done on 50 pairs of twins discordant for their autism traits
-EWAS studied 28000 epigenetic loci
-found that in each pair of twins- 37 genomic regions were DM; most of these were unique to that twin pair
- the twins discordant in their social traits had DM-GABR3
-twins discordant in their communicative behaviour had DM-UBE3A

50
Q

why was it difficult to study ASD in sub saharan africa

A

because there was no ASD database setup yet

51
Q

what was the background to colleens study

A

-needed to generate a ASD database in SA
-she used boys only; aged 6-12 years old
-she took cheek swabs from the boys

52
Q

what was the method colleen used to study ASD

A

-she first assessed the children using the ADOS-2 assessment tool to determine their ASD and non ASD in the controls
-she took the DNA from the swabs-did bisulfite treatment- beadchip array(EWAS)-data analysis-gene list -GSEA

53
Q

results from colleens paper

A
  1. found 893 DM genes in the ASD cohort
  2. only 39 of these genes overlapped with data from international databases ie; the rest were novel findings
  3. the top DM gene was STOML2
54
Q

what was the GSEA that colleen did on the 893 DM genes

A

she did a GSEA on these DM genes to find which pathways they are involved in
-they found all the genes were involved in enriched canonical pathways found in the mitochondria
-these pathways were metabolic pathways
-only one pathway was not metabolic; it was a protein ubiquitination pathway

55
Q

how did colleen replicate her data

A
  1. bisulphite sequencing
    - this method chose 38 of the 893 genes and foud 12 of them were DM
  2. DNA pyrosequencing
    -chose 2 genes; the one being the central mitochondrial gene called PCCB and found this gene to be enriched in 3 metabolic pathways
    -they found there was a decrease in methylation of PCCB in asd
56
Q

how did she find supporting evidence for this mitochondrial dysfunction

A

another study was published that showed this mitochondrial dysfunction is implicated in ASD by using brain tissue

57
Q

how did she find the metabolites involved in ASD

A

-if there is mitochondrial dysfunction then the products of metabolism must also be dysregulated
-they found 3 metabolites to be significantly elevated in ASD
- to study the metabolites she compared the metabolites of her ASD group, the controls and from a group of patients who have impaired mitochondrial respiration
-found that the group with impaired respiration and the ASD group had the same metabolites

58
Q

mtDNA copy number variation

A

found that in ASD; mtDNA copy number is significantly elevated
-this is indicative of mitchondrial dysfunction
-so it was obvious that they found the PGC-1a gene responsible for mitochondrial biogenesis to be hypermethylated in ASD
-Also found DM-STOML2, FIS1 and OPA1