L7 - Big Data Flashcards
What is big data?
-These are datasets too large or complex to process using traditional data processing methods
What is big data used to measure? (6)
-Big data is gathered from large populations of DNA, RNA, protein molecules as well as cells, tissues and organisms
What is used to analyse the following:
Short read sequence -
Long read sequence -
Proteomics/metabolomics -
Epigenomics -
Short read sequence - Illumina programme
Long read sequence - PacBio
Proteomics/metabolomics - Mass spectrometry
Epigenomics - ChIP-Seq
What is genomics used for
what is transciptomics used for?
-analysing DNA
-analysing RNA
What is Fold-change?
What is Significance?
Fold-change - is how much gene expression is increased or decreased by the treatment
Significance - the statistical significance of the difference in gene expression
What happens in a single cell RNA-Seq? (5)
-Collects the transcriptome of many individual cells
-Instead of measuring all the mRNA, measure from the individual cells of the sample
-Do this by breaking down the tissue into a single cell suspension that contains different cell types from the sample
-Link the sequencing data to particular cells and find out which mRNA belongs to which cell
-Use a UMAP plot to present the data where each dot is a cell
In a Genome-wide association study (GWAS) what is identified?
-(single nucleotide variants) SNPs, high scoring SNPs may be associated with the diseases and play causative roles