genomic vocab glossary Flashcards

1
Q

ab initio gene discovery

A
  • this is a method for identifying genes in a sequence when you don’t have prior information about the gene (lack information about comparisons with other species and gene transcript product)- most ab initio approaches use a hidden Markov model to search for sequence motifs that are commonly found in genes, such as long open reading frames, intro-exon boundary signatures, and conserved upstream regulatory motifs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ab initio protein structure prediction

A
  • finding the tertiary structure of a protein (helices, sheets, coil foldings)- some alternative methods include X-ray crystallography, NMR spectroscopy, fitting model by homology
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

acrylamide

A
  • compound used to make gels for electrophoresis (separation of proteins or nucleic acids)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

affinity chromatography

A
  • method for purifying proteins and their complexes based on their affinity for some compound- this compound is crosslinked to a matrix in a column- proteins eluted when buffer disrupts interaction between proteins on the column
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

alignment

A
  • lining up two or more DNA/protein sequences- maximizing # of identical nt/residues- minimizing # of mismatches and gaps
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

alternative splicing

A
  • combination of different sets of exons to make two or more mature mRNA from the same primary transcript- observed in higher eukaryotes- single gene can create multiple protein isoforms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

annotation

A
  • linking information from literature to databases for genes/ proteins- in genome sequencing, annotation refers to the identification of likely genes using a combination of ab initio methods, homology searches and physical evidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

antibody

A
  • secreted immunoglobulin molecule- recognizes up to 10 aa (aka epitope)- poly clonal Ab = group of different Ab recognizing different epitopes on the same protein- monoclonal Ab = recognize single epitopes and made by hybridoma cell lines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

association mapping

A
  • search for genes that affect disease susceptibility- done by testing alleles at DNA polymorphisms and seeing if they are present in affected individuals more or less commonly than expected by chance-LD complicates and helps the mapping process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

balancer chromosomes

A
  • chromosomes that have been engineered to contain multiple inversions that suppress crossing over- these are used to maintain recessive mutations in genetic stock- balancers usually have a recessive lethal marker and a dominant visible genetic marker
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

base calling

A
  • process of calling series of nt from a sequence trace- usually automated, but manual work can resolve ambiguities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

bioconductor project

A
  • project using open-source software written in the R programming language- used for statistical analysis of genomic data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

case-control association mapping

A
  • screening genetic markers that are associated with disease status- based on comparing allele frequences in a group of affected people and in a similar control group of unaffected ones
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

cDNA

A
  • DNA that is complementary to mRNA- first strand synthesis of cDNA is made by reverse transcriptase- cDNA can also be ds
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

cDNA clone

A
  • complementary DNA copy of a full length transcript
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

cDNA library

A
  • collection of cDNA clones (usually isolated from a single tissue)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

cDNA microarray

A
  • an array of cDNA on a glass microscope slide of nitrocellulose filter- they are hybridized to labeled mRNA for profiling gene expression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

centriMorgan (cM)

A
  • standard unit of genetic map distance- it corresponds to a 1% probability of a crossover occurring between two sites in any meiosis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

chain termination sequencing

A
  • most commonly used method for sequencing DNA clones (up to 1 kb)- based on a method first devised by Fred Sanger- molecules of all possible lengths are made by random termination of DNA polymerization when a dideoxynucleotide (ddNTP) is incorporated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

chemometrics

A
  • series of analytical methods for quantifying chemical profiles- this includes principle component analysis (PCA) and artificial neural networks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

chemostat

A
  • apparatus used for long-term exponential growth of microbial cultures- fresh medium is introduced at the same time as liquid culture waste is removed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

chromatin immunoprecipitation microarrays (ChIP chips)

A
  • microarrays consisting of DNA that corresponds to potential regulatory regions of genes- this is used to detect sequences that bind to transcription factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

chromosome painting

A
  • this procedure aligns the chromosomes of two different eukaryotic species - based on fluorescence in situ hybridization (FISH)- a set of chromosomes-specific probes from one species are made using unique combos of fluorescent dyes- these probes paint the chromosomes in a mitotic chromosome spread from cells of the second species
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

chromosome walking

A
  • this procedure clones a large contiguous portion of a chromosome- a probe at end of one clone is used to identify overlapping genomic clones in a library- procedure is repeated until region of interest is covered
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

clusters of orthologous genes ( COGs)

A
  • sets of genes from a collection of species- they are hypothesized to encode the same gene product- determined by pairwise best-match sequence similarity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

complementation group

A
  • a set of alleles that fail to complement ( substitute for the function of) on another- this usually indicates that they are mutations in the same locus
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

consensus sequence

A
  • a hypothetical sequence that has the most common amino acid at each position in a multiple alignment of DNA or protein sequences- aka amino acid or DNA sequence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

copy number variation (CNV)

A
  • polymorphism in the number of copies of a stretch of DNA- this includes deletions and duplications of whole genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

contig

A
  • a contiguous stretch of cloned DNA- may refer to:1) a scaffold of overlapping clones (physically mapped)2) a long stretch of DNA sequence assembled by merging two or more sequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

cosmid

A
  • large insert plasmids- usually exist as a single copy within host bacterial cells- contain cos sites that allow in vitro packaging of inserts as phase molecules if desired
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

CpG islands

A
  • stretches of vertebrate DNA- usually 1-2kb long- contain a 10x higher frequency of doublet nucleotides CG than entire genome- usually found near the 5’ end of genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Cre-Lox recombination system

A
  • a combo of site specific recombinase (Cre) and its recognition site (lox) from the bacteriophage P1- engineered into yeast, mouse, and other eukaryotic genomes to facilitate targeted recombination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

C-Value paradox

A
  • no apparent correlation between the number of genes and the amount of DNA in a genome- there is a range of DNA content even in closely related organisms- no relationship between complexity and DNA content
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

cytological map

A
  • map of the location of genes or other DNA features relative to the banding patterns of the chromosomes of a species
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

data normalization

A
  • process of removing systematic biases from microarray data- these biases cause misinterpretation of apparent differences in transcript abundance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

deficiency complementation mapping

A
  • this method is for fine scale mapping of QTL based on the variable ability of WT alleles to complement the effects of hemizygotes for a deletion of a gene or genes- how a WT reacts to hemizygotes when gene is missing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

dideoxynucleotide

A
  • a nucleotide without OH at both 2’ and 3’ carbon of the sugar backbone- cannot covalently link to the next nucleotide in a growing DNA- used in chain termination sequencing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

DNA binding motif

A
  • short stretch of DNA (8-12 nt) that can be recognized by a DNA binding protein- motifs can be represented by a profile of frequency of each dNTP- these motifs help identify sequences important for gene regulation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

DNA library

A
  • collection of clones where each piece contains a different segment of genomic of cDNA - if clone can be transcribed and translated, it’s an expression library
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

ectopic expression

A
  • activation of expression of a gene in a cell that is not usually expressed (abnormal expression)- can be done artificially or by disease
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

embryonic stem (ES) cells

A
  • this cell line an be transformed and manipulated in culture-then injected into the blastula (early embryo) where it integrates with and grows to contribute to the development of the adult animal- injected embryos are chimeric- if ES cells populate in the germ line, a transgenic organism is produced in the next population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

enhancer

A
  • orientation and distance independent regulatory sequence- increase transcription levels and can occur anywhere in a genome- can act over 100kbps - one enhancer can affect the transcription of several genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

enhancer trap

A
  • this transposable element is modified with a reporter gene - when inserted into the genome adjacent to a gene, the enhancer that drives expression of that gene also drives expression of a reporter gene
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Ensembl gene browser

A
  • storage and resource for genomic data in Europe- run by Sanger Centre (Cambridge) and European Bioinformatics institute (within the European Molecular Biology Lab)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

epistasis

A
  • two definitions- (Quantitative genetics) an interaction between two or more loci that results in non additive effects of one allele as a function of the genotype at the other locus- (developmental/physiological genetics) describing a mutation whose phenotype is unaffected by another mutation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

epitope

A
  • portion of a protein, carb, or other molecule that is specifically recognized by an antibody
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

E-value

A
  • expected number of sequences in a database that would by chance produce an equivalent or better alignment score than the one under consideration
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

expressed sequence tag (EST)

A
  • sequenced piece of cDNA (subsequence)- full-length cDNA defines structure of transcript, but EST is a tag that indicates that the particular sequence is part of a transcribed gene- (online definition) EST is a tiny portion of an entire gene that can be used to help identify unknown genes and to map their positions within a genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

expression library

A
  • a library of cDNA clones in a vector that allows the gene products to be expressed (transcribed and translated) in a controlled manner
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

expression vector

A
  • a cloning vector that allows transcription and translation of a cDNA fragment that is inserted into the multiple cloning site
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

expressivity

A
  • the severity of a disease OR- the degree to which a trait is observed in affected individuals- often affected by the environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

F3 design

A
  • a genetic screen designed to isolate recessive mutations- requires that the phenotype be measured in F3 progeny of the mutagenized individual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

floxing

A
  • a method for inducing a mutation at a precise time and place in an organism- when a mouse has loxP binding sites on both sides of an exon of the gene to be mutated (placed by homologous recombination)- this is crossed to a strain containing Cre recombinase in the tissue of interest- exon is excised only in that tissue
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

fold recognition

A
  • method for predicting the tertiary structure of a protein- secondary structure is predicted by using limited sequence similarity and comparing it to find the previously described domain fold that most closely fits the unknown protein structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

forward genetics

A
  • genetic analysis that starts with the phenotype and moves towards isolation of gene that causes the phenotype- phenotype => gene
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

functional genomics

A
  • study of the function of each and every gene- (ie) biochemical activity, cell biology function, organismal function- this includes genetic analysis, microarrays, proteomics, and computational biology
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

fusion protein

A
  • hybrid protein made by fusion of two genes in an expression vector-N terminal = tag (poly histidine, small glutathione s transferase)C terminal = protein of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

GAL4

A
  • a potent transcription factor from yeast that enhances gene expression only though a UAS sequence adjacent to the promoter- if no UAS sequence, GAL4 will have no effect on transcription in heterologous genomes- GAL4-UAS system is specifically used to drive expression of transgenes introduced into that genome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

gene knock-in

A
  • replacement of the endogenous gene with a different functional piece of DNA- inserted gene is expressed in place of the original gene- germline gene therapy uses gene knock-in to replace a defective gene with an active copy- the replacements are performed using positive-negative double selection strategy in ES cells
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

gene knock outs

A
  • a mutation that targets a specific gene, made by using homologous recombination to replace exon of the target gene with a piece of foreign DNA (lacZ reporter gene)- insertional mutations can also cause gene knock outs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

genetic fingerprinting

A
  • strategy for testing subtle effects of mutations on the fitness of microbial strains in competition with other strains during long-term culture
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

genetic heterogeneity

A
  • the observation that the same disease or phenotype can have multiple different genetic causes- allelic heterogeneity = if different variants are within a single locus
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

genetic map

A
  • in cM (centiMorgans)- map of the order of and distance between genes based on recombination frequency between markers- markers can be physical (molecular variants) or visible (mendelian loci)- mapping populations may be pedigrees, crosses between lines, or radiation hybrid cell panels
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

genome-wide association study (GWAS)

A
  • a study designed to scan the entire genome for SNP and CNV that are associated with a disease or trait- at least 500k different genetic variants are measured in several 1000 disease cases and a similar number of healthy controls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

germ line

A
  • the population of cells in eukaryotes that are destined to undergo meiosis to become oocytes or sperm- the germ line is set aside very early in animal development - in plants, the germ line is specified at the time of flowering
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

haplotype

A
  • multi-site genotype of two or more polymorphisms on the same chromosome- (ie) individuals who are homozygous at one site for G allele and heterozygous at a nearby site for A and T, the individual would have GA and GT haplotypes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Hardy-Weinburg equilibrium

A
  • expectation that genotype frequencies in a population will tend to be stable and predictable as a simple function of individual allele frequencies- equilibrium is broken by evolutionary forces (such as migration, inbreeding, mutation or selection)- these forces can lead to increase or decrease in the number of heterozygotes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

heavy isotope labeling

A
  • method for quantifying protein expression between two samples- one protein is labeled with heavy isotope (deuterium) so peptide piece moves slower through TOF spectrometer than the corresponding unlabeled fragment- ICAT reagents are used for uniform labeling of protein mixes after cell extraction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

heteroduplex DNA

A
  • dsDNA containing a polymorphism- formed by renaturing PCR products from two different alleles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

heuristic search

A
  • algorithms that use time-saving methods to search for the most likely solution- reduce search space by excluding unlikely solution from the analysis- not guaranteed to find the optimal solution, but it’s often the only way to perform phylogenetic analysis or sequence alignment involving a large number of sequences
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

hierarchical sequencing

A
  • whole genome sequencing approach based on the principle that the genome is first divided into an ordered set of clones- process is to clone genome into artificial chromosomes, then cosmid or BAC clones several 100kbp, then plasmids up to 10kbp- BAC clones are now more commonly directly sequenced using the shotgun strategy
72
Q

hidden Markov model (HMM)

A
  • class of bioinformatic procedures for identifying sequence features - these sequence features includes looking for local amino acid or nucleotide usage patterns that are distinct from random sequence
73
Q

HKA test

A
  • statistical test of neutrality - based on the expectation that divergence between species and polymorphisms with species are highly correlated in the absence of selection
74
Q

homogenous assay

A
  • SNP genotyping assay in which all of the steps are performed in a single sample- no transferring of products = save labour and materials
75
Q

homolog

A

two definitionsa) biological feature (molecules -> traits) that show a similar structure due to the fact that they derive from a common ancestor- homology = identity by descent- similarity of structure may or may not reflect homologyb) imply that two DNA or aa sequences are similar

76
Q

horizontal gene transfer

A
  • transfer of gene to another individual (by means other than sexual - vertical)- used in context of microbial genomes that can be shown to have incorporated genes from a different species
77
Q

hybridization

A
  • annealing ssDNA to complementary strand to form double helix- usually one strand is labeled to detect transcript/clones
78
Q

immunohistochemistry

A
  • detecting proteins in tissue sample based on antibodies that recognize the protein
79
Q

inbreeding

A
  • process of mating siblings or close relatives repeatedly- loss of genetic variability in the line- (NIL) near isogenic lines = almost homozgous throughout the genome (using 10+ inbreeding generations)- NIL are used in quantitative genetic analysis
80
Q

indel

A
  • insertion or deletion polymorphisms- range from few bp to several kbps- large indels often involve transposable elements
81
Q

in situ hybridization

A
  • detection of a specific mRNA in a tissue sample- done by hybridization of tissue to a DNA or RNA probe that is complementary to the mRNA- this probe is labelled with a fluorescent or radioactive group or with a small compound such as biotin or digoxygenin that can be recognized by an antibody
82
Q

insertional mutagenesis

A
  • creating mutations by controlled insertion of TE near the gene of interest
83
Q

interference

A
  • observation that recombination is suppressed via nearby recombination events- this results in the genetic map distances between two or more markers and they don’t necessarily add up to the distances between pair of adjacent markers
84
Q

interval mapping

A
  • method for QTL mapping that uses the genotypes of two adjacent genetic markers to estimate the likely genotype at each point in the interval between the markers
85
Q

introgression

A
  • introduction of a small portion of one genome into another genome, by repeated backcrossing with selection for the region of interest- interspecific hybrid and parent species cross
86
Q

linkage disequilibium mapping

A
  • an approach to identifying genes that correspond to a QTL- based on detection of:LD between markerthe trait/diseasecasual polymorphism in an outbred population
87
Q

isogenic

A
  • homozygous for the entire portion of the genome
88
Q

laboratory information management system (LIMS)

A
  • automated tracking system with barcodes, robotits, and software checkpoints- this is to ensure accurate tracking of samples and data in high throughput genomics labs
89
Q

linkage disequilibrium

A
  • the nonrandom segregation of genetic markers- LD decays over time as a result of recombination while physical distance between the markers increase- LD can be caused by many factors - founder effect, admixture, and epistatic selection
90
Q

LOD score

A
  • logarithms of the odds score- measuring statistical significance in association studies and linkage mapping- logarithm of ( the ratio of the probability of observing the data given an association to the probability under the null hypothesis)- (online) - LOD scores compare the chance of obtaining the test data if two loci are indeed linked, to the likelihood of observing the same data purely by chance
91
Q

maldi tof

A
  • matrix assisted laser desorption ionization time of flight spectrometry- ionize/separate peptides/DNA sequences- then identify corresponding sequence
92
Q

mapping function

A
  • mathematical function that converts recombination frequencies to genetic map distances- this is done by accounting for the incidence of double crossovers between markers- common functions include Haldane and Kosambi, Kosambi adjusts the data for interference
93
Q

marker assisted selection

A
  • approach to improve animal and plant breeding- based on selection in each generation DNA markers that are associated with some desired trait- rather that selecting for the trait itself
94
Q

mass spectrometry (MS)

A
  • technique for identifying molecules based on comparing spectrum of molecules separated by mass/charge ratio with a theoretical standard- in genomics/proteomics = separation based on TOF of an ionized fragment ( the vacuum is sensitive to mass differences)
95
Q

mate pair sequences

A
  • pair of sequences from the two ends of a single clone- essential in shotgun sequencing- the distance between the pairs help resolve repetitive DNA sequences and verifies the sequence assembly
96
Q

McDonald Kreitman statistic

A
  • test for selection on protein sequences- based on comparison of levels of synonymous and replacement polymorphism and divergence- (online) looks for ancient selection over long periods as opposed to the steady accumulation of mutations that confer no selective advantage predicted by the neutral theory
97
Q

metabolome

A
  • set of metabolites present in a cell or tissue- molecules that mediate physiological properties of organisms include lipids, carbs, steroids, amino acids
98
Q

metabolomics

A
  • high throughput methods for metabolome characterization
99
Q

metabolic control analysis (MCA)

A
  • an approach to modeling of metabolism- based on biophysical and biochemical principles- seek to understand and predict the effects of genetic/environmental disturbance
100
Q

microsatellite

A
  • a stretch of repetitive DNA - (ie) variable number of nucleotides- (ie) 100+ tandem repeats of small number of nucleotides - most common is (AG)n or (CAG)n- microsatellites are highly polymorphic and heterozygous- can occur at high density (100kbps) in high eukaryotic genomes
101
Q

minimal genome

A
  • the smallest number of genes required to sustain life
102
Q

modifier

A
  • a polymorphism or mutation that modified (to enhance or suppress) a phenotype associated with a different mutation
103
Q

monoclonal antibodies (MAbs)

A
  • MAbs made from a single immunoglobulin gene- they recognize a single epitope on a protein- MAbs are made by immortal hybrid cell lines that secrete them
104
Q

motif

A
  • short conserved sequence of nucleotide or amino acid- suggests conservation of function
105
Q

multiple cloning site (mcs) AKA polylinker

A
  • plasmid site for foreign DNA insertions at unique RE sites
106
Q

multiplex PCR

A
  • simultaneous amplification of multiple different DNA fragments- done by using several pairs of specific primers
107
Q

National Center for Biotechnology Information (NCBI)

A
  • source of genomic data in USA- part of National Library of Medicine (NLM) within the National Institutes of Health (NIH)
108
Q

neutral theory

A
  • null hypothesis explaining distribution of molecular variation in natural populations in the absence of natural selection- factors affecting neutral evolution1) mutation pressure2) migration rate3) population size4) breeding structure5) recombination rate
109
Q

next generation DNA sequencing

A
  • emerging technology that are replacing traditional dideoxy-based methods- high speed and low cost- (ie) pyrosequencing, reversible termination technology, ligation sequencing
110
Q

Northern blotting

A
  • method for characterizing gene expression1) mRNA2) transfer to nitrocellulose or nylon3) probed with chemically or radioactively labelled DNA corresponding to the gene of interest
111
Q

nucleotide diversity

A
  • average proportion of nucleotide differences between all pairs of sequences in a sample- a measure of polymorphism that is a function of of the number an frequency of variable alleles
112
Q

Online Mendelian Inheritance in Man (OMIM)

A
  • website by NCBI and john hopkins university- documents genetic info about human disease
113
Q

open reading frame (ORF)

A
  • a reading frame is a stretch of genomic DNA that encodes at least 20 codons without a stop codon- there are six possible reading frames on any stretch of DNA ( three in each orientation)- a reading frame is open if it supports translation of a peptide sequence and can be assembled from exons (after splicing out introns)
114
Q

orphan gene

A
  • predicted gene with no sequence similarity to any other gene in the database (cannot be assigned to a family)
115
Q

orthologs

A
  • two genes in separate species that derive from a common ancestor without duplication
116
Q

paralogs

A
  • two genes that arose by duplication of an ancestral gene
117
Q

penetrance

A
  • the frequency of individuals with an allele who show the phenotypic trait
118
Q

PERL

A
  • language to perform bioinformatic procedures such as extracting DNA sequences from a database
119
Q

pharmacogenomics

A
  • the study of the effect of the genomic base on response to drugs, toxins, and other pharmacological agents- (online) how genetic makeup affects an individual’s response to drugs
120
Q

phenocopy

A
  • an environmentally induced phenotype that mimics a known mutation
121
Q

phylogenetic analysis

A
  • ( comparative genomics) approach to annotate a gene function- based on the assumption that evolutionary history is a more reliable indicator of likely function than sequence similarity alone
122
Q

phylogenetic footprinting

A
  • aligning DNA sequences from various divergent species- purpose is to detect evolutionary conserved elements that may encode genes or other important DNA sequences
123
Q

phylogenetic shadowing

A
  • aligning DNA sequences from several closely related species- purpose is to detect highly conserved DNA elements that may encode regulatory elements
124
Q

physical map

A
  • map of genome with ordered set of large insert clones (shown in kbp)- shows distances between molecular features (RE sites, sequence tagged sites)
125
Q

polony

A
  • PCR colonies- made as cell free clones of single DNA molecules that accumulate as a concentrated spot of amplified DNA within an acrylamide matrix on a glass microscope slide
126
Q

population stratification

A
  • differences in allele or haplotype frequencies between populations- identity of the population is not obvious as the population structure may be hidden- in historical populations, they used to correspond to geographic location or phenotypic attributes
127
Q

positional cloning

A
  • cloning of a gene that is responsible for a disease or trait on the basis of its position in the genome- usually using recombination mapping
128
Q

position effect

A
  • a phenomenon where the site of insertion has a large effect on the level of expression of the transgene- seen in transgenic animals and plants
129
Q

profile

A
  • a list of the frequencies of each amino acid in each position in a multiple alignment of protein sequences
130
Q

promoter

A
  • region 5’ to the start site of transcription of a gene that serves as a binding site for the RNAP initiation complex- usually includes regulatory sequences
131
Q

protein domain

A
  • a structurally distinct region of a protein that performs a particular subset of the function of the whole protein- usually less than 150 aa residues in length- (ie) DNA binding domain, kinase domains, extracellular domains
132
Q

protein interaction map

A
  • description of network of interactions among proteins- (ie) physical associations (detected using Y2H and protein microarrays)- (ie) protein interactions (determing by biochemical and genetic analysis)
133
Q

proteome

A
  • profile of a cell or tissue’s proteins present at a specific circumstance- may describe relative or absolute abundance
134
Q

pseudogene

A
  • DNA sequence with structural features of true genes, but isn’t active- many are made with reverse transcriptase and lack introns- a pseudogene may never have been active or may be decaying in the absence of any selection pressure to maintain function
135
Q

psychogenetics

A
  • study of behaviour and psychosis using genomic approaches
136
Q

pyrolysis

A
  • thermal degradation of materials into volatile fragments- this is a type of spectrometric method used in profiling the metabolome
137
Q

QTL mapping

A
  • finding location of QTL - using statistical procedures for identifying nonrandom associations between genetic markers and trait values- similar idea as recombination mapping of several loci at the same time
138
Q

quantitative trait locus (QTL)

A
  • region of genome with a quantitative effect on a trait- responsible for a portion of the genetic variance- QTL may affect continuous traits, or liability to discrete traits including diseases
139
Q

quantitative trait nucleotides (QTN)

A
  • SNPs or CNV that contribute to the effect of QTL
140
Q

radiation hybrid mapping

A
  • method for making genetic maps of vertebrates or plants- fragments of genome of one species are propagated in hybrid cell lines with another species- co-segregation of sequences in multiple lines indicate that the two sequences are physically linked
141
Q

random mutagenesis

A
  • making a collection of new mutations- screen for aberrant phenotypes of TE insertions
142
Q

recombinant inbred line (RIL)

A
  • line derived from two genetically distinct parents that has been bred to be nearly isogenic ( homozygous throughout genome - by generations of inbreeding)- each member of a panel of recombinant inbred lines contains a different combo of fragments from the parents-RIL is useful for mapping QTL
143
Q

recombination mapping

A
  • finding gene location of a gene causing a particular phenotype, disease or quantitative trait- based on the co-segregation of linked genetic markers with the trait in a pedigree
144
Q

redundant gene

A
  • a gene whose function can be supplied by another gene or genes if it is mutated- this redundancy can be due to1) multiple copies of the genes in a genome2) because the protein activity can be supplied by a different type of gene3) the enzyme can perform the function with another genetic pathway
145
Q

reverse genetics

A
  • genetic analysis- starts with a gene -> find phenotype it generates
146
Q

restriction fragment length polymorphism (RFLP)

A
  • polymorphism that is detected as a difference in the length of the fragments formed when a piece of DNA is RE digested- detect genetic variation with enzymatic digetion
147
Q

reverse transcriptase

A
  • the enzyme that converts RNA -> ssDNA- usually encoded in the genome of RNA viruses
148
Q

Rosetta Stone Approach

A
  • bioinformatic technique to make protein interaction maps- based on the idea that two genes are likely to encode interacting proteins if they exist as a single fused gene in another species
149
Q

saturation random mutagenesis program

A
  • forward genetic screen of a sufficiently large number of mutagenized chromosomes- this is to guarantee that all genes affecting the trait of interest will be hit by at least one mutation
150
Q

sequence-contig scaffold

A
  • alignment of sequenced contigs against physical and cytological map- this step ( in genome sequencing) is done before finishing stage (scaffold gaps are filled in)
151
Q

sequence tagged sites

A
  • sequenced DNA fragment taken from library of clones that is then placed in a physical map of the genome
152
Q

serial analysis of gene expression (SAGE)

A
  • method for profiling gene expression (look at mRNA population)- based on sequencing lots of unique tags correspnoding to each gene in the genome- tag = small sequence of nt- tag = can identify original transcript- tag = linking tags allows rapid sequencing analysis of multiple transcripts
153
Q

shotgun sequencing

A
  • finding DNA sequences by randomly breaking it into a redundant set of small clones- these are then sequenced - each fragment is represented 5-10 times- contigs are assembled by computer alignment for optimal sequence overlap
154
Q

whole genome shotgun sequencing

A
  • sequencing entire genome without dividing it into large clones as in hierarchal sequencing
155
Q

single base extension (SBE)

A
  • SNP detection method- based on minisequencing reactions that only detect the identity of the base adjacent to the sequencing primer
156
Q

single nucleotide polymorphism (SNP)

A
  • genome site where one nucleotide is found to have 2+ states in a collection of individuals of the same species- most SNP = substitutions involving 2+ nucleotides (AG)- SNP can apply to single nucleotide indels
157
Q

single sperm typing

A
  • purpose is to measure recombination rates- determining genotypes of multiple markers from a single sperm
158
Q

site directed mutagenesis

A
  • using recombinant DNA technology to alter cloned sequence so that the protein is altered when expressed in a transgenic organism- allow testing roles of the residues on protein function
159
Q

southern blotting

A
  • method for DNA sequence detection using RE digestion and then electrophoresis - transfer to nitrocellulose or nylon membrane-SB is to1) detect differences in the genomic DNA encompassing a gene2) see if gene is present in the genome of another species
160
Q

synteny

A
  • conservation of gene order between divergent species
161
Q

synthetic lethal

A
  • two mutations that are alone homozygous viable- but when together, it’s inviable- this lethality indicates the two genes function in a similar process
162
Q

systems biology

A
  • an integrative approach to genome biology that involves theoretical modeling of large genomic, transcriptomic, proteomic, and metabolomic datasets to guide the generation and testing of hypotheses
163
Q

tagging SNPs

A
  • SNP to represent haplotypes- chosen as to capture the majority of DNA sequence variation in a population
164
Q

threading

A
  • method to predict protein structure based on a combination of:1) secondary structure similarity2) assessing likely binding energies of potential folds
165
Q

tiling path

A
  • aligning large insert clones that covers the contig with minimal redundancy
166
Q

transcriptome

A
  • complete set of transcripts expressed at cell or tissue under defined conditions (and each class’s transcript abundance)
167
Q

transient expression

A
  • activated gene expression for a limited amount of time(examples)1) plasmids injected into embryos2) infection of tissue with a replication defective virus
168
Q

transmission disequilibrium testing

A
  • testing for association based on unequal ratios of allele frequencies in the affected children of heterozygous parents
169
Q

transposable elements

A
  • mobile pieces of DNA that jump in location of genome- they usually use an enzyme encoded on the natural TE
170
Q

transposon mutagenesis

A
  • mobilizing of modified TE to create tagged mutations - mutation now contains inserted DNA fragments
171
Q

UAS

A
  • binding site for the GAL4 transcription factor- only found upstream of yeast genes- UAS-GAL4 combo is used to drive transgene expression in plants and animals-(old notes) GAL4 tf = BD and AD (binding domain and activation domain)- BD has bait - AD has library of random cDNA- when BD and AD bind to form GAL4 tf, this induces a reporter gene expression
172
Q

unigene set

A
  • a set of unique cDNA clones- derived from a cDNA library by filtering out duplicate copies of the same transcript
173
Q

unitig

A
  • unique contig-a set of overlapping DNA sequences that correspond to a single piece of genomic DNA that is represented multiple times in a shotgun sequence library
174
Q

western blotting

A
  • technique for detecting protein expression- protein- acrylamide gel- membrane- probe with labelled antibodies
175
Q

yeast two hybrid screens Y2H

A
  • method for detecting protein protein interactions-based on the reconstitution of transcription factor activity when1) (TF protein part BD) (bait)one protein domain fused to DNA binding domain interacts with2) (TF protein part AD) (prey…cDNA library)another protein domain fused to an activation domain – developed in and most commonly performed in yeast cells