Genomics Flashcards
DNA
DNA, or deoxyribonucleic acid, is a complex molecule that contains the genetic instructions necessary for the development, functioning, growth, and reproduction of all known living organisms and many viruses. It is often referred to as the “molecule of life” because of its fundamental role in heredity and biological processes.
RNA
RNA, or ribonucleic acid, is a biological molecule that plays a critical role in various cellular processes, including protein synthesis, gene regulation, and the transmission of genetic information. It is closely related to DNA but has some structural and functional differences that make it essential for a range of biological functions. It’s usually the result of a biological process known as transcription that occurs with the DNA. A gene (piece of the DNA) is transcribed into RNA molecules.
Chromosome
Chromosomes are thread-like structures found in the nucleus of a cell that carry genetic information in the form of DNA. They play a crucial role in maintaining and transmitting this genetic information from one generation of cells to the next during cell division.
Allele
An allele is a variant form of a gene that occupies a specific location, or locus, on a chromosome. Genes are segments of DNA that code for specific traits or characteristics in an organism. Alleles are what contribute to the diversity of traits within a population, as they can come in different forms and lead to variations in a particular trait
Variant
It is a term often used in genetics and genomics to refer to different versions of a particular genetic sequence or genomic element within a population. Variants can arise due to natural genetic variation, mutations, or genetic alterations. These variants can have various effects, ranging from no observable change to significant functional or phenotypic differences.
SNP, SNV
SNP (Single Nucleotide Polymorphism) and SNV (Single Nucleotide Variant) are terms used in genetics to describe a type of genetic variation where a single nucleotide (DNA base) is altered at a specific position in the genome
Cell
A biological cell is the fundamental unit of life and the basic structural and functional unit of all living organisms. Cells are the building blocks that make up the tissues, organs, and systems of living organisms, and they carry out the essential processes necessary for life.
Nucleus
The cell nucleus is a membrane-bound organelle found in eukaryotic cells, which are the types of cells that have a well-defined nucleus and other membrane-bound organelles. The nucleus serves as the control center of the cell, containing the genetic information in the form of DNA and regulating various cellular activities.
Base Pair
A base pair is a fundamental concept in genetics and refers to the pairing of two nucleotide bases that form a “rung” of the DNA double helix structure. DNA (deoxyribonucleic acid) is composed of long chains of nucleotides, and each nucleotide consists of a sugar molecule, a phosphate group, and a nitrogenous base.
NGS, MPS
NGS and MPS are acronyms that stand for two closely related technologies: Next-Generation Sequencing (NGS) and Massively Parallel Sequencing (MPS). These technologies have revolutionized the field of genomics by enabling rapid and cost-effective sequencing of DNA and RNA, leading to significant advancements in various areas of biology, medicine, and research.
Long read sequencing
Long-read sequencing refers to a DNA sequencing approach that generates longer sequences of DNA fragments compared to traditional short-read sequencing methods. In contrast to short-read sequencing, which produces DNA fragments that are typically a few hundred base pairs in length, long-read sequencing technologies can generate sequences that are thousands to tens of thousands of base pairs long
FASTA
The FASTA format is a simple and widely recognized text-based format for representing sequences. It consists of two main components: a sequence identifier line (header) and the actual sequence data. FASTA files can store single sequences or multiple sequences, and they are often used for representing nucleotide or protein sequences.
FASTQ
The FASTQ format is more comprehensive than the FASTA format and includes additional information about the quality of the sequence data. It is commonly used in next-generation sequencing (NGS) data to represent raw sequence reads and associated quality scores.
SAM / BAM / CRAM
SAM (Sequence Alignment/Map), BAM (Binary Alignment/Map), and CRAM (Compressed Sequence Alignment/Map) are file formats commonly used in bioinformatics to store and exchange sequence alignment data generated from DNA or RNA sequencing experiments. These formats are essential for representing the alignment of sequencing reads to a reference genome and include information about the aligned sequences, their mapping positions, quality scores, and more.
VCF
The Variant Call Format (VCF) is a standardized file format used to represent genetic variations detected in DNA sequencing data. It’s widely used in genomics research and clinical genetics for storing information about single nucleotide polymorphisms (SNPs), insertions, deletions, and other genetic variants identified through sequencing experiments. VCF files allow researchers to store and exchange information about genetic variations in a structured and standardized manner.