lecture 2 Flashcards
Why health and biomedical informatics?
- modern healthcare and biomedical research are information intensive activities
- the crossover/intersection between health care/biomedical research and information technology
What is the demand for specialised workforce?
- people with the ability to combine ICT and biomedical skills and knowledge are in high demand in Australia and around the world
- A job market study from June 2012 showed that job posts in Health Informatics have increased ten times faster than other health related jobs in recent years
- need people with specific skills: people able to understand both the language of medicine and and the language of informatics technology
- well paid in many countries
What is rationale?
Background
• health care, biomedical research and public health are information intensive activities:
- medical images and clinical records
- DNA sequencing, molecular data
- literature and public databases
- clinical trials, biobanks, GWAS
• new data types (extremely complex and heterogeneous) are being generated at an unprecedented pace
How long did/does it take to decode the human genome?
- first decoded in 2003 after one decade of work
* nowadays takes one day
What is Pubmed?
- growing exponentially
- 5000 biomedical research articles are published daily
- over 22 million articles
What are projects that human genome project has led to?
- human microbiome project
- exposome alliance project
- ENCODE genome regulation
- Human Epigenome Project
- phenome levels (proteome, metabolome)
- inter and intra individual genetic variation: 1000 genomes, mapping human genetic variation
What is Big Data?
• global size of “Big Data” in Healthcare stands at roughly 150 Exabytes (10^18) in 2001, increasing at a rate btetween 1.2 and 2.4 Exabytes per year (SA)
defined by the 4 Vs:
• volume: Data at Rest, terabytes to exabytes of existing data to process
• velocity: data in motion, streaming data, milliseconds to seconds to respond
• variety: data in many forms, structured, unstructured, text, multimedia
• veracity: data in doubt, uncertainty due to data inconsistency and incompleteness, ambiguities, latency, deception, model approximations
What is small data?
- our individual digital traces
- personal devices specifically designed for self-tracking (Fitbit)
- social networks, search engines, mobile operators, online games, and e-commerce sites that we access every day
- our everyday behaviours are becoming data
- data that are just about me, over time
- data about us, but not being provided to us
- from chronic pain to depression to memory enhancement and Crohn’s Disease
- generating evidence where n=me
- issues: open data, privacy, standards and tools
What are issues with informatics?
- how can we efficiently collect, store, search, integrate, analyse and visualise all of this info ?
- how can we use research data to model and simulate human physiology and pathology?
- how can we facilitate the translation of research findings into clinical solutions?
- which new information processing methods will be needed to respond to the emerging research approaches?
- the tools you use are the result of our research
How do we connect levels of biomedical information?
- connecting different levels of biomedical information
- bottom - up approaches (from gene/molecule to environment)
- top-down approaches (reverse)
- need to link health informatics (population data, clinical data, patient generated data) with bioinformatics (genomic data, gene expression data, proteomics and metabolomics data)
- relationship between genotype, phenotype
- everything must also be connected with the complex interplay between itself and the environment
Health vs bioinformatics?
- bioinformatics is different from health informatics
- increasing opportunities for interaction
- bioinformatics and computational systems biology
- health and biomedical informatics
Why is working with clinical data so hard? Why is healthcare data different?
Humans are a result of evolution - not perfect - many levels
Data about humans that arises from a growing number of sources and contexts:
- clinical research
- clinical practice - EHRs
- patient and disease registires
- mHealth apps
- Smart devices and sensors
- Environmental data
- Social media data
why different? • distributed (EMR, clinical departments) • different formats (text, images, numeric, videos) • same data exists in different systems • patient generated data • data is structured and unstructured • inconsistent/variable definitions • new data coming out every day • complexity of data (the human body) • changing regulatory requirements • privacy issues
What is biomedical informatics?
- informatics is the science of information
- information is data plus meaning
- biomedical informatics is the science of information as applied to or studied inthe context of biomedicine
- informaticians study information (data + meaning, in contrast to focusing exclusively on data)
- thus, practitioners must understand the context or domain (biomedicine)
- IT is different from biomedical informatics
- IT is basically the technology that we use to process information
- informatics is about providing meaning to data
What are some of the big challenges currently being address?
- phenome –> genome * exposome
- human → environmental sensors, phenomic sensors, genomic sensors
- environmental sensors → environmental risk factors (pollution, radiation, toxic agents,…)
- phenomic sensors → physiological, biochemical parameters (cholesterol, temperature, glucose, heart rate…)
- genomic sensors: biomarkers (DNA sequence, proteins, gene expression, epigenetics)
- all of these combined → integrated personal health record
challenge:
• how can we measure environmental exposure?
• e.g. diabetes
• measuring the exposome
• environement-wide association study on Type 2 Diabetes mellitus
• 266 environmental factors
• future: combined: GWAS-EWAS?
What is an example of a new way of presenting/visualising data?
- microarrays: analysis of gene expression in cancer
- how this is translated into survival curve
- ontologies: systems that are expressed in a standardised way, so you could analyse data from two different places