Informatics 2- Data Information and Knowledge Flashcards
data
- observations
- may or may not be meaningful
- computers do not understand
- input, store, process
- output zero (off) and one (on)
- each zero or one is known as a bit
- a series of eight bits is called a byte
- ex. 112134493 (pt id number)
information
meaningful data to draw conclusions
- take data and make conclusion
- ex. patient ID, interpreting diagnoses codes
knowledge
information justifiably believed to be true
-ex. smokers are more likely to develop lung cancer
bit
- only zero’s and ones’
- only language a computer understands
- computers do not understand you (they are dumb!)
byte
a series of eight bits is a byte
-10011101
integers
- numbers
- type of data
floating point numbes
- type of data
- 3.5456
- decimal
character
- 8 bytes is a character
- ex. “a” and “z”
strings
- putting characters together makes strings
- “hello” or “ball”
file formats
from least to most storage:
- image files
- text files
- sound files
- video files
image files
- JPG
- GIF
- PNG
- more clarity -> more storage*
text files
- txt
- doc
sound files
- WAV
- MP3
video files
MPG
informatics vs. IT and computer scientists
- computer science- software
- informatics- broad spectrum, hardware, software, combine them into a relative way
- IT search or sort data more efficiently
- takes data and makes it easier to use (sort, filter)
- informatics manipulates information (tools vary, could be computers)
- information retrieval:
- relationship between aspirin and heart attack -> finding correlations
- problem is identifying documents that contain certain meaning
data to information
- vocabularies help convert data into information
- ICD-10 162.9 is meaningless datum
- interpreting ICD-10-CM as “Lung neoplasm, not otherwise specified” turns datum into a unit of information
- human interpretation is necessary
- interoperability- transmission of information
- consistency of interpretation
information to knowledge
- information produces knowledge
- in clinical world, evidence exists that knowledge is true rather than proven fact
you cant convert data to knowledge
-it needs to put into something meaningful first (information)
clinical data warehouse (CDW)
- clinical data are collected via electronic health records (EHRs)
- clinical records composed of:
- structured data- billing codes, lab results, ICD-9-CM 162.9 = Lung Neoplasm, medication lists etc… -> easier to manage and retrieve
- unstructured or (free text)- clinical notes, natural language, but difficult to process -> natural language processing (NLP)
- shared database that collects, integrates, and stores clinical data from a variety of sources including electronic health records, radiology and other information systems
- staging: extract, transform and load
- EHR designed for real time updating of individual data
- CDW supports queries (a search) for groups
- take all the information and make a diagnosis
- cant be deleted after it enters the data warehouse
patients charts are made up of
-structured and unstructured data
van der Lei
- data shall be used only for the purpose for which they were collected
- this law has a collateral: if no purpose was defined prior to the collection of the data, then the data should not be used
- if there is no purpose why collect it?
- waste of time! we dont have time
query
- search
- looking for something
- query the x-ray
CDW as a clinical resource
- monitor quality to query for specific quality measures in specific pt populations
- all the people in the background are communicating with pt, looking that everything is being handled and input correctly
- clinical and translational researchers to identify trends and link research with clinical practice
- hospital infection control specialists track pathogens
- public health agencies conduct surveillance for natural or man-made illnesses
- informatics for integrating biology and the bedside (i2b2) project by Harvard
use of aggregated clinical data
- recognize records for pts with specific conditions
- could be use of billing codes (controlled vocab -> ICD-10-CM)
- concept extraction- identifying concepts within unstructured data -> extracting information from free text clinical notes (discharge summaries or pathology reports)
- need to map between terms or phases and controlled vocab with accuracy
- good notes -> better care
CDW summary
- CDWs more than just archive data
- must turn data into information and knowledge
- make sense of clinical data
- make clinical data meaningful ( data to information) and information to knowledge
classification
- problem of categorizing data into two or more categories
- algorithm can be used to learn a representation of features that characterizes annotated positive (pts with breast cancer) and negative (pts without breast cancer) cases
- new cases can then be categorized automatically
- someone programs the computer -> how do you know they were correct
semantic gap
- difference between data and information = meaning (semantics)
- right answer wrong data -> right data wrong answer
- in banking the gap between data and information is narrow -> bc its numbers!
- direct link between data (numbers) and information (account balances)
- did collect data correctly and did you interpret it correctly
tracking
-everything you do in health care is tracked
concepts are poorly defined
- definition of sick
- system in human body are connected
- conceptual and computational models are rarely complete
health information technology is really health data technology
- existing technology stores, manipulates and transmits data (symbols), not information (data + meaning)
- in health care, data do not fully represent the meaning
- you can collect data but it may not answer your question
- all data needs a purpose
interfaces
- in health care systems we use HL7 (health language 7)
- HL7 reference information model (RIM)
- connects multiple systems together
- HEALTH CARE ONLY
- complex
- does not necessarily match all health care environments
- take the first group of numbers and matching it to a second group of numbers -> tells the system where to point something correctly (area)
- matching fields
- lock and key
- interface people teach other people how to do this -> they dont get paid to do it for other people
incomplete information
-information with missing data, but potentially obtainable
uncertain information
-not possible to objectively determine whether it is true or false
imprecise information
-information not as specific as it should be
vague information
-unclear information
inconsistent information
-information that contains 2 or more assertions that cannot simultaneously hold
computer algorithms (spell checks)
- computers and programming languages process discrete symbols according to precise formal rules
- they do not make sense of a highly ambiguous information
- person who wrote the algorithm is smart not the computer
health IT: easy to sell?
- no
- improve health care quality, prevent medical errors, and increase efficiency
- but no
- increase mortality, increase error and decrease efficiency
- not just HIT, but how it is implemented and in what context clinical results are determined
future trends
- transition from “data processing” to “information processing”
- introduce human cognition and abilities