BMI2207C CA2 Flashcards

1
Q

Difference between the EHR and EMR.

A

Electronic Medical Records (EMR) :
- Digital version of the paper charts and structured data in hospitals or clinical office
- System to capture medical records of a patient
- Does not travel easily out of hospitals
- Examples : EPIC and Allscripts

Electronic Health Records (EHR) :
- Total health of the patient going beyond standard clinical data collected in the provider’s office or hospital
- Reach out beyond the health organisation that originally collects the information
- Includes information from all clinicians in the patient care
- Built to share information with other healthcare providers like labs and specialists
- Examples : NEHR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain what the term “Meaningful use” means.

A

It is a term to get HCPs to begin sorting and sharing of health data electronically to be able to better improve clinical processes and health outcomes for patients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Functions of “meaningful use”.

A
  • Improve quality of patient care
  • Engaging patients in health
  • Easier to coordinate care
  • Improve overall health of a given patient population
  • Secure and protects people’s health information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain the stages in meaningful use, and state which stage Singapore’s EHR is in.

A

Stage 1 : Focused on getting healthcare providers to adopt EHRs and store clinical data electronically

Stage 2 : Encourage healthcare professionals and institutions to then use the data and technology to improve the quality of care for their patients and make it easier to exchange information within and between organisations

Stage 3 : Centred on leveraging EHRs and clinical data to improve health outcomes, and ease reporting requirements to align with other government health programs

**Singapore’s EHR meets Stage 1 definition of “meaningful use”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain what “Directed Exchange” means.

A

Health information is sent directly to other providers over an encrypted secure connection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain what “Query-based exchange” means.

A

Data is requested by providers to a central Health information exchange (HIE).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain what “Consumer mediated exchange” means.

A

Patients are involved in the collection and transmission of healthcare data to providers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

List the 12 persistent risks in AI.

A
  1. Disinformation
  2. Safety and security
  3. Black box problem (cannot see what process AI has undertaken before it has generated data output)
  4. Ethical concerns (Non-maleficence, Beneficence, Autonomy, Manipulation)
  5. Bias
  6. Instability
  7. Hallucinations in LLMs (AI giving false outputs and causing unnecessary panic)
  8. Unknown unknowns (we are not really sure how AI is going to react, blindspots preventing anticipation of AI behaviour)
  9. Job loss and social inequalities
  10. Environmental impacts
  11. Industry concentration
  12. State overreach
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the relationship between AI, ML, DL, NLP, LLM and Conversational AI.

A
  • Artificial Intelligence includes ML, DL, Conv. AI, LLM, NLP
  • ML includes DL, LLM, Conv. AI, and can intersect with NLP
  • DL can intersect with Conv. AI and LLM
  • Conv. AI and LLM are both subsets of ML and NLP
  • NLP overlaps with ML, but has independent parts as well
    **Data science, data management, descriptive analytics and visualisation is NOT AI and is a complete other subset
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

List the 2 roadblocks of modern AI.

A
  1. Common sense problems and implicit knowledge – Cognitive and contextual limitations require excessive exceptions to be inputted to model interactions
  2. Lack of training/labelled data – Amount of inputs are never enough, especially for cellular images
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Differentiate supervised and unsupervised learning in machine learning.

A

Supervised learning :
- Classification is the problem of predicting the correct category/label for a input object
- Regression is the prediction of continuous response

Unsupervised learning :
- Clustering is the problem of identifying implicit groupings in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain regression analysis.

A

Regression analysis is a set of statistical processes for estimating the relationship between a dependent variable and one or more independent variable

  1. Linear regression : Models relationship b/w 2 variables and estimates the value of a response by using line of best fit
  2. Multiple regression : Models relationship between variables (with >1 independent variable) and method of least squares is used to find p-dimensional plane
  3. Logistic regression : Predicts a binary outcome and is a sigmoid function to map predictions to probabilities ; independent variables can be categorical/numeric, but dependent must ALWAYS BE CATEGORICAL due to binary output
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define artificial neural network (ANN).

A

ANN represents a prediction model as a series of multi-layered interconnected nodes functioning similar to neurons.

  • Learning in ANN involves adjusting the weights of various inputs to minimise errors in output
  • Model will back propagate to correct weights until the error rate converges
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain Deep Learning.

A
  • Subset of ANN that uses multi-layered neural networks
  • Allows for more complex feature detections and requires more training data
  • Image recognition, natural language, processing and autonomous driving
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define clustering and the 2 categories.

A

Clustering is defined as the classifying of data into points based on similarities, extracting underlying groupings without labels.

K means clustering : Clustering based on centroids, which changes its centroids until intra-cluster distance is minimised and inter-cluster data is maximised

Density based clustering (DBscan)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we conduct evaluation of ML models?

A
  • Choose the best model in the market and use it as the benchmark for comparison with other models
  • Ensures models are generalised well, not overtrained or under-trained
  • Hyper-parameter tuning where hyperparameters can significantly affect model performance
  • Interpreting pitfalls of models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Explain the train-validation-test split.

A

Refers to the segregation of an entire dataset into various percentages to train, validate and test the learning model.
- 70% of entire database to help ML model learn and train
- 15% of database to help validate (fine tune model hyperparameters)
- 15% of data to test the model to see if it is working

18
Q

Explain what cross validation for model fitting means.

A

The N-fold cross validation is used to evaluate the performance of a ML model through a series of training and testing with random sets.

19
Q

List the metrics for evaluating ML.

A
  • Accuracy
  • Precision
  • Recall/sensitivity (proportion of correct predictions from all positive instances)
  • F-1 measure
  • True positive, False positives
  • Area under curve
20
Q

Interpret the area under curve.

A

Residuals : Errors and deviations between actual and prediction values, hence we use line of best fit

Overfitting : Model “learns too much” and is unable to give a generalised trend

Underfitting : Model “learns too little” and gives a low prediction accuracy

21
Q

Applications of AI in healthcare

A
  • Enhances diagnostic accuracy
  • Improve treatment outcome and process
  • Reduce healthcare cost
  • Expand access to healthcare services
    BUT, Ethical, regulatory and privacy concerns
22
Q

What are Language Models?

A

Creates a circuit that guesses an output word given a bunch of input words.

23
Q

What does self-supervision mean?

A
  • Encoder, decoder network is trained to output the same word as the input
  • Encoders are circuits that are able to take in much more words using point vectors
24
Q

What does a masked language model mean?

A
  • Masking part of the sentences and testing the LLMs to see whether they can fill up the mask with the same output
  • However, when the mask is at the end of the sentence, the model would keep guessing for the next word in the sequence and form a autoregressive model instead
25
Q

Unique points of LLMs

A
  • Uses transformer architecture
  • Self-attention mechanism
  • Embeddings
  • Positional encoding
  • Multi-Head Attention
  • Feedforward Neural Networks
  • Training processes by learning language (pre-training on large amount of texts and fine-tuning by training on specific tasks)
26
Q

**Things to know about LLMs

A
  1. Training on random information
  2. Model is data biased (would only generate content it is trained on)
  3. Model cannot differentiate what is right or wrong
  4. LLMs may make mistakes and hallucinate, or generate certain preferences
  5. Trust deficit therefore requires validation
  6. Quality of response is directly proportional to quality of input prompt
  7. Model cannot undergo a real conversation as it does not remember everything that was said
27
Q

What is a “prompt” in the context of generative AI?

A

Text or input provided by the user to initiate a response or action from the AI

28
Q

Parameter to change how random and creative AI generated response is?

A

Temperature. The higher the temperature, the more random, diverse and less predictable the outcome is.

29
Q

What are tokens in LLMs?

A

A token is a word, punctuation mark or symbol that represents the smallest unit of text that the LLM can process.

30
Q

Name the framework for prompt generation.

A

CO-STAR framework

31
Q

Explain the CO-STAR acronym.

A
  • Context (background info of scenario)
  • Objective (clear and specific task definition)
  • Style (writing style of response)
  • Tone (sentiment)
  • Audience (intended audience of the response)
  • Response (response format)
32
Q

6 common task specific prompts to get LLM-powered AIs to do?

A
  • Rewriting
  • Extracting
  • Classifying
  • Clustering
  • Summarising
  • Generating
33
Q

What is the purpose of thought-based prompting?

A

Getting the AI to generate a list of steps as to how it come to a certain conclusion.

34
Q

Explain the data challenges in medical AI.

A
  1. Data Quality and Availability
  2. Privacy and Security Concerns
  3. Data bias and representation
35
Q

Explain the technical and integration challenges in medical AI.

A
  1. Integration with existing systems
  2. Scalability issues
  3. Complexity of Medical Conditions
36
Q

Explain the ethical and legal considerations in Medical AI.

A
  1. Ethical concerns
  2. Regulatory challenges
  3. Liability & Accountability
37
Q

Implications and importance of medical data security.

A
  • Leads to identity theft, financial loss
  • Has life-threatening implications if medical records are altered or stolen
  • Results in hefty fines and reputational damage on organisations
38
Q

List the practices in data security.

A
  1. Two-factor authentication
  2. Access control
  3. PDPA guidelines
  4. Guide staff on data handling, access and protection
  5. Have clear SOPs on how to respond to breaches
  6. Monitor and conduct audits to ensure compliance with policies and regulations
  7. Encrypting data in transit

Future works :
- Decentralised data storages
- Privacy considerations from the outset

39
Q

List the regulatory framework for healthcare data privacy and security.

A
  • Cyber and Data security guidelines (CDSG)
  • Personal Data Protection Act (PDPA)
  • Health Information Bill (HIB)
40
Q
A