tech enablers - AI/ML Flashcards
development stages of AI
- narrow artificial intelligence
- general artificial intelligence
- super artificial intelligence
supervised learning
- create training data (annotated by human)
- use AI algo to create model
- apply model to unseen data
microworkers (pre read)
- perform tasks that machines cannot
- provide data for machine learning algorithms that are the basis of AI by adding the human element
- educated individuals, no regulations, etc
- eg. Amazon’s Mechanical Turk platform
environmental sustainability?
high carbon emissions footprint to train an AI model
what is deep learning
- deep layers of neural network
- data: very important but hard to derive features
- representation learning: raw data -> feature learning -> model - perform actions
pre-trained models and transfer learning
- train from scratch
- fine tune a pre-trained model (transfer learning)
data centric vs model centric AI
- Labelling consistency is key : lack of consistency can deteriorate the outcome.
- Systematic improvement of data quality on a basic model is better than chasing the state-of-the-art models with low-quality data.
- With data centric view, there is significant room for improvement in problems with smaller datasets (<10k examples).
- When working with smaller datasets, tools and services to promote data quality are critical.
GAN
generative adversarial networks
- learn to mimic any distribution of data
generative AI
- textual prompt -> novel content
- powered by foundation model or AI models trained on vast quantity of unlabelled data at scale -> adapted to wide range of downstream tasks
- eg. chatgpt is trained using technique called reinforcement learning from human feedback (tune/teach model to human preferred response)
concerns of AI (2)
- difficult to distinguish real from fake
- replacing humans
Working with smart machines_ Insights on the future of work (pre reading)
ecosystems for supporting AI applications
1. technology-based ecosystem:
- platforms: exploration support/transaction support/automated decision platforms
- intelligent case management systems: workflow management/prioritisation/recommendations
- job role-based ecosystem
- new job specialisations/hybridisations
aspects of AI readiness index framework
- organisational readiness ( AI literacy/talent/governance management support)
- business value readiness (business use case)
- data readiness (data quality/reference data)
- infrastructure readiness (ML/Data)
AI biases
- learnt from training data and is amplified
- eg. healthcare, hiring, policing, gender, race
retail AI example
domino’s
obj: use AI for consistent quality and speedier delivery
tech:
- pizza checker (image analysis and ML)
- processing order via voice (NLP)
- autonomous delivery vehicle
results: invest further to fit all kitchens with pizza checker
transport AI example
Tesla - aiming for level 5 (full) autonomy
obj: minimise accidents and death on road
tech:
- IoT, sensors, camera (computer vision)
- cloud computing to analyse all driving data
- siri-style AI assistant (voice control NLP)
results: autopilot (2) cut rate of airbag deployment to 0.8 per million miles driven instead of 1.3
products and privacy AI example
Apple
obj: pioneer of in-device AI technology
tech:
- neural engine inside iPhone X model (custom chip)
- AI ecosphere (build algorithms into products)
- smarter app with AI functionality (eg. Homecourt computer vision)
results: prioritise user privacy over cloud data processing; exclusivity of Apple app and AI ecosphere
healthcare AI example
Tencent – AI to power WeChat and Healthcare
obj: capitalize use of AI on all industries
tech:
- Tencent Miying : AI medical imaging and diagnosing platform (deep learning for image recognition and NLP on medical documents and case files)
results:
- Leverage WeChat for appointment booking and treatment payment
- Miying can identify symptoms of more than 700 diseases
sustainable AI
alphabet’s AI-powered camera system
entertainment AI
spotify
obj: help to discover new talents and tracks
tech:
- AI fills role of DJ (recommender)
- collaborative filtering (deduce interest from others)
- audio analysis (tempo, pitch)
- NLP (lyrics and reviews on web)
results: subscriber base growing by 8 million and share price rising by 25% in 3 months after IPO
opportunity and risks of AI (4)
- social & ethical
- legal & regulatory
- technological & implementation-oriented
- economic
- refer to docx for more details
type of recommenders (2)
- collaborative filtering
- articles read by similar users A & B
- recommend article read by user A to user B - content-based filtering
- similar articles are recommended to users who read one of the articles
complexity of language
- computers mainly treating documents as “bags of words” – good at finding statistical patterns but not the meaning and context
text pre-processing
source sentence -> remove stop words -> stemming -> tokens
text representation
- convert to numbers
- term frequency / count of word
- tfidf
similarity calculation
- euclidean distance
- cosine similarity
word embedding
- neural network’s internal representation for the word
- Each word is represented as a vector and semantically similar words have similar vectors
- eg. Word2Vec model is trained such that probability of a word depends upon the neighbouring words
pretrained models - eg. BERT
Bidirectional Encoder Representations from Transformers
- text preprocessing is no longer a must
caution when using pretrained models
- finetune with task related data
- can be resource intensive (GPU needed)