ML, Cloud computing and Edge computing Flashcards
Data mining (DM)
Focus on discovering patterns and knowledge from historical data to find valuable insights.
Machine learning (ML)
The ability of an algorithm to learn from data without being explicitly programmed for specific tasks.
Artificial Intelligence (AI)
Technologies and systems that act and think in a way similar to human behaviour, with the ability to make rational decisions.
Different types of machine learning algorithms(/types?)
Supervised machine learning, unsupervised machine learning and reinforcement learning.
Supervised machine learning
Classification: is when the output variable is a category, such as red or blue or disease and no disease.
Regression: a problem occurs when the output variable is a real value such as weight or price, e.g., house selling, stocks.
Algorithms for classification and regression in supervised machine learning?
Logistic regression: used to predict a binary outcome: yes/no, pass/fail.
Naive Bauers: calculate the possibility of whether a sample X belongs within a certain category or does not.
K-nearest neighbour: utilising training data to find the K closest relatives in future examples.
Decision tree: builds tree branches in a hierarchy approach, and each branch can be considered as an if-else statement.
Linear regression: a technique to examine whether there is a statistical relationship between a response variable and two or more explanatory variables (X).
Unsupervised machine learning
Splits datasets based on common attributes, detects anomalies that do not fit in any group and simplifies data by reducing dimensions. Clustering, anomaly detection and dimensionality reduction are common techniques. E.g., IoT in agriculture to segment plants based on sensor information.
Reinforcement learning
the machine is given feedback concerning the decision it makes, but no information about the possible alternatives. E.g., used in automated cars, and different games.
Challenges about ML in general
- Insufficient data: ML algorithms need a lot of data to work properly, simple problems may take thousands of examples and complex problems (image/speech recognition) may require millions.
- Poor quality data: if the training data is full of errors, outliners, and noises it will be hard for the system to detect underlying patterns.
- Irrelevant features: a critical part is coming up with a good set of features to train on.
- Feature selection: select the most useful features to train among existing features.
- Feature extraction: combining existing features to produce more useful ones, dimensionality reduction algorithms can help.
- Feature engineering: new features by gathering new data.
What is the role of ML in iot?
Predictive maintenance: finding signs of issues before a breakdown happens.
Anomaly detection: involves identifying events of data points that are outside the expected range.
Personalization: based on user behaviour and preferences, ml can be used to customize iot apps.
Environmental monitoring: data from sensors estimate environmental factors.
Resource optimization: ml can used to maximize the usage of resources like water, electricity and materials.
Smart transportation: cars, aeroplanes.
Challenges of using ML in IoT
What are some challenges in using ML in IoT?
Data quality: ML algorithms require high-quality data to provide accurate predictions.
Scalability: iot applications involve large amounts of data and a large number of devices, which can make it difficult to scale ml algorithms.
Latency: real-time or near-real-time decision-making is crucial.
Interoperability: ml algorithms may be challenging to integrate into iot devices and systems because they are frequently created using various technologies and standards.
Energy efficiency: iot devices often have limited power and processing resources, which can make it difficult to run complete machine learning algorithms.
Security: iot devices could be exposed to security risks like viruses or hacking.
What is the role of cloud computing in IoT?
cloud computing and iot complement each other. Cloud computing allows iot devices to record, capture, process and store data at a massive scale.
What are the benefits and limitations of edge computing?
by edge computing, we can better control data, reduce cost, provide faster insights and actions, and enable more continuous and streamlined operations. A very good option for sensitive areas - sensitive data. The limitation with edge is that large, complex models can not be deployed to edge service.
Edge vs cloud
Cloud Computing uses centralized servers in large, remote data centers to process and analyze data. It is best for applications that don’t need instant responses.
Edge Computing is a distributed system closer to users and devices. It processes data locally and analyzes it in real-time, making it ideal for situations where low latency is important and every millisecond counts.
Summary:
• Cloud: Good for non-time-sensitive tasks, data stored remotely. • Edge: Best for real-time tasks, data processed locally.
Cloud computing
Cloud computing is a model that provides users with easy and immediate access to shared resources via the internet. These resources can include networks, servers, storage and applications. Users can quickly access and configure these resources with minimal management and without needing much interaction with the service provider. Instead of saving files on a dedicated hard disk or local storage device, cloud-based storage makes it possible to save remotely. Five key characteristics of cloud computing are, On-Demand Self-Service, Ubiquitous Network Access, Resource Pooling, Rapid Elasticity and Measured Service.
What is MLOps?
the process of automating and productionalizing machine learning applications and workflow. A methodology used to implement and deploy ML in IoT. It enables faster model development and deployment by automating key steps such as monitoring, validating and re-training models. Continuous integration, deployment, and traning.
Which are the elements for building ML systems?
configuration, automation, data collection, data verification, feature engineering, ML code, testing and debugging, model analysis, process management, metadata management, serving infrastructure, and monitoring.
What is the ML life cycle and what challenges are there in the cycle?
The different steps required to develop and implement a machine learning model, from data collection to deployment.
Business understanding → data collection → data analysis → data processing → modelling → model evaluation and testing → model analyzing → trained model → repository → model deployment.
Challenges: Time-consuming, it is manual, not reusable, error-prone.
Clustering
Identification of groups with similar characteristics, dividing data into groups (or ‘clusters’) where the items within each group are more similar to each other than they are to the items in other groups.
Edge computing
Edge Computing processes data close to where it’s created (e.g., sensors, devices, offices) instead of sending it to a central data center. This reduces latency, increases efficiency and handles large amounts of data from sensors and IoT devices. A local gateway can handle certain applications without needing to send all data to the cloud or a central server.
Benefits:
* Real-time insights * Faster decision-making * Only important data is sent to the central data center for more analysis.
Why cloud computing
Cloud Computing uses advanced networking, storage, and processing to provide hardware and software resources managed together. This makes security, resource management, and fault tolerance simpler. It’s mainly used for business computing and has a big economic impact.
Key benefits:
• Appears to offer unlimited resources for easy scaling • Pay-as-you-go pricing means no upfront costs
Give examples of two cloud computing services
AWS and Azure
Different cloud computing types
Private, Hybrid, Community and Public
Describe the private cloud computing type
used for a singel organisation, can be internally or externally hoster. It may be managed by the organization or a theird party, may exist on or off premise. E.g., Jetstream, RedCloud. Pros: may be cheaper, you can keep it off the Internet so data can be safe, optimize your own hardware, control everything. Cons: you are responsible for everything, not as many high-level services may not be cheaper, you manage physical and system security.