AI Project Cycle Flashcards

Question 1

Q

What is AI Project Cycle?

Answer

A

AI Project cycle is a framework used to design an AI project. This concept provides an organized plan for breaking down the task of program development into manageable modules. The project cycle consists of 5 stages namely: Problem Scoping, Data acquisition, Data exploration, Modelling and Evaluation.

Question 2

Q

What is problem scoping?

Answer

A

Problem scoping refers to understanding a problem and finding out various factors that affect the problem. In this stage of AI Project Cycle, a 4W problem canvas method is used that helps the user answer questions related to the problem thereby arriving at a definite problem statement. The 4Ws are Who, What, When/Where and Why. The answers to these questions lead to the problem statement.

Question 3

Q

What is data acquisition? What are the various ways in which students can collect data?

Answer

A

Data acquisition refers to acquiring authentic data crucial for the AI model from reliable sources. The data acquired can be divided into two categories: Training data and testing data. The AI model gets trained on the basis of training data and evaluated on the basis of testing data. The various ways in which students can collect data are: Surveys, Cameras, Sensors, Observations, Web Scraping, Application Program Interface, etc.

Question 4

Q

What is loopy?

Answer

A

Loopy is an open source tool to understand the concept of system maps. A system map shows the components and boundaries of a system and the components of the environment at a specific point in time. With the help of a system map, one can easily define a relationship among different elements which come under a system. The map shows the cause and effect relationships of elements with each other with the help of arrows. The arrow hand depicts the direction of the effect and a sign (+ or -) shows their relationship. + sign depicts positive relationship and - sign depicts negative relationship between elements. Considering the data features of any problem to be solved, a system map can be drawn.

Question 5

Q

What is Data Exploration? Why should we visualize the acquired data in some user-friendly format?

Answer

A

This is the process of refining the gathered data which needs to be arranged uniformly for better understanding. After acquiring the data, comes the need to analyze the data. For this, they need to visualize the acquired data in some user -friendly format so that they can:
- Quickly get a sense of the trends, relationships and patterns contained within the data.
- Define strategy for which model to use at a later stage.
- Communicate the same to others effectively.
Data exploration basically refers to visualizing the data to determine the pattern, relationships between elements and trends in the dataset that gives a clear meaning and understanding of the dataset. Data exploration is important as it helps the user to select an AI model crucial in the next stage of the AI project cycle. To visualize the data, various types of visual representations can be used such as diagrams, charts, graphs, flows and so on.

Question 6

Q

Explain the “Who” block.

Answer

A

The “Who” block helps in analyzing the people who are getting affected directly or indirectly due to it. Under this, we find out who the stakeholders to this problem are and what we know about them. Stakeholders are the people who face this problem are and what we know about them. Stakeholders are the ones being affected by the problem and would be benefitted with the solution.

Question 7

Q

Explain the “What” block.

Answer

A

At this stage, you determine the nature of the problem. Under this block, you gather evidence that the selected problem actually exists. Newspaper articles, media, announcements are some examples.

Question 8

Q

Explain the “What” block.

Answer

A

This block helps you look into the situation where the problem arises, the context of it, and the locations where it is prominent.

Question 9

Q

Explain the “Why” block.

Answer

A

In this stage, we think about the benefits which the stakeholders would get from the solution and how it will benefit them as well as the society.

Question 10

Q

What is data?

Answer

A

Data can be a piece of information or facts and statistics collected together for reference or analysis. Whenever we want an AI project to predict an output, we need to train it first using data.
For example, If you want to make an Artificially Intelligent system which can predict the salary of any employee based on his previous salaries, you would feed the data of his previous salaries into the machine. This is the data with which the machine can be trained. Now, once it is ready, it will predict his next salary efficiently. The previous salary data here is known as Training Data while the next salary prediction data set is known as the Testing Data.
For better efficiency of an Al project, the Training data needs to be relevant and authentic. In the previous example, if the training data was not of the previous salaries but of his expenses, the machine would not have predicted his next salary correctly since the whole training went wrong. Similarly, if the previous salary data was not authentic, that is, it was not correct, then too the prediction could have gone wrong. Hence… For any Al project to be efficient, the training data should be authentic and relevant to the problem statement scoped.

Question 11

Q

What are data features?

Answer

A

Data features refers to the type of data you want to collect. In our previous example, data features would be salary amount, increment percentage, increment period, bonus, etc. After mentioning the data features, you get to know what type of data is to be collected.

Question 12

Q

What is one of the most authentic and reliable sources of information?

Answer

A

One of the most authentic and reliable sources of information are the open-sourced websites hosted by the government. These governments have general information collected in suitable format which can be collected and used wisely.

Question 13

Q

What is Modelling?

Answer

A

Data is the fuel of artificial intelligence. A machine is said to be artificially intelligent if it gets trained with data and can make decisions/predictions on its own and learn and improve from its own experience and mistakes. In the modelling stage, data is split into training set and testing set. The model is trained on the training set from which it makes its own rules which help the model to give an output and is then evaluated on the testing set. The build an AI based project, we need to work around AI models or intelligent algorithms.

Question 14

Q

Differentiate between Rule Based Approach and Learning Based Approach.

Answer

A

Generally, AI models can be classified as follows:
- Rule Based Approach: Refers to AI modelling where the rules are defined by the developer. The machine follows the rules or instructions mentioned by the developer and performs its task accordingly. Decision tree is a rule based AI model to solve classification or regression problems which helps the machine in predicting the element with the help of various rules fed to it.
- Learning Based Approach: Refers to AI modelling where the machine learns by itself. Under the learning based approach, the AI model gets trained on the data fed to it and then is able to design a model which is adaptive to the change in data. For example, if the machine is fed with X type of data then the model forms an algorithm around it, and modifies itself when any change occurs in the data so that all exceptions are handled in this case.

Question 15

Q

Explain supervised learning. What are its types?

Answer

A

In supervised learning model, the dataset which is fed into the machine is labelled. In other words, we can say that the dataset is known to the person who is training the machine only then he/she is able to label the data. A label is some information which can be used as a tag for data. For example, students get grades according to the marks they obtain in their examination. In this case, the grades are the labels.
There are two types of supervised learning models:
a) Classification: Where the data is classified according to the labels. For example, in the grading system, students get grades according to the marks they secure in the examination. This model works on discrete dataset meaning the data need not be continuous.
b) Regression: Such models work on continuous data. For example, if you wish to predict your next salary, then you would put data of your previous salary, any increments, etc., and would train the model. Here, the data which is fed into the machine is continuous.

Question 16

Q

What is unsupervised learning?

Answer

A

The dataset which is fed to the machine is unlabeled. This means that the data fed to the machine is random and there is a possibility that the person who is training the model does not have any information regarding it. The unsupervised learning models are used to identify relationships, patterns and trends out of the data which is fed into it. It helps the user in understanding what the data is about and what are the major features identified by the machine in it. For example: You have a random data of 1000 dog images and wish to find some pattern out of the data. Unsupervised learning algorithm can be further classified into two categories:
a) Clustering: Clustering refers to the unsupervised learning algorithm which can cluster the unknown data according to the trends and patterns identified out of it. The patterns observed might be the ones which are known to the developer or might even come up with some unique patterns out of it.
b) Dimensionality Reduction: We humans are able to visualize up to 3- Dimensions only but according to a lot of theories and algorithms, there are various entities which exist beyond 3-Dimensions. For example, in Natural language Processing, the words are considered to be N-Dimensional entities. Which means that we cannot visualize them as they exist beyond our visualization ability. Hence, to make sense out of it, we need to reduce their dimensions. Here, dimensionality reduction algorithm is used. As we reduce the dimension of an entity, the information which it contains starts getting distorted. For example, if we have a ball in our hand, it is 3-
Dimensions right now.
But if we click its picture, the data transforms to 2-D as an image is a 2-Dimensional entity. Now, as soon as we reduce one dimension, at least 50% of the information is lost as now we will not know about the back of the ball.
Whether the ball was of same color at the back or not? Or was it just a hemisphere? If we reduce the dimensions further, more and more information will get lost. Hence, to reduce the dimensions and still be able to make

Question 17

Q

What is Reinforcement Learning?

Answer

A

This is a reward-based AI model. It is a self-teaching system that essentially learns by trial and error. It performs various actions with the aim of maximizing positive rewards. For every right action or decision of an algorithm, it is rewarded positive reinforcement. For every wrong action, it is given negative reinforcement. This way, it learns about the nature of the actions that need to be performed and which need not be done. This type of learning can assist in industrial automation.

Question 18

Q

What is Evaluation?

Answer

A

Model Evaluation is the last stage of AI project development cycle. Evaluation is the process of understanding the reliability of an AI model, based on outputs by feeding test dataset into the model and comparing with the actual answers. Model Evaluation is an integral part of the model development process. It helps to find the best model that represents our data and how well the chosen model will work in the future. There can be different evaluation techniques, depending on the type and purpose of the model.

Question 19

Q

What is a Neural network?

Answer

A

Neural networks are loosely modelled on how neurons in the human brain behave. The key advantage of neural networks are that they are able to extract features automatically without needing the input of the programmer. A neural network is essentially a system of organizing machine learning algorithms to perform certain tasks. It is a fast and efficient way to solve problems for which the dataset is very large, such as in images.

Question 20

Q

List the features of a Neural Network.

Answer

A

Neural network systems are modelled on the human brain and nervous system.
Every neural network node is essentially a machine learning algorithm.
They are able to automatically extract features without needing input from the programmer.
It is useful when solving problems for which the dataset is very large.

Question 21

Q

What is validating data?

Answer

A

It is also called secondary dataset. This data is to check if the newly developed model is correctly identifying the data for making predictions. This step is to make sure that the model has not become specific to the primary dataset values in making predictions. If that is the case then corrections and tweaks are made in the project. The primary and secondary datasets are also reruns through the model until the desired accuracy is achieved.

Question 22

Q

What is Testing data?

Answer

A

All primary and secondary data come with relevant label tags on the data. The testing data is the final dataset which provides no help in terms of tags to the model produced. This dataset paves the way for the machine model to come into the real world and start making predictions.

Question 23

Q

What is data warehousing?

Answer

A

Data is always collected in bulk from various sources and stored in various formats. This is called data warehousing.