Data Science Terms Flashcards

Question 1

Q

What is data science

Answer

A

It is the combination of business analytical and programming skills that are used to extract meaningful insights from raw data

Question 2

Q

Deep learning

Answer

A

The application of computational network. Deep learning is a subset of machine learning that trains a computer to perform human-like tasks, such as speech recognition, image identification and prediction making

Question 3

Q

Artificial intelligence

Answer

A

A set of approaches to enable computer to emulate and thus automatize congnitivr behaviour - often based on learning from data

Question 4

Q

Machine learning

Answer

A

Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

Question 5

Q

Benefits of data science

Answer

A

-enable organizations to make better decisions
-enhance operational efficiency, business routines and workflows
-recognize and inform companies of their target audience
Assist the automated aspect of HR

Question 6

Q

Training set

Answer

A

The dataset used by the machine learning model that will help it to learn its desired task

Question 7

Q

Testing set

Answer

A

These data are used to measure the performance of the developed machine learning model

Question 8

Q

Outlier

Answer

A

A data recorded which is seen as exceptional and outside the distribution of the normal input data

Question 9

Q

Data cleansing

Answer

A

The process of removing redundant data, handling missing data entries and removing, or at least alleviating other data quality issues

Question 10

Q

Feature

Answer

A

An observable measure of data. E.g height, length data, other terms are also used such as properties, characteristics and attribute instead of feature

Question 11

Q

Dimensionality reduction

Answer

A

The process of reducing dataset into less dimensions, ensuring that it conveys similar information.

Question 12

Q

Feature selection

Answer

A

The process of selecting relevant features of the provided data set

Question 13

Q

Supervised learning

Answer

A

The subset of machine learning that is based on data learning. It can be further distinguished in regression and classification

Question 14

Q

Unsupervised learning

Answer

A

The subset of machine learning that is based on unlabelled data. Typical unsupervised tasks are clustering and dimensioniallity reduction.

Question 15

Q

Probability

Answer

A

Quantification of how likely it is that a certain event occurs, or the degree of belief in given proposition

Question 16

Q

Standard deviation

Answer

Study These Flashcards

A

A measure of how spread out the data values are

Question 17

Q

Type I error

Answer

Study These Flashcards

A

False positive output, meaning that it was actually negative but predicted that it was positive

Question 18

Q

Type II error

Answer

Study These Flashcards

A

False negative output, meaning that it was actually positive but has been predicted as negative

Question 19

Q

Decision model

Answer

Study These Flashcards

A

A model assesses the relationships between the element of provided data to recommend a possible decision for given situation

Question 20

Q

Regression

Answer

Study These Flashcards

A

A forecasting technique to estimate the functional difference between input and output variables

Question 21

Q

Cluster analysis

Answer

Study These Flashcards

A

A type of unsupervised learning used to portion a set of data records into clusters. Records in a cluster are more similar to those than in another cluster

Question 22

Q

Classification

Answer

Study These Flashcards

A

A machine learning approach to categorise entities into pre defined classes

Question 23

Q

Data science related activities

Answer

Study These Flashcards

A

-understand the problem
-collect enough data
-processing raw data
-explore the data
-analyze the data
-communicate the results

Data Science Terms Flashcards

(23 cards)