Introduction To Data Mining Flashcards

1
Q

4 Reasons of Using Data Mining

A

Existing Solutions
Complex Problems
Fluctuating Environments
Getting Insights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

4 Motivating Challenge of Using Data Mining

A

Large-scale
High dimensional
Heterogeneous and Complex
Distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

2 Type of Data Mining Tasks

A

Predictive Methods
Descriptive Methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

8 Main Challenges of Data Mining

A

Insufficient Quantity of Training Data
Irrelevant Features
Nonrepresentative Training Data
Poor-Quality Data - Outliers, Missing values, Noise, Errors
Overfitting the Training Data
Underfitting the Training Data
Hyperparameter Tuning and Model Selection
Testing and Validating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Labelled Data

A

Data that has been labelled with one or more labels that indicate specific attributes or characteristics, classes, or contained objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Clustering

A

Defining a group of data points that are similar to each other and different from other objects in another group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

4 Type of Machine Learning Systems based on amount of supervision

A

Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

3 Feature of Online Learning

A

Incrementally
Sequentially
Individually or small groups (mini-batches)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 Online Learning Usefulness

A

Huge datasets
Data as a continuous flow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

3 Characteristic of Batch Learning

A

From scratch on the full dataset
Requires a lot of computing resources
Cannot adapt to rapidly changing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

3 Step of Knowledge Discovery in Database (KDD)

A

Preprocessing
Data Mining
Postprocessing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

6 Step of Cross-Industry Standard Process for Data Mining (CRISP-DM)

A

Business Understanding
Data Understanding
Data Preparation
Modelling
Evaluation
Deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly