Chapter 3 Data Preprocessing Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are the method for imputation

A

Imputation is a method which range from simple technique such as Mean, Median,
and Mode to complex technique such as regression, interpolation, K Nearest Neighbors and etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is moving average?

A

It is a method for smoothing time-series data by calculating the average of a window of adjacent data points over a specified time period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the techniques to calculate moving average?

A

Moving averages can be calculated using a variety of techniques, including simple moving average (SMA), weighted moving average (WMA), and exponential moving average (EMA).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is gaussian smoothing

A

It is a method of data smoothing that involves convolving the data with a
Gaussian kernel which is a bell shaped curve that assigns weights to neighboring
data points
The method is also known as Gaussian blur or Gaussian filtering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the different characteristics of ETL!

A

maintenance
It requires more maintenance and more knowledge

Processing time
Processing time increases as the data volume increases because all transformations must take place

Infrastructure
An on-premises environment that is expensive and difficult to scale is essential

Costs
High initial and running costs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the different characteristics of ELT!

A

maintenance
Virtually maintenance-free as we move raw data

Processing time
Processing time is significantly less dependent on the amount of data, because we migrate raw data

Infrastructure
It uses cloud services such as SaaS or PaaS, which do not need to be installed. They enable dynamic scalability.

Costs
Low start-up costs, downstream costs depending on data volume

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How data integration helpful in business?

A

In the business world, data integration helps organizations to gain a unified view of their operations, customers, and markets, which can then be used for reporting, analysis, and decision-making purposes. Data integration can help businesses to streamline their processes, reduce costs, and improve their overall performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How data integration helpful in healthcare?

A

In healthcare, data integration helps to combine data from various sources such as electronic health records, lab reports, and medical imaging, to provide a comprehensive view of a patient’s health history. This can help healthcare providers to make more informed decisions about patient care, improve patient outcomes, and reduce healthcare costs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How data integration helpful in manufacturing?

A

In manufacturing, data integrationintegrationis usedusedto combine data from various
sources suchsuchas production systems, sensors, and supply chain systems, to provideprovideaunified viewviewof the manufacturing processprocess.This cancanhelp manufacturersmanufacturersto optimize their
production processes, reduce costs, and improve product quality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly