Chapter 20 Flashcards

1
Q

What are types of data duplication on non-unique primary key

A
  • Duplicate identification numbers
  • House holding
  • Individualization
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 2 sub types of Duplicate identification numbers

A
  • Multiple customer numbers

- Multiple employee numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is House holding: data duplication problem

A

Multiple people living in same one house. (e.g. there are many people and families living in one home. All have bank accounts. When bank send advertise to its account holders, it will be sending multiple brochures to one house)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is individualization: data duplication problem

A

There is one person behind but looks more than one. (e.g. Mr ahad, Major. ahad)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is purge

A

Remove duplicate records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What 0 and 1 represents in degree of similarity

A

1 - near

0 - far

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is BSN method

A

Basic sorted neighborhood method is a method for removal of duplication record. It has 3 steps.
1- Identify attributes of record that will make a key of each record.
2- Sort that record on the basis of key.
3- Data is organized in a set. Each set is called window. Each window representing number can help us to identify duplicates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is BSN window

A

Data is organized in a set. Each set is called window . Each window representing number can help us to identify duplicates. BSN window size minimum will be 2 and maximum can be entire list.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is BSN method: selection of keys

A

In this method, we select which parts of attributes of records and those part will help to make key. That key will be sorted and do BSN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is BSN method: matching candidates

A

Merging of duplicate records is complex inferential process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly