4.5 Database definitions Flashcards

Question 1

Q

What is data consistency?

Answer

A

Data consistency refers to the accuracy, reliability, and uniformity of data across a database or system. It ensures that data is correct and coherent throughout its lifecycle.

Question 2

Q

What is data redundancy?

Answer

A

Data redundancy occurs when a piece of data is stored in multiple places across a database or system.

Question 3

Q

What is data independence?

Answer

A

Data independence refers to the seperation of data from the applications or systems that use it.

Question 4

Q

What is a relational database?

Answer

A

A relational database organises data into tables which consist of rows and columns. (records and attributes). Tables have relationships with each other that are established through keys.

Question 5

Q

What is data normalisation?

Answer

A

Data normalisation is the process of organising data in a database to reduce redundancy, improve data integrity, and make the database more efficient. It involves breaking down tables into smaller, more manageable structures while ensuring that relationships between data are preserved.

Question 6

Q

What are the features of first normal form?

Answer

A

It contains only atomic values
Each column has a unique name
All entries in a column are of the same data type

Question 7

Q

What are the features of second normal form?

Answer

A

It is already in 1NF
It has no partial dependencies - all non-key attributes (columns) depend on the entire primary key, not just part of it.

Question 8

Q

What are the features of third normal form?

Answer

A

Data is already in 2NF
All non-key attributes depend only on the primary key.

Question 9

Q

3 Advantages of normalisation

Answer

A

Resulting database will take up less storage space
Information retrieval will be more efficient because data is structured effectively.
Less redundancy means less inconsistencies in data because data will only need to be entered once.

Question 10

Q

3 disadvantages of normalisation

Answer

A

It is a complex process to create the database structure.
Can generate more tables than an unnormalised database which will mean a more complex database.
It is necessary to assign more relationships to interact with larger numbers of tables.

Question 11

Q

What is validation?

Answer

A

Validation ensures that data/ input is sensible and reasonable. It does not check the accuracy of data.

Question 12

Q

What is verification?

Answer

A

Verification is used to ensure that the data entered exactly matches the original source.

Question 13

Q

2 Methods of Verification

Answer

A

Double entry- Entering the data twice and comparing the two copies.
Proofreading data- Someone checking the data entered against the original document.
Referential integrity- For two tables that are linked together, records that include a foreign key can only exist if there is a corresponding primary key.

Question 14

Q

What is a data dictionary?

Answer

A

A file containing descriptions of the structure and attributes of data items stored in a database. It is a tool used by data managers.

Question 15

Q

What is data mining?

Answer

A

Data mining refers to the process of discovering patterns in large data sets. (Big data) It combines AI, statistics and database systems.

Question 16

Q

What is big data?

Answer

A

Big data is a term associated with data sets that are so complex that traditional databases and other processing applications are unable to capture, curate, manage and process them within an acceptable time frame.

Question 17

Q

What are the 3 stages of data mining?

Answer

A

Big data gathering
Big data storage
Big data Processing and Analysis

Question 18

Q

Explain big data gathering.

Answer

A

Consumer companies actively scan social media websites to decipher user preferences, choices and perceptions towards their brands.

Question 19

Q

Explain big data storage

Answer

A

Using big data may require an organisation to store data sets in the range of terabytes to many petabytes. Big data practitioners such as Google and Facebook all run what are known as hyperscale computing environments which consist of a vast number of servers each with Direct Attached Storage (DAS) - essentially lots of hard drives or flash storage devices.

Question 20

Q

Explain big data processing and analysis.

Answer

A

Big data processing techniques analyse data sets at terabyte or even petabyte scale. Some methods are cluster analysis, anomaly detection, summarisation.

Question 21

Q