2.5 Databases Flashcards

1
Q

What is meant by Big Data

A

Refers to data sets so large and complex that it becomes difficult to process using standard database techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data mining

A

The analysis of a large amount of data in a data warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Predictive analysis and Example

A

consist of a variety of statistical techniques including modelling, machine learning, and data mining.

Example: In business, predictive models analyses patterns found in historical and transactional data to identify patterns that may present risks or opportunities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data normalisation (1NF,2NF,3NF)

A

Moving from:
· unnormalised data to 1NF involves ensuring there are no repeating attributes, attributes should be atomic
· 1NF -> 2NF involves ensuring there are no partial dependencies
· 2NF -> 3NF all data items depend on nothing but the primary key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Primary key and foreign key

A

PK - uniquely identifies a record in a database

FK - a field in a table which links to a primary key in another table
enables data in different tables to be linked together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Indexes definition

A

An index is a list of key fields to improve access times to records and sort the records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data consistency definition

A

For data to be consistent, it must be added only if it satisfies the rules of the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Different views of the data

3 points

A

allow users to access/read/write to/amend/delete only part of DB

Allow database users to access only certain records or certain fields

May link tables together so users view is as if only one table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe three problems associated with paper-based systems and how computerised databases could solve these problem

A
  1. Difficult and time consuming to amend/easy to make mistakes
    S. It is easy to amend / update data in a database to minimise errors
  2. Difficult to encrypt so accessible if stolen
    S. Easy to encrypt so not compromised if stolen
  3. Difficult for multiple persons to look at the same record
    S. Many people can view the same record (only one can update)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe four benefits (for the college) of using a computerised database system.

A

· Database would be easy to and quick to search for a student or course details

· Easy to back up student or course details in a computerised database

· It is easy to overwrite / amend / update student or course details in a database

· Database allows different access rights for different college staff

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Verification and it’s purpose

A

Verification checks are carried out when data is being entered and when data is being transferred from one place to another

Purpose is to ensure data are consistent and ensure data have not been corrupted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Double entry verification

A

Ask customer to type password twice and compare both inputs to check that they are the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data validation techniques

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data dictionary definition

A

A list of information about all the fields used in a database
It will usually include the table names, fields, primary keys and the field validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Outline the role of a database administrator

A

The person in a company who is responsible for the structure, security and management of the database system and the data in it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe why distributed databases are often used and identify one difficulty associated with using distributed databases. Explain what is actually distributed in a distributed database.

A

It is often more efficient to store data on a number of different computers to maximise performance.

It is difficult to ensure that all the data in all the computers is always up-to-date / maintain integrity.

Both processing and data are distributed across the different computers that the data is stored upon

17
Q

How a random access file operates

6 Marks

A

· Physical location for new record is calculated from the key field
· A hashing algorithm is used for this calculation to find the location
· If data collision /something there, the record is stored instead in an overflow area
· Data in the overflow area is normally stored and searched in a linear manner
· File may need reorganising if overflow becomes too large
· Existing records are accessed in the same way.

18
Q

Explain what is meant by data normalisation in a relational database. With benefits

A

Normalisation:
· is a way of structuring data according to theoretical rules
· normalising data usually reduces data duplication/redundancy
· avoids danger of inconsistency / maintains integrity
· avoids danger of data being lost during update
· avoids wasting processing time
· probably enables easier maintenance of the database
· allows different views of the data

19
Q

Describe the difference between flat file and relational database systems

A

A flat file system may contain a number of single tables with no links between them, whereas a relational database normally contains a number of linked tables

20
Q

Advantages of using a distributed database

5 Marks

A

· Resilient. A problem in one site will not stop other sites from working.
· Security. Staff access can be limited to only their portion of the database.
· Network traffic is reduced so reducing bandwidth costs.
· A single site database still works even if the connection between sites is temporarily broken
· Expense: either cheaper or more expensive but has to be properly qualified

21
Q

Advantages of a relational over a flat file

A

Redundancy is reduced

Risk of inconsistent data is reduced

Data independence allows different views of the same data

Allows easy extension to the structure of the database