4.5 Database definitions Flashcards
What is data consistency?
Data consistency refers to the accuracy, reliability, and uniformity of data across a database or system. It ensures that data is correct and coherent throughout its lifecycle.
What is data redundancy?
Data redundancy occurs when a piece of data is stored in multiple places across a database or system.
What is data independence?
Data independence refers to the seperation of data from the applications or systems that use it.
What is a relational database?
A relational database organises data into tables which consist of rows and columns. (records and attributes). Tables have relationships with each other that are established through keys.
What is data normalisation?
Data normalisation is the process of organising data in a database to reduce redundancy, improve data integrity, and make the database more efficient. It involves breaking down tables into smaller, more manageable structures while ensuring that relationships between data are preserved.
What are the features of first normal form?
- It contains only atomic values
- Each column has a unique name
- All entries in a column are of the same data type
What are the features of second normal form?
- It is already in 1NF
- It has no partial dependencies - all non-key attributes (columns) depend on the entire primary key, not just part of it.
What are the features of third normal form?
- Data is already in 2NF
- All non-key attributes depend only on the primary key.
3 Advantages of normalisation
- Resulting database will take up less storage space
- Information retrieval will be more efficient because data is structured effectively.
- Less redundancy means less inconsistencies in data because data will only need to be entered once.
3 disadvantages of normalisation
- It is a complex process to create the database structure.
- Can generate more tables than an unnormalised database which will mean a more complex database.
- It is necessary to assign more relationships to interact with larger numbers of tables.
What is validation?
Validation ensures that data/ input is sensible and reasonable. It does not check the accuracy of data.
What is verification?
Verification is used to ensure that the data entered exactly matches the original source.
2 Methods of Verification
- Double entry- Entering the data twice and comparing the two copies.
- Proofreading data- Someone checking the data entered against the original document.
- Referential integrity- For two tables that are linked together, records that include a foreign key can only exist if there is a corresponding primary key.
What is a data dictionary?
A file containing descriptions of the structure and attributes of data items stored in a database. It is a tool used by data managers.
What is data mining?
Data mining refers to the process of discovering patterns in large data sets. (Big data) It combines AI, statistics and database systems.
What is big data?
Big data is a term associated with data sets that are so complex that traditional databases and other processing applications are unable to capture, curate, manage and process them within an acceptable time frame.
What are the 3 stages of data mining?
Big data gathering
Big data storage
Big data Processing and Analysis
Explain big data gathering.
Consumer companies actively scan social media websites to decipher user preferences, choices and perceptions towards their brands.
Explain big data storage
Using big data may require an organisation to store data sets in the range of terabytes to many petabytes. Big data practitioners such as Google and Facebook all run what are known as hyperscale computing environments which consist of a vast number of servers each with Direct Attached Storage (DAS) - essentially lots of hard drives or flash storage devices.
Explain big data processing and analysis.
Big data processing techniques analyse data sets at terabyte or even petabyte scale. Some methods are cluster analysis, anomaly detection, summarisation.