Module 2 stars Flashcards
What is data wrangling?*
“+The process of retrieving, cleansing, integrating, transforming and enriching data to support subsequent analysis”
+IT transforms raw data into a format that is more appropriate and easier to analyse
What are the objectives of data wrangling?*
+Improve data quality
+Reduce time/effort required to perform analysis
+Reveal the true intelligence in the data
What is data modelling?*
“+The process of defining The structure of a database”
+Relational databases are modelled in a way to offer flexibility and ease of data retrieval.
What is an ERD?*
“+Entity relationship diagrams are graphical representations used to illustrate The structure of The data”
What is an entity?*
“+Person, place, things, events, etc..”
What is an instance?*
“+a single occurrence of An Entity”
+Represented as a record in a database
What are the 2 types of keys in an ERD?*
1.Primary key
2.Foreign key
What is a primary and composite primary key?*
1.Primary key:+Attribute that uniquely identifies each instance of the entity
+Used for fast retrieval and searches
2.Compisite primary key:+A primary key that consists of more than one attribute
+Used when none of the individual attributes alone can uniquely identify each instance of the entity
What is a foreign key?*
“+a primary key from another Entity, that The focal Entity contains.”
What are the 3 main aspects of an attribute?*
1.Name:+A unique name for the attribute
2.Values/meanings:+List of acceptable values for the attribute and what these values mean
3.Description:+The definition of the attribute in relation to the solution.
Further discuss relationships of entities in an ERD?*
“+The relationships between entities provide structure for The data model.”
+Indicating which entities relate to other entities and how
+Specifications show The number of minimum and maximum occurrences allowed on Each side of The relationship
+relationships may be read in either direction
What are the different relationship indicators for an ERD?*
+Read the slides for the 6 indicators(Slide 15 chap 2)
What is a (enterprise) Data warehouse?*
“+a central repository of data from multiple departments within An organisation”
What are 5 reasons for a enterprise data warehouse system?*
+It is integrated and accurate
+Supports managerial decision making
+It helps keep enterprise wide organisation (clean/ organised) around subjects such as sales, customers or products
+Gives a historical and comprehensive view of the entire organisation.
+NoteVolume of data can become very large very quickly
How is data integrated from different databases in different departments?
“+Using The ETL ( Extraction, transformation and load) process”
+data must be universally retrieved , reconciled and transformed into a consistent format
+After this The final data must be loaded into a data warehouse