CPA ISC - S2 M6 Data Storage and Database Design Flashcards
What is a Star Schema?
A star schema is the most common schema for dimensional modeling, and is the simplest schema used for dimensional modeling. It is organized into a central fact table with associated dimension tables surrounding it, where the diagram of the schema looks like a star with the fact table at the center and the dimension tables arranged around it. It requires fewer foreign keys to link tables together when compared to the snowflake schema.
What is a Snowflake Schema?
The snowflake schema has further normalized dimension tables compared to the star schema. The dimension tables are broken down into multiple related tables rather than a single table. The snowflake schema can be more complex than the star schema since it requires more foreign keys to link the tables together. The snowflake schema can be more flexible, as it allows for more detailed information to be stored about the dimensions.
What is a Flat Model Schema?
The flat model is generally considered the simplest type of database schema where the data is held in an individual table. A more complex schema requiring multiple tables would least likely be a flat model.
This is a single, two-dimensional array where elements in each column are the same type of data, and elements in the same row relate to each. This is like an Excel spreadsheet.
What is a Hierarchical Model Schema?
A hierarchical model is generally considered a set of data with relationships whereas one piece of data may be considered a parent of numerous offspring pieces of data.
This is a tree-like structure, with a “root” node of data and child nodes that branch out from thar root. There is a one-to-many relationships between parent and child nodes.
What is a Composite Primary Key?
A composite primary key combines attributes to create unique identifiers in a table.
Operational Data Store (ODS)
An ODS is a repository of transactional data from multiple sources and is often a temporary waypoint area between a data source and a data warehouse.
Data Warehouse
A data warehouse is a large data repository that is centralized and used for reporting analysis rather than transactional purposes.
Data Mart
A data mart is like a warehouse but is for a specific purpose such as marketing or logistics, and is often a subset of a data warehouse.
Data Lake
A data lake is a repository similar to a warehouse but contains structured and unstructured data.
What are flat files?
Flat files are files that contain plain text with no structural interrelationships within that file (such as excel or CSV file).
First Normal Form (1NF)
1Nf is the first step in normalizing the data.
- Each cell in a table must contain only one piece of information
- Each record in every table must be uniquely identified (aka ensuring that there is a primary key).
What is normalization in database design?
Normalization is reducing data redundancy and eliminates undesirable characteristics like insertion, update, and delegation anomalies.
Second Normal Form (2NF)
2NF is the second step in normalizing the data.
- Ensure that all non-key attributes in a table directly depend on the entire primary key.
Third Normal Form (3NF)
3NF is the third and last step in normalizing the data.
- Ensure that each column in a table describes only the primary key.
What is a Conceptual Data Model?
A conceptual data model shows the high level structure and meaning of the data.