Chapter 4 - Preparing data for Analysis Flashcards

1
Q

What are relational databases?

What are the parts of a relational database?

A

Best means of storing data in one place:

  • Tables: Data organzied into fields and rows
  • Field: Columns that contain descriptive characteristics about each observation
  • Records: The rows that contain each observation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are primary keys of relational databases?

Why are they important?

A

Unique identifier of each record that allows connection to other tables (think customer ID)

  • Every table must have a primary key to ensure unique listings.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are foreign keys of relational databases?

A

Primary key used in a separate table to create relationships between the tables.

  • Not required for every table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are relational database data dictionaries?

How do they differ from data dictionaries of chapter 3?

A

Table that

  • relates each attribute (transaction ID, date, etc..) to their respective tables and
  • explains useful info about each attribute

Contain more fields and desriptive elements of each attribute.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Use this slide to understand relational database data dictionaries.

A

Note the table field and primary/foreign key field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are entity relationship diagrams?

A

Graphical representation of each table in a database that shows the fields contained within,

  • Connects tables through separate diagrams:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the advantages of using relational databases for storing data.

A

Puts all data in one place and ensures:

  • Preventive internal controls
  • Security of data entry
  • Reduced redundancy costs
  • Version control
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the three characteristics that ensure data integrity (faithful representation)?

A

Data that is:

  • free from error
  • Complete (no missing info).
  • Neutral (not biased)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is it better to connect files to a relational data base rather than extracting and creating different version?

A

Ensures info is up to date.

Extracting data makes data out of date as soon as new transactions are posted and damages the representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do relational databases improve internal controls?

A

Relational databases ensures that parent data must exist before creating new data:

a foreign key cannot be made without first creating the parent key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do relational databases improve security?

A

Data entry can be verified and show who has entered and viewed new data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do relational databases reduce redundancy costs?

A

Ensures that there is a unique listing of each observation in only one location.

  • maintains one version of the truth.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Use this slide to show how to use excel to connect to a database:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Use this slide to show how to refresh connections to databases in excel:

A

From data tab:

  • Select “refresh all” dropdown
  • “connection properties…”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Structured Query Language (SQL) and why is it useful for extracting data?

A

Universale database language.

Used to create, update, and delete records in relational databases.

Necessary for connecting to databases too large for excel to handle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you use SQL to extract data?

A

Need two lines of code:

  • SELECT: identifies the fields (columns) you would like to extract
  • FROM: identifies the table(s) that store those columns

May need third line to restrict rows:

WHERE: acts as a filter to specify rows associated with observation

17
Q

Write SQL code to retrieve the date, transaction_type, and amount columns from the Transaction table of Returned items

A

Query:

  • SELECT Date, Transaction_Type, Amount*
  • FROM Transaction*
  • WHERE Transaction_Type = ‘Return’*
18
Q

If you want to select ALL columns/fields when writing SQL, what do you place in the SELECT code?

A

SELECT *

note: asterisks will return all fields

19
Q

If you want to connect to multiple tables when writing SQL code, what lines of code must you include?

A
  • Place the first table in the FROM line as normal:
  • Place the second table in third row: INNER JOIN
  • Denote the primary and foreign keys in a fourth row: ON

  • FROM Transactions*
  • INNER JOIN Customers*
  • ON Transaction.CustomerID = Customers.CustomerID*