Data Types Flashcards

1
Q

What is structured data?

A
  1. Organised🗂️
  2. Labelled🏷️
  3. Tables format for 🔢
  4. Quick retrieval💨
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What can you do with structured data?

A
  1. Sort 🔼
  2. Aggregate ➕
  3. Query with SQL📟
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Whats an example of structured data?

A

CSV file

Collibra exported to Excel which happened to be in structured format, brought into google sheets for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is unstructured data?

A

no set structure/format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can you analyse unstructured data?

A

Pre-processing methods:

Texting mining ⚒️
Natural Language Processing🗣️
Image Recognition🖼️

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What’s an example of unstructured data?

A

customer reviews on social media

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a relational database?

A

database structured into tables made with rows and columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the tables within a relational database joined by?

A

Primary Key: unique identifier for each record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is normalisation

A

process to organise and maintain the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the steps in normalisation called?

A

Normal forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the 3 normal forms?

A

First, second, third - standards used to structure tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the first form?

A
  • Unique column names🏷️
  • Indivisible columns ➗

E.g. in a Products table, you may have started with “Clothing, Casual” in the “Categories” column, but 1NF means “Clothing” and “Casual” are split into seperate rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the second form?

A
  • 1NF plus…
  • Divided into tables 🪑
  • With primary keys 🗝️

e.g. ProductName depends only on ProductID, so they create a seperate table from the Sales Table which contains SaleID, Qty and Customer Name

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the third form?

A
  • 2NF plus…
  • Columns dependent only on PK only
  • Not inferred from each other

E.g. A sales table with SaleID, ProductID, Qty, CustomerName and CustomerID - CustomerName can be derived from CustomerID, so we only need CustomerID in the table and the Name is removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is NoSQL

A

not only SQL - flexible databases that store and manage un/structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

example of NoSQL

A
  1. Graphs📈
  2. **Key value pairs **(Cust ID = 87654)🗝️🍐
17
Q

What databases system design does JLP use?

A
  1. Mainly relational🔗
  2. Aligned to the PLDM🧬
  3. Multiple Sources - ETL📚
  4. Warehouses in 80s🏭
  5. Snowflake handles both un/structured data but only use it for relational databases❄️
  6. Cloud Based Tools: Google Cloud Storage; Tableau; Collibra
  7. Data ingested into Snowflake and access through RBAC
18
Q

What are the challenges of maintaining Relational Databases? and how do you overcome them?

A
  1. Security and governance - classification/RBAC👮
  2. Skills - Training/External support🎓
  3. DQ/consistency - data validation/data lineage🌳
19
Q

What is a Data Warehouse?

A
  • Centralised system 🌐
  • Storing🗄️
  • Managing👔
  • Analysing📊
  • Large amounts of data 📈
  • From Various Sources📚