DBUD Flashcards

1
Q

What is relational data

A

Relational databases organize data like tables in a spreadsheet, ensuring everything stays organized and structured. They’re great for things like keeping track of money, managing company information, or handling online shopping.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is non-relational data

A

Non-relational databases are like big storage bins where you can throw in all sorts of things without worrying too much about organization. They’re perfect for handling lots of different kinds of information quickly, like social media posts, website clicks, or online gaming data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why should companies use databases instead of flat files?

A

While flat files like Excel can be useful for small-scale, simple tasks, databases provide the necessary infrastructure, security, scalability, and functionality required for robust data management in a business environment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Data Lake?

A

A data lake is a system of data that is stored in its natural/raw format. No information or format is changed while storing the data in the lake.

Good with big data, needing flexibility in data storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is data Warehouse

A

A data warehouse is designed to store structured data from multiple sources, typically used for reporting and data analysis. It organizes data into schemas and tables optimized for query performance and analytics.

Ideal when you need to match structured data from diffrent sources, like reports, historical data and others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is data mart?

A

A data mart is a subset of a data warehouse, focused on a specific business line, department, or user group. It contains a smaller, more specific set of data tailored for specific analytical needs.

Is used when specific departments or user groups need quick access to relevant datasets tailored to their needs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What SQL codes are part of the Data Definition Language (DDL)

A

DDL is used to define and manage all database objects, such as tables, indexes, and views.
The common DDL commands include CREATE, ALTER, DROP, and TRUNCATE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What codes are part of the Data manipulation language (DML)

A

DML is used for querying and modifying data within existing database objects.

Common DML commands include SELECT, INSERT, UPDATE, and DELETE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are data models and how are they different?

A

A data model is a representation of some data.It has a subject and attributes that describe the relationship of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a Relational Model?

A

Data structured and represented as tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Semi-structured Model?

A

Data structured as a tree hierarchy, such as XML (eXtensible Markup Language)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are socio-economic factors?

A

It ensures that DATABASES we build are inclusive and equitable. Under here are all the diffrent kinds of Socio-economics:

  • Multilingual Support
  • Low Bandwidth Optimization:
  • Screen Reader Compatibility:
  • Affordability in Pricing Data:
  • Accessible Education Resources:
  • Disability-Friendly Interfaces:
  • Elderly-Friendly Navigation:
  • Income-Based Services and Features:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What Socioeconomic factors are rellivandt in our subject

A
  • Multilingual Support:
    Even though we have a business rule that all our customers are danish with a danish phone number, it is still worth that we mention that our website should be including for non-danish/english speakers.
    • This is important because not all danish citizens speak nor read danish, but should still have access to the information within the database!
  • Affordability:
    • pros: A big part of what we do, is providing power for a good price. We should have an option that highlights the low pricing and what you get for it.
    • cons: we actually only provide one price for each house so maybe this is irrelevant
  • Low Bandwidth options: We transport power to everyone in Denmark, even tho who lives in the middle of nowhere like in Skjern. Therefor the power_broker (Who we work for) needs to be able to turn off the live pricing update, so the website does not need as much data to load.

Screen reader compatibility:

  • important hence not all customers is expected to have a good sight, so having the ability to get information read aloud, could help visually impaired individuals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain how your database could be part of a larger multi-tier client server system - Data-application.

A
  • Presentation Tier (Often the Client)

This is what the customers receive. For example, if we create an app for the customers, this is what we present to them. They can interact with our data, manage bills, and see prices.

  • Application tier (or Business logic)

Processes and manages business operations like power distribution, billing, and customer management.

  • Data tier (our database)

Databases holding customer information, power supply contracts, billing records, and usage data.

We (Power_broker) have our own data on all our customers, while our Power_supplier has data on the gathered power, etc.
I admit I am not sure if they meet in the Data tier or the Application tier, but I have arguments for both.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does ACID mean?

A

Atomicity, Consistency, Isolation, and Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain how your DB project practices the ACID (Atomicity, Consistency, Isolation, and Durability) test for databases!

A
  • “A”tomicity: essential data integrity principle. Data must be atomic.
    • “Atomicity” is a reliability rule where either any database transaction (e.g. registration, query, ATM withdrawal/deposit, erasing data etc.) must either be completed in full or not be executed at all.
  • “C”onsistency: constraints that ensure that transactions are reliable and proper.
    • Consistency sort of work as the rules we created for the database!!
    • Databases must have specific constraints that ensure certain business and technical rules. A part of data validation.
    • Account cannot have a negative balance.
    • Phone number must follow a certain numerical format
  • “I”solation is a database transaction rule based on FIFO (First in, First out).
    • Every single database transaction, whether a query or adding or deleting an entry, must be done in independently. Two transactions can be run at the same time.
  • “D”urability” is to ensure that once a transaction is run, the transaction remains in the database and is not lost or erased.
    • Every transaction must be durable and stored in non-volatile memory.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How are databases, views, and applications related?

A

Databases store and organize data

views provide customized perspectives of the data

Applications interact with databases and views to fulfill specific business requirements and user needs.

Together, they form an interconnected ecosystem that supports data-driven operations and applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you structure a user story?

A

As a (…), I want (…), so that I can (…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why are user stories useful in DBUD

A
  • User stories illustrate end-user needs, providing relatable examples that highlight the importance of a database.
  • They enhance communication and collaboration between clients and the development team, fostering informed decision-making.
  • User stories allow clients to visualize database functionality within their workflow, facilitating incremental development tailored to evolving requirements.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What questions do you typically ask during requirements engineering (during eliclication and refinement)

A
  • General understanding:
    • What is the purpose of the system?
    • Who are the primary stakeholders?
      • Who will be affected by this system and who will be interested in the system and its development?
    • What problems are you trying to solve with this system?
      • our report:
        energy poverty
    • What are the key business processes that the system needs to support?
  • Functional requirements:
    • What specific functionalities do you expect the system to provide?
    • What are the main tasks the users need to perform with the system?
    • Are there existing system that this new system needs to integrate with?
  • non-functional requirements
    • What are the performance requirements?
    • What are the security requirements?
      • Two-factor Authentication
    • What are the usability requirements?
    • What are the availability and reliability requirements?
  • Constraints and limitations:
    • Are there any budget constraints?
    • What is the project timeline?
    • Are there any legal or regulatory requirements?
    • What are the technical constraints?
  • User perspective:
    • Who are the end-users of the system?
    • What are the user experience goals?
    • What are the common user scenarios or use cases?
    • What are the pain points for users with the current system? (if any)
      • Understand current challenges and areas for improvement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What would be a new user story for a data analytics dashboard for management?

A

As a data analytics, I want a database where I can store information about my employees names, their account number for their salary and employee ID, so that I can keep them paid and happy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are your Project-ERD Business Rules?

A
  1. A customer can only have one contract with the broker.
  2. An energy company can have many customers, both private customers and business customers.
  3. An employee at an energy company can be a customer at the same company that they are employed at.
  4. Each energy company can have their own database showcasing the prices of energy, one database can contain data from many different energy companies.
  5. There needs to be a broker between the individual, private customer and a Power Supplier. A customer cannot buy energy directly from the energy supplier.
  6. An address can only belong to one customer and zip code.
  7. A supplier is only able to deliver one type of energy. The type of energy that the supplier can deliver is either green or fossil fuels
  8. Contract numbers can only consist of letters a-z and numbers 0-9
  9. All ID’s can only consist of numbers between 0-9, and cannot be more or less than 10 numbers
  10. All customers have residency in Denmark and therefore have a Danish phone number.
  11. The cost of delivering energy will not be considered since we have no information on how the energy is transported from the supplier to the broker and to the customers.
  12. Energy has been divided into green/fossil - putting aside the different types of green and fossil.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which columns can be Null vs. Not Null? Why?

A
  • NOT NULL means that the column needs to contain data and cannot be left empty e.g., primary keys, ID, or emails
  • NULL means that the column is optional and it is permissible to have records were this column has not value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How did you decide on the unique attribute for each table?

A

How we choose which of the attributes that should be the primary key was how unique those single attributes could be - this was often the ID of the different stakeholders. Having the ID’s as the primary key ensured that we easier could retrieve the correct data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a candidate keys?

A

Candidate Keys are columns or sets of columns that can uniquely identify rows in a table.

Often a PK and the other uniques.

26
Q

In the ERD which once are candidate keys, and which once are composite Key, made by candidate keys?

A

In the diagram:

Power_Supplier:

Power_supplier_name (Primary Key)
Customer:

Customer_id (Primary Key)
Customer_mail (Unique Key)
Customer_pnumber (Unique Key)
Power_Broker:

Broker_name (Primary Key)
Contract_Supplier_Broker:

CSB_contract_id (Primary Key)
Customer_Contract:

Customer_contract_id (Primary Key)
City:

City_name (Primary Key)
Region:

Region_name (Primary Key)
Customer_address:

Composite Key, made by candidate keys

Composite Key: (Road_name, House_number, Zipcode) (Primary Key)

27
Q

What is a Composite Key?

A

A Composite Key is a key that consists of two or more columns in a table that together uniquely identify a row in that table. Each column in a composite key can contain duplicate values, but the combination of values across these columns must be unique for each row. Composite keys are used when a single column is not sufficient to guarantee the uniqueness of a row.

28
Q

Does our ERD offer any level of database normalization? Which level and why?

A

We follow the rules and all tables are in 3NF. Two examples

Power_Supplier:

It is in 3NF. Each attribute is functionally dependent on the primary key (Power_supplier_name).

Customer_address:

In 3NF. Each attribute is dependent on the composite primary key (Road_name, House_number, Zipcode).

29
Q

What is Atomic values

A

Atomic values are like the smallest pieces of data that can’t be divided further. In a database, this means each cell in a table should hold just one piece of information, like a single phone number or a single name, not a list or a combination of multiple items.

30
Q

What are the three normal form rules (database normalization)

A

First Normal Form (1NF):

All tables have unique column names.
Each cell contains a single value (no repeating groups or arrays).
Primary keys are unique and not null.

Second Normal Form (2NF):

All requirements for 1NF are met.
All non-key attributes are fully functional dependent on the primary key.
No partial dependencies.

Third Normal Form (3NF):

All requirements for 2NF are met.
Elimination of transitive dependencies.
No non-primary key column is dependent on another non-primary key column.

31
Q

What is an Entity

A

Entities refer to tables used in databases.

32
Q

what is strong and weak entities?
Do we have weak and which once?

A

strong entities are those that have their own primary key and can exist independently of other entities, while weak entities rely on the existence of a related strong entity.

So, in our ERD, the strong entities are Power Supplier, Customer, Power Broker, City, Region, and Customer Address, while the weak entities are Customer Contract and Contract Supplier Broker.

33
Q

How to ensured relationships, cardinalities, and referential integrity?

A

Foreign key constraints ensure referential integrity by linking the primary key of one table to a foreign key in another, while unique constraints ensure uniqueness within a column, both vital for maintaining data consistency and accuracy in a relational database.

34
Q

When would you use an inner join?

A

Inner Joins: You would use inner joins when you only want to retrieve rows that have matching values in both tables being joined. This is useful when you need to combine data from multiple tables based on a common key.

35
Q

When would you use a Outer Joins?

A

Outer joins are used when you want to retrieve all rows from one or both tables being joined, regardless of whether there is a matching row in the other table. This is useful when you want to include unmatched rows from one or both tables in the result set.

36
Q

How do you assign PK and Composite Keys?

A

With PK you Use the PRIMARY KEY constraint to designate the chosen colum.

CREATE TABLE TableName (
PK_Column INT AUTO_INCREMENT PRIMARY KEY,
Other_Columns DATATYPE
);

With composite you do kind of the same, just referring to two gollum’s

CREATE TABLE TableName (
Column1 DATATYPE,
Column2 DATATYPE,
PRIMARY KEY (Column1, Column2),
Other_Columns DATATYPE
);

We did it like this

CREATE TABLE Customer_address (
Customer_id INTEGER NOT NULL,
Road_name NVARCHAR(135) NOT NULL,
House_number NVARCHAR NOT NULL,
Zipcode NVARCHAR NOT NULL,
FOREIGN KEY (Customer_id) REFERENCES Customer(Customer_id),
CONSTRAINT Customer_address_Composite_Key PRIMARY KEY (Road_name, House_number, Zipcode)
);

37
Q

What options does our platform Sustainadata have for its users?

A
  1. The user has an account with the needed information.
  2. Tell if the power supplied by certain suppliers is either green or fossil fuels.
  3. Tell the power price in 6-hour intervals.
  4. See the amount of energy at disposal compared with the amount needed.
  5. Let the user know when the power is the cheapest during the day.
  6. Let the broker know how much power each supplier can supply.
38
Q

What is a associative entity?

A

Decomposes a many-to-many relationship
Creates a one-to-many-many-to-one (OMMO) to each entity

The PK of the strong entity will become the FK of the weak entity

39
Q

SQL CODE: What does SELECT mean?

A

SELECT is a statement that enables a selection of specific data from the database

It is often followed by a FROM statement

40
Q

SQL CODE: Give an example of WHERE statement

A

SELECT CC_morning_price, CC_afternoon_price
FROM Customer_Contract
WHERE Customer_id = 6

– Selecting two time intervals
– specifying the table “from where”

41
Q

How do you connect to Python?

A

importing the SQLITE3 package
import sqlite3
# creating a varibale ‘conn’ to connect to the database path
conn=sqlite3.connect(“/Users/amaliecarlsen/Desktop/Studie/2. Semester /SustainaData.db”)
# using ‘cursor’ to connect to the database
cursor=conn.cursor()

Inserting a new customer (when new customers create a profile)
cursor.execute(“INSERT INTO Customer(Customer_id, Customer_fname, Customer_mname, Customer_lname, Customer_mail, Customer_pnumber, Customer_Gender, Energy_consumption_kwh, Carbon_footprint_kg)VALUES(11,’Soren’,’Smith’,’Nielsen’,’SSM@mail.com’,’0045 31629808’,’Male’,4700,12300)”)
# committing it to the database
conn.commit()

42
Q

Give an example of how to INSERT data

A

INSERT INTO Region (Region_name, City_count, Percentage_off_green_energy)
VALUES
(‘Region Nordjylland’, 11, 37),
(‘Region Midtjylland’, 19, 42),
(‘Region Syddanmark’, 22, 46),
(‘Region Hovedstaden’, 29, 76),
(‘Region Sjælland’, 17, 58);

43
Q

How do you create a view

A

CREATE VIEW COMBADD
AS
SELECT Road_name || ‘ ‘ || House_number || ‘, ‘ || Zipcode AS “Customer Address”
FROM Customer_address;

44
Q

Scalability, how is it possible to expand the database to the entirety of Denmark?

A

We will execute vertical scaling by enhancing the CPU, RAM, and disk space.

45
Q

How do you rename a Entity/table

A

ALTER TABLE __ RENAME TO __

Ex. Department table

  • suppose we want to rename the table “department” to “Dept”

ALTER TABLE department RENAME TO Dept

46
Q

what are the 3 phases in requirements Engineering

A

The start of all database projects
Every recuirements engineering is a user story

Process modeling: representing the system with processes (what to do when and why)

Logic modeling: represents logic of user stories processes

Data modeling: representing the code to build, and queries

47
Q

ACID. What is Atomicity

A

Essential data integrity principle
Data must be atomic (either something runs or it doesn’t run)
Either all or none of the database actions occur
A good rule in any database transaction: registration, query, ATM, deposit, withdrawal, etc.
Ex. Submitting, you have to press submit else nothing happens at all

48
Q

ACID. What is Consistency

A

Constraints that ensure that transactions are reliable and proper
- Data validation: databases must have specific constraints ensuring certain business and technical rules
- Our business rules make the database consistent
Ex. Phone number must follow a certain numerical format
Either statement level or transaction level consistent…

49
Q

ACID. What is Durability

A

Ensures that when a transaction is run, it remains in the database (not lost or erased)  database commited changes are permanent
- Transactions must be durable and stored in memory
- If you do something it stays

50
Q

ACID. What is Isolation

A

A rule based on FIFO: First in, First out
Every database transaction (queries, adding, deleting) must be isolated/ done independently.
- What is written first runs first, what is written after runs after that

51
Q

What is Data-application independence

A

The ability to maintain data separate from application
If aplicators change we don’t have to worry about changes in the data (data is saved if programs crash)  ensures no corruption of data

52
Q

What is Database Schema

A

Database blueprints describing the relationship between tables

Physical: storing the actual digital data
Logical: how data is organized within the database, which data is stored in which tables? which are the primary keys establishing relationships between tables?

53
Q

SQL: What is BOOLEAN

A

Denoted a binary attribute constraint such as TRUE/ FALSE

54
Q

What is Cardinality

A

Modelled relationship between each entity, defined by the number of instances two entities can “participate” with each other

= the number of instances of entity B that can or must be associated with each instance of entity A

One-to-one (1:1)

OMMO: One to many – many to one
Many-to-many cardinality (M:M),
One-to-many cardinality (1:M)

Max: 0,1 or many

Optimal participation: minimum cardinality is zero
Required participation: minimum cardinality is one

55
Q

How to make an Inner join

A

SELECT [attribute], [attribute], [attribute], [attribute]
FROM [tablename]
INNER JOIN [tablename]
ON [tablename].[key] = [tablename].[key];

56
Q

What is a Trigger

A
  • Designed to check or change data based on data modification or definition statement
  • Should not return data to the user
  • Trigger logic starts with BEGIN and ends with END

Ex. Insert trigger (triggered by insert statement)
CREATE TRIGGER [triggername] AFTER INSERT
ON [tablename]
BEGIN
INSERT INTO tablename
VALUES(new.[attribute], new.[attribute]);
– using new to add new entities
END;

57
Q

What is DBMS?

A

Database Management Systems (DBMS) are software systems used to store, retrieve, and run queries on data. A DBMS serves as an interface between an end-user and a database, allowing users to create, read, update, and delete data in the database.

58
Q

What is Role-Based Access Control (RBAC)

A

Assigns users to roles based on their job functions, and each role has specific permissions associated with it. This simplifies management of permissions and enhances security.

59
Q

what does it mean to be Scaling Up (Vertical Scaling)

A

Includes:
Same infrastructure but more power

Add more CPU, RAM, and Disk space to cope with an increasing workload while maintaining control and existing design
(you can only increase to the limits of your server)

Risks:
Too much complexity and downtime
More chances of failure from adding to much

60
Q

what does it mean to be scaling Out (Horizontal Scaling)

A

Includes:
Distribute database processing across many different machines’
Balance load and complexity (like balancing a scale by shifting weight)

Needs:
A lot of data from different places has to be read fast without high accuracy, consistency, security
Combination with noSQL can be usefull

61
Q

NoSQL

A

(Not only SQL) – using non-relational databases

  • These are often distributed, replicated databases
  • Used in recognized web applications: facebook, twitter

Document, key value, graph or wide column stores
Horizontally scalable, while SQL is vertically scalable

Why instead of SQL:
- SQL takes more time: set up tables
- Not effective for storing and quering unstructured data with unknown formats
- noSQL has big scalability benefits

Weakness:
- Eventual consistency, delayed data updates