Introduction to databases Flashcards

1
Q

What is Data, Information and Knowledge?

A

Data can be defined as symbols or facts, of a qualitative or quantitative type, that
represent properties of objects, e.g., the number 10

▪ Information is data compiled to derive meaningful inferences. It is structured,
processed and presented with an assigned meaning
• E.g., the number 10 from the data example gains a context if we say “10 km”
————————————–
10 km on its own doesn’t mean much though. Is it the length of a road? The distance
between two points? Or something else entirely?
Hence, we apply knowledge. Knowledge is your own (or a collective) expertise used to
infer results from information
• E.g., 10 km is a walkable distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the main difference between Data, Information and Knowledge?

A

The main difference between the three types is their level of abstraction. Data
being the lowest (or most concrete) type and knowledge being the highest (or
most abstract) type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is it important to make distinctions between Data, Information and Knowledge?

A

These distinctions are important:
• Because we need to understand the relationship between each type to make sense of
the purpose of a database
• E.g., we use databases to store data. However, this means that we still need someone
to make sense of said data (information) as well as utilize it in the intended way
(knowledge).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What different types of Data exists? And what is the difference between them.

A

Structured data and Unstructured data.

▪ Data that resides in a fixed field within a record or file is called structured data.
This includes data contained in relational databases and spreadsheets. E.g., a cell
in a spreadsheet may contain the number 10

Structured data depends on creating a data model – a model of the types of data
that will be recorded as well as how they will be stored, processed and accessed

We might for example have a spreadsheet containing students and their attributes. Each
attribute in a row, combined, makes out a single student. I.e., our data model

▪ Unstructured data, however, is all those things that cannot be so readily classified
and fit into a neat box: photos/graphic images, videos, webpages, documents etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What language do you use to manage structured data?

A
▪ Structured data, in the context of databases, is often managed using Structured
Query Language (SQL)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Where can unstructed data be stored?

A

▪ Unstructured data can be stored in non-relational databases (MongoDB,
JanusGraph etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is semi-structured data?

A

Semi-structured data is information that doesn’t consist of purely
structured data whilst still retaining some structure. E.g., we could store an
employee in a JSON-format rather than different columns in a spreadsheet

▪ “Some structure” basically means that we’ll have some type of markers or tags
which can be used to identify elements within the data (much like structured
data), but it doesn’t have the same rigid structure

▪ For example, an email will have a sender, a recipient, a subject, a message text
and other fixed fields (structured data). But we can also attach an image or a file
(unstructured data) to our email before sending it

▪ As it’s neither entirely structured or unstructured, we achieve semi-structured
data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What defines lists or spreadsheets?

A

In lists or spreadsheets, each row of data is intended to stand on its own. I.e., we’re allowed to
have rows with duplicated information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are spreadsheets good and bad for?

A

▪Spreadsheets are good for:

We can easily sort a column based on the values in the cells. E.g., we can sort by student- or
course names if we wanted to

They’re also good for storing data. The example used is a small one, but we could obviously
add thousands of rows without a problem

  • Spreadsheets are bad for:

At some point we will run out of RAM

However - way before we run out of RAM - we’ll also start to note just how much longer it
takes to simply open the spreadsheet, never mind searching for a specific data cell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What purposes does Databases fulfill that is not so obvious?

A

• To provide an organizational structure for data
• To provide a mechanism for creating (C), reading (R), updating (U) and deleting (D)
data, i.e., CRUD operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what does CRUD stand for?

A

creating (C), reading (R), updating (U) and deleting (D)

data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How much data can be stored in a relational database?

A

There is no real limit to how much data can be stored in a relational database, you can
generally always add more storage as you go

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the downsides to using a database over spreadsheets?

A

▪ In short, it requires more technical expertise to work with (it is quite easy to
simply open a new sheet in Google Sheets and list away, comparatively)

▪ In some cases, it might also be redundant. If you know you’re going to be working
with a small set of data, why bother establishing a database for that purpose
when you might as well work with a smaller CSV-file (or an equivalent to that)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a DBMS (database management system)?

A

▪ A DBMS (database management system) is a program used to manage a database. This means that it – among
other things - provides a user with an interface to perform various CRUDoperations on a database

• It also provides protection and security to the database itself and makes sure we don’t run
into issues when, e.g., multiple users are working with the database at the same time

• I.e., it helps us maintain data integrity (the opposite of data corruption)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is it that an RDBMS (relational database management system) does in addition to the DBMS (database management system) functionality?

A

A RDBMS will, in addition to the DBMS functionality, also automatically
keep track of the relationships between our data (or rather, our tables/relational
data models)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is essential when working with databases?

A

Planning is essential when working with databases

17
Q

What the three general phases to designing a database?

A

The three general phases to designing a database is: Conceptual design, logical
design and a physical design

18
Q

In the first phase of designing a database, Conceptual design. What do we focus on there?

A

Conceptual design:

• A design phase where we focus on developing conceptual models of the data used in an organization.

I.e., we conceptualize the relational data models and their attributes as entities

19
Q

In the second phase of designing a database,

Logical design. What do we focus on there?

A

Logical design:

• In the second database design phase (i.e., logical design) we focus on translating
the ER model(conceptual model) to a set of conceptual relational tables called schemas

20
Q

In the third and last phase of designing a database,

Physical design. What do we focus on there?

A

Physical design:

• In this final phase, we create actual tables in an actual database using SQL

21
Q

What does a three-tier/layer architecture consists of?

A

A three-tier/layer architecture consists of a presentation layer, a logic layer and a
data layer

• The presentation layer represents the UI or view, i.e., what the end-user sees and interacts
with

• The logic layer acts as the “business layer”. It is here that we’ll coordinate the application by
running calculations, fetching and sending data from the database etc. I.e., it acts as the middleman between our UI and the database

• The data layer represents the information source, e.g., our database or file system