Module 6 Flashcards

Data mangement, modelling and manipulation

1
Q

Data

A

Raw facts and figures without any meaning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Information

A

Data shaped into a meaningful form through context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Knowledge

A

Results from experience and education. The info collected by an entity and the ability to locate additional info

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

DNA of a Tweet

A
  • data on hashtags used and users mentioned
  • geo-data
  • data on videos and pictures
  • user-data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Relational Database

A

Database organises data into one or more tables. Tables consist of columns called attributes or fields and rows called tuples or records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Keys

A

Unique Primary Key identify each row. Keys can be used to link different tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SQL

A

Standard Query Language is the defacto standard language used to communicate with database systems. We can create, delete and manipulate databases using SQL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Categories of commands in SQL

A
  • DDL data definition language: creating, deleting, altering databases and tables
  • DML data manipulation language: inserting, deleting, changing, retreiving tuples
  • DCL data control language: access management
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SQLite

A

SQLite stores the whole database in a single file. Once you have access to the file, you have access to the database. DCL is not part of SQLite

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Creating tables

A

CREATE TABLE Table Name (Column1 DataType (optional length), Column2 DataType (optional length),… PRIMARY KEY (KeyColumn1, KeyColumn2)

Creates a table and declares a composite primary key across multiple columns.
You can’t have spaces in table name and column names unless you put [].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data Types

A

TEXT: character string
INTEGER/INT: signed integer
BLOB: binary large object, videos?photos stored in binary language
REAL: floating point numbers
NUMERIC/NUM: anything not covered by the rest, includes BOOLEAN, DATE, DATETIME

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Redundancy

A

Redundancy=repetition should be avoided in databases since it:
- Increases the required amount of storage
- Is a source for data inconsistencies
BUT: it can improve databases since the join operator becomes uneccessary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Normalization

A

Normalization is a technique that manages the trade-off between redundancy and ease of use/speed of database. It is a standardized technique that transforms a database iteriatively accordingly to a strict set of rules.
A database satisfies a normal form, if all tables in that database satisfy the normal form.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

First Normal Form 1NF

A
  1. The domain (space from wich attributes take their value from) of each attribute contains only atomic values. Atomic values: values that can’t be split into multiple logically distinct parts
  2. Each attribute contains only one value from the domain.
    After first normalization redundancies remain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to solve when there is multiple players with the same name?

A

a) create a composite key (player, date of birth)
b) create a unique player ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Second Normal Form

A
  1. It satisfies 1NF
  2. In case there is a composite key there are no partial dependencies (a non-key attribute always depends on all components of the key and not just a subset.
    [place depends on tournament, but not on year, it only depends on part of the key, 2NF not satisfied]
17
Q

How to solve partial dependencies?

A

a) we can create a new column that provides an artificial key to the table such as ID (the primary key consists of a single value so 2NF is satisfied)-> doesn’t reduce redundancy
b) we split the table into 2 (with the second table containing the tournament information)-> reduces redundancy

18
Q

Third Normal Form

A
  • It satisfies 2NF
  • There are no transitive dependencies (attributes may not be functionally dependant on other non-key attributes)
    E.g: Country and date of birth are both dependant on player, which is not a primary key. They are only linked to the primary key because the player is linked to it. (this is called transitive dependency)
19
Q

How to solve transitive dependencies?

A

By splitting the table (with the second table containing player info)

20
Q

Results of Normalization

A
  • Less suceptible to human error
  • More complex SQL queries to join the tables to produce a complete statement
    Normalization also resolves many-to-many relationships between entity sets
21
Q

Entity Relationship Models

A

Entity Relationship Models describe the relationship between things of interest. They are visualized as Entity Relationship Diagrams. ERD

22
Q

Components of ERD

A
  • Entities are things of interest to the organization, may be physical or immaterial concepts.
  • Attributes describe an entity (chracteristics of a person may be gender, age, name…)
  • An entity set corresponds to the table notion, while an entity corresponds to a row,
23
Q

Relationships and Cardinalities

A

Relationships explain the link between two entities. Cardinalities refer to the number or entities in one entity set that are related to entities in another entity set.

24
Q

Types of Cardinalities

A
  1. One-to-one (1:1): Exactly one entity in set A is linked to one in set B. These are very rare as they usually get combined in one entity set.
  2. One-to-many (1:M): One entity in set A is linked to many in set B
  3. Many-to-many (M:N): Many entities in set A are linked to many (not always the same nr.) entities in set B. These are eliminated through normalization.
25
Q

Optional Relationship

A

One or both ends of the relationship can have zero entities. Optionality refers to the minimum nr.of cardinalities (1 or 0)

26
Q

Upper cardinality

A

Or multiplicities are about the maximum number of cardinalities.
E.g: Not all employees are given a car.
Not all players have won a tournament. (but every tournament needs a champion, so this is not optional).

27
Q

Sets & Keys

A
  • an entity set contains all entities of a similar kind
  • a relationship set contains all relationships of a similar kind
    To uniquely identify each entity in an entity set we need a key.
  • A key is an attribute that uniquely identifies each entity in an entity set, respectively each record
28
Q

Phases of an ERD

A
  • Conceptual model: identify the things of interest (entities)
  • Logical model: identify the attributes and keys, decide the data type for each entity
  • Physical model: database proof, no many to many relationships, implementable
29
Q

How to solve M:N relationships?

A