Databases Flashcards

1
Q

What are databases?

A

Large collection of data organised for rapid search + retrieval (by comp). E.g. card index (punch cards storing data, like rolodex but in comp lang), flat file (Files store data in hierarchical data structure, analogue of physical storage) or relational (Entities can ref each other, more efficient as reduces dup, quicker search, can be hierarchal but also reps complex relationships).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does a database system involve?

A

Schema (design/structure of data + how organised), what store + how related), DBMS (Database Management System, software used to store data efficiently, e.g. Access, MySQL), Query Language (domain specific language [DSL] to talk to DBMS, Insert, manipulate, search, extract e.g. SQL) + data itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 3 levels of design?

A

Conceptual, logical, physical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are entities?

A

People, things, events + locations. Distinct thing with indep existence which we store data about. Logical thing, not specific e.g. dog = entity but Danny = instance of entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are entity relationships?

A

We rep entities + relationships in Entity Relationship Diagram (ERD). Relationship is connection between entities. Cardinality of relationship is numerical relationship (3 flavours of cardinality) i.e. 1:1 (each entity has 1 related entity), 1:n (1 entity may have multiple related entities but entity on other side has only 1) + m:n (both entities can have many related entities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an attribute?

A

Properties/characteristics of an entity, i.e. required/optional, simple/composite, single-valued/multi-valued, stored/derived, keys.
If an attribute is required, it cannot be blank or have a null value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Simple vs Composite Attribute

A

Simple is self contained e.g. gender, whilst composite is made up of other attributes e.g. address.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Single vs Multi-Valued Attribute

A

Single means only 1 value e.g. DOB. Multi means multiple values e.g. skills.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Stored vs Derived Attribute

A

Stored are data we actually store e.g. DOB whereas derived are derived from stored e.g. age.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a primary key?

A

Attribute which uniquely identifies individual instances of an entity, quick way to find instance. Can’t be blank/null. E.g. student ID num.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a candidate key?

A

Attribute that could be a key, identified before choosing a primary key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a composite key?

A

Combo of attributes that uniquely identifies individual instances of entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does open source mean?

A

Software source code is provided + can be modified by anyone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between MySQL + MariaDB?

A

It’s a binary replacement with add+ features + dif licencing. MySQL more supported + MariaDB freer + more innovative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a NoSQL DBMS?

A

‘Not Just SQL’ is increasingly pop approach to data storage, moves away from trad model of large tables where state (data consistency) closely controlled to large flexible databases focusing on availability. V. pop for large data + high availability (lots of users) applications. e.g. MongoDB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is ORM?

A

Object-Relational Mapping. Mapping of entities within software to entities within database (e.g. class in Java maps to table in database). Often done automatically – tables automatically created in database once code written.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the main data sources?

A

Social media platforms, pharmaceuticals, medicine & e-commerce (transport, banking, retail, construction, financial, insurance, energy).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What’s wrong with unstructured data?

A

Hard to use + hard to turn into info.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is hierarchical structure?

A

Graph (tree-like structure). Each node may have 0+ children, each child has 1 parent, root note has no parent. 1st database model, created by IBM in 60s.
E.g. Windows registry stores low-level settings for Windows OS + apps using registry.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the pros + cons of hierarchical structure?

A

Pros: Clean + logical, fast to navigate, little data duplication, everything in place.
Cons: Many to Many not supported, rarely models real world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is set theory?

A

More flexible, items of data (nodes) can belong to 0+ sets, sets can contain 0+ items. Like Venn Diagram. Union (either/both), intersection (both) + set diff (\ in one, not other).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is data mining?

A

Computerised process to discover patterns in data using combo of techniques from data science, machine learning, stats + DB systems. Involves pre-processing, data mining, validation & visualisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is pre-processing?

A

Assembly of data into set + cleansing (removal of noise + fitting in missing data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is data mining as a process within data mining?

A

Finding anomalies (Unusual records/interesting errors), association rules (searching for relationships between variables), clustering (groups & objects with similarities), regression (find function/model which fits with least error) & summarisation (compacted/visualised rep/report).

25
Q

What are the issues with data mining?

A

Suitable available data, privacy concerns, processing power, accuracy of training/baseline data + false patterns/findings, correlation != causation. Data mining IRL involves risk models, fraud detection + trend predictions.

26
Q

What is normalisation?

A

Organising data into multiple related tables to min data redundancy + improve data integrity.

27
Q

What is data replication?

A

Increases database size + creates issues with insertion, deletion + updating. Data duplicated unnecessarily.

28
Q

What is un-normalised form?

A

List of un-normalised attributes showing key field + repeating groups in (…)* Data may be repeated (multiple attributes/columns with same name) + non-atomic (e.g. address). May have no primary key.

29
Q

What is 1NF?

A

Broken into repeated groups. Identify keys of each group + foreign key links using @, repeating group key will be compound 1 of at least 2 fields. No attributes with repeated/similar data, atomic attributes (e.g. Name -> Forename, Surname). Has primary key, each attribute has unique name.

30
Q

What is 2NF?

A

Remove partial-key dependencies. Identify keys of each group + foreign key links using @.

31
Q

What is 3NF?

A

Remove non-key dependencies. All groups have key, foreign links @ clearly shown.

32
Q

What is atomic data?

A

Data composed to lowest level, attribute that can’t be broken down any further.

33
Q

What is a foreign key?

A

Attribute, or set of attributes, within 1 relation that matches candidate key of some relation.

34
Q

What is referential integrity?

A

E.g. Any student added to enrollment must first exist in student.

35
Q

What is a schema?

A

Standard notation for documenting entity names, attributes + identifiers. i.e. Entity […Attributes…], Primary key underlined. (Attributes) means repeated fields. E.g. Registration [Student_Number, Surname, Forename, Address, Gender, (Module_Code, Module_Title, Lecturer_Code, Lecture)]

36
Q

What are some of the keywords + their meanings in databases? How should attributes be named?

A

Entity = table, attribute = field, instance of entity = row in table. When naming attributes, remove spaces + don’t use reserved words or special characters.

37
Q

What are the MySQL data types?

A

MySQL data types: TINYINT (-128 to 127 signed, 0 – 255 unsigned), SMALLINT, MEDIUMINT, INT, BIGINT, FLOAT, DOUBLE (larger than float), BOOLEAN, DATE, DATETIME (ms precision), TIMESTAMP (us precision), TIME (us precision), YEAR, CHAR (fixed length of characters), VARCHAR (variable length of characters), TEXT (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT), ENUM (set of options).

38
Q

What are some of the additional details necessary in MySQL regarding data types?

A

Whether it’s required/can be null, if indexed, if a default value available, if must be unique.

39
Q

What is a data dictionary?

A

Rep of physical database design.

40
Q

What is SQL?

A

Domain-specific language (DSL), used in programming + designed for managing data (organising + retrieving info) held in RDBMS. ANSI standard 1986, ISO standard 1987, not entirely portable. Designed to be human-readable.

41
Q

What are some of the key words used in SQL?

A

CREATE TABLE, ADD INDEX, INSERT, DELETE, UPDATE, SELECT. E.g. SELET name FROM pet WHERE pet id = 40255. Use * for all. Also SUM(), COUNT() + AVG(). Can rename fields e.g. SELECT name AS “Pet Name” FROM Pet. In text, there’s = (exactly equals, case insensitive) or LIKE and % (wildcard), esp for last letter.

42
Q

How do we order SQL results?

A

ORDER BY field ASC/DESC. Can do multiple fields in order of priority just use comma.

43
Q

How do we limit num of SQL results returned?

A

We can limit num of fields using LIMIT start, num from start.

44
Q

What is concurrency?

A

Introduces possibility of wrong info being written, updates overwriting each other. Can use locks in SQL to tell DB we only want to be able to read/write data. Can be done on tables or rows. Similar to feature called transactions (group of statements grouped + executed together which can be rolled back if fail). E.g. LOCK TABLES table READ/WRITE. UNLOCK TABLES.

45
Q

How do READ locks work?

A

Multiple READ locks can be held by diff sessions. While READ locked, locking session can read data, other sessions can start READ locks + read from tables, other sessions without locks can read. Nobody can write.

46
Q

How do WRITE locks work?

A

Session holding WRITE lock can read + write from table, only 1 WRITE lock held at time. Nobody else can read/write from table.

47
Q

What is VIEW?

A

Virtual table (like pre-built query output), used for commonly used outputs + access control, appear as real tables. Once created, stored as part of schema. GRANT access to resource in DB to specific USER for specific OPERATIONS. E.g. GRANT SELECT ON db.table TO user@host IDENTIFIED BY password.

48
Q

Why is db security important?

A

Commercial sensitivity (financial losses through frauds by employees) + personal privacy + data protection (leaking bad). Key issues: availability, authenticity, integrity, confidentiality.

49
Q

Describe availability in terms of db security.

A

Data must be available at all times, to appropriate users + must track who has access to it + who has accessed it.

50
Q

Describe authenticity in terms of db security.

A

Ensure data has been edited by authorised source, ensure users accessing system are who they say they are, verify all report requests are from authorised users + verify any outbound data is going to expected receiver.

51
Q

Describe integrity in terms of db security.

A

Integrity: Verify external data has correct formatting + other metadata, verify input data accurate + verifiable, ensure data following correct workflow rules for company, be able to report on all data changes + who authored them to ensure compliance with corporate rules + privacy laws.

52
Q

Describe confidentiality in terms of db security.

A

Ensure confidential data only available to correct people, ensure db is secure from external + internal system breaches, provide for reporting on who has accessed what data + what they’ve done with it. Mission critical + legal sensitive data must be highly secure at risk of lost business + litigation.

53
Q

What are security models?

A

Access control (method/system module to ensure users are who they say they are + have appropriate access to data e.g. log-in/off, password management. Authentication + authorisation).

54
Q

What is authentication?

A

DB authentication (process/act of confirming user who is attempting to log into db is authorised to do so + accorded rights to perform only activities they’ve been authorised to do). Authentication is prerequisite for authorisation.

55
Q

What is authorisation?

A

Process where db manager gets info about authenticated user.

56
Q

What is GRANT?

A

GRANT ACTION ON db.table TO person IDENTIFIED BY password. WITH GRANT OPTION allows user to pass grant onto other people.

57
Q

What are stored procedures?

A

Ways of creating code blocks you can reuse. Procedures do task but may not return a value, function does something but can return data. Can provide API (application programming interface for db app), easy maintenance (centralise complex code in single maintainable instance). Used for business logic, can receive + send parameters identified as IN, OUT, INOUT. Can read data from IN parameters + write data via OUT parameters + both from INOUT parameters.

58
Q

What are triggers?

A

When event occurs, do something. Allows us to check + fix integrity of data at db level, can cascade data creation automatically, can audit changes made, can run events every time change occurs. E.g BEFORE/AFTER INSERT/UPDATE/DELETE