introduction Flashcards

1
Q

what is data

A

anything that can be represented in binary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why do we collect data

A

so that we can retrieve information and understand it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is data engineering

A

the process of designing and building systems that allow people to collect manage and analyse data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what do data engineers do

A

make raw data useable for data scientists and third parties in general

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

which 6 things are data engineers responsible for

A

pipelines
integration
quality
analysis
security
automation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

data pipelines

A

flows that manage and process large data sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

data integration

A

ensuring that data from different sources is integrated seamlessly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

data quality

A

making sure the data infrastructure is reliable efficient and of high quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

data analysis

A

analysing raw data to show trends and provide predictive models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

data security

A

protecting data against loss/theft

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

automation

A

automating tasks within the data pipeline which improves efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is a database

A

structured systems for storing retrieving and managing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is raw data

A

data kept in an excel file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how is data stored in a database

A

organised in a structured format using data models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

data model

A

defines how data will be related and stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are some negatives of file processing applications

A

would have to hire a developer as they were hard to implement which is expensive
issues with optimisation performance reliability and reuse

17
Q

what is a file processing application

A

a set of files that are processed to retrieve information

18
Q

logical model

A

allows a logical representation which aids processing which can be represented graphically via an entity relationship diagram

19
Q

dbsm

A

database management system; uses the logical model to control the database and provides efficient reliable multi-user storage and access to large amounts of persistent data

20
Q

what is the difference between source code, a program, and software

A

source code; code written by developers that isn’t directly executable
program; what you get when the source code is executed
software; the entire collection of programs, libraries and related data

21
Q

ddl

A

data definition language; creates tables

22
Q

dml

A

data manipulation language; modifies tables

23
Q

what is an entity and how is it represented in an er diagram

A

an object with distinguishable attributes
rectangle

24
Q

entity set

A

a representation of entities with the same set of attributes

25
Q

what is an attribute and how is it represented in an er diagram

A

information that describes an aspect of an entity
oval

26
Q

primary key

A

unique identifier for an entitiy

27
Q

what is a derived attribute and how is it represented in an er diagram

A

an imaginary attribute not stored in the database but derived using system recourses
dashed/broken oval

28
Q

what are multi-valued attributes and how are they represented in an er model

A

attributes with more than one value
double oval

29
Q

composite attributes

A

an imaginary group of attributes not in the table that groups together existing attributes

30
Q

relation

A

a logical model for associating 2 or more entities

31
Q

what is a relationship set and how is it represented in an er model

A

a set of all relationship that can have attributes and a primary key
diamond

32
Q

degree of relationship set

A

number of different entity sets in the relationship

33
Q

one to one relationship

A

created whilst creating the table and can lead to new tables

34
Q

total participation (must)

A

for every entity in an entity set, it must relate to at least another entity in the other set

35
Q

partial participation (may)

A

not all entities have to map to another entity in the other set

36
Q

what is combined to create the subject table key

A

owner table primary key + subject table weak key

37
Q

overlap constraints

A

restrictions on the relationship between different entities, ensuring that they don’t simultaneously occupy conflicting roles

38
Q

covering constraints

A

ensures that everything is correctly classified within the appropriate categories and leaving no gaps in the er model

39
Q

what are the 2 reasons that we use ISA

A

identifies entities that participate in relationships
adds descriptive attributes to a specific subclass