Introduction to Modelling Flashcards

(42 cards)

1
Q

what is metadata

A

data about data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is data

A

unprocessed information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is information

A

data associated together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is knowledge

A

understanding information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what sort of software manages data

A

file formats for particular applications .xls .doc .mp4 .jpg

specialist data management applications eg covid tracker

group project last year

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

4 ways of adding structure to data files

A

delimited text field
fixed length field
length-based field
identified field

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a delimeter text field

A

choosing a special character eg comma or question mark that will not appear as a legitimate character within the info field and this will separate the individual data entries eg. comma separated file csv

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a fixed length field

A

use a fixed length for each information field eg 20 characters, padding out when length is less than fixed lenght

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is the disadvantage of delimeter text

A

the character cannot be used legitimately in the information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is length based field

A

writing the length of the information field before the information so we know exactly how much space it takes up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is identified field

A

write the name of the information field and then value both represented as delimited text fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the two types of approaches of turning data into information

A

structured and unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is a structured way of turning data into information

A

deliberately associate data together into information eg excel, data bases, datawarehouses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

examples of structured approaches of turning data into information

A

excel
databases
datawarehouses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is an unstructured way of turning data into information

A

loosely managed data together to serve a specific information need eg search engine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

example of an unstructured approach of turning data into information

A

search engines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

example of structured querying

A

SQL eg select exact criteria about data types

18
Q

What does SQL stand for

A

structured query language

19
Q

example of unstructured querying

A

keyword based, phrase based, search engine

20
Q

examples of structured results

A

exact, do need to estimate relevance, returns complete set of data that matches query criteria

21
Q

example of unstructured results

A

unsure of relevance, we must estimate relevance ourselves egGoogle page

22
Q

DBs stands for

23
Q

ACID stands for

A

atomic
consistent
isolated
durable

24
Q

different type of data base models

A

relational
networked
hierarchal
onject-orientated

25
most popular type of data base model
relational
26
why didn't the other types of data base models take off
most investments made in relational
27
DWs stands for
datawarehouses
28
what is a datawarehouse
a subject orientated, nonvolatile, time-varients collection of data in support of management decisions
29
uses of datawarehouses
data mining decision support OLAP (online analytical processing)
30
what do data warehouses allow
trend view of data as timestamped
31
information retrieval cycle in unstructured approach
``` information needed >>> query >>> query indexing >>> refined query >>> matching/retrieval >>> recommended objects >>> user review >>> ```
32
what are four common challenges in managing data for enterprises and individuals
volume validity variety velocity
33
how is volume a challenge in managing data
getting bigger ad bigger
34
what are legacy sytstems
old information systems, as technology changes, information about enterprises or customers etc stays constant so this information is stored in a legacy system
35
how is velocity a challenge in managing data
often data is time sensitive so much be processed in real time as it is streaming in order to maximize its value
36
how is variety a challenge in managing data
variety in data eg text, audio, video, click streams, log files and more, difficult to label it all and agree
37
how is validity a challenge in managing data
trade offs between data privacy and protection
38
solution for coping with variety challenges
natural language processing semantic web technologies
39
what is the difference between natural language processing and semantic web technologies
NLP is about understanding the context and matching based on that semantics is more about labelling things right anf agreeing on the labels
40
solution for coping with volume challenge
outsource information management and technologies eg cloud
41
solution for coping with validity challenges
GDPR lots of work going on about this at EU level data ethics data privacy
42
what does GDPR stand for
general data protection regulations