Introduktion Flashcards

1
Q

Varför är det bra att använda big data i beslutsfattande inom företag enligt en studie av Brown et al(2011)?

A

företag som använder data och affärsanalys för att vägleda beslutsfattande är mer produktiva och upplever högre avkastning på eget kapital än konkurrenter som inte gör det.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Vilka är “the 5 Vs of Big Data”?

A

Volume, value, velocity, veracity & variety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Vad menas med the first V “volume”?

A

refers to the amount of all types of data generated from different sources and continue to expand. The benefit of gathering large amounts of data includes the creation of hidden information and patterns through data analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Vad innebär the second V “variety”?

A

refers to the different types of data collected via sensors, smartphones, or social networks. Such data types include video, image, text, audio, and data logs, in either structured or unstructured format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Vad innebär the third V “velocity”?

A

refers to the speed of data transfer. The contents of data constantly change because of the absorption of complementary data collections, introduction of previously archived data or legacy collections, and streamed data arriving from multiple sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vad ingår i “data sources”?

A

Sociala medier, Machine-generated data, sensing, transactions, IoT (“IoT represents a set of objects that are uniquely identifiable as a part of the Internet.”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vad innebär the fourth V “value”?

A

is the most important aspect of big data; it refers to the process of discovering huge hidden values from large datasets with various types and rapid generation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Vilka 5 kategorier klassificeras big data in i enligt Hashem et al (2015)?

A

(i) data sources, (ii) content format, (iii) data stores, (iv) data staging, and (v) data processing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Vad ingår i “content format”?

A

Structured data, unstructured data & semi-structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Vad är structured data?

A

Structured data are often managed SQL, a programming language created for managing and querying data in RDBMS. Structured data are easy to input, query, store, and analyze. Examples of structured data include numbers, words, and dates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Vad är semi-structured data?

A

Semi-structured data are data that do not follow a conventional database system. Semi-structured data may be in the form of structured data that are not organized in relational database models, such as tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Vad är unstructured data?

A

Unstructured data, such as text messages, location information, videos, and social media data, are data that do not follow a specified format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Vilka 4 saker ingår i “data stores”?

A

Document-oriented, column-oriented, graph database, key-value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vad är en document-oriented data store?

A

Document-oriented data stores are mainly designed to store and retrieve collections of documents or information and support complex data forms in several standard formats, such as JSON, XML, and binary forms (e.g., PDF and MS Word).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Vad är en column-oriented data store?

A

A column-oriented database stores its content in columns aside from rows, with attribute values belonging to the same column stored contiguously. Column-oriented is different from classical database systems that store entire rows one after the other [16], such as BigTable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Vad är en graph database?

A

A graph database, such as Neo4j, is designed to store and represent data that utilize a graph model with nodes, edges, and properties related to one another through relations

15
Q

Vad är key-value?

A

Key-value is an alternative relational database system that stores and accesses data designed to scale to a very large size

16
Q

Vilka 3 saker ingår i “data staging”?

A

Cleaning, transform, normalization

17
Q

Vad innebär cleaning (steg i data staging)?

A

Cleaning is the process of identifying incomplete and unreasonable data

18
Q

Vad innebär transform (steg i data staging)?

A

Transform is the process of transforming data into a form suitable for analysis.

19
Q

Vad innebär normalization (steg i data staging)?

A

Normalization is the method of structuring database schema to minimize redundancy

20
Q

Vilka 2 saker ingår i “data processing”?

A

Batch & real time

21
Q

Vad är batch i kontexten data processing?

A

Batch processing is the method computers use to periodically complete high-volume, repetitive data jobs

22
Q

Vad är real time i kontexten data processing?

A

a method of processing data at a near-instant rate, requiring a constant flow of data intake and output to maintain real-time insights

23
Q

Vad innebär “veracity”? (5 V)

A

Veracity innebär sanningshalten i datan - hur nyligen den är uppdaterad osv.