Introduktion Flashcards
Varför är det bra att använda big data i beslutsfattande inom företag enligt en studie av Brown et al(2011)?
företag som använder data och affärsanalys för att vägleda beslutsfattande är mer produktiva och upplever högre avkastning på eget kapital än konkurrenter som inte gör det.
Vilka är “the 5 Vs of Big Data”?
Volume, value, velocity, veracity & variety
Vad menas med the first V “volume”?
refers to the amount of all types of data generated from different sources and continue to expand. The benefit of gathering large amounts of data includes the creation of hidden information and patterns through data analysis.
Vad innebär the second V “variety”?
refers to the different types of data collected via sensors, smartphones, or social networks. Such data types include video, image, text, audio, and data logs, in either structured or unstructured format.
Vad innebär the third V “velocity”?
refers to the speed of data transfer. The contents of data constantly change because of the absorption of complementary data collections, introduction of previously archived data or legacy collections, and streamed data arriving from multiple sources
Vad ingår i “data sources”?
Sociala medier, Machine-generated data, sensing, transactions, IoT (“IoT represents a set of objects that are uniquely identifiable as a part of the Internet.”)
Vad innebär the fourth V “value”?
is the most important aspect of big data; it refers to the process of discovering huge hidden values from large datasets with various types and rapid generation
Vilka 5 kategorier klassificeras big data in i enligt Hashem et al (2015)?
(i) data sources, (ii) content format, (iii) data stores, (iv) data staging, and (v) data processing.
Vad ingår i “content format”?
Structured data, unstructured data & semi-structured data
Vad är structured data?
Structured data are often managed SQL, a programming language created for managing and querying data in RDBMS. Structured data are easy to input, query, store, and analyze. Examples of structured data include numbers, words, and dates.
Vad är semi-structured data?
Semi-structured data are data that do not follow a conventional database system. Semi-structured data may be in the form of structured data that are not organized in relational database models, such as tables.
Vad är unstructured data?
Unstructured data, such as text messages, location information, videos, and social media data, are data that do not follow a specified format.
Vilka 4 saker ingår i “data stores”?
Document-oriented, column-oriented, graph database, key-value
Vad är en document-oriented data store?
Document-oriented data stores are mainly designed to store and retrieve collections of documents or information and support complex data forms in several standard formats, such as JSON, XML, and binary forms (e.g., PDF and MS Word).
Vad är en column-oriented data store?
A column-oriented database stores its content in columns aside from rows, with attribute values belonging to the same column stored contiguously. Column-oriented is different from classical database systems that store entire rows one after the other [16], such as BigTable