Data Storage and Processing Flashcards

1
Q

What challenge do most existing IoT solutions face?

A

Most IoT solutions are tailored to specific verticals, leading to separate data silos, which makes it difficult to capture the full potential of IoT across multiple domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is handling IoT data from different domains challenging?

A

IoT data come from various structures, sources, and descriptions, making it complex to integrate and process them properly across different domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is required to ensure interoperability of IoT devices?

A

IoT data must be stored in different network databases, shared among multiple nodes, analyzed by various tools, and interpreted by different machines to ensure interoperability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Semantic Web, and how does it help with IoT data?

A

The Semantic Web, or linked data web, provides reasoning engines and tools to analyze and link IoT data meaningfully across various domains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What role does complex event processing play in IoT data analysis?

A

Complex event processing searches for dependencies and patterns in streaming IoT data, creating real-time insights to help businesses identify opportunities and threats early.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Why is a single server insufficient for handling IoT data?

A

IoT data are often too large for a single server or database to handle, requiring distributed processing approaches like MapReduce.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does the MapReduce programming model help manage IoT data?

A

MapReduce distributes datasets across multiple databases to process the data separately and then recombines the results, making it possible to handle large volumes of structured and unstructured IoT data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How did the web evolve from its initial phase to the Semantic Web (Web 3.0)?

A

The web started as a collection of documents linked to each other and gradually evolved into the Semantic Web, where documents and pieces of data are meaningfully connected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What was unclear about the relationships between documents in the early phases of the web?

A

In the early phases, relationships between documents were unclear because they were not linked to specific pieces of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the Semantic Web enable for users and machines?

A

The Semantic Web provides meaningful links between data, allowing users (both humans and machines) to explore and understand connections between pieces of information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is linked data, and what does it create?

A

Linked data refers to semantically linking and integrating pieces of information across domains, creating a global web that connects data on topics like books, companies, and social media.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do machines use linked data in the Semantic Web?

A

Machines can connect distributed data sources, process new data as they appear on the web, and produce integrated results, enhancing applications like data browsers and search engines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a generic linked data browser allow users to do?

A

A generic linked data browser lets users browse a data source and travel along links to related sources, enhancing data exploration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What capability do linked data search engines provide?

A

Linked data search engines allow expressive query capabilities over aggregated data by crawling the global web of linked data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is linked data?

A

Linked data refers to machine-readable, well-defined information published on the web that can be connected to external datasets from various sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What format is used in linked data technologies to connect information?

A

Linked data technologies use the Resource Description Framework (RDF) format to create a web of data by linking different things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What kinds of data sources can linked data technologies connect?

A

Linked data can connect data sources ranging from geographically distributed database to heterogeneous systems that cannot interoperate at the data level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Who specified the rules for publishing data as part of the global web of data?

A

Tim Berners-Lee, the inventor of the World Wide Web, specified the rules for publishing data as part of the global web of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the four linked data principles as specified by Tim Berners-Lee?

A
  • Use Uniform Resource Identifiers (URIs) as names for things.
  • Use HTTP URIs to help people look up the things’ names.
  • Use RDF and SPARQL standards to provide useful data.
  • Include links to other URIs to help people discover more things.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which two fundamental web technologies are relied on by the first two linked data principles?

A

The first two linked data principles rely on Uniform Resource Identifiers (URIs) and Hypertext Transfer Protocol (HTTP).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does RDF enhance linked data?

A

RDF supports a generic, graph-based data model that structures and links data describing things in the world, enhancing linked data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does the Resource Description Framework (RDF) syntax encode and represent?

A

RDF encodes and represents web resources and data in a structure known as triples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the three components of an RDF triple?

A
  • Subject: A resource identified by a URI.
  • Predicate: A URI specifying the relationship between the subject and object.
  • Object: A resource or literal (a basic string value) identified by a URI, related to the subject.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does the predicate represent in an RDF triple?

A

The predicate specifies the relationship between the subject and the object, represented by a URI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What can the object in an RDF triple be?

A

The object can be either a resource or a literal (basic string value) identified by a URI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How are subjects and objects in RDF triples similar to hypertext links?

A

Like hypertext links that connect documents, subjects and objects in RDF triples link items in various datasets, contributing to the web of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Give an example of an RDF triple relationship.

A

An example is “Berlin” (subject) and “Germany” (object) being related through the predicate “is the capital of,” showing that Berlin is the capital of Germany.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What type of relationship exists in RDF between subject and object resources?

A

RDF defines a unidirectional relationship from the subject to the object resource.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Can a resource in RDF be used in multiple triples? If yes, in what roles?

A

Yes, a resource can be used in various triples with different roles: as a subject, predicate, or object.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What do multiple connections between RDF triples create?

A

Multiple connections between RDF triples create a connected graph of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

In an RDF graph, how are resources and predicates represented?

A
  • Resources are represented as nodes.
  • Predicates (relationships between nodes) are depicted with lines connecting the nodes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the significance of the connected graph in RDF?

A

The connected graph allows for multiple relationships between data points, enabling more complex and meaningful data linkages across different datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a major benefit of using centralized systems for RDF datasets?

A

Benefit: No communication overhead between different nodes, as all data storage and queries are processed on a single machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What limits the capabilities of centralized systems in handling RDF datasets?

A

Limitation: The system is restricted by the memory and computational capacity of the single node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How do distributed systems improve over centralized systems for RDF datasets?

A

Improvement: Distributed systems offer larger memory and computational power by utilizing multiple machines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What are the potential drawbacks of distributed systems when processing RDF data?

A

Drawback 1: Expensive communication between machines. 
Drawback 2: Intermediate data shuffling during complex queries can degrade system performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is the DBPedia project, and what does it aim to do?

A

DBPedia Project: It extracts the structured content of Wikipedia and makes it available in RDF. It allows users to semantically query properties, relationships, and link to related datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

How does the DBPedia project improve user experience in applications?

A

Improvement: Applications can exploit information from other datasets to enhance the user experience by linking related information in RDF triples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Why is RDF Schema (RDFS) used in conjunction with RDF?

A

RDF Schema (RDFS) is used to define classes of resources in RDF, enabling the categorization of things into hierarchical classes, which RDF alone does not support.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is a resource in RDFS, and how is it classified?

A

A resource in RDFS is an instance of a certain class, and each class can have subclasses with additional descriptions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Does RDF Schema (RDFS) specify how applications should use the class descriptions?

A

No, RDFS does not specify how an application should use the descriptions of resources in the classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How does linked data facilitate data abstraction in IoT?

A

Linked data uses common identifiers like International Resource Identifiers (IRIs), which integrate common data structures from various IoT sensors, enhancing data abstraction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What role do machines play in interpreting linked data in IoT?

A

Machines can interpret data descriptions by extracting the origin, attributes, and understanding the relationships between the data and other related information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the main purpose of the Internet of Things (IoT)?

A

The main purpose of IoT is to interpret the semantic data captured from various sources and sensors and transform it into actionable knowledge.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Why are IoT data considered useless?

A

IoT data are considered useless if they cannot be understood or interpreted, as they must provide meaningful insights to be actionable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What challenges arise from the heterogeneous nature of IoT data?

A

The heterogeneous nature of IoT data presents challenges in ensuring interoperability among IoT devices due to the support for different protocols and data formats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

How does the Semantic Web contribute to IoT?

A

The Semantic Web provides analytical tools and best practices that facilitate data reasoning, help satisfy interoperability requirements, and enable effective integration and analysis of different sources of IoT data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is the relationship between the Internet of Things and the Semantic Web?

A

The relationship between IoT and the Semantic Web results in global interoperability between devices, enabling the generation of new services through effective data integration and analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What are the key open approaches developed by the Semantic Web community for data analytics?

A

The key open approaches include sharing and reusing open data through linked data, linked vocabularies, and linked services.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

How are semantic IoT data stored and managed in the Semantic Web?

A

Semantic IoT data are stored and managed in RDF databases as RDF graphs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What language is used for querying and reasoning over RDF graphs?

A

SPARQL is used for querying and reasoning over the stored RDF graphs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is the role of semantic technologies in data analytics?

A

Semantic technologies help derive meaning from collected data, transforming it into actionable information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What are some well-known methods and technologies employed by the Semantic Web to process IoT data?

A

Some well-known methods include linking data, real-time and linked stream processing, logic-based approaches, machine learning, distributed semantic reasoning, and cross-domain recommender systems.

54
Q

What advantage does linking data provide in the context of the Semantic Web?

A

Linking data allows for meaningful connections not just between documents, but also between machine-readable and interpretable datasets, enhancing data interoperability and insight extraction.

55
Q

What extension has been added to SPARQL to handle stream sensor data?

A

An extension called linked stream data has been added to SPARQL to help handle stream sensor data.

56
Q

What does linked stream data allow SPARQL to do?

A

Linked stream data allows SPARQL to enrich stream sensor data with linked open data, which is freely used and distributed.

57
Q

What mechanisms does the Semantic Web provide to ensure the consistency of IoT data?

A

The Semantic Web provides mechanisms to develop rules, check data consistency, and ensure that IoT data are logically valid.

58
Q

What is the purpose of the Linked Edit Rules (LER) approach?

A

The Linked Edit Rules (LER) approach checks the consistency of data to ensure its validity, such as verifying that relative humidity values cannot be negative.

59
Q

How is logic-based reasoning utilized in the analysis of IoT data?

A

Logic-based reasoning is used to analyze simple sensor data, such as temperature or humidity, and is characterized by being fast and easy to implement.

60
Q

When are machine learning techniques and data mining approaches applied in the context of IoT?

A

Machine learning techniques and data mining approaches are applied to reason about complex semantic IoT data (e.g., electrocardiography [ECG] signals) where logic-based reasoning alone is insufficient.

61
Q

Where is time-sensitive data processed in many IoT network architectures?

A

Time-sensitive data are processed at the edge of the network to reduce latency and save bandwidth.

62
Q

What new challenges are introduced by processing data at the edge of the network?

A

New challenges include the need for reasoning at every layer of the IoT data management and computational stack (i.e., cloud, fog, and edge) and the difficulty for smart nodes to understand and interpret data from heterogeneous IoT sensors.

63
Q

How does distributed reasoning improve reasoning latency in large datasets?

A

Distributed reasoning can improve reasoning latency by allowing processing to occur at the sensors and edge devices, thus reducing the amount of data sent to centralized locations.

64
Q

What are the advantages of distributed reasoning over centralized reasoning?

A
  1. Distributed reasoning is advantageous when:
    * Data are distributed both logically and physically.
    * Communication costs are negligible compared to problem solution costs.
    * There is collaboration between the system’s components to solve problems.
65
Q

When is distributed reasoning particularly beneficial?

A
  1. Distributed reasoning is beneficial when:
    * Data are dynamic with ambiguous content.
    * Data size exceeds the computational capacity of IoT devices.
    * Sharing data and reasoning tasks can yield comprehensive intelligence.
66
Q

What impact does distributed reasoning have on the performance of knowledge systems?

A

Distributed reasoning can improve knowledge system performance by splitting large computational tasks into sub-tasks that can be solved more efficiently.

67
Q

What is the primary focus of traditional recommender systems?

A

Traditional recommender systems focus on a single vertical and assist users in finding their topic of interest among vast amounts of data in a specific domain.

68
Q

What do Cross-Domain Recommender Systems (CDRS) use to provide recommendations?

A

CDRS use data and knowledge gained from multiple source domains to provide recommendations in a target domain, assuming there is information overlap between items and users across different domains.

69
Q

How are cross-domain recommendation approaches classified?

A
  1. Cross-domain recommendation approaches are classified into two categories based on how knowledge is exploited:
    * Knowledge linking approach: User preferences are merged, and recommendations from both domains are combined.
    * Knowledge sharing approach: The source domain transfers its data to the target domain for producing recommendations.
70
Q

What is the purpose of Prefix.cc?

A

Prefix.cc simplifies the RDF development process by looking up URI prefixes.

71
Q

What is the role of rdf-vocab?

A

rdf-vocab is an open-source project used by RDF developers to look up and search for linked data vocabularies.

72
Q

What does the W3C RDF Validator do?

A

The W3C RDF Validator is an online service that checks and visualizes RDF documents.

73
Q

What are some examples of data reasoners used in IoT analytical applications?

A

Examples of data reasoners include:
* CEL Description Logic (DL)
* Euler
* FaCT++
* HermiT Reasoner
* Java Expert System Shell (JESS)
* Jena Eyeball (a command-line semantics validator).

74
Q

How can data reasoners be classified?

A

Data reasoners can be classified based on their linkage and discovery mechanisms or their usability.

75
Q

What is complex event processing (CEP)?

A

CEP is a set of techniques used to aggregate, process, and analyze large amounts of streaming data to generate real-time insights from those events as they happen, even before the data is stored in databases.

76
Q

When does CEP generate real-time insights from data?

A

CEP generates real-time insights as the events happen, before storing the data in databases.

77
Q

How are insights generated in CEP?

A

Insights are generated by searching for dependencies and complex patterns in the incoming raw data.

78
Q

What is the purpose of CEP in businesses and organizations?

A

CEP helps businesses and organizations identify opportunities and threats, enabling systems and applications to respond in real-time as quickly as possible.

79
Q

How does CEP identify meaningful events?

A

CEP identifies meaningful events by continuously processing raw data and finding correlations between other events before the data is stored in databases.

80
Q

What does CEP search for in the incoming raw data?

A

CEP searches for dependencies and complex patterns in the raw data to generate insights.

81
Q

Why is it important for CEP to process data before it is stored in databases?

A

Processing data before storing it allows for real-time insights and enables faster responses to opportunities and threats.

82
Q

What kinds of responses can CEP insights trigger in systems and applications?

A

CEP insights help systems and applications respond to opportunities and threats in real-time and as quickly as possible.

83
Q

Why are “complex event processing” and “stream processing” used interchangeably?

A

They are used interchangeably because they rely on the same underlying technologies.

84
Q

What is the main focus of complex event processing (CEP)?

A

CEP is focused on searching for complex patterns and dependencies between different events to identify a particular event.

85
Q

What does stream processing primarily focus on?

A

Stream processing focuses on aggregating data in time windows and responding to a single event, often using time series data.

86
Q

How does stream processing handle data from a surveillance camera?

A

It collects and processes the images (data) captured by the camera and responds to that event by analyzing the time series data.

87
Q

What is Apache Kafka, and how is it related to stream processing?

A

Apache Kafka is an open-source stream processing engine, known for its streaming analytics, mission-critical applications, and data integration.

88
Q

What are some applications of CEP in business monitoring?

A

CEP is used to monitor business processes and resources, helping businesses identify opportunities and problems at early stages.

89
Q

How is CEP applied in sensor networks?

A

CEP is used in sensor networks to measure physical parameters, such as temperature, for predictive maintenance in industrial and manufacturing facilities.

90
Q

What role does CEP play in predictive maintenance for industrial facilities?

A

CEP analyzes patterns from IoT devices to predict when equipment may need to be shut down or repaired.

91
Q

How is RFID data used in smart retail stores with the help of CEP?

A

CEP processes RFID data to help management optimize store layout and inventory tracking.

92
Q

How is complex event processing (CEP) applied in the stock market?

A

CEP is used to derive useful data about the stock market by analyzing real-time data streams and detecting complex patterns.

93
Q

What are some well-known tools for complex event processing?

A

Some well-known tools include Hadoop/MapReduce, Amazon Kinesis Analytics, and Microsoft Azure Stream Analytics. LinkedIn uses Apache Samza, and Twitter uses Apache Storm.

94
Q

How does CEP process data to generate high-level business information?

A

CEP detects complex patterns from real-time data streams, transforming low-level data into high-level business information.

95
Q

Why is CEP considered a time-sensitive task?

A

CEP is time-sensitive because it requires ultra-low latency (typically less than a few milliseconds) to handle real-time data effectively.

96
Q

What is an example of a connected vehicle IoT use case involving CEP?

A

An example is a vehicle where ice sensors warn of a slippery road, the weather forecast shows a high chance of precipitation, and other sensors in the car indicate unusual conditions. CEP processes all these events to provide critical information to the driver, connected cars, and road units.

97
Q

Why is ultra-low latency important in CEP?

A

Ultra-low latency is crucial in CEP because it enables the system to process and respond to real-time events almost instantly, which is necessary in scenarios like connected vehicles.

98
Q

What type of data is CEP typically handling in real-time?

A

CEP handles low-level data streams that are transformed into meaningful, high-level business information.

99
Q

What role does CEP play in connected vehicle systems?

A

CEP processes data from various sensors (e.g., ice sensors, brakes, steering) to provide crucial information to drivers, other vehicles, and road infrastructure in real-time.

100
Q

What is big data in the context of IoT?

A

Big data refers to the large volumes of structured and unstructured data produced by billions of connected IoT devices and sensors. This data is often too complex to be processed using conventional tools.

101
Q

What are the five characteristics of big data, also known as the 5Vs?

A
  1. The 5Vs are:
    * Volume: Size of the data.
    * Velocity: Speed at which data is generated.
    * Variety: Types of data (structured, unstructured, semi-structured).
    * Veracity: Trustworthiness and accuracy of the data.
    * Value: Usefulness of the data in gaining insights.
102
Q

What role does Apache Hadoop play in processing big data?

A

Apache Hadoop is an open-source software utility that allows for distributed storage and processing of large datasets using the MapReduce programming model.

103
Q

What is the Map function in the MapReduce model?

A

The map function breaks down a dataset into key-value pairs and transforms it into a structured set of data. It operates on one key-value pair at a time.

104
Q

What does the Reduce function do in the MapReduce model?

A

The reduce function combines the output of the map function into smaller sets of data tuples. It groups values associated with the same key and outputs reduced key-value pairs.

105
Q

Explain how the MapReduce model can be compared to a shopping scenario.

A

In a shopping scenario, the grocery list is split into smaller lists (e.g., bakery, seafood), and multiple shopping carts collect items in parallel. They all meet at the cashier, speeding up the process compared to using a single cart.

106
Q

What is the key limitation of traditional relational databases when dealing with big data?

A

Traditional relational databases are limited in handling the complexity and volume of unstructured data typically found in big data because they rely on tabular relations and SQL for querying structured data.

107
Q

What are NoSQL databases, and how do they differ from traditional relational databases?

A

NoSQL databases are non-relational, highly scalable databases designed to store and retrieve unstructured data. They don’t use tabular relations, and they support various models such as key-value, document, column-oriented, and graph models.

108
Q

What are key-value store databases, and what are their advantages and limitations?

A

Key-value store databases store data as key-value pairs, with the key being a unique identifier and the value a large data field. They offer high performance but do not support complex queries, as only keys can be queried, not the values.

109
Q

What is a column-oriented database, and how does it structure data?

A

Column-oriented databases store data in columns instead of rows. They use a key space that contains column families, each having rows and columns.

110
Q

How do document-oriented databases store and retrieve data?

A

Document-oriented databases store data as JSON documents. They allow fast querying and are flexible, making them suitable for use cases such as IoT applications in healthcare.

111
Q

What is the primary use case for graph-oriented databases?

A

Graph-oriented databases are used to store graph-based data, such as social network information. They focus on the relationships between data points, storing the data as originally produced without a predefined structure.

112
Q

What is the main benefit of using the MapReduce programming model in data processing?

A

The main benefit of the MapReduce model is its ability to scale data processing across multiple machines, making it efficient for handling large datasets in distributed environments.

113
Q

What does veracity refer to in the context of big data?

A

Veracity refers to the uncertainty, accuracy, and trustworthiness of the data in big data analytics.

114
Q

What is the significance of value in the 5Vs of big data?

A

Value refers to the ability of data to provide useful insights or information that can be acted upon.

115
Q

What tools are typically required to process big data?

A

Tools like Apache Hadoop are needed to process big data using distributed computing over several machines and servers..

116
Q

What is the function of the Map stage in the MapReduce model?

A

The Map stage breaks down a dataset into key-value pairs, converting unstructured data into structured data.

117
Q

What is the main challenge that IoT devices create in terms of data?

A

IoT devices produce large volumes of structured and unstructured data, often exceeding the processing capabilities of traditional databases.

118
Q

Why do traditional relational databases struggle with big data?

A

Relational databases are designed for structured data with predefined relationships and cannot handle the complexity and volume of unstructured big data.

119
Q

What are NoSQL databases, and why are they important for big data?

A

NoSQL databases are non-relational databases that are highly scalable and can store unstructured data. They are important because they can handle the complexities of big data without requiring tabular relations.

120
Q

What are the four types of NoSQL databases?

A
  • Key-value store: Stores data as key-value pairs.
  • Column-oriented: Stores data in columns rather than rows.
  • Document-oriented: Stores data as JSON documents.
  • Graph-oriented: Stores data in graph format, focusing on relationships.
121
Q

Describe key-value store databases and their limitations.

A

Key-value store databases store data as key-value pairs and offer high performance. However, they do not support querying or searching values, only the keys.

122
Q

How do column-oriented databases structure their data?

A

Column-oriented databases store sparse tabular data in columns instead of rows, organized by key spaces that contain column families, rows, and columns.

123
Q

What is a document-oriented database, and why is it suitable for IoT use cases?

A

A document-oriented database stores data as JSON documents. It provides flexibility and fast queries, making it ideal for IoT applications like healthcare that need real-time data access.

124
Q

What is a graph-oriented database used for?

A

Graph-oriented databases are used to store graph-based data, such as social networks, where relationships between data are as important as the data itself.

125
Q

Why are traditional data storage and analytical tools insufficient for handling data in the Semantic Web of Things?

A

Traditional tools cannot handle the massive, heterogeneous data produced by IoT devices, which requires special tools and techniques like semantic technologies and NoSQL databases.

126
Q

What role do semantic technologies play in the Semantic Web of Things?

A

Semantic technologies, such as linking data, machine learning, distributed semantic reasoning, and cross-domain recommender systems, help derive meaning from the large amounts of data collected from IoT devices.

127
Q

What is complex event processing (CEP)?

A

Complex event processing is a set of techniques used to aggregate, process, and analyze large amounts of streaming data to provide real-time insights as events occur.

128
Q

How does complex event processing benefit businesses?

A

CEP helps businesses uncover patterns, identify opportunities, and detect potential threats in their early stages by analyzing real-time data streams.

129
Q

Why are relational databases unsuitable for storing IoT data?

A

Relational databases struggle to handle complex, semi-structured, and unstructured data typical in IoT environments, making them unsuitable for big data storage.

130
Q

How do NoSQL databases store IoT data, and what programming model do they rely on?

A

NoSQL databases store IoT data in documents, key-value pairs, or graph models and rely on the MapReduce programming model for processing large datasets.