Data collection in IoT systems Flashcards

1
Q

Data

A

Data is information, such as facts and figures, used to analyze something or make decisions. When data is collected, for example from a wireless sensor, it is processed several times before it reaches a backend system. This is done to customize how the data is presented so that different applications can use it, or to perform calculations that help companies get value from the data and link it to specific business needs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Structured data

A

Data organized in tables (rows and columns), making it easy to analyse. Examples are database tables with rows and columns, where each column has a specific property or attribute, and each row is a data record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unstructured data

A

is a type of data that is difficult to organize in tables, such as text documents, images or audio files. This data is often referred to as qualitative data and cannot be easily processed or analyzed with standard data tools. Instead, unstructured data is used in fields such as natural language processing (NLP) and text mining, which analyze and draw insights from text and other forms of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Semi-structured data

A

is a type of data that lies between structured and unstructured data. Unlike structured data, which is always organized in rows and columns (as in a database), semi-structured data lacks a fixed form. However, it does have certain elements such as ‘tags’ or labels that help to understand and organize the information. These tags help to understand what each part of the data means. A common example of semi-structured data is XML (Extensible Markup Language), which uses tags to describe data. E.g., the tags <book>, <title>, <author>, and <year> help us understand what the information is about.</year></author></title></book>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Meta-data

A

data about data. It provides information that describes or summarises other data, helping to understand, organize, or manage it more effectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of data collections in IoT

A

using sensors to collect data and track the status of the smart devices connected to the IoT.

Equipment data: information on the status of IoT devices, often in real-time, for maintenance and optimization. Example: Predictive maintenance.

Environmental data: it includes information on the physical environment. Humidity, temperature, movement, air quality.

Submeters data: Data collected from different users of common resources, e.g. water and electricity in multi-tenant buildings, collecting data and sending it to the cloud.

Location data: it comprises information on the location and movement of people, objects or vehicles. E.g. collecting data and sending it to the cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which layers interplay to make the IoT data collection process work (IoT data collection architecture)?

A

Device layer, communication layer, IT edge layer, Event processing layer, client communication layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Device layer

A

layer with sensors to collect data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Communication layer

A

this layer manages how data is transferred between IoT devices and other systems. Is the layer that defines a type of protocol, and other protocols that they need to send data. E.g., protocols HTTP/HTTPS, MQTT, CoAP. The communication layer is used in many embedded systems, for example in automation, security systems and household appliances (smart home devices).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

IT Edge layer

A

the layer where data is stored close to its collection point which includes hardware, firmware and operating systems of IoT devices. It plays a crucial role in IoT data processing by performing preliminary processing and analysis of data collected from connected devices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Event processing layer

A

layer where data is cleansed, metadata is added and insights are generated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Client communication layer

A

a bridge between back-end databases and front-end interfaces. API. It is the layer that communicates the results of data analysis to the end user via interfaces such as a mobile application, which is the interface that the user interacts with. The layer translates raw data into data the end user can easily understand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What principles do IoT data collection systems use to work properly?

A

Scalability: robust iot data collection systems must be secure enough to gather and store large volumes of data.

Security: iot-based data collection systems must provide top-notch security to prevent data breaches or unauthorized access.

Interoperability: IoT systems must be able to work together and exchange data, regardless of manufacturers or technological platforms.

Flexibility: iot data collection systems must accept different data formats and adapt to changing requirements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is the right data needed when collecting and how can proper data collection from IoT devices benefit businesses?

A
  • Improved data operational efficiency: IoT data collection automates the collection of sensor information, increasing productivity and eliminating the need for manual data collection.
  • Accurate real-time insights: IoT data collection allows businesses to monitor and solve problems in real time, enabling faster response and control.
  • Better decision-making: Collected IoT data provides insights into customer behavior, market trends, and business performance, facilitating strategic planning, predictive maintenance, and decisions.
  • Saved costs: IoT data can identify inefficiencies in processes, allowing companies to optimize their operations, reduce costs and increase profitability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Challenges involved in iot data collection

A
  • Security and privacy: Strict data security.
  • Compatibility: different types of data need to be integrated and compatible to work effectively.
  • Large data sets: a huge amount of data, not all are useful.
  • Consistency: costly to ensure the right communication between devices and systems.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How should we handle the raw data that is collected?

A

Raw data from IoT devices cannot be directly used to build models. The data must be cleaned and processed before it can be used.

17
Q

Why do we need to pre-process collected data?

A
  • Noisy data: it could include errors or outliers.
  • Incomplete data: data could have lacking values or missing values.
  • Inconsistent data: Data may have incompatibilities or be incompatible with other systems. No quality data leads to poor decisions.
18
Q

Why could data be missing in IoT applications?

A

Sensor problem, connection error, outside attacker.

19
Q

What is data argumentation?

A

it is a technique to increase the diversity of your training data. How to deal with lack of data in iot data collection for building machine learning modules.

20
Q

Why is data reduction important?

A
  • reduces overfitting: less redundant data means less opportunity to make decisions based on noises.
  • Improves accuracy: less misleading data means modelling accuracy improves.
  • Reduces training time: fewer data points reduce algorithm complexity and the algorithm trains faster.
21
Q

Data reduction techniques

A

Features selection → Supervised feature selection. Intrinsic: Algorithms that perform automatic feature selection during training: decision tree. Filter Method: Select subsets of features based on their relationship with the target. Statistical methods, future importance methods. Wrapper: Search for well-performing subsets of features. Recursive feature elimination, Heuristic algorithm.
Dimensionally reduction → Techniques to reduce the number of variables in the data without losing important information.

22
Q

Why is data collection important?

A
  • Better user experience: Automation helps you understand the needs and habits of your end user.
  • Asset maintenance: live data enables developers and managers to monitor the state of equipment.
  • Efficient use of resources: automated data collection serves the purpose of knowledgeable decision-making.
23
Q

How is data collected in IoT devices?

A

Data is collected via sensing modules and transferred (communication modules via WiFi and Bluetooth) through the embedded software and electronics to the central processing unit and analysis.

24
Q

Heterogeneous data

A

Data are produced by a large number of different sensors and is itself highly heterogeneous, with different sampling rates, quality of collected values, etc.