class 4 Flashcards

1
Q

what is big data?

A

extremely large and complex data collections that traditional data management software, hardware and analysis processes are incapable of dealing with them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the 3 characteristics of big data?

A

volume (amount of data)
velocity (speed at which we collect data)
variety (in what format we receive the data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the 2 formats of big data?

A

structured
unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is and example of structured big data?

A

corporate databases containing customer, product and inventory data in tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is an example of unstructured big data?

A

word-processing documents, social media, email, photos, surveillance video

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 8 sources of big data?

A

documents (word, email, powerpoint)
data from business apps
social media (twitter, Facebook, likedin, Pinterest)
sensor data (process control devices)
media (images, audio, video, live data feeds, podcasts)
machine log data (business process logs)
public data (local, state, federal government websites)
archives (historical records)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the five Vs of big data?

A

volume (how much data is generated)
velocity (how fast data is generated)
variety (the different forms of data)
value
veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

explain the V of big data “value”?

A

having access to good quality data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

explain the V of big data “veracity”?

A

how often there are discrepancies found in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the importance of big data?

A

data can be fetched from any source and analyzed to solve problems that can lead to cost reduction, time reduction, new product development, smart decision making

the combination of big data with high-powered analytics can have great impact on business strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the 4 business strategy that big data can have great impact on?

A

finding the root cause of failures in real time operations

generating coupons at the point of sale using customers habit of buying goods

recalculating entire risk portfolio

detecting fraudulent behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are 3 examples of people who use big data?

A

retail organizations (monitor social networks)

hospitals (analyze medical data and patient records to get their medical history)

advertising and marketing agencies (track comments on social media to understand consumers responsiveness to ads)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the 5 challenges of big data?

A

how to choose what subset of the data to store

where and how to store the data

how to find the nuggets of data that are relevant

how to derive value from the relevant data

how to identify which data needs to be protected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are 3 technologies are used to process big data?

A

data warehouse
data marts
data lakes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are data warehouses?

A

a large databases that collects business information from any sources in the enterprise in support of management decision making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is the primary purpose of data warehouses?

A

to relate information In innovative ways and help managers and executives make better decisions and do the ETL process

17
Q

what is the ETL process?

A

the process of: Extract, Transform, Load
Extract:
extracting the source data from all the various sources

Transform:
a series of rules or algorithms are applied to the extracted data to derive the valuable data

Load:
extracted and transformed data is loaded into the data warehouse

18
Q

what is an example of a company using a data warehouse?

A

Walmart operatres separate data warehouses for Walmart and sams club and allows suppliers access to almost any data they could possibly need to determine how their product is selling and when they need more

19
Q

why is it important for data warehouses to have quality data?

A

because data warehouses are used for decision making so maintaining a high quality of data is vital so that organizations avoid wrong conclusions

20
Q

what is the biggest issue in data warehousing?

A

the wide range of data inconsistencies and sheer volume of data, data quality is considered one of the biggest issues in data warehousing

21
Q

what are data marts?

A

a subset of a data warehouse that Is used by small and medium sized businesses and departments within large companies to support their decision making

22
Q

what are data lakes?

A

a “store-everything” approach to big data, saving all data in its raw and unaltered form so raw data is available when users decide just how they want to use it, typically for specific analysis

23
Q

what is another term for a data lake?

A

enterprise hub

24
Q

what is a NoSQL database?

A

provides a means to store and retrieve data that is modelled using some means other than the simple two-dimensional tabular relations used in relational databases

25
Q

what is data mining?

A

the process of searching and analyzing a large batch of raw data in order to identify patterns and extract useful information

26
Q

what are the 4 steps in data mining?

A

setting an objective that data mining will be used for

data preparation- finding value and cleaning data

applying the data through data mining algorithms

evaluating results

27
Q

do corporations use data mining?

A

yes, it I used by corporations for everything from learning about what customers are interested in or what to buy to fraud detection and spam filtering

28
Q

how does data mining work?

A

data mining programs break down patterns and connections in data based on what information users request or provide and then shows that data to the user

29
Q

what is an example of a company using data mining?

A

grocery chain used the data mining capacity to analyze local buying patterns, the discovered that when men bought diapers on Thursday and Saturday, they also dented to buy snacks. this led them to move the snacks closer to the diapers and make sure snacks and diapers were sold at full price on Thursdays and Saturdays

30
Q

what is predictive analysis?

A

a form of data mining that combines historical data with assumptions about future conditions to predict outcomes of events

31
Q

what is an example of predictive analysis being used?

A

retailers used it to upgrade customers into frequent shoppers

32
Q

what are 6 data mining applications?

A

branding and position
customer churn
direct marketing
fraud detection
market segmentation
trend analysis

33
Q

explain the data mining application “branding and position of products and services”?

A

enable the strategist to visualize product behaviour in different markets, while condensing the data in demotions that are easily analyzed

34
Q

explain the data mining application “customer churn”?

A

predict current customers who are likely to switch to a competitor

35
Q

explain that data mining application “direct marketing”?

A

helps identify customer prospects most likely to respond to direct marketing practices