Exam 2 Flashcards

1
Q

A report is a premade analytical view of sometimes complex data.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

A report that contains all the important facts, but not more than are necessary.
A. Complete
B. Accurate
C. Timely
D. Accessible

A

A. Complete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Put these report development steps in the correct order.
A. Finalizing the report structure
B. Building the layout for readability
C. Binding the analytics to the data
D. Identify data sources

A

D. Identify data sources
B. Building the layout for readability
C. Binding the analytics to the data
A. Finalizing the report structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the most important step in authoring a well-designed report?
A. Identifying data sources
B. Identifying the needs of the report user
C. Deployment
D. Building the layout for readability

A

B. Identifying the needs of the report user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does “binding” data mean?
A. Creating a preliminary crosstab report
B. Drawing user attention to specific values
C. Connecting a report component (e.g. a chart) to its data sources
D. Combining two or more star schemas into one virtual schema

A

C. Connecting a report component (e.g. a chart) to its data sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In which report authoring step is “types of data the report is intended to convey” considered?
A. Identifying the needs of the report user
B. Identifying data sources
C. Building the layout for readability
D. Binding analytical components to the data sources

A

C. Building the layout for readability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ad Hoc reports are created to meet a specific one-time need.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the difference in report and dashboard?
A. Reports allow users to interact directly with the data unlike a dashboard.
B. To change a report, users must rerun it with new parameters such as date.
C. Dashboards are static and only show data at a point in time.
D. Dashboards do not offer animation on interactivity like a report can.

A

B. To change a report, users must rerun it with new parameters such as date.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a data dashboard?
A. report prepared so that information requirements can be defined
B. Report written around the data outcomes
C. Collection of graphs and tables displayed together on a screen
D. data in a tabular format

A

C. Collection of graphs and tables displayed together on a screen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In most circumstances, the terms dashboard and cockpit can be used interchangeably.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

To make dashboards more responsive to user choices, designers can add interactive features shown below EXCEPT:
A. Radio buttons
B. Dials
C. Text boxes
D. Sliders

A

C. Text boxes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does a balanced scorecard do?
A. Measures strategic progress or a firm
B. Balances Human Resources and Payroll
C. Helps to communicate Human Resources policies
D. Balances revenue and expenses by goal

A

A. Measures strategic progress or a firm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Both a balanced scorecare and a dashboard can display:
A. KPIs
B. Financial data
C. Non-financial data

A

A. KPIs
B. Financial data
C. Non-financial data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Display-only dashboards allow users to choose inputs that modify the visualization to meet their specific needs.
A. True
B. False

A

B. False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the process of adding interactive functionality using a programming language?
A. Server-side scripting
B. Responsive programming
C. Interactive programming
D. Client-side scripting

A

D. Client-side scripting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

On a dashboard, accordions are best used
A. To select nested values
B. In a scorecard analysis
C. Add interactive wave files to a dashboard
D. To graphically represent data

A

A. To select nested values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The traditional V’s of the Big Data are
A. Volume, Value, Vision
B. Velocity, Variety, Very big
C. Volume, Variety, Velocity
D. Volume, Variety, Virtual

A

C. Volume, Variety, Velocity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In terms of big data, What is meant by data volatility?
A. Speed at which data are generated and collected
B. Changes in the meaning of data over time or in context
C. The lifespan of the data/how long it should be stored
D. Reliability or truthfulness of data

A

C. The lifespan of the data/how long it should be stored

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which of the following is NOT true of data mining?
A. It is the process of turning large amounts of data into useful information
B. It is a tool used to extract patterns and correlations from data
C. It is no longer supported with current software
D. It requires carefully analyzing data from various dimensions

A

C. It is no longer supported with current software

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In big data, unstructured and structured data is related to:
A. Variety
B. Value
C. Veracity
D. Volume

A

A. Variety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

One of the drivers of the big data is the world becoming increasingly digital.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Data velocity is
A. The many types of data collected
B. Massive amounts of data collected
C. Pace at which data is collected
D. The quality or trustworthiness of data

A

C. Pace at which data is collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How do firms gather data through sentiment mining?
A. Evaluating customer comments from social media (Facebook and Twitter)
B. Examine purchases through video cameras
C. Uncover unknown patterns of databases and variables
D. Obtain data from UPC scanner codes

A

A. Evaluating customer comments from social media (Facebook and Twitter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Which of the following statements about data mining (DM) is FALSE?
A. If the benefits outweighs the cost, DM activities are justified
B. Analysts believe that big data and DM are driving advances in stat methods
C. DM involves sifting through large volumes of data to obtain insights
D. DM is unique to the business world

A

D. DM is unique to the business world

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

The role of data analytics becomes more important the _____ predictable a system is
A. More
B. Less

A

B. Less

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Which system classification is the most predictable?
A. Nondeterministic
B. Chaotic
C. Deterministic
D. Stochastic

A

C. Deterministic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Which system classification would benefit the most from data mining?
A. Nondeterministic
B. Chaotic
C. Deterministic

A

A. Nondeterministic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Place the data mining process steps in the correct order from top to bottom
A. Deployment
B. Validation
C. Data Staging
D. Data Mining Model

A

C. Data Staging
D. Data Mining Model
B. Validation
A. Deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Which types of data mining model is unsupervised?
A, Prescriptive
B. Predictive
C. Anticipatory
D. Descriptive

A

D. Descriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Which type of data mining model is classification?
A. Prescriptive
B. Predictive
C. Anticipatory
D. Descriptive

A

B. Predictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the purpose of the validation data partition?
A. Train the model based on existing data
B. Check how the model performs on predicting the outcomes
C. Provides an unbiased estimate of how the model will perform with new data

A

B. Check how the model performs on predicting the outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What benefit do in-memory databases provide over traditional databases?
A. Discover patterns in data for future forecasts
B. Allow for unsupervised data mining
C. Include all relevant data are in memory (RAM) all the time
D. Ability to process data mining on chaotic systems

A

C. Include all relevant data are in memory (RAM) all the time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What method involves processing large volumes of data almost instantaneously to provide feedback ASAP?
A. Real-time analytics
B. Monitoring data models
C. Anticipatory data models
D. Deterministic systems

A

A. Real-time analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

_______ is used to form group or clusters of similar records based on measurements made on these records.
A. Time series analysis
B. Cluster analysis
C. Discriminant analysis
D. Association analysis

A

B. Cluster analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What type of analytic is clustering?
A. Descriptive
B. Diagnostic
C. Predictive
D. Prescriptive

A

A. Descriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

K-means clustering is an unsupervised analytic.
A. True
B. False

A

A. True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

A member of a cluster is
A. Most similar to members within other clusters
B. Dissimilar to other members within the same cluster
C. Most similar to other members within the same cluster

A

C. Most similar to other members within the same cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

K-means cluster analysis is not sensitive to outliers.
A. True
B. False

A

B. False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

K-means clustering
A. Can be used with qualitative data
B. Converges to a global optimum
C. Can be used with quantitative data
D. Does not require a number of clusters to be defined beforehand

A

C. Can be used with quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

In K-means clustering K is
A. the number of clusters
B. the number of decimals
C. the number of times the algorithm iterates
D. name of the inventor

A

A. the number of clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is the rule of tumbe for calculating the number of clusters?
A. Sqrt(N+2)
B. Sqrt(N)
C. Sqrt(N-2)
D. Sqrt(N/2)

A

D. Sqrt(N/2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

In clustering, intra-cluster distances …& inter-cluster distances…
A. Are maximized — Are minimized
B. Are minimized — are maximized
C. Are minimized — are maximized
D. Are minimized — are maximized

A

B. Are minimized — are maximized

43
Q

What is the main goal of association rule mining?
A. To find hidden patterns in data
B. To predict future outcomes
C. To classify data into categories
D. To cluster data into groups

A

A. To find hidden patterns in data

44
Q

Association Rule Mining is
A. Supervised
B. Hybrid
C. Unsupervised
D. None

A

C. Unsupervised

45
Q

What is association rule mining?
A. Same as frequent itemset mining
B. Finding of strong association rules using frequent itemsets
C. Using association to analyze correlation rules
D. Grouping together data points that are similar to one another

A

B. Finding of strong association rules using frequent itemsets

46
Q

Which metrics are used to Association Rules to filter itemsets?
A. Support
B. Confidence
C. Lift
D. All of the above

A

D. All of the above

47
Q

What is the “confidence” in association rule mining?
A. The number of times a rule appears in the data
B. The strength of the association between two items
C. The probability that a rule will appear in future data
D. The probability of an event given that another event has occurred

A

D. The probability of an event given that another event has occurred

48
Q

What is the “support” in association rule mining?
A. How common the rule is in the dataset
B. The strength of the association between two items
C. The probability that a rule will appear in future data
D. The confidence of a rule

A

A. How common the rule is in the dataset

49
Q

What is the purpose of computing “lift” in association rule mining?
A. To measure the strength of association between two items
B. To determine the rarity of an itemset
C. To estimate the probability of an itemset
D. To find the most frequent itemset

A

A. To measure the strength of association between two items

50
Q

How is “lift” interpreted in association rule mining?
A. A list value of 1 indicates no association between the items
B. A list value greater than 1 indicates a positive association between items
C. A lift value less than 1 indicates a negative association between the items
D. A lift value of 0 indicates a perfect association between the items

A

A. A list value of 1 indicates no association between the items
B. A list value greater than 1 indicates a positive association between items
C. A lift value less than 1 indicates a negative association between the items

51
Q

Which is an example of using association rule mining?
A. Bundling of insurance products bought by customers
B. Identify a classification of customers
C. Indicate rules for assigning new cases to classes for identification
D. Credit approval (i.e., good or bad credit risk)

A

A. Bundling of insurance products bought by customers

52
Q

An Apriori algorithm will
A. Generate itemset rules candidates and prune frequent rules
B. Generate lists of itemset candidates
C. Prune infrequent itemset rules
D. Generate itemset rules candidates and prune infrequent rules

A

D. Generate itemset rules candidates and prune infrequent rules

53
Q

What is time-series analysis?
A. Analysis that discusses the similarities between previous years
B. Analysis of the relationship between variables at one point in time
C. Analysis that discusses the differences between previous years
D. Analysis of the relationship between variables over a period of time

A

D. Analysis of the relationship between variables over a period of time

54
Q

What is true about a forecast or a prediction?
A. A forecast is used to estimate or determine future behavior
B. A forecast is an estimate of the value of a variable in the future
C. A prediction will help use estimate revenue next year
D. Only a prediction can be used to consider multiple independent variables

A

B. A forecast is an estimate of the value of a variable in the future

55
Q

What can time-series analysis be used for?
A. Telling the time
B. Forecasting sales next month
C. Predicting past sales
D. Finding out how much profit a business is making

A

B. Forecasting sales next month

56
Q

A time-series is made up of a trend, seasonality, cyclic component and randomness.
A. True
B. False

A

A. True

57
Q

What type of data does time-+series analysis use?
A. Historical
B. Raw
C. Old
D. New

A

A. Historical

58
Q

What does x-axis represent in time-series analysis?
A. Any dimension
B. Time dimension
C. Any measure

A

B. Time dimension

59
Q

What is cyclical variation?
A. A yearly pattern in sales
B. General increase or decrease in sales figures over time
C. Variations in sales over a number of years
D. Random anomalies in the sales data

A

C. Variations in sales over a number of years

60
Q

What are seasonal variations in data?
A. Pattern or regular periodic fluctuations in data in a year, months or weeks
B. Irregular variations in data values over times
C. Pattern of data variation which occur over a number of years
D. General increase or decrease in data values over time

A

A. Pattern or regular periodic fluctuations in data in a year, months or weeks

61
Q

In time-series analysis, what is a gradual upward or downward movement of the data over time?
A. Trend
B. Cycles
C. Random Variations
D. Seasonality

A

A. Trend

62
Q

In time-series analysis, what are “blips” in the data caused by chance and unusual situations?
A. Trend
B. Cycles
C. Random Variations
D. Seasonality

A

C. Random Variations

63
Q

How many variables are involved in univariate data analysis?
A. 0
B. 1
C. 2
D. 3

A

B. 1

64
Q

“Holt Winters” is a forecasting method that takes into account…
A. Only trends
B. Trend & seasonality
C. Only seasonality
D. Trend, seasonality, & randomness

A

D. Trend, seasonality, & randomness

65
Q

In forecasting Alpha represents the smoothing constant for trend.
A. True
B. False

A

B. False

66
Q

Exponential smoothing averages give more weight to recent data
A. True
B. False

A

A. True

67
Q

When alpha is close to 0, ___ data points have more weight in determining the forecast.
A. Older data points
B. Newer data points
C. All data points
D. No data points

A

A. Older data points

68
Q

What does “Beta” represent in the Holt-Winters method of exponential smoothing?
A. Seasonal smoothing
B. Basic smoothing
C. Trend smoothing
D. Trend and seasonal smoothing

A

C. Trend smoothing

69
Q

When beta is close to 1, _____ data points have more weight in determining the forecast.
A. Older data points
B. Newer data points
C. All data points
D. No data points

A

B. Newer data points

70
Q

When smoothing technique is appropriate when data has both trend and seasonality?
A. Double exponential smoothing
B. Single exponential smoothing
C. Simple moving average
D. Triple exponential smoothing

A

D. Triple exponential smoothing

71
Q

If the given time series has a trend and no seasonality, what type of smoothing is needed?
A. Single exponential smoothing
B. Holt-Winters no trend smoothing
C. Double exponential smoothing
D. Holt-Winters additive

A

C. Double exponential smoothing

72
Q

The sum of squared errors (SSE) represents…
A. Variation in the independent variable explained by the regression
B. Variation in the dependent variable explained by the regression

A

B. Variation in the dependent variable explained by the regression

73
Q

What is the difference between simple linear regression and multiple linear regression?
A. In multiple regression, the lines can be curvlinear
B. In simple regression, there are multiple dependent variables
C. in multiple regression, there are multiple independent variables
D. In simple regression, the lines can be curvlinear

A

C. in multiple regression, there are multiple independent variables

74
Q

For every one unit increase in the _____, there is a [slope] unit increase/decrease in the _____.
A. Dependent variable; predicted variable
B. Predicted variable; dependent variable
C. Independent variable; dependent variable
D. Predicted variable; independent variable

A

C. Independent variable; dependent variable

75
Q

Supervised Learning trains a model on known input and output data to predict future outputs
A. True
B. False

A

A. True

76
Q

Which of the following represents the coefficient of linear determination?
A. r
B. r/2
C. r^2
D. r^3

A

C. r^2

77
Q

The root-mean-square error is measure of
A. Forecast accuracy
B. Damping factor
C. Central tendency
D. Normality

A

A. Forecast accuracy

78
Q

What is the difference between supervised and unsupervised learning?
A. Supervised learning has target/labels, unsupervised does not.
B. Unsupervised learning has target/labels, supervised does not
C. Supervised learning for classification, unsupervised for regression
D. Supervised learning requires less training data than unsupervised

A

A. Supervised learning has target/labels, unsupervised does not.

79
Q

Which is true about variables in logistic regression?
A. There can only be two independent variables
B. There are two dependent variables
C. The dependent variables assumes one of two discrete values
D. The dependent variables assumes one of many discrete values

A

C. The dependent variables assumes one of two discrete values

80
Q

What is a false negative in a confusion matrix?
A. Predicted - yes, Actual - no
B. Predicted - no, Actual - yes
C. Predicted - yes, Actual - yes
D. Predicted - no, Actual - no

A

B. Predicted - no, Actual - yes

81
Q

Dummy variables are…
A. Variables not used in the analysis
B. Nominal variables converted to 0 or 1
C. Variables that can assume flexible values

A

B. Nominal variables converted to 0 or 1

82
Q

Naive Bayes classifiers…
A. Assumes that features (inputs) are independent
B. Assume that output classes are independent
C. Assume that features (inputs are real-valued
D. Assume that output classes are equally balanced

A

A. Assumes that features (inputs) are independent

83
Q

Binning is a transformation method for a categorical data to numeric data
A. True
B. False

A

B. False

84
Q

Naive Bayes Classification is often used to output categorical variables
A. True
B. False

A

A. True

85
Q

What is the ‘k’ in k-nearest neighbor algorithm?
A. Maximum distance between 2 points
B. Number of neighbors considered
C. Number of neighbors not considered
D. The type of neighbors (there’s also n and p type)

A

B. Number of neighbors considered

86
Q

Decision tree nodes are where:
A. A conscious decision between two or more options is made
B. No decision is made, likelihoods are attached to each outcome
C. Utilities are attached to nodes

A

A. A conscious decision between two or more options is made

87
Q

In a decision tree, all decision must be binary (yes/no)
A. True
B. False

A

B. False

88
Q

A deep neural network is a neural network with many hidden layers
A. True
B. False

A

A. True

89
Q

Support Vector Machine can be used for both regression and classification
A. True
B. False

A

A. True

90
Q

Data anbalytics and data-driven decisions
A. Help companies meet their goals
B. Help identify trends, patterns, predict future, gain competitive advantage
C. Create a level of uncertainty for managers
D. Create direct value for the customers

A

A. Help companies meet their goals
B. Help identify trends, patterns, predict future, gain competitive advantage

91
Q

Data driven decision making…
A. Promotes decisions based on opinion
B. Promotes decision based on intuition
C. Promotes decisions based on data and evidence
D. Promotes decisions bases on what the person with more knowledge knows

A

C. Promotes decisions based on data and evidence

92
Q

Put these decision cycle steps in the correct order starting at the top.
A. Analysis
B. Data
C. Insight
D. Decision

A

B. Data
A. Analysis
C. Insight
D. Decision

93
Q

Put these decision cycle steps in the correct order starting at the top.
A. Outcome
B. Action
C. Improvement
D. Assessment

A

B. Action
A. Outcome
D. Assessment
C. Improvement

94
Q

Data analysis results can provide insights to a question or sometimes to questions that haven’t been asked yet.
A. True
B. False

A

A. True

95
Q

What is the difference in outcomes and assessments?
A. Outcomes are data compared to a desired goal
B. Assessments occur before outcomes are determined
C. Outcomes are the collected results of actions
D. Assessments lead to insights

A

C. Outcomes are the collected results of actions

96
Q

A negative feedback loop
A. is really not a loop but a straight line
B. Is trying to get back to a set point
C. Is trying to avoid a set point
D. Moves away from the set point

A

B. Is trying to get back to a set point

97
Q

Negative feedback loops promote
A. Amplification
B. Stabilization

A

B. Stabilization

98
Q

Positive feedback loops
A. Are always beneficial
B. Amplify the effects of a disturbance
C. Diminish the effects of a disturbance
D. Have an odd number of negative couplings

A

B. Amplify the effects of a disturbance

99
Q

Positive feedback loops…
A. Return something to its original state
B. Are stabilizing
C. Reduce change
D. Destabilize as they increase change

A

D. Destabilize as they increase change

100
Q

Analysis paralysis refers to the inaction that can result from overanalyzing when making a decision.
A. True
B. False

A

A. True

101
Q

Which of the following is a challenge to optimization of the data-driven decision cycle?
A. Assessment of metrics
B. Disruptive technologies
C. Multiple actions to meet a goal
D. Improvements required due to missed goals

A

B. Disruptive technologies

102
Q

When does overfitting often happen?
A. When noise and outliers are not included in the training model
B. Too few parameters compared to the number of data points
C. Bad or dirty data are used in for training
D. When noise and outliers are included in the training model

A

D. When noise and outliers are included in the training model

103
Q

Where can Bias occue in analytics?
A. Data collection
B. Analysis design
C. Outcome assessment
D. All of the above

A

D. All of the above

104
Q

What step is NOT the way to avoid bias in data analytics?
A. Ask about the analysis method
B. Question the conclusion
C. Think critically
D. Don’t care about the bias and accept it

A

D. Don’t care about the bias and accept it