Stats Flashcards

1
Q

What is the primary focus of statistics?

Group of answer choices

Data mining

Application of algorithms to inform strategic decisions

Collection, analysis, interpretation, presentation, and organization of data

Predictive modeling

A

Collection, analysis, interpretation, presentation, and organization of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following methods is commonly used in statistics to understand data distributions and relationships?

Group of answer choices

Algorithm application

Predictive modeling

Hypothesis testing and regression analysis

Data mining

A

Hypothesis testing and regression analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does analytics emphasize in addition to statistical methods?

Group of answer choices

Data presentation

Predictive modeling and data mining

Data collection

Data interpretation

A

Predictive modeling and data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following best describes the scope of analytics?

Group of answer choices

Limited to data collection and presentation

Integrates statistical methods with advanced computational techniques

Focuses solely on hypothesis testing

Only involves data organization

A

Integrates statistical methods with advanced computational techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the first step in the data analysis process

Group of answer choices

Prepare data

Extract patterns

Apply machine learning techniques

Get actionable information

A

Prepare data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following is not listed as a data source from the chart?

Group of answer choices

Printed Books

Audio

Email

Social Media Posts

A

Printed Books

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In which step would you apply machine learning techniques according to this flowchart?

Group of answer choices

Step 3 - Get Actionable Information

Step 2- Extract Patterns

None of the above steps explicitly mention applying machine learning techniques

Step 1 - Prepare Data

A

Step 2- Extract Patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What outcome does this flowchart suggest as a result of following these steps?

Group of answer choices

Gaining insights or making informed decisions based on analyzed data

Creation of new databases

Learning how to code in various programming languages

Development of new software programs

A

Gaining insights or making informed decisions based on analyzed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which of the following is an example of transactional data?

Group of answer choices

Weather forecasts

Credit card payment

Movie reviews

Social media posts

A

Credit card payment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What type of information is included in contractual, subscription, or account data?

Group of answer choices

Weather patterns

Social media interactions

General market trends

Information about the type of product combined with customer characteristics

A

Information about the type of product combined with customer characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the primary aim of surveys?

Group of answer choices

To entertain a particular group of people

To provide financial assistance to people

To organize social events for communities

To extract sociodemographic and behavioral data from a particular group of people

A

To extract sociodemographic and behavioral data from a particular group of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is unstructured data?

Group of answer choices

Information that resides in a traditional row-column database

Information that does not reside in a traditional row-column database

Data that is always numerical

Data that is always textual

A

Information that does not reside in a traditional row-column database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Which of the following is an example of a purpose for which data poolers gather data?

Group of answer choices

Event planning

Marketing and credit risk assessment

Cooking recipes

Weather forecasting

A

Marketing and credit risk assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the first phase in the data analytics process?

Group of answer choices

Modelling

Data Preparation

Evaluation

Business Understanding

A

Business Understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the primary goal of the Business Understanding phase?

Group of answer choices

Evaluating the model

Evaluating the model

Applying machine learning algorithms

Cleaning data for better quality

A

Evaluating the model(ata)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which phase involves selecting related data from various databases?

Group of answer choices

Deployment

Modelling

Data Understanding

Data Preparation

A

Data Understanding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the primary focus of the Modelling phase?

Group of answer choices

Identifying business tasks

Applying statistical and machine learning algorithms

Cleaning data

Selecting related data

A

Applying statistical and machine learning algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which phase involves evaluating the performance of the model?

Group of answer choices

Data Preparation

Business Understanding

Deployment

Evaluation

A

Evaluation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What type of data can be found in a Temporal, Sequence or Time-Series Database?

Group of answer choices

Time-based data

Categorical data

Aggregated data

Static data

A

Time-based data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which phase involves selecting the related data from many available databases to correctly describe a given business task?

Group of answer choices

Evaluation

Modelling

Data Understanding

Data Preparation

A

Data Understanding

21
Q

What is the definition of Mean

Group of answer choices

The average value of a dataset

The middle value in a dataset

The most frequently occurring value in a dataset

The range of values in a dataset

A

The average value of a dataset

22
Q

How is the Mean calculated?

Group of answer choices

By identifying the most frequent value

By summing all values and dividing by the number of values

By subtracting the smallest value from the largest value

By finding the middle value

A

By summing all values and dividing by the number of values

23
Q

What does the Median represent?

Group of answer choices

The most frequently occurring value in a dataset

The difference between the highest and lowest values

The average value of a dataset

The middle value when arranged in order

A

The middle value when arranged in order

24
Q

Which measure of central tendency can have multiple values?

Group of answer choices

Range

Mode

Mean

Median

A

Mode

25
Q

What is the primary purpose of measures of central tendency?

Group of answer choices

Measuring dispersion

Calculating probability

Solving equations

Organizing, summarizing, and visualizing data

A

Organizing, summarizing, and visualizing data

26
Q

What is the midrange of the data set 11, 13, 4, 30, 9, 15?

A

17

27
Q

It describe the current performance of data

A

Descriptive Analytics

28
Q

It identifies data anomalies

A

DIAGNOSTIC ANALYTICS

29
Q

Refers to the general relationship
between two random variables

A

Association

30
Q

Refers to a linear relationship two
variables

A

Correlation

31
Q

it measures linear
correlation between two numerical variables

A

Pearson Correlation Coefficient

32
Q

it measures the strength and direction of the nonlinear relationship between two variables.

A

Spearman Rank Correlation

33
Q

It is correlation based on the ranks of the data

A

Spearman Rank Correlation

34
Q

it refers to a consistent trend between
two variables, as one variable varies, the other variable also
consistently varies, but not necessarily at a constant rate.

A

monotonic relationships.

35
Q

It is a branch of statistics that involves using sample data to make conclusions or predictions about a larger population

A

Inferential Statistics

36
Q

It allows us to infer or generalize findings from a sample to the population it
represents.

A

Inferential Statistics

37
Q

It is the process of using sample data to
calculate a single value (point estimate) that serves as an
estimate of a population parameter.

A

Point Estimation

38
Q

__________ is a fundamental concept in inferential statistics
used to make decisions based on sample data for population

A

Hypothesis Testing

39
Q

Independent variable does not influence dependent
variable.

A

Null Hypothesis

40
Q

Independent variable does influence
dependent variable.

A

Alternative Hypothesis

41
Q

_______ is a range or a “guess” about where a population value (like a mean or proportion) might
fall

A

Confidence interval

42
Q

It guides our decision-making in hypothesis
testing by setting a threshold for accepting or rejecting the null
hypothesis based on the observed data.

A

Significance Level

43
Q

Type of Error that reject null hypothesis when it is true

A

Type 1 Error(False Positive)

44
Q

Rejected Alternate hypothesis when it
is true. When we fail to detect a significant
effect or difference when it actually exists.

A

Type ii Error(False Negative)

45
Q

This test is used to determine whether there is a
significant association between two categorical variables

A

Chi-Square test

46
Q

This test is used to determine whether there is a significant
difference between sample and population means, or between the means of two samples.

A

Z-test

47
Q

Use ______ when the population standard deviation (σ) is unknown and must be estimated from the sample.

A

t-test

48
Q

If you know the population standard deviation and have a large sample
size (n > 30), you can use a ________

A

z-test

49
Q

If the population standard deviation is unknown or the sample size is small (n < 30), use a _______

A

t-test