Analyzing Data Flashcards

1
Q

Data can be…

A

discrete or continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is the distribution of a dataset measured?

A

The distribution of the dataset is measured using frequency distribution and is presented in graphical form using a bar chart or a histogram.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The histogram is preferred in the case of what type of data?

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The bar chart is preferred for what type of data?

A

nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Measures of central tendency are used to show…

A

the center of the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the mean?

A

average of all data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the median?

A

the middle of the entire data set.
Usually, the data is sorted in ascending order and the middle value is considered as the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the mode?

A

Mode indicates the most commonly occurring value in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If the mean, median, and mode are equal then data is said to be…

A

normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The value of the estimates of central tendency help in the next series of characteristics of the data set…

A

dispersion or variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The objective of measure of dispersion or variation is to identify…

A

the extent to which the entire data set is spread from the central tendency – specifically mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are common measures of dispersion or variation?

A

Range
Stanard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is variance?

A

The variance is the square of Standard Deviation. It helps to find the spread of the data. It is different from Standard Deviation which indicates the concentration of data around the mean and uses the same units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is range?

A

The range is the simplest of the three. It is the difference between the maximum and minimum number in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When do we use descriptive stats?

A

In instances where descriptive statistics can provide straight-forward answers about a sample especially; usually when the question is about describing the sample or comparing two samples. Comparison of data of two separate samples can be done if the researcher can establish that the samples are large enough to be compared adequately.
The objective is to answer a direct question and not to infer about an entire population. For any kind of inferential statistics in research, the first step is to establish the descriptive statistics from the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why do we use descriptive stats?

A

To create an overview of the entire data set by summarizing it
To generate an actionable set of information from the large data set having multiple variables
To segregate the data into homogeneous groups to enable comparison
To operate on the descriptive statistics’ output (which forms the basis) using inferential statistical tests to generalize about a population or forecast outcomes

17
Q

Why do we use excel spreadsheets for data?

A

Excel provides information on summary statistics that includesMean, Standard Error, Median, Mode, Standard Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum, and Count. In other words, it consists of measures of central tendency, variability, skewness and kurtosis.

18
Q

What is SPSS?

A

Statistical Product and Service Solutions
Runs descriptive statistics and regression analyses, view patterns of missing data and summarize variable distributions with an integrated interface.
IBM® SPSS® Statistics is a powerful statistical software platform. It offers a user-friendly interface and a robust set of features that lets your organization quickly extract actionable insights from your data. Advanced statistical procedures help ensure high accuracy and quality decision making. All facets of the analytics lifecycle are included, from data preparation and management to analysis and reporting.

19
Q

What are the 2 main uses of inferential statistics?

A

Making estimates about populations
Testing hypotheses to draw conclusions about populations

20
Q

What is inferential statistics?

A

Inferential statisticsuses a small sample of data to draw inferences about the larger population that the sample came from.

21
Q

What are the 5 main steps in hypothesis testing?

A

State your research hypothesis as a null hypothesis and alternate hypothesis.
Collect data in a way designed to test the hypothesis.
Perform an appropriate statistical test.
Decide whether to reject or fail to reject your null hypothesis.
Present the findings in yourresults and discussion section.

22
Q

What is hypothesis testing?

A

A formal procedure for investigating our ideas about the world usingstatistics. It is most often used by scientists to test specific predictions, called hypotheses, that arise from theories.

23
Q

What is a p-value?

A

describes how likely you are to have found a particular set of observations if the null hypothesis were true.

The smaller thepvalue, the more likely you are to reject the null hypothesis.

24
Q
A