Stats Midterm 1 Flashcards

1
Q

What is Statistics

A

Collecting Data
e.g. Survey

Characterizing Data
e.g. Mean and Median

Analyzing Data
e.g. Trends and Patterns

Interpreting Data
e.g. Conclusions and Decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Statistics (cont’d)

A

Statistics is the science of data. It involves
collecting, classifying, summarizing,
organizing, analyzing, and interpreting
numerical information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Descriptive Statistics

A

consists of methods for organizing and summarizing information.

includes the construction of graphs, charts, and tables and the calculation of various descriptive measures such as averages, measures of variation, and percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population

A

The collection of all individuals or
items under consideration in a statistical study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

That part of the population from which
information is obtained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Census

A

The collection of data from every member
of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Inferential statistics

A

consists of methods for drawing and
measuring the reliability of conclusions about a population based on information obtained from a sample of the population.

(draw conclusions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

observational study

A

is a data-collection method where the experimental units sampled are observed in their natural setting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

designed experiment

A

is a data-collection method where the researcher exerts full control over the characteristics of the experimental units sampled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Simple random sampling

A

A sampling procedure for which each possible sample of a given size is equally likely to be the one obtained. Also, called
probability sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Simple random sample

A

A sample obtained by simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

representative sample

A

exhibits characteristics typical of those possessed by the population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Simple random sampling with replacement
(SRSWR)

A

whereby a member of the population can be selected more than once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Simple random sampling without
replacement (SRS)

A

whereby a member of the population can be selected at most once

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

experimental units

A

In a designed experiment, the individuals or items on which the experiment is performed are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

subject

A

When the experimental units are humans

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Principles of Experimental Design

A

Control: Two or more treatments should be compared.

Randomization: The experimental units should be randomly divided into groups to avoid unintentional selection bias in constituting the groups.

Replication: A sufficient number of experimental units should be used to ensure that randomization creates groups that resemble each other closely and to increase the chances of detecting any differences among the treatments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The group receiving the specified treatment is

A

treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

the group receiving placebo is

A

control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Response variable

A

The characteristic of the experimental outcome that is to be measured or observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Factor

A

A variable whose effect on the response variable is of interest in the experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Levels

A

The possible values of a factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Treatment

A

Each experimental condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

completely randomized design

A

all the experimental units are assigned randomly among all the treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

randomized block design

A

the experimental units are assigned randomly among all the treatments separately within each block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

histogram

A

displays the classes of the quantitative
data on a horizontal axis and the frequencies (relative frequencies, percents) of those classes on a vertical axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Important Uses of a Histogram

A
  • Visually displays the shape of the distribution of the data
  • Shows the location of the center of the data
  • Shows the spread of the data
  • Identifies outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

dotplot

A

is a graph in which each observation is
plotted as a dot at an appropriate place above a horizontal axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Features of Dotplot

A

– Displays the shape of distribution of data.
– It is usually possible to recreate the original list of data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

stem-and-leaf diagram (or stemplot)

A

each observation is separated into two parts, namely, a stem-consisting of all but the rightmost digit- and a leaf, the rightmost digit.

31
Q

Features of Stem-and-leaf

A

– Shows the shape of the distribution of the data.
– Retains the original data values.
– The sample data are sorted (arranged in order).

32
Q

distribution of a data set

A

is a table, graph, or formula that provides the values of the observations and how often they occur

33
Q

Population data

A

The values of a variable for the entire population

34
Q

Sample data

A

The values of a variable for a sample of the population

35
Q

population distribution, or the distribution of the variable

A

The distribution of population data is

36
Q

sample distribution

A

The distribution of sample data is

37
Q

Drawings of objects

A

pictographs

38
Q

Variable

A

a characteristic that varies from one person (item/object) to another

39
Q

Data

A

all the values of the variable

40
Q

Data set

A

collection of all observations for a variable; also has more than one variable

41
Q

Quantitative (or numerical) data collection

A

consist of numbers representing counts or
measurements

42
Q

Categorical (or qualitative or attribute) data

A

consist of names or labels (not numbers that represent counts or measurements)

43
Q

Discrete data

A

data values are quantitative, and the number of values is finite, or “countable.”

44
Q

Example of Discrete data

A

Examples: The number of tosses of a coin before getting tails or the number of students in this class

45
Q

Continuous data

A

result from infinitely many possible quantitative values, where the collection of values is not countable

46
Q

Examples of Continuous Data

A

Examples: Heights, weights, lengths, temperature

47
Q

frequency distribution

A

of qualitative data is a listing of the distinct values and their frequencies

48
Q

relative-frequency distribution

A

of qualitative data is a listing of the distinct values and their relative frequencies

49
Q

pie chart

A

is a disk divided into wedge-shaped pieces proportional to the relative frequencies of
the qualitative data

50
Q

bar chart

A

displays the distinct values of the qualitative data on a horizontal axis and the relative frequencies (or frequencies or percents) of those values on a vertical axis.

51
Q

Lower class limits

A

The smallest numbers that can belong to each of the different classes (categories)

52
Q

Upper class limits

A

The largest numbers that can belong to each of the different classes (categories)

53
Q

Class mark or midpoint

A

– The values in the middle of the classes.
– Each class midpoint can be found by:
▪ adding the lower class limit to the upper class limit and dividing the sum by 2.

54
Q

Class width

A

The difference between two consecutive lower class limits (or two consecutive lower class boundaries) in a frequency distribution.

55
Q

Lower class cutpoint

A

The smallest value that could go in a class

56
Q

Upper class cutpoint

A

The largest value that could go in the next-higher class (equivalent to the lower cutpoint of the next- higher class)

57
Q

Class width

A

The difference between the cutpoints of a class

58
Q

Class midpoint

A

The average of the two cutpoints of a
class

59
Q

Gaps

A

The presence of gaps can show that the data are from two or more different populations.

60
Q

central tendency

A

of the set of measurements-that is, the tendency of the data to cluster, or center, about certain numerical values

61
Q

mean

A

is the sum of the measurements divided by the number of measurements for the variable

62
Q

A statistic is resistant

A

if the presence of extreme values (outliers) does not cause it to change very much.

63
Q

mode

A

is the measurement that occurs most frequently in the data set

64
Q

bimodal

A

When two data values occur with the same greatest frequency, each one is a mode, and the data set is

65
Q

multimodal

A

When more than two data values occur with the same greatest frequency, each is a mode, and the data set is

66
Q

no mode

A

When no data value is repeated, we say

67
Q

A data set is said to be skewed

A

if one tail of the distribution has more extreme observations than the other tail.

68
Q

variability

A

the spread of the data

69
Q

range

A

Max - Min

70
Q

sample variance

A

for a sample of n (represents the number of data values in a sample) measurements is equal to the sum of the squared deviations from the mean divided by (n – 1).

71
Q

standard deviation

A

is a measure of how much data values
deviate away from the mean

72
Q

Smaller values tell us that our data is not spread out

A

They indicate that most of the data values are clustered around the mean

73
Q

Larger values tell us that our data is spread out

A

They indicate that most of the data values are not clustered around the mean

74
Q

Three-Standard-Deviations Rule

A

Almost all the observations in any data set lie within three standard deviations to either side of the mean