Exam 1 Flashcards

1
Q

Range = ?

A

Range = Maximum - Minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define Sample

A

A sample is a set of data drawn from the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define Population

A

— a population is the group of all items of interest to a statistics practitioner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define parameter

A

A descriptive measure of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define Statistic

A

A descriptive measure of a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

define descriptive statistics

A

Descriptive statistics deals with methods of organizing, summarizing, and presenting data in a convenient and informative way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define inferential statistics

A

Inferential statistics is a body of methods used to draw conclusions or inferences about characteristics of populations based on sample data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

We use __________ to make inferences about _____________.

A

We use statistics to make inferences about parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define confidence level

A

The confidence level is the proportion of times that an estimating procedure will be correct.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define significance level

A

the significance level measures how frequently the conclusion will be wrong in the long run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

_______ and _________ are popular numerical techniques to describe the location of the data.

A

The mean and median are popular numerical techniques to describe the location of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The _______, ________, and ______ _______ measure the variability of the data

A

The range, variance, and standard deviation measure the variability of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define Variable

A

A variable is some characteristic of a population or sample. Usually represented by an uppercase letter like X, Y, Z, etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define values of variable

A

The values of the variable are the range of possible values for a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Three types of data and information

A

Interval Data, Nominal Data, Ordinal Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define Interval Data

A

Real numbers, i.e. heights, weights, prices, etc. Intervals between each value are equally split. Arithmetic operations can be performed on Interval Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define Nominal Data

A

The values of nominal data are categories EX: marital status: Single = 1, Married = 2, Divorced = 3, Widowed = 4 Usually data fits into classification category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Nominal data are also called _________ or _________.

A

Nominal data are also called qualitative or categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Interval data are also called _________ or ____________.

A

Interval data are also called quantitative or numeral

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define Ordinal Data

A

Ordinal Data appear to be categorical in nature, but their values have an order; a ranking to them: College course rating system: poor = 1, fair = 2, good = 3, very good = 4, excellent = 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

______ _____ refers to quantities that have a natural ordering.

A

Ordinal Data refers to quantities that have a natural ordering. With ordinal data you cannot state with certainty whether the intervals between each value are equal. Small, Medium, Large (small may not be the same distance from medium as medium is from large)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Interval Data Summary

A

Interval Values are real numbers. All calculations are valid. Data may be treated as ordinal or nominal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Ordinal Data Summary

A

Ordinal Values must represent the ranked order of the data. Calculations based on an ordering process are valid. Data may be treated as nominal but not as interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Nominal Data Summary

A

Nominal Values are the arbitrary numbers that represent categories. Only calculations based on the frequencies of occurrence are valid. Data may not be treated as ordinal or interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

The only allowable calculation on nominal data is to ______ ___ ________ of each value of the variable.

A

The only allowable calculation on nominal data is to count the frequency of each value of the variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does a relative frequency distribution do? (%)

A

A relative frequency distribution lists the categories and the proportion with which each occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what is a frequency distribution How Frequent a Category was chose

A

We can summarize the data in a table that presents the categories and their counts called a frequency distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Bar Charts show ___________.

A

Bar Charts show frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Pie Charts show __________.

A

Pie Charts show relative frequencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Histograms and stem & leaf displays are used to graphically describe ________ ____.

A

Histograms and stem & leaf displays are used to graphically describe interval data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Define a Histogram

A

A Histogram is a graphical display of data using bars of different heights. It is similar to a Bar Chart, but a histogram groups numbers into ranges Histograms are great for illustrating the frequency of continuous data (no gaps), but if the data is categorical, use a bar chart (gaps)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Observations measured at successive points in time are called _________ data. _________ data graphed on a line chart.

A

Observations measured at successive points in time are called time-series data. Time-series data graphed on a line chart,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

what does a scatter diagram do

A

Scatter diagram (plots two variables against one another) Describe the relationship between two variables How two interval variables are related

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

The Independent variable is and is on the

A

X Horizontal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

The Dependent variable is and is on the

A

Y Vertical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Three patterns of scatter diagrams

A

positive linear relationship, negative linear relationship, weak or non-linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What kind of data do you use histograms for

A

Interval data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Measures of central location

A

Mean, Median, Mode

39
Q

Measures of Variability

A

Range, Standard Deviation, Variance, Coefficient of Variation

40
Q

Measures of relative standing

A

Percentiles, Quartiles

41
Q

Measures of Linear Relationship

A

Covariance, Correlation, Determination, Least Squares Line

42
Q

Mean = ?

A

Mean = Sum of the Observations/Number of observations

43
Q

When referring to the number of observations in a population, we use ___________

A

When referring to the number of observations in a population, we use uppercase letter N

44
Q

When referring to the number of observations in a sample, we use __________

A

When referring to the number of observations in a sample, we use lower case letter n

45
Q

The arithmetic mean for a population is denoted with Greek letter “mu”:

A

The arithmetic mean for a population is denoted with Greek letter “mu”: u with a tail

46
Q

The arithmetic mean for a sample is denoted with an “x-bar”:

A

XBAR

47
Q

Population mean Formula

A

Population Mean Formula

48
Q

Sample Mean Formula

A

sample mean formula

49
Q

The _______ is calculated by placing all the observations in order; the observation that falls in the middle is the ________.

A

The median is calculated by placing all the observations in order; the observation that falls in the middle is the median.

50
Q

The ____ of a set of observations is the value that occurs most frequently. ____ is useful for all data types, though maily used for nominal data.

A

The mode of a set of observations is the value that occurs most frequently. Mode is useful for all data types, though maily used for nominal data.

51
Q

Compute the Mean to

A

Describe the central location of a single set of interval data

52
Q

Compute the Median to

A

Describe the central location of a single set of interval or ordinal data

53
Q

Compute the Mode to

A

Describe a single set of nominal data

54
Q

The range is the simplest measure of ______, calculated as: Range = ?

A

The range is the simplest measure of variability, calculated as: Range = Largest observation – Smallest observation

55
Q

_______ and its related measure, _______ ________, are arguably the most important statistics. Used to measure variability, they also play a vital role in almost all statistical inference procedures.

A

Variance and its related measure, standard deviation, are arguably the most important statistics. Used to measure variability, they also play a vital role in almost all statistical inference procedures.

56
Q

Population variance is denoted by

A

Population variance is denoted by
(Lower case Greek letter “sigma” squared) σ ²

57
Q

Sample variance is denoted by

A

Sample variance is denoted by
(Lower case “S” squared) s²

58
Q

The variance of a population is: EQUATION

A

The Variance of a population is :

59
Q

The Variance of a sample is: EQUATION

A

The Variation of a sample is:

60
Q

The _______ __________ is simply the square root of the __________

A

The standard deviation is simply the square root of the variance

61
Q

Population standard deviation looks like

A

Population standard deviation looks like

σ

62
Q

Sample standard deviation looks like:

A

Sample standard deviation looks like: s

63
Q

Empirical Rule, which states:

A

Approximately 68% of all observations fall within one standard deviation of the mean.
Approximately 95% of all observations fall within two standard deviations of the mean.
Approximately 99.7% of all observations fall within three standard deviations of the mean.

64
Q

_______: the Pth percentile is the value for which P percent are less than that value and (100-P)% are greater than that value.

A

Percentile

65
Q

We have special names for the 25th, 50th, and 75th percentiles, namely __________.

A

quartiles

66
Q

The three quartiles are as follows:

A

The first or lower quartile is labeled Q1 = 25th percentile.

The second quartile, Q2 = 50th percentile (which is also the median).

The third or upper quartile, Q3 = 75th percentile.

67
Q

Location of Percentiles: EQUATION

A

Location of Percentiles:

68
Q

Interquartile Range = ?

A

Interquartile Range = Q3 - Q1

69
Q

two numerical measures of linear relationship that provide information as to the strength & direction of a linear relationship between two variables

A

They are the covariance and the coefficient of correlation.

70
Q

Population Covariance looks like

A
71
Q

Sample Covariance Looks like

A

Sample Covariance Looks Like

72
Q

When two variables move in the same direction (both increase or both decrease), the covariance will be a _____ _______ number.

A

When two variables move in the same direction (both increase or both decrease), the covariance will be a large positive number.

73
Q

When two variables move in opposite directions, the covariance is a ______ _______ number.

A

When two variables move in opposite directions, the covariance is a large negative number.

74
Q

When there is no particular pattern, the covariance is a ______ number.

A

When there is no particular pattern, the covariance is a small number.

75
Q

Define Coefficient of Correlation

A

The coefficient of correlation is defined as the covariance divided by the standard deviations of the variables:

76
Q

Sample Coefficient of Correlation looks like:

A

Sample Coefficient of Correlation

77
Q

The coefficient of correlation is

A

The advantage of the coefficient of correlation over covariance is that it has fixed range from -1 to +1, thus:

If the two variables are very strongly positively related, the coefficient value is close to +1 (strong positive linear relationship).

If the two variables are very strongly negatively related, the coefficient value is close to -1 (strong negative linear relationship).

No straight line relationship is indicated by a coefficient close to zero.

78
Q

Symbol Table:

A

Symbol Table:

79
Q

A survey ……

A

A survey solicits information from people

80
Q

Key design
principles of a survey:

A

Key design
principles of a survey:

Keep the questionnaire as short as possible
Ask short, simple, and clearly worded questions
Start with demographic questions to help respondents get started comfortably
Use dichotomous (yes/no) and multiple choice questions
Use open-ended questions cautiously
Avoid using leading-questions >>>

81
Q

the ______ population and the ______ population should be similar to one another.

A

the sampled population and the target population should be similar to one another.

82
Q

A ______ ______ is a method or procedure for specifying how a sample will be taken from a population.

A

A sampling plan is a method or procedure for specifying how a sample will be taken from a population.

83
Q

3 common methods of sampling plans

A

Simple random sampling
Stratified random sampling
Cluster sampling

84
Q

Define Simple Random Sampling

A

A simple random sample is a sample selected in
such a way that every possible sample of the
same size is equally likely to be chosen.

Example:
 Drawing three names from a hat containing all the names of the students in the class is an example of a simple random sample (any group of three names is as equally likely as picking any other group of three names)
85
Q

Define Stratified Random Sampling

A

A stratified random sample is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum.

Divide population into two or more subgroups (called strata) according to some common characteristic
A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes
Samples from subgroups are combined into one
This is a common technique when sampling population of voters, stratifying across racial or socio-economic lines

86
Q

Define Cluster Sampling

A

A cluster sample is a simple random sample of groups or clusters of elements (vs. a simple random sample of individual objects).

This method is useful when it is difficult or costly to develop a complete list of the population members or when the population elements are widely dispersed geographically

87
Q

Compare the sampling methods

A

Simple random sample
Simple to use
May not be a good representation of the population’s underlying characteristics

Stratified random sample
Ensures representation of individuals across the entire population

Cluster sample
More cost effective
Less efficient (need larger sample to acquire the same level of precision)

88
Q

The ______ the sample size is, the more accurate we can expect the sample estimates to be

A

The larger the sample size is, the more accurate we can expect the sample estimates to be

89
Q

Define Sampling Error

A

Sampling error refers to differences between the
sample and the population that exist only
because of the observations that happened to be
selected for the sample.

Increasing the sample size will reduce this error

90
Q

Define Nonsampling errors

A

Nonsampling errors are more serious and are due to mistakes
made in the acquisition of data or due to the sample
observations being selected improperly.

(Note: increasing the sample size will not reduce this type of error.)

91
Q

3 types of nonsampling errors:

A

Errors in data acquisition
Nonresponse errors
Selection bias

92
Q

Errors in data acquisition

A

…arises from the recording of incorrect responses

93
Q

Define Selection Bias

A

…occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample

94
Q

______ occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample

A

Selection Bias occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample