Review Slides Flashcards

1
Q

Statistics

A

A collection of methods for collecting, displaying, analyzing and drawing conclusions from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive Statistics

A

the branch of statistics that involves organizing, displaying and describing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inferential Statistics

A

the branch of statistics that involves drawing conclusions about a population based on info contained in a sample taken from that population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Population

A

any specific collection of objects of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sample

A

any subset of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Census

A

A sample that consists of the whole population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Measurement

A

a number or attribute computed for each member of a population or sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sample data

A

The collective measurements of sample elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Parameter

A

number that summarizes some aspect of the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Statistic

A

number computed from the sample data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Individuals

A

the individual units of observation (members of a population or a sample)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

variable

A

any characteristic of an individual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Distribution

A

tells us what values the variable takes, and how often it takes these values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Qualitative data/variables

A

measurements for which there is no natural numerical scale, but which consists of attributes or other non-numerical characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Quantitative variables/data

A

measurements for which there is a numerical scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Categorical Variable

A

codes whether each one in a set of observations is in a particular category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Nominal variable

A

assigns numerical labels to qualitative variables that represent different categories that cannot be ranked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Ordinal variable

A

assigns numerical labels to qualitative variables that represent different categories that can be ranked.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

data list

A

explicit listing of all the individual measurements made on a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

data frequency table

A

table listing each distinct value (x) and its frequency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Frequency of a value x

A

is the number of times it appears in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Sturges’ Rule

A

the desirable number of classes = k, the closest integer to:

1+3.3log(n)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sample size

A

The number of individuals in a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Absolute class frequency (or class frequency)

A

the number of measurements in the data set that are in the class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

absolute frequency distribution

A

a tabular summary of a data set that shows the absolute class frequency for each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Relative Frequency

A

proportion of all measurements in the data set that are in the class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Relative frequency distributions

A

tabular summary of a data set that shows the relative class frequency for each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

cumulative class frequency

A

the sum of all class frequencies up to and including the class in question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Frequency Histogram

A

Graphical device showing how data are distributed across the range of their values by collecting them into classes and indicating the number of measurements in each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Relative Frequency histogram

A

Graphical device showing how data are distributed across the range of their values by collecting them into classes and indicating the proportion of measurements in each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Internal Data

A

Data that is created as by-products of regular activities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

External Data

A

Data that is created by entities other than the person, firm or government that wants to use the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Survey

A

To survey is to ask individuals question or series of questions in order to gather information about what they do or what they believe.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Nonprobability Sample

A

a sample taken from a population in a haphazard fashion, without the use of some randomizing device assigning each member a known probability of selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

probability sample

A

is a sample taken with the help of a randomizing device that assures each member a known probability selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

random sample

A

is a sample obtained using a randomizing device that assures each member of the population has an equal chance of being in the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Nonresponse Bias

A

a systematic tendency for elementary units with particular characteristics not to contribute to data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

response bias

A

the tendency for answers to survey questions to be systematically wrong.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

sample mean

A

x bar = sum of X / n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Population mean

A

miu = sum of X / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Sample median:
odd
even

A

Odd number of measurements is the middle measurements when the data is assigned in numerical order.
Even number of measurements is the mean of the middle two measurements when the data is in numerical order.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Sample mode

A

most frequently occurring value.

43
Q

Range

A

R = Xn - X1

44
Q

Skewness

A

a measure of the frequency distributions deviation from symmetry.

45
Q

Kurtosis

A

a measure of the heavy tailedness of its frequency distribution.

46
Q

Coefficient of a variation is the ratio…

A

of the standard deviation to the mean

47
Q

Sample Proportion (P)

A

the frequency of observations in a particular category as a fraction of the sample size.

48
Q

Given an observed value of X in a data set, X is the Pth percentile of the data if

A

the percentage of the data that are less than or equal to X is P.

49
Q

If X is the Pth percentile of the data, then the number P

A

is the percentile rank of X.

50
Q

Quartiles

A

3 percentiles that cut the data into fourths.

51
Q

Q2
Q1
Q3?

A

Q2 is the median
Q1 is the 25th percentile (first quartile)
Q3 is the 75th percentile (third quartile)

52
Q

IQR

A

Q3-Q1

53
Q

Five number summary consists of

A

smallest, Q1, median (Q2), Q3, largest.

54
Q

Outlier

A

a measurement that is far removed from most or all of the remaining measurements.

55
Q

Boxplot

A

Graphical summary of the distribution of the data based on the fivenunmsum

56
Q

z-score

A

z = (X-miu) / st dev.

57
Q

empirical rule says that if a data set

hint: 68, 95, 99.7

A

has an approximately bell shaped frequency histogram, then approximately

  • 68% of the data lie within 1 standard deviation of the mean
  • 95% of the data lie within 2 st. dev. of the mean
  • 99.7% of the data lies within 3 st. dev. of the mean.
58
Q

Chebyshev’s Theorem says that for any numerical data set

A

at least
-3/4 of the data lie within 2 st. Dev of the mean
-8/9 of the data lie within 3 st. dev. of the mean
-(1 - 1/k^2) lie within k st. dev. of the mean.
k is any positive whole number greater than 1.

59
Q

Probability Theory

A

a branch of mathematics concerned with the analysis of random phenomena.

60
Q

Statistical Inference

A

a set of techniques to turn sample evidence into valid conclusions about populations of interest.

61
Q

Experiment

A

any repeatable process from which an outcome, measurement or result is obtained.

62
Q

Random Experiment

A

An experiment that produces a definite outcome that cannot be predicted with certainty

63
Q

Trial

A

one repetition of a random experiment

64
Q

sample space (outcome space) associated with a random variable

A

is the set of all possible outcomes

65
Q

event

A

a subset of the sample space.

66
Q

an event E is said to occur on a particular trial of an experiment if

A

the outcome observed is an element of the set E.

67
Q

simple event

A

is any basic outcome from a random experiment

68
Q

composite event

A

any combination of 2 or more basic outcomes from a random experiment.

69
Q

probability of an outcome e in a sample space S is

A

the number p between 0 and 1 that measures the likelihood that e will occur on a single trial of the experiment.

70
Q

the probability of an event A (p(A)) is

A

the sum of the probabilities of the individual outcomes of which it is composed.

71
Q

Factorials

A

n! = n x (n-1) x (n-2) x …. x 1

0! = 1

72
Q

A permutation of n different things taken x at a time is

A

an arrangement in a specific order of any x of the n things

nPx = n! / (n-x)!

73
Q

Combination of n things taken x at a time is

A

an arrangement of any x of these things without regard to order

nCx = n! / (n-x)! x!

74
Q

Intersection of events A and B is the collection

A

of all outcomes that are elements of both A and B

75
Q

Events A and B are mutually exclusive or disjoint if

A

they have no elements in common

76
Q

probability rule for mutually exclusive events is that events A and B are

A

mutually exclusive only if p(A inter B) = 0

77
Q

union of events A and B is the

A

collection of all outcomes that are elements of one or the other of the sets A and B or both of them

78
Q

the special addition law says that

A

for any 2 mutually exclusive events A and B,

p(A union B) = P(A) + P(B)

79
Q

unconditional probability is the

A

likelihood that a particular event will occur regardless of whether another event occurs

80
Q

Joint probability p(A inter B) is the

A

likelihood that 2 or more events will simultaneously occur (jointly)

81
Q

Conditional probability of A given B is the

A

probability that A has occurred in a trial of a random experiment, given that B has also occurred.

82
Q

collectively exhaustive events

A

when the union contains all the basic elements of the sample space

83
Q

partition

A

a set of elements is mutually exclusive and collectively exhaustive

84
Q

unconditional from joint rule

A

to obtain an unconditional probability from joint probabilities, we sum the joint probabilities over all possible events in a partition

85
Q

joint probability table shows

A

Frequencies or relative frequencies for joint events

86
Q

in the context of joint probability tables, we also refer to _____ as _______

A

unconditional probability as marginal probability

87
Q

General multiplication law

A

p(A inter B) = p(A) x p(B|A)

joint = unconditional x conditional

88
Q

events A and B are dependent when

A

the probability of occurrence of A is affected by the occurrence of B so that p(A) not equal to p(A|B)

89
Q

A and B are independent only if

A

p(A) = p(A|B)

90
Q

special multiplication law

A

the joint probability of A and B is the product of the unconditional probabilities of A and B

91
Q

posterior probability

A

is a revised probability

92
Q

RV

A

a numerical quantity that is generated by a random experiment

93
Q

a set of possible values is countable if

A

all possible values can be listed one after the other

94
Q

Discrete RV

A

a RV that has either a finite or countable number of possible values

95
Q

continuous RV

A

a RV for which the possible values contain a whole interval of real numbers

96
Q

cumulative distribution function (CDF) of a RV X is

A

the probability that X is less than or equal to a particular value of x

97
Q

the probability distribution of a discrete RV X is a

A

list of each possible value of X together with the probability that X takes that value in one trial of the experiment

98
Q

a Bernoulli process with parameters p and n consists of a of n identical trials of a random experiment such that each trial

A
  1. Produces one of two possible complementary outcomes, which have probability p=success and q=failure
  2. stands independent of any other trial
99
Q

Binomial RV with parameters n and p

A

a discrete RV that counts the number of successes in a Bernoulli process with parameters n and p

100
Q

binomial probability distribution is a

A

list of each possible value of a binomial RV X together with the probability that X takes that value

101
Q

probability distribution of a continuous RV X is an assignment of probabilities to intervals of decimal numbers using a function f(x) called a ________ in the following way:

A

density function
the probability that X assumes a value in the interval [a, b] = the area under curve bounded below by the x-axis and bounded on the left and right by x = a and x = b

102
Q

normal distribution with mean (miu) and st dev is

A

the probability distribution corresponding to the density function for the bell curve with parameters miu and st dev.

103
Q

normally distributed RV

A

a continuous RV whose probabilities are described by the normal distribution with mean and st dev

104
Q

standard normal RV is a

A

normally distributed RV with mean = 0 and st dev = 1