Midterm Flashcards

1
Q

VARIABLES

A

VARIABLE ASPECTS OF REALITY

(In statistical research, a variable is defined as an attribute of an object of study.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. VARIABLES CONSIST OF
A

VALUES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

σ

A

Sigma

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sigma represents

A

population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

pulation standard deviation formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

µ means

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

VARIABLE ASPECTS OF REALITY ARE CALLED

A

VARIABLES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

VARIABLES CONSIST OF

A

VALUES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

VALUES ARE TAKEN ON BY

A

OBSERVATIONS (SUBJECTS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

VALUES ARE TAKEN ON BY OBSERVATIONS (SUBJECTS) IN

A

TIME AND IN SPACE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

WE MAY WANT TO DO TWO THINGS WITH VALUES OF OBSERVATIONS:

A
  1. WE MAY WANT TO KNOW IF THERE IS A PATTERN IN A LIMITED NUMBER OF VALUES AVAILABLE TO US “HERE AND NOW” (IS THERE A PATTERN OF SCORING BY A BASKETBALL TEAM OVER A SEASON?)
    • THIS GOAL CAN BE ACCOMPLISHED WITH A SET OF STATISTICAL PROCEDURES, CALLED DESCRIPTIVE STATISTICS.
  2. b. WE MAY ALSO WANT TO KNOW IF A PATTERN OBSERVED IN A LIMITED NUMBER OF OBSERVATIONS IS LIKELY TO HOLD WITH OTHER OBSERVATIONS UNDER SIMILAR CONDITIONS. (ARE OTHER TEAMS IN THE LEAGUE LIKELY TO DISPLAY A SIMILAR SCORING PATTERN OVER A SEASON AS THE TEAM WE HAVE OBERVED?)
    • THIS GOAL CAN BE ACCOMPLISHED WITH A SET OF STATISTICAL PROCEDURES, KNOWN AS INFERENTIAL STATISTICS.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

DESCRIPTIVE STATISTICS

A

is a means of describing features of a data set by generating summaries about data samples. It’s often depicted as a summary of data shown that explains the contents of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

INFERENTIAL STATISTICS

A

describe the many ways in which statistics derived from observations on samples from study populations can be used to deduce whether or not those populations are truly different.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

OBSERVATIONS THAT WE OBSERVE “HERE AND NOW” MAKE UP A

A

SAMPLE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TO DESCRIBE A SAMPLE WE USE

A

SAMPLE STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SAMPLE STATISTICS ARE REFERRED TO BY

A

LATIN LETTERS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A SET OF ALL RELEVANT OBSERVATIONS FROM WHICH YOUR SAMPLE WAS TAKEN IS CALLED A

A

POPULATION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

TO DESCRIBE A POPULATION, WE USE

A

POPULATION PARAMETERS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

POPULATION PARAMETERS ARE REFERRED TO BY

A

GREEK LETTERS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

THE PROBLEM WITH A POPULATION IS THAT IT’S DIFFICULT TO OBSERVE. THEREFORE, WE USUALLY OBSERVE PATTERNS IN SAMPLES AND DECIDE IF THESE PATTERNS ARE LIKELY TO

A

HOLD IN POPULATIONS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

SAMPLES MUST BE

A

REPRESENTATIVE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

SAMPLES MUST BE REPRESENTATIVE:

A

THEY MUST REFLECT GENERAL COMPOSITION OF POPULATION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

SAMPLES MUST BE SELECTED VIA

A

RANDOM SAMPLING

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

SAMPLES MUST BE SELECTED VIA RANDOM SAMPLING:

A

WHERE EACH OBSERVATION IN A POPULATION HAS IDENTICAL PROBABILITY OF BEING SELECTED INTO A SAMPLE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

SAMPLING WITH/ WITHOUT REPLACEMENT

A

SAY WE HAVE 5 RED BALLS, 5 WHITE ONES, AND SELECT A SAMPLE OF 2 BALLS. PROBABILITY OF A 2ND BALL BEING RED DEPENDS ON THE COLOR OF THE 1ST BALL PICKED INTO THE SAMPLE. THIS VIOLATES EQUAL PROBABILITY PRINCIPLE FOR THE SECOND BALL. TO AVOID THE VIOLATION WE REPLACE THE 1ST BALL BEFORE PICKING THE 2ND ONE. REPLACEMENT IS NOT NECESSARY WITH LARGE POPULATIONS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

A PERFECT FIT BETWEEN A SAMPLE AND A POPULATION DOES NOT EXIST. THERE’S ALWAYS A

A

SAMPLING ERROR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A SAMPLING ERROR IS

A

THE DIFFERENCE BETWEEN A POPULATION PARAMETER AND A SAMPLE STATISTIC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

TWO TYPES OF SAMPLING ERRORS:

A

• THE RELATIVELY “HARMLESS” SAMPLING ERROR IS UNBIASED

• THE “HARMFUL” SAMPLING ERROR IS BIASED

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

THE RELATIVELY “HARMLESS” SAMPLING ERROR IS UNBIASED:

A

OVER MULTIPLE SAMPLES SOME SAMPLE STATISTICS WILL BE GREATER AND SOME – SMALLER THAN POPULATION PARAMETER.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

THE “HARMFUL” SAMPLING ERROR IS BIASED:

A

OVER MULTIPLE SAMPLES SOME SAMPLE STATISTICS ALL OF THEM WILL BE EITHER GREATER OR SMALLER THAN POPULATION PARAMETER.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

THE BIAS OF A SAMPLING ERROR CAN BE DETECTED BY

A

BY INVESTIGATING SAMPLING PROCEDURE. (BECAUSE FULL POPULATIONS AND THEIR PARAMETERS ARE USUALLY UNOBSERVABLE). FOR A SAMPLING ERROR TO BE UNBIASED, SAMPLING PROCEDURE MUST ENSURE EQUAL PROBABILITY OF SELECTION FOR EACH OBSERVATION IN POPULATION.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

TYPES OF VARIABLES

A

NATURE VARIABLES
DISCRETE VARIABLES
CONTINUOUS VARIABLES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

NATURE VARIABLES CAN BE

A

DISCRETE OR CONTINUOUS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

DISCRETE VARIABLES HAVE MEASUREMENT UNITS THAT ARE

A

CLEARLY DEFINED WITH NO INTERIM VALUES FALLING BETWEEN TWO SMALLEST POSSIBLE UNITS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

DISCRETE VALUES ARE OFTEN USED TO

A

DENOTE QUALITIES (FEMALE / MALE  1 / 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

DISCRETE VARIABLES USUALLY HAVE

A

RELATICELY FEW VALUES (TYPES OF A MEDAL: GOLD, SILVER, BRONZE), BUT SOME CAN HAVE A LARGER NUMBER OF VALUES (THE AMOUNT OF ONE-CENT COINS IN YOUR POCKET).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

CONTINUOUS VARIABLES DO NOT HAVE A

A

CLEARLY DEFINED SMALLEST VALUE. (TIME, TEMPERATURE, ETC.) VALUES COULD IN PRINCIPLE CONTINUE TO INFINITY IN BETWEEN ANY TWO GIVEN OBSERVATIONS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

CONTINUOUS VARIABLES NEVER HAVE

A

THE SAME VALUE FOR ANY TWO OBSERVATIONS. NO TWO PEOPLE ARE 170 CM TALL. WE ONLY HAVE SAME-SOUNDING VALUES, BECAUSE OUR MEASURMENT DEVICES CANNOT PICK-UP FINER SUB-UNITS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

TO GIVE A VALUE OF A CONTINUOUS VARIABLE PECISELY, YOU SHOULD

A

INDICATE ITS UPPER AND LOWER REAL LIMITS AT A DESIRED INTERVAL. LET’S SAY THAT A DESIRED INTERVAL IS 1 CM. A PERSON WITH A HEIGHT OF 170 CM IS THEN SAID TO BE BETWEEN LRL = 169.5 CM & URL = 170.5 CM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

BY DEFINITION 170.5 IS THE URL OF AN INTERVAL

A

“71”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

SAME VARIABLE CAN BE MEASURED WITH DIFFERENT DEGREE OF PRECISION WITH DISTINCT

A

MEASUREMENT SCALES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

NOMINAL SCALE (NAMES) VALUES SIMPLY PERFORM

A

THE FUNCTION OF NAMES. NO MATHEMATICAL OPERATIONS CAN BE ACCOMPLISHED WITH NOMINAL SCALE. (NUMBERS ON BASKETBALL JERSEYS, RANDOMLY ASSIGNED TO PLAYERS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

ORDINAL SCALE (RANKINGS). VALUES CAN BE USED TO

A

RANK OBSERVATIONS IN ORDER OF MAGNITUDE. (NUMBERS ON BASKETBALL JERSEYS, ASSIGNED ACCORDING TO HEIGHT). NO MATHEMATICAL OPERATIONS, EXCEPT FOR RANKING, CAN BE CONDUCTED ON THIS SCALE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

NOMINAL AND ORDINAL SCALES CAN BE USED TO

A

MEASURE: LOW, MEDIUM AND HIGH PRESSURE IS AN ORDINAL MEASURE). TO MEASURE BOTH DISCRETE AND CONTINUOUS VARIABLES (WHILE BLOOD PRESSURE IS A CONTINUOUS VARIABLE, ITS MEASURE: SYSTOLIC OR DIASTOLIC IS A NOMINAL SCALE MEASURE, WHILE ANOTHER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

STILL YOU SHOULD AVOID MEASURING CONTINUOUS VARIABLES ON NOMINAL OR ORDINAL SCALE, BECAUSE

A

THIS WAY YOU LOSE PRECISION THAT CAN BE OBTAINED WITH MORE SOPHISTICATED SCALES.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

INTERVAL SCALE. ENABLES NOT ONLY RANKING, BUT

A

BUT MEASURING MEANINGFUL DIFFERENCE BETWEEN VALUES OF A VARIABLE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

INTERVAL SCALE MEASUREMENTS DO NOT HAVE

A

AN ABSOLUTE ZERO (SOMETIMES KNOWN AS AN ABSOLUTE ZERO, AT WHICH A VARIABLE CEASES TO EXIST.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

ALL MATHEMATICAL OPERATIONS CAN BE DONE WITH

A

VARIABLES MEASURED ON INTERVAL SCALE, EXCEPT FOR TAKING A RATIO (CANNOT DIVIDE). CONSIDER WAKING UP AT 4AM WHILE YOU NORMALLY WAKE UP AT 8 AM. DOES THAT MEAN THAT YOU WOKE UP TWICE AS EARLY? NO (BECAUSE TIME DID NOT START AT MIDNIGHT).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

RATIO SCALE. APPLIES TO

A

CONTINUOUS VARIABLES THAT HAVE AN ABSOLUTE ZERO. ALL MATHEMATICAL OPERATIONS POSSIBLE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

INTERVAL AND RATIO SCALES USUALLY MEASURE

A

MEASURE CONTINUOUS VARIABLES. YOU SHOULD USE THESE TWO SCALES TO MEASURE CONTINUOUS VARIABLES, INSTEAD OF USING NOMINAL OR ORDINAL SCALES FOR THE RICHNESS OF INFORMATION.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

VARIABLES ARE USUALLY REFERRED TO WITH

A

LATIN UPPER-CASE LETTERS (X, Y, Q…)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

VALUES ARE USUALLY REFERRED TO WITH

A

LATIN LOWER-CASE LETTERS WITH SUBSCRIPTS (x1, x2, x3 … xn).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

THE NUMBER OF OBSERVATIONS IN A POPULATION IS MARKED WITH

A

UPPER CASE N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

A SUM OF VALUES OF A PARTICULAR VARIABLE IS KNOWN BY

A

UPPER CASE GREEK LETTER SIGMA: Σ. SIGMA MUST ALWAYS BE FOLLOWED BY WHATEVER IS BEING ADDED.
a. LETS SAY WE HAVE A SAMPLE OF n = 4, 3, 6, 7.
• Σ(X) = 20
• Σ(X – 1)2 = 9 + 4 + 25 + 36 = 74
• (ΣX)2 = 202 = 400.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

FREQUENCY DISTRIBUTIONS

A

THE FIRST TOOL FOR DESCRIPTIVE STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

FD SHOW

A

WHICH VALUES IN A VARIABLE OCCUR FREQUENTLY, AND WHICH ARE RARE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

USUALLY, FD ARE

A

GRAPHIC REPRESENTATIONS OF DATA, BUT THEY BEGIN WITH A FREQUENCY TABLE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

FREQUENCY TABLES LIST VALUES OF A VARIABLE IN THE

A

LFTMOST COLUMN. ALL POSSIBLE VALUE MUST BE LISTED.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

AN ADJACENT COLUMN CONTAINS

A

FREQUENCIES (f) OF EACH VALUE: NUMBERS OF OBSERVATIONS IN A SAMPLE THAT HAVE A PARTICULAR VALUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

A FREQUENCY TABLE MAY CONTAIN

A

RELATIVE FREQUENCIES (rf, %): SHARES OF OBSERVATIONS (FROM THE TOTAL n) THAT HAVE A PARTICULAR VALUE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

A FREQUENCY TABLE MAY CONTAIN CUMULATIVE FREQUENCIES (cf):

A

NUMBERS OF OBSERVATIONS THAT HAVE VALUES THAT ARE EQUAL TO OR LOWER THAN A GIVEN VALUE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

A FREQUENCY TABLE MAY CONTAIN CUMULATIVE RELATIVE FREQUENCIES (crf, c%):

A

SHARES OF OBSERVATIONS THAT HAVE VALUES THAT ARE EQUAL TO OR LOWER THAN THE VALUE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

CUMULATIVE RELATIVE FREQUENCY IS USEFUL FOR

A

SHOWING A RELATIVE STANDING OF AN OBSERVATION WITH A PARTICULAR VALUE VIS-À-VIS OTHER OBSERVATIONS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

THE CONCEPT OF PERCENTILE RANK.

A

PERCENTILE RANK SHOWS RELATIVE STANDING OF AN OBSERVATION’S VALUE AMONG OTHER VALUES.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

RECENTILE RANK SHOWS

A

THE PERCENT OF OBSERVATIONS WITH VALUES EQUAL TO OR LOWER THAN A GIVEN VALUE. A STUDENT EARNING A GRADE WITH PERCENTILE RANK OF 70 HAS DONE AS WELL OR BETTER THAN 70% OF OTHER STUDENTS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

FOR CONTINUOUS VARIABLES, OR DISCRETE ONES WITH MANY POSSIBLE VALUES, THE CONTENT OF THE LEFT COLUMN IN F.T. HAS TO BE

A

CLUSTERED IN TO GROUPS OF EQUAL SIZE WITH APPROXIMATELY 8 – 10 SUCH GROUPS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

INTERPOLATION:

A

MAKING AN EDUCATED GUESS ABOUT THE LIKELY CRF OF A VALUE IN THE MIDDLE OF AN INTERVAL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

WHAT IF YOU HAVE A CONTINUOUS VARIABLE?

A

• USE A POLYGON
• USE A HISTOGRAM (SHOWN FOR A SEPATATE SET OF VALUES)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

FOR STARTERS: STEM AND LEAF DIAGRAM

A

AN ALTERNATIVE WAY OF VISUALIZING F.D. OF CONTINUOUS VARIABLES (JOHN TUKEY).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Just understand this table:

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

FREQUENCY DISTRIBUTIONS CAN HAVE A GREAT VARIERTY OF SHAPES. LETS LEARN SOME WORDS TO DESCRIBE THEM.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

THE KEY POINT OF DEPARTURE, TALKING ABOUT SHAPES IS THE CONCEPT OF

A

SYMMETRY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

ONE DEPARTURE FROM SYMMETRY IS

A

SKEWNESS

74
Q

SKEWNESS

A

A SKEW EXISTS, WHEN F.D. HAS A “TAIL” IN ONE DIRECTION OR ANOTHER FROM THE CENTER.

75
Q

A TAIL CONSISTS OF INFREQUENT VALUES ON THE SIDE OF A F.D., ALSO KNOWN AS

A

OUTLIERS

76
Q

A TAIL STRETCHING IN THE DIRECTION OF POSITIVE NUMBERS SHOWS A

A

POSITIVE SKEW

77
Q

A TAIL IN THE DIRECTION OF NEGATIVE NUMBERS

A

A NEGAITVE SKEW

78
Q

SKEWNESS CAN BE MEASURED:

A

WHEN SKEWNESS STATISTIC IS 0, WE HAVE PERFECT SYMMETRY. WHEN SKEWNESS IS > 0, WE HAVE A POSITIVE SKEW; WHEN SKEWNESS IS > 2, WE HAVE AN EXTREME POSITIVE SKEW. ANALOGOUS INTERPRETATION FOR THE NEGATIVE SKEW.

79
Q

ANOTHER ASPECT OF A SHAPE OF F.D. IS

A

KURTOSIS

80
Q

KURTOSIS:

A

KURTOSIS: THIS IS THE RELATIVE HEAVINESS (THICKNESS) OF THE TAILS. THICK TAILS SHOW THAT EXTREME VALUES ARE RATHER FREQUENT RELATIVE TO “CENTRAL” MORE COMMON VLUES.

81
Q

MEASUERMENT of kurtosis:

A

IF KURTOSIS STATISTIC = 3, WE HAVE TAILS THAT ARE NOT TOO THICK, AND NOT TOO THIN. KURTOSIS > 3 SHOWS A DISTRIBUTION WITH THIN TAILS. KURTOSIS < 3 MEANS DISTRIBUTION WITH FAT TAILS.

82
Q

CENTRAL TENDENCY REPRESENTS VALUES (USUALLY A SINGLE VALUE) THAT IS

A

MOST COMMON IN A FREQUENCY DISTRIBUTION. CLEARLY C.T. IS NOT ALWAYS AT THE CENTER OF SAMPLE VALUES. THEREFORE WE HAVE SEVERAL ALTERNATIVE MEASURES OF C.T.

83
Q

THE MEAN AKA AVERAGE (MARKED µ FOR A POPULATION, M FOR A SAMPLE) –

A

THE MOST COMMON, AND, WHEN POSSIBLE, PREFERRED MEASURE OF C.T., BECAUSE IT TAKES INTO CONSIDERATION VALUES OF EACH OBSERVATION IN A F.D.

84
Q

POPULATION AND SAMPLE MEANS ARE GIVEN BY FOLLOWING FORMULA :

A
85
Q

MEAN IS

A

THE PREFERRED MEASURE OF C.T. FOR ANY BELL-SHAPED (I.E. MORE OR LESS SYMMETRIC) F.D. WITH SKEWED DISTRIBUTIONS MEAN IS AN UNRELIABLE REPRESENTATION OF A CENTRAL TENDENCY, BECAUSE IT TENDS TO “MOVE” IN THE DIRECTIN OF EXTREME VALUES IN A TAIL. LIKEWISE IN DISTRIBUTION WITH SEVERAL “PEAKS” A MEAN DOES NOT CONVEY INFORMATIN ABOUT MOST COMMON VALUES.

86
Q

BASIC FEATURES OF A MEAN:

A

• IF YOU CHANGE ONE VALUE IN A SAMPLE, MEAN CHANGES.
• IF YOU ADD / REMOVE AN OBSERVATINO TO / FROM A SAMPLE, MEAN CHANGES, UNLESS THAT OBSERVATIN HAS THE VALUE OF THE MEAN.
• IF YOU MULTIPLY/DIVIDE ALL VALUES IN A SAMPLE BY A CONSTANT, THE MEAN WILL ALSO BE MULTIPLIED/DIVIDED BY THAT CONSTANT.
• IF YOU ADD / SUBTRACT A CONSTANT TO ALL VALUES IN A SAMPLE, YOU ADD / SUBTRACT THAT SAME CONSTANT TO THE MEAN.

87
Q

THE MEDIAN (MD):

A

IS A VALUE FROM A SAMPLE OR A POPULATION DIVIDING ALL OBSERVATIONS INTO TWO EQUAL HALVES.

88
Q

MEDIAN IS USEFUL UNDER FOLLOWING CONDITIONS:

A

• WITH SKEWED FREQUENCY DISTRIBUTIONS.
• WHEN A SAMPLE HAS VALUES THAT ARE INCOMPLETE (“DID NOT FINISH” IN A RACE)
• WHEN YOU HAVE OPEN ENDED DISTRIBUTIONS (“FIVE OR MORE” AS AN ANSWER TO A QUESTION HOW MANY PIZZAS DO YOU EAT IN A WEEK).

89
Q

THE MODE:

A

A MOST COMMON VALUE IN YOUR SAMPLE / POPULATION.

90
Q

A MODE IS USEFUL WITH:

A

• MULTIMODAL DISTRIBUTIONS (THE ONES WITH SEVERAL “PEAKS”)
• NOMINAL / ORDINAL VALUES
• WHEN YOU WANT TO USE A WHOLE NUMBER, AND NOT A FRACTION TO SHOW C.T.

91
Q

IN A SYMMETRIC UNIMODAL F.D. MEAN, MEDIAN AND MODE COINCIDE. IN A SKEWED F.D. MEAN MOVES

A

TOWARDS OUTLIERS, WHILE MEDIAN AND MODE STAY CLOSER TO COMMON VALUES. IN A MULTIMODAL F.D. MEAN AND MEDIAN TEND TOWARDS THE MIDDLE VALUES OF ALL OBSERVATINO, WHILE MODES SHOW THE MOST FREQUENT ONES.

92
Q

VARIABILITY:

A

ARE OBSERVATINONS WITHIN A F.D. SIMILAR TO ONE ANOTHER OR NOT?

93
Q

VARIABILITY: ARE OBSERVATINONS WITHIN A F.D. SIMILAR TO ONE ANOTHER OR NOT?

A
  1. ANSWER TO THIS QUESTION IS CENTRAL TO ISSUES OF INFERENTIAL STATISTICS. (THE LOWER VARIABILITY IN A POPULATION, THE MORE REPRESNTATIVE WILL A RANDOM SAMPLE FROM THAT POPULATION BE AND THE MORE LIKELY PATTERNS IN A SAMPLE WILL HOLD IN POPULATION).
  2. A BASIC MEASURE OF VARIABILITY IS THE RANGE: xMAX – xMIN. SADLY IT GIVES NO INORMATION ABOUT VARIABILITY “INSIDE” THE RANGE. ARE VALUES DISTRIBUTED EVENLY BETWEEN xMAX AND xMIN OR ARE THEY CLUSTERED SOMEWHERE IN THE CENTER?
  3. STANDARD DEVIATION AND VARIANCE.
94
Q

STANDARD DEVIATION

A

IS AN AVERAGE DISTANCE OF ALL OBSERVATIONS FROM THE MEAN.

95
Q

A DEVIATION SCORE IS

A

A DISTANCE FROM AN INDIVIDUAL OBSERVATION TO THE MEAN.

96
Q

DIVIDING SS BY THE NUMBER OF OBSERVATIONS IN POPULATION GIVES A POPULATION

A

VARIANCE (σ2 FOR A POPULATION, s2 FOR A SAMPLE)

97
Q

VARIANCE

A
98
Q

WHY MUST WE DIVIDE BY n – 1 FOR SAMPLE VARIANCE?

A

BECAUSE SAMPLE VARIABILITY IS AWAYS SMALLER THAN POPULATION VARIABILITY. THIS OCCURS DUE TO A BIASED SAMPLING PROCEDURE. TO CORRECT WE MANUALLY DECREASE THE DENOMINTOR INCREASING VARIANCE.

99
Q

MATHEMATICAL PROPERTIES OF STANDARD DEVIATION:

A
  • ADDING / SUBTRACTING A CONSTANT TO EACH VALUE DOES NOT AFFECT STD.
  • MULTIPLYING / DIVIDING EACH VALUE BY A CONSTANT, STD ALSO GETS MULTIPLIED / DIVIDED BY THAT CONSTANT.
100
Q

THE CONCEPT OF DEGREES OF FREEDOM

A

THIS n – 1 IN THE DENOMINATOR OF SAMPLE VARIANCE AND STD CAN BE INTERPRETED AS DEGREES OF FREEDOM. D.F. SHOW HOW MANY OBSERVATIONS IN A SAMPLE ARE WE FREE TO VARY (INDEPENDENT OF OTHER VALUES AND STATISTICS).

101
Q

Look at these rocks (this table)

A
102
Q

WE CAN SAY THAT D.F. SHOW THE

A

EXTENT (THE SIZE) OF THE PROBLEM OF NON-INDEPENDENT SAMPLING. THE LAGER YOUR n THE SMALLER IS THE PROBLEM (THE GREATER YOUR D.F.)

103
Q

ALTERNATIVLEY WE CAN THINK OF DEGREES OF FREEDOM AS AN ANSWER TO THE QUESTION: HOW FREE ARE YOU TO ESTIMATE VARIABILITY OF A POPULATION WELL

A

WITH n = 1 YOU’RE NOT FREE AT ALL. WITH n = 2 YOU HAVE THE MINIMAL AMOUNT OF FREEDOM. WITH A LARGE N YOU ARE MORE FREE (MORE CONFIDENT) TO OBTAIN A GOOD MEASURE OF VARIABILITY IN POPULATION.

104
Q

IN MOST SYMMETRIC UNIMODAL F.D.S APPROXIMATELY

A

70% OF ALL OBSERVATIONS FALL WITHIN + / - ONE STD AROUND THE MEAN. AND APPROXIMATELY 95% OF ALL OBSERVATIONS FALL WITHIN + / - TWO STD AROUND THE MEAN. OBSERVATIONS THAT ARE REMOVED FROM THE MEAN BY MORE THAN TWO STD ARE CONDERED OUTLIERS.

105
Q

IN MOST SYMMETRIC UNIMODAL F.D.S APPROXIMATELY 70% OF ALL OBSERVATIONS FALL WITHIN + / - ONE STD AROUND THE MEAN. AND APPROXIMATELY 95% OF ALL OBSERVATIONS FALL WITHIN + / - TWO STD AROUND THE MEAN. OBSERVATIONS THAT ARE REMOVED FROM THE MEAN BY MORE THAN TWO STD ARE CONDERED OUTLIERS.

A
106
Q

IF YOUR EXAM GRADE, COMPARED TO YOUR CLASSMATES IS ONE STD ABOVE AVERAGE, THEN YOU DID BETTER THAN

A

50% + (70% / 2) = 85% OF YOUR FRIENDS.

107
Q

SEVERAL WAYS TO DETERMINE LOCATION OF AN OBSERVATION IN A F.D.

A

a. INTERPOLATION
b. TUKEY’S S&L DIAGRAM
c. S.T.D.

108
Q

AN EXAMPLE, USING STD:

A

SUPPOSE A THREE-SEASON SCORING AVERAHE FOR A BASKETBALL TEAM IS 85 POINTS WITH s = 13. HOW DOES A SCORE OF 72 POINTS COMPARE TO OTHER SCORES BY THE TEAM? ASSUME THAT SCORES ARE DISTRIBUTED IN A SYMMETRIC BELL-SHAPED FORM.
FIND OUT HOW MANY STANDARD DEVIATIONS THE GRADE DIFFERS FROM CLASS AVERAGE.
(72 – 85)/13= -1
LOCATE –1 STANDARD DEVIATIONS ON THE BELL-SHAPED F.D.

CALCULATE THE SHARE OF OBSERVATIONS THAT HAVE VALUES LOWER THAN 1 STANDARD DEVIATIONS BELOW THE MEAN.
50% – (70% – 35%) = 15% OF ALL SCORES FALL BETWEEN THE MINIMUM AND 72 POINTS.  
IN OTHER WORDS BY SCORING 72 POINTS THE TEAM PLAYED A GAME THAT IS BETTER THAN 15% AND WORSE THAN 85% OS ITS GAMES.
109
Q

A DISTANCE BETWEEN VALUE 72 AND THE MEAN IN TERMS OF STANDARD DEVIATIONS. SUCH A DISTANCE IS CALLED A

A

z SCORE

110
Q

WE CAN CALCULATE z SCORES FOR ALL VALUES IN A POPULATION USING THIS FORMULA:

A
111
Q

IF WE z-STANDARDIZED AN ENTIRE POPULATION, IT WOULD HAVE FOLLOWING FEATURES:

A

a. SAME SHAPE AS ORIGINAL UNSTANDARDIZED POPULATION.
b. A MEAN, EQUAL TO ZERO.
c. A STD EQUAL TO ONE.

112
Q

PROBABILITY OF SAMPLING (I.E. OBTAINING) AN OBSERVATION WITH A PARTICULAR VALUE FROM A POPULATION IS EQUAL TO

A

THE SHARE OF OBSERVATIONS WITH THAT VALUE IN THE POPULATION.

113
Q

Probability of a value=

A

(number of observations with that value)÷
(total number of observations)

114
Q

CALCULATE PROBABILITIES OF RANDOMLY SELECTING OBSERVATIONS FROM A CERTAIN REGION OF A F.D.

A
115
Q

A NORMAL F.D. IS A SYMMETRIC UNIMODAL F.D. WITH CERTAIN SHARES OF OBSERVATIONS IN ITS REGIONS.

A
116
Q

CONNECTING NORMAL F.D. WITH PROBABILITY. SUPPOSE THAT WINTER TEMPERATURES ARE NORMALLY DISTRIBUTED. ASSUME AN AVERAGE WINTER TEMPERATURE µ = –3C AND σ = 8C. GIVEN THIS INFORMATION, WHAT IS A PROBABILITY OF OBSERVING A WINTER DAY COLDER THAN MINUS 19C?

A
117
Q

UNIT NOMAL TABLE

A

a. LEFT COLUMN OF U.N.T. CONSISTS OF SINGLE DIGITS AND A FIRST DECIMAL OF A z SCORE
b. A TOP ROW OF U.N.T. CONTAINS THE SECOND DECIMAL OF A z SCORE.
c. A CELL AT THE INTERSECTION OF A COLUMN AND A ROW GIVES PROBABILITY OF SELECTING AN OBSERVATION FROM A SHADED AREA UNDER A NORMAL F.D. (SAME AS THE SHARE OF OBSERVATIONS IN THAT SHADED AREA.)
d. U.N.T. EXPRESSES PROBABILITIES (SHARES) AS PROPORTIONS, NOT PERCENTAGES

118
Q

LETS SAY THAT WE HAVE A NORMAL F.D. WITH µ = 10 AND σ = 2. WHAT IS THE PROBABILITY OF OBTAINING AN OBSERVATION THAT IS GREATER THAN 7?

A
119
Q

ASSUME THE SAME NORMAL F.D. AS ABOVE. NOW YOU WANT TO KNOW PROBABILITY OF RANDOMLY SELECTING AN OBSERVATION WITH VALUE THAT FALLS BETWEEN 8 AND 13.

A
120
Q

CONSIDER A FOLLOWING EXERCISE
A. OBTAIN A LARGE NUMBER OF SAMPLES FROM A POPULATION (SAME n).
B. FOR EACH SAMPLE, CALCULATE M.
C. ARRANGE THESE MEANS INTO A F.D. FROM MMIN TO MMAX.
D. WHAT SHAPE, µ AND σ WILL F.D. OF THIS NEW VARIABLE HAVE?

A
121
Q

HYPOTHESIS TESTS ARE A PART OF

A

SCIENTIFIC METHODOLOGY

122
Q

HYPOTHESIS IS A STATEMENT IN THE FORM:

A

X (INDEPENDENT VARIABLE) CAUSES Y (DEPENDENT VARIABLE).

123
Q

HYPOTHESIS TEST VERIFIES IF

A

THIS PROPOSED RELATIONSHIP IS LIKELY TO HOLD IN REALITY

124
Q

INFERENTIAL STATISTICS IS A KEY COMPONENT OF A HYPOTHESIS TEST AS IT ALLOWS TO

A

DETERMINE IF PATTERNS OF DEPENDENT VARIABLE (Y) AFTER EXPOSURE TO AN INDEPENDENT VARIABLE (X) IN A LIMITED AMOUNT OF DATA ARE LIKELY TO BE SEEN IN A LARGE POPULATION.

125
Q

HYPOTHESIS TESTS CAN FOLLOW MANY DIFFERENT RESEARCH STRATEGIES OR DESIGNS. WE BEGIN STUDY OF H.T. WITH THE SIMPLEST POSSIBLE DESIGN:

A

ONE SAMPLE HYPOTHESIS TEST

126
Q

ONE SAMPLE HYPOTHESIS TEST

A

A. SUPPOSE WE WANT TO KNOW IF EXPOSURE TO A SOME VALUE OF VARIABLE X CHANGES VALUES OF VARIABLE Y.
B. SAY X IS AN EXPERIMENTAL MEDICINE TO REDUCE BLOOD PRESSURE, AND Y IS BLOOD PRESSURE.
C. WE KNOW THE MEAN AND THE STD OF FOR A POPULATION OF VARIABLE Y.
D. WE SELECT A SAMPLE FROM THAT POPULATION AND EXPOSE IT TO VARIABLE X (MEDICATION).
• NOTE THAT AFTER EXPOSURE TO X, THE SAMPLE NO LONGER REPRESENTS THE ORIGINAL POPULATION FROM WHICH IT CAME, BUT RATHER IT REPRESENTS A NEW POPULATION THAT WOULD EXIST IF ALL ORIGINAL POPULATION WERE EXPOSED TO X.
E. WE OBTAIN A MEAN FOR THE SAMPLE, MY-NEW. SUPPOSE THAT MY-NEW < µY-OLD. HERE MY-NEW IS THE OBSERVED SAMPLE MEAN REPRESENTING THE IMAGINARY POPULATION WHERE EVERYONE HAS TAKEN THE MEDICINE, AND µY-OLD IS EXPECTED VALUE OF A SAMPLING DISTRIBUTION OF MEANT THAT COULD BE TAKEN FROM THE ORIGINAL POPULATION UNEXPOSED TO X (MEDICATION).
F. ULTIMATELY WE WANT TO KNOW IF THE DIFFERENCE BETWEEN MY-NEW AND µY-OLD IS SUFFICIENTLY LARGE THAT WE CAN CONCLUDE THAT MEDICATION (AND NOT A SAMPLING ERROR) WAS BEHIND THE REDUCTION IN BLOOD PRESSURE IN THE SAMPLE.

127
Q

A STATISTICAL HYPOTHESIS TEST PROCEEDS THROUGH

A

FIVE STEPS

128
Q

A STATISTICAL HYPOTHESIS TEST PROCEEDS THROUGH FIVE STEPS:

A

A. STATE NULL AND ALTERNATIVE HYPOTHESES
B. CHOOSE A CRITICAL LEVEL (AKA ALPHA LEVEL)
C. OBTAIN CRITICAL VALUES OF z.
D. CALCULATE TEST VALUE OF z
E. COMPARE

129
Q

A. STATE NULL AND ALTERNATIVE HYPOTHESES.

A

• A NULL HYPOTHESIS STATES THAT A THERE IS NO RELATIONSHIP BETWEEN X AND Y AS PROPOSED BY RESEARCHER.
• STATISTICALLY, THIS MEANS THAT µY-NEW IS EQUAL TO µY-OLD.
• H0 CAN BE DIRECTIONAL (ONE-TAILED) OR NON-DIRECTIONAL (TWO-TAILED):
- NON-DIRECTIONAL H0: µY-NEW = µY-OLD
- DIRECTIONAL H0: µY-NEW ≥ µY-OLD OR µY-NEW ≥ µY-OLD
- THERE MUST ALWAYS BE AN “EQUAL” SIGN IN A NULL HYPOTHESIS.
• ALTERNATIVE HYPOTHESIS STATES THAT A THERE IN FACT IS A RELATIONSHIP BETWEEN X AND Y AS PROPOSED BY RESEARCHER.
• STATISTICALLY, THIS MEANS THAT µY-NEW IS NOT EQUAL TO µY-OLD.
• LIKE H0, HA CAN BE DIRECTIONAL (ONE-TAILED) OR NON-DIRECTIONAL (TWO-TAILED):
- NON-DIRECTIONAL HA: µY-NEW ≠ µY-OLD
- DIRECTIONAL H0: µY-NEW < µY-OLD OR µY-NEW > µY-OLD
- THERE IS NEVER AN “EQUAL” SIGN IN AN ALTERNATIVE HYPOTHESIS.

130
Q

H0, HA CAN BE

A

DIRECTIONAL (ONE-TAILED) OR NON-DIRECTIONAL (TWO-TAILED)

131
Q

B. CHOOSE A CRITICAL LEVEL (AKA ALPHA LEVEL)

A

THIS α LEVEL DETERMINES HOW FAR MY-NEW HAS TO BE REMOVED FROM µY-OLD, TO COUNT AS EVIDENCE AGAINST H0 AND IN FAVOR OF HA.
• IN THE CHART BELOW, SHADED AREAS REPRESENT ALPHA LEVEL.
• SPECIFICALLY α IS THE SHARE OF SAMPLE MEANS THAT ARE SO FAR REMOVED FROM THE EXPECTED VALUE (µY-OLD) AS TO BE CONSIDERED EVIDENCE FOR REJECTING H0.
• A SAMPLE MEAN REPRESENTED BY THE ORANGE LINE IN CHART BELOW WOULD COUNT AS EVIDENCE THAT µY-NEW ≠ µY-OLD, WHILE A SAMPLE MEAN REPRESENTED BY A BLUE LINE WOULD INDICATE THAT µY-NEW = µY-OLD.
• TYPICALLY APHA LEVELS ARE 0.01, 0.05 OR (IN SOME SOCIAL SCIENCES) 0.1.
• NOTE THAT AN ALPHA LEVEL NEEDS TO BE SPLIT IN HALF (AND “PLACED” ON EACH TAIL) FOR A NON-DIRECTIONAL H.T., WHILE THE ENTIRE CRITICAL AREA MUS BE CONCENTRATED IN THE TAIL GIVEN BY ALTERNATIVE HYPOTHESIS FOR DIRECTIONAL H.T.

132
Q

C. OBTAIN CRITICAL VALUES OF z

A

. THESE ARE z SCORES, MARKING BOUNDARIES BETWEEN CRITICAL AREA(S) IN THE TAIL(S) AND THE REMAINING BODY OF SAMPLING DISTRIBUTION.

133
Q

D. CALCULATE TEST VALUE OF z

A

ASSOCIATED WITH SAMPLE MY-NEW

134
Q

E. COMPARE

A

TEST z AGAINST CRITICAL z. IF |TEST z| > |CRITICAL z|, REJECT H0

135
Q

A SAMPLE PROBLEM.

A

A HISTORIC, AVERAGE GRADE IN A COURSE, EARNED BY LOCAL STUDENTS IS µ = 7, WITH σ = 2. WE SELECT A RANDOM SAMPLE OF n = 9 ERASMUS STUDENTS, AND FIND OUT THAT THEIR M = 6. DOES THIS MEAN, THAT ERASMUS STUDENTS EARN ON AVERAGE DIFFERENT GRADES THAN LOCAL STUDENTS?

136
Q

A HISTORIC, AVERAGE GRADE IN A COURSE, EARNED BY LOCAL STUDENTS IS µ = 7, WITH σ = 2. WE SELECT A RANDOM SAMPLE OF n = 9 ERASMUS STUDENTS, AND FIND OUT THAT THEIR M = 6. DOES THIS MEAN, THAT ERASMUS STUDENTS EARN ON AVERAGE DIFFERENT GRADES THAN LOCAL STUDENTS?

A
137
Q

ASKING IF ERASMUS STUDENTS EARN LOWER GRADES THAN LOCAL STUDENTS?

A
138
Q

A HYPOTHESIS TEST CAN COMMIT TWO TYPES OF ERRORS

A

TYPE ONE ERROR: REJECTING A CORRECT H0 (STATING AN EFFECT, WHEN THERE IS NONE).
• A RATHER DANGEROUS ERROR (PRESCRIBING MEDICINE WHEN IT DOESN’T WORK).
• T1E IS DIRECTLY RELATED TO THE SIZE OF YOUR ALPHA LEVEL. THE LARGER THE CRITICAL AREA, THE MORE LIKELY WILL OUR SAMPLE M “JUMP” INTO IT. EVEN IF ANY DIFFERENCE BETWEEN M AND µ IS DUE ONLY TO SAMPLING ERROR.
• FOR THIS REASON, CHOOSE SMALLER OF THE TWO ELIGIBLE CRITICAL AREAS IN UNT.
C. T2E: FAILING TO REJECT A WRONG H0.
• T2E IS RELATED TO:
- A SMALL EFFECT SIZE OF X ON Y.
- A SMALL n.
- THERE IS NOT MUCH A RESEARCHER CAN DO ABOUT T2E, EXCEPT FOR INCREASING n.

139
Q

WHAT DETERMINES THE LIKELIHOOD OF REJECTING H0?

A

A. EFFECT SIZE: THE MORE DIFFERENT IS MY-NEW FROM µY-OLD THE MORE LIKELY MY-NEW IS TO GET INTO CRITICAL AREA.
B. SAMPLE SIZE: THE LARGER n, THE SMALLER STANDARD ERROR, THE LARGERT TEST z SCORE, THE MORE LIKELY IS THAT Z SCORE TO GET INTO CRITICAL AREA.
C. ALPHA LEVEL (NOT TO BE INCREASED FOR THE PURPOSE OF REJECTING H0)

140
Q

Mean (μ) =

A
141
Q

Σ is

A

the summation (addition) sign

142
Q

xi is

A

each individual number

143
Q

N is

A

the population size

144
Q

A sampling distribution is a

A

probability distribution of a statistic obtained from a larger number of samples drawn from a specific population.

145
Q

MEAN OF A SAMPLING DISTRIBUTION IS CALLED

A
146
Q

AN ADJACENT COLUMN CONTAINS FREQUENCIES (f) OF EACH VALUE:

A

NUMBERS OF OBSERVATIONS IN A SAMPLE THAT HAVE A PARTICULAR VALUE

147
Q

FREQUENCY TABLE MAY CONTAIN RELATIVE FREQUENCIES (rf, %):

A

SHARES OF OBSERVATIONS (FROM THE TOTAL n) THAT HAVE A PARTICULAR VALUE.

148
Q

A FREQUENCY TABLE MAY CONTAIN CUMULATIVE FREQUENCIES (cf):

A

NUMBERS OF OBSERVATIONS THAT HAVE VALUES THAT ARE EQUAL TO OR LOWER THAN A GIVEN VALUE.

149
Q

A FREQUENCY TABLE MAY CONTAIN CUMULATIVE RELATIVE FREQUENCIES (crf, c%):

A

SHARES OF OBSERVATIONS THAT HAVE VALUES THAT ARE EQUAL TO OR LOWER THAN THE VALUE.

150
Q

STANDARD DEVIATION IS

A

AN AVERAGE DISTANCE OF ALL OBSERVATIONS FROM THE MEAN.

151
Q

s^2=

A
152
Q

IF YOU CHANGE ONE VALUE IN A SAMPLE,

A

MEAN CHANGES.

153
Q

IF YOU ADD / REMOVE AN OBSERVATINO TO / FROM A SAMPLE, MEAN

A

CHANGES, UNLESS THAT OBSERVATIN HAS THE VALUE OF THE MEAN.

154
Q

IF YOU MULTIPLY/DIVIDE ALL VALUES IN A SAMPLE BY A CONSTANT, THE MEAN WILL

A

ALSO BE MULTIPLIED/DIVIDED BY THAT CONSTANT.

155
Q

IF YOU ADD / SUBTRACT A CONSTANT TO ALL VALUES IN A SAMPLE,

A

YOU ADD / SUBTRACT THAT SAME CONSTANT TO THE MEAN.

156
Q

What is Mean?

A

The mean is the average or the most common value in a collection of numbers.

157
Q

WITH SKEWED FREQUENCY DISTRIBUTIONS. A MEAN SHIFTS

A

QUITE STRONGLY IN THE DIRECTION OF OUTLIERS.

158
Q

MEDIAN CAN BE AN AVERAGE OF TWO VALUES WHEN A SAMPLE OR A POPULATION HAS

A

AN EVEN NUMBER OF OBSERVATIONS.

159
Q

GOOD EXAMPLES OF INTERVAL-SCALE VARIABLES ARE

A

TIME AND TEMPERATURE.

160
Q

VARIANCE (σ2 FOR A POPULATION, s2 FOR A SAMPLE):

A

AN AVERAGE SQUARED DISTANCE OF ALL OBSERVATIONS FROM THE MEAN .

161
Q

Mean and median are close to each other, because

A

distribution is symmetric.

162
Q

CENTRAL TENDENCY REPRESENTS VALUES (USUALLY A SINGLE VALUE) THAT IS

A

MOST COMMON IN A FREQUENCY DISTRIBUTION.

163
Q

with symmetric distributions mean is a preferred measure of

A

Central tendency

164
Q

MEAN IS THE PREFERRED MEASURE OF C.T. FOR

A

ANY BELL-SHAPED F.D.

165
Q

WITH SKEWED FREQUENCY DISTRIBUTIONS. A MEAN SHIFTS QUITE STRONGLY IN THE DIRECTION OF

A

OUTLIERS

166
Q

IF YOU CHANGE ONE VALUE IN A SAMPLE, MEAN

A

CHANGES

167
Q

IF YOU ADD / REMOVE AN OBSERVATION TO / FROM A SAMPLE, MEAN

A

CHANGES, UNLESS THAT OBSERVATION HAS THE VALUE OF THE MEAN.

168
Q

IF YOU MULTIPLY/DIVIDE ALL VALUES IN A SAMPLE BY A CONSTANT, THE MEAN WILL

A

ALSO BE MULTIPLIED/DIVIDED BY THAT CONSTANT.

169
Q

IF YOU ADD / SUBTRACT A CONSTANT TO ALL VALUES IN A SAMPLE, YOU ADD / SUBTRACT THAT SAME CONSTANT TO THE

A

MEAN

170
Q

IN A SYMMETRIC UNIMODAL F.D. MEAN, MEDIAN, AND MODE

A

COINCIDE (SUTAMPA)

171
Q

IN A SKEWED F.D. MEAN MOVES TOWARDS OUTLIERS, WHILE MEDIAN AND MODE

A

STAY CLOSER TO COMMON VALUES.

172
Q

IN A MULTIMODAL F.D. MEAN AND MEDIAN TEND TOWARDS THE MIDDLE VALUES OF ALL OBSERVATION, WHILE MODES SHOW

A

THE MOST FREQUENT ONES.

173
Q

The population would have greater variability than a sample, because of

A

biased sampling error

174
Q

df =

A

N-1

175
Q

DF means

A

degrees of freedom

176
Q

What standard deviation means?

A

This means that N-1 observations in a sample could’ve taken on any value, while one observation would be predetermined by the values of others and of the mean.

177
Q

OBSERVATIONS THAT ARE REMOVED FROM THE MEAN BY MORE THAN TWO STD ARE

A

CONSIDERED OUTLIERS.

178
Q

Kai decreasina ar increasina, less \ more ->

A

one tail

179
Q

Jei tiesiog differnt, change -

A

two tail

180
Q
A
181
Q

ESTIMATED STANDARD ERROR:

A