final Flashcards

1
Q

WHAT IS STATISTICS?

A

It is a set of tools used in order to describe, organize, summarize , interpret data, draw conclusions & relate one data set to another. i.e school scores, level of stress. Statistics help us understand the world around us.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is descriptive statistics

A

Tools used to organize and describe characteristics of a collection of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Inferential statistics?

A

Next step after descriptive tools to infer data findings from a smaller group/sample to a larger group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an average?

A

It is the one value that best reprents (best value of) an entire group of scores
average = measures of central tendancies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define the mean

A

MOST USED type of average, MOST ACCURATELY reflects the population mean. very SENSITIVE TO EXTREME SCORES as these can pull the mean in one or the other direction & make it less representative of the set of scores and less useful
=TYPICAL, AVERAGE, MOST CENTRAL SCORE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

formula to obtain the mean

A

The sum of all the values in a group, divided by the number of values in the group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between statistics and parameters?

A

PARAMETERS describe POPULATION;
i.e.average height of all WSU students

STATISTICS describe SAMPLES
Ex:average height of the students in our sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what types of sampling method do we have?

A

BIASED sample; just ask your friends

RANDOM sampling: everyone in the group has equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is random sampling the best method?

A
  • maximizes chances to have a sample that is BEST REPRESENTATIVE of population
  • representative sample, allow us to GENERALIZE OUR RESULTS much easier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define variable

A

condition/characteristics that can have different values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define value

A

possible number or category a score can have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define score

A

A particular person’s value on a variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

difference between mean and median

A

The mean is the MIDDLE POINT OF A SET OF VALUES and the median is the MIDDLE POINT OF A SET OF CASES, as it cares about how many cases and not the values of those cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define median

A

defined as the MIDPOINT in a set of scores, where 50% of the scores fall ABOVE OR BELOW IT. It is the MIDDLE MOST VALUE. When there is an even number of values the median is the mean of the two middle values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

define mode

A
  • Most GENERAL AND LEAST PRECISE
  • helps understand the characteristics of a set of scores.
  • value that OCCURS MOST FREQUENTLY
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is an extreme score?

A

know as outliers
Scores that do not “look like” the rest of the data/observations
Are “very different” from the group to which they belong (high or low)
Known as “outliers” (Can be bigger or smaller)
PULL the value of the mean ineither direction & makes it less valuable to know

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

characteristics of median

A

-Cares about how many data points, not the value of each of the data points.
-Insensitive to extreme scores (Outliers)
-Has a relationship with Percentile Points
“at the 50th %” - What does that mean?
You are the top half of the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

characteristics of mode

A

Possible to have no mode
Possible to have more than one mode
E.g. “bimodal distributions”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

characteristics of mean

A
  • “BALANCES” the numbers(Values on either side are equal in weight)
  • Same “TOTAL DISTANCE”
  • Does not have to be a number in a set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When to use what?

A

mean is more precise than the median & the median more precise than the mode. WITH ALL THINGS BEING EQUAL USE MEAN
•Use mode for categorical data
•Use median when you have extreme scores
•Use mean when you have data that isn’t categorical and do not have extreme scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define variability

A

Provide the FULL PICTURE as it reflects HOW SCORES DIFFER FROM ONE ANOTHER, more precisely FROM THE MEAN, since the mean is the best representation of the average of a set of scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the measures of variability?

A

Three measures Range, standard deviation and Variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Define the range

A

MOST GENERAL measure of variability, tells how far apart scores from 1 another

  • subtract the lowest score from the highest score R=h-l
  • not to be used as a conclusion, but as a part of a process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Define the standard deviation

A

MOST COMMONLY used it represents the AVERAGE AMOUNT OF VARIABILITY in a set of scores, it’s the average distance from the mean.

25
Q

Can we add the sum of deviation from the mean?

A

-no because they always =0

26
Q

Why do we square the sum of deviation?

A
  • to get rid of the negative sign

- to check validity of answer

27
Q

Why do we remove the square root?

A

-to return to the same units we started with.

28
Q

Why do we divide by n-1 instead of n

A

because the SD is an estimate of the population standard which is unbiased. We do this TO FORCE THE SD TO BE ARTIFICIALLY LARGER THAN IT WOULD OTHERWISE BE.
All other things being equal the larger the size of the population the lesser the difference between the biased and unbiased estimates of SD. The closer to the size of the population the sample is, the more accurate the estimate will be.

29
Q

What if S=0

A

there is no variability as the scores are essentially identical in value. Rare to find

30
Q

What is the Variance?

A

SD squared, not commonly used by itself in research articles as it is difficult to interpret a square number. It is still important because it used as concept and as a practical measure of variability.

31
Q

What is the differnce between SD and S?

A

They are both measure of variability, dispersion or spread, but the SD is expressed in original units and S is expressed in squared unit

32
Q

What do we use both variability and central tendency?

A

Possible to have the same mean, but varying amounts of variability. Proof as to why we need to know and report BOTH

33
Q

Whys is the range the most convenient measure of dispersion? when use it?

A

because you only to do a simple substraction. doesnt consider the values. USE WHEN YOU NEED A GROSS ESTIMATE

34
Q

Why does the SD gets smaller as the individuals in a group score more similarly on a test?

A

as individuals score more simirlarly, they are closer to the mean, and the deviation from the mean is smaller, SD is smaller also

35
Q

inclusive range

A

r = h – l + 1 (because one point will be outside the range)

36
Q

Why n-1 as a denominator?

A

Overestimate the SD of the population

37
Q

Graphs

A

Graphs help examine how DIFFERENCES in measures of CENTRAL TENDENCY and those of VARIABILITY can RESULT in different looking distributions. Graphs are a VISUAL REPRESENTATION of a distribution of scores.
Tips: a graph should communicate only one idea

38
Q

What is a frequency distribution?

A

IT is a method of tallying and representing how often scores occur. A FD is grouped into class of intervals (range of numbers).

39
Q

steps to create a frequency distribution

A
  • Order data sequentially
  • Look at the Range of Values
  • Decide how many intervals you want
  • Divide by number of intervals
  • List intervals, largest to smallest.
  • Start placing actual data points into the buckets.
  • Calculate frequencies.
40
Q

histogram

A

This is a VISUAL REPRESENTATION of the FD where frequencies are presented by bar

41
Q

What do you need to create a histogram ?

A

i) Place values at equal distances on the x-axis, then identify their midpoint
ii) Draw a bar around each midpoint that represents the entire class interval to the height representing the frequency.

42
Q

What is a polygon?

A

A polygon is a continuous line that represents the frequencies of scores within a class interval

43
Q

What are cumulatives frequencies and do we create them?

A

It is a visual representation of the CUMULATIVE OF OCCURENCES by class intervals. It is created by adding the frequency in a class interval to all frequencies below it.

44
Q

Frequency distributions differences

A

1) AVERAGE VALUE: the middle point in a distribution is only the average when the curve is a mirror image of itself
2) VARIABILITY
3) SKEWNESS: of lack of symmetry,1 tail of distribution is longer than another.
4) KURTOSIS: how flat peaked a distribution appears.

45
Q

Types of Kurtosis

A

Two kinds

  1. PLATYKURTIC: distribution relatively FLAT compared to a normal bell-shaped distribution. They are more DISPERSE than those that are not.
  2. LEPTOKURTIC: distribution relatively PEAKED compared to a normal bell shaped distribution. They are LESS VARIABLE or disperse relative to others
46
Q

Define correlations?

A

It is how the VALUE IN ONE variable CHANGES THE VALUE in another variable. It reflects the dynamic quality of the relationship between two variables.

47
Q

correlations values

A

Value
-1 and +1
A correlation between two variables is called bivariate

48
Q

Types of correlations

A

-Direct: or positive. When x increases in value y increases in value
When x decreases in value y decreases in value
-Indirect: or negative when x increases in value y decreases in value
When x decreases in value y increases in value

49
Q

Absolute value

A

The absolute value REFLECTS THE STRENGHT of the correlation

50
Q

correlation coefficient?

A

PEARSON PRODUCT MOMENT CORRELATION
(r) is any value between -1 &+1
Number that reflects the STRENGTH AND DIRECTION RELATIONSHIP between 2 variables

51
Q

correlation matrix?

A

a tool for organizing BI VARIATE correlations between a set of variables

52
Q

coefficient of determination?

A

PERCENTAGE OF VARIANCE IN one variable that is accounted for by the variance in the other variable

53
Q

what happens when we square r?

A

we found out how much VARIABILITY in one variable can be accounted for in the other variable

54
Q

advantage of stronger correlation

A

THE STRONGER, LARGER THE CORRELATION the more shared variance=the more INFORMATION about a PERFORMANCE on one score can be explained by the other score

55
Q

are all correlations linear?

A

no they do not all have to be linear, because not all relationships are linear

56
Q

correlation versus causation

A

a change in one does not result in the change in the other. ice cream consumption increase, crime rate increase, and same with decreasing, but the only thing they share is outside temperature.

57
Q

What are the signs in correlation for?

A

Changes in different directions

58
Q

what does a perfect relationship mean?

A

if you know the value of one, you know the value of the other

59
Q

What is the correlation of alienation?

A

the amount of UNEXPLAINED VARIANCE between variables