Week 14: hypo testing II Flashcards

1
Q

descriptive stats definition

A

a way of summarizing data to convey main information
—instead of presenting raw scores, researchers present a few summary measures that capture their data’s key characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

descriptive stats can be

A
  • frequencies and percentages
  • measures of location (central tendency)
  • measures of variability
  • measures of individual location
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

frequencies and percentages

A
  • preferred when researchers want to convey how often particular phenomena occurred (nominal measures)
  • divide # in a category by # in group and x100
  • proportion is an alternative to percentage, just dont multiply by 100
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

measures of location or central tendency

A

single values that describe an entire set of data

  • include two broad categories
  • –central location
  • —–mean, median, mode
  • –fractiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

mean

A

average

  • the sum of n divided by n (the sample size)
  • most appropriate for interval or ratio levels
  • most reflective of data when the distribution of scores fits a normal distribution
  • –outliers make it less representative
  • x bar= sigma x/n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

median (mdn)

A
  • the midpoint or center of the data or set of numbers
  • computing the median requires ordinal, interval, and ratio levels (must be arranged from low to high)
  • if even # the average the 2 middle
  • if data is not normally distributed the median may e preferred over the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the mode

A
  • based on frequency info, primary measures of central tendency for nominal measures
  • –category, response, or score that happens most often
  • relatively uninformative but is the only thing you can do with nominal level
  • bimodal–when two sets have the same number of frequency (2 humps like a camel) normal is unimodal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

fractiles

A

statistical procedures are called fractiles when they divide a set of data into two or more nearly equal parts

  • identify the proportion of observations above and below them (median divides in half)
  • can be quartiles, quintiles, deciles, percentiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

measures of individual location

A

used to specify the location of one participant in relation to a group of participants, they include: rank, percentile rank, and standard scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

rank

A

orders a group of participants in terms of their performance on some measure from low to high or high to low

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

percentile rank

A

shows the relative position for an individual within a group of participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

standard score

A
  • aka z score
  • raw scores that are converted to standard deviation units, they are useful as indicators of individual location when data are normally distributes
  • z=(x-mean)/standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

measures of variability

A

the degree of dispersion in a set of data (aka spread)

  • measures:
  • –range
  • –variance
  • –standard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

range

A

the largest observed value minus the smallest

  • useful when comparing variations between two sets of data, however tells nothing about variability of scores falling between
  • nominal data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

variance

A

considers the dispersion of individual values around the mean

  • can be calculated by either interval or ratio level data
  • to calculate
    1) list all scores in a column
    2) compute the mean for the set of scores
    3) obtain the difference between each score and the mean
    4) square the difference between each score and the mean
    5) add the squared differences together–sum the squares
    6) divide the sum of squared by the number of participants minus one (n-1)
  • sum of x minus xbar squared all divided by n-1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

variance short comings

A

its value is not expressed in the same unit of measurement as the sample scores
—not easily interpreted, however change it to standard deviation= easy to interpret

17
Q

standard deviation

A
  • square root of variance
  • reflects the dispersion of scores around the mean
  • most commonly used measure of variability for interval or ratio levels of data
  • easily interpreted
  • –large=lots of spread
  • –small= close to mean
  • uses all scores in set so extemes have less influence than on the range
18
Q

coefficient of variation

A

a measure of relative variation and in some cases may be more meaningful to express SD as a percentage of what is being measured

  • (CV)
  • expressed at (Sd/mean)*100
19
Q

distribution

A

a representation of the pattern of scores

  • for categorical data is in a bar graph
  • for continuous variables is displayed in a line graph or histogram
20
Q

normal distribution

A
  • unimodal meaning one mode in the center
  • symmetrical
  • continuous from one tail to the other
  • asymptotic meaning scores get closer to zero as it gets further from the center but never hits zero with most scores at the center and few at the edges
21
Q

z scores

A

a normal distribution is standardized with z scores

  • converted to a common distribution with the same mean
  • 1
22
Q

standard units

A

a single score in the distribution of scores can be represented by a standard unit of measure known as a z-score

  • –the z-score identifies where an individual score is located within the distribution of scores
  • –it is derived by dividing the difference between the target score and the mean by the standard deviation
  • calculates how many standard deviations a single value is away from zero so that it can be compared to any other value taken from a standard normal distribution (0 is the mean)
23
Q

negatively skewed (left skewed)

A

most data is on the right and the tail is on the left

24
Q

positively skewed

A

most data is on the left and the tail is on the right

25
Q

kurtosis

A

a measure of peakedness

26
Q

mesokurtic

A

normal distribution

27
Q

leptokurtic

A

more data in the center

28
Q

platykurtic

A

more data in the tails

29
Q

standard error of the mean

A

how we measure how different any sample will be from the total population because it is impractical to collect all possible samples

  • two hypothesis
    1) the mean of the distribution = the population’s mean and the distribution of the sample is less variable than the population
  • –allow to calculate an estimate of the standard error for any given sample
    2) second hypothesis known as the central limit theorem (CLT) assumption is the distribution is normal if the sample is laarge enough (over 30)
30
Q

point estimates

A

a single number derived from a sample and used to estimate a population value (estimating SEM)
*point estimates are problematic because the sample statistic inevitable contains sampling error

31
Q

interval estimates

A

build on several point estimates to establish a range of values
*the range of values is called the confidence interval (CI)

32
Q

student’s t distributions

A

for sample sized below 30, alternative distribution to be used

  • this family of distributions is know as student’s t distributions
  • as sample size increases, the sampling distributions increasingly approximate the normal distribution
  • they are asymmetrical, bell shaped, and centered on the mean but change as their sample size changes