Descriptive Statistics Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

describe/summarize the data a researcher has

A

descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

helps a researcher understand the data that he has, while descriptive statistics help him explain to other people what is happening to his data

A

Exploratory data analysis (EDA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The first thing to describe is the distribution of the data,
to show the kinds of numbers that we have.

A

describing data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  • Different ways of Describing the Distribution
  • is used to
    present the pattern in the data.
A
  • Frequency Table
  • Charts (e.g., histograms, bar chart etc)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

frequency distributions of nominal or ordinal data are customarily plotted using a ______

A

bar graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

____ drawn for each category, where the height of the
bars represent the frequency or number of members of
that category.

A

Bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

used to represent frequency distributions
composed of interval or ratio data. Bar is drawn for each
class interval.

  • Class intervals are plotted on the horizontal axis such
    that each class bar begins and terminates at the real
    limits of the interval.
A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

also used to represent interval or
ratio data.

Instead of using bars, a point is plotted over the midpoint
of each interval at a height corresponding to the
frequency of the interval. Points are joined by a straight
line.

A

frequency polygon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Don’t draw a bar chart for ___

A

Continuous measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

presents the score values and
their frequency of occurrence.

When presented in a table, the score values are listed in
rank order, with the lowest score value usually at the
bottom of the table.

A

Frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

in grouping data

A

how wide should interval be?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When data are grouped

A

some information is lost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The wider the interval,

A

the more information is lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Constructing a frequency distribution of grouped scores

A
  1. Find the range of the scores.
  2. Determine the width of each class interval (i).
  3. List the limits of each class interval, placing the interval
    containing the lowest score value at the bottom.
  4. Tally the raw scores into the appropriate class intervals.
  5. Add the tallies for each interval to obtain the interval
    frequency.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

indicates the
proportion of the total number of scores in each interval.

A

Relative Frequency Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

indicates the
number of scores that fall below the upper limit of each
interval.

A

Cumulative Frequency Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

–indicates the
percentage of scores that fall below the upper limit of
each interval.

A

Cumulative Percentage Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is this symbol?

f/N

A

Relative Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

frequency of interval + frequencies of all class intervals below it.

A

Cumulative Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is this formula?

cum f / N x 100

A

cumulative percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

_____are very important in data analysis, because
they allow us to examine the shape of the distribution of
a variable.

The shape is a pattern that forms when a _____ is
plotted and is known as the distribution.

A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

the normal distribution also known as the

A

Gaussian Distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

_____ symmetrical and bell shaped. It
curves outwards at the top and then inwards nearer the
bottom, the tails getting thinner and thinner.

A

normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is the data form a perfect normal distribution?

A

never but as long as the distribution is close to a normal
distribution, it will not matter too much.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A very ___ of naturally occurring variables are
normally distributed.

A _____ of statistical tests make the assumption
that the data form a normal distribution.

A

large number

25
Q

don’t refer to the Normal Distribution as either of the
following;

A

usual, regular, standard, or even distribution.

25
Q

Wrong Shape

Distributions can be of wrong shape for two reasons.

First, because it is not symmetrical –

Second, because it is not the characteristic bell shape

A
  • SKEW
  • KURTOSIS
26
Q

A non-symmetrical distribution is said to be _____.

A

SKEW

26
Q

the curve rises rapidly and then drops off
slowly.

A

positive skew

26
Q

the curve rises slowly and then
decreases rapidly.

A

negative skew

27
Q

Skewness has some serious implications for some types
of data analysis.

Skew often happens because of ____ or _____

A

floor effect or ceiling effect

28
Q

occurs when only few of the subjects are
strong enough to get off the floor.

A

floor effect

29
Q

causes negative skew and are much less
common in Psychology.

sometimes occur most commonly when we
are trying to ask questions to measure the range of some
variable, and the questions are all too easy, or too low
down the scale.

A

ceiling effect

29
Q

Much trickier than Skew but is usually less of a problem.

Occurs when there are either too many people at the
extremes of the scale, or not enough people at the
extremes.

A

kurtosis

30
Q

when there are insufficient people in
the tail (ends) of the scores to make the distribution
normal.

A

positive kurtosis

31
Q

when there are too many people,
too far away, in the tails of the distribution.

A

negative kurtosis

32
Q

_____ is just a “posh” way of saying average.

In some way refers to the most central value of a data
set with different interpretations of the sense of
“central”.

Loosely known as the average. In statistical description,
though, we have to be more precise about just what sort
of average we mean.

A

central tendency

32
Q

Small number of data points that lie outside the
distribution when the distribution is approximately
normal.

Usually easily spotted in histograms.

______ are easy to spot but deciding what to do with
them can be much trickier.

A

outliers

33
Q

The mean is very sensitive to _____

A

extreme scores

33
Q

Called the arithmetic mean.

Calculated by adding up all the scores and dividing by the
number of individual scores.

Equation: (?) = ∑x / N

A

Mean

33
Q

Under most circumstances, of the measures used for
central tendency, the mean is least subject to ______

A

sampling variation

33
Q

For statistics to be correct, we need to make some _____

A

assumptions

34
Q

The sum of the squared deviations of all the scores
about their mean is a ______

A

minimum

35
Q

the _____ is equal to the sum of the mean of each
group times the number of scores in the group, divided
by the sum of the number of scores in each group.

A

overall mean

36
Q

Second most common measure of central tendency.

It is the middle score in a set of scores.

Used when the mean is not valid, which might be
because the data are not symmetrically or normally
distributed, or because the data are measured in an
ordinal level.

A

Median

37
Q

The median is _____ than the mean to extreme
scores.

A

less sensitive

38
Q

The most frequent score in the distribution or the most
common observation among a group of scores.

Best measure of central tendency for CATEGORICAL data
(although it is not even very useful for that)

Rarely used in research.

A

mode

39
Q

In a frequency distribution it is very easy to see because
it is the _______ of the distribution.

The problem with it is it does not tell us very much.

A

highest point

40
Q

The _____ is the simplest measure of dispersion.

It is the distance between the highest score and the
lowest score.

It can be expressed as a single number, or sometimes it is
expressed as the highest and lowest scores.

A

range

41
Q

To find the range we find the lowest value (2) and the
highest value (17). Sometimes the range is expressed as
a single figure, calculated as:

A

Range = Highest Value – Lowest Value

42
Q

Used with ordinal data or with non-normal distributions.

If median is used as a measure of central tendency, the ___ is probably used as a measure of dispersion.

It is the distance between the upper and lower quartiles.

A

inter-quartile-range

43
Q

There are ____ quartiles in a variable – they are the____ values that divide the variable into four groups.

A

three

44
Q

The ____ quartile happens one quarter of the way up the
data, which is also the 25th centile.

A

1st quartile

45
Q

The _____ quartile is the half-way point, which is the
median, and is also the 50th centile.

A

2nd quartile

46
Q

The ____ quartile is the three-quarter-way point or the 75th
centile.

A

third quartile

47
Q

symbol

s

A

sample standard deviation

47
Q

______ is like the mean, in that it
takes all of the values in the dataset into account when
it is calculated.

It is also like the mean in that it needs to make some
assumptions about the shape of the distribution.

To calculate the _____, we must assume that we have a
normal distribution.

A

Standard Deviation

48
Q

symbol

σ

A

population standard deviation

49
Q

the _____ of a set of scores is just the square of the standard deviation

A

variance

50
Q

the variance is not used much in descriptive statistics because it gives us squared units of measurement. however, it is used quite frequently in ___________

A

inferential statistics

51
Q
  1. the SD gives us a measure of dispersion relative to the mean
  2. the SD is sensitive to each score in the distribution
  3. like the mean, the SD is stable with regard to sampling fluctuations
A

properties of the standard deviation

52
Q

population standard deviation

A

boxplot or box and whisker plot