descriptive statistics Flashcards

1
Q

observation on one variable may be shown visually by putting variables on one axis and putting the frequency on the other

A

visual presentation of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

they are best used to interpret the frequency distribution visually

A

histogram:
y axis - no. of units
x axis - measurement lvl
bars are visually proportional to e/o

frequency polygon:
shorthanded presents a histogram
dot is placed at top of bars then connected = polygon (must be shaded)
better enunciates the data shape
the graph starts and ends at zero to “close” the shape

line graph:
can illustrate more than one data sets in one graph
- arithmetic line graph
- semilogarithmic line graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

differences between histogram, frequency polygon and line graph

A

histogram - data distribution
frequency polygon - connects those bar’s midpoints with lines
line graph - trends/ changes over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

briefly explain
arithmetic line graph:
semilogarithmic line graph:

A

arithmetic line graph:
both x and y axis has arithmetic scale (numerical)

semilogarithmic line graph: y axis has logarithmic axes

arithmetic - evenly spaced interval semilogarithmic -scale increase by multiples bcs of the exponential changes (bacteria)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

it is how well distributed are the instances of a data

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

frequency distribution from ____ data is defined by …

A

continuous data
types of descriptors aka parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the types of parameters of a frequency distribution

A

central tendency
dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

it is defined as the value used to represent the center or the middle of a set of data values

A

central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

it locates observations on a measurement scale

A

central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

it describes the spread of values in a given data set

A

dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

it suggest how widely spread out the observations are

A

dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

high SD =
low SD =

A

high SD = scattered data or spread out far from the mean
low SD =clumped data around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

it is the average or the sum (∑) of all observer values (xi) divided by the total no. of observation (N)

A

mean, x̄

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

it has the most mathematical properties and is most representative of the dataset if not for outliers.

A

mean, x̄

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

in median, the middle has been arranged from ____ to ____

A

highest to lowest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

median is
frequently used in -
rarely -

A
  • healthcare and economics
  • used to make inferential conclusion from
15
Q

it is the most commonly observed value

A

mode

16
Q

true or false:
mode is frequently used in statistics

A

false - seldomly

17
Q

arithmetic mean:
weighted mean:

A

arithmetic mean: for each indiv observation
weighted mean calculated by multiplying the weight associated with a particular outcome (grading system)

18
Q

what are the downside of using mode as a measure of central tendency

A

may have no mode
have more than one mode

19
Q

it is a statical measurement of the spread between numbers in a data set

A

variance

20
Q

The differenced between the observed value of a data point and the Expected value is known as deviation in statistics.

A

mean deviation

it is the average deviation of a data point from the mean, median or mode of the data set.

20
Q

It measures how far each number in the set is from the mean and thus from every other in the set

A

variance

21
Q

it is the average amount of variability in your dataset.

A

SD

22
Q

mean deviation is aka

A

mean absolute deviation

23
Q

values that split sorted data or a probability distribution into equal parts

A

quantiles

24
Q

a statistical term that describes a division of observation into four defined intervals based on the values of the data and how they compare to the entire set of observations.

A

quartiles

lower Q
median Q
upper Q

25
Q

how to calculate percentiles

A

data ordered from lowest to highest
divided into 100 equal parts

26
Q

how to find range

A

highest value minus lowest value

27
Q

In descriptive statistics, the range of a set of data is the size of the narrowest interval which contains all the data.

A

range
IQR

28
Q

it is defined as as the difference between third and the first quartile.

A

IQR

29
Q

a measure of asymmetry of a distribution
explain why it is asymmetry

A

skewness (horizontal imbalance)
because left and right images are not mirror images

30
Q

it is used to help measure how data disperse between a distribution’s center and tails, with larger values indicating a data distribution may have “heavy” tails that are thickly concentrated with observations or that are long with extreme observations

A

kurtosis (vertical imbalance)