Module 3 Flashcards

1
Q

when does the process of data analysis begin?

A

at the very start of any project, long before any data are collected or analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the principles of measurements?

A
  • reliability
  • validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define reliability

A

the repeatability and consistency of series of the measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define validity

A

the accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is descriptive technique

A

helps you find outliers or missing information which may harm your analysis and cause bias of your interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is inferential statistics

A

allows you to make generalization about populations from the sample thats collected

**to make inferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is descriptive statistics?

A

-used to organize, summarize, and report data
- can summarize large amounts of data in just a few numbers or a simple graphic display
- descriptive statistics= data reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the first step in any data analysis?

A
  • describe that data
    **describing the data is like performing the hx and physical exam on a patient before prescribing treatment and further testing
    **descriptive statistics are the ‘vitals signs’ of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does the level of measurement dictate?

A

-dictates what types of descriptive statistics can be performed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the types of categorical data?

A
  • Nominal (names)
    • gender, race, disease status
  • Ordinal (ranked)
    • satisfaction with care, pain, cancer stages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

descriptive statistics for categorical data (3)

A
  • Frequencies (counts) (n)
  • Relative frequencies presented as percentages (%)
  • graphic display
    • bar chart for nominal
      -histogram for ordinal. if histogram is not available,
      use the bar chart
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does a frequency table do?

A

it takes a disorganized set of scores and groups together all individuals who have the same scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relative frequency

A

measures the fraction of the entire group that is associated with each score
–> to compute the % associated with each score, first find the relative frequency then multiply by 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are types of continuous data?

A
  • interval level
  • ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is interval level?

A
  • equal intervals between each number on the scale
  • no ‘true’ zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is ratio

A
  • equal intervals between each number on the scale
  • there is a true zero
17
Q

description of continuous data (3)

A
  • measures of central location (central tendency)
  • measures of spread (dispersion)
  • graphic display (histograms, stem-and-leaf plots, box-and-whisker plots)
18
Q

what are the measures of central location

A

-mean: add all scores and divide
-median: 50% of the values are above the median and 50% are below the median
-mode: most freq occurring
-geometric mean
-midrange: geometric mean and midrange

19
Q

measures of central location: the decision of whether to use the mean or median is based on:

A
  • the shape of the distribution (normal or assymetrical)
  • the presence or absence of outliers
20
Q

measures of central location: mean is used when

A
  • the data are normally distributed AND
  • there are no outliers
21
Q

measures of central location: median is used when

A
  • the data are NOT normally distributed OR
  • there are any outliers (even 1)
22
Q

measures of spread (7)

A
  • statistical range
  • epidemiological range
  • percentiles
  • quartiles
  • interquartile range (IQR)
  • standard deviation (SD)
  • coefficient of variation (CV)
23
Q

statistical range

A
  • one number
  • calculated as the difference between the largest value minus the smallest value
  • if the largest value was 50 and the smallest was 20, the statistical range would be 30
24
Q

epidemiological range

A
  • two numbers
  • both the minimum and the maximum values are reported
  • if the largest value was 50 and the smallest was 20, the epidemiological range would be present as 20,50
25
Q

standard deviation

A

a measure of the average distance between the observations and the mean

26
Q

what are the most common values used to describe a set of data?

A

the mean and standard deviation

27
Q

the most commonly used measures of spread are: (3)

A
  • standard deviation
  • interquartiles range
  • epidemiological range
28
Q

what is the measure of spread determined by?

A

choice of a measure of central tendency

29
Q

data display for continuous data

A
  • histogram
  • stem and leaf plot: similar to a histogram
    • shows the overall shape of the distribution
    • show individual data values
  • box and whisker plot: conveys more info than both histogram and stem-and-leaf plot
    • depicts the:
      • overall distribution
      • center of distribution
      • quartiles
      • outliers
        **good for comparison across groups
30
Q

what test are used to assess normality?

A
  • shapiro wilk test
  • kolmogorov-smirnov
31
Q

description of continuous data (4)

A
  • frequency (number of observation)
  • measures of central location
  • measures of spread
  • graphic display (histogram, stem and leaf plot, box and whisker plots)
32
Q

how is the presence of outliers assessed?

A

with the box and whisker plot with tukey fences

33
Q

normality of the distribution: p-value ≥ .05

A

normal distribution

34
Q

normality of the distribution: p-value <.05

A

the distribution is not normal

35
Q

what should be reported if the data follow a normal distribution and there are no outliers?

A

the mean, standard deviation, and epidemiological range should be reported

36
Q

what should be reported if the data are not distributed normally or if there are outliers?

A

the median, interquartile range and epidemiological range should be reported