Module 3 Flashcards

1
Q

when does the process of data analysis begin?

A

at the very start of any project, long before any data are collected or analyzed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the principles of measurements?

A
  • reliability
  • validity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

define reliability

A

the repeatability and consistency of series of the measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

define validity

A

the accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is descriptive technique

A

helps you find outliers or missing information which may harm your analysis and cause bias of your interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is inferential statistics

A

allows you to make generalization about populations from the sample thats collected

**to make inferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is descriptive statistics?

A

-used to organize, summarize, and report data
- can summarize large amounts of data in just a few numbers or a simple graphic display
- descriptive statistics= data reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the first step in any data analysis?

A
  • describe that data
    **describing the data is like performing the hx and physical exam on a patient before prescribing treatment and further testing
    **descriptive statistics are the ‘vitals signs’ of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what does the level of measurement dictate?

A

-dictates what types of descriptive statistics can be performed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the types of categorical data?

A
  • Nominal (names)
    • gender, race, disease status
  • Ordinal (ranked)
    • satisfaction with care, pain, cancer stages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

descriptive statistics for categorical data (3)

A
  • Frequencies (counts) (n)
  • Relative frequencies presented as percentages (%)
  • graphic display
    • bar chart for nominal
      -histogram for ordinal. if histogram is not available,
      use the bar chart
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what does a frequency table do?

A

it takes a disorganized set of scores and groups together all individuals who have the same scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

relative frequency

A

measures the fraction of the entire group that is associated with each score
–> to compute the % associated with each score, first find the relative frequency then multiply by 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are types of continuous data?

A
  • interval level
  • ratio
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is interval level?

A
  • equal intervals between each number on the scale
  • no ‘true’ zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is ratio

A
  • equal intervals between each number on the scale
  • there is a true zero
17
Q

description of continuous data (3)

A
  • measures of central location (central tendency)
  • measures of spread (dispersion)
  • graphic display (histograms, stem-and-leaf plots, box-and-whisker plots)
18
Q

what are the measures of central location

A

-mean: add all scores and divide
-median: 50% of the values are above the median and 50% are below the median
-mode: most freq occurring
-geometric mean
-midrange: geometric mean and midrange

19
Q

measures of central location: the decision of whether to use the mean or median is based on:

A
  • the shape of the distribution (normal or assymetrical)
  • the presence or absence of outliers
20
Q

measures of central location: mean is used when

A
  • the data are normally distributed AND
  • there are no outliers
21
Q

measures of central location: median is used when

A
  • the data are NOT normally distributed OR
  • there are any outliers (even 1)
22
Q

measures of spread (7)

A
  • statistical range
  • epidemiological range
  • percentiles
  • quartiles
  • interquartile range (IQR)
  • standard deviation (SD)
  • coefficient of variation (CV)
23
Q

statistical range

A
  • one number
  • calculated as the difference between the largest value minus the smallest value
  • if the largest value was 50 and the smallest was 20, the statistical range would be 30
24
Q

epidemiological range

A
  • two numbers
  • both the minimum and the maximum values are reported
  • if the largest value was 50 and the smallest was 20, the epidemiological range would be present as 20,50
25
standard deviation
a measure of the average distance between the observations and the mean
26
what are the most common values used to describe a set of data?
the mean and standard deviation
27
the most commonly used measures of spread are: (3)
- standard deviation - interquartiles range - epidemiological range
28
what is the measure of spread determined by?
choice of a measure of central tendency
29
data display for continuous data
- histogram - stem and leaf plot: similar to a histogram - shows the overall shape of the distribution - show individual data values - box and whisker plot: conveys more info than both histogram and stem-and-leaf plot - depicts the: - overall distribution - center of distribution - quartiles - outliers **good for comparison across groups
30
what test are used to assess normality?
- shapiro wilk test - kolmogorov-smirnov
31
description of continuous data (4)
- frequency (number of observation) - measures of central location - measures of spread - graphic display (histogram, stem and leaf plot, box and whisker plots)
32
how is the presence of outliers assessed?
with the box and whisker plot with tukey fences
33
normality of the distribution: p-value ≥ .05
normal distribution
34
normality of the distribution: p-value <.05
the distribution is not normal
35
what should be reported if the data follow a normal distribution and there are no outliers?
the mean, standard deviation, and epidemiological range should be reported
36
what should be reported if the data are not distributed normally or if there are outliers?
the median, interquartile range and epidemiological range should be reported